Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Organic reaction mechanism classification using machine learning

Abstract

A mechanistic understanding of catalytic organic reactions is crucial for the design of new catalysts, modes of reactivity and the development of greener and more sustainable chemical processes1,2,3,4,5,6,7,8,9,10,11,12,13. Kinetic analysis lies at the core of mechanistic elucidation by facilitating direct testing of mechanistic hypotheses from experimental data. Traditionally, kinetic analysis has relied on the use of initial rates14, logarithmic plots and, more recently, visual kinetic methods15,16,17,18, in combination with mathematical rate law derivations. However, the derivation of rate laws and their interpretation require numerous mathematical approximations and, as a result, they are prone to human error and are limited to reaction networks with only a few steps operating under steady state. Here we show that a deep neural network model can be trained to analyse ordinary kinetic data and automatically elucidate the corresponding mechanism class, without any additional user input. The model identifies a wide variety of classes of mechanism with outstanding accuracy, including mechanisms out of steady state such as those involving catalyst activation and deactivation steps, and performs excellently even when the kinetic data contain substantial error or only a few time points. Our results demonstrate that artificial-intelligence-guided mechanism classification is a powerful new tool that can streamline and automate mechanistic elucidation. We are making this model freely available to the community and we anticipate that this work will lead to further advances in the development of fully automated organic reaction discovery and development.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Relevance and state of the art on kinetic analysis.
Fig. 2: Mechanism scope and composition of data.
Fig. 3: Performance of the machine learning model on the test set with six time points per kinetic profile.
Fig. 4: Effect of error and amount of data points in the performance of the machine learning model.
Fig. 5: Case studies with experimental kinetic data.

Similar content being viewed by others

Data availability

The datasets generated for training, validation and testing are available from figshare: https://doi.org/10.48420/16965292.

Code availability

Trained models, weights and python scripts are available from https://doi.org/10.48420/16965271.

References

  1. Simonetti, M., Cannas, D. M., Just-Baringo, X., Vitorica-Yrezabal, I. J. & Larrosa, I. Cyclometallated ruthenium catalyst enables late-stage directed arylation of pharmaceuticals. Nat. Chem. 10, 724–731 (2018).

    Article  CAS  PubMed  Google Scholar 

  2. Salazar, C. A. et al. Tailored quinones support high-turnover Pd catalysts for oxidative C-H arylation with O2. Science 370, 1454–1460 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. DiRocco, D. A. et al. A multifunctional catalyst that stereoselectively assembles prodrugs. Science 356, 426–430 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  4. Li, T. et al. Efficient, chemoenzymatic process for manufacture of the Boceprevir bicyclic [3.1.0]proline intermediate based on amine oxidase-catalyzed desymmetrization. J. Am. Chem. Soc. 134, 6467–6472 (2012).

    Article  CAS  PubMed  Google Scholar 

  5. Nielsen, L. P., Stevenson, C. P., Blackmond, D. G. & Jacobsen, E. N. Mechanistic investigation leads to a synthetic improvement in the hydrolytic kinetic resolution of terminal epoxides. J. Am. Chem. Soc. 126, 1360–1362 (2004).

    Article  CAS  PubMed  Google Scholar 

  6. van Dijk, L. et al. Mechanistic investigation of Rh(I)-catalysed asymmetric Suzuki–Miyaura coupling with racemic allyl halides. Nat. Catal. 4, 284–292 (2021).

    Article  Google Scholar 

  7. Camasso, N. M. & Sanford, M. S. Design, synthesis, and carbon-heteroatom coupling reactions of organometallic nickel(IV) complexes. Science 347, 1218–1220 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  8. Milo, A., Neel, A. J., Toste, F. D. & Sigman, M. S. A data-intensive approach to mechanistic elucidation applied to chiral anion catalysis. Science 347, 737–743 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  9. Butcher, T. W. et al. Desymmetrization of difluoromethylene groups by C-F bond activation. Nature 583, 548–553 (2020).

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Cho, E. J. et al. The palladium-catalyzed trifluoromethylation of aryl chlorides. Science 328, 1679–1681 (2010).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. Hutchinson, G., Alamillo-Ferrer, C. & Bures, J. Mechanistically guided design of an efficient and enantioselective aminocatalytic alpha-chlorination of aldehydes. J. Am. Chem. Soc. 143, 6805–6809 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Schreyer, L. et al. Confined acids catalyze asymmetric single aldolizations of acetaldehyde enolates. Science 362, 216–219 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  13. Peters, B. K. et al. Scalable and safe synthetic organic electroreduction inspired by Li-ion battery chemistry. Science 363, 838–845 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  14. Michaelis, L. & Menten, M. L. Die Kinetik der Invertinwirkung. Biochem. Z. 49, 333–369 (1913).

    CAS  Google Scholar 

  15. Blackmond, D. G. Reaction progress kinetic analysis: a powerful methodology for mechanistic studies of complex catalytic reactions. Angew. Chem. Int. Ed. Engl. 44, 4302–4320 (2005).

    Article  CAS  PubMed  Google Scholar 

  16. Mathew, J. S. et al. Investigations of Pd-catalyzed ArX coupling reactions informed by reaction progress kinetic analysis. J. Org. Chem. 71, 4711–4722 (2006).

    Article  CAS  PubMed  Google Scholar 

  17. Bures, J. A simple graphical method to determine the order in catalyst. Angew. Chem. Int. Ed. Engl. 55, 2028–2031 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Burés, J. Variable time normalization analysis: general graphical elucidation of reaction orders from concentration profiles. Angew. Chem. Int. Ed. Engl. 55, 16084–16087 (2016).

    Article  PubMed  Google Scholar 

  19. Shi, Y., Prieto, P. L., Zepel, T., Grunert, S. & Hein, J. E. Automated experimentation powers data science in chemistry. Acc. Chem. Res. 54, 546–555 (2021).

    Article  CAS  PubMed  Google Scholar 

  20. Burger, B. et al. A mobile robotic chemist. Nature 583, 237–241 (2020).

    Article  ADS  CAS  PubMed  Google Scholar 

  21. Bedard, A. C. et al. Reconfigurable system for automated optimization of diverse chemical reactions. Science 361, 1220–1225 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  22. Steiner, S. et al. Organic synthesis in a modular robotic system driven by a chemical programming language. Science 363, eaav2211 (2019).

    Article  CAS  PubMed  Google Scholar 

  23. Clauset, A., Shalizi, C. R. & Newman, M. E. J. Power-law distributions in empirical data. SIAM Rev. 51, 661–703 (2009).

    Article  ADS  MathSciNet  MATH  Google Scholar 

  24. Martinez-Carrion, A. et al. Kinetic treatments for catalyst activation and deactivation processes based on variable time normalization analysis. Angew. Chem. Int. Ed. Engl. 58, 10189–10193 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Bernacki, J. P. & Murphy, R. M. Model discrimination and mechanistic interpretation of kinetic data in protein aggregation studies. Biophys. J. 96, 2871–2887 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  26. Pfluger, P. M. & Glorius, F. Molecular machine learning: the future of synthetic chemistry? Angew. Chem. Int. Ed. Engl. 59, 18860–18865 (2020).

    Article  PubMed  Google Scholar 

  27. Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  28. Raissi, M., Yazdani, A. & Karniadakis, G. E. Hidden fluid mechanics: learning velocity and pressure fields from flow visualizations. Science 367, 1026–1030 (2020).

    Article  ADS  MathSciNet  CAS  PubMed  PubMed Central  MATH  Google Scholar 

  29. Hermann, J., Schatzle, Z. & Noe, F. Deep-neural-network solution of the electronic Schrodinger equation. Nat. Chem. 12, 891–897 (2020).

    Article  CAS  PubMed  Google Scholar 

  30. Shields, B. J. et al. Bayesian reaction optimization as a tool for chemical synthesis. Nature 590, 89–96 (2021).

    Article  ADS  CAS  PubMed  Google Scholar 

  31. Tunyasuvunakool, K. et al. Highly accurate protein structure prediction for the human proteome. Nature 596, 590–596 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  32. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  33. Hueffel, J. A. et al. Accelerated dinuclear palladium catalyst identification through unsupervised machine learning. Science 374, 1134–1140 (2021).

    Article  ADS  CAS  PubMed  Google Scholar 

  34. Haitao, X., Junjie, W. & Lu, L. In Proc. 1st International Conference on E-Business Intelligence 303–309 (Atlantis Press, 2010).

  35. Batista, G. E. A. P. A. et al. In Advances in Intelligent Data Analysis VI (eds Fazel Famili, A. et al.) 24–35 (Springer, 2005).

  36. Wei, J.-M., Yuan, X.-J., Hu, Q.-H. & Wang, S.-Q. A novel measure for evaluating classifiers. Expert Syst. Appl. 37, 3799–3809 (2010).

    Article  Google Scholar 

  37. Alberton, A. L., Schwaab, M., Schmal, M. & Pinto, J. C. Experimental errors in kinetic tests and its influence on the precision of estimated parameters. Part I—analysis of first-order reactions. Chem. Eng. J. 155, 816–823 (2009).

    Article  CAS  Google Scholar 

  38. Pacheco, H., Thiengo, F., Schmal, M. & Pinto, J. C. A family of kinetic distributions for interpretation of experimental fluctuations in kinetic problems. Chem. Eng. J. 332, 303–311 (2018).

    Article  CAS  Google Scholar 

  39. Storer, A. C., Darlison, M. G. & Cornish-Bowden, A. The nature of experimental error in enzyme kinetic measurments. Biochem. J 151, 361–367 (1975).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Valkó, É. & Turányi, T. In Lindner, E., Micheletti, A. & Nunes, C. (eds) Mathematical Modelling in Real Life Problems. Mathematics in Industry https://doi.org/10.1007/978-3-030-50388-8_3 (2020).

  41. Thiel, V., Wannowius, K. J., Wolff, C., Thiele, C. M. & Plenio, H. Ring-closing metathesis reactions: interpretation of conversion-time data. Chem. Eur. J. 19, 16403–16414 (2013).

    Article  CAS  PubMed  Google Scholar 

  42. Joannou, M. V., Hoyt, J. M. & Chirik, P. J. Investigations into the mechanism of inter- and intramolecular iron-catalyzed [2 + 2] cycloaddition of alkenes. J. Am. Chem. Soc. 142, 5314–5330 (2020).

    Article  CAS  PubMed  Google Scholar 

  43. Knapp, S. M. M. et al. Mechanistic studies of alkene isomerization catalyzed by CCC-pincer complexes of iridium. Organometallics 33, 473–484 (2014).

    Article  CAS  Google Scholar 

  44. Stroek, W., Keilwerth, M., Pividori, D. M., Meyer, K. & Albrecht, M. An iron-mesoionic carbene complex for catalytic intramolecular C-H amination utilizing organic azides. J. Am. Chem. Soc. 143, 20157–20165 (2021).

    Article  CAS  PubMed  Google Scholar 

  45. Lehnherr, D. et al. Discovery of a photoinduced dark catalytic cycle using in situ LED-NMR spectroscopy. J. Am. Chem. Soc. 140, 13843–13853 (2018).

    Article  CAS  PubMed  Google Scholar 

  46. Ludwig, J. R., Zimmerman, P. M., Gianino, J. B. & Schindler, C. S. Iron(III)-catalysed carbonyl-olefin metathesis. Nature 533, 374–379 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  47. Albright, H. et al. Catalytic carbonyl-olefin metathesis of aliphatic ketones: iron(III) homo-dimers as Lewis acidic superelectrophiles. J. Am. Chem. Soc. 141, 1690–1700 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Janse van Rensburg, W., Steynberg, P. J., Meyer, W. H., Kirk, M. M. & Forman, G. S. DFT prediction and experimental observation of substrate-induced catalyst decomposition in ruthenium-catalyzed olefin metathesis. J. Am. Chem. Soc. 126, 14332–14333 (2004).

    Article  PubMed  Google Scholar 

  49. van der Eide, E. F. & Piers, W. E. Mechanistic insights into the ruthenium-catalysed diene ring-closing metathesis reaction. Nat. Chem. 2, 571–576 (2010).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

We thank the European Research Council for an Advanced Grant (no. 833337 to I.L.) and Research IT for assistance given and use of the Computational Shared Facility at The University of Manchester. We thank H. Plenio, P. J. Chirick, A. R. Chianese, M. Albrecht and C. S. Schindler for providing the numerical experimental kinetic data used in this study.

Author information

Authors and Affiliations

Authors

Contributions

I.L. and J.B. conceived the project. J.B. selected the mechanisms with input from I.L. I.L. wrote the code, created the datasets and trained the models with input from J.B. I.L. and J.B. analysed data. I.L. and J.B. wrote the manuscript.

Corresponding authors

Correspondence to Jordi Burés or Igor Larrosa.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Tiago Rodrigues and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Additional case study with experimental kinetic data.

Includes the reaction under study, the experimental kinetic data used as input for the AI-model and its output. Symbols correspond to substrate concentration. Red triangles: lowest catalyst loading; yellow squares: medium catalyst loading; blue circles: largest catalyst loading. Data from ref. 47.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Burés, J., Larrosa, I. Organic reaction mechanism classification using machine learning. Nature 613, 689–695 (2023). https://doi.org/10.1038/s41586-022-05639-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-05639-4

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing