Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Using machine learning to go beyond potential energy surface benchmarking for chemical reactivity

A preprint version of the article is available at arXiv.

Abstract

We train an equivariant machine learning (ML) model to predict energies and forces for hydrogen combustion under conditions of finite temperature and pressure. This challenging case for reactive chemistry illustrates that ML potential energy surfaces are difficult to make complete, due to overreliance on chemical intuition of what data are important for training. Instead, a ‘negative design’ data acquisition strategy using metadynamics as part of an active learning workflow helps to create a ML model that avoids unforeseen high-energy or unphysical energy configurations. This strategy more rapidly converges the potential energy surfaces such that it is now more efficient to make calls to the external ab initio source when query-by-committee models disagree to further molecular dynamics in time without need for ML retraining. With the hybrid ML–physics model we realize two orders of magnitude reduction in cost, allowing for prediction of the free-energy change in the transition-state mechanism for several hydrogen combustion reaction channels.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Observations of the missing data in the ML model for hydrogen combustion and the addition of dilation data.
Fig. 2: Schematic illustration of AL workflow using query-by-committee and metadynamics.
Fig. 3: The AL data and error for rxn18.
Fig. 4: Schematic illustration of the new workflow for rebuilding the free-energy surface.
Fig. 5: Free-energy surface reconstructed from metadynamics using the hybrid model for hydrogen combustion.

Similar content being viewed by others

Data availability

Coordinates of geometries, energy and forces for the hydrogen combustion original dataset17 are available at https://doi.org/10.6084/m9.figshare.19601689. IRC dilation data and active learning-generated data44 used in the training are available at https://doi.org/10.6084/m9.figshare.23290115.v1. Source data for Figs. 1, 3 and 5 is available with this paper.

Code availability

The full workflow code45 can be found at https://github.com/THGLab/H2Combustion_AL.

References

  1. Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).

    Article  Google Scholar 

  2. Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1, a data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci. Data 4, 170193 (2017).

    Article  Google Scholar 

  3. Smith, J. S., Nebgen, B., Lubbers, N., Isayev, O. & Roitberg, A. E. Less is more: sampling chemical space with active learning. J. Chem. Phys. 148, 241733 (2018).

    Article  Google Scholar 

  4. Smith, J. S. et al. Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning. Nat. Commun. 10, 2903 (2019).

    Article  Google Scholar 

  5. Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).

    Article  Google Scholar 

  6. Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, e1603015 (2017).

    Article  Google Scholar 

  7. Schutt, K. T., Sauceda, H. E., Kindermans, P. J., Tkatchenko, A. & Müller, K. R. SchNet-a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).

    Article  Google Scholar 

  8. Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at https://arxiv.org/abs/1802.08219 (2018).

  9. Drautz, R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 99, 014104 (2019).

    Article  Google Scholar 

  10. Anderson, B., Hy, T. S. & Kondor, R. Cormorant: covariant molecular neural networks. Adv. Neural Inf. Process. Syst. 32, 03573b32 (2019).

  11. Qiao, Z., Welborn, M., Anandkumar, A., Manby, F. R. & Miller, T. F. OrbNet: deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys. 153, 124111 (2020).

    Article  Google Scholar 

  12. Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat Commun 13, 2453 (2022).

    Article  Google Scholar 

  13. Glick, Z. L., Koutsoukas, A., Cheney, D. L. & Sherrill, C. D. Cartesian message passing neural networks for directional properties: fast and transferable atomic multipoles. J. Chem. Phys. 154, 224103 (2021).

    Article  Google Scholar 

  14. Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In Proc. 38th International Conference on Machine Learning, Proc. Machine Learning Research Vol. 139 (eds Meila, M. & Zhang, T.) 9377–9388 (PMLR, 2021).

  15. Unke, O. T. et al. SpookyNet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).

    Article  Google Scholar 

  16. Haghighatlari, M. et al. NewtonNet: a Newtonian message passing network for deep learning of interatomic potentials and forces. Digit. Discov. 1, 333–343 (2022).

    Article  Google Scholar 

  17. Guan, X. et al. A benchmark dataset for hydrogen combustion. Sci. Data 9, 215 (2022).

    Article  Google Scholar 

  18. Mardirossian, N. & Head-Gordon, M. ωB97X-V: a 10-parameter, range-separated hybrid, generalized gradient approximation density functional with nonlocal correlation, designed by a survival-of-the-fittest strategy. Phys. Chem. Chem. Phys. 16, 9904–9924 (2014).

    Article  Google Scholar 

  19. Bertels, L. W., Newcomb, L. B., Alaghemandi, M., Green, J. R. & Head-Gordon, M. Benchmarking the performance of the ReaxFF reactive force field on hydrogen combustion systems. J. Phys. Chem. A 124, 5631–5645 (2020).

    Article  Google Scholar 

  20. Li, J., Zhao, Z., Kazakov, A. & Dryer, F. L. An updated comprehensive kinetic model of hydrogen combustion. Int. J. Chem. Kinet. 36, 566–575 (2004).

    Article  Google Scholar 

  21. Kulichenko, M. et al. Uncertainty-driven dynamics for active learning of interatomic potentials. Nat. Comput. Sci. 3, 230–239 (2023).

    Article  Google Scholar 

  22. Shapeev, A., Gubaev, K., Tsymbalov, E. & Podryabinkin, E. in Machine Learning Meets Quantum Physics Lecture Notes in Physics Vol. 968 (eds Schütt, K. et al.) 309–329 (Springer, 1970).

  23. Seung, H. S., Opper, M. & Sompolinsky, H. Query by committee. In Fifth Annual Workshop on Computational Learning Theory 287–294 (Association for Computing Machinery, 1992).

  24. Schran, C. et al. Machine learning potentials for complex aqueous systems made simple. Proc. Natl Acad. Sci. USA 118, e2110077118 (2021).

    Article  Google Scholar 

  25. Ang, S. J., Wang, W., Schwalbe-Koda, D., Axelrod, S. & Gómez-Bombarelli, R. Active learning accelerates ab initio molecular dynamics on reactive energy surfaces. Chem 7, 738–751 (2021).

    Article  Google Scholar 

  26. Zhang, S. et al. Exploring the frontiers of chemistry with a general reactive machine learning potential. Preprint at ChemRxiv https://doi.org/10.26434/chemrxiv-2022-15ct6 (2023).

  27. Khalak, Y., Tresadern, G., Hahn, D. F., de Groot, B. L. & Gapsys, V. Chemical space exploration with active learning and alchemical free energies. J. Chem. Theory Comput. 18, 6259–6270 (2022).

    Article  Google Scholar 

  28. Yang, M., Bonati, L., Polino, D. & Parrinello, M. Using metadynamics to build neural network potentials for reactive events: the case of urea decomposition in water. Catal. Today 387, 143–149 (2022).

    Article  Google Scholar 

  29. Laio, A. & Parrinello, M. Escaping free-energy minima. Proc. Natl Acad. Sci. USA 99, 12562–12566 (2002).

    Article  Google Scholar 

  30. Barducci, A., Bussi, G. & Parrinello, M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 100, 020603 (2008).

    Article  Google Scholar 

  31. Ko, T. et al. Using diffusion maps to analyze reaction dynamics for a hydrogen combustion benchmark dataset. J. Chem. Theory Comput. 19, 5872–5885 (2023).

    Article  Google Scholar 

  32. van der Oord, C. et al. Hyperactive learning for data-driven interatomic potentials. NPJ Comput. Mater. 9, 168 (2023).

    Article  Google Scholar 

  33. Mueller, T., Hernandez, A. & Wang, C. Machine learning for interatomic potential models. J. Chem. Phys. 152, 050902 (2020).

    Article  Google Scholar 

  34. Ramakrishnan, R., Dral, P. O., Rupp, M. & von Lilienfeld, O. A. Big data meets quantum chemistry approximations: the Δ-machine learning approach. J. Chem. Theory Comput. 11, 2087–2096 (2015).

    Article  Google Scholar 

  35. Böselt, L., Thürlemann, M. & Riniker, S. Machine learning in QM/MM molecular dynamics simulations of condensed-phase systems. J. Chem. Theory Comput. 17, 2641–2658 (2021).

    Article  Google Scholar 

  36. Shao, Y. et al. Advances in molecular quantum chemistry contained in the Q-Chem 4 program package. Mol. Phys. 113, 184–215 (2014).

    Article  Google Scholar 

  37. Tribello, G. A., Bonomi, M., Branduardi, D., Camilloni, C. & Bussi, G. PLUMED 2: new feathers for an old bird. Comput. Phys. Commun. 185, 604–613 (2014).

    Article  Google Scholar 

  38. Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condensed Matter 29, 273002 (2017).

    Article  Google Scholar 

  39. Rupp, M., Tkatchenko, A., Müller, K.-R. & von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).

    Article  Google Scholar 

  40. Herman-Saffar, O. An approach for choosing number of clusters for k-means. Medium https://towardsdatascience.com/an-approach-for-choosing-number-of-clusters-for-k-means-c28e614ecb2c (2021).

  41. Epifanovsky, E. et al. Software for the frontiers of quantum chemistry: an overview of developments in the Q-Chem 5 package. J. Chem. Phys. 155, 084801 (2021).

    Article  Google Scholar 

  42. Van Voorhis, T. & Head-Gordon, M. A geometric approach to direct minimization. Mol. Phys. 100, 1713–1721 (2002).

    Article  Google Scholar 

  43. Khaliullin, R. Z., Cobar, E. A., Lochan, R. C., Bell, A. T. & Head-Gordon, M. Unravelling the origin of intermolecular interactions using absolutely localized molecular orbitals. J. Phys. Chem. A 111, 8753–8765 (2007).

    Article  Google Scholar 

  44. Guan, X., Heindel, J., Ko, T., Yang, C. & Head-Gordon, T. Hydrogen combustion supplemetary data from an active learning study. figshare https://doi.org/10.6084/m9.figshare.23290115.v1 (2023).

  45. Guan, X. Thglab/h2combustion_al: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.8378075 (2023).

Download references

Acknowledgements

X.G., J.P.H. and T.H.-G. thank the CPIMS program, Office of Science, Office of Basic Energy Sciences, Chemical Sciences Division of the US Department of Energy under contract DE-AC02-05CH11231 for support of the machine learning approach to hydrogen combustion. T.K. and C.Y. thank the US Department of Energy via the Scientific Discovery through Advanced Computing (SciDAC) program for the collective variables. This work used computational resources provided by the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy Office of Science User Facility operated under contract DE-AC02-05CH11231.

Author information

Authors and Affiliations

Authors

Contributions

X.G. and T.H.-G. designed the project. X.G. carried out the AIMD simulations, metadynamic calculations and active learning. X.G. and T.H.-G. designed the collective coordinates with the help of J.P.H., T.K. and C.Y. X.G. and T.H.-G. wrote the paper. All authors discussed the results and made comments and edits to the paper.

Corresponding author

Correspondence to Teresa Head-Gordon.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Benjamin Nebgen, David van der Spoel, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1 and 2, Discussion and Tables 1–4. Supplementary Fig. 1: Two representative structures that the original ML model predicts with large error. Supplementary Fig. 2: Spot checking the hybrid mode model for rxn18 for energies and forces. Supplementary Table 1: The 19 reactions contained in the hydrogen combustion benchmark dataset. Supplementary Table 2: Metadynamics collective variables used in the active learning and free-energy reconstruction. Supplementary Table 3: Total number of data points added in active learning for each reaction. Supplementary Table 4: AIMD committer analysis on identified free-energy transition state from the hybrid model at 500 K.

Source data

Source Data Fig. 1

Source data including MD trajectory for Fig. 1a and source geometry and energy data for Fig. 1b,c.

Source Data Fig. 3

Source data geometry and energy data for Fig. 3a,c,e,g and statistical source data for Fig. 3b,d,f,h.

Source Data Fig. 5

Atomic coordinates and statistical source data for Fig. 5.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guan, X., Heindel, J.P., Ko, T. et al. Using machine learning to go beyond potential energy surface benchmarking for chemical reactivity. Nat Comput Sci 3, 965–974 (2023). https://doi.org/10.1038/s43588-023-00549-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-023-00549-5

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics