Abstract
We train an equivariant machine learning (ML) model to predict energies and forces for hydrogen combustion under conditions of finite temperature and pressure. This challenging case for reactive chemistry illustrates that ML potential energy surfaces are difficult to make complete, due to overreliance on chemical intuition of what data are important for training. Instead, a ‘negative design’ data acquisition strategy using metadynamics as part of an active learning workflow helps to create a ML model that avoids unforeseen high-energy or unphysical energy configurations. This strategy more rapidly converges the potential energy surfaces such that it is now more efficient to make calls to the external ab initio source when query-by-committee models disagree to further molecular dynamics in time without need for ML retraining. With the hybrid ML–physics model we realize two orders of magnitude reduction in cost, allowing for prediction of the free-energy change in the transition-state mechanism for several hydrogen combustion reaction channels.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Coordinates of geometries, energy and forces for the hydrogen combustion original dataset17 are available at https://doi.org/10.6084/m9.figshare.19601689. IRC dilation data and active learning-generated data44 used in the training are available at https://doi.org/10.6084/m9.figshare.23290115.v1. Source data for Figs. 1, 3 and 5 is available with this paper.
Code availability
The full workflow code45 can be found at https://github.com/THGLab/H2Combustion_AL.
References
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1, a data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci. Data 4, 170193 (2017).
Smith, J. S., Nebgen, B., Lubbers, N., Isayev, O. & Roitberg, A. E. Less is more: sampling chemical space with active learning. J. Chem. Phys. 148, 241733 (2018).
Smith, J. S. et al. Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning. Nat. Commun. 10, 2903 (2019).
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).
Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, e1603015 (2017).
Schutt, K. T., Sauceda, H. E., Kindermans, P. J., Tkatchenko, A. & Müller, K. R. SchNet-a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at https://arxiv.org/abs/1802.08219 (2018).
Drautz, R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 99, 014104 (2019).
Anderson, B., Hy, T. S. & Kondor, R. Cormorant: covariant molecular neural networks. Adv. Neural Inf. Process. Syst. 32, 03573b32 (2019).
Qiao, Z., Welborn, M., Anandkumar, A., Manby, F. R. & Miller, T. F. OrbNet: deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys. 153, 124111 (2020).
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat Commun 13, 2453 (2022).
Glick, Z. L., Koutsoukas, A., Cheney, D. L. & Sherrill, C. D. Cartesian message passing neural networks for directional properties: fast and transferable atomic multipoles. J. Chem. Phys. 154, 224103 (2021).
Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In Proc. 38th International Conference on Machine Learning, Proc. Machine Learning Research Vol. 139 (eds Meila, M. & Zhang, T.) 9377–9388 (PMLR, 2021).
Unke, O. T. et al. SpookyNet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).
Haghighatlari, M. et al. NewtonNet: a Newtonian message passing network for deep learning of interatomic potentials and forces. Digit. Discov. 1, 333–343 (2022).
Guan, X. et al. A benchmark dataset for hydrogen combustion. Sci. Data 9, 215 (2022).
Mardirossian, N. & Head-Gordon, M. ωB97X-V: a 10-parameter, range-separated hybrid, generalized gradient approximation density functional with nonlocal correlation, designed by a survival-of-the-fittest strategy. Phys. Chem. Chem. Phys. 16, 9904–9924 (2014).
Bertels, L. W., Newcomb, L. B., Alaghemandi, M., Green, J. R. & Head-Gordon, M. Benchmarking the performance of the ReaxFF reactive force field on hydrogen combustion systems. J. Phys. Chem. A 124, 5631–5645 (2020).
Li, J., Zhao, Z., Kazakov, A. & Dryer, F. L. An updated comprehensive kinetic model of hydrogen combustion. Int. J. Chem. Kinet. 36, 566–575 (2004).
Kulichenko, M. et al. Uncertainty-driven dynamics for active learning of interatomic potentials. Nat. Comput. Sci. 3, 230–239 (2023).
Shapeev, A., Gubaev, K., Tsymbalov, E. & Podryabinkin, E. in Machine Learning Meets Quantum Physics Lecture Notes in Physics Vol. 968 (eds Schütt, K. et al.) 309–329 (Springer, 1970).
Seung, H. S., Opper, M. & Sompolinsky, H. Query by committee. In Fifth Annual Workshop on Computational Learning Theory 287–294 (Association for Computing Machinery, 1992).
Schran, C. et al. Machine learning potentials for complex aqueous systems made simple. Proc. Natl Acad. Sci. USA 118, e2110077118 (2021).
Ang, S. J., Wang, W., Schwalbe-Koda, D., Axelrod, S. & Gómez-Bombarelli, R. Active learning accelerates ab initio molecular dynamics on reactive energy surfaces. Chem 7, 738–751 (2021).
Zhang, S. et al. Exploring the frontiers of chemistry with a general reactive machine learning potential. Preprint at ChemRxiv https://doi.org/10.26434/chemrxiv-2022-15ct6 (2023).
Khalak, Y., Tresadern, G., Hahn, D. F., de Groot, B. L. & Gapsys, V. Chemical space exploration with active learning and alchemical free energies. J. Chem. Theory Comput. 18, 6259–6270 (2022).
Yang, M., Bonati, L., Polino, D. & Parrinello, M. Using metadynamics to build neural network potentials for reactive events: the case of urea decomposition in water. Catal. Today 387, 143–149 (2022).
Laio, A. & Parrinello, M. Escaping free-energy minima. Proc. Natl Acad. Sci. USA 99, 12562–12566 (2002).
Barducci, A., Bussi, G. & Parrinello, M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 100, 020603 (2008).
Ko, T. et al. Using diffusion maps to analyze reaction dynamics for a hydrogen combustion benchmark dataset. J. Chem. Theory Comput. 19, 5872–5885 (2023).
van der Oord, C. et al. Hyperactive learning for data-driven interatomic potentials. NPJ Comput. Mater. 9, 168 (2023).
Mueller, T., Hernandez, A. & Wang, C. Machine learning for interatomic potential models. J. Chem. Phys. 152, 050902 (2020).
Ramakrishnan, R., Dral, P. O., Rupp, M. & von Lilienfeld, O. A. Big data meets quantum chemistry approximations: the Δ-machine learning approach. J. Chem. Theory Comput. 11, 2087–2096 (2015).
Böselt, L., Thürlemann, M. & Riniker, S. Machine learning in QM/MM molecular dynamics simulations of condensed-phase systems. J. Chem. Theory Comput. 17, 2641–2658 (2021).
Shao, Y. et al. Advances in molecular quantum chemistry contained in the Q-Chem 4 program package. Mol. Phys. 113, 184–215 (2014).
Tribello, G. A., Bonomi, M., Branduardi, D., Camilloni, C. & Bussi, G. PLUMED 2: new feathers for an old bird. Comput. Phys. Commun. 185, 604–613 (2014).
Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condensed Matter 29, 273002 (2017).
Rupp, M., Tkatchenko, A., Müller, K.-R. & von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).
Herman-Saffar, O. An approach for choosing number of clusters for k-means. Medium https://towardsdatascience.com/an-approach-for-choosing-number-of-clusters-for-k-means-c28e614ecb2c (2021).
Epifanovsky, E. et al. Software for the frontiers of quantum chemistry: an overview of developments in the Q-Chem 5 package. J. Chem. Phys. 155, 084801 (2021).
Van Voorhis, T. & Head-Gordon, M. A geometric approach to direct minimization. Mol. Phys. 100, 1713–1721 (2002).
Khaliullin, R. Z., Cobar, E. A., Lochan, R. C., Bell, A. T. & Head-Gordon, M. Unravelling the origin of intermolecular interactions using absolutely localized molecular orbitals. J. Phys. Chem. A 111, 8753–8765 (2007).
Guan, X., Heindel, J., Ko, T., Yang, C. & Head-Gordon, T. Hydrogen combustion supplemetary data from an active learning study. figshare https://doi.org/10.6084/m9.figshare.23290115.v1 (2023).
Guan, X. Thglab/h2combustion_al: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.8378075 (2023).
Acknowledgements
X.G., J.P.H. and T.H.-G. thank the CPIMS program, Office of Science, Office of Basic Energy Sciences, Chemical Sciences Division of the US Department of Energy under contract DE-AC02-05CH11231 for support of the machine learning approach to hydrogen combustion. T.K. and C.Y. thank the US Department of Energy via the Scientific Discovery through Advanced Computing (SciDAC) program for the collective variables. This work used computational resources provided by the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy Office of Science User Facility operated under contract DE-AC02-05CH11231.
Author information
Authors and Affiliations
Contributions
X.G. and T.H.-G. designed the project. X.G. carried out the AIMD simulations, metadynamic calculations and active learning. X.G. and T.H.-G. designed the collective coordinates with the help of J.P.H., T.K. and C.Y. X.G. and T.H.-G. wrote the paper. All authors discussed the results and made comments and edits to the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Benjamin Nebgen, David van der Spoel, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1 and 2, Discussion and Tables 1–4. Supplementary Fig. 1: Two representative structures that the original ML model predicts with large error. Supplementary Fig. 2: Spot checking the hybrid mode model for rxn18 for energies and forces. Supplementary Table 1: The 19 reactions contained in the hydrogen combustion benchmark dataset. Supplementary Table 2: Metadynamics collective variables used in the active learning and free-energy reconstruction. Supplementary Table 3: Total number of data points added in active learning for each reaction. Supplementary Table 4: AIMD committer analysis on identified free-energy transition state from the hybrid model at 500 K.
Source data
Source Data Fig. 5
Atomic coordinates and statistical source data for Fig. 5.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guan, X., Heindel, J.P., Ko, T. et al. Using machine learning to go beyond potential energy surface benchmarking for chemical reactivity. Nat Comput Sci 3, 965–974 (2023). https://doi.org/10.1038/s43588-023-00549-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43588-023-00549-5