Abstract
Computational screening in heterogeneous catalysis relies increasingly on machine learning models for predicting key input parameters due to the high cost of computing these directly using first-principles methods. This becomes especially relevant when considering complex materials spaces such as alloys, or complex reaction mechanisms with adsorbates that may exhibit bi- or higher-dentate adsorption motifs. Here we present a data-efficient approach to the prediction of binding motifs and associated adsorption enthalpies of complex adsorbates at transition metals and their alloys based on a customized Wasserstein Weisfeiler–Lehman graph kernel and Gaussian process regression. The model shows good predictive performance, not only for the elemental transition metals on which it was trained, but also for an alloy based on these transition metals. Furthermore, incorporation of minimal new training data allows for predicting an out-of-domain transition metal. We believe the model may be useful in active learning approaches, for which we present an ensemble uncertainty estimation approach.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The DFT-calculated adsorption energies and relaxed coordinates of the simple and complex adsorbates databases as well as all calculated features are available at https://github.com/Wenbintum/WWL-GPR and Zenodo48. Source Data are provided with this paper.
Code availability
The source code of WWL-GPR is publicly available on GitHub at https://github.com/Wenbintum/WWL-GPR and Zenodo48. We provide predefined tasks for tutorial purposes and for reproducing the results presented in this work. The RBF-GPR is implemented with Scikit-learn55, which is available at https://scikit-learn.org. The SISSO code18 is available at https://github.com/rouyang2017/SISSO, and the XGBoost code20 is available at https://github.com/dmlc/xgboost.
References
Cao, A. et al. Mechanistic insights into the synthesis of higher alcohols from syngas on CuCo alloys. ACS Catal. 8, 10148–10155 (2018).
Chang, C. & Medford, A. J. Application of density functional tight binding and machine learning to evaluate the stability of biomass intermediates on the Rh(111) surface. J. Phys. Chem. C 125, 18210–18216 (2021).
Wang, Z., Li, Y., Boes, J., Wang, Y. & Sargent, E. CO2 Electrocatalyst design using graph theory. Preprint at https://doi.org/10.21203/rs.3.rs-66715/v1 (2020).
Nørskov, J. K., Abild-Pedersen, F., Studt, F. & Bligaard, T. Density functional theory in surface chemistry and catalysis. Proc. Natl Acad. Sci. USA 108, 937–943 (2011).
Choi, Y. & Liu, P. Mechanism of ethanol synthesis from syngas on Rh(111). J. Am. Chem. Soc. 131, 13054–13061 (2009).
Michel, C., Auneau, F., Delbecq, F. & Sautet, P. C–H Versus O–H bond dissociation for alcohols on a Rh(111) surface: a strong assistance from hydrogen bonded neighbors. ACS Catal. 1, 1430–1440 (2011).
Filot, I. A. W. et al. First-principles-based microkinetics simulations of synthesis gas conversion on a stepped rhodium surface. ACS Catal. 5, 5453–5467 (2015).
Gu, T., Wang, B., Chen, S. & Yang, B. Automated generation and analysis of the complex catalytic reaction network of ethanol synthesis from syngas on Rh(111). ACS Catal. 10, 6346–6355 (2020).
Tran, K. & Ulissi, Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nat. Catal. 1, 696–703 (2018).
Noh, J., Back, S., Kim, J. & Jung, Y. Active learning with non-ab initio input features toward efficient CO2 reduction catalysts. Chem. Sci. 9, 5152–5159 (2018).
Andersen, M., Levchenko, S. V., Scheffler, M. & Reuter, K. Beyond scaling relations for the description of catalytic materials. ACS Catal. 9, 2752–2759 (2019).
Wang, S.-H., Pillai, H. S., Wang, S., Achenie, L. E. & Xin, H. Infusing theory into deep learning for interpretable reactivity prediction. Nat. Commun. 12, 1–9 (2021).
Fung, V., Hu, G., Ganesh, P. & Sumpter, B. G. Machine learned features from density of states for accurate adsorption energy prediction. Nat. Commun. 12, 88 (2021).
Back, S. et al. Convolutional neural network of atomic surface structures to predict binding energies for high-throughput screening of catalysts. J. Phys. Chem. Lett. 10, 4401–4408 (2019).
Gu, G. H. et al. Practical deep-learning representation for fast heterogeneous catalyst screening. J. Phys. Chem. Lett. 11, 3185–3191 (2020).
Chanussot, L. et al. Open Catalyst 2020 (OC20) dataset and community challenges. ACS Catal. 11, 6059–6072 (2021).
Togninalli, M., Ghisu, E., Llinares-López, F., Rieck, B. & Borgwardt, K. Wasserstein Weisfeiler–Lehman graph kernels. In Adv Neural Inf Process Syst. Vol. 32 (NeurIPS, 2019).
Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M. & Ghiringhelli, L. M. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018).
Ouyang, R., Ahmetcik, E., Carbogno, C., Scheffler, M. & Ghiringhelli, L. M. Simultaneous learning of several materials properties from incomplete databases with multi-task SISSO. J. Phys. Mater. 2, 024002 (2019).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).
Medford, A. J. et al. Activity and selectivity trends in synthesis gas conversion to higher alcohols. Top. Catal. 57, 135–142 (2014).
Schumann, J. et al. Selectivity of synthesis gas conversion to C2+ oxygenates on fcc(111) transition-metal surfaces. ACS Catal. 8, 3447–3453 (2018).
Deimel, M., Reuter, K. & Andersen, M. Active site representation in first-principles microkinetic models: data-enhanced computational screening for improved methanation catalysts. ACS Catal. 10, 13729–13736 (2020).
Deringer, V. L., Caro, M. A. & Csányi, G. Machine learning interatomic potentials as emerging tools for materials science. Adv. Mater. 31, 1902765 (2019).
Gasteiger, J., Becker, F. & Günnemann, S. Gemnet. Universal directional graph neural networks for molecules. In Conference on Neural Information Processing Systems Vol. 34 (NeurIPS, 2021).
Wen, M., Blau, S. M., Spotte-Smith, E. W. C., Dwaraknath, S. & Persson, K. A. BonDNet: a graph neural network for the prediction of bond dissociation energies for charged molecules. Chem. Sci. 12, 1858–1868 (2021).
Tang, Y.-H. & de Jong, W. A. Prediction of atomization energy using graph kernel and active learning. J. Chem. Phys. 150, 044107 (2019).
Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018).
Montoya, J. H. & Persson, K. A. A high-throughput framework for determining adsorption energies on solid surfaces. npj Comput. Mater. 3, 14 (2017).
Boes, J. R., Mamun, O., Winther, K. & Bligaard, T. Graph theory approach to high-throughput surface adsorption structure generation. J. Phys. Chem. A 123, 2281–2285 (2019).
Deshpande, S., Maxson, T. & Greeley, J. Graph theory approach to determine configurations of multidentate and high coverage adsorbates for heterogeneous catalysis. npj Comput. Mater. 6, 79 (2020).
Xu, W., Andersen, M. & Reuter, K. Data-driven descriptor engineering and refined scaling relations for predicting transition metal oxide reactivity. ACS Catal. 11, 734–742 (2020).
Rupp, M. Machine learning for quantum mechanics in a nutshell. Int. J. Quantum Chem. 115, 1058–1073 (2015).
Deringer, V. L. et al. Gaussian process regression for materials and molecules. Chem. Rev. 121, 10073–10141 (2021).
Bruix, A., Margraf, J. T., Andersen, M. & Reuter, K. First-principles-based multiscale modelling of heterogeneous catalysis. Nat. Catal. 2, 659–670 (2019).
Meskine, H., Matera, S., Scheffler, M., Reuter, K. & Metiu, H. Examination of the concept of degree of rate control by first-principles kinetic monte carlo simulations. Surf. Sci. 603, 1724–1730 (2009).
Medford, A. J. et al. Assessing the reliability of calculated catalytic ammonia synthesis rates. Science 345, 197–200 (2014).
Sutton, J. E., Guo, W., Katsoulakis, M. A. & Vlachos, D. G. Effects of correlated parameters and uncertainty in electronic-structure-based chemical kinetic modelling. Nat. Chem. 8, 331 (2016).
Döpking, S. & Matera, S. Error propagation in first-principles kinetic monte carlo simulation. Chem. Phys. Lett. 674, 28–32 (2017).
Flores, R. A. et al. Active learning accelerated discovery of stable iridium oxide polymorphs for the oxygen evolution reaction. Chem. Mater. 32, 5854–5863 (2020).
Kunkel, C., Margraf, J. T., Chen, K., Oberhofer, H. & Reuter, K. Active discovery of organic semiconductors. Nat. Commun. 12, 1–11 (2021).
Tran, K. et al. Methods for comparing uncertainty quantifications for material property predictions. Mach. Learn. Sci. Technol. 1, 025006 (2020).
Palmer, G. et al. Calibration after bootstrap for accurate uncertainty quantification in regression models. npj Comput. Mater. 8, 1–9 (2022).
Kuleshov, V., Fenner, N. & Ermon, S. Accurate uncertainties for deep learning using calibrated regression. In International Conference on Machine Learning 2796–2804 (MLR Press, 2018).
Giannozzi, P. et al. QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 (2009).
Wellendorff, J. et al. Density functionals for surface science: exchange-correlation model development with Bayesian error estimation. Phys. Rev. B 85, 235149 (2012).
Huber, S. P. et al. AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance. Sci. Data 7, 300 (2020).
Xu, W., Reuter, K. & Andersen, M. Predicting Binding Motifs of Complex Adsorbates Using Machine Learning with a Physics-Inspired Graph Representation (Zenodo, 2022); https://doi.org/10.5281/zenodo.6640198
Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Dal Corso, A. Pseudopotentials periodic table: from H to Pu. Comput. Mater. Sci. 95, 337–350 (2014).
Garrity, K. F., Bennett, J. W., Rabe, K. M. & Vanderbilt, D. Pseudopotentials for high-throughput DFT calculations. Comput. Mater. Sci. 81, 446–452 (2014).
Bartók, A. P., Kondor, R. & Csányi, G. On representing chemical environments. Phys. Rev. B 87, 184115 (2013).
Esterhuizen, J. A., Goldsmith, B. R. & Linic, S. Theory-guided machine learning finds geometric structure–property relationships for chemisorption on subsurface alloys. Chem 6, 3100–3117 (2020).
Andersen, M. & Reuter, K. Adsorption enthalpies for catalysis modeling through machine-learned descriptors. Acc. Chem. Res. 54, 2741–2749 (2021).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Acknowledgements
The authors gratefully acknowledge support from the Max Planck Computing and Data Facility (MPCDF) and the Jülich Supercomputing Centre (www.fz-juelich.de/ias/jsc). W.X. is grateful for support through the China Scholarship Council (CSC). M.A. acknowledges funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie (grant agreement no. 754513), the Aarhus University Research Foundation, the Danish National Research Foundation through the Center of Excellence ’InterCat’ (grant agreement no. DNRF150) and VILLUM FONDEN (grant no. 37381).
Author information
Authors and Affiliations
Contributions
W.X. performed the DFT calculations, workflow and machine learning methods development. K.R. and M.A. conceived and supervised the project. All authors contributed to analyzing the data and writing the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Gyoung Na, Hongliang Xin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Handling editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Sections 1–4, Figs. 1–9 and Tables 1–12.
Source data
Source Data Fig. 2
DFT-calculated versus machine learning-predicted (SISSO, RBF-GPR and WWL-GPR) adsorption enthalpies for the simple adsorbates database.
Source Data Fig. 3
DFT-calculated versus machine learning-predicted (RBF-GPR, WWL-GPR and XGBoost) adsorption enthalpies for the complex adsorbates database.
Source Data Fig. 4
Principal components 1 and 2 from kernel principal component analysis for the complex adsorbates database and the WWL-GPR model (all metals and Rh metal only).
Source Data Fig. 5
DFT-calculated adsorption enthalpies, machine learning-predicted adsorption enthalpies (WWL-GPR) and predicted uncertainties (single GPR model and ensemble model).
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, W., Reuter, K. & Andersen, M. Predicting binding motifs of complex adsorbates using machine learning with a physics-inspired graph representation. Nat Comput Sci 2, 443–450 (2022). https://doi.org/10.1038/s43588-022-00280-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43588-022-00280-7
This article is cited by
-
Exploring catalytic reaction networks with machine learning
Nature Catalysis (2023)
-
Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks
Nature Computational Science (2023)
-
AdsorbML: a leap in efficiency for adsorption energy calculations using generalizable machine learning potentials
npj Computational Materials (2023)
-
Machine-learning driven global optimization of surface adsorbate geometries
npj Computational Materials (2023)