Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Predicting binding motifs of complex adsorbates using machine learning with a physics-inspired graph representation

A preprint version of the article is available at arXiv.


Computational screening in heterogeneous catalysis relies increasingly on machine learning models for predicting key input parameters due to the high cost of computing these directly using first-principles methods. This becomes especially relevant when considering complex materials spaces such as alloys, or complex reaction mechanisms with adsorbates that may exhibit bi- or higher-dentate adsorption motifs. Here we present a data-efficient approach to the prediction of binding motifs and associated adsorption enthalpies of complex adsorbates at transition metals and their alloys based on a customized Wasserstein Weisfeiler–Lehman graph kernel and Gaussian process regression. The model shows good predictive performance, not only for the elemental transition metals on which it was trained, but also for an alloy based on these transition metals. Furthermore, incorporation of minimal new training data allows for predicting an out-of-domain transition metal. We believe the model may be useful in active learning approaches, for which we present an ensemble uncertainty estimation approach.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic illustration of the WWL-GPR model.
Fig. 2: Parity plot of DFT-calculated versus machine learning-predicted adsorption enthalpies using SISSO, RBF-GPR and WWL-GPR.
Fig. 3: Parity plot of DFT-calculated versus machine learning-predicted adsorption enthalpies using RBF-GPR, WWL-GPR and XGBoost.
Fig. 4: Kernel principal component analysis.
Fig. 5: Estimated uncertainties versus absolute prediction errors for the single and ensemble models.

Data availability

The DFT-calculated adsorption energies and relaxed coordinates of the simple and complex adsorbates databases as well as all calculated features are available at and Zenodo48. Source Data are provided with this paper.

Code availability

The source code of WWL-GPR is publicly available on GitHub at and Zenodo48. We provide predefined tasks for tutorial purposes and for reproducing the results presented in this work. The RBF-GPR is implemented with Scikit-learn55, which is available at The SISSO code18 is available at, and the XGBoost code20 is available at


  1. Cao, A. et al. Mechanistic insights into the synthesis of higher alcohols from syngas on CuCo alloys. ACS Catal. 8, 10148–10155 (2018).

    Article  Google Scholar 

  2. Chang, C. & Medford, A. J. Application of density functional tight binding and machine learning to evaluate the stability of biomass intermediates on the Rh(111) surface. J. Phys. Chem. C 125, 18210–18216 (2021).

    Article  Google Scholar 

  3. Wang, Z., Li, Y., Boes, J., Wang, Y. & Sargent, E. CO2 Electrocatalyst design using graph theory. Preprint at (2020).

  4. Nørskov, J. K., Abild-Pedersen, F., Studt, F. & Bligaard, T. Density functional theory in surface chemistry and catalysis. Proc. Natl Acad. Sci. USA 108, 937–943 (2011).

    Article  Google Scholar 

  5. Choi, Y. & Liu, P. Mechanism of ethanol synthesis from syngas on Rh(111). J. Am. Chem. Soc. 131, 13054–13061 (2009).

    Article  Google Scholar 

  6. Michel, C., Auneau, F., Delbecq, F. & Sautet, P. C–H Versus O–H bond dissociation for alcohols on a Rh(111) surface: a strong assistance from hydrogen bonded neighbors. ACS Catal. 1, 1430–1440 (2011).

    Article  Google Scholar 

  7. Filot, I. A. W. et al. First-principles-based microkinetics simulations of synthesis gas conversion on a stepped rhodium surface. ACS Catal. 5, 5453–5467 (2015).

    Article  Google Scholar 

  8. Gu, T., Wang, B., Chen, S. & Yang, B. Automated generation and analysis of the complex catalytic reaction network of ethanol synthesis from syngas on Rh(111). ACS Catal. 10, 6346–6355 (2020).

    Article  Google Scholar 

  9. Tran, K. & Ulissi, Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nat. Catal. 1, 696–703 (2018).

    Article  Google Scholar 

  10. Noh, J., Back, S., Kim, J. & Jung, Y. Active learning with non-ab initio input features toward efficient CO2 reduction catalysts. Chem. Sci. 9, 5152–5159 (2018).

    Article  Google Scholar 

  11. Andersen, M., Levchenko, S. V., Scheffler, M. & Reuter, K. Beyond scaling relations for the description of catalytic materials. ACS Catal. 9, 2752–2759 (2019).

    Article  Google Scholar 

  12. Wang, S.-H., Pillai, H. S., Wang, S., Achenie, L. E. & Xin, H. Infusing theory into deep learning for interpretable reactivity prediction. Nat. Commun. 12, 1–9 (2021).

    Google Scholar 

  13. Fung, V., Hu, G., Ganesh, P. & Sumpter, B. G. Machine learned features from density of states for accurate adsorption energy prediction. Nat. Commun. 12, 88 (2021).

    Article  Google Scholar 

  14. Back, S. et al. Convolutional neural network of atomic surface structures to predict binding energies for high-throughput screening of catalysts. J. Phys. Chem. Lett. 10, 4401–4408 (2019).

    Article  Google Scholar 

  15. Gu, G. H. et al. Practical deep-learning representation for fast heterogeneous catalyst screening. J. Phys. Chem. Lett. 11, 3185–3191 (2020).

    Article  Google Scholar 

  16. Chanussot, L. et al. Open Catalyst 2020 (OC20) dataset and community challenges. ACS Catal. 11, 6059–6072 (2021).

    Article  Google Scholar 

  17. Togninalli, M., Ghisu, E., Llinares-López, F., Rieck, B. & Borgwardt, K. Wasserstein Weisfeiler–Lehman graph kernels. In Adv Neural Inf Process Syst. Vol. 32 (NeurIPS, 2019).

  18. Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M. & Ghiringhelli, L. M. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018).

    Article  Google Scholar 

  19. Ouyang, R., Ahmetcik, E., Carbogno, C., Scheffler, M. & Ghiringhelli, L. M. Simultaneous learning of several materials properties from incomplete databases with multi-task SISSO. J. Phys. Mater. 2, 024002 (2019).

    Article  Google Scholar 

  20. Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).

  21. Medford, A. J. et al. Activity and selectivity trends in synthesis gas conversion to higher alcohols. Top. Catal. 57, 135–142 (2014).

    Article  Google Scholar 

  22. Schumann, J. et al. Selectivity of synthesis gas conversion to C2+ oxygenates on fcc(111) transition-metal surfaces. ACS Catal. 8, 3447–3453 (2018).

    Article  Google Scholar 

  23. Deimel, M., Reuter, K. & Andersen, M. Active site representation in first-principles microkinetic models: data-enhanced computational screening for improved methanation catalysts. ACS Catal. 10, 13729–13736 (2020).

    Article  Google Scholar 

  24. Deringer, V. L., Caro, M. A. & Csányi, G. Machine learning interatomic potentials as emerging tools for materials science. Adv. Mater. 31, 1902765 (2019).

    Article  Google Scholar 

  25. Gasteiger, J., Becker, F. & Günnemann, S. Gemnet. Universal directional graph neural networks for molecules. In Conference on Neural Information Processing Systems Vol. 34 (NeurIPS, 2021).

  26. Wen, M., Blau, S. M., Spotte-Smith, E. W. C., Dwaraknath, S. & Persson, K. A. BonDNet: a graph neural network for the prediction of bond dissociation energies for charged molecules. Chem. Sci. 12, 1858–1868 (2021).

    Article  Google Scholar 

  27. Tang, Y.-H. & de Jong, W. A. Prediction of atomization energy using graph kernel and active learning. J. Chem. Phys. 150, 044107 (2019).

    Article  Google Scholar 

  28. Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018).

    Article  Google Scholar 

  29. Montoya, J. H. & Persson, K. A. A high-throughput framework for determining adsorption energies on solid surfaces. npj Comput. Mater. 3, 14 (2017).

    Article  Google Scholar 

  30. Boes, J. R., Mamun, O., Winther, K. & Bligaard, T. Graph theory approach to high-throughput surface adsorption structure generation. J. Phys. Chem. A 123, 2281–2285 (2019).

    Article  Google Scholar 

  31. Deshpande, S., Maxson, T. & Greeley, J. Graph theory approach to determine configurations of multidentate and high coverage adsorbates for heterogeneous catalysis. npj Comput. Mater. 6, 79 (2020).

    Article  Google Scholar 

  32. Xu, W., Andersen, M. & Reuter, K. Data-driven descriptor engineering and refined scaling relations for predicting transition metal oxide reactivity. ACS Catal. 11, 734–742 (2020).

    Article  Google Scholar 

  33. Rupp, M. Machine learning for quantum mechanics in a nutshell. Int. J. Quantum Chem. 115, 1058–1073 (2015).

    Article  Google Scholar 

  34. Deringer, V. L. et al. Gaussian process regression for materials and molecules. Chem. Rev. 121, 10073–10141 (2021).

    Article  Google Scholar 

  35. Bruix, A., Margraf, J. T., Andersen, M. & Reuter, K. First-principles-based multiscale modelling of heterogeneous catalysis. Nat. Catal. 2, 659–670 (2019).

    Article  Google Scholar 

  36. Meskine, H., Matera, S., Scheffler, M., Reuter, K. & Metiu, H. Examination of the concept of degree of rate control by first-principles kinetic monte carlo simulations. Surf. Sci. 603, 1724–1730 (2009).

    Article  Google Scholar 

  37. Medford, A. J. et al. Assessing the reliability of calculated catalytic ammonia synthesis rates. Science 345, 197–200 (2014).

    Article  Google Scholar 

  38. Sutton, J. E., Guo, W., Katsoulakis, M. A. & Vlachos, D. G. Effects of correlated parameters and uncertainty in electronic-structure-based chemical kinetic modelling. Nat. Chem. 8, 331 (2016).

    Article  Google Scholar 

  39. Döpking, S. & Matera, S. Error propagation in first-principles kinetic monte carlo simulation. Chem. Phys. Lett. 674, 28–32 (2017).

    Article  Google Scholar 

  40. Flores, R. A. et al. Active learning accelerated discovery of stable iridium oxide polymorphs for the oxygen evolution reaction. Chem. Mater. 32, 5854–5863 (2020).

    Article  Google Scholar 

  41. Kunkel, C., Margraf, J. T., Chen, K., Oberhofer, H. & Reuter, K. Active discovery of organic semiconductors. Nat. Commun. 12, 1–11 (2021).

    Article  Google Scholar 

  42. Tran, K. et al. Methods for comparing uncertainty quantifications for material property predictions. Mach. Learn. Sci. Technol. 1, 025006 (2020).

    Article  Google Scholar 

  43. Palmer, G. et al. Calibration after bootstrap for accurate uncertainty quantification in regression models. npj Comput. Mater. 8, 1–9 (2022).

    Article  Google Scholar 

  44. Kuleshov, V., Fenner, N. & Ermon, S. Accurate uncertainties for deep learning using calibrated regression. In International Conference on Machine Learning 2796–2804 (MLR Press, 2018).

  45. Giannozzi, P. et al. QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 (2009).

    Article  Google Scholar 

  46. Wellendorff, J. et al. Density functionals for surface science: exchange-correlation model development with Bayesian error estimation. Phys. Rev. B 85, 235149 (2012).

    Article  Google Scholar 

  47. Huber, S. P. et al. AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance. Sci. Data 7, 300 (2020).

    Article  Google Scholar 

  48. Xu, W., Reuter, K. & Andersen, M. Predicting Binding Motifs of Complex Adsorbates Using Machine Learning with a Physics-Inspired Graph Representation (Zenodo, 2022);

  49. Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).

    Article  Google Scholar 

  50. Dal Corso, A. Pseudopotentials periodic table: from H to Pu. Comput. Mater. Sci. 95, 337–350 (2014).

    Article  Google Scholar 

  51. Garrity, K. F., Bennett, J. W., Rabe, K. M. & Vanderbilt, D. Pseudopotentials for high-throughput DFT calculations. Comput. Mater. Sci. 81, 446–452 (2014).

    Article  Google Scholar 

  52. Bartók, A. P., Kondor, R. & Csányi, G. On representing chemical environments. Phys. Rev. B 87, 184115 (2013).

    Article  Google Scholar 

  53. Esterhuizen, J. A., Goldsmith, B. R. & Linic, S. Theory-guided machine learning finds geometric structure–property relationships for chemisorption on subsurface alloys. Chem 6, 3100–3117 (2020).

    Article  Google Scholar 

  54. Andersen, M. & Reuter, K. Adsorption enthalpies for catalysis modeling through machine-learned descriptors. Acc. Chem. Res. 54, 2741–2749 (2021).

    Article  Google Scholar 

  55. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    MathSciNet  MATH  Google Scholar 

Download references


The authors gratefully acknowledge support from the Max Planck Computing and Data Facility (MPCDF) and the Jülich Supercomputing Centre ( W.X. is grateful for support through the China Scholarship Council (CSC). M.A. acknowledges funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie (grant agreement no. 754513), the Aarhus University Research Foundation, the Danish National Research Foundation through the Center of Excellence ’InterCat’ (grant agreement no. DNRF150) and VILLUM FONDEN (grant no. 37381).

Author information

Authors and Affiliations



W.X. performed the DFT calculations, workflow and machine learning methods development. K.R. and M.A. conceived and supervised the project. All authors contributed to analyzing the data and writing the manuscript.

Corresponding author

Correspondence to Mie Andersen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Gyoung Na, Hongliang Xin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Handling editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–4, Figs. 1–9 and Tables 1–12.

Peer Review file

Source data

Source Data Fig. 2

DFT-calculated versus machine learning-predicted (SISSO, RBF-GPR and WWL-GPR) adsorption enthalpies for the simple adsorbates database.

Source Data Fig. 3

DFT-calculated versus machine learning-predicted (RBF-GPR, WWL-GPR and XGBoost) adsorption enthalpies for the complex adsorbates database.

Source Data Fig. 4

Principal components 1 and 2 from kernel principal component analysis for the complex adsorbates database and the WWL-GPR model (all metals and Rh metal only).

Source Data Fig. 5

DFT-calculated adsorption enthalpies, machine learning-predicted adsorption enthalpies (WWL-GPR) and predicted uncertainties (single GPR model and ensemble model).

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, W., Reuter, K. & Andersen, M. Predicting binding motifs of complex adsorbates using machine learning with a physics-inspired graph representation. Nat Comput Sci 2, 443–450 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing