## Introduction

Adsorption of molecules or their fragments at transition-metal surfaces is a fundamental process for many technological applications, such as chemical sensing, molecular self-assembly, and heterogeneous catalysis. Because of the convoluted interplay between electron transfer and orbital coupling, chemical bonding can be formidably complex. Recent decades have brought major advances in spectroscopic tools1,2, which reveal orbitalwise information of chemisorbed systems and concurrently in predicting chemical reactivity at sites of interest via electronic factors, e.g., the number of valence d-electrons3, density of d-states at the Fermi level4, d-band center5, and d-band upper edge6,7. Compared with a full quantum-mechanics treatment of many-body systems, the simplicity of physics-inspired descriptors comes at a cost of limited generalization, particularly for high-throughput materials screening. Incorporation of multifidelity site features into reactivity models with machine learning (ML) algorithms has shown early promise for the prediction of adsorption energies, with an accuracy comparable to the typical error (~0.1−0.2 eV) of density functional theory (DFT) calculations8,9,10,11,12,13,14,15,16. However, the approach is largely black-box in nature, prohibiting its physical interpretation. Developing a theory-based, generalizable model of chemisorption that bridges the complexity of electronic descriptors, and predicts the binding affinity of active sites to key reaction intermediates with uncertainty quantification represents one of the biggest challenges in fundamental catalysis.

Here, we present a Bayesian inference approach to probing chemisorption processes at metal sites by learning from ab initio datasets. The model is built upon the basic framework of the d-band reactivity theory5, while employing a Newns–Anderson-type Hamiltonian17,18 to capture essential physics of adsorbate-substrate interactions. Such types of simplified Hamiltonians were originally used for describing magnetic properties of impurities in a bulk metallic host17, and later extended with success by Newns and Grimley to chemisorption at surfaces18,19. A basis set of orbitals consisting of the adsorbate and substrate states was used for solving the hybridization problem within a self-consistent Hartree–Fock scheme18. Despite a remarkable success in advancing the basic understanding of adsorption phenomena at surfaces, particularly for d-block metals6, its application in materials design remains limited due to the lack of accurate model parameters and meaningful error estimates. Bayesian inference produces the posterior probability distribution of model parameters under the influence of observations and prior knowledge20. With representative species, e.g., *O and *OH, we demonstrate the predictive performance and physical interpretability of Bayesian models for chemical bonding at a diverse range of intermetallics and near-surface alloys, bridging the complexity of electronic descriptors in search of novel catalytic materials.

## Results

### The d-band reactivity theory

Within the basic framework of the d-band reactivity theory for transition-metal surfaces, the formation of the adsorbate-metal bond conceptually takes place in two consecutive steps5, as illustrated in Fig. 1. First, the adsorbate frontier orbital (or orbitals) $$\left|a\right\rangle$$ at $${\epsilon }_{{\mathrm{a}}}^{0}$$ couples to the delocalized, free-electron-like sp-states of the metal substrate, leading to a Lorenzian-shaped resonance state at ϵa. Second, the adsorbate resonance state interacts with the localized, narrowly-distributed metal d-states, shifting up in energies due to the orthogonalization penalty for satisfying the Pauli principle, and then splitting into bonding and antibonding states. The first step interaction contributes a constant ΔE0 albeit often the largest part of chemical bonding. The variation in adsorption energies from one metal to another is determined by the metal d-states. This part of the interaction energy ΔEd can be further partitioned into orbital orthogonalization and orbital hybridization contributions21. To a first approximation, the orbital hybridization energy can be evaluated by the changes of integrated one-electron energies. The orbital orthogonalization cost is considered simply as proportional to the product of interatomic coupling matrix and overlap matrix, VS, or equivalently αV2, where α is the orbital overlap coefficient. The absolute value of V2 can be written as $$\beta {V}_{{\mathrm{ad}}}^{2}$$, in which the standard values of $${V}_{{\mathrm{ad}}}^{2}$$ relative to Cu are readily available on the Solid State Table22. The overall adsorption energy ΔE can then be written as the sum of the energy contributions from the sp-states ΔE0 and the d-states ΔEd, with the latter depending on the symmetry and degeneracy of adsorbate frontier orbitals. Another important information from this framework is the evolving density of states projected onto the adsorbate orbital(s) upon adsorption, ρa. A full account of the theoretical framework is presented in the “Methods” section.

There are a number of unknown parameters within the basic framework of the d-band reactivity theory as discussed above and detailed in “Methods” section, including the energy contribution from the sp-band ΔE0, adsorbate resonance energy ϵa relative to the Fermi level, sp-band chemisorption function Δ0, orbital overlap coefficient α, and orbital coupling coefficient β. By least-squares fitting of the adsorbate density of states and the integrated one-electron energy changes to those from DFT calculations23,24, the Schmickler model of electron transfer has been developed to understand H2 evolution/oxidation and OH adsorption at metal–electrolyte interfaces. However, the deterministic fitting of adsorption properties from a single surface is prone to overfitting or trapping into a locally optimal region, limiting its application in catalysis.

### Bayesian learning

We instead employ Bayesian learning to infer the vector of model parameters $$\overrightarrow{\theta }={(\Delta {E}_{0},{\epsilon }_{{\mathrm{a}}},{\Delta }_{0},\alpha ,\beta )}^{\prime}$$ from the evidence, i.e., ab initio adsorption properties, along with prior knowledge if available20. In Bayes’ view, those parameters are not deterministic point values, but rather probabilistic distributions reflecting the uncertainty of physical variables. The use of parameter distributions as opposed to computationally-derived point values has obvious advantages for uncertainty quantification. In the chemical sciences, Bayesian learning has been used for calibration and validation of thermodynamic models for the uptake of CO2 in mesoporous silica-supported amines25, designing the Bayesian error estimation functional with van der Waals correlations26, and identifying potentially active sites and mechanisms of catalytic reactions27, just to name a few. The Bayesian approach allows one to infer the posterior probability distribution $$P(\overrightarrow{\theta }| {\mathcal{D}})$$ for latent variables based on the prior $$P(\overrightarrow{\theta })$$ as well as the likelihood function $$P({\mathcal{D}}| \overrightarrow{\theta })$$ subject to the observation $${\mathcal{D}}$$. The mathematical relationship between the prior, observation, and posterior is given by the Bayes’ theorem20, $$P(\overrightarrow{\theta }| {\mathcal{D}})=P({\mathcal{D}}| \overrightarrow{\theta })P(\overrightarrow{\theta })/P({\mathcal{D}})$$. Our initial belief about likely parameter values is provided by weakly informative priors to minimize potential bias. For example, ΔE0 and ϵa can be estimated from DFT calculations of the adsorbate on a simple metal, e.g., sodium (Na) at the face-centered cubic (fcc) phase. Specifically, we took Normal for floating-point variables unrestricted in sign, LogNormal for non-negative parameters, and Uniform for others (see the details of Bayesian learning and parameter choices in the “Methods” section). Computing the normalizing constant $$P({\mathcal{D}})$$, denominator of the posterior distribution, is impossible in most practical scenarios. To avoid this complication, the Markov chain Monte Carlo (MCMC) method28, whose sampling criterion only depends on the relative posterior density of the newly explored point and its preceding point, is used. To compute the transition probability of each MCMC step, we define the sum of the (negative) logarithm of the likelihood functions corresponding to binding energies and projected density of states onto each adsorbate orbital with a hyperparameter λ adjusting the weight of two contributing metrics, see details in the “Methods” section. After a large number of MCMC samplings, burning (discard) of the first half of the trajectory and then thinning (1 out of 5 samplings) were performed before extracting converged values from the joint posterior distributions. The convergence of the MCMC sampling is checked by using parallel chains with different starting parameter sets such that the variance of interchain samplings is close or within 1.2–1.5 times to that of intrachains28. The complete code, named Bayeschem, is now available at a Github repository https://github.com/hlxin/bayeschem for public access.

### Model development

In Fig. 2a, we are showing the co-variance of the joint posterior distribution for each parameter pair and the 1D histogram of model parameters (ΔE0, ϵa, Δ0, α, and β) from MCMC simulations for *O adsorption at the fcc-hollow site of the {111}-terminated transition-metal surfaces (Cu, Ag, Au, Ni, Pd, Pt, Co, Rh, Ir, and Ru). We assume three degenerate O2p orbitals as used before29 for demonstration of the approach, while later extend it to multiorbital models. To attain converged posterior distributions, 200k MCMC sampling steps with the Metropolis–Hastings algorithm were performed in a multidimensional parameter space illustrated in Fig. 2b. In Fig. 2, the approximate contours for 68, 95, and 99% confidence regions are shown at the lower triangle, showing little to no correlation between latent-variable pairs.

With the converged Bayesian sampling, in Fig. 3a, it shows the model-predicted adsorption energies of *O at the fcc-hollow site of transition-metal surfaces, with a mean absolute error (MAE)  ~0.17 eV compared to DFT calculations. The standard deviation of model prediction using the posterior distribution of model parameters ($$\overrightarrow{\theta },\,\overrightarrow{\sigma }$$) is overlaid, providing for the first time uncertainty quantification of adsorption energies within the d-band reactivity theory. Figure 3b shows DFT-calculated and model-constructed projected density of states onto the O2p orbital using the posterior means of model parameters, taking Pt(111) as an example (see all the surfaces in Supplementary Fig. 1). The chemisorption function Δ(ϵ) and its Hilbert transform Λ(ϵ) along with the straight adsorbate line (ϵ − ϵa) are shown for the graphical solution of the Newns–Anderson model18. The intersects indicated by solid circles in Fig. 3b represent the O2p–Pt5d bonding and antibonding states, with the latter above the Fermi level, suggesting a strong covalent interaction of *O at Pt(111). Given the simplicity of the model, the clearly captured electronic structure of the adsorbate–substrate system and the reactivity trend are satisfying.

To extend the approach for adsorbates with multiple valence orbitals that possibly contribute to bonding, we have explicitly treated O2p states with the doubly degenerate pxy orbitals and the single pz orbital in Bayesian learning. We infer model parameters (ϵa, Δ0, and β) corresponding to each non-equivalent adsorbate orbital together with an orbital-independent α29 and a global parameter ΔE0. The posterior parameter distributions are shown in Supplementary Fig. 2. From the posterior means of model parameters, we can see that the orbital coupling coefficient β of pxy (1.67 eV−1) is smaller than that of pz (1.77 eV−1), consistent with the symmetry analysis, that the pxy orbitals that are parallel to a surface form π bonds with the d-states, while the pz orbital can interact through a stronger σ bond. A weaker coupling manifests itself in a narrower orbital splitting of π/π* than that of σ/σ*, which has been previously observed using the angle-resolved photoemission spectroscopy on Cu and Ni30. In Supplementary Figs. 3 and 4, it shows that the model-constructed projected density of states onto symmetry-resolved orbitals closely resemble the DFT-calculated distributions and the predicted values of *O adsorption energies have a MAE  ~0.17 eV. To demonstrate the robustness and generalizability of the approach, we have also optimized the Bayeschem model of *O at the atop configuration, see Supplementary Figs. 57. In this model scheme, an individual set of parameters is obtained for the adsorbate at a given site. Compared to the linear adsorption-energy scaling relations31 that link adsorption energies of different adsorbates, Bayeschem creates the connection between the electronic structure of a surface site and the adsorption energy.

To test the prediction capability of the Bayeschem model for unseen systems, we took the *OH species at the atop adsorption configuration as a case study because of its fundamental importance in understanding the nature of chemical bonding32, and practical interests as a key reactivity descriptor in transition metal catalysis33,34,35. Three frontier molecular orbitals, i.e., 3σ, 1π, and 4σ*, are assumed to be involved in chemical bonding32. Symmetry-resolved, molecular orbital density of states projected onto OH along with adsorption energies are used as the DFT ground truth Y in Eq. (6). With the Bayeschem model developed here (see Supplementary Figs. 810 for posterior parameter distributions, model-predicted adsorption energies and projected density of states on training samples), we predict *OH binding energies at a diverse range of intermetallics and near-surface alloys. Specifically, we included A3B, A′@AML, A-B@AML, A3B@AML, A@A3B, and A@AB3, where A (A′) represents ten fcc/hcp metals used in the model development and B covers d-metals across the periodic table (see ref. 36 for structural details and tabulated data). The coupling matrix element Vad for alloys is assumed to be constant from the Solid State Table22. Its dependence on the local chemical environment can be incorporated into the model using the tight-binding approximation33. The A sites of above-mentioned surfaces exhibit diverse characteristics of the metal d-states ranging from bulk-like semi-elliptic bands to free-atom-like discrete energy levels37, as illustrated in Fig. 4a using Pt and Ag3Pt as examples. Similar to previous observations of single-atom alloys with coinage metal hosts37,38, a reactive guest metal often exhibits peaky signatures within the d-band due to the energy misalignment of coupling dd orbitals7. A direct consequence of such diverse electronic properties of adsorption sites is that no single electronic descriptor can capture the local chemical reactivity accurately. Encouragingly, the Bayeschem model, parameterized using ten pristine transition-metal data, predicts *OH adsorption energies on 512 alloy surfaces with a MAE 0.16 eV, see Fig. 4b. The standard deviation of predicted *OH adsorption energies from the posterior distribution of model parameters is marked for uncertainty quantification. It shows a similar performance to data-driven ML models8,9,10,11 while outperforming the state-of-the-art electronic descriptors, e.g., the d-band center ϵd (MAE: 0.20 eV) and upper edge ϵu (MAE: 0.23 eV). The approach can be easily extended to more complex adsorbates than *O and *OH, e.g., *OOH, without losing its generalizability in the development workflow.

### Orbitalwise interpretation of chemical bonding

More importantly, the Bayesian framework with built-in physics allows us to quantitatively interrogate the underlying mechanism of chemical bonding, that is difficult to obtain from purely data-driven regression models. Taking *OH adsorption at the M (10 fcc/hcp metals) site of {111}-terminated Ag3M intermetallics as examples, Fig. 4c shows the partition of *OH adsorption energies resulting from the 2nd step interaction (ΔEd) into orbital orthogonalization and hybridization. As we can see, for 3d, 4d, and 5d series of the guest metal M, the orthogonalization and hybridization contributions decrease in magnitude from left to right across the periodic table, while the hybridization dominates the reactivity trends. The changes in $$\Delta {E}_{{\mathrm{d}}}^{{\mathrm{hyb}}}$$ can be understood from the simplified d-band model, with the position and occupancy of adsorbate–substrate antibonding states tracking with the d-band center or upper edge. The orthogonalization energy is proportional to the filling f and $${V}_{{\mathrm{ad}}}^{2}$$ (see Eq. (4)), which are offsetting each other to a certain extent ($${V}_{{\mathrm{ad}}}^{2}$$ decreases while f increases across 3d, 4d, and 5d series), leading to a less dominant role than the hybridization. The orbitalwise contributions shown in Fig. 4c with different fill patterns suggest that the sole contribution of *OH adsorption at d-metal surfaces is from the 1π orbital, while those from 3σ and 4σ* are too small to be visible. This is supported by projected molecular orbital density of states in Supplementary Fig. 7, which shows that 3σ and 4σ* are forming resonance states after their interactions with the sp-states of the metal site without noticeable splitting due to d-states. Thus, they do not contribute to the observed trend of *OH adsorption. The Bayesian-optimized orbital coupling coefficients of 3σ and 4σ* are rather small (0.12 and 0.001 as shown in Supplementary Fig. 5, respectively), supporting unfavorable orbital overlaps with the d-states. This rationalizes the observation that *OH prefers the nearly-parallel adsorption geometry on most of the d-metals to maximize the interaction of the 1π orbital with metal d-states, while *OH on Na(111) adsorbs more strongly in a up-straight orientation because of a lack of such directional interactions. This orbitalwise insight of chemical bonding could provide guidance in tailoring orbital-specific characteristics of the metal d-band for desired catalytic properties through site engineering. Despite an exclusive discussion about the d-metals, it is possible to extend the Bayeschem framework to p-block metals and alloys see Supplementary Fig. 11, unifying the reactivity theory of metal surfaces.

To conclude, we present the first Bayesian model of chemisorption by learning from ab initio adsorption properties. The model leverages the well-established d-band reactivity theory and a Newns–Anderson-type Hamiltonian for capturing essential physics of chemisorption processes. We demonstrated that the Bayeschem models of descriptor species, e.g., *O and *OH, optimized with pristine transition-metal data predicts adsorption energies at a diverse range of atomically-tailored metal sites with a MAE  ~0.1–0.2 eV while providing uncertainty quantification. Incorporation of physics-based models into data-driven ML algorithms, e.g., deep learning, might hold the promise toward developing highly accurate while interpretable reactivity models. Furthermore, this conceptual framework can be broadly applied to unravel orbital-specific factors governing adsorbate–substrate interactions, paving the path toward design strategies to go beyond adsorption-energy scaling limitations in catalysis.

## Methods

### DFT calculations

Spin-polarized DFT calculations were performed through Quantum ESPRESSO39 with ultrasoft pseudopotentials. The exchange-correlation was approximated within the generalized gradient approximation (GGA) with Perdew–Burke–Ernzerhof (PBE)40. {111}-terminated metal surfaces were modeled using (2  ×  2) supercells with four layers and a vacuum of 15 Å between two images. The bottom two layers were fixed while the top two layers and adsorbates were allowed to relax until a force criteria of .1 eV/Å. A plane wave energy cutoff of 500 eV was used. A Monkhorst-Pack mesh of 6 × 6 × 1 was used to sample the Brillouin zone, while for molecules and radicals only the Gamma point was used. Gas phase species of O and OH were used as the reference for adsorption energies of *O and *OH, respectively. The projected atomic and molecular density of states were obtained by projecting the eigenvectors of the full system at a denser k-point sampling (12 × 12 × 1) with a energy spacing 0.01 eV onto the ones of the part, as determined by gas-phase calculations. The convergence of DFT calculations was thoroughly tested to be within 0.05 eV. Further details and tabulated data can be found in the ref. 9.

### The d-band reactivity theory

To revisit the d-band theory of chemisorption along with new developments, let’s consider a metal substrate M in which electrons occupy a set of continuous states with one-electron wavefunctions $$\left|k\right\rangle$$ and eigenenergies ϵk, and an isolated adsorbate species A with a valence electron described by an atomic wavefunction $$\left|a\right\rangle$$ at $${\epsilon }_{{\mathrm{a}}}^{0}$$, see Fig. 1. When the adsorbate is brought close to the substrate, the two sets of states will overlap and hybridize with each other. The strength of such interactions is determined by the coupling integral $${V}_{{\mathrm{ak}}}\,=\,\langle a| \hat{{\mathcal{H}}}| k\rangle$$, where $$\hat{{\mathcal{H}}}$$ is the system Hamiltonian. Within the Newns–Anderson model of chemisorption17,18,19, $$\hat{{\mathcal{H}}}$$ is defined as,

$$\hat{{\mathcal{H}}}=\mathop{\sum }\limits_{\sigma }\left\{{\epsilon }_{{\mathrm{a}}\sigma }{n}_{{\mathrm{a}}\sigma }+\mathop{\sum }\limits_{{\mathrm{k}}}{\epsilon }_{{\mathrm{k}}}{n}_{{\mathrm{k}}\sigma }+\mathop{\sum }\limits_{{\mathrm{k}}}({V}_{{\mathrm{ak}}}{c}_{{\mathrm{k}}\sigma }^{\dagger }{c}_{{\mathrm{a}}\sigma }+H.c.)\right\},$$
(1)

where σ denotes the electron spin, n is the orbital occupancy operator, and c and c represent the creation and annihilation operator, respectively. The first two terms in Eq. (1) are the one-electron energies from the adsorbate and the substrate when they are infinitely separated in space. The last term captures the coupling, or intuitively electron hopping, between the adsorbate orbital $$\left|a\right\rangle$$ and a continuum of substrate states $$\left|k\right\rangle$$. If the one-electron states of the whole system can be described as a linear combination of the unperturbed adsorbate and substrate states, the one-electron Schrödinger equation can be solved using the Green’s function approach18. In Fig. 1, we illustrate the chemisorption process of a simple adsorbate onto a d-block metal site characterized by delocalized sp-states and localized d-states21. The interaction of the adsorbate state at $${\epsilon }_{{\mathrm{a}}}^{0}$$ with the structureless sp-states, typically accompanied with electron transfer from/to the Fermi sea, results in a broadened resonance (or so-called renormalized adsorbate state) at an effective energy level ϵa. Conceptually viewing chemical bonding as consecutive steps in Fig. 1, the renormalized adsorbate state then couples with the narrowly distributed d-states, shifting up in energies due to orbital orthogonalization that increases the kinetic energy of electrons and splitting into bonding and antibonding states. One important information from this framework is the evolving density of states projected onto the adsorbate orbital $$\left|a\right\rangle$$ upon adsorption

$${\rho }_{{\mathrm{a}}}(\epsilon )\,=\,\frac{1}{\pi }\frac{\Delta (\epsilon )}{{\left[\epsilon -({\epsilon }_{{\mathrm{a}}}+\Lambda (\epsilon ))\right]}^{2}\,+\,\Delta {(\epsilon )}^{2}},$$
(2)

in which spin is neglected for simplicity. The effective adsorbate energy level, ϵa, is determined by the image potential of a charged particle in front of conducting surfaces and the Coulomb repulsion between electrons in the same orbital18. The chemisorption function Δ(ϵ) includes contributions from the sp-states and the d-states

$$\Delta (\epsilon )\,=\,\pi \mathop{\sum }\limits_{{\mathrm{k}}}{V}_{{\mathrm{ak}}}^{2}\delta (\epsilon -{\epsilon }_{{\mathrm{k}}})\,=\,{\Delta }_{0}\,+\,{\Delta }_{{\mathrm{d}}}.$$
(3)

To simplify the matter, only the 2nd step interaction, i.e., the coupling of the renormalized adsorbate state with the substrate d-states, is explicitly considered in Eq. (2). As a new development in our approach, we include an energy-independent constant Δ0 along with Δd as the chemisorption function Δ(ϵ). The inclusion of Δ0 provides a lifetime broadening of the adsorbate state, serving as a mathematical trick to avoid burdensome sampling of the resonance, i.e., the Lorentzian distribution $${\tilde{\rho }}_{{\mathrm{a}}}$$ from the 1st step interaction in Fig. 1. Accordingly, ϵa represents the renormalized adsorbate state. Attributed to the narrowness of a typical metal d-band, Δd can be simplified as the projected density of d-states onto the metal site ρd(ϵ) modulated by an effective coupling integral squared V2, i.e., Δd πV2ρd(ϵ). Λ(ϵ) is the Hilbert transform of Δ(ϵ). In this framework, the interaction energy between the adsorbate and the substrate can be partitioned into two contributions, i.e., ΔE0 and ΔEd. ΔE0 is the energy change due to the interaction of the unperturbed adsorbate orbital(s) with the delocalized sp-states, while ΔEd is the energy contribution from further interactions with the localized d-states of the substrate. Since all d-block metals have a similar, free-electron-like sp-band, ΔE0 can be approximated as a surface-independent constant albeit the largest contribution to bonding21. To calculate ΔEd, we include both the attractive orbital hybridization $$\Delta {E}_{{\mathrm{d}}}^{{\mathrm{hyb}}}$$ and repulsive orbital orthogonalization $$\Delta {E}_{{\mathrm{d}}}^{{\mathrm{orth}}}$$29,41:

$$\begin{array}{l}\Delta {E}_{{\mathrm{d}}}^{{\mathrm{hyb}}}\,=\,\frac{2}{\pi }\mathop{\int}\nolimits_{-\infty }^{{\epsilon }_{{\rm{F}}}}{\tan }^{-1}\left[\frac{\Delta (\epsilon )}{\epsilon -{{\epsilon }}_{{\mathrm{a}}}-\Lambda (\epsilon )}\right]d\epsilon -\frac{2}{\pi }\mathop{\int}\nolimits_{-\infty }^{{\epsilon }_{{\rm{F}}}}{\tan }^{-1}\left[\frac{{\Delta }_{0}(\epsilon )}{\epsilon -{{\epsilon }}_{{\mathrm{a}}}}\right]d\epsilon \\ \Delta {E}_{{\mathrm{d}}}^{{\mathrm{orth}}}\,=\,2(\langle {\tilde{n}}_{{\mathrm{a}}}\rangle +f)\alpha \beta {V}_{{\mathrm{ad}}}^{2}.\hfill\end{array}$$
(4)

The constant 2 considers spin degeneracy of the orbital, $$\langle {\tilde{n}}_{{\mathrm{a}}}\rangle$$ is the occupancy of the renormalized adsorbate state by integrating the Lorentzian distribution $${\tilde{\rho }}_{{\mathrm{a}}}$$ up to the Fermi level ϵF (taken as 0), and f is the idealized d-band filling of the metal atom. The $${\tan }^{-1}$$ is defined to lie between  −π to 0 since Δ0 is a nonzero constant across the energy scale [−15, 15] eV. Thus there is no need to explicitly include localized states even if present below or above the d-band. In Eq. (4), α is termed the orbital overlap coefficient, i.e., S ≈ αV, in which the overlap integral S is linearly proportional to the coupling integral V for a given orbital. Similarly, the effective coupling integral squared V2 can be written as $$\beta {V}_{{\mathrm{ad}}}^{2}$$, where β denotes the orbital coupling coefficient and $${V}_{{\mathrm{ad}}}^{2}$$ characterizes the interorbital coupling strength when the bonding atoms are aligned along the z-axis at a given distance42. Its values of d-block metals relative to that of Cu are readily available on the Solid State Table22. It is important to note that β is in the chemisorption function, which determines both the adsorption energy and adsorbate density of states, whereas α only affects the orbital orthogonalization energy since overlap was not explicitly considered.

### Bayesian learning

Due to the computationally intensive nature of the MCMC algorithm, there is a need for a more efficient implementation of the Newns–Anderson model than what is obtained by Python and standard libraries like SciPy and NumPy. We make extensive use of Cython, a C++ extension to the standard Python, to speed up the performance (10–1000 times) of some CPU-intensive functions in the model, e.g., Hilbert transform. To perform MCMC sampling, we use PyMC, a flexible and extensible Python package which includes a wide selection of built-in statistical distributions and sampling algorithms43, e.g., Metropolis-Hastings. A “burn-in” of the first half of the samplings and then thinning (1 out of 5 samplings) was performed to ensure that subsequent ones are representative of the posterior distribution. Convergence of our MCMC-based sampling was verified using parallel chains28. The MCMC sampling results can be directly visualized using corner, a open-source Python module. We took Normal for floating-point variables unrestricted in sign, LogNormal for non-negative parameters, and Uniform for others. ΔE0 and ϵa can be estimated from DFT calculations of the adsorbate on a simple metal, e.g., sodium (Na) at the face-centered cubic (fcc) phase. Specifically, for *O, we used ΔE0 ~ N(−5.0, 1), ϵa ~ N(−5, 1), Δ0 ~ LN(1, 0.25), β ~ LN(2, 1), and α ~ U(0, 1). For *OH, we used ΔE0 ~ N(−3.0, 1), $${\epsilon }_{{\mathrm{a}}}^{3\sigma } \sim N(-6,1)$$, $${\epsilon }_{{\mathrm{a}}}^{1\pi } \sim N(-2,1)$$, and $${\epsilon }_{{\mathrm{a}}}^{4{\sigma }^{* }} \sim N(4,1)$$. We assume that the predicted adsorption properties from Eqs. (2) and (4) are subject to independent normal errors. Specifically, for the property Y and the surface i we have

$${Y}_{{\mathrm{i}}}={\hat{Y}}_{{\mathrm{i}}}(\overrightarrow{\theta })+\sigma {\epsilon }_{{\mathrm{i}}},\,i=1,\,2,\,\ldots ,\,n,$$
(5)

where ϵi is an independent and standard normal random variable and σ is the standard deviation, allowing for a mismatch between the model prediction $${\hat{Y}}_{{\mathrm{i}}}(\overrightarrow{\theta })$$ and the DFT ground truth Yi. In this approach, we define the likelihood function of the property Y from n observations44

$$P(Y| \overrightarrow{\theta },\,\sigma )\propto {\sigma }^{-n}\exp \left[-\frac{1}{2{\sigma }^{2}}\mathop{\sum }\limits_{i = 1}^{n}{\left\{{Y}_{{\mathrm{i}}}-{\hat{Y}}_{{\mathrm{i}}}(\overrightarrow{\theta })\right\}}^{2}\right],$$
(6)

where the sum runs over n training samples for the property Y, which is either the projected density of states onto an adsorbate orbital or adsorption energies. For adsorption energies, Yi and $${\hat{Y}}_{{\mathrm{i}}}$$ are scalar values with no ambiguity. For projected density of states, it is a vector of paired values, i.e., the one-electron energy of a state and its probability density, thus deserving a clarification. The mean squared residuals of model prediction from Eq. (2) for the surface i is used as $${\{{Y}_{{\mathrm{i}}}-{\hat{Y}}_{{\mathrm{i}}}(\overrightarrow{\theta })\}}^{2}$$ in Eq. (6). To compute the transition probability of each MCMC step, we define the sum of the (negative) logarithm of the likelihood functions corresponding to projected density of states onto each adsorbate orbital and binding energies with a hyper parameter λ adjusting the weight of two contributing metrics, i.e., $$-{\mathrm{ln}}\,({P}_{\Delta {\mathrm{E}}})-\lambda \sum {\mathrm{ln}}\,({P}_{{\rho }_{{\mathrm{a}}}})$$. To optimize this parameter, we varied it on a grid of 1.0e−3, 1.0e−2, 1.0e−1, and 1, and found that 1.0e−2 is the optimal value to obtain the best performance in adsorption energy prediction.