Introduction

Transition metals (TMs) supported on oxide surfaces are ubiquitous heterogeneous catalysts in chemical processes that produce fuel and value-added chemicals, as well as in processes central to renewable energy and environmental protection technologies. Among supported TM catalysts, single-atom catalysts (SACs) have attracted attention due to their enhanced performance in many applications1, including CO oxidation2,3,4,5,6,7,8,9,10, water–gas shift11,12,13, selective hydrogenation14,15,16,17, dehydrogenation18,19,20,21, and photocatalytic22 reactions. SACs with uniform TM dispersion expose every TM atom to the reaction environment, thus maximizing utilization of the expensive TM component. Key to achieving such uniform dispersion is the ability to control metal cluster sizes, which in turn is largely dictated by the TM’s strength of interaction with the support. Predicting the strength of metal/support interactions is a challenging task because there are multiple factors at play, such as the reducibility of the oxide, the electronegativity of the metal, and the structure of the interface. Herein, we use density functional theory (DFT) and statistical learning (SL) to identify physical descriptors that build a predictive model for describing metal atom binding on MgO(100) surfaces. Focus is placed on understanding and predicting how modifications of the unreactive MgO(100) surface, through the introduction of surface adsorbates or dopants, can enhance metal binding energies. We also focus on comparing the performance of various SL approaches, where we show that the physical descriptors identified with Dirichlet–Laplace prior23 are strong descriptors for predicting metal binding on MgO(100). Furthermore, we show that descriptors identified with MgO training sets also can be used to describe TM binding on modified CaO(100), BaO(100), and ZnO(100) surfaces—demonstrating the transferability of SL-derived descriptors beyond the MgO system they were trained to describe, albeit for very closely related systems.

Metal/support interactions impact both the morphology of the catalyst surface and oxidation states at the active site, which both influence catalytic performance24,25. The size of metal clusters on oxide supports is controlled by thermodynamic driving forces that usually cause small clusters to agglomerate (e.g., through Ostwald ripening)26. Metal/support interactions alter the chemical potential of each metal atom in the cluster, and thus influence thermodynamic stability with respect to cluster size27,28. These interactions also influence the kinetics of cluster formation, where an increase in the metal adsorption energy on the oxide surface generally reduces sintering rates leading to smaller particle sizes29,30. Metal/support interactions affect not only particle size distributions, but also alter the electronic state of the adsorbed metal31,32,33,34. This phenomenon, known as electronic metal–support interaction35, is caused by charge transfer between the metal and the support, which can alter reaction rates by affecting how strongly intermediates adsorb at the active site.

Adsorbates from the reaction environment, or dopants present in the oxide structure, can enhance charge transfer at the metal/support interface31,36,37,38,39,40. Addou et al.41 demonstrated that adsorbed hydrogen atoms on the rutile TiO2(011)-(2 × 1) surface alters the binding energy of Pd atoms to the support, which stabilizes small Pdn clusters (i.e., n ~ 1–3 atoms). Babucci et al.42 used Fourier-transform infrared spectroscopy to show that the electronic state of Ir(CO)2 can be altered on oxide supports that are modified with different ligands. They demonstrated that Ir complexes exhibit different reactivity toward 1,3-butene hydrogenation if the surface modifications shift the Ir oxidation state at the active site. Kumar et al.43 used Hammett reactivity studies, together with DFT, to show that the electronic state of Au during benzyl alcohol oxidation is influenced by electron donation from the supporting oxide. Similar effects have been observed on surfaces that are modified by introducing dopants in the lattice of the oxide. For example, Shao et al.44 reported that the electronic structure of CaO surfaces can be controlled by introducing Mo dopants that replace Ca atoms in the surface lattice. The additional valence electrons supplied by Mo migrate toward adsorbed Au clusters, thus enhancing Au binding on the modified CaO surfaces. This effect was shown to be less pronounced over Cr-doped MgO surfaces45,46, which demonstrates that the dopant effect varies between different types of dopants and oxide supports.

Given the impact that metal/support interactions can have on catalyst morphology and activity, it is clear that identifying physical descriptors for predicting interaction trends will be of high value. Toward this end, Campbell and Sellers47 proposed that the adsorbed metal’s enthalpy of oxide formation computed relative to the isolated metal atom (ΔHf,ox,atom) should be an effective descriptor for predicting metal adsorption energies on oxide surfaces; ΔHf,ox,atom captures the bonding strength between the metal atom and oxygen, and it should therefore correlate with the metal’s binding strength to oxide surfaces. They demonstrated a linear correlation between ΔHf,ox,atom and metal adhesion energies measured by adsorption calorimetry, thus deriving a simple model for predicting metal adsorption energies based on readily available reference data27. Using both isothermal titration calorimetry and DFT, Strayer et al.30 verified that ΔHf,ox,atom is also an effective descriptor for predicting metal adsorption energies on HCa2Nb3O10 perovskite surfaces. Although successful in these cases, identifying physical descriptors purely from chemical intuition is not a simple task. Complex charge transfer between the metals, the oxide supports, and surface modifiers (i.e., adsorbates and dopants) is challenging to describe with closed physical forms motivated from chemical intuition alone, as multiple phenomena are occurring simultaneously. As such, in this work we apply SL to identify useful physical descriptors that capture charge transfer at complex, multicomponent interfaces. We choose MgO for our initial study to demonstrate the effectiveness of the SL approach, since charge transfer characteristics can be anticipated from the non-reducible characteristics of MgO. Although we do not expect all surface modifications of MgO studied here to be experimentally feasible because dopants/adsorbates can generate charge-compensating defects that would counteract charge transfer to the adsorbed metal45, the idealized nature of pristine MgO is a desirable testbed for evaluating the performance of various SL approaches and for identifying physical descriptors to predict how charge transfer affects metal adsorption in the idealized case. Once the strengths and weaknesses of the SL approaches are known, then we can pursue studies that use well-chosen SL approaches to understand more complex, but experimentally realizable, oxide systems containing defects.

Multiple feature selection (FS) methods have been introduced in the materials community to identify physical descriptors from extremely large sets of candidate descriptors. FS methods are preceded by gathering the chemical properties of the system’s components from various databases and using these properties to generate a large feature space of candidate descriptors. This step, referred to as feature engineering, is achieved by applying a series of mathematical operations on all descriptor pairs to enumerate millions (or even billions) of possible descriptor combinations and functional forms. FS methods are then applied to identify the features from this pool that are the strongest predictors of the property of interest (Fig. 1). There is a rich menu of FS methods in SL, ranging from traditional principle analysis component (PCA)48 or kernel ridge regression (KRR)49,50 to large scale methods, such as least absolute shrinkage and selection operator with l0 norm regression (LASSO + l0)51,52 or sure independence screening and sparsifying operator (SISSO)53,54. The success of these FS approaches in the materials community is demonstrated by many examples55,56, including the work of Ghiringhelli et al.51, who used LASSO + l0 regression to identify physical descriptors that can predict crystal structures based on material composition. Similarly, Andersen et al.54 used SISSO to derive descriptor subsets that can predict molecular adsorption energies on metal alloys. Alternative to these frequentist approaches, Bayesian FS has emerged in SL as a primary tool that provides a coherent and principled framework to quantify uncertainties57,58,59,60. In addition to inheriting the advantage of Bayesian methods to conveniently incorporate domain knowledge via prior distributions, state-of-the-art Bayesian FS is able to adapt to unknown sparsity levels in the feature space61,62 and to achieve automatic multiplicity correction63,64. In this work, we will explore the performance of both LASSO-based FS methods and state-of-the-art Bayesian FS methods23,65 for finding descriptors that can predict changes in metal binding energy caused by MgO surface modification.

Fig. 1: Statistical learning workflow.
figure 1

The predictive model is built from statistical learning methods using fundamental chemical properties as the descriptors and the DFT energy as the training set.

In the present work, DFT binding energies of single metal atoms on modified MgO surfaces are used to train our SL models. Training and validation sets are built from data collected for MgO(100) surfaces that are modified by either adsorbates or dopants. MgO(100) is an irreducible oxide that, in its pristine form, binds TMs weakly compared to other supports52. It therefore is ideal for investigating the effects of support modifications, as changes in its binding interaction with the supported metal can be attributed solely to the effects of the surface modification. Indeed, MgO has been used as a template for many investigations of metal/support interactions, such as demonstrating how the oxidation state of gold varies upon surface modification39,66 and how linear scaling relationships can be derived for a wide variety of adsorbates at Au/oxide interfaces67. We modified the MgO substrate with surface dopants and adsorbates to induce electron-poor and electron-rich conditions, where we find that both lead to an enhancement of the metal binding energies due to an increase in charge transfer between the metal and the support, as expected from the irreducible nature of pristine MgO.

We identify descriptors for predicting the effects of surface modifications by using multiple FS methods: LASSO68, Dirichlet–Laplace prior23, and Horseshoe prior65 (Fig. 1). The methods are applied to identify physical descriptors for predicting the enhancement of metal binding energies on modified MgO based on readily available chemical properties of the system’s components (i.e., the adsorbed metal, the MgO surface, and the adsorbate or dopant). The selected feature spaces are then refined with l0 norm51 regression to build predictive models that describe single metal atom binding on MgO surfaces. We find that Dirichlet–Laplace prior shows outstanding FS properties, as it mitigates many disadvantages evident in the performance of LASSO and Horseshoe prior. We evaluate the transferability of the identified features of each method by applying them to predict changes in metal adsorption over CaO(100), BaO(100), and ZnO(100) surfaces. We find that features identified by Dirichlet–Laplace prior using the MgO training data are effective descriptors for predicting metal binding on these irreducible surfaces as well, which again demonstrates the robust nature of Dirichlet–Laplace prior because no CaO, BaO, and ZnO data were included in the FS procedure. It also demonstrates the transferability of SL-derived features within oxide families, which widens model applicability to systems that were not used explicitly to generate the training set.

Results

Effects of dopants and adsorbates on metal adsorption energy

Our objective is to build a model that captures the relationship between metal adsorption energy and readily available chemical properties of the adsorbed metals, MgO, dopants, and adsorbates. A range of TMs (Ag, Au, Cd, Co, Cr, Cu, Fe, Ir, Mn, Mo, Nb, Ni, Pd, Pt, Rh, Ru, V, W, and Zn) were adsorbed on MgO(100) surfaces to generate our DFT training sets. The metal adsorption energy (ΔEads) is calculated using Eq. 1, where EMetal/Surface is the total DFT energy of the metal atom adsorbed on the MgO(100) surface at its most stable site, EMetal is the total DFT energy of the isolated metal atom, and ESurface is the total DFT energy of the clean MgO(100) surface.

$${\mathrm{\Delta }}E_{\rm{ads}}\,=\,E_{\rm{Meta/Surface}}\,-\,E_{\rm{Metal}}\,-\,E_{\rm{Surface}}.$$
(1)

There are four unique adsorption sites on the MgO(100) surface: an atop site on O, an atop site on Mg, a bridge site between O and Mg, and a hollow site. Adsorption at all four sites was tested to find the most favorable adsorption site for each metal. The atop site above the oxygen anion was found to be the most stable adsorption site for every metal, in agreement with previous reports52,69,70. A representative structure of Au adsorbed on MgO(100) is shown in Fig. 2; similar geometries were obtained for all other metals.

Fig. 2: Metal atom adsorption on MgO surfaces.
figure 2

Top/side view geometries of Au adsorption on a/d pristine, b/e OH-modified, and c/f Al-doped MgO(100) surfaces.

We modified the MgO(100) surface by introducing dopants and adsorbates to enhance the reducibility of the support, which we find generally increases metal adsorption strengths. We first considered H and OH adsorbates, which are commonly encountered species under catalytic reaction conditions. Adsorption of H and OH on four unique surface sites was tested to identify the most stable site, which we found was the atop site on surface O for H and the surface hollow site for OH. The metal atoms were then placed on the optimized structures at the atop O site and then the entire system was optimized to calculate the metal binding energies. A representative structure of Au adsorbed on the OH-MgO(100) surface is shown in Fig. 2; other adsorbate-modified surfaces are shown in Supplementary Fig. 1. F and NO2 were also introduced to expand the DFT training data, where F has a higher electron affinity (EA) (3.40 eV) than OH (1.83 eV) and thus is expected to yield to a stronger metal binding enhancement compared to OH. NO2 was included because it can readily act as a neutral, positive, or negative species based on the chemical environment, thus diversifying the training set. Each metal atom was placed as far from the adsorbate as possible within the cell to prevent direct interaction between the metal and the adsorbate. This isolates the adsorbate’s impact on the properties of the support from effects related to direct metal–adsorbate interactions. Ag, Nb, and Ru formed direct bonds to NO2 during all optimization attempts, so their binding energies were excluded from the data set so that only surface-mediated effects were considered. Doping the oxide surface alters the electronic structure of the oxide in a manner similar to the adsorbates37,39,44,45,46. Li, Na, B, and Al were chosen as dopants in this study, as they contain either one fewer or one additional valance electron compared to Mg. The dopants replaced one Mg atom in the first layer of the oxide (Fig. 2c, f).

All modifications of the MgO(100) surface enhanced charge transfer between the surface and the supported metal, as shown by the charge distributions in Fig. 3. When H is adsorbed on Au/MgO(100), one electron is transferred from H to the metal adatom (Fig. 3b). Conversely, OH is electron withdrawing causing charge to be drawn away from Au (Fig. 3c). For surface dopants, Al has one more valence electron than Mg and donates charge to the adsorbed Au atom (Fig. 3d). Na reverses this trend, causing Au to donate electrons to balance the charge depletion in the surface (Fig. 3e). There is a clear similarity in the effects of surface dopants and adsorbates, and therefore we broadly classify surfaces with additional electrons compared to pristine MgO as electron-rich and surfaces with fewer electrons as electron-poor.

Fig. 3: Electron density distribution of Au on MgO surfaces.
figure 3

Isostructural charge density difference plots of a Au/MgO(100), b Au/H-MgO(100), c Au/OH-MgO(100), d Au/Al-MgO(100), and e Au/Na-MgO(100). Blue represents depletion of electron density and green represents accumulation. The black arrows depict the flow of electrons. The isosurface level is ±0.003 e Bohr−3.

Metal adatom oxide formation enthalpy (ΔHf.ox,atom) is a widely used descriptor for predicting metal adsorption on different oxides27,30,47,52. However, we found that this descriptor was not sufficient for predicting changes in adsorption energy on the modified MgO(100) surfaces shown in Fig. 3, as it only captures how easily the metal can be oxidized by donating its electrons to the surface and fails to represent situations where electrons are donated to the metal (i.e., as seen in Fig. 3b, d for Au on electron-rich surfaces). The enhancement in metal adsorption energy in such situations is largely controlled by the ability of the adsorbed metal to accept excess charge from the modified surface, where ΔHf.ox,atom is no longer a suitable descriptor because it only captures the ability of the metal to donate electrons. Therefore, we propose that the ionization potential (i.e., IE1) and EA can be used as broad descriptors for predicting the change in metal adsorption energy (Δ(ΔEads), as defined by Eq. 2) caused by bidirectional charge transfer.

$${\mathrm{\Delta }}\left( {{\mathrm{\Delta }}E_{\rm{ads}}} \right) = {\mathrm{\Delta }}E_{\rm{ads}}\left( {{\rm Modified}\,{\rm MgO}} \right) - {\mathrm{\Delta }}E_{\rm{ads}}({{\rm Clean}\,{\rm MgO}}).$$
(2)

These descriptors capture the atom’s propensity for both electron donation (versus IE1) and acceptance (versus EA).

We plot the EA and IE1 against Δ(ΔEads) for electron-rich and electron-poor surfaces in Fig. 4. The correlation is separated into two categories: a negative correlation for electron-rich surfaces (Fig. 4a) and a positive correlation for electron-poor surfaces (Fig. 4b). On the electron-rich surfaces, Au adsorption has been enhanced significantly because Au has the highest EA and readily accepts the extra electronic charge. The adsorption of V is increased the most on the electron-poor surfaces because V has the lowest IE1 and can readily donate charge to the surface. The direction of charge transfer in these systems is further confirmed by the charge density difference analyses for Au and Mn shown in Supplementary Fig. 2 and the corresponding Bader charge analysis in Supplementary Table 1. Moreover, the density of states (DOS) analysis for Au and Mn on doped MgO(100) are provided in Supplementary Fig. 3. The shift in the DOS at the Fermi level of the electron-poor MgO(100) surface clearly demonstrates a transfer of electrons to the surface from the adsorbed metal (and vice versa for the electron-rich surface).

Fig. 4: Metal binding correlation to electron affinity and ionization potential.
figure 4

The a electron affinity (EA) and b ionization potential (IE1) are used as descriptors for predicting the difference in metal adsorption energy between clean MgO versus electron-rich MgO and electron-poor MgO, respectively.

SL for identifying physical descriptors

Although useful for predicting broad trends, EA and IE1 are limited in their ability to quantitatively describe metal binding on modified MgO. This is evident in the scatter seen in Fig. 4. Quantitative descriptors that can unravel the effects of multiple charge transfer phenomena simultaneously will likely require more complex functional forms than what is captured by the simple electronic properties. In this section, we identify such descriptors for MgO using FS via LASSO, Horseshoe prior, and Dirichlet–Laplace prior. The feature spaces were also tested for transferability to CaO(100), BaO(100), and ZnO(100) in the next section, where DFT data for the additional oxide surfaces were not included in any training data used for feature selection. We find that Dirichlet–Laplace prior yields the best models, both in terms of lowest error and highest transferability.

We show the performance of the selected descriptor sets in Figs 5 and 6 using 50 trials of randomly separated training/validation data sets for dopant- and adsorbate-modified MgO, respectively. We use the l0 norm method to refine each model by systematically decreasing the number of descriptors in the model (i.e., the model dimension corresponds to the number of descriptors in the model, nD). The root mean square error (RMSE) is calculated by Eq. 3 to quantify the prediction accuracy, where yi is the ith predicted value, \(\widehat {y_i}\) is the ith estimated value, and n is the number of predicted values.

$${\rm{RMSE}}\,=\,\sqrt {\frac{{\mathop {\sum }\limits_{i = 1}^n \left( {y_i\,-\,\widehat {y_i}} \right)^2}}{n}}.$$
(3)
Fig. 5: Descriptor performance for doped MgO surfaces.
figure 5

Box plots represent the overall training and validation RMSE, computed using the l0 norm method to refine the total descriptor set identified with each method to a model containing the number of descriptors that yields the lowest validation error, of the predictive models obtained from feature selection methods, i.e., a LASSO, b Horseshoe prior, and c Dirichlet–Laplace prior. Box plots reflect data from 50 trials of randomly split training and validation sets. Parity plots of models generated with d LASSO, e Horseshoe prior, and f Dirichlet–Laplace prior, where training and validation data are presented as solid and hollow points, respectively. The training and validation RMSE for df are indicated in ac as orange diamonds. 15D, 3D, and 10D denote the number of descriptors in each model (i.e., the number of free coefficients used to fit the linear regression). The center line, upper bound, lower bound, upper whisker, and lower whisker in box plots represent median, 75th percentile, 25th percentile, maximum, and minimum, respectively.

Fig. 6: Descriptor performance for adsorbate-modified MgO surfaces.
figure 6

Box plots represent the overall training and validation RMSE, computed using the l0 norm method to refine the total descriptor set identified with each method to a model containing the number of descriptors that yields the lowest validation error, of the predictive models obtained from feature selection methods, i.e., a LASSO, b Horseshoe prior, and c Dirichlet–Laplace prior. Box plots reflect data from 50 trials of randomly split training and validation sets. Parity plots of models generated with d LASSO, e Horseshoe prior, and f Dirichlet–Laplace prior, where training and validation data are presented as solid and hollow points, respectively. The training and validation RMSE for df are indicated in ac as orange diamonds. 13D, 3D, and 5D denote the number of descriptors in each model (i.e., the number of free coefficients used to fit the linear regression). The center line, upper bound, lower bound, upper whisker, and lower whisker in box plots represent median, 75th percentile, 25th percentile, maximum, and minimum, respectively.

The model with trained coefficients is then applied to the validation set, and the dimension that provides the lowest validation RMSE is used to determine the final predictive model. RMSE for the final models derived from 50 randomly chosen training set points is shown in Fig. 5a–c and Fig. 6a–c for dopant- and adsorbate-modified MgO, respectively. Although LASSO tends to select the most descriptors, which will be shown in the “Methods” section, the derived models are less accurate than the ones built from the Bayesian methods. In addition, the total data set derived from Horseshoe prior contains less than five descriptors on average using either dopant-modified or adsorbate-modified MgO data, which is much smaller in size than the total data sets derived by the other two methods. Although Horseshoe prior constructs more accurate models than LASSO, it is highly aggressive in its selection and generates a very small pool of candidate features that are available for refining the model. Dirichlet–Laplace prior exhibits the best performance, as it is a balance between the stringent selection tendencies of Horseshoe prior and the permissive selection tendencies of LASSO. Although Horseshoe usually performs similarly to Dirichlet–Laplace (Supplementary Fig. 4) at a given model dimension, the model cannot be further improved because we rapidly exhaust the pool of candidate descriptors. Conversely, Dirichlet–Laplace prior can be systematically improved by adding more descriptors until the validation error of the test set increases, thus allowing us to balance model accuracy with model simplicity. The predictive performance of these models is shown by the parity plots in Figs 5d–f and 6d–f, where each plot was generated with a representative training set that yields a training RMSE close to the median of each method (i.e., the same training set was used for all three methods to ensure comparability). The corresponding training and validation RMSE of these plots is labeled as an orange diamond in Figs 5a–c and 6a–c. The performance of the models is improved compared to our previous work on clean oxides using LASSO + l052, where in that work we achieved an RMSE of 0.41 eV on the training set using five descriptors to predict 92 data points (i.e., compared to errors of 0.26 and 0.25 eV on training sets with five descriptors for the dopant and adsorbate data in Supplementary Fig. 4, respectively).

Tables 1 and 2 present the descriptors and relevant coefficients of the best models for dopant-modified and adsorbate-modified MgO surfaces that were shown in Figs 5f and 6f. The physical descriptors for doped MgO are composed of ENP, ENMB, IE1, IE2, EA, NVal, Z, and wrp, and for adsorbate-modified MgO are composed of ENP, ENMB, IE1, IE2, and NVal. The formula of each descriptor is more complex than those identified in our previous study of clean oxides52, reflecting the increased complexity of the problem once surface modifiers are introduced. The importance of charge transfer phenomena is evident in the character of the descriptors. For instance, \(\left| {\frac{{\rm{IE}}_1^{\rm{m}} - {\rm{IE}}_2^{\rm{s}}}{{\rm{IE}}_2^{\rm{m}} - {\rm{IE}}_2^{\rm{d}}}} \right|\,\times\,\left| {\frac{{\rm{IE}}_1^{\rm{m}} - {\rm{IE}}_2^{{\rm{dn}}}}{{\rm{IE}}_2^{\rm{m}} - {\rm{IE}}_1^{{\rm{dn}}}}} \right|\) in Table 1 plays a major role in predicting the enhanced binding of Au adsorption on Al-MgO compared to Na-MgO. The appearance of a term capturing the difference between the ionization energies of the metal and the dopant suggests that the favorability of charge transfer between these two components plays a major role in determining the overall metal binding energy. For adsorbate-modified surfaces, \(\left( {\frac{{\rm{EN}}_{\rm{P}}^{\rm{m}} - {\rm{EN}}_{\rm{P}}^{\rm{s}}}{{\rm{EN}}_{\rm{P}}^{\rm{s}} - {\rm{EN}}_{\rm{P}}^{\rm{o}}}} \right)^2\,\times\,\left| {\frac{{\rm{IE}}_2^{\rm{m}} - {\rm{IE}}^{\rm{a}}}{{\rm{IE}}_1^{\rm{o}} - {\rm{IE}}^{\rm{a}}}} \right|\) is the most effective descriptor for Au adsorption. \(\left| {\frac{{\rm{IE}}_2^{\rm{m}}\,-\,{\rm{IE}}^{\rm{a}}}{{\rm{IE}}_1^{\rm{o}}\,-\,{\rm{IE}}^{\rm{a}}}} \right|\) qualifies the difference in the ability of the parent metal, oxygen, and the adsorbate to donate charge. Although some terms in these models may be interpreted, this is in general a difficult task given the fact that interrelated phenomena controlling charge transfer cannot be described by a simple physical framework71.

Table 1 Descriptors, model coefficients, and responding values determined by Dirichlet–Laplace for dopant-modified MgO.
Table 2 Descriptors, model coefficients, and responding values determined by Dirichlet–Laplace for adsorbate-modified MgO.

Robustness and transferability of the selected descriptors

The descriptors identified in the previous section were applied to predict Δ(ΔEads) on various surfaces to test feature transferability and robustness. The metals were placed on CaO(100), BaO(100), and ZnO(100) doped with the same dopants as used for MgO (Al, B, Li, and Na). We used the SL-derived descriptor sets from the MgO training data to predict the behavior on the abovementioned modified surfaces, where we use l0 norm regression to refine the model dimensionality (Fig. 7). Note that this entails refitting the coefficients in front of each descriptor, so here we are only testing the transferability of the descriptors and not the transferability of the entire model. The performance of features derived from 50 trials of randomly separated MgO training sets are used to verify feature transferability, as shown by the box plots in Fig. 7. Since Horseshoe typically selects less than five descriptors, we restrict all of the predictive models to have at most five descriptors for comparison in Fig. 7a–c, as indicated by a dashed line in Fig. 7d–f. Comparing RMSE in Fig. 7a–c, the transferability of selected feature spaces is ranked as CaO ≈ BaO > ZnO. MgO and CaO have many similar properties and are commonly compared in the literature45,46,72, and therefore it is expected that descriptors that are important for MgO should also be applicable to CaO. Since Ba is in the same group as Mg and Ca in the periodic table, it is intuitive that BaO data also correlates with the MgO-derived features. We find that the features are less transferable in the case of ZnO, as expected given the fact ZnO is in a different group. In the comparison between the SL methods, Horseshoe performs the worst. Although it can pick meaningful descriptors, it will only select the ones that are dominant for MgO and will aggressively throw out ones that, although important for other oxides, are less important for MgO. For a more detailed comparison, we summarize the difference in RMSE for LASSO and Dirichlet–Laplace (ΔRMSE) in Supplementary Fig. 5, which shows that LASSO has higher average prediction errors than Dirichlet–Laplace when comparing models derived with the same training set and the same model dimension. Thus, Dirichlet–Laplace shows the highest transferability among the SL methods. In summary, we find that descriptors identified using Dirichlet–Laplace prior for one oxide are applicable to other related oxides, demonstrating the robust nature of the selected features and their transferability outside the training set used for feature selection. We note that the RMSE for all of the testing oxides with the same feature space is generally similar to the MgO models.

Fig. 7: Feature transferability.
figure 7

Box plots illustrate RMSE for a CaO, b BaO, and c ZnO applying the MgO-derived features. We note that RMSE is obtained from models with five descriptors even if the method selects more than five descriptors to have a fair comparison between the SL methods because Horseshoe prior usually selects less than five features (Fig. 10). Comparison of models for describing d CaO, e BaO, and f ZnO data built from features selected by LASSO, Horseshoe prior, and Dirichlet–Laplace prior using the MgO training data in Fig. 5. RMSE is computed using the l0 norm method to refine the total descriptor set identified with each method to a model containing the number of descriptors shown on the x-axis. The center line, upper bound, lower bound, upper whisker, and lower whisker in box plots represent median, 75th percentile, 25th percentile, maximum, and minimum, respectively.

Stability of modified MgO(100) surfaces

In the previous sections we evaluated the performance of various SL approaches for predicting the behavior of idealized MgO surfaces, where it is assumed that the dopants or adsorbates do not induce accompanying defects to compensate the additional charge and therefore charge transfer to the adsorbed metal atom was forced to occur. In real systems there inevitably will be a tendency for charge-compensating defects to compete for this charge transfer, which will be examined in this section. Numerous experiments demonstrate that the presence of the adsorbed metal can in certain cases suppresses the tendency to form charge-compensating defects in these irreducible oxides73,74,75. Shao et al.44 showed that the charge transfer from Mo dopants in CaO enhances Au binding strength because excess electrons from Mo are transferred to the electronegative Au atoms. This effect was also present, but less pronounced, on Cr-doped MgO(100), where Stavale et al.45 found that the enhancement of Au binding was inhibited by charge-compensating Mg vacancies. In addition to MgO, Tran et al.76 induced Fe as dopants in ZnO to modulate the Pt/ZnO interaction. The X ray photoelectron spectroscopy analysis of Pt 4f core levels revealed the change in the oxidation state of Pt, which is direct evidence of the charge transfer between Fe and Pt. The enhanced metal–support interaction in this system stabilizes Pt nanoparticles and suppresses metal sintering during CO oxidation. For instance, the average Pt particle size increased from 2.2 to 5.7 nm on the bare ZnO during CO oxidation but only increased from 2.4 to 3.2 nm on the Fe-doped ZnO. Furthermore, the turnover frequency was raised from 0.60 to 5.37 s−1 with the assistance of the Fe dopants. Thus, the predictive model developed in this work is relevant for screening system compositions where one can expect significant charge transfer to induce an enhancement of the metal binding energy. Once identified, the stability of the required surface modification can be assessed, by computing free energies of formation as demonstrated below, to determine if the candidate system is a viable target for experimental synthesis attempts.

Here we compute phase diagrams for the formation of charge-compensating O or Mg vacancy defects in electron-poor (Na-doped and OH-modified) and electron-rich (Al-doped and H-modified) surfaces, respectively (Fig. 8). We expanded the simulation cell to twice the size of the original unit cell to accommodate multiple dopants in the surface (Supplementary Fig. 6). The O and Mg vacancy formation free energies are calculated by Eqs. 4 and 5, respectively, where ΔGO-vac is the O vacancy formation free energy, ΔGMg-vac is the Mg vacancy formation free energy, EO-vac is the total DFT energy of the modified surface with one O vacancy, EMg-vac is the total DFT energy of the modified surface with one Mg vacancy, E2MgO is the total DFT energy difference between the geometry with two MgO units on the surface (Supplementary Fig. 7) and the pristine MgO(100) surface, Esurface is the total DFT energy of the modified surface without any charge-compensating defect, T is the temperature, and PO2 is the partial pressure of oxygen molecule. µO2 is the chemical potential of an oxygen molecule in the gas phase, which is computed by the total DFT energy of the O2 molecule with enthalpy and entropy corrections provided by the NIST webbook77. We used the energy of a small supported MgO cluster with two MgO formula units as the reference for Mg vacancy formation in Eq. 5 because the formation of this small cluster will be the first nucleation step once Mg atoms leave the MgO lattice. The energy of bulk MgO can also be used as the reference, which will systematically shift all Mg vacancy formation energies to be more negative by 2.59 eV. This will not impact the relative difference in vacancy formation energy when comparing the clean surfaces to the surfaces with adsorbed metal atoms.

$${\mathrm{\Delta }}G_{\rm{O}\,-\,\rm{vac}}\left( {T,P_{\rm{O}_{2}}} \right)\,=\,E_{\rm{O}\,-\,\rm{vac}}\,+\,\frac{1}{2}\mu _{\rm{O}_{2}}\left( {T,P_{\rm{O}_{2}}} \right)\,-\,E_{\rm{surface}},$$
(4)
$${\mathrm{\Delta }}G_{\rm{Mg}\,-\,\rm{vac}}\left( {T,P_{\rm{O}_{2}}} \right)\,=\,E_{\rm{Mg}\,-\,\rm{vac}}\,+\,\frac{1}{2}E_{\rm{2MgO}}\,-\,\frac{1}{2}\mu _{\rm{O}_{2}}\left( {\rm{T,P}}_{\rm{O}_{2}} \right)\,-\,E_{\rm{surface}}.$$
(5)
Fig. 8: Phase diagram of charge-compensating defect formation.
figure 8

The curve represents the boundary where the defect, e.g., O or Mg vacancies, formation free energy equals zero on modified MgO(100) surfaces as a function of temperature and oxygen partial pressure. For electron-poor surfaces, i.e., a Na-MgO and b OH-MgO, the O vacancy will form in the bottom-right region. For electron-rich surfaces, i.e., c Al-MgO and d H-MgO, the Mg vacancy will form in the top-left region.

We use the above equations to compute the boundary in (PO2, T) space where the defect formation energy is zero, which is used to generate the phase diagrams shown in Fig. 8. In each case, we compute the defect formation boundary on the clean surface without any adsorbed metal, and then we compare it to the boundary on surfaces with the adsorbed metals. We find that the metal atom on modified MgO generally inhibits defect formation because the metal atom acts as an electron source or sink to alleviate the excess surface charge caused by the dopant or adsorbate. The extent of the boundary shift correlates with IE1 and EA for electron-poor and electron-rich surfaces, respectively, which indicates that it is controlled by the ability of the adsorbed metal to donate or accept excess charge (Supplementary Fig. 8). Comparing vacancy formation energy of modified MgO(100) surfaces (the dashed lines in Supplementary Fig. 8) to the pristine surfaces (Supplementary Table 2), it is evident that the surface modification will lead to compensating defects, as discussed above. However, the favorability of compensating defect formation will be suppressed by the adsorbed metals. The suppression of the area of the phase diagram in which a compensating defect would occur, as seen in Fig. 8, suggests that it may thermodynamically be feasible to prepare some of the adsorbed metal/modified MgO surfaces considered in this work if the surface modification and metal adsorbate are introduced simultaneously before charge-compensating defects can form.

Discussion

We have shown the capability of SL methods for constructing predictive models for computing Δ(ΔEads) in oxide-supported SAC systems. The SL methods yield useful relationships between the fundamental chemical properties of dopants/adsorbates and overall metal/support interactions, which are often used to control surface morphology but in many cases are not well-understood31,36,37,38,41,44,45,46. The resulting models provide an estimation tool for quantifying how metal binding energy can be tuned through surface modification. This in turn can provide guidelines for designing and controlling the particle size of the TMs (e.g., by finding modifications that stabilize single adsorbed metal atoms relative to the bulk metal). Indeed, we predict that Mn may form stable SACs on Li-doped and Na-doped ZnO surfaces because the formation energy of single Mn atoms on these surfaces is negative relative to the formation of bulk Mn (Supplementary Section 1). We also introduced two state-of-the-art Bayesian methods that have several advantages over other common FS methods (e.g., PCA, KRR, LASSO, SISSO, etc.), such as flexible prior distributions, providing automatic error estimation on selected features, adaptivity to arbitrary sparsity in the feature space, and the ability to utilize the full data set for inference without requiring cross validation.

The robustness of our methodology was verified by applying the final predictive models to validation data sets, as well as by applying the selected features to CaO, BaO, and ZnO surfaces. The transferability of descriptors reveals the power of the Dirichlet–Laplace prior method, since it can use training data on one particular oxide to identify features that are applicable to similar oxides. We are currently exploring combined FS approaches based on training data derived from multiple oxides to build models that are fully general within oxide families. This will include extensions to more reducible oxides, where it is expected that the parent metal of the oxide will play a more direct role as a center of charge transfer. For example, our previous study52 showed that metal–metal and redox interactions can occur on highly reducible oxides, such as CeO2. The parent metals in the support are expected to interact with dopants and adsorbates as well, which will further complicate the nature of charge transfer. Finding suitable descriptors from chemical intuition alone will be nearly impossible for such systems, but likely will be feasible using the SL tools developed in this study.

Transition metal adsorption on MgO is enhanced by surface doping and adsorbate modification, which can be predicted using physical descriptors identified by SL (namely, LASSO, Horseshoe prior, and Dirichlet–Laplace prior methods). In particular, MgO can be transformed by dopants or adsorbates to exhibit either electron-rich or electron-poor characteristics. The extent of charge transfer between metal and modified MgO surfaces correlates with the ionization potential and the EA of the metal, but scatter in these correlations preclude quantitative prediction. Therefore, we used SL to derive predictive models that go beyond these simple descriptors, which were built from the fundamental chemical properties of the adsorbed metals, Mg, O, dopants, and adsorbates. Multiple FS tools, including state-of-the-art Bayesian methods, were applied together with l0 norm regression to derive simple and effective models for predicting metal adsorption on modified MgO surfaces. Of the tested SL methods, we found that Dirichlet–Laplace prior yields the most accurate and transferable models. The descriptors identified for MgO(100) by Dirichlet–Laplace prior were also shown to be effective for estimating metal binding on CaO(100), BaO(100), and ZnO(100) surfaces with comparable accuracy, which demonstrates the robust nature of this FS approach and the transferability of our models to related oxides.

Methods

DFT calculations

Metal adsorption energies were calculated with DFT using the Vienna ab initio simulation package (VASP 5.4.4)78. The Perdew–Burke–Ernzerhof exchange-correlation functional79 was applied with spin polarization and the projector augmented-wave method80 was used to treat core electrons with VASP default potentials81. The valence electrons treated self-consistently for each atom type are listed in Supplementary Table 3. Planewave basis sets were expanded to a kinetic energy cutoff of 600 eV. Gaussian smearing was employed with a smearing width of 0.05 eV. A Monkhorst–Pack (MP)82 k-point mesh was used with 4 × 4 × 4 sampling on the bulk MgO structure and 3 × 3 × 1 sampling on the (2 × 2) MgO(100) surface models. The Grimme empirical dispersion correction was used to treat van der Waals dispersion83. Geometries were optimized to a convergence criterion of 0.05 eV Å−1. Metal binding energies computed with a force convergence criterion of 0.05 eV Å−1 were found to be nearly identical to those computed with a tighter criterion of 0.01 eV Å−1 (Supplementary Table 4).

Surface models contained four layers of MgO(100) with all layers relaxed to avoid spurious surface dipoles that can arise from frozen layers in oxide surfaces84. The vacuum distance between layers in the direction perpendicular to the surface was at least 15 Å to avoid interactions between MgO slabs, and a dipole correction85 was applied perpendicular to the MgO(100) surface. Calculations on CaO(100), BaO(100), and ZnO(100) surfaces followed the same settings as those applied for the MgO(100) surfaces, except the MP k-points sampling is 5 × 5 × 1 for ZnO(100). The ground state energy of single metal atoms is computed in a 15 × 16 × 17 Å3 unit cell, which was our reference for computing binding energies. The magnetization states and energies of the isolated metal atom are listed in Supplementary Table 5. Multiple magnetization states were tested when considering metal adsorption on the oxide surfaces, examining all probable spin configurations as listed in Supplementary Table 6, and the ground state with the lowest energy was selected for use in the adsorption energy training set. Metal binding energies on MgO(100), CaO(100), BaO(100), and ZnO(100) are listed in Supplementary Tables 710, and the corresponding system magnetizations are reported in Supplementary Tables 1114. The Bader method86,87 was used to calculate partial charges on atoms, which were only used for qualitatively assessing the direction of charge transfer (Supplementary Table 1). The defect formation energy of the pristine MgO(100) surface is listed in Supplementary Table 2. Coordinates of all energy-minimized structures are provided in the Supplementary Information.

Statistical learning

Our previous study52 applied LASSO + l051 to derive descriptors that can predict metal adsorption energies across a wide range of metal/oxide pairs. While successful in that work, in the present study we found that LASSO + l0 was limited in its ability to handle correlated features when we introduced properties of the surface modifiers in the feature space (vide infra). As such, here we apply two additional state-of-the-art Bayesian methods for feature selection, Horseshoe prior65 and Dirichlet–Laplace prior23, that are implemented alongside the LASSO68 approach for comparison. Bayesian methods are particularly well-suited for FS because they allow users to incorporate domain knowledge when applicable, they offer uncertainty quantification, and, most notably, they provide a coherent framework to infer model probabilities. Modern Bayesian FS methods can adapt to unknown sparsity levels in the feature space and perform well when handling correlated features88. Bayesian methods operate by first specifying an initial distribution of values for each descriptor coefficient (i.e., a prior distribution) and then updating this distribution based on available training data through Bayes’ theorem (i.e., solving for the posterior distribution). These methods offer flexibility through the choice of the prior distribution, as different prior distributions will yield different FS characteristics. The DFT training data for FS in this work consisted of metal atom binding energies on MgO surfaces that were modified by either dopants or adsorbates, where FS was separate for “dopant” and “adsorbate” data sets. Details regarding the implementation of each FS method are provided in the Supplementary Information.

Primary descriptors

Our feature space was built from a primary descriptor set that contained chemical properties of the adsorbed metals, the parent atoms in the oxide surface (i.e., Ba, Ca, Mg, O, and Zn), and dopants that were available in the CRC Handbook of Chemistry and Physics89. These atomic properties include the atomic number (Z), electronegativity in Pauling and Martynov–Batsanov90 scales (ENP and ENMB), first and second ionization energy (IE1 and IE2), EA, standard sublimation enthalpy (ΔHsub), standard molar enthalpy of oxide formation of the metal adatoms (ΔHf,ox,bulk), standard molar enthalpy of formation of the adsorbed metal’s most stable oxide referenced to the isolated metal atom (ΔHf,ox,atom), Zunger and Cohen orbital radii of s and p orbitals (zrs and zrp)91, Waber and Cromer orbital radii of s and p orbitals (wrs and wrp)92, number of valence electrons (NVal), Miedema metal alloy formation parameters (η1/3and ϕ)93, and absolute electronegativity (AEN)94. The following data were not available in CRC Handbook of Chemistry and Physics and instead were taken from the provided references: IE2 of Ir95, EA of Cd, Mg, Mn, and Zn96, and ΔHf,ox,bulk of Au and Pt47. Following our previous notation scheme52, superscripts m, s, o, d, and a are used to indicate adsorbed metals, the parent metal in the oxide surface (which is Mg, Ca, Ba, or Zn in this study), oxygen, dopants, and adsorbates, respectively. The IE1, IE2, and EA of dopants include separate entries for the neutral (n) dopant (IE1dn, IE2dn, and EAdn) and for the dopant in its most stable oxidation state (IE1d, IE2d, and EAd). For adsorbate-modified systems, we also included ENP, ENMB, IE, EA, NVal, coordination number between adsorbates and the MgO(100) surface (CN), and bond dissociation energy (BD). We define BD as the binding energy between the atom in the adsorbate and the atom in the oxide surface to which the adsorbate binds (e.g., O–H for H*, Mg–O for OH* and NO2*, and Mg–F for F*, where * indicates a species adsorbed on the surface). The EN of adsorbates is taken as the EN of the atom in the adsorbate that is attached to the support. We also considered polarization inside the adsorbed molecules by including electronegativity differences (ΔENP and ΔENMB) between the different atoms in multi-atom molecules (e.g., |ENN – ENO| for NO2 and |ENH – ENO| for OH). All data described above are provided in the Supplementary Information.

Feature selection

We applied LASSO68, Horseshoe prior65, and Dirichlet–Laplace prior23 methods, implemented in R version 3.6.0 on Linux97 to identify correlations between the chemical properties of the system’s components and the enhancement in metal binding energy that results from surface modification. There are a total of 76 and 73 calculated binding energies for dopant-modified and adsorbate-modified MgO(100), respectively. A total of 52 data points are randomly selected from this set to build a training set used to develop the predictive model, while the remaining data points are isolated from all SL procedures to use as a validation set. The feature engineering procedures used to generate the feature space of candidate descriptors are summarized in Fig. 9. First, we expanded our feature space of candidate descriptors by applying a series of mathematical operators (described in detail in the Supplementary Information) on the set of primary descriptors described in the previous section. This procedure introduces secondary descriptors that can capture nonlinear correlations between the fundamental properties of each component in the system and the metal binding energy. All categorical descriptors (i.e., descriptors for which all elements in the feature vector fall in a common category; see Supplementary Information for detailed explanation and definition of categorical descriptors) in the secondary descriptor set were converted into numerical descriptors using dummy variables so that they can be treated with linear methods98. The numerical descriptors from the secondary descriptor set were then cross-multiplied to mix different properties, which builds an engineered feature space containing ~106 descriptors. We prescreened the feature space by ranking the Pearson coefficient of correlation to identify the top 1,000 descriptors. We found that this data preprocessing step stabilizes the subsequent analysis, as well as improves calculation speed. We formed the final ternary feature space by adding the transformed categorical descriptors back into the secondary feature space, where all descriptor sets were normalized. All feature engineering procedures and the resulting feature space subsets are provided in the Supplementary Information. The algorithms of Dirichlet–Laplace prior and Horseshoe prior are summarized in Supplementary Tables 16 and 17, respectively.

Fig. 9: Feature engineering and feature preselection procedures.
figure 9

Descriptor sets are indicated in orange boxes, feature engineering steps are indicated in green boxes, feature preselection steps are indicated in blue boxes, and the size of the feature space at each step is indicated in black text. There were seven and ten categorical descriptors for doped and adsorbate-modified MgO surfaces, respectively.

The FS process yields a reduced feature space with an order of ~10 descriptors. To build the final model, we systematically test the performance of all combinations of features in this reduced space, where the size of the model (i.e., number of features in the combination) is predetermined, and select the combination that yields the lowest RMSE. We then systematically decrease the number of features in the final model until we reach a model with only one descriptor. This strategy, called l0 norm regression, helps to systematically test the performance of the model with respect to the number of tunable parameters. Given the selected descriptors and the corresponding coefficients, we use the validation set to calculate the validation error to determine whether the predictive model is overfit, which occurs when the error of the validation set begins to rise as more descriptors are added to the model. We choose the final model size by selecting the model that yields the lowest error on the validation data set, thus ensuring that the model is not overfit. To test the FS processes, we applied the l0 norm regression approach to evaluate feature spaces selected from MgO data on data derived from CaO, BaO, or ZnO modified with the same dopants (Al, B, Li, and Na). Each of these sets contained 52 data points for each oxide.

Evaluation of FS methods using synthetic data

It is crucial to understand the properties of each SL method by evaluating its performance characteristics on simulated data sets, where the ground truth is known a priori. This was done by populating a simulated data set with 100 observations and a feature space of 1,000 candidate descriptors, out of which ten descriptors truly correlate with the response. We repeated the simulation 100 times, each generating a random data set following the same model. We evaluated each method by reporting the average number of true positives (i.e., descriptors that truly correlate with the training set and are selected by the method), true negatives (i.e., descriptors that do not correlate with the training set and are not selected by the method), false positives (i.e., descriptors that do not correlate with the training set but are selected by the method), and false negatives (i.e., descriptors that correlate with the training set but are not selected by the method). This gives us insight into the characteristics of each SL approach.

Figure 10a shows the simulation result for a candidate feature space that has an average feature correlation of ρ = 0.5 (i.e., a test case with a high rate of correlation among the candidate features). As seen in the figure, LASSO suffers from a high rate of false positives when candidate features are highly correlated. This issue is largely mitigated by the Bayesian methods, which motivates their use in this work. Although Horseshoe prior shows a low rate of false positive selection, it also misses many true descriptors evident from its high rate of false negatives. Simulation tests demonstrate that the high rate of false positives in LASSO and false negatives in Horseshoe is consistent across various correlation parameters (see Supplementary Information for simulations with ρ = 0, ρ = 0.5, and ρ = 0.9 in Supplementary Tables 18–20), which suggests that this behavior will also be present given the expected range of correlation in our feature space. This is verified here by showing the number of features selected by each method on our MgO data set (Fig. 10b). Horseshoe selects the fewest descriptors in both dopant-modified and adsorbate-modified data sets, which suggests that Horseshoe is the most aggressive FS methods. We note that LASSO does not always select the most descriptors, but its performance is mostly worse than Dirichlet–Laplace. The feature spaces selected based on the “real” MgO data are analyzed in the “Results” section.

Fig. 10: Behavior and performance of feature selection methods.
figure 10

a Simulated average number of descriptors selected by each statistical learning method over 100 trials with 100 observations, 1000 candidate descriptors, and a correlation of ρ = 0.5 between the descriptors. b The average number of descriptors selected by each method using 50 randomly separated training set of metal binding energies on dopant-modified or adsorbate-modified MgO surfaces. The error bars represent the range of one standard deviation.