Using statistical learning to predict interactions between single metal atoms and modified MgO(100) supports

Liu, Chun-Yen; Zhang, Shijia; Martinez, Daniel; Li, Meng; Senftle, Thomas P.

doi:10.1038/s41524-020-00371-x

Download PDF

Article
Open access
Published: 21 July 2020

Using statistical learning to predict interactions between single metal atoms and modified MgO(100) supports

npj Computational Materials volume 6, Article number: 102 (2020) Cite this article

4025 Accesses
18 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Metal/oxide interactions mediated by charge transfer influence reactivity and stability in numerous heterogeneous catalysts. In this work, we use density functional theory (DFT) and statistical learning (SL) to derive models for predicting how the adsorption strength of metal atoms on MgO(100) surfaces can be enhanced by modifications of the support. MgO(100) in its pristine form is relatively unreactive, and thus is ideal for examining ways in which its electronic interactions with metals can be enhanced, tuned, and controlled. We find that the charge transfer characteristics of MgO are readily modified either by adsorbates on the surface (e.g., H, OH, F, and NO₂) or dopants in the oxide lattice (e.g., Li, Na, B, and Al). We use SL methods (i.e., LASSO, Horseshoe prior, and Dirichlet–Laplace prior) that are trained against DFT data to identify physical descriptors for predicting how the adsorption energy of metal atoms will change in response to support modification. These SL-derived feature selection tools are used to screen through more than one million candidate descriptors that are generated from simple chemical properties of the adsorbed metals, MgO, dopants, and adsorbates. Among the tested SL tools, we demonstrate that Dirichlet–Laplace prior predicts metal adsorption energies on MgO most accurately, while also identifying descriptors that are most transferable to chemically similar oxides, such as CaO, BaO, and ZnO.

Statistical learning goes beyond the d-band model providing the thermochemistry of adsorbates on transition metals

Article Open access 15 October 2019

Stability of heterogeneous single-atom catalysts: a scaling law mapping thermodynamics to kinetics

Article Open access 24 September 2020

Data-driven models for ground and excited states for Single Atoms on Ceria

Article Open access 18 August 2022

Introduction

Transition metals (TMs) supported on oxide surfaces are ubiquitous heterogeneous catalysts in chemical processes that produce fuel and value-added chemicals, as well as in processes central to renewable energy and environmental protection technologies. Among supported TM catalysts, single-atom catalysts (SACs) have attracted attention due to their enhanced performance in many applications¹, including CO oxidation^{2,3,4,5,6,7,8,9,10}, water–gas shift^11,12,13, selective hydrogenation^14,15,16,17, dehydrogenation^18,19,20,21, and photocatalytic²² reactions. SACs with uniform TM dispersion expose every TM atom to the reaction environment, thus maximizing utilization of the expensive TM component. Key to achieving such uniform dispersion is the ability to control metal cluster sizes, which in turn is largely dictated by the TM’s strength of interaction with the support. Predicting the strength of metal/support interactions is a challenging task because there are multiple factors at play, such as the reducibility of the oxide, the electronegativity of the metal, and the structure of the interface. Herein, we use density functional theory (DFT) and statistical learning (SL) to identify physical descriptors that build a predictive model for describing metal atom binding on MgO(100) surfaces. Focus is placed on understanding and predicting how modifications of the unreactive MgO(100) surface, through the introduction of surface adsorbates or dopants, can enhance metal binding energies. We also focus on comparing the performance of various SL approaches, where we show that the physical descriptors identified with Dirichlet–Laplace prior²³ are strong descriptors for predicting metal binding on MgO(100). Furthermore, we show that descriptors identified with MgO training sets also can be used to describe TM binding on modified CaO(100), BaO(100), and ZnO(100) surfaces—demonstrating the transferability of SL-derived descriptors beyond the MgO system they were trained to describe, albeit for very closely related systems.

Metal/support interactions impact both the morphology of the catalyst surface and oxidation states at the active site, which both influence catalytic performance^24,25. The size of metal clusters on oxide supports is controlled by thermodynamic driving forces that usually cause small clusters to agglomerate (e.g., through Ostwald ripening)²⁶. Metal/support interactions alter the chemical potential of each metal atom in the cluster, and thus influence thermodynamic stability with respect to cluster size^27,28. These interactions also influence the kinetics of cluster formation, where an increase in the metal adsorption energy on the oxide surface generally reduces sintering rates leading to smaller particle sizes^29,30. Metal/support interactions affect not only particle size distributions, but also alter the electronic state of the adsorbed metal^31,32,33,34. This phenomenon, known as electronic metal–support interaction³⁵, is caused by charge transfer between the metal and the support, which can alter reaction rates by affecting how strongly intermediates adsorb at the active site.

Adsorbates from the reaction environment, or dopants present in the oxide structure, can enhance charge transfer at the metal/support interface^{31,36,37,38,39,40}. Addou et al.⁴¹ demonstrated that adsorbed hydrogen atoms on the rutile TiO₂(011)-(2 × 1) surface alters the binding energy of Pd atoms to the support, which stabilizes small Pd_n clusters (i.e., n ~ 1–3 atoms). Babucci et al.⁴² used Fourier-transform infrared spectroscopy to show that the electronic state of Ir(CO)₂ can be altered on oxide supports that are modified with different ligands. They demonstrated that Ir complexes exhibit different reactivity toward 1,3-butene hydrogenation if the surface modifications shift the Ir oxidation state at the active site. Kumar et al.⁴³ used Hammett reactivity studies, together with DFT, to show that the electronic state of Au during benzyl alcohol oxidation is influenced by electron donation from the supporting oxide. Similar effects have been observed on surfaces that are modified by introducing dopants in the lattice of the oxide. For example, Shao et al.⁴⁴ reported that the electronic structure of CaO surfaces can be controlled by introducing Mo dopants that replace Ca atoms in the surface lattice. The additional valence electrons supplied by Mo migrate toward adsorbed Au clusters, thus enhancing Au binding on the modified CaO surfaces. This effect was shown to be less pronounced over Cr-doped MgO surfaces^45,46, which demonstrates that the dopant effect varies between different types of dopants and oxide supports.

Given the impact that metal/support interactions can have on catalyst morphology and activity, it is clear that identifying physical descriptors for predicting interaction trends will be of high value. Toward this end, Campbell and Sellers⁴⁷ proposed that the adsorbed metal’s enthalpy of oxide formation computed relative to the isolated metal atom (ΔH_f,ox,atom) should be an effective descriptor for predicting metal adsorption energies on oxide surfaces; ΔH_f,ox,atom captures the bonding strength between the metal atom and oxygen, and it should therefore correlate with the metal’s binding strength to oxide surfaces. They demonstrated a linear correlation between ΔH_f,ox,atom and metal adhesion energies measured by adsorption calorimetry, thus deriving a simple model for predicting metal adsorption energies based on readily available reference data²⁷. Using both isothermal titration calorimetry and DFT, Strayer et al.³⁰ verified that ΔH_f,ox,atom is also an effective descriptor for predicting metal adsorption energies on HCa₂Nb₃O₁₀ perovskite surfaces. Although successful in these cases, identifying physical descriptors purely from chemical intuition is not a simple task. Complex charge transfer between the metals, the oxide supports, and surface modifiers (i.e., adsorbates and dopants) is challenging to describe with closed physical forms motivated from chemical intuition alone, as multiple phenomena are occurring simultaneously. As such, in this work we apply SL to identify useful physical descriptors that capture charge transfer at complex, multicomponent interfaces. We choose MgO for our initial study to demonstrate the effectiveness of the SL approach, since charge transfer characteristics can be anticipated from the non-reducible characteristics of MgO. Although we do not expect all surface modifications of MgO studied here to be experimentally feasible because dopants/adsorbates can generate charge-compensating defects that would counteract charge transfer to the adsorbed metal⁴⁵, the idealized nature of pristine MgO is a desirable testbed for evaluating the performance of various SL approaches and for identifying physical descriptors to predict how charge transfer affects metal adsorption in the idealized case. Once the strengths and weaknesses of the SL approaches are known, then we can pursue studies that use well-chosen SL approaches to understand more complex, but experimentally realizable, oxide systems containing defects.

Multiple feature selection (FS) methods have been introduced in the materials community to identify physical descriptors from extremely large sets of candidate descriptors. FS methods are preceded by gathering the chemical properties of the system’s components from various databases and using these properties to generate a large feature space of candidate descriptors. This step, referred to as feature engineering, is achieved by applying a series of mathematical operations on all descriptor pairs to enumerate millions (or even billions) of possible descriptor combinations and functional forms. FS methods are then applied to identify the features from this pool that are the strongest predictors of the property of interest (Fig. 1). There is a rich menu of FS methods in SL, ranging from traditional principle analysis component (PCA)⁴⁸ or kernel ridge regression (KRR)^49,50 to large scale methods, such as least absolute shrinkage and selection operator with l₀ norm regression (LASSO + l₀)^51,52 or sure independence screening and sparsifying operator (SISSO)^53,54. The success of these FS approaches in the materials community is demonstrated by many examples^55,56, including the work of Ghiringhelli et al.⁵¹, who used LASSO + l₀ regression to identify physical descriptors that can predict crystal structures based on material composition. Similarly, Andersen et al.⁵⁴ used SISSO to derive descriptor subsets that can predict molecular adsorption energies on metal alloys. Alternative to these frequentist approaches, Bayesian FS has emerged in SL as a primary tool that provides a coherent and principled framework to quantify uncertainties^57,58,59,60. In addition to inheriting the advantage of Bayesian methods to conveniently incorporate domain knowledge via prior distributions, state-of-the-art Bayesian FS is able to adapt to unknown sparsity levels in the feature space^61,62 and to achieve automatic multiplicity correction^63,64. In this work, we will explore the performance of both LASSO-based FS methods and state-of-the-art Bayesian FS methods^23,65 for finding descriptors that can predict changes in metal binding energy caused by MgO surface modification.

**Fig. 1: Statistical learning workflow.**

In the present work, DFT binding energies of single metal atoms on modified MgO surfaces are used to train our SL models. Training and validation sets are built from data collected for MgO(100) surfaces that are modified by either adsorbates or dopants. MgO(100) is an irreducible oxide that, in its pristine form, binds TMs weakly compared to other supports⁵². It therefore is ideal for investigating the effects of support modifications, as changes in its binding interaction with the supported metal can be attributed solely to the effects of the surface modification. Indeed, MgO has been used as a template for many investigations of metal/support interactions, such as demonstrating how the oxidation state of gold varies upon surface modification^39,66 and how linear scaling relationships can be derived for a wide variety of adsorbates at Au/oxide interfaces⁶⁷. We modified the MgO substrate with surface dopants and adsorbates to induce electron-poor and electron-rich conditions, where we find that both lead to an enhancement of the metal binding energies due to an increase in charge transfer between the metal and the support, as expected from the irreducible nature of pristine MgO.

We identify descriptors for predicting the effects of surface modifications by using multiple FS methods: LASSO⁶⁸, Dirichlet–Laplace prior²³, and Horseshoe prior⁶⁵ (Fig. 1). The methods are applied to identify physical descriptors for predicting the enhancement of metal binding energies on modified MgO based on readily available chemical properties of the system’s components (i.e., the adsorbed metal, the MgO surface, and the adsorbate or dopant). The selected feature spaces are then refined with l₀ norm⁵¹ regression to build predictive models that describe single metal atom binding on MgO surfaces. We find that Dirichlet–Laplace prior shows outstanding FS properties, as it mitigates many disadvantages evident in the performance of LASSO and Horseshoe prior. We evaluate the transferability of the identified features of each method by applying them to predict changes in metal adsorption over CaO(100), BaO(100), and ZnO(100) surfaces. We find that features identified by Dirichlet–Laplace prior using the MgO training data are effective descriptors for predicting metal binding on these irreducible surfaces as well, which again demonstrates the robust nature of Dirichlet–Laplace prior because no CaO, BaO, and ZnO data were included in the FS procedure. It also demonstrates the transferability of SL-derived features within oxide families, which widens model applicability to systems that were not used explicitly to generate the training set.

Results

Effects of dopants and adsorbates on metal adsorption energy

Our objective is to build a model that captures the relationship between metal adsorption energy and readily available chemical properties of the adsorbed metals, MgO, dopants, and adsorbates. A range of TMs (Ag, Au, Cd, Co, Cr, Cu, Fe, Ir, Mn, Mo, Nb, Ni, Pd, Pt, Rh, Ru, V, W, and Zn) were adsorbed on MgO(100) surfaces to generate our DFT training sets. The metal adsorption energy (ΔE_ads) is calculated using Eq. 1, where E_{Metal/Surface} is the total DFT energy of the metal atom adsorbed on the MgO(100) surface at its most stable site, E_Metal is the total DFT energy of the isolated metal atom, and E_Surface is the total DFT energy of the clean MgO(100) surface.

$${\mathrm{\Delta }}E_{\rm{ads}}\,=\,E_{\rm{Meta/Surface}}\,-\,E_{\rm{Metal}}\,-\,E_{\rm{Surface}}.$$

(1)

There are four unique adsorption sites on the MgO(100) surface: an atop site on O, an atop site on Mg, a bridge site between O and Mg, and a hollow site. Adsorption at all four sites was tested to find the most favorable adsorption site for each metal. The atop site above the oxygen anion was found to be the most stable adsorption site for every metal, in agreement with previous reports^52,69,70. A representative structure of Au adsorbed on MgO(100) is shown in Fig. 2; similar geometries were obtained for all other metals.

**Fig. 2: Metal atom adsorption on MgO surfaces.**

We modified the MgO(100) surface by introducing dopants and adsorbates to enhance the reducibility of the support, which we find generally increases metal adsorption strengths. We first considered H and OH adsorbates, which are commonly encountered species under catalytic reaction conditions. Adsorption of H and OH on four unique surface sites was tested to identify the most stable site, which we found was the atop site on surface O for H and the surface hollow site for OH. The metal atoms were then placed on the optimized structures at the atop O site and then the entire system was optimized to calculate the metal binding energies. A representative structure of Au adsorbed on the OH-MgO(100) surface is shown in Fig. 2; other adsorbate-modified surfaces are shown in Supplementary Fig. 1. F and NO₂ were also introduced to expand the DFT training data, where F has a higher electron affinity (EA) (3.40 eV) than OH (1.83 eV) and thus is expected to yield to a stronger metal binding enhancement compared to OH. NO₂ was included because it can readily act as a neutral, positive, or negative species based on the chemical environment, thus diversifying the training set. Each metal atom was placed as far from the adsorbate as possible within the cell to prevent direct interaction between the metal and the adsorbate. This isolates the adsorbate’s impact on the properties of the support from effects related to direct metal–adsorbate interactions. Ag, Nb, and Ru formed direct bonds to NO₂ during all optimization attempts, so their binding energies were excluded from the data set so that only surface-mediated effects were considered. Doping the oxide surface alters the electronic structure of the oxide in a manner similar to the adsorbates^{37,39,44,45,46}. Li, Na, B, and Al were chosen as dopants in this study, as they contain either one fewer or one additional valance electron compared to Mg. The dopants replaced one Mg atom in the first layer of the oxide (Fig. 2c, f).

All modifications of the MgO(100) surface enhanced charge transfer between the surface and the supported metal, as shown by the charge distributions in Fig. 3. When H is adsorbed on Au/MgO(100), one electron is transferred from H to the metal adatom (Fig. 3b). Conversely, OH is electron withdrawing causing charge to be drawn away from Au (Fig. 3c). For surface dopants, Al has one more valence electron than Mg and donates charge to the adsorbed Au atom (Fig. 3d). Na reverses this trend, causing Au to donate electrons to balance the charge depletion in the surface (Fig. 3e). There is a clear similarity in the effects of surface dopants and adsorbates, and therefore we broadly classify surfaces with additional electrons compared to pristine MgO as electron-rich and surfaces with fewer electrons as electron-poor.

**Fig. 3: Electron density distribution of Au on MgO surfaces.**

Metal adatom oxide formation enthalpy (ΔH_f.ox,atom) is a widely used descriptor for predicting metal adsorption on different oxides^27,30,47,52. However, we found that this descriptor was not sufficient for predicting changes in adsorption energy on the modified MgO(100) surfaces shown in Fig. 3, as it only captures how easily the metal can be oxidized by donating its electrons to the surface and fails to represent situations where electrons are donated to the metal (i.e., as seen in Fig. 3b, d for Au on electron-rich surfaces). The enhancement in metal adsorption energy in such situations is largely controlled by the ability of the adsorbed metal to accept excess charge from the modified surface, where ΔH_f.ox,atom is no longer a suitable descriptor because it only captures the ability of the metal to donate electrons. Therefore, we propose that the ionization potential (i.e., IE₁) and EA can be used as broad descriptors for predicting the change in metal adsorption energy (Δ(ΔE_ads), as defined by Eq. 2) caused by bidirectional charge transfer.

$${\mathrm{\Delta }}\left( {{\mathrm{\Delta }}E_{\rm{ads}}} \right) = {\mathrm{\Delta }}E_{\rm{ads}}\left( {{\rm Modified}\,{\rm MgO}} \right) - {\mathrm{\Delta }}E_{\rm{ads}}({{\rm Clean}\,{\rm MgO}}).$$

(2)

These descriptors capture the atom’s propensity for both electron donation (versus IE₁) and acceptance (versus EA).

We plot the EA and IE₁ against Δ(ΔE_ads) for electron-rich and electron-poor surfaces in Fig. 4. The correlation is separated into two categories: a negative correlation for electron-rich surfaces (Fig. 4a) and a positive correlation for electron-poor surfaces (Fig. 4b). On the electron-rich surfaces, Au adsorption has been enhanced significantly because Au has the highest EA and readily accepts the extra electronic charge. The adsorption of V is increased the most on the electron-poor surfaces because V has the lowest IE₁ and can readily donate charge to the surface. The direction of charge transfer in these systems is further confirmed by the charge density difference analyses for Au and Mn shown in Supplementary Fig. 2 and the corresponding Bader charge analysis in Supplementary Table 1. Moreover, the density of states (DOS) analysis for Au and Mn on doped MgO(100) are provided in Supplementary Fig. 3. The shift in the DOS at the Fermi level of the electron-poor MgO(100) surface clearly demonstrates a transfer of electrons to the surface from the adsorbed metal (and vice versa for the electron-rich surface).

**Fig. 4: Metal binding correlation to electron affinity and ionization potential.**

SL for identifying physical descriptors

Although useful for predicting broad trends, EA and IE₁ are limited in their ability to quantitatively describe metal binding on modified MgO. This is evident in the scatter seen in Fig. 4. Quantitative descriptors that can unravel the effects of multiple charge transfer phenomena simultaneously will likely require more complex functional forms than what is captured by the simple electronic properties. In this section, we identify such descriptors for MgO using FS via LASSO, Horseshoe prior, and Dirichlet–Laplace prior. The feature spaces were also tested for transferability to CaO(100), BaO(100), and ZnO(100) in the next section, where DFT data for the additional oxide surfaces were not included in any training data used for feature selection. We find that Dirichlet–Laplace prior yields the best models, both in terms of lowest error and highest transferability.

We show the performance of the selected descriptor sets in Figs 5 and 6 using 50 trials of randomly separated training/validation data sets for dopant- and adsorbate-modified MgO, respectively. We use the l₀ norm method to refine each model by systematically decreasing the number of descriptors in the model (i.e., the model dimension corresponds to the number of descriptors in the model, nD). The root mean square error (RMSE) is calculated by Eq. 3 to quantify the prediction accuracy, where y_i is the ith predicted value, $\widehat {y_i}$ is the ith estimated value, and n is the number of predicted values.

$${\rm{RMSE}}\,=\,\sqrt {\frac{{\mathop {\sum }\limits_{i = 1}^n \left( {y_i\,-\,\widehat {y_i}} \right)^2}}{n}}.$$

(3)

**Fig. 5: Descriptor performance for doped MgO surfaces.**

**Fig. 6: Descriptor performance for adsorbate-modified MgO surfaces.**

The model with trained coefficients is then applied to the validation set, and the dimension that provides the lowest validation RMSE is used to determine the final predictive model. RMSE for the final models derived from 50 randomly chosen training set points is shown in Fig. 5a–c and Fig. 6a–c for dopant- and adsorbate-modified MgO, respectively. Although LASSO tends to select the most descriptors, which will be shown in the “Methods” section, the derived models are less accurate than the ones built from the Bayesian methods. In addition, the total data set derived from Horseshoe prior contains less than five descriptors on average using either dopant-modified or adsorbate-modified MgO data, which is much smaller in size than the total data sets derived by the other two methods. Although Horseshoe prior constructs more accurate models than LASSO, it is highly aggressive in its selection and generates a very small pool of candidate features that are available for refining the model. Dirichlet–Laplace prior exhibits the best performance, as it is a balance between the stringent selection tendencies of Horseshoe prior and the permissive selection tendencies of LASSO. Although Horseshoe usually performs similarly to Dirichlet–Laplace (Supplementary Fig. 4) at a given model dimension, the model cannot be further improved because we rapidly exhaust the pool of candidate descriptors. Conversely, Dirichlet–Laplace prior can be systematically improved by adding more descriptors until the validation error of the test set increases, thus allowing us to balance model accuracy with model simplicity. The predictive performance of these models is shown by the parity plots in Figs 5d–f and 6d–f, where each plot was generated with a representative training set that yields a training RMSE close to the median of each method (i.e., the same training set was used for all three methods to ensure comparability). The corresponding training and validation RMSE of these plots is labeled as an orange diamond in Figs 5a–c and 6a–c. The performance of the models is improved compared to our previous work on clean oxides using LASSO + l₀⁵², where in that work we achieved an RMSE of 0.41 eV on the training set using five descriptors to predict 92 data points (i.e., compared to errors of 0.26 and 0.25 eV on training sets with five descriptors for the dopant and adsorbate data in Supplementary Fig. 4, respectively).

Tables 1 and 2 present the descriptors and relevant coefficients of the best models for dopant-modified and adsorbate-modified MgO surfaces that were shown in Figs 5f and 6f. The physical descriptors for doped MgO are composed of EN_P, EN_MB, IE₁, IE₂, EA, NVal, Z, and wr_p, and for adsorbate-modified MgO are composed of EN_P, EN_MB, IE₁, IE₂, and NVal. The formula of each descriptor is more complex than those identified in our previous study of clean oxides⁵², reflecting the increased complexity of the problem once surface modifiers are introduced. The importance of charge transfer phenomena is evident in the character of the descriptors. For instance, $\left| {\frac{{\rm{IE}}_1^{\rm{m}} - {\rm{IE}}_2^{\rm{s}}}{{\rm{IE}}_2^{\rm{m}} - {\rm{IE}}_2^{\rm{d}}}} \right|\,\times\,\left| {\frac{{\rm{IE}}_1^{\rm{m}} - {\rm{IE}}_2^{{\rm{dn}}}}{{\rm{IE}}_2^{\rm{m}} - {\rm{IE}}_1^{{\rm{dn}}}}} \right|$ in Table 1 plays a major role in predicting the enhanced binding of Au adsorption on Al-MgO compared to Na-MgO. The appearance of a term capturing the difference between the ionization energies of the metal and the dopant suggests that the favorability of charge transfer between these two components plays a major role in determining the overall metal binding energy. For adsorbate-modified surfaces, $\left( {\frac{{\rm{EN}}_{\rm{P}}^{\rm{m}} - {\rm{EN}}_{\rm{P}}^{\rm{s}}}{{\rm{EN}}_{\rm{P}}^{\rm{s}} - {\rm{EN}}_{\rm{P}}^{\rm{o}}}} \right)^2\,\times\,\left| {\frac{{\rm{IE}}_2^{\rm{m}} - {\rm{IE}}^{\rm{a}}}{{\rm{IE}}_1^{\rm{o}} - {\rm{IE}}^{\rm{a}}}} \right|$ is the most effective descriptor for Au adsorption. $\left| {\frac{{\rm{IE}}_2^{\rm{m}}\,-\,{\rm{IE}}^{\rm{a}}}{{\rm{IE}}_1^{\rm{o}}\,-\,{\rm{IE}}^{\rm{a}}}} \right|$ qualifies the difference in the ability of the parent metal, oxygen, and the adsorbate to donate charge. Although some terms in these models may be interpreted, this is in general a difficult task given the fact that interrelated phenomena controlling charge transfer cannot be described by a simple physical framework⁷¹.

Table 1 Descriptors, model coefficients, and responding values determined by Dirichlet–Laplace for dopant-modified MgO.

Full size table

Table 2 Descriptors, model coefficients, and responding values determined by Dirichlet–Laplace for adsorbate-modified MgO.

Full size table

Robustness and transferability of the selected descriptors

The descriptors identified in the previous section were applied to predict Δ(ΔE_ads) on various surfaces to test feature transferability and robustness. The metals were placed on CaO(100), BaO(100), and ZnO(100) doped with the same dopants as used for MgO (Al, B, Li, and Na). We used the SL-derived descriptor sets from the MgO training data to predict the behavior on the abovementioned modified surfaces, where we use l₀ norm regression to refine the model dimensionality (Fig. 7). Note that this entails refitting the coefficients in front of each descriptor, so here we are only testing the transferability of the descriptors and not the transferability of the entire model. The performance of features derived from 50 trials of randomly separated MgO training sets are used to verify feature transferability, as shown by the box plots in Fig. 7. Since Horseshoe typically selects less than five descriptors, we restrict all of the predictive models to have at most five descriptors for comparison in Fig. 7a–c, as indicated by a dashed line in Fig. 7d–f. Comparing RMSE in Fig. 7a–c, the transferability of selected feature spaces is ranked as CaO ≈ BaO > ZnO. MgO and CaO have many similar properties and are commonly compared in the literature^45,46,72, and therefore it is expected that descriptors that are important for MgO should also be applicable to CaO. Since Ba is in the same group as Mg and Ca in the periodic table, it is intuitive that BaO data also correlates with the MgO-derived features. We find that the features are less transferable in the case of ZnO, as expected given the fact ZnO is in a different group. In the comparison between the SL methods, Horseshoe performs the worst. Although it can pick meaningful descriptors, it will only select the ones that are dominant for MgO and will aggressively throw out ones that, although important for other oxides, are less important for MgO. For a more detailed comparison, we summarize the difference in RMSE for LASSO and Dirichlet–Laplace (ΔRMSE) in Supplementary Fig. 5, which shows that LASSO has higher average prediction errors than Dirichlet–Laplace when comparing models derived with the same training set and the same model dimension. Thus, Dirichlet–Laplace shows the highest transferability among the SL methods. In summary, we find that descriptors identified using Dirichlet–Laplace prior for one oxide are applicable to other related oxides, demonstrating the robust nature of the selected features and their transferability outside the training set used for feature selection. We note that the RMSE for all of the testing oxides with the same feature space is generally similar to the MgO models.

Stability of modified MgO(100) surfaces

In the previous sections we evaluated the performance of various SL approaches for predicting the behavior of idealized MgO surfaces, where it is assumed that the dopants or adsorbates do not induce accompanying defects to compensate the additional charge and therefore charge transfer to the adsorbed metal atom was forced to occur. In real systems there inevitably will be a tendency for charge-compensating defects to compete for this charge transfer, which will be examined in this section. Numerous experiments demonstrate that the presence of the adsorbed metal can in certain cases suppresses the tendency to form charge-compensating defects in these irreducible oxides^73,74,75. Shao et al.⁴⁴ showed that the charge transfer from Mo dopants in CaO enhances Au binding strength because excess electrons from Mo are transferred to the electronegative Au atoms. This effect was also present, but less pronounced, on Cr-doped MgO(100), where Stavale et al.⁴⁵ found that the enhancement of Au binding was inhibited by charge-compensating Mg vacancies. In addition to MgO, Tran et al.⁷⁶ induced Fe as dopants in ZnO to modulate the Pt/ZnO interaction. The X ray photoelectron spectroscopy analysis of Pt 4f core levels revealed the change in the oxidation state of Pt, which is direct evidence of the charge transfer between Fe and Pt. The enhanced metal–support interaction in this system stabilizes Pt nanoparticles and suppresses metal sintering during CO oxidation. For instance, the average Pt particle size increased from 2.2 to 5.7 nm on the bare ZnO during CO oxidation but only increased from 2.4 to 3.2 nm on the Fe-doped ZnO. Furthermore, the turnover frequency was raised from 0.60 to 5.37 s⁻¹ with the assistance of the Fe dopants. Thus, the predictive model developed in this work is relevant for screening system compositions where one can expect significant charge transfer to induce an enhancement of the metal binding energy. Once identified, the stability of the required surface modification can be assessed, by computing free energies of formation as demonstrated below, to determine if the candidate system is a viable target for experimental synthesis attempts.

Here we compute phase diagrams for the formation of charge-compensating O or Mg vacancy defects in electron-poor (Na-doped and OH-modified) and electron-rich (Al-doped and H-modified) surfaces, respectively (Fig. 8). We expanded the simulation cell to twice the size of the original unit cell to accommodate multiple dopants in the surface (Supplementary Fig. 6). The O and Mg vacancy formation free energies are calculated by Eqs. 4 and 5, respectively, where ΔG_O-vac is the O vacancy formation free energy, ΔG_Mg-vac is the Mg vacancy formation free energy, E_O-vac is the total DFT energy of the modified surface with one O vacancy, E_Mg-vac is the total DFT energy of the modified surface with one Mg vacancy, E_2MgO is the total DFT energy difference between the geometry with two MgO units on the surface (Supplementary Fig. 7) and the pristine MgO(100) surface, E_surface is the total DFT energy of the modified surface without any charge-compensating defect, T is the temperature, and P_O2 is the partial pressure of oxygen molecule. µ_O2 is the chemical potential of an oxygen molecule in the gas phase, which is computed by the total DFT energy of the O₂ molecule with enthalpy and entropy corrections provided by the NIST webbook⁷⁷. We used the energy of a small supported MgO cluster with two MgO formula units as the reference for Mg vacancy formation in Eq. 5 because the formation of this small cluster will be the first nucleation step once Mg atoms leave the MgO lattice. The energy of bulk MgO can also be used as the reference, which will systematically shift all Mg vacancy formation energies to be more negative by 2.59 eV. This will not impact the relative difference in vacancy formation energy when comparing the clean surfaces to the surfaces with adsorbed metal atoms.

$${\mathrm{\Delta }}G_{\rm{O}\,-\,\rm{vac}}\left( {T,P_{\rm{O}_{2}}} \right)\,=\,E_{\rm{O}\,-\,\rm{vac}}\,+\,\frac{1}{2}\mu _{\rm{O}_{2}}\left( {T,P_{\rm{O}_{2}}} \right)\,-\,E_{\rm{surface}},$$

(4)

$${\mathrm{\Delta }}G_{\rm{Mg}\,-\,\rm{vac}}\left( {T,P_{\rm{O}_{2}}} \right)\,=\,E_{\rm{Mg}\,-\,\rm{vac}}\,+\,\frac{1}{2}E_{\rm{2MgO}}\,-\,\frac{1}{2}\mu _{\rm{O}_{2}}\left( {\rm{T,P}}_{\rm{O}_{2}} \right)\,-\,E_{\rm{surface}}.$$

(5)

**Fig. 8: Phase diagram of charge-compensating defect formation.**

We use the above equations to compute the boundary in (P_O2, T) space where the defect formation energy is zero, which is used to generate the phase diagrams shown in Fig. 8. In each case, we compute the defect formation boundary on the clean surface without any adsorbed metal, and then we compare it to the boundary on surfaces with the adsorbed metals. We find that the metal atom on modified MgO generally inhibits defect formation because the metal atom acts as an electron source or sink to alleviate the excess surface charge caused by the dopant or adsorbate. The extent of the boundary shift correlates with IE₁ and EA for electron-poor and electron-rich surfaces, respectively, which indicates that it is controlled by the ability of the adsorbed metal to donate or accept excess charge (Supplementary Fig. 8). Comparing vacancy formation energy of modified MgO(100) surfaces (the dashed lines in Supplementary Fig. 8) to the pristine surfaces (Supplementary Table 2), it is evident that the surface modification will lead to compensating defects, as discussed above. However, the favorability of compensating defect formation will be suppressed by the adsorbed metals. The suppression of the area of the phase diagram in which a compensating defect would occur, as seen in Fig. 8, suggests that it may thermodynamically be feasible to prepare some of the adsorbed metal/modified MgO surfaces considered in this work if the surface modification and metal adsorbate are introduced simultaneously before charge-compensating defects can form.

Discussion

We have shown the capability of SL methods for constructing predictive models for computing Δ(ΔE_ads) in oxide-supported SAC systems. The SL methods yield useful relationships between the fundamental chemical properties of dopants/adsorbates and overall metal/support interactions, which are often used to control surface morphology but in many cases are not well-understood^{31,36,37,38,41,44,45,46}. The resulting models provide an estimation tool for quantifying how metal binding energy can be tuned through surface modification. This in turn can provide guidelines for designing and controlling the particle size of the TMs (e.g., by finding modifications that stabilize single adsorbed metal atoms relative to the bulk metal). Indeed, we predict that Mn may form stable SACs on Li-doped and Na-doped ZnO surfaces because the formation energy of single Mn atoms on these surfaces is negative relative to the formation of bulk Mn (Supplementary Section 1). We also introduced two state-of-the-art Bayesian methods that have several advantages over other common FS methods (e.g., PCA, KRR, LASSO, SISSO, etc.), such as flexible prior distributions, providing automatic error estimation on selected features, adaptivity to arbitrary sparsity in the feature space, and the ability to utilize the full data set for inference without requiring cross validation.

The robustness of our methodology was verified by applying the final predictive models to validation data sets, as well as by applying the selected features to CaO, BaO, and ZnO surfaces. The transferability of descriptors reveals the power of the Dirichlet–Laplace prior method, since it can use training data on one particular oxide to identify features that are applicable to similar oxides. We are currently exploring combined FS approaches based on training data derived from multiple oxides to build models that are fully general within oxide families. This will include extensions to more reducible oxides, where it is expected that the parent metal of the oxide will play a more direct role as a center of charge transfer. For example, our previous study⁵² showed that metal–metal and redox interactions can occur on highly reducible oxides, such as CeO₂. The parent metals in the support are expected to interact with dopants and adsorbates as well, which will further complicate the nature of charge transfer. Finding suitable descriptors from chemical intuition alone will be nearly impossible for such systems, but likely will be feasible using the SL tools developed in this study.

Transition metal adsorption on MgO is enhanced by surface doping and adsorbate modification, which can be predicted using physical descriptors identified by SL (namely, LASSO, Horseshoe prior, and Dirichlet–Laplace prior methods). In particular, MgO can be transformed by dopants or adsorbates to exhibit either electron-rich or electron-poor characteristics. The extent of charge transfer between metal and modified MgO surfaces correlates with the ionization potential and the EA of the metal, but scatter in these correlations preclude quantitative prediction. Therefore, we used SL to derive predictive models that go beyond these simple descriptors, which were built from the fundamental chemical properties of the adsorbed metals, Mg, O, dopants, and adsorbates. Multiple FS tools, including state-of-the-art Bayesian methods, were applied together with l₀ norm regression to derive simple and effective models for predicting metal adsorption on modified MgO surfaces. Of the tested SL methods, we found that Dirichlet–Laplace prior yields the most accurate and transferable models. The descriptors identified for MgO(100) by Dirichlet–Laplace prior were also shown to be effective for estimating metal binding on CaO(100), BaO(100), and ZnO(100) surfaces with comparable accuracy, which demonstrates the robust nature of this FS approach and the transferability of our models to related oxides.

Methods

DFT calculations

Metal adsorption energies were calculated with DFT using the Vienna ab initio simulation package (VASP 5.4.4)⁷⁸. The Perdew–Burke–Ernzerhof exchange-correlation functional⁷⁹ was applied with spin polarization and the projector augmented-wave method⁸⁰ was used to treat core electrons with VASP default potentials⁸¹. The valence electrons treated self-consistently for each atom type are listed in Supplementary Table 3. Planewave basis sets were expanded to a kinetic energy cutoff of 600 eV. Gaussian smearing was employed with a smearing width of 0.05 eV. A Monkhorst–Pack (MP)⁸² k-point mesh was used with 4 × 4 × 4 sampling on the bulk MgO structure and 3 × 3 × 1 sampling on the (2 × 2) MgO(100) surface models. The Grimme empirical dispersion correction was used to treat van der Waals dispersion⁸³. Geometries were optimized to a convergence criterion of 0.05 eV Å⁻¹. Metal binding energies computed with a force convergence criterion of 0.05 eV Å⁻¹ were found to be nearly identical to those computed with a tighter criterion of 0.01 eV Å⁻¹ (Supplementary Table 4).

Surface models contained four layers of MgO(100) with all layers relaxed to avoid spurious surface dipoles that can arise from frozen layers in oxide surfaces⁸⁴. The vacuum distance between layers in the direction perpendicular to the surface was at least 15 Å to avoid interactions between MgO slabs, and a dipole correction⁸⁵ was applied perpendicular to the MgO(100) surface. Calculations on CaO(100), BaO(100), and ZnO(100) surfaces followed the same settings as those applied for the MgO(100) surfaces, except the MP k-points sampling is 5 × 5 × 1 for ZnO(100). The ground state energy of single metal atoms is computed in a 15 × 16 × 17 Å³ unit cell, which was our reference for computing binding energies. The magnetization states and energies of the isolated metal atom are listed in Supplementary Table 5. Multiple magnetization states were tested when considering metal adsorption on the oxide surfaces, examining all probable spin configurations as listed in Supplementary Table 6, and the ground state with the lowest energy was selected for use in the adsorption energy training set. Metal binding energies on MgO(100), CaO(100), BaO(100), and ZnO(100) are listed in Supplementary Tables 7–10, and the corresponding system magnetizations are reported in Supplementary Tables 11–14. The Bader method^86,87 was used to calculate partial charges on atoms, which were only used for qualitatively assessing the direction of charge transfer (Supplementary Table 1). The defect formation energy of the pristine MgO(100) surface is listed in Supplementary Table 2. Coordinates of all energy-minimized structures are provided in the Supplementary Information.

Statistical learning

Our previous study⁵² applied LASSO + l₀⁵¹ to derive descriptors that can predict metal adsorption energies across a wide range of metal/oxide pairs. While successful in that work, in the present study we found that LASSO + l₀ was limited in its ability to handle correlated features when we introduced properties of the surface modifiers in the feature space (vide infra). As such, here we apply two additional state-of-the-art Bayesian methods for feature selection, Horseshoe prior⁶⁵ and Dirichlet–Laplace prior²³, that are implemented alongside the LASSO⁶⁸ approach for comparison. Bayesian methods are particularly well-suited for FS because they allow users to incorporate domain knowledge when applicable, they offer uncertainty quantification, and, most notably, they provide a coherent framework to infer model probabilities. Modern Bayesian FS methods can adapt to unknown sparsity levels in the feature space and perform well when handling correlated features⁸⁸. Bayesian methods operate by first specifying an initial distribution of values for each descriptor coefficient (i.e., a prior distribution) and then updating this distribution based on available training data through Bayes’ theorem (i.e., solving for the posterior distribution). These methods offer flexibility through the choice of the prior distribution, as different prior distributions will yield different FS characteristics. The DFT training data for FS in this work consisted of metal atom binding energies on MgO surfaces that were modified by either dopants or adsorbates, where FS was separate for “dopant” and “adsorbate” data sets. Details regarding the implementation of each FS method are provided in the Supplementary Information.

Primary descriptors

Our feature space was built from a primary descriptor set that contained chemical properties of the adsorbed metals, the parent atoms in the oxide surface (i.e., Ba, Ca, Mg, O, and Zn), and dopants that were available in the CRC Handbook of Chemistry and Physics⁸⁹. These atomic properties include the atomic number (Z), electronegativity in Pauling and Martynov–Batsanov⁹⁰ scales (EN_P and EN_MB), first and second ionization energy (IE₁ and IE₂), EA, standard sublimation enthalpy (ΔH_sub), standard molar enthalpy of oxide formation of the metal adatoms (ΔH_f,ox,bulk), standard molar enthalpy of formation of the adsorbed metal’s most stable oxide referenced to the isolated metal atom (ΔH_f,ox,atom), Zunger and Cohen orbital radii of s and p orbitals (zr_s and zr_p)⁹¹, Waber and Cromer orbital radii of s and p orbitals (wr_s and wr_p)⁹², number of valence electrons (NVal), Miedema metal alloy formation parameters (η^1/3and ϕ)⁹³, and absolute electronegativity (AEN)⁹⁴. The following data were not available in CRC Handbook of Chemistry and Physics and instead were taken from the provided references: IE₂ of Ir⁹⁵, EA of Cd, Mg, Mn, and Zn⁹⁶, and ΔH_f,ox,bulk of Au and Pt⁴⁷. Following our previous notation scheme⁵², superscripts m, s, o, d, and a are used to indicate adsorbed metals, the parent metal in the oxide surface (which is Mg, Ca, Ba, or Zn in this study), oxygen, dopants, and adsorbates, respectively. The IE₁, IE₂, and EA of dopants include separate entries for the neutral (n) dopant (IE₁^dn, IE₂^dn, and EA^dn) and for the dopant in its most stable oxidation state (IE₁^d, IE₂^d, and EA^d). For adsorbate-modified systems, we also included EN_P, EN_MB, IE, EA, NVal, coordination number between adsorbates and the MgO(100) surface (CN), and bond dissociation energy (BD). We define BD as the binding energy between the atom in the adsorbate and the atom in the oxide surface to which the adsorbate binds (e.g., O–H for H*, Mg–O for OH* and NO₂*, and Mg–F for F*, where * indicates a species adsorbed on the surface). The EN of adsorbates is taken as the EN of the atom in the adsorbate that is attached to the support. We also considered polarization inside the adsorbed molecules by including electronegativity differences (ΔEN_P and ΔEN_MB) between the different atoms in multi-atom molecules (e.g., |EN_N – EN_O| for NO₂ and |EN_H – EN_O| for OH). All data described above are provided in the Supplementary Information.

Feature selection

We applied LASSO⁶⁸, Horseshoe prior⁶⁵, and Dirichlet–Laplace prior²³ methods, implemented in R version 3.6.0 on Linux⁹⁷ to identify correlations between the chemical properties of the system’s components and the enhancement in metal binding energy that results from surface modification. There are a total of 76 and 73 calculated binding energies for dopant-modified and adsorbate-modified MgO(100), respectively. A total of 52 data points are randomly selected from this set to build a training set used to develop the predictive model, while the remaining data points are isolated from all SL procedures to use as a validation set. The feature engineering procedures used to generate the feature space of candidate descriptors are summarized in Fig. 9. First, we expanded our feature space of candidate descriptors by applying a series of mathematical operators (described in detail in the Supplementary Information) on the set of primary descriptors described in the previous section. This procedure introduces secondary descriptors that can capture nonlinear correlations between the fundamental properties of each component in the system and the metal binding energy. All categorical descriptors (i.e., descriptors for which all elements in the feature vector fall in a common category; see Supplementary Information for detailed explanation and definition of categorical descriptors) in the secondary descriptor set were converted into numerical descriptors using dummy variables so that they can be treated with linear methods⁹⁸. The numerical descriptors from the secondary descriptor set were then cross-multiplied to mix different properties, which builds an engineered feature space containing ~10⁶ descriptors. We prescreened the feature space by ranking the Pearson coefficient of correlation to identify the top 1,000 descriptors. We found that this data preprocessing step stabilizes the subsequent analysis, as well as improves calculation speed. We formed the final ternary feature space by adding the transformed categorical descriptors back into the secondary feature space, where all descriptor sets were normalized. All feature engineering procedures and the resulting feature space subsets are provided in the Supplementary Information. The algorithms of Dirichlet–Laplace prior and Horseshoe prior are summarized in Supplementary Tables 16 and 17, respectively.

**Fig. 9: Feature engineering and feature preselection procedures.**

The FS process yields a reduced feature space with an order of ~10 descriptors. To build the final model, we systematically test the performance of all combinations of features in this reduced space, where the size of the model (i.e., number of features in the combination) is predetermined, and select the combination that yields the lowest RMSE. We then systematically decrease the number of features in the final model until we reach a model with only one descriptor. This strategy, called l₀ norm regression, helps to systematically test the performance of the model with respect to the number of tunable parameters. Given the selected descriptors and the corresponding coefficients, we use the validation set to calculate the validation error to determine whether the predictive model is overfit, which occurs when the error of the validation set begins to rise as more descriptors are added to the model. We choose the final model size by selecting the model that yields the lowest error on the validation data set, thus ensuring that the model is not overfit. To test the FS processes, we applied the l₀ norm regression approach to evaluate feature spaces selected from MgO data on data derived from CaO, BaO, or ZnO modified with the same dopants (Al, B, Li, and Na). Each of these sets contained 52 data points for each oxide.

Evaluation of FS methods using synthetic data

It is crucial to understand the properties of each SL method by evaluating its performance characteristics on simulated data sets, where the ground truth is known a priori. This was done by populating a simulated data set with 100 observations and a feature space of 1,000 candidate descriptors, out of which ten descriptors truly correlate with the response. We repeated the simulation 100 times, each generating a random data set following the same model. We evaluated each method by reporting the average number of true positives (i.e., descriptors that truly correlate with the training set and are selected by the method), true negatives (i.e., descriptors that do not correlate with the training set and are not selected by the method), false positives (i.e., descriptors that do not correlate with the training set but are selected by the method), and false negatives (i.e., descriptors that correlate with the training set but are not selected by the method). This gives us insight into the characteristics of each SL approach.

Figure 10a shows the simulation result for a candidate feature space that has an average feature correlation of ρ = 0.5 (i.e., a test case with a high rate of correlation among the candidate features). As seen in the figure, LASSO suffers from a high rate of false positives when candidate features are highly correlated. This issue is largely mitigated by the Bayesian methods, which motivates their use in this work. Although Horseshoe prior shows a low rate of false positive selection, it also misses many true descriptors evident from its high rate of false negatives. Simulation tests demonstrate that the high rate of false positives in LASSO and false negatives in Horseshoe is consistent across various correlation parameters (see Supplementary Information for simulations with ρ = 0, ρ = 0.5, and ρ = 0.9 in Supplementary Tables 18–20), which suggests that this behavior will also be present given the expected range of correlation in our feature space. This is verified here by showing the number of features selected by each method on our MgO data set (Fig. 10b). Horseshoe selects the fewest descriptors in both dopant-modified and adsorbate-modified data sets, which suggests that Horseshoe is the most aggressive FS methods. We note that LASSO does not always select the most descriptors, but its performance is mostly worse than Dirichlet–Laplace. The feature spaces selected based on the “real” MgO data are analyzed in the “Results” section.

**Fig. 10: Behavior and performance of feature selection methods.**

Data availability

Coordinates of all energy-minimized DFT structures and all SL training data are provided in the following online repository: https://github.com/tsenftle/MgO_SL/.

Code availability

All SL codes are provided in the following online repository: https://github.com/tsenftle/MgO_SL/Scripts.

References

Liu, L. & Corma, A. Metal catalysts for heterogeneous catalysis: from single atoms to nanoclusters and nanoparticles. Chem. Rev. 118, 4981–5079 (2018).
CAS Google Scholar
Qiao, B. et al. Highly efficient catalysis of preferential oxidation of CO in H₂-rich stream by gold single-atom catalysts. ACS Catal. 5, 6249–6254 (2015).
CAS Google Scholar
Qiao, B. et al. Single-atom catalysis of CO oxidation using Pt₁/FeO_x. Nat. Chem. 3, 634–641 (2011).
CAS Google Scholar
Moses-DeBusk, M. et al. CO oxidation on supported single Pt atoms: experimental and ab initio density functional studies of CO interaction with Pt atom on θ-Al₂O₃(010) surface. J. Am. Chem. Soc. 135, 12634–12645 (2013).
CAS Google Scholar
DeRita, L. et al. Catalyst architecture for stable single atom dispersion enables site-specific spectroscopic and reactivity measurements of CO adsorbed to Pt atoms, oxidized Pt clusters, and metallic Pt clusters on TiO₂. J. Am. Chem. Soc. 139, 14150–14165 (2017).
CAS Google Scholar
Jones, J. et al. Thermally stable single-atom platinum-on-ceria catalysts via atom trapping. Science 353, 150–154 (2016).
CAS Google Scholar
Zhang, Z. et al. Thermally stable single atom Pt/m-Al₂O₃ for selective hydrogenation and CO oxidation. Nat. Commun. 8, 16100 (2017).
CAS Google Scholar
Abbet, S., Heiz, U., Häkkinen, H. & Landman, U. CO oxidation on a single Pd atom supported on magnesia. Phys. Rev. Lett. 86, 5950–5953 (2001).
CAS Google Scholar
Liang, J.-X. et al. Theoretical and experimental investigations on single-atom catalysis: Ir₁/FeO_x for CO oxidation. J. Phys. Chem. C. 118, 21945–21951 (2014).
CAS Google Scholar
Spezzati, G. et al. Atomically dispersed Pd–O species on CeO₂(111) as highly active sites for low-temperature CO oxidation. ACS Catal. 7, 6887–6891 (2017).
CAS Google Scholar
Yang, M. et al. A common single-site Pt(II)–O(OH)_x—species stabilized by sodium on “active” and “inert” supports catalyzes the water-gas shift reaction. J. Am. Chem. Soc. 137, 3470–3473 (2015).
CAS Google Scholar
Lin, J. et al. Remarkable performance of Ir₁/FeO_x single-atom catalyst in water gas shift reaction. J. Am. Chem. Soc. 135, 15314–15317 (2013).
CAS Google Scholar
Yang, M. et al. Catalytically active Au-O(OH)_x—species stabilized by alkali ions on zeolites and mesoporous oxides. Science 346, 1498–1501 (2014).
CAS Google Scholar
Wei, H. et al. FeO_x-supported platinum single-atom and pseudo-single-atom catalysts for chemoselective hydrogenation of functionalized nitroarenes. Nat. Commun. 5, 5634 (2014).
CAS Google Scholar
Kwak, J. H., Kovarik, L. & Szanyi, J. CO₂ reduction on supported Ru/Al₂O₃ catalysts: cluster size dependence of product selectivity. ACS Catal. 3, 2449–2455 (2013).
CAS Google Scholar
Liu, P. et al. Photochemical route for synthesizing atomically dispersed palladium catalysts. Science 352, 797–800 (2016).
CAS Google Scholar
Guzman, J. & Gates, B. C. Structure and reactivity of a mononuclear gold-complex catalyst supported on magnesium oxide. Angew. Chem. Int. Ed. 115, 714–717 (2003).
Google Scholar
Wang, C. et al. Low-temperature dehydrogenation of ethanol on atomically dispersed gold supported on ZnZrO_x. ACS Catal. 6, 210–218 (2016).
CAS Google Scholar
Guo, X. et al. Direct, Nonoxidative conversion of methane to ethylene, aromatics, and hydrogen. Science 344, 616–619 (2014).
CAS Google Scholar
Gu, X.-K. et al. Supported single Pt₁/Au₁ atoms for methanol steam reforming. ACS Catal. 4, 3886–3890 (2014).
CAS Google Scholar
Hu, B. et al. Isolated Fe^II on silica as a selective propane dehydrogenation catalyst. ACS Catal. 5, 3494–3503 (2015).
CAS Google Scholar
Li, Y. H., Xing, J., Yang, X. H. & Yang, H. G. Cluster size effects of platinum oxide as active sites in hydrogen evolution reactions. Chem. Eur. J. 20, 12377–12380 (2014).
CAS Google Scholar
Bhattacharya, A., Pati, D., Pillai, N. S. & Dunson, D. B. Dirichlet–Laplace priors for optimal shrinkage. J. Am. Stat. Assoc. 110, 1479–1490 (2015).
CAS Google Scholar
Tauster, S. J., Fung, S. C. & Garten, R. L. Strong metal-support interactions. Group 8 noble metals supported on titanium dioxide. J. Am. Chem. Soc. 100, 170–175 (1978).
CAS Google Scholar
Chandler, B. D. An extra layer of complexity: strong metal-support interactions. Nat. Chem. 9, 108–109 (2017).
CAS Google Scholar
Dai, Y., Lu, P., Cao, Z., Campbell, C. T. & Xia, Y. The physical chemistry and materials science behind sinter-resistant catalysts. Chem. Soc. Rev. 47, 4314–4331 (2018).
CAS Google Scholar
Hemmingson, S. L. & Campbell, C. T. Trends in adhesion energies of metal nanoparticles on oxide surfaces: understanding support effects in catalysis and nanotechnology. ACS Nano 11, 1196–1203 (2017).
CAS Google Scholar
Campbell, C. T. & Mao, Z. Chemical potential of metal atoms in supported nanoparticles: dependence upon particle size and support. ACS Catal. 7, 8460–8466 (2017).
CAS Google Scholar
Campbell, C. T. The energetics of supported metal nanoparticles: relationships to sintering rates and catalytic activity. Acc. Chem. Res. 46, 1712–1719 (2013).
CAS Google Scholar
Strayer, M. E. et al. Charge transfer stabilization of late transition metal oxide nanoparticles on a layered niobate support. J. Am. Chem. Soc. 137, 16216–16224 (2015).
CAS Google Scholar
Chen, G. et al. Interfacial electronic effects control the reaction selectivity of platinum catalysts. Nat. Mater. 15, 564–569 (2016).
CAS Google Scholar
Wang, Y.-G., Yoon, Y., Glezakou, V.-A., Li, J. & Rousseau, R. The role of reducible oxide–metal cluster charge transfer in catalytic processes: new insights on the catalytic mechanism of CO oxidation on Au/TiO₂ from ab initio molecular dynamics. J. Am. Chem. Soc. 135, 10673–10683 (2013).
CAS Google Scholar
Matsubu, J. C. et al. Adsorbate-mediated strong metal–support interactions in oxide-supported Rh catalysts. Nat. Chem. 9, 120–127 (2017).
CAS Google Scholar
Hu, P. et al. Electronic metal–support interactions in single-atom catalysts. Angew. Chem. Int. Ed. 53, 3418–3421 (2014).
CAS Google Scholar
Campbell, C. T. Catalyst–support interactions: electronic perturbations. Nat. Chem. 4, 597–598 (2012).
CAS Google Scholar
Pacchioni, G. Electronic interactions and charge transfers of metal atoms and clusters on oxide surfaces. Phys. Chem. Chem. Phys. 15, 1737 (2013).
CAS Google Scholar
Schlexer, P., Puigdollers, A. R. & Pacchioni, G. Tuning the charge state of Ag and Au atoms and clusters deposited on oxide surfaces by doping: a DFT study of the adsorption properties of nitrogen- and niobium-doped TiO₂ and ZrO₂. Phys. Chem. Chem. Phys. 17, 22342–22360 (2015).
CAS Google Scholar
Hu, C. H. et al. Modulation of catalyst particle structure upon support hydroxylation: ab initio insights into Pd₁₃ and Pt₁₃/γ-Al₂O₃. J. Catal. 274, 99–110 (2010).
CAS Google Scholar
Ghosh, S., Mammen, N. & Narasimhan, S. Descriptor for the efficacy of aliovalent doping of oxides and its application for the charging of supported Au clusters. J. Phys. Chem. C. 123, 19794–19805 (2019).
CAS Google Scholar
Rahmani Didar, B. & Balbuena, P. B. Reactivity of Cu and Co nanoparticles supported on Mo-doped MgO. Ind. Eng. Chem. Res. 58, 18213–18222 (2019).
CAS Google Scholar
Addou, R. et al. Influence of hydroxyls on Pd atom mobility and clustering on rutile TiO₂(011)-2×1. ACS Nano 8, 6321–6333 (2014).
CAS Google Scholar
Babucci, M. et al. Controlling catalytic activity and selectivity for partial hydrogenation by tuning the environment around active sites in iridium complexes bonded to supports. Chem. Sci. 10, 2623–2632 (2019).
CAS Google Scholar
Kumar, G. et al. Evaluating differences in the active-site electronics of supported Au nanoparticle catalysts using Hammett and DFT studies. Nat. Chem. 10, 268–274 (2018).
CAS Google Scholar
Shao, X. et al. Tailoring the shape of metal Ad-particles by doping the oxide support. Angew. Chem. Int. Ed. 50, 11525–11527 (2011).
CAS Google Scholar
Stavale, F. et al. Donor characteristics of transition-metal-doped oxides: Cr-doped MgO versus Mo-doped CaO. J. Am. Chem. Soc. 134, 11380–11383 (2012).
CAS Google Scholar
Prada, S., Giordano, L. & Pacchioni, G. Charging of gold atoms on doped MgO and CaO: identifying the key parameters by DFT calculations. J. Phys. Chem. C. 117, 9943–9951 (2013).
CAS Google Scholar
Campbell, C. T. & Sellers, J. R. V. Anchored metal nanoparticles: effects of support and size on their energy, sintering resistance and reactivity. Faraday Discuss. 162, 9–30 (2013).
CAS Google Scholar
Curtarolo, S., Morgan, D., Persson, K., Rodgers, J. & Ceder, G. Predicting crystal structures with data mining of quantum calculations. Phys. Rev. Lett. 91, 135503 (2003).
Google Scholar
Schütt, K. T. et al. How to represent crystal structures for machine learning: towards fast prediction of electronic properties. Phys. Rev. B 89, 205118 (2014).
Google Scholar
Rupp, M., Tkatchenko, A., Müller, K.-R. & von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).
Google Scholar
Ghiringhelli, L. M. et al. Learning physical descriptors for materials science by compressed sensing. N. J. Phys. 19, 023017 (2017).
Google Scholar
O’Connor, N. J., Jonayat, A. S. M., Janik, M. J. & Senftle, T. P. Interaction trends between single metal atoms and oxide supports identified with density functional theory and statistical learning. Nat. Catal. 1, 531–539 (2018).
Google Scholar
Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M. & Ghiringhelli, L. M. SISSO: a compressed-sensing method for systematically identifying efficient physical models of materials properties. https://arxiv.org/abs/1710.03319 (2017).
Andersen, M., Levchenko, S. V., Scheffler, M. & Reuter, K. Beyond scaling relations for the description of catalytic materials. ACS Catal. 9, 2752–2759 (2019).
CAS Google Scholar
Goldsmith, B. R., Esterhuizen, J., Liu, J.-X., Bartel, C. J. & Sutton, C. Machine learning for heterogeneous catalyst design and discovery. AIChE J. 64, 2311–2323 (2018).
CAS Google Scholar
Schmidt, J., Marques, M. R. G., Botti, S. & Marques, M. A. L. Recent advances and applications of machine learning in solid-state materials science. npj Comput. Mater. 5, 83 (2019).
Google Scholar
Raftery, A. E. Bayesian model selection in social research. Sociol. Methodol. 25, 111–163 (1995).
Google Scholar
Casella, G. & Moreno, E. Objective Bayesian variable selection. J. Am. Stat. Assoc. 101, 157–167 (2006).
CAS Google Scholar
Park, T. & Casella, G. The Bayesian Lasso. J. Am. Stat. Assoc. 103, 681–686 (2008).
CAS Google Scholar
Claeskens, G. & Hjort, N. L. Model Selection and Model Averaging (Cambridge University Press, 2008).
Castillo, I., Schmidt-Hieber, J. & van der Vaart, A. Bayesian linear regression with sparse priors. Ann. Stat. 43, 1986–2018 (2015).
Google Scholar
Zhang, Y. & Bondell, H. D. Variable selection via penalized credible regions with Dirichlet–Laplace global-local shrinkage priors. Bayesian Anal. 13, 823–844 (2018).
Google Scholar
Scott, J. G. & Berger, J. O. Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Stat. 38, 2587–2619 (2010).
Google Scholar
Li, M. & Dunson, D. B. Comparing and weighting imperfect models using D-probabilities. J. Am. Stat. Assoc. 1–26, https://doi.org/10.1080/01621459.2019.1611140 (2019).
Carvalho, C. M., Polson, N. G. & Scott, J. G. The horseshoe estimator for sparse signals. Biometrika 97, 465–480 (2010).
Google Scholar
Brown, M. A. et al. Oxidation of Au by surface OH: nucleation and electronic structure of gold on hydroxylated MgO(001). J. Am. Chem. Soc. 133, 10668–10676 (2011).
CAS Google Scholar
Choksi, T., Majumdar, P. & Greeley, J. P. Electrostatic origins of linear scaling relationships at bifunctional metal/oxide interfaces: a case study of Au nanoparticles on doped MgO substrates. Angew. Chem. Int. Ed. 57, 1–6 (2018).
Google Scholar
Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B 58, 267–288 (1996).
Google Scholar
Yudanov, I., Pacchioni, G., Neyman, K. & Rösch, N. Systematic density functional study of the adsorption of transition metal atoms on the MgO(001) surface. J. Phys. Chem. B 101, 2786–2792 (1997).
CAS Google Scholar
Risse, T., Shaikhutdinov, S., Nilius, N., Sterrer, M. & Freund, H.-J. Gold supported on thin oxide films: from single atoms to nanoparticles. Acc. Chem. Res. 41, 949–956 (2008).
CAS Google Scholar
Lipton, Z. C. The mythos of model interpretability. https://arxiv.org/abs/1606.03490 (2016).
Cui, Y., Stiehler, C., Nilius, N. & Freund, H.-J. Probing the electronic properties and charge state of gold nanoparticles on ultrathin MgO versus thick doped CaO films. Phys. Rev. B 92, 075444 (2015).
Google Scholar
Lin, X. et al. Charge-mediated adsorption behavior of CO on MgO-supported Au clusters. J. Am. Chem. Soc. 132, 7745–7749 (2010).
CAS Google Scholar
Pacchioni, G. & Freund, H. Electron transfer at oxide surfaces. The MgO paradigm: from defects to ultrathin films. Chem. Rev. 113, 4035–4072 (2013).
CAS Google Scholar
Pacchioni, G. & Freund, H.-J. Controlling the charge state of supported nanoparticles in catalysis: lessons from model systems. Chem. Soc. Rev. 47, 8474–8502 (2018).
CAS Google Scholar
Tran, S. B. T., Choi, H. S., Oh, S. Y., Moon, S. Y. & Park, J. Y. Iron-doped ZnO as a support for Pt-based catalysts to improve activity and stability: enhancement of metal–support interaction by the doping effect. RSC Adv. 8, 21528–21533 (2018).
Google Scholar
Linstrom, P. J. & Mallard, W. G. NIST Chemistry WebBook, NIST Standard Reference Database Number 69 (National Institute of Standards and Technology, 2020).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).
CAS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
CAS Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Google Scholar
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999).
CAS Google Scholar
Monkhorst, H. J. & Pack, J. D. Special points for Brillouin-zone integrations. Phys. Rev. B 13, 5188–5192 (1976).
Google Scholar
Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 132, 154104 (2010).
Google Scholar
Hinnemann, B. & Carter, E. A. Adsorption of Al, O, Hf, Y, Pt, and S atoms on α-Al₂O₃(0001). J. Phys. Chem. C. 111, 7105–7126 (2007).
CAS Google Scholar
Neugebauer, J. & Scheffler, M. Adsorbate-substrate and adsorbate-adsorbate interactions of Na and K adlayers on Al(111). Phys. Rev. B 46, 16067–16080 (1992).
CAS Google Scholar
Bader, R. F. W. Atoms in molecules. Acc. Chem. Res. 18, 7 (1985).
Google Scholar
Henkelman, G., Arnaldsson, A. & Jónsson, H. A fast and robust algorithm for Bader decomposition of charge density. Comput. Mater. Sci. 36, 354–360 (2006).
Google Scholar
Ishwaran, H. & Rao, J. S. Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33, 730–773 (2005).
Google Scholar
Rumble, J. R. CRC Handbook of Chemistry and Physics, 99th (Internet Version 2018) (CRC Press/Taylor & Francis, Boca Raton, FL.).
Villars, P. A three-dimensional structural stability diagram for 998 binary AB intermetallic compounds. J. Less Common Met. 92, 215–238 (1983).
CAS Google Scholar
Zunger, A. Systematization of the stable crystal structure of all AB -type binary compounds: a pseudopotential orbital-radii approach. Phys. Rev. B 22, 5839–5872 (1980).
CAS Google Scholar
Waber, J. T. & Cromer, D. T. Orbital radii of atoms and ions. J. Chem. Phys. 42, 4116–4123 (1965).
CAS Google Scholar
Miedema, A. R., de Châtel, P. F. & de Boer, F. R. Cohesion in alloys—fundamentals of a semi-empirical model. Phys. B 100, 1–28 (1980).
CAS Google Scholar
Pearson, R. G. Absolute electronegativity and absolute hardness of Lewis acids and bases. J. Am. Chem. Soc. 107, 6801–6806 (1985).
CAS Google Scholar
Finkelnburg, W. & Humbach, W. Ionisierungsenergien von Atomen und Atomionen. Naturwissenschaften 42, 35–37 (1955).
CAS Google Scholar
Bratsch, S. G. & Lagowski, J. J. Predicted stabilities of monatomic anions in water and liquid ammonia at 298.15 K. Polyhedron 5, 1763–1770 (1986).
CAS Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Suits, D. B. Use of dummy variables in regression equations. J. Am. Stat. Assoc. 52, 548–551 (1957).
Google Scholar

Download references

Acknowledgements

The authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing high performance computing (HPC) resources that have contributed to the research results reported within this paper. C.-Y.L. and T.P.S. would like to acknowledge startup funding provided by Rice University.

Author information

These authors contributed equally: Chun-Yen Liu, Shijia Zhang.

Authors and Affiliations

Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, 77005, USA
Chun-Yen Liu, Daniel Martinez & Thomas P. Senftle
Department of Computer Science and Engineering, School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA, 16802, USA
Shijia Zhang
Department of Statistics, Rice University, Houston, TX, 77005, USA
Meng Li

Authors

Chun-Yen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shijia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Meng Li
View author publications
You can also search for this author in PubMed Google Scholar
Thomas P. Senftle
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.P.S. developed the concept of this project and supervised it together with M.L. C.-Y.L. and D.M. computed the DFT energies, conducted the electronic analysis, and engineered the primary features. S.Z. implemented the Horseshoe prior and Dirichlet–Laplace prior in the scripts written in R for feature selection. All authors discussed and modified the paper together. C.-Y.L. and S.Z. contributed equally in this work.

Corresponding authors

Correspondence to Meng Li or Thomas P. Senftle.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, CY., Zhang, S., Martinez, D. et al. Using statistical learning to predict interactions between single metal atoms and modified MgO(100) supports. npj Comput Mater 6, 102 (2020). https://doi.org/10.1038/s41524-020-00371-x

Download citation

Received: 25 October 2019
Accepted: 02 July 2020
Published: 21 July 2020
DOI: https://doi.org/10.1038/s41524-020-00371-x

This article is cited by

Interpretable machine learning for knowledge generation in heterogeneous catalysis
- Jacques A. Esterhuizen
- Bryan R. Goldsmith
- Suljo Linic
Nature Catalysis (2022)
Electronic structure factors and the importance of adsorbate effects in chemisorption on surface alloys
- Shikha Saini
- Joakim Halldin Stenlid
- Frank Abild-Pedersen
npj Computational Materials (2022)
Data-driven models for ground and excited states for Single Atoms on Ceria
- Julian Geiger
- Albert Sabadell-Rendón
- Núria López
npj Computational Materials (2022)

Subjects

Abstract

Similar content being viewed by others

Statistical learning goes beyond the d-band model providing the thermochemistry of adsorbates on transition metals

Stability of heterogeneous single-atom catalysts: a scaling law mapping thermodynamics to kinetics

Data-driven models for ground and excited states for Single Atoms on Ceria

Introduction

Results

Effects of dopants and adsorbates on metal adsorption energy

SL for identifying physical descriptors

Robustness and transferability of the selected descriptors

Stability of modified MgO(100) surfaces

Discussion

Methods

DFT calculations

Statistical learning

Primary descriptors

Feature selection

Evaluation of FS methods using synthetic data

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing Interests

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Interpretable machine learning for knowledge generation in heterogeneous catalysis

Electronic structure factors and the importance of adsorbate effects in chemisorption on surface alloys

Data-driven models for ground and excited states for Single Atoms on Ceria

Search

Quick links