Abstract
Low-cost, efficient catalyst high-throughput screening is crucial for future renewable energy technology. Interpretable machine learning is a powerful method for accelerating catalyst design by extracting physical meaning but faces huge challenges. This paper describes an interpretable descriptor model to unify activity and selectivity prediction for multiple electrocatalytic reactions (i.e., O2/CO2/N2 reduction and O2 evolution reactions), utilizing only easily accessible intrinsic properties. This descriptor, named ARSC, successfully decouples the atomic property (A), reactant (R), synergistic (S), and coordination effects (C) on the d-band shape of dual-atom sites, which is built upon our developed physically meaningful feature engineering and feature selection/sparsification (PFESS) method. Driven by this descriptor, we can rapidly locate optimal catalysts for various products instead of over 50,000 density functional theory calculations. The model’s universality has been validated by abundant reported works and subsequent experiments, where Co-Co/Ir-Qv3 are identified as optimal bifunctional oxygen reduction and evolution electrocatalysts. This work opens the avenue for intelligent catalyst design in high-dimensional systems linked with physical insights.
Similar content being viewed by others
Introduction
Small-molecule activation by electrocatalytic reactions has emerged as a promising avenue in the pursuit of eco-friendly carbon-neutral cycles1, such as O2, CO2, and N2 reduction reactions (ORR2,3, CRR4,5,6, and NRR7,8) as well as oxygen evolution reaction (OER)9,10. Dual-atom catalysts (DACs), an extension of single-atom catalysts (SACs), particularly benefit electrocatalysis due to their intricate and flexible active sites11. However, a significant challenge in catalyst design lies in the development of universal descriptor models that can accurately capture the intricate interplay of geometric and electronic structures12,13. Nowadays, numerous notable descriptors have been reported, effectively revealing the structure-performance relationship on SACs or DACs14,15,16,17,18. However, when considering their reliability, practical utility, and universality comprehensively, the optimal low-cost descriptor capable of unifying multiple reactions while addressing both experimental and theoretical results remains elusive. The challenge arises from the abundant variables within high-dimensional systems. As an example, a high-throughput screening of highly active catalysts corresponding to various reactions and products among 840 DACs requires over 50,000 adsorption calculations via density functional theory (DFT). Instead, the most ideal approach is to rapidly assess the activity and selectivity of dual-atoms sites using a universal descriptor based on easily accessible features19,20,21. Correspondingly, all atomic property, synergistic, and coordination effects, as well as the inherent properties of reactants, must be collectively considered. However, it is hugely challenging to correlate them with an interpretable analytic expression, which aids in a fundamental understanding of structure-performance relationships and is more likely to be applicable to other unexplored systems.
Interpretable machine learning (ML) provides the potential to distill universal descriptor models in high-dimensional systems22. The symbolic regression algorithm (e.g., genetic programming23 and SISSO24), as the most applied interpretable ML in catalysis17,25,26,27, utilizes mathematical operators to search for concise functional forms predicting target properties based on input features. In general, constructing a vast feature space is programmatically done, often involving random or exhaustive methods, without the incorporation of physical insights. This approach may lead to a multitude of unphysical analytical forms, obscuring the truly significant formulas within. Consequently, the key to identifying universal descriptors lies in a physical feature space that has been streamlined based on physical insight. The d-band shape is a consequence of the interplay between electronic and geometric structures, which has been a basis for theoretically understanding trends in adsorption at surfaces of transition metals28,29. Some classic descriptors based on the d-band model, such as d-band center30 or upper edge31,32, still exhibit deviations in predicting certain specific systems. Moreover, distinguished from metallic catalysts, many previous works have reported that the frontier orbitals (FO) play an essential role in determining adsorption capacity for SACs due to their molecular complex characteristics in d-orbitals33,34,35. As a result, accurately quantifying the impact of its distinct d-orbital on activity is of utmost importance for DACs design.
Herein, we propose a far-reaching methodology to accurately capture the atomic property (A), reactant (R), synergistic (S), and coordination effects (C) through easily obtainable properties of dual-atom sites (Fig. 1). A universal and interpretable descriptor model, called ARSC, is developed via physically meaningful feature engineering and the feature selection/sparsification (PFESS) method. The physical meaning of PFESS is based on our combination of d-band theory and the frontier orbitals. It unifies activity and selectivity prediction for multiple electrocatalytic reactions (ORR, OER, CRR, and NRR), and illustrates the significant role of d-orbital overlap degree in reactions at dual-atom sites. Based on this model, which is built upon less than 4500 DFT calculations, highly active dual-atom sites for various reactions and products can be quickly predicted instead of more than 50,000 high-throughput calculations. To verify the volcano plot predicted by ARSC, we summarize experimental data from 28 reported DAC works, including more than 17 dual-atom sites. Additionally, predicted as the optimal bifunctional ORR/OER catalysts, our synthesized Co-Co/Ir-Qv3 experimentally shows not only a remarkable half-wave potential of 0.941/0.937 V for ORR but also a small overpotential of 330/340 mV at 10 mA cm−2 for OER. This universal model, ARSC, and our proposed PFESS method have been verified to be extendable for other materials and applications. Our findings pave the way for quick, high-throughput screening of catalysts with glass-box models.
Results
Primitive descriptor for atomic property effects
We construct 840 homonuclear (X-X) and heteronuclear (X-Y) DACs with 3-5d transition metals and four different coordination structures (Qv1, Qv2, Qv3, and Qv4, Fig. 2a). The coordination structures are categorized based on whether two metal atoms are directly bonded (bonded or non-bonded DACs)36. The microenvironment of dual-atom sites is jointly influenced by the atomic properties of the metal, the synergy of the dual atoms, and the coordination environment. Here, we start with the atomic properties which would have an impact on the activities with homonuclear dual-atom sites as model systems. Due to the practical challenges in precisely synthesizing different coordination environments in experiments, it is meaningful to neglect their influence. Therefore, the adsorption free energies on these four geometries are averaged to mitigate the coordination effects (for example, the ∆G(*OH) on X-X sites represents the average value of ∆G(*OH) on X-X-Qv1, X-X-Qv2, X-X-Qv3, and X-X-Qv4). We introduce a new descriptor referred to as a “primitive descriptor” in this work. A primitive descriptor is common to relate the atomic properties of dual-atom sites with different catalytic activities. We calculate the adsorption behaviors of OHz (z = 0, 1), OOH, COOH, CO, CHzO (z = 1, 2, 3), COH, CHOH, N2, N2H, NHz (z = 2, 3), and H on DACs to evaluate the activity of multiple electrocatalytic reactions (ORR, OER, CRR, and NRR)13,20,21. Our calculations validate that the adsorption free energies of *OH and *COOH (∆G(*COOH) and ∆G(*OH)) and the reaction free energy of *N2 → *N2H (∆G(*N2 → *N2H)) all present volcano-like correlations with limiting potentials (UL) of ORR/OER, CRR, and NRR, respectively (Supplementary Figs. 1–3).
Previous studies have shown that the d-band model of metals, such as the d-band center, plays a critical role in governing these energy descriptors28. However, given that d-orbitals of DACs lie between energy levels and energy bands (Supplementary Figs. 4–8), relying solely on the d-band center no longer effectively describes adsorption capability (Supplementary Fig. 9). Alternatively, we find that dxz, dyz, and \({d}_{{{\mbox{z}}}^{2}}\) are found to be FO for most DACs, and using the energy levels of FO will further enhance the accuracy of describing adsorption energies (Supplementary Fig. 10a, b). However, the lack of consideration for the d-band shape results in insufficient precision, due to the characteristics of the energy bands remaining in DACs. While the upper d-band edge has been widely confirmed as an effective descriptor for capturing the influence of the d-band shape32, it’s worth noting that under the different d-band filling (fd) and width (Wd), the same upper d-band edge can correspond to a different position of adsorbate-metal antibonding orbitals (Supplementary Fig. 11). As illustrated in Supplementary Fig. 12, at the same fd, wider bands correspond to a higher antibonding position. Conversely, at the similar Wd, the antibonding position is inversely proportional to the filling level. Therefore, the relative antibonding position (fd/Wd0.5) is introduced as an electronic descriptor for understanding the effect of the d-band shape on adsorption properties on dual-atom sites (fd/Wd0.5 ∝ ∆G, Fig. 2b). Our calculations further reveal that the shape of FO (fFO/WFO0.5) is strongly correlated with the d-band shape, indicating that the FO and all d-orbitals both can effectively describe the structure-activity relationship of DACs with comparable accuracy (Supplementary Fig. 10c, d).
We then delve into the relationship between the atomic properties of dual atoms and the d-band shape. For fd, it is a positive function of the valence electron number (n) of metals (fd ∝ n) (Fig. 2d). Wd is more complex, which is primarily determined by the metal-metal bond distance (dM-M) and the size of the atomic d-orbitals. Unlike in pristine metals, where dM-M and d-orbital size mutually influence each other, for dual-atom sites, dM-M remains constant as both metal atoms are anchored in the substrate (Fig. 2c). Therefore, Wd is solely influenced by the d-orbital size, presenting as a negative function of atomic radius (R) and the number of electron shells (S) (Wd ∝ 1/(R × S)) (Fig. 2c). Based on the above findings, we propose a primitive descriptor to quantify atomic property effects on activity for dual-atom sites (X-X DACs):
where β and γ are non-negative indices determined by the significance of R and S on adsorption (Supplementary Table 1). As shown in Fig. 2g, when β = 1 and γ = 0, ϕxx is strongly linear in fd/Wd0.5 with R2 reaching 0.994, suggesting that R is more crucial for Wd on homonuclear dual-atom sites.
The d-band shape and its impact on adsorbate-metal antibonding can be well described by simple atomic properties: a smaller ϕxx value indicates a wider d-band and a higher location of the d-band upper edge (Fig. 2f), leading to a higher antibonding position and therefore a stronger adsorption energy of various key intermediates (ϕxx ∝ fd/Wd0.5 ∝ ∆G), and vice versa (Fig. 2h). This interpretation is further supported by the linear relationship between ϕxx and the integrated crystal orbital Hamilton population (ICOHP) for the metal-adsorbate bond (Fig. 2g and Supplementary Fig. 13). We also find that different coordination structures have a relatively minor impact on the catalytic activity of homonuclear sites (Fig. 2e and Supplementary Fig. 14).
Unification of multiple small-molecule activation reactions
Multiple electrocatalytic reactions, including ORR (2e− and 4e− process), OER, CRR (CO/CH4/CH3OH production), and NRR, are successfully unified by plotting the primitive descriptor ϕxx vs. ∆G(z) − ∆Gopt(z) (Fig. 3a). ∆G(z) can be regarded as energy descriptors for activity, where z represents different key intermediates for the entire reaction, such as *OH for ORR/OER, *COOH for CRR, and *N2 → *N2H for NRR. ∆Gopt(z) is the optimal value of ∆G(z), which is located near the top of the UL-∆G(z) volcano plot, corresponding to the lowest overpotential for a specific reaction and product (Supplementary Figs. 1–3). According to UL-∆G(z) volcano plots, ∆Gopt(*OH) is found to be ~0.98 eV for ORR (4e−)/OER and ~1.23 eV for ORR (2e−), ∆Gopt(*COOH) is ~0 eV for CRR, and ∆Gopt(*N2 → *N2H) is ~0.3 eV for NRR. Therefore, the different ϕxx-values at ∆G(z) − ∆Gopt(z) = 0 (ϕopt) signify diverse favorable reactions and products for homonuclear dual-atom sites. It seems that Re-Re is better suited for NRR, while Mn-Mn exhibits superior activity for CO production, and Co-Co presents a lower overpotential for ORR (4e−) and OER.
Further designing heteronuclear dual-metal sites with better activity is another important application of ϕopt. For a typical heteronuclear X-Y site, X is defined as the metal that binds the adsorbate at the top site. Our calculations find that ϕxx is smaller than most of ϕyy, indicating that the metal that binds the adsorbate can be rapidly identified via ϕxx values (Supplementary Fig. 15 and Supplementary Table 2). For a specific reaction and product, X-type and Y-type metals can be identified through ϕopt. A metal with ϕxx <ϕopt is more conducive to the activation of adsorbate species, making it more suitable as an X-type metal. Conversely, a metal with ϕxx > ϕopt is more favorable for product desorption, which can be designed as Y-type metals. Therefore, the combination of X- and Y-type metals can further enhance catalytic activity. Based on this principle, the number of potential high-activity heteronuclear structures will be reduced to one-fourth, effectively improving the screening efficiency of DACs while maintaining physical insights.
A fundamental understanding of why ϕopt locates at the different values for diverse reactions and products is also proposed. The ϕxx value represents the adsorption ability of dual-atom sites as a negative function of the activation degree of reactants. Due to the strong bond energy of the N≡N bond, a metal with a smaller ϕxx is required to activate N2 molecules effectively. From the N≡N bond to the C=O and O=O bonds, as the bond energy of adsorbates decreases, the metal with a larger ϕxx is more desirable to avoid over-activation of reactants or strong adsorption of products. This interpretation is confirmed by the linear relationship between the bond energies of reactants and the ϕopt values for corresponding reactions (Fig. 3b). As a result, the activity and selectivity of multiple reactions are totally unified by this simple primitive descriptor.
PFESS approach for synergistic effects between X- and Y-metals
While the structure-activity relationship of homonuclear DACs has been fully described by ϕxx, the synergistic effects between X- and Y-metals have not been considered yet, leaving our understanding of X-Y sites incomplete. To further introduce synergistic effects into ϕxx, we therefore performed the PFESS algorithm to identify a simple and universal mathematical formula unifying multiple reactions, as illustrated in Fig. 4a. Our feature space is constructed based on the analytical form of ϕxx, where only three units of important atomic properties are used and combined (nx, ny, Rx, Ry, Sx, Sy), with R and S exhibiting an inverse mathematical relationship with n. Only simple mathematical/functional operators are utilized to ensure the final mathematical forms are as simple and interpretable as possible25.
The optimal descriptors (1D ~ 6D) are identified finally through a modification of the least absolute shrinkage and selection operator method (LASSO), referred to as LASSO + ℓ037,38. However, this approach is computationally infeasible when managing an ultra-high-dimensional feature space or when features are correlated24. To address the huge size of this problem, the feature selection is conducted using a combination of the random forest regression (RF) and recursive feature elimination (RFE) methods, where RF is used to provide importance scores for features in a model, and RFE iteratively trains the model and removes the least contributive features to model performance to reduce feature dimensionality.
Potential highly active X-Y sites for different reactions from 760 heteronuclear DACs are selected, respectively, according to the design principle we mentioned above. X-metals or Y-metals are chosen based on ϕopt values. For ORR/OER, X is selected as Ru, Os, Fe, Co, and Y is considered as Co, Ni, Rh, Pd, Ir, Pt; for CRR, we use W, Mo, V, Re, Cr, Mn, Os, Ru as X-sites, with Fe, Co, Ir, Rh, Pd, Pt, Ni as Y-site; for NRR, X-metals are defined as Ta, Nb, Mo, W, Re, and Y-metals are Fe, Co, Ni, Ru, Rh, Pd, Os, Ir, Pt. Taking these conditions into account, the catalytic activities of totally 123 atomic combinations (492 DACs) are studied with comprehensive calculations of the reaction pathway. The 1D ~ 6D descriptors are trained with ∆G(*OH), UL(CRR), and UL(NRR) that are averaged by 4 different coordination structures (Qv1, Qv2, Qv3, Qv4) as targets. The root mean square errors (RMSE) are calculated to evaluate the fitting performance for these ML-trained descriptors. The mean RMSE in 10-fold cross-validation (CV10) demonstrates the absence of discernible overfitting in these 1D ~ 6D descriptors (Fig. 4b and Supplementary Table 3). The most simple, interpretable, and generalized 1D descriptors are the focal point of our investigation. The top 5 1D descriptors for ORR/OER, CRR, and NRR, each formed by atomic properties of both X and Y, are listed in Table 1. Even for the least accurate 1D descriptor, its RMSE is only 0.10 V, demonstrating remarkably strong predicting precision towards ORR/OER/CRR/NRR activity. Other best 1D descriptors are listed in Supplementary Tables 4–6. In agreement with our physical insights, nx/(Rx × Sx) or (Rx × Sx)/nx with different non-negative indices always appear in 1D descriptors, consistent with the analytical expression of the primitive descriptor ϕxx. In addition, ny is the most frequently occurring atomic property of Y-metals within 1D descriptors and is consistently divided by or subtracted from nx, which suggests the competition for electron transfer between the X- and Y-metals. The non-negative exponent of nx is consistently greater than ny, further emphasizing the importance of X-metals and the auxiliary role of Y-metals. Another reason is that the value of nx is always less than ny, increasing the index of nx ensures that nxa − nyb remains positive. Furthermore, a universal mathematical formula model for multiple reactions with a balance of simplicity, generalization, and accuracy is defined as:
where δ is set to 1 or −1, which is solely introduced to enhance the fitting accuracy and holds no physical significance; β and γ are also non-negative indices determined by the significance of Rx and Sx on adsorption; f(nx, ny) presents the absolute differences or divisions on nx and ny, where the indexes for both nx and ny can only be 0.5, 1, or 2. Thus, for ORR/OER, the descriptor is defined as:
For CRR, the descriptor is defined as:
For NRR, the descriptor is defined as:
The descriptors trained through PEFSS method demonstrate significant potential in universality for multiple reactions and interpretability when compared to other symbolic regression algorithms (Supplementary Fig. 16 and Supplementary Tables 7–9).
As shown in Fig. 4c–e, ϕxy presents a linear function of ∆G(*OH), UL(CRR), and UL(NRR), exhibiting outstanding predictive performance for activity. It is found that CRR and NRR activities remain the same for different coordination structures, while ∆G(*OH) is highly sensitive to the coordination effects. Therefore, for ORR, we need to consider the coordination effects in subsequent analyses. For CRR and NRR, our calculations show that the functions between adsorption energies and activities on our selected X-Y sites are concentrated on one side of the Sabatier principle, where active sites can effectively activate the reactants and the desorption of products is always the rate-limiting step. This phenomenon can be well explained. Our selected X-metals inherently possess high activation capability for reactants. Consequently, for active sites containing X-metals, molecular activation is not the problem. However, X-metals may face issues related to inhibiting the subsequent desorption of products, which is why we introduce Y-metals for electronic regulation. Hence, a relatively weaker adsorption capability is more likely to become an X-Y site with ideal activity. In addition, the synergistic effects can also be well quantified by ϕxy (Supplementary Figs. 17 and 18). Based on this mathematical form, several highly active X-Y sites are identified, such as Cr-Rh towards CH4 with an averaged UL(CRR) of −0.51 V, as well as Re-Fe and Mo-Fe with an averaged UL(NRR) of −0.36 V.
Quantification of coordination effects
Due to the sensitivity of ∆G(*OH) to the coordination environments of X-Y sites, it is crucial to quantify the coordination effects on ORR activity. The plot of ∆G(*OH) vs. ϕxx illustrates that these linear functions exhibit a certain offset along the x-axis for different coordination structures on X-Y sites (Fig. 5a). The intercept in the x-direction, to some extent, reflects the coordination effects. One can see that Qv1 shows the highest *OH adsorption capacity, followed by Qv3 and Qv4, with Qv2 exhibiting the weakest binding. Our calculations show that different coordination structures primarily determine the varying adsorption capabilities of species by regulating the Wd of X-Y sites (Supplementary Figs. 19 and 20).
This phenomenon is closely associated with the N coordination number of each metal atom (CN), the ratio of N atoms simultaneously bonded to two metals (RN), and the formation of direct metal-metal bonds. Herein, we define a characteristic parameter α to quantify coordination effects:
where λ is an index governed by the coordination type of X-Y sites: λ = 1 for bonded type and λ = −1 denotes non-bonded type, suggesting that the N atom shared by two metals is better suited to enhance the adsorption capacity for non-bonded DACs, whereas the shared N atoms will hinder the adsorption of species on the bonded type. Nm is the maximum number of N atoms that can coordinate to the entire X-Y site for a specific CN (2 × CN), and Na is correspondingly the actual number. CN should be 3 for Qv1 and 4 for Qv2, Qv3, and Qv4. Na is 6 for Qv1 and Qv4, 7 for Qv3, and 8 for Qv4. As shown in Supplementary Fig. 19, α is linear in the x-intercept of ∆G(OH)-ϕxx lines as well as \(\bar{{W}_{d}}\), indicating that the intrinsic correlation between the electronic structures and the coordination environments is captured and quantified by the coordination factor α.
By introducing α into ϕxy, the ORR activity descriptor with coordination effect corrections is proposed, which precisely describes the catalytic performance of dual-atom sites under the influence of atomic property effects, reactant, synergistic, and coordination effects, called ARSC. ARSC is defined as follows:
where k represents the coefficient for α, which physically represents the weight of the coordination effect within ARSC. It is obtained by numerically fit and is exactly equal to 1 here. A volcano-like correlation between UL(ORR) and Φ can be plotted in Fig. 5b. Four ORR mechanisms are considered, including the conventional association and new dissociation pathways (Path-I~IV, Supplementary Fig. 21). Only 12 X-Y sites are found to favor the dissociation pathways (Supplementary Table 10), which are concentrated near the volcano summit. The non-bonded Co-Y sites, such as Co-Ir-Qv3, Co-Rh-Qv3, and Co-Co-Qv3 all locate at the top with UL(ORR) of 0.98 V, 0.98 V, and 0.95 V, respectively. To further investigate whether these high-active DACs will be blocked by severe kinetic issues, we calculate the kinetics barriers on Co-Y-Qv3 (Y = Co, Ir, Ni, Rh, Pd, Pt), and find that all the Co-Y-Qv3 can overcome kinetic obstacles with energy below 0.4 eV, exhibiting favorable kinetic behavior (Supplementary Figs. 22–24).
ARSC can directly quantify the effect of the d-band/FO shape of DACs on activities, that is, the position of metal-adsorbate antibonding orbitals. A larger ARSC value corresponds to a lower antibonding position and weaker adsorption ability of adsorbates. As shown in Supplementary Fig. 25, the antibonding for Co-OH bond is positioned at a moderate level on Co-Co/Ir-Qv3, while that for V-OH bond is the highest and Ni-OH is the lowest. In addition, the Co-OH antibonding position for Qv3 is highest than that for Qv4, with Qv2 being the lowest. These rankings are consistent with the relative ARSC values, indicating that the promising performance of Co-Co/Ir-Qv3 is owing to its moderate d-band width and filling, which resulting in a moderate antibonding position.
The Pourbaix diagrams further reveals that the active sites of DACs are easily oxidized during OER, in line with previous reports39, while the pristine dual-atom site is more favorable under the ORR condition (Supplementary Figs. 26 and 27 and Supplementary Table 11). OER prefer to occur along Path-II on *O-covered DACs rather than Path-I on pristine DACs (Supplementary Figs. 28 and 29). The similar volcano-like correlation between the descriptor ARSC and OER activities for DACs with the *O-covered active sites when k = 0.05 (Supplementary Fig. 30), suggesting the high universality of ARSC for OER. The smaller k also illustrates that OER is less sensitive to coordination effects compared to ORR, as the introduction of the same O-coordination will hinder the overall coordination effect governed by the difference of the N-coordination structures in the substrate. Some candidates with promising OER activity are identified, such as Co-Co-Qv3, Co-Ir-Qv3, and Co-Ni-Qv1 with UL(OER) of 1.52 V, 1.52 V, and 1.53 V, respectively. These two volcano plots together show that the non-bonded DACs, especially Qv3-based DACs, can serve as promising bifunctional oxygen electrocatalysts, such as Co-Co/Ir-Qv3, due to their moderate value of ARSC. Compared with other high-throughput studies covering Co-Co dual-atom sites for effective bifunctional catalysts, multiple screening results have been covered in our work, and more new DACs exhibiting higher bifunctional activity can be further identified by ARSC (Supplementary Table 12). More importantly, our descriptor can quickly screen out multiple optimal DACs for various reactions and products simultaneously, which is what other Co-Co-related high-throughput studies has not yet achieved.
As another crucial criterion for catalyst screening, we comprehensively evaluate the stability of the 840 DACs by calculating the aggregation energy (Eagg) and dissolution potential (Udiss). We use Eagg < 0 eV and Udiss > 0 V as the standard for the unaggregatable and indissoluble system40. Supplementary Figs. 31–33 illustrate that the majority of X-Y-Qv2 and X-Y-Qv3 sites exhibit remarkable stability, while the metal atoms tend to aggregate and dissolve in the electrolyte in some X-Y-Qv4 and X-Y-Qv1 sites. It is worth noting that Fe, Co, Ni, Ru, Rh, Os, and Ir-based DACs have great potential to remain stable in a harsh electrochemical environment, which are identified as the potentially highly active elements for ORR/OER. Especially, as the optimal bifunctional oxygen catalysts screened by this work, Co-Co/Ir-Qv3 display superior durability with a very negative Eagg value and a very positive Udiss value.
Experimental verification and universality expansion of ARSC
To verify the screening results, non-bonded Co-Co and Co-Ir, characterized by identical Φ values and demonstrating remarkable activity for both ORR and OER, are synthesized (denoted as Co-Co/Ir-NC). The rhombic dodecahedral morphology and absence of crystal metal species are displayed by scanning electron microscopy (SEM), transmission electron microscopy (TEM), and X-ray diffraction patterns (XRD, Supplementary Figs. 34–36). Energy-dispersive X-ray spectroscopy (EDS) elemental mappings show that Co and Ir are homogeneously distributed (Supplementary Fig. 35c, d). Aberration-corrected high-angle annular dark-field scanning TEM (AC-HAADF-STEM) images illustrate a uniform dispersion of numerous atomic pairs with distances around 3.6 Å, closely resembling the metal-metal bond length in Qv3 of 3.59 Å (Fig. 5c and Supplementary Fig. 37).
The results of X-photoelectron spectroscopy (XPS) show the metal-N coordination in Co-Co/Ir-NC samples (Supplementary Figs. 38 and 39). The chemical states and local coordination environments are further investigated by X-ray absorption near-edge structure (XANES) and extended X-ray absorption fine structure (EXAFS) analysis. The Co K-edge XANES spectra suggest that the absorption edge of Co-Co/Ir-NC is located between Co foil and Co3O4 (Fig. 5d). As for XANES at Ir L3-edge, the white-line intensity of Co-Ir-NC is situated between Ir foil and IrO2 (Supplementary Fig. 40a). These XANES spectra indicate the oxidized valence states of metals. A prominent peak at ca. 1.56 Å and ca. 1.48 Å is detected in the Fourier-transformed (FT) k3-weighted EXAFS spectra of Co-Co/Ir-NC samples, respectively, corresponding to the metal-N first shell coordination (Fig. 5e and Supplementary Fig. 40b). It is worth noting that simply fitting the first shell alone is not effective in distinguishing between single atoms or metallic/oxidized clusters41. The EXAFS fitting for 3 scattering paths and comparisons between multiple structural models imply the absence of clusters and the non-bonded local structure of metal-metal in Co-Co/Ir-NC (Supplementary Fig. 41 and Supplementary Tables 13 and 14). Moreover, the EXAFS fitting uncovers the metal-N4 configuration (Supplementary Table 14). As a result, we infer that our as-synthesized Co-Co/Ir-NC samples have specific local structures of Co-Co/Ir-Qv3, corresponding to CoCo@N4A7 and CoIr@N4A7, respectively.
The ORR activities of as-designed DACs are assessed. According to Fig. 5f and Supplementary Fig. 42, Co-Co-NC and Co-Ir-NC show half-wave potential (E1/2) of 0.941 V vs. reversible hydrogen electrode (RHE) and 0.937 V, respectively, higher than that of commercial Pt/C (0.894 V). Their superior activities in terms of ORR kinetics are further verified by the smaller Tafel slopes compared to commercial Pt/C (Supplementary Fig. 43). Co-Co-NC and Co-Ir-NC display comparable and remarkable ORR activity, attributed to the identical and moderate Φ values close to the optimal value of Co-Co/Ir-Qv3. This can be verified by the volcano-like relationship between Φ and experimental ORR activity among a series of previously reported DACs (Fig. 5h and Supplementary Table 15). More than 28 published papers and 17 dual-atom sites are involved, highlighting the reliability of activity predictions of ARSC. Slightly negative shifts of E1/2 are observed after 10,000 cycling for Co-Co/Ir-NC, suggesting reasonably high catalytic stability compared with other DACs (Fig. 5i, Supplementary Fig. 44 and Supplementary Table 16).
To investigate the feasibility of bifunctional oxygen electrocatalysis, the OER performance is also evaluated. As shown in Fig. 5g and Supplementary Fig. 45, Co-Co-NC and Co-Ir-NC can deliver a current density of 10 mA cm−2 at a low overpotential (η10) of 330 mV and 340 mV, respectively, lower than that of commercial IrO2 (350 mV). The corresponding Tafel slopes are also obviously small, suggesting their high OER activities in alkaline media (Supplementary Fig. 46). More importantly, their bifunctional catalytic activities, evaluated by the potential difference between E1/2 and η10, surpass the reported DACs (0.619 and 0.633 V for Co-Co-NC and Co-Ir-NC, respectively, Supplementary Table 17). Co-Co/Ir-NC can be well maintained at 10 mA cm−2 for 200 h with minimal degradation, demonstrating superior stability compared to other carbon-based DACs (Fig. 5j, Supplementary Figs. 47 and 48 and Supplementary Table 18).
The mathematical formulation of ARSC comprises several components, including nx, Sx, and Sy for atomic property effects, denoted as ϕxx, another component ny for synergistic effects, and a third component α for coordination effects. The quantification of atomic effects and synergistic effects we developed should be applicable to any dual-atom site, not limited to graphene systems with N coordination. To validate this hypothesis, the similar linear relationships between ∆G(*OH) and ϕxx on X-X-C and X-X-NO sites suggest that the ARSC descriptor holds significant potential to be applied in systems with coordination by various elements (Supplementary Figs. 49 and 50). In addition, using reported data, a linear relationship between UL(NRR) for X-Y sites supported on 2D expanded phthalocyanine and ϕxy can be plotted42, where the ideal DACs, such as MoV, can be quickly identified by our descriptor (Supplementary Fig. 51). However, we also notice individual outliers, suggesting that the potential limitation of our constructed descriptor is that the impact of substrates outside of graphene on the microenvironment of dual-atom sites is not considered in this work. It should be known that, although the descriptor ARSC has been demonstrated to possess the capability to unveil the structure-activity relationships across various dual-atom systems and reactions, for future high-throughput screening of DACs on other substrates, it would be better to further optimize this descriptor based on the characteristics of different substrates to enhance the whole prediction accuracy. This insight will enhance the future potential of our descriptor.
Discussion
In summary, we develop a simple but universal mathematical formula, called ARSC, to unify various electrocatalytic reactions while simultaneously elucidating atomic property, reactant, synergistic, and coordination effects of dual-atom sites. Our descriptor exhibits superior performance from six dimensions: practical utility, universality, physical meaning, developing method, experimental performance, and literature verification (Supplementary Table 19 and Supplementary Figs. 52 and 53). The physical-chemical basis of this descriptor relies on PFESS method, including a huge physically meaningful feature space reflecting the d-band characteristic and ML-based feature selection/sparsification algorithms. The specific d-band shape for dual-atom sites can be reflected in the valence electron number, atomic radius, number of electron layers, and coordination of active sites, where the orbital overlap degree is found to be the key to determining the activity. According to our findings, a broader and more vacant d-band at dual-atom sites is more conducive to activating adsorbed species, better suited for reactants with higher bonding energies in electrocatalytic reactions. Based on ARSC, established by ~4326 DFT-generated data points, desirable catalysts for different products can be quickly identified instead of more than 50,000 high-throughput calculations. Co-Co/Ir-Qv3, which has a moderate value of ARSC, is predicted to be the most promising bifunctional oxygen electrocatalysts, as proved by 28 reported studies and our experiments. The ARSC and corresponding PFESS method have the potential to be expanded to other reactions and materials. This work breaks the limitations of traditional descriptors for describing individual reactions, provides a new research paradigm for fast screening of highly active catalysts in a high-dimensional system with glass-box models.
Methods
DFT settings for screening and analysis
The DFT calculations utilized the Vienna Ab Initio Simulation package43 with the projector augmented wave method44. Exchange-correlation energy was modeled through the generalized gradient approximation in the form of the Perdew-Burke-Ernzerhof functional45. Van der Waals interactions were considered using the DFT-D3 correction method46. Spin-polarization was applied throughout. A 400 eV cutoff energy for a plane-wave basis was set for each slab model, and atoms were fully relaxed until the force on each atom was <0.02 eV Å⁻¹. To model monolayer graphene, we utilized a (3 × 3) periodic supercell included a 15 Å z-direction vacuum layer to prevent inter-slab interactions. Geometry optimization and frequency calculations employed a (5 × 5 × 1) k-point mesh, followed by a denser (7 × 7 × 1) k-point grid for electronic property calculations. The high-throughput calculations and analysis were based on various Python and shell software packages, including Atomic Simulation Environment (ASE)47, Python Materials Genomics (pymatgen)48, and VASPKIT49. Atomic charges were determined by Bader charge analysis50. Crystal orbital Hamilton population (COHP) analysis was carried out utilized the LOBSTER 3.2.0 package51. The climbing image nudged elastic band (CI-NEB) method was employed to searched for transition states (TS)52.
Electronic state correction
To correct the magnetization, we tested different MAGMOM, SIGMA, AMIX, and AMIX_MAG parameters in VASP. The results reveal that these parameters had little influence on the converged magnetic moments. Then we scanned the total magnetization ranging from 0 to 6 μB for Fe, Co, and Ni-based DACs and found that the X-Y sites with unspecified total magnetization are the most stable (Supplementary Table 20). Fe possesses the highest spin state (2 μB, triplet state), followed by Co (1 μB, doublet state), Ni (0 μB, singlet state), and other elements (0 μB, singlet state) (Supplementary Table 21). The magnetic properties will alter when adsorbing *O, *OH and *OOH species, in line with the previous work40 (Supplementary Table 22).
Reaction free energy calculation
For each step in the electrocatalytic reaction process, we computed the reaction free energy (∆G) using the computational hydrogen electrode (CHE) model proposed by Nørskov and colleagues53. This model equates the free energy of an electron-proton pair (H+ + e−) to half the chemical potential of gaseous H2 at equilibrium (0 V vs. the standard hydrogen electrode). The ∆G was determined by the formula: ΔG = ΔE + ΔEZPE + δH0 − TΔS + eU + ΔGpH. In this equation, ΔE represents the energy from DFT calculations, ΔZPE and ΔS are corrections for zero-point energies and entropy, respectively, T stands for the reaction temperature (298.15 K), and δH0 is the integrated heat capacity. The terms e and U indicate the numbers of electrons transferred and electrode potential, respectively, with the U value usually set to 0 V. For the pH correction, ΔGpH, we used the formula: ΔGpH = kBT × ln10 × pH, where kB is the Boltzmann constant, and pH was assumed to be 0 in this study. To assess the electrocatalytic activities, we calculated the limiting potential (UL) as UL = −ΔGmax/e, where ΔGmax is the ∆G of the potential determining step (PDS).
Solvation effect
The solvation effect is taken into account via the implicit model (Poisson-Boltzmann model), which is implemented in VASPsol with a dielectric constant of 8054. For NRR, our previous work has verified that the solvation effect is negligible, which will bring a deviation of about 0.1 eV into the system, in line with multiple reports13. For ORR/OER, our calculations reveal that introduction of the solvation correction is negligible for the activity trends between different metals since the slope of the linear fit approximates to 1, especially for the late transition metals, confirming that solvent effects may be unlikely to alter our catalyst screening results for ORR/OER (Supplementary Fig. 54a). It is further verified by Supplementary Fig. 54b where most selected potentially highly active X-Y sites (X = Ru, Os, Fe, Co, and Y = Co, Ni, Rh, Pd, Ir, Pt) only exhibit a deviation within 0.1 V compared to the UL(ORR) values without the implicit model. In addition to *O- and *N-based adsorption, we also found that the same results can be gained when UL(CRR) with the implicit model on the potentially highly active X-Y sites (X = W, Mo, V, Re, Cr, Mn, Os, Ru, and Y = Fe, Co, Ir, Rh, Pd, Pt, Ni) were studied (Supplementary Fig. 54c). All these results reveal that solvation effects have little influence on the screening results for promising electrocatalyst candidates identified by our proposed descriptor.
Atomic property selection
The atomic radii R we used in this work are calculated using self-consistent-field (SCF) functions55. We also considered and compared the ionic radii, covalent radii, and metallic radii as R values. The results of electron localization function (ELF) reveal strong covalent bond characteristics between metal centers and coordination N atoms on DACs (Supplementary Fig. 55), suggesting that covalent radii are more suitable for quantifying atomic property effects of metal center on the catalytic performance than ionic radii and metallic radii. The descriptor based on covalent radii (derived from CRC Handbook of Chemistry and Physics) demonstrates a predictive accuracy comparable to that based on the SCF function-calculated atomic radii (Supplementary Fig. 56), indicating that the covalent radii can also be used as R in this work. While the ionic radii and metallic radii may be not suitable, which suffers from high costs and low accuracy, respectively.
Evaluation of linear scaling relations
In addition to the R² mentioned in the figure, we listed multiple regression results for the linear scaling relations (LSRs) in Figs. 2b–e, g and 4c–e and Supplementary Fig. 14, including the mean absolute error (MAE), maximum positive/negative error (MPE/MNE) and root mean square errors (RMSE), as shown in Supplementary Table 23.
Stability analysis
The stability over metal-aggregation is governed by the aggregation energy (Eagg), which is defined as: Eagg = (E(X-Y-NC) − E(NC) − μ(X) − μ(Y))/2, where E(X-Y-NC) and E(NC) represent total energies of DACs and the defected nitrogenated graphene surface, respectively; μ(X) and μ(Y) are the energies of X-metal and Y-metal, which are taken from the bulk metals. The negative Eagg value indicates that the embedding into graphene is preferred over metal clustering. Dissolution potential (Udiss) was used to determine the electrochemical stability against dissolution of X-Y sites56, which is defined as: Udiss(X) = \({{\mbox{U}}}_{{\mbox{diss}}}^{{\mbox{o}}}\)(X) − (E(X-Y-NC) − E(Y-NC) − μ(X))/ene; Udiss(Y) = \({{\mbox{U}}}_{{\mbox{diss}}}^{{\mbox{o}}}\)(Y) − (E(X-Y-NC) − E(X-NC) − μ(Y))/ene; Udiss = min[Udiss(X), Udiss(Y)]. In these equations, \({{\mbox{U}}}_{{\mbox{diss}}}^{{\mbox{o}}}\)(X) and \({{\mbox{U}}}_{{\mbox{diss}}}^{{\mbox{o}}}\)(Y) are the standard dissolution potential of bulk X-metal and Y-metal; ne represent the number of transferred electrons involved in the dissolution; e is the electron charge; E(Y-NC) and E(X-NC) are the total energies of DACs with the vacancy of X-metal and Y-metal, respectively. The positive value of Udiss indicates that the dissolution of metal atoms can be better avoided under electrochemical reactions.
Interpretable machine learning approach
The interpretable machine learning approach was implemented based on DFT-calculated datasets for rapid and high-throughput screening of DACs. RF and LASSO + ℓ0 in PFESS method were performed using the scikit-learn package57 and we used gplearn package58 to carry out GPSR method. In this work, to ensure that the final trained descriptors are as simple and interpretable as possible, only simple algebraic/functional operators were utilized, including (+, –, ×, ÷, 2, 0.5), for all symbolic regression methods (PFESS, GPSR26, and SISSO24). n, S, and R were used as primary atomic features. UL and ΔG were utilized as targets for training descriptors with different dimensions or complexities. In PFESS, the construction of the physically meaningful feature space is inherently rooted in the primitive descriptors ϕ. The first feature space incorporates the original features, their square roots, and their squares. The second feature space is populated by performing summations, absolute differences, multiplications, and divisions on features from the first feature space that share the same units. Features in the first feature space with differing units are exclusively allowed to be multiplied or divided, contributing to the creation of the third feature space. Unphysical analytical forms are circumvented by employing this approach25. In the final stage, the features in the second and third feature spaces between different units are combined through multiplication and division, containing 163659 features. Our code about the construction of feature spaces referenced the method proposed by Senftle et al.25. By combining RF, RFE, and LASSO + ℓ0, this method enables rapid dimensionality reduction in complex systems containing 105 ~ 106 physically meaningful features, in which nonlinear features can also be considered to prevent potential nonlinear relationships later, such as the volcano relationship13, although this nonlinear correlation between descriptors and targets falls outside the scope of this paper. In GPSR, in addition to n, S, and R, the electronegativity (X), d-electron count (d), and 1st ionization energy (IE) were also considered as primary features. The grid search method was employed for parsimony coefficient, crossover probability (pc), and subtree mutation probability (ps), and generations26. Thirteen parsimony coefficients (0.0005, 0.0010, 0.0015), 18 pc values ranging from 0.5 to 0.95 (step = 0.025), 8 ps values ranging from (1 − pc)/3 to (0.92 − pc)/3 (step = 0.1), and 20 generations were considered. The population size was set at 5000. In SISSO, the ℓ0 constraint for sparsification is applied to a reduced feature subspace, which is determined through sure independence screening (SIS) method24. The size of this subspace is set to a value defined by the user, multiplied by the dimension of the descriptor. Given that 1D descriptors offer the highest interpretability and the associated feature space is not overly complex, they stand out as the most ideal descriptors. Therefore, 1D descriptors with a higher level of complexity (phi = 2 and 3) were trained and studied in this work.
Materials
2-methylimidazole (2-MI, solid, 98%, Macklin), zinc acetate dihydrate (Zn(OAc)2·2H2O, solid, 99.99%, Aladdin), cobalt acetate tetrahydrate (Co(OAc)2·4H2O, solid, 99.9%, Aladdin), phytic acid (liquid, 50% in H2O, Aladdin), cobalt nitrate hexahydrate (Co(NO3)2·6H2O, solid, 99.99%, Aladdin), Sodium hexachloroiridate(III) hydrate (Na3IrCl6·xH2O, solid, Ir 35% ~ 40%, Aladdin), ethanol (C2H5OH, liquid, HPLC, ≥99.8%, Macklin), methanol (CH3OH, liquid, HPLC, ≥99.9%, Aladdin), potassium hydroxide (KOH, solid, 95%, Macklin), Nafion perfluorinated resin solution (liquid, polymer content 5.0 ~ 5.4 wt %, Aladdin), commercial Pt/C (solid, 20 wt % Pt, Johnson Matthey), and commercial IrO2 (solid, 99.9%, Meryer). All reagents and materials were used as received without further purification.
Catalyst preparation
The Co-Co-NC and Co-Ir-NC was synthesized based on the previously reported method59. In a typical synthesis, a solution of 2.2331 g of 2-MI in 10 mL of deionized water was prepared. The mixture was then stirred for 5 min at room temperature. Subsequently, a vigorously stirred solution containing 0.596 g of Zn(OAc)2·2H2O and 0.068 g of Co(OAc)2·4H2O in 10 mL of aqueous solution was added. After an additional 1-min stir at room temperature, the combined solution was left undisturbed and aged for 5 h. The resulting product was then collected by centrifugation and washed three times with methanol to obtain Co-ZIF. After that, the as-synthesized 50 mg of Co-ZIF were dispersed in 50 mL of deionized water through sonication, forming a homogeneous colloid. Subsequently, a 1 mL aqueous solution containing 5 mg of Co(NO3)2·6H2O was added dropwise. After a 10-min stir at room temperature, 65 mg of phytic acid was introduced into the mixture under vigorous stirring. Following a 30-min stir at room temperature, the product was collected by centrifugation, washed three times with deionized water and ethanol, and then dried at 50 °C. The thus gained powder was heated to 600 °C under an N2 atmosphere with a ramp rate of 1 °C min⁻¹, kept for 120 min, and then further heated to 700 °C with a ramp rate of 2 °C min⁻¹. Finally, the black powder was naturally cooled down to room temperature to obtain Co-Co-NC. The Co-Ir-NC was synthesized via a similar procedure except for the corresponding 8.5 mg of Na3IrCl6·xH2O as a raw material instead of 5 mg of Co(NO3)2·6H2O.
Characterization
Field emission scanning electron microscopy (FE-SEM, Hitachi Regulus 8100), transmission electron microscopy (TEM), high-resolution TEM (HRTEM, JEOL JEM-F200 and JEM-2100plus), and aberration-corrected high-angle annular dark-field scanning TEM (AC-HAADF-STEM, JEOL JEM-ARM200F) were employed for observing the micromorphology and microstructure of samples. The element distribution was analyzed by EDS, which was attached to the TEM. Inductively coupled plasma optical emission spectroscopy (ICP-OES, Varian VISTA-MPX) was performed to identify the element content of samples. To identify the crystal phase of samples, X-ray Diffractometer (XRD, Bruker D8 Focus) with copper Kα radiation (λ = 1.54056 Å) was utilized. The valence state was determined by X-ray photoelectron spectrometer (XPS, Thermo K-Alpha+), using an Al Kα X-ray source (hν = 1486.6 eV). To investigate the chemical state and local coordination structure of samples, the X-ray absorption fine structure (XAFS) measurements at Co K-edge and Ir L3-edge were obtained using the BL14W1 beamline at the Shanghai Synchrotron Radiation Facility (SSRF). The XAS raw data and Fourier-transformed fitting were processed by Athena and Artemis software packages, respectively60.
Work electrode preparation
The preparation of the homogeneous catalyst ink involved dispersing 5 mg of the as-synthesized catalyst and 1 mg of carbon black in 700 μL of ethanol and 300 μL of deionized water. After sonication for 2 h, the mixture was supplemented with 30 µL of a 5 wt% Nafion solution, and further sonicated for 1 h to ensure uniform dispersion. Subsequently, 10 µL and 5 µL of the resulting catalyst ink was evenly dropped onto a fine-polished glassy carbon (GC) electrode with a 5 mm diameter to afford a mass loading of about 0.25 mg cm⁻² and 0.125 mg cm⁻² for ORR and OER, respectively. The Pt/C (20 wt%) and IrO2 electrodes were also prepared using the same procedure.
Electrochemical measurement
The electrochemical performance was assessed using a three-electrode electrochemical cell at room temperature (25 °C), which included a carbon rod counter electrode, a saturated calomel reference electrode (SCE), and a catalyst-loaded GC working electrode. All potentials reported in this study were calibrated to the RHE reference scale using the equation: E (vs. RHE) = E (vs. SCE) + 0.2438 V + 0.059 × pH. The electrochemical data were collected using an electrochemical workstation (IVIUM, CompactStat.e20250) accompanied with a rotating system. The electrolytes used for ORR and OER were 0.1 M KOH (pH = 12.9 ± 0.1) and 1.0 M KOH (pH = 13.9 ± 0.1), respectively. Prior to each electrochemical test, the electrolyte was freshly prepared. Take a quantitative amount of KOH and add it to a volumetric flask to dilute to 0.1 M or 1 M before use. The ORR activity was evaluated through linear sweep voltammetry, in O2-saturated KOH electrolyte at a scanning rate of 5 mV s⁻¹. The catalyst ink was drop-cast onto a GC electrode, which was subsequently mounted to a rotating disk electrode (RDE) with a revolution speed of 1600 rpm. The OER performance was characterized using cyclic voltammetry at a 5 mV s⁻¹ scan rate. An iR-correction (85%) was implemented to compensate for the voltage drop between the reference and working electrodes. Electrochemical impedance spectroscopy (EIS) measurements were conducted in a frequency range of 105 to 0.01 Hz with a sinusoidal perturbation of 10 mV at 0.94 V vs. RHE and 1.56 V vs. RHE for ORR and OER, respectively. The resistance values of the catalysts represent the average electron transfer resistance obtained from multiple EIS tests. The overpotential values were obtained by triplicate reproducibility tests.
The ORR stability was evaluated by the accelerated stress test (AST) in the potential range of 0.6 ~ 1.0 V vs. RHE with the scan rate at 50 mV s−1 for continuous 10,000 cycles in oxygen-saturated 0.1 M KOH solution. The OER stability was investigated using chronoamperometry measurements at a constant current density of 10 mA cm−2.
Data availability
The data that support the findings of this study are available within the paper and its Supplementary Information. The atomic coordinates of the optimized model for electronic structure calculations are provided as a separate Supplementary Data 1. Source data are provided with this paper.
Code availability
The Python code written to perform PFESS method and train universal descriptors has been open-sourced under the Apache 2.0 license and is accessible via our GitHub repository at https://github.com/TJU-ECAT-AI/PFESS. All code was archived on Zenodo (https://doi.org/10.5281/zenodo.13169808)61.
References
Seh, Z. W. et al. Combining theory and experiment in electrocatalysis: insights into materials design. Science 355, eaad4998 (2017).
Greeley, J. et al. Alloys of platinum and early transition metals as oxygen reduction electrocatalysts. Nat. Chem. 1, 552–556 (2009).
Karmodak, N. & Nørskov, J. K. Activity and stability of single- and di-atom catalysts for the O2 reduction reaction. Angew. Chem. Int. Ed. 62, e202311113 (2023).
Zhong, M. et al. Accelerated discovery of CO2 electrocatalysts using active machine learning. Nature 581, 178–183 (2020).
Tran, K. & Ulissi, Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nat. Catal. 1, 696–703 (2018).
Ji, Y., Du, J. & Chen, A. Review on heteroatom doping carbonaceous materials toward electrocatalytic carbon dioxide reduction. Trans. Tianjin Univ. 28, 292–306 (2022).
Singh, A. R. et al. Electrochemical ammonia synthesis—the selectivity challenge. ACS Catal. 7, 706–709 (2016).
Liu, X., Jiao, Y., Zheng, Y., Jaroniec, M. & Qiao, S. Z. Building up a picture of the electrocatalytic nitrogen reduction activity of transition metal single-atom catalysts. J. Am. Chem. Soc. 141, 9664–9672 (2019).
Sun, Y. et al. Covalency competition dominates the water oxidation structure–activity relationship on spinel oxides. Nat. Catal. 3, 554–563 (2020).
Liu, X. et al. Recent advances in the comprehension and regulation of lattice oxygen oxidation mechanism in oxygen evolution reaction. Trans. Tianjin Univ. 29, 247–253 (2023).
Li, R. & Wang, D. Superiority of dual‐atom catalysts in electrocatalysis: one step further than single‐atom catalysts. Adv. Energy Mater. 12, 2103564 (2022).
Chang, X. et al. Designing single-site alloy catalysts using a degree-of-isolation descriptor. Nat. Nanotechnol. 18, 611–616 (2023).
Lin, X. et al. High-throughput screening of electrocatalysts for nitrogen reduction reactions accelerated by interpretable intrinsic descriptor. Angew. Chem. Int. Ed. 62, e202300122 (2023).
Fang, C. et al. Synergy of dual-atom catalysts deviated from the scaling relationship for oxygen evolution reaction. Nat. Commun. 14, 4449 (2023).
Kaiser, S. K. et al. Performance descriptors of nanostructured metal catalysts for acetylene hydrochlorination. Nat. Nanotechnol. 17, 606–612 (2022).
Li, D., Xu, H., Zhu, J. & Cao, D. Fast identification of the stability of atomically dispersed bi-atom catalysts using a structure descriptor-based model. J. Mater. Chem. A 10, 1451–1462 (2022).
Han, Z. K. et al. Single-atom alloy catalysts designed by first-principles calculations and artificial intelligence. Nat. Commun. 12, 1833 (2021).
Yuan, H., Li, Z., Zeng, X. C. & Yang, J. Descriptor-based design principle for two-dimensional single-atom catalysts: carbon dioxide electroreduction. J. Phys. Chem. Lett. 11, 3481–3487 (2020).
Gao, W. et al. Determining the adsorption energies of small molecules with the intrinsic properties of adsorbates and substrates. Nat. Commun. 11, 1196 (2020).
Xu, H., Cheng, D., Cao, D. & Zeng, X. C. A universal principle for a rational design of single-atom electrocatalysts. Nat. Catal. 1, 339–348 (2018).
Ren, C. et al. A universal descriptor for complicated interfacial effects on electrochemical reduction reactions. J. Am. Chem. Soc. 144, 12874–12883 (2022).
Esterhuizen, J. A., Goldsmith, B. R. & Linic, S. Interpretable machine learning for knowledge generation in heterogeneous catalysis. Nat. Catal. 5, 175–184 (2022).
Wang, Y., Wagner, N. & Rondinelli, J. M. Symbolic regression in materials science. MRS Commun. 9, 793–805 (2019).
Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M. & Ghiringhelli, L. M. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018).
O’Connor, N. J., Jonayat, A. S. M., Janik, M. J. & Senftle, T. P. Interaction trends between single metal atoms and oxide supports identified with density functional theory and statistical learning. Nat. Catal. 1, 531–539 (2018).
Weng, B. et al. Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts. Nat. Commun. 11, 3513 (2020).
Andersen, M., Levchenko, S. V., Scheffler, M. & Reuter, K. Beyond scaling relations for the description of catalytic materials. ACS Catal. 9, 2752–2759 (2019).
Kitchin, J. R., Nørskov, J. K., Barteau, M. A. & Chen, J. G. Role of strain and ligand effects in the modification of the electronic and chemical properties of bimetallic surfaces. Phys. Rev. Lett. 93, 156801 (2004).
Abild-Pedersen, F. et al. Scaling properties of adsorption energies for hydrogen-containing molecules on transition-metal surfaces. Phys. Rev. Lett. 99, 016105 (2007).
Jiao, S., Fu, X. & Huang, H. Descriptors for the evaluation of electrocatalytic reactions: d‐band theory and beyond. Adv. Funct. Mater. 32, 2107651 (2021).
Vojvodic, A., Nørskov, J. K. & Abild-Pedersen, F. Electronic structure effects in transition metal surface chemistry. Top. Catal. 57, 25–32 (2013).
Xin, H., Vojvodic, A., Voss, J., Nørskov, J. K. & Abild-Pedersen, F. Effects of d-band shape on the surface reactivity of transition-metal alloys. Phys. Rev. B 89, 115114 (2014).
Fu, Z., Yang, B. & Wu, R. Understanding the activity of single-atom catalysis from frontier orbitals. Phys. Rev. Lett. 125, 156001 (2020).
Li, Q., Yan, G. & Vlachos, D. G. Theoretical insights into H2 activation over anatase TiO2 supported metal adatoms. ACS Catal. 14, 886–896 (2024).
Yang, P., Li, J., Vlachos, D. G. & Caratzoulas, S. Tuning active site flexibility by defect engineering of graphene ribbon edge-hosted Fe-N3 sites. Angew. Chem. Int. Ed. 63, e202311174 (2024).
Yan, L. et al. Atomically precise electrocatalysts for oxygen reduction reaction. Chem 9, 280–342 (2023).
Ghiringhelli, L. M., Vybiral, J., Levchenko, S. V., Draxl, C. & Scheffler, M. Big data of materials science: critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015).
Ghiringhelli, L. M. et al. Learning physical descriptors for materials science by compressed sensing. N. J. Phys. 19, 023017 (2017).
Wygant, B. R., Kawashima, K. & Mullins, C. B. Catalyst or precatalyst? The effect of oxidation on transition metal carbide, pnictide, and chalcogenide oxygen evolution catalysts. ACS Energy Lett. 3, 2956–2966 (2018).
Ha, M. et al. Tuning metal single atoms embedded in NxCy moieties toward high-performance electrocatalysis. Energy Environ. Sci. 14, 3455–3468 (2021).
Finzel, J. et al. Limits of detection for EXAFS characterization of heterogeneous single-atom catalysts. ACS Catal. 13, 6462–6473 (2023).
Guo, X. et al. Tackling the activity and selectivity challenges of electrocatalysts toward the nitrogen reduction reaction via atomically dispersed biatom catalysts. J. Am. Chem. Soc. 142, 5709–5721 (2020).
Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comp. Mater. Sci. 6, 15–50 (1996).
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999).
Perdew, J. P., Burke, K. & Ernzerhof, M. ERRATA: generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Grimme, S. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 132, 154104 (2010).
Hjorth Larsen, A. et al. The atomic simulation environment-a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Ong, S. P. et al. Python Materials Genomics (pymatgen): a robust, open-source python library for materials analysis. Comp. Mater. Sci. 68, 314–319 (2013).
Wang, V., Xu, N., Liu, J.-C., Tang, G. & Geng, W.-T. VASPKIT: a user-friendly interface facilitating high-throughput computing and analysis using VASP code. Comput. Phys. Commun. 267, 108033 (2021).
Bader, R. F. W. A quantum theory of molecular structure and its applications. Chem. Rev. 91, 893–928 (1991).
Maintz, S., Deringer, V. L., Tchougreeff, A. L. & Dronskowski, R. LOBSTER: a tool to extract chemical bonding from plane-wave based DFT. J. Comput. Chem. 37, 1030–1035 (2016).
Henkelman, G., Uberuaga, B. P. & Jónsson, H. A climbing image nudged elastic band method for finding saddle points and minimum energy paths. J. Chem. Phys. 113, 9901–9904 (2000).
Nørskov, J. K. et al. Origin of the overpotential for oxygen reduction at a fuel-cell cathode. J. Phys. Chem. B 108, 17886–17892 (2004).
Mathew, K., Kolluru, V. S. C., Mula, S., Steinmann, S. N. & Hennig, R. G. Implicit self-consistent electrolyte model in plane-wave density-functional theory. J. Chem. Phys. 151, 234101 (2019).
Clementi, E. & Raimondi, D. L. Atomic screening constants from SCF functions. J. Chem. Phys. 38, 2686–2689 (1963).
Greeley, J. & Nørskov, J. K. Electrochemical dissolution of surface alloys in acids: thermodynamic trends from first-principles calculations. Electrochim. Acta 52, 5829–5836 (2007).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Stephens, T. gplearn. https://gplearn.readthedocs.io/en/latest/intro.html (2016).
Zhao, X., Fang, R., Wang, F., Kong, X. & Li, Y. Dual-metal single atoms with dual coordination for the domino synthesis of natural flavones. JACS Au 3, 185–194 (2023).
Ravel, B. & Newville, M. ATHENA, ARTEMIS, HEPHAESTUS: data analysis for X-ray absorption spectroscopy using IFEFFIT. J. Synchrotron Rad. 12, 537–541 (2005).
Lin, X. et al. Machine learning-assisted dual-atom sites design with interpretable descriptors unifying electrocatalytic reactions. Zenodo. https://doi.org/10.5281/zenodo.13169808 (2024).
Acknowledgements
The authors acknowledge the financial support from the National Key R&D Program of China (2021YFA1500704), the National Natural Science Foundation of China (Nos. 22121004, 22250008, U22A20409), the Haihe Laboratory of Sustainable Chemical Transformations, the Program of Introducing Talents of Discipline to Universities (BP0618007) and the XPLORER PRIZE. The authors also appreciate the generous computing resources provided by the High Performance Computing Center of Tianjin University and Shanghai Synchrotron Radiation Facility (SSRF) for their assistance in conducting XAS measurements.
Author information
Authors and Affiliations
Contributions
J.G. and Z.-J.Z. coordinated and supervised the project. J.G., Z.-J.Z., P.Z. and X.L. conceived the project idea. X.L. completed the DFT calculations, developed the feature space and made the symbolic regression analysis. S.W. and S.Z. contributed to the DFT calculations. X.L., X.D. and P.Z. synthesized catalyst samples, carried out characterizations and electrochemical measurements. W.L. completed the XAS measurements. C.P. participated in the discussion of the characterizations and catalytic behavior. All authors wrote and revised this manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Qiang Li, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Lin, X., Du, X., Wu, S. et al. Machine learning-assisted dual-atom sites design with interpretable descriptors unifying electrocatalytic reactions. Nat Commun 15, 8169 (2024). https://doi.org/10.1038/s41467-024-52519-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-52519-8
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.