Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution


The electrochemical reduction of CO2 and H2 evolution from water can be used to store renewable energy that is produced intermittently. Scale-up of these reactions requires the discovery of effective electrocatalysts, but the electrocatalyst search space is too large to explore exhaustively. Here we present a theoretical, fully automated screening method that uses a combination of machine learning and optimization to guide density functional theory calculations, which are then used to predict electrocatalyst performance. We demonstrate the feasibility of this method by screening various alloys of 31 different elements, and thereby perform a screening that encompasses 50% of the d-block elements and 33% of the p-block elements. This method has thus far identified 131 candidate surfaces across 54 alloys for CO2 reduction and 258 surfaces across 102 alloys for H2 evolution. We use qualitative analyses to prioritize the top candidates for experimental validation.


Global energy demands have increased over time and are likely to continue increasing1,2. Meeting these demands using only fossil fuels may not be possible because doing so may negatively impacts the world’s environments, climate and biodiversity3,4,5,6. Alternative methods for energy production and storage include solar fuels, which are synthetic fuels created with photovoltaic energy or with photoelectrochemical cells7,8. Examples include H2 created by the electrochemical reduction of water, which can be combusted or used in hydrogen fuel cells, and synthetic hydrocarbons created by reducing CO2. Solar fuel production is currently inhibited by a lack of sufficiently active, efficient, selective, stable and low-cost catalysts9.

High-performing catalysts can be discovered using ab initio methods such as density functional theory (DFT) to predict catalyst properties. A common approach is to use DFT to predict thermodynamic energy descriptors that correlate with detailed microkinetic model results or experimental measurements of catalyst activity and selectivity. For the CO2 reduction reaction (CO2RR), the CO adsorption energy is a common descriptor to predict the activity towards hydrocarbon production10. This method of using adsorption energies to predict performance is general and has been applied to many reaction systems11, which include the hydrogen evolution reaction (HER)12. Of course, single descriptors cannot completely describe the electrocatalytic performance of an intermetallic, which requires analyses of surface stability, high coverage thermodynamics and electrochemical kinetics. These analyses take considerable resources, and full theoretical studies cannot keep pace with the accelerating experimental studies of intermetallics. Thus, a screening method to prioritize experimental and theoretical studies is valuable.

Screening large search spaces is non-trivial. Some methods address this by focusing on limited search spaces, such as bimetallic A3B crystals13 or simple cubic lattice intermetallics14. Other approaches include creating databases of electronic structure calculations15,16. Data sets of such a wide scope are necessary to perform robust screening studies, but creation of these databases has required collaboration between dozens of DFT experts. A consequence of this approach is that researchers spend a substantial portion of their time configuring, managing and waiting for DFT calculations, which are inherently time consuming. The personnel cost of configuring and managing DFT calculations has been addressed by creating computer-science-based solutions. Generalized frameworks exist that are able to enumerate surfaces and adsorption sites on arbitrary intermetallic bulk structures17,18. Software exists that manages computationally intensive calculations across multiple computing clusters19. These solutions are vital to perform high-throughput material screenings.

The computational cost of theoretical materials screenings has been partially addressed by the development of machine learning methods to accelerate DFT calculations. Machine learning regressions on DFT data can yield formation energy predictions at a fraction of the computational cost20,21 or accelerate screenings within individual bimetallics22. Neural networks can accelerate nudged elastic band studies of reaction kinetics23 or enable the study of disorder on intermetallic configurations24. One of the shortfalls of many of these approaches is that they require experts to manually prioritize new materials to screen and study. Many also rely on DFT-computed surface features, such as the d-band centre, or predict activity only at the surface level without distinguishing between surface sites14.

Some methods used to accelerate computational screenings may not have been applied to electrocatalyst discovery yet. Surrogate-based optimization25 is a method in which a surrogate model is built to replace a more computationally expensive model, and then an objective function is optimized on the cheaper surrogate model. Another accelerative method is active machine learning—also known as the optimal design of experiments26. With this method, a surrogate model is created from a given data set, and then the model is used to select which data should be obtained next. The selected data are added to the original data set and then used to create an updated surrogate model. The process is repeated iteratively such that the surrogate model is improved continuously. This method of iterative surrogate model screening has already been used in other fields, such as the discovery of light emitting diodes27, drug discovery28 or molecular property prediction29.

We created a workflow that borrows ideas from both surrogate-based optimization and active machine learning. We used this workflow to screen a search space of 1,499 intermetallics for potentially selective catalysts for CO2RR and HER. The workflow identified 54 intermetallics that have surfaces with near-optimal descriptors for CO2RR and 102 intermetallics for HER, and include both already-discovered and undiscovered catalysts. We then used a qualitative, heuristic method to down-select these compounds further, which resulted in a shortened list of 10 intermetallics for CO2RR and 14 intermetallics for HER. A holistic view of the intermetallic design space also allows trends and design rules to be identified that are difficult to see from small case studies. This workflow can be extended easily to other reaction chemistries for which ideal thermodynamic descriptors are known.


Framework construction

Our workflow uses machine learning models to search an arbitrarily large design space of intermetallic crystals and surfaces for near-optimal activity (Fig. 1). Surfaces are searched for ideal CO and H adsorption energies, which are indicative of catalyst performance for CO2 reduction10 and H2 evolution12, respectively. The workflow verifies the adsorption energies of these sites by performing DFT calculations automatically. DFT results are stored in a database, which is used to retrain the machine learning models. This yields a closed feedback loop of machine learning screening, DFT verification, and machine learning retraining that produces a database of DFT results that grows continuously, systematically, and without the need for user interaction. It is important to note that this workflow does not use machine learning to accelerate calculations of user-supplied systems. Instead, it uses machine learning to guide full-accuracy DFT screenings. Thus, we use DFT to perform a surrogate-based optimization, and we use an active learning feedback loop as an optimization guide.

Fig. 1: Workflow for automating theoretical materials discovery.

a,b, The experimental workflow for finding catalysts (a) is accelerated by a DFT workflow for screening the catalysts ab initio (b). c, Conventional workflows (blue) require scientific intuition to select candidates for DFT screenings. d, Our workflow (red) uses machine learning (ML) to select candidates systematically and automatically. The red text outlines the framework that this study used to perform active machine-learning and surrogate-based optimization.

Enumeration of search space

To generate a search space of adsorption sites, we obtained 1,499 different intermetallic combinations from the Materials Project30 across 31 different elements (Supplementary Fig. 1). We enumerated the surfaces on each intermetallic using pymatgen17 and then used Delaunay triangulation18 to enumerate all the adsorption sites on each surface (Methods section gives additional details). This search space encompassed 50% of the d-block elements and 33% of the p-block elements. Every possible adsorption site on each surface was considered for potential activity by our machine learning models.

Active learning optimization of catalyst descriptors

We enabled the machine learning of catalyst descriptors by developing a fingerprinting method to represent an intermetallic adsorption site numerically (Fig. 2). For each site, the types of elements coordinated with the adsorbate were tabulated. Each element type was described with a vector of four numbers: the atomic number of the element (Z), the Pauling electronegativity of the element (χ), the number of atoms of the element coordinated with the adsorbate (CN), as determined by Voronoi tessellation performed by pymatgen17, and the median adsorption energy between the adsorbate and the pure element (∆E). ∆E values were calculated from our own database of adsorption energies, and χ values were obtained from the Mendeleev database31. We repeated this vector creation process on the second shell of atoms bonded to the coordination atoms. One issue with this method is that it yields a variable number of features. We addressed this issue using a method found in literature32 (Methods gives additional details). Note also that the illustration in Fig. 2 is a simplification. The real fingerprint vector has four items per element and four elements per shell, which yields a total of 32 items per fingerprint vector.

Fig. 2: Fingerprint of coordination site.

Adsorption sites are reduced to numerical representations, or fingerprints, and these fingerprints are used as model features by TPOT35 to predict ∆ECO.

These fingerprints were chosen using a combination of intuition, trial and error and success in other surrogate modelling studies. To account for bulk steric effects, atomic radii were used as features32. Atomic radii may change depending on the local environment though, so elemental periods and groups may be appropriate substitutes for atomic radius. However, initial heuristic investigations showed a negligible difference in performance between using the period or group and atomic number, so we used the atomic number because of its relatively small dimensionality. To account for electronic affinity effects, Pauling electronegativity has been shown to be a successful feature14. To account for both sterics and environmental electronic effects, the coordination number has been shown to be a successful feature33. To improve the predictive capability, crude estimates of properties have been shown to be successful34. In this setting, the crude estimate of adsorption energy on a specific site is ∆E.

An automated machine learning package, TPOT35, was then used to select a machine learning regression method to predict adsorption energies from the site fingerprints. TPOT’s recommended modelling pipeline changed regularly due to the stochastic nature of TPOT and the constantly changing training data set. To aid prediction, we used a preprocessing pipeline to shift and scale each feature across all data points so that the averages and variances for each feature were zero and one, respectively. We also performed a principal component analysis on the fingerprints to orthogonalize the feature space. Supplementary Note 1 outlines the TPOT settings used, and Supplementary Methods outlines other regression techniques and feature representations we tested during development.

The regression methods illustrated in Fig. 2 were combined with all the available DFT data to train and update surrogate models daily. These models were used to aid in selecting adsorption sites for DFT calculation. First, the trained models were used to estimate ∆ECO and ∆EH for all the adsorption sites that we enumerated. These estimates were pooled with the explicit DFT results stored in the database, and machine-learning-estimated adsorption energies were removed if the DFT result existed for that exact site and adsorbate, to leave only one prediction or estimate per adsorption site. Then we defined the strongest binding energy on each surface as the low-coverage adsorption energy of that surface. The surfaces with low-coverage adsorption energies predicted to be near optimal (as established in Methods) were selected for DFT calculation with a Gaussian probability defined by the distance of the predicted site from the optimal values with a standard deviation of 0.2 eV.

The targeting of specific adsorption energies with near-optimal values allows us to exploit the knowledge that the surrogate models have learned thus far, and the addition of Gaussian noise gives a heuristic method that allows us to explore search spaces that the surrogate model normally would not suggest. Approximately 80% of the calculations were dedicated to this descriptor-optimization goal. The remaining 20% of our resources were dedicated to simulating all of the sites on the surfaces for which the low-coverage adsorption energies were closest to the top of the volcano—that is, nearest to a ∆ECO of –0.67 eV and a ∆EH of –0.27 eV for CO2RR and HER, respectively. This mitigated the chances of finding a false minimum adsorption energy on a surface. The regression and surrogate model prediction was performed once per day, and the Gaussian selection of DFT calculations was performed four times per day. In total, 42,785 DFT calculations of adsorption energies were completed by this study at a rate of approximately 200–300 calculations per day. Methods gives details regarding these DFT calculations.

Performance of active learning optimization

Due to the iterative nature of the surrogate modelling, we calculated prediction errors via evaluation on a rolling forecasting origin36. Specifically, we retrospectively trained a surrogate model on the first 200 data points that we obtained and then calculated the prediction errors between the next 200 data points and this first model’s predictions of these points. We then trained a second model on the first 400 points and calculated the prediction errors between the next 200 data points and this second model’s predictions of these points. We performed this iteratively until we obtained a single prediction error for every data point, excluding the first 200 points. All the prediction errors are plotted against time in Fig. 3a and Supplementary Fig. 2 for ∆ECO and ∆EH, respectively, along with a record of the number of near-optimal surfaces identified over time. The root-mean-squared error, mean absolute error (MAE) and median absolute deviation across all of the time-dependent ∆ECO predictions were 0.46, 0.29 and 0.17 eV, respectively. The root-mean-squared error, MAE and median absolute deviation of the ∆EH predictions were 0.41, 0.24 and 0.16 eV, respectively. Note that we chose 200 as the step size because our framework was able to perform at least 200 calculations per day, and so a step size of 200 served as a proxy for surrogate model updates.

Fig. 3: Identification of surfaces with near-optimal ∆ECO values for CO2RR.

a, The number of near-optimal surfaces identified (top) and violin plots of the absolute error in predicting ∆ECO (bottom) as a function of time. The outer shells of the violins bound all data, narrow vertical lines bound 95% of the data, thick vertical lines bound 50% of the data and white dots represent medians. Apparently missing months are not shown because no data were collected during those months. b, The normalized distribution of the low-coverage, DFT-calculated CO adsorption energies of all of the DFT-analysed surfaces in this study. The subdistribution for copper is also illustrated (orange). Dashed lines indicate the 0.1 eV range around the optimal ∆ECO value of –0.67 eV. c, Surfaces (131) for which low-coverage CO adsorption energies were calculated and verified with DFT. d, Surfaces (844) for which low-coverage CO adsorption energies were calculated only by the machine learning models.

The profile of the prediction errors over time provides us with practical insights into our framework. Between November 2017 and January 2018, fellow users of our computing clusters reduced their usage during the holiday season. This allowed our automated framework to effectively consume their unused capacity, and thereby increase our calculation throughput. This temporary increase in throughput is the probable cause of the relatively high rate of surface identification, and the improvement in prediction errors at this time may have been caused by an improved sampling of the search space. Then, in February 2018, we expanded the number of elements in our search space from ~20 elements to the 31 elements that we are searching currently. We seeded this new search space by manually queueing calculations with the new elements. This expansion in search space and subsequent seeding may have caused the decline in both predictive performance and identification rate. In May 2018, we refined our zero-point energy, entropic and solvation correction calculations to what is now shown in Supplementary Methods. This refinement may have caused the increase in the number of surfaces identified during that month. Thus, the trends in the prediction errors and identification rates are confounded with both the methods we used and various managerial events, such as changes in throughput capacity, changes in search space or changes in optimization targets.

Performance metrics and plots that are typically used to judge static surrogate models are shown in the Supplementary Note 2, such as train/test errors, parity plots and learning curves. Notably, the errors calculated via the rolling forecasting origin are generally larger than the errors calculated from the classical train/test split and learning curve methods. This is because classical train/test splitting methods allow the training sets to share the same sampling space as the test sets. Our workflow often searches unexplored sampling spaces though, which are more difficult to predict. Thus, the rolling forecasting origin method to evaluate error is more representative of our use scenario because it restricts the models from seeing data that they would not normally see in practice. We hypothesize that the rolling forecasting origin errors and the train/test errors would converge if we sampled the search space sufficiently.

Discovering potential intermetallic catalysts for CO2 reduction

This framework discovered 131 different intermetallic facets with near-optimal ∆ECO as confirmed by DFT (Fig. 3 and Supplementary Table 1). These surfaces correspond to 54 different intermetallic combinations and are recommended for experimental verification of activity. Some of these intermetallics have already been investigated. For example, Cu/Sn blends were shown to reduce CO2 to either CO or formate at high Faradaic efficiencies37. Ni/Ga intermetallics were shown to be active for CO2 reduction22,38, Pd/Au bimetallics were shown to be active for CO2 reduction to C1–C5 products39 and machine learning results in this study suggest that single Pd atoms surrounded by Au atoms may be the most likely bimetallic active site with a ∆ECO of about –0.8 eV, in contrast to the hypothesized Pd-rich Pd/Au site. Cu/Al bimetallics, which have not been studied previously, also show promising experimental results in current ongoing work, the results of which will be published in due course.

Discovering trends in CO2 reduction

In addition to simply finding potential active surfaces, the data that this study generated can be used to gain insight into the chemistry of CO adsorption. Figure 4 illustrates the fraction of enumerated surfaces that have near-optimal values for ∆ECO for various bimetallic combinations. Elements in Fig. 4 are rank ordered by the average pure-element ∆ECO values as calculated by this study with DFT. Forming bimetallics with two elements that both have a stronger binding than Ge generally led to inactive materials, but silicon broke this trend, which suggests that it has destabilizing properties when alloyed into an intermetallic.

Fig. 4: CO2 reduction activity map for bimetallics.

Visualization of two-component intermetallics for which the surfaces have low-coverage CO adsorption energy (∆ECO) values inside the range of (–0.77, –0.57) eV. White shading indicates an absence of any enumerated surfaces; grey shading indicates that all the ∆ECO values are outside the range of (–0.77, –0.57) eV and coloured shading indicates possible activity. The ∆ECO values used to create the upper half of this figure were calculated by DFT, and the values used to create the bottom half were calculated by the surrogate machine learning model. Copper-containing intermetallics are outlined in red because copper is the element for which the monometallic adsorption energy is nearest to the optimal value of –0.67 eV.

Other trends can also be found when different elemental pairings are analysed. A number of strong–weak elemental pairings yield possibly active surfaces, and include combinations of strong-binding elements, such as Pd, Pt, Ni or Os, with weak-binding elements, such as Al, Sn, Ga or Sb, which shows that the strong/weak Ni/Ga motif found in previous work is more general than previously known22. Interestingly, to combine two weak-binding elements can lead to possibly active surfaces. For example, the strongest-binding Ga surfaces and Au surfaces are approximately –0.44 eV and –0.53 eV, respectively, based on a combination of DFT and machine learning predictions. However, a Ga–Ga bridge site on AuGa2(100) leads to a near-ideal binding energy of –0.57 eV.

Although the volume of data generated by this study is arguably intractable to study in detail, the size of the data enables certain methods of data analysis. For example, we are now able to assess potential intermetallic performance based solely on the number and distribution of potential active sites instead of on the activity of one particular surface at one particular alloying ratio. Figure 5 illustrates this point by showing all 19,644 sites for which we performed DFT calculations of ∆ECO. The x and y axes in this figure are a reduced two-dimensional feature space40 (Supplementary Methods gives details). Clusters of points in this reduced space share similarities in site coordination and elemental combinations. Sites are coloured by their ∆ECO so that regions with different binding strengths can be identified.

Fig. 5: Active site motif analysis.

Latent space visualization using t-SNE40 on all the adsorption sites simulated with DFT. Proximity in this reduced space indicates a similarity in the structures of the adsorption sites. Black circle, sites that bind too strongly; dark purple circles, sites with optimal binding; purple circles, sites that bind too weakly. Adsorption energy values are in units of eV. Stronger binding sites are overlaid on top of weaker binding sites to indicate that the stronger sites have a greater influence on activity than the weaker sites due to their greater thermodynamic stability. We labelled dark purple clusters/materials because we expect them to be better candidates for further investigation and experimentation.

Labelled clusters that are nearly uniformly dark purple in Fig. 5 are robust combinations. These are not the only possible active alloys; instead, they represent combinations that are most likely to yield a higher fraction of active adsorption sites than other alloy combinations investigated thus far. This is especially important when matching theory with polycrystalline experiments in which the precise active surface may not be known a priori or in which there is little control over the surfaces created. Clusters that contain weak binding sites alongside active binding sites may still be active as the CO will prefer the stronger binding, more active sites. The presence of strong binding sites is more likely to hide an active site on a surface and should be avoided. Within a cluster, the embedding shows how active site coordination or alloying ratios may affect the activity. For example, the bottom-right panel in Fig. 5 shows that Si sites in CuSi alloys tend to bind too weakly, and thereby suggests that higher ratios of Cu to Si may improve the activity.

The machine learning model in this work also provides activity estimates of surfaces without explicit DFT calculations. The model predicts that approximately 81% of the surfaces have non-ideal ∆ECO values, defined as outside of the range (–0.8 – MAE, –0.5 + MAE) eV. This considerably narrows the potential experimental search space. Likewise, the search space for bimetallic combinations can be reduced. If at least one surface must be predicted to be near optimal for the CO2RR, the search space can be reduced by 72%. If at least 10% of surfaces must be active (similar to the robust determination from Fig. 4), then the search space is reduced by 93%.

Discovering potential catalysts for H2 evolution

The same types of analyses as performed for CO2RR can also be performed for HER. Figure 6a illustrates the t-distributed stochastic neighbour embedding (t-SNE) representation of the 23,141 adsorption sites for which we used DFT to calculate ∆EH. Figure 6b shows the distribution of DFT-calculated ∆EH values, where we found 258 different surfaces with low-coverage ∆EH values within 0.1 eV of the optimal value of –0.27 eV. All of these surfaces are listed in Supplementary Table 2, and the bimetallic map of HER performance is shown in Supplementary Fig. 3. Similar to our analysis for CO2RR, a number of the intermetallics that our screening study identified as having surfaces with near-optimal ∆EH values have already been verified by various literature studies41,42,43.

Fig. 6: Analysis of results for the HER performance.

a, t-SNE40 visualization of all the adsorption sites simulated with DFT. Adsorption energy values are in units of eV. Similar to Fig. 5, stronger binding sites are overlaid on top of weaker binding sites, and dark purple clusters/materials are labelled because we expect them to be better candidates for further investigation and experimentation. b, Normalized distribution of low-coverage ∆EH values calculated by our DFT workflow. Dashed lines indicate the 0.1 eV range around the optimal ∆EH value of –0.27 eV.

Supplementary Fig. 3 shows that, in addition to Pt, there is a band of elements with comparable monometallic adsorption energies that tend to yield intermetallic surfaces with near-optimal ∆EH values: As, Al, Si, Sb, Rh and Pd. Many of these elements also appear in the t-SNE diagram for HER as well (Fig. 6a), which suggests that these elements warrant further study and experimentation.


We created a framework that produces and stores DFT data continuously and without the need for user intervention. This framework combines task and calculation management software with active machine-learning and surrogate-based optimization to enable the automated, systematic selection and execution of DFT calculations. The framework produced 42,785 adsorption-energy calculations to identify 131 candidate surfaces across 54 intermetallics with a potentially high CO2 reduction activity and 258 candidate surfaces across 102 intermetallics for hydrogen evolution. A number of the candidate surfaces found here have already been validated by literature experiments37,38,39,41,42,43, which suggests that the unstudied candidates found in this screening warrant further investigation. The full list of potential surfaces is shown in Supplementary Tables 1 and 2, and shortened lists of candidate intermetallics are illustrated in Figs. 5 and 6.

Our workflow to generate DFT data offers a combination of benefits that we have not yet seen in other literature frameworks. Our task and calculation management systems reduce the amount of time required to configure and process DFT calculations, our database of DFT results enables holistic analyses across numerous adsorption sites, surfaces and material spaces, and our active machine-learning and/or surrogate-based optimization workflow guides the discovery of candidate catalysts without the need for expert intuition. The flexibility of the framework also allows for expert-assisted guidance, which allows us to use the high-throughput DFT workflow to study specific sites, surfaces or systems if needed. The combination of flexibility, automation and machine learning guidance accelerates the theoretical discovery and study of catalysts for CO2 reduction, H2 evolution or any other chemistry with a descriptor performance scaling relationship.

A shortfall of our workflow is its heavy reliance on descriptor performance relationships, which are used to guide the active learning algorithms. For example, this method will have issues with predicting CO2 reduction activity for materials and surfaces that yield reaction mechanisms in which ∆ECO is independent of activity. Additionally, this method does not address other important aspects of catalyst performance, such as surface stability or catalyst cost. These issues are acceptable because this framework is used primarily as a tool to screen for candidate catalysts from a relatively large search space and to supplement experts’ intuitions with machine-derived suggestions. Our framework does not replace robust theoretical and experimental studies; it accelerates them by reducing search spaces to more tractable sizes and focusing expensive studies to systems that are more likely to yield interesting results. Future work could still be done to address the issues of diverse reaction mechanisms or multiple aspects of catalyst performance.


Enumerating search space

For each of the 1,499 intermetallic crystals we obtained from the Materials Project30, we used pymatgen17 to enumerate symmetrically distinct facets with Miller indices between –2 and 2. Many intermetallic facets contained asymmetric top and bottom surfaces, and in those cases both surfaces were analysed as well as distinct surfaces that arise from the absolute position of the surface cut. In total, 1,499 crystal structures were considered, which resulted in 17,507 unique surfaces and 1,684,908 unique adsorption sites. Surfaces were enumerated using ideal structures from the Materials Project instead of relaxed structures. This can cause differences in the number of enumerated facets, but it allows the enumeration to be completed without DFT relaxations for every bulk structure.

Addressing models with a variable number of features

One issue with our fingerprinting method is that it yields a variable number of features. For example, two vectors are needed to represent the first shell of a Cu–Al bridge site, but only one vector is needed to represent a Cu–Cu bridge. This issue can be addressed with zero padding, but can be better modelled using a literature method32 to make dummy features to replace features that are not populated naturally. To continue with the previous example, the first shell of the Cu–Al site would be represented by a vector of four numbers for the Cu element and four more numbers for the Al element, but a Cu–Cu site would be represented by four numbers for the Cu element and four dummy features. These dummy features are the average Z, the average χ, the average median adsorption energy of all the elements we studied and a CN value of zero. Using averages of valid feature values reduces the bias induced from these dummy features and the CN value of zero ensures that no valid configuration can be confounded with the dummy features.

Calculating optimal adsorption energies

A descriptor/activity relationship10 was used to predict the catalyst activity and selectivity for CO2RR given a free energy change, ∆GCO. This relationship shows that a ∆GCO of –0.17 eV yields an optimal activity and selectivity, which corresponds to a ∆ECO of –0.67 eV (Supplementary Methods gives more details). Similarly, literature relations were adopted to predict HER performances12. This relation predicted an optimal ∆GH of –0.03 eV, which corresponds to a target ∆EH of 0.27 eV (ref. 44).

Calculating adsorption energies with DFT

The adsorption energy calculation workflow used in this study mimicked typical computational chemistry methods to calculate adsorption energies for sites of interest in the catalogue11. Crystal structures from the Materials Project were relaxed using DFT. Relaxed crystal structures were used to generate facets of interest, which were then relaxed with free surface atoms and fixed subsurface atoms. Finally, the adsorbate was placed on the surface at the relevant site and a final relaxation was completed. Final relaxed structures and their energies were comparable to traditional expert-made structures, as shown in Supplementary Note 3.

We performed all of the DFT calculations using the Vienna Ab initio Simulation Package45,46,47,48 (VASP) implemented in an Atomic Simulation Environment (ASE)49, the revised Perdew–Burke–Ernzerhof functionals50, k-point grids of 4,4,1, an energy cutoff of 350 eV and the default pseudopotentials supplied by VASP version 5.4. Bulk relaxations were performed with a 10,10,10 k-point grid and a 500 eV cutoff and only isotropic relaxations were allowed. Surfaces were replicated in the x/y directions so that each cell vector was at least 4.5 Å. No spin magnetism or dispersion corrections were included. Slabs were replicated in the z direction to a minimum of 7 Å and at least 20 Å of vacuum was included in between the slabs. For some facets this led to slabs with a large depth due to constraints in how the facet could be formed. Generally, the bottom layers were fixed and defined as those atoms more than 3 Å from the top of the surface in the scaled z direction. Adsorption energies were calculated relative to gas-phase CO(g) for CO, and relative to gas-phase 1H2(g) for H.

Automating DFT calculations

DFT calculations and other calculational tasks were coordinated in parallel and in an automated, high-throughput fashion. Each type of calculation and task was encoded as an interdependent task, and then dependency management software (Luigi51) was used to manage the tasks in parallel. For example, an adsorption energy calculation depends on a single surface relaxation which depends on a single bulk relaxation. Requesting an adsorption energy calculation automatically triggers the prerequisite bulk and slab relaxations and then adds the results to a database. When a new adsorption energy calculation is triggered that requires the same surface, the prerequisite bulk and slab results are read from the database instead of being regenerated. This differs from a fixed pipeline approach52, because intermediate tasks, such as slab relaxations, can be shared across multiple pipelines and at different times. DFT tasks were managed by a central FireWorks19 database that distributed DFT relaxation tasks across multiple computing clusters. This combined Luigi and FireWorks framework enabled high-throughput DFT calculations, because adsorption energies could be queried for any of the 1,684,908 enumerated sites without the need for human management of the intermediate tasks. All of the DFT relaxations were stored in a Mongo database that contains DFT calculation settings, the identity of the original crystal structure, the Miller indices of the slab, the exact, Cartesian location of the adsorption site, chemical information about the adsorption site, such as local coordination, and the adsorption energy.

Our simulations occasionally yielded abnormal relaxations that arise from desorptions, dissociations, surface reconstructions or DFT non-convergence. These abnormalities were omitted from our regressions by excluding data from simulations that met any of the following criteria: simulations in which the final maximum equilibrium force between any two atoms exceeded 0.5 eV Å–1, where the absolute value of the adsorption energy exceeded 4 eV, where any atom moved more than 0.5 Å during the bare slab relaxation, where the adsorbate moved more than 1.5 Å during the adsorption relaxation and where any slab atom moved more than 1.5 Å during the adsorption relaxation. These exclusion criteria were used as heuristics to reduce the outliers. This approach may induce bias in the data set if systematic portions of the search space are missing because they often fail for these reasons. Some DFT errors may be treated by automatically tuning the DFT calculation settings17, but these approaches are not robust across the full range of calculation errors in adsorption simulations.

The adsorption energy database, which was required to train machine learning models, was initially seeded with ∆ECO and ∆EH calculations for every unique adsorption site on a variety of surfaces, which included the (100), (111) and (211) facets of the most stable crystal form of each element in Supplementary Fig. 1. We also added all of the unique coordination types (on-top Ni, on-top Fe, bridge Ni–Fe and so on) up to a coordination number of two and then selected the corresponding surfaces with the smallest number of atoms. This ensured that the original machine learning model contained at least some data to begin with.

Code availability

The code used to perform this work is available at

Data availability

The code and data used to produce the figures in this article are available in the GASpy manuscript repository at A snapshot of our adsorption energy data are included with this article in JSON format. A ‘’ and a ‘how_to_read_gasdb_json.ipynb’ Jupyter notebook are also included to illustrate how to convert the JSON data into atoms objects as per the ASE49. Up-to-date versions of the JSON-formatted data are also available from the corresponding author on reasonable request. An up-to-date visualization of the data can also be viewed at


  1. 1.

    World Energy Outlook 2017 Technical Report (International Energy Agency, 2017);

  2. 2.

    Annual Energy Outlook 2017 with Projections to 2050 Technical Report (US Energy Information Administration, 2017);

  3. 3.

    Mackay, D. J. C. Sustainable Energy—without the Hot Air Vol. 2 (UIT Cambridge Ltd, Cambridge, 2009).

  4. 4.

    Edenhofer, O., Madruga, R. P. & Sokona, Y. Renewable Energy Sources and Climate Change Mitigation (Cambridge Univ. Press, 2012).

  5. 5.

    Rockström, J. et al. A safe operating space for humanity. Nature 461, 472–475 (2009).

  6. 6.

    IPCC Climate Change 2014: Synthesis Report (eds Core Writing Team, Pachauri, R. K. & Meyer L. A.) (IPCC, 2015).

  7. 7.

    Lewis, N. S. & Nocera, D. G. Powering the planet: chemical challenges in solar energy utilization. Proc. Natl Acad. Sci. USA 104, 15729–15735 (2007).

  8. 8.

    Seh, Z. W. et al. Combining theory and experiment in electrocatalysis: insights into materials design. Science 355, eaad4998 (2017).

  9. 9.

    Montoya, J. H. et al. Materials for solar fuels and chemicals. Nat. Mater. 16, 70–81 (2016).

  10. 10.

    Liu, X. et al. Understanding trends in electrochemical carbon dioxide reduction rates. Nat. Commun. 8, 15438 (2017).

  11. 11.

    Nørskov, J. K., Studt, F., Abild-Pedersen, F. & Bligaard, T. Fundamental Concepts in Heterogeneous Catalysis (John Wiley & Sons, Inc., Hoboken, 2015).

  12. 12.

    Greeley, J., Jaramillo, T. F., Bonde, J., Chorkendorff, I. & Nørskov, J. K. Computational high-throughput screening of electrocatalytic materials for hydrogen evolution. Nat. Mater. 5, 909–913 (2006).

  13. 13.

    Hansen, H. A., Shi, C., Lausche, A. C., Peterson, A. A. & Nørskov, J. K. Bifunctional alloys for the electroreduction of CO2 and CO. Phys. Chem. Chem. Phys. 18, 9194–9201 (2016).

  14. 14.

    Li, Z., Wang, S., Chin, W. S., Achenie, L. E. & Xin, H. High-throughput screening of bimetallic catalysts enabled by machine learning. J. Mater. Chem. A 5, 24131–24138 (2017).

  15. 15.

    Hummelshøj, J. S., Abild-Pedersen, F., Studt, F., Bligaard, T. & Nørskov, J. K. CatApp: a web application for surface chemistry and heterogeneous catalysis. Angew. Chem. Int. Ed. 51, 272–274 (2012).

  16. 16.

    Scheffler, M. & Draxl, C. The NOMAD Repository (Computer Center of the Max-Planck Society, Garching, 2014).

  17. 17.

    Ong, S. P. et al. Python Materials Genomics (pymatgen): a robust, open-source python library for materials analysis. Comp. Mater. Sci. 68, 314–319 (2013).

  18. 18.

    Montoya, J. H. & Persson, K. A. A high-throughput framework for determining adsorption energies on solid surfaces. Comp. Mater. 3, 14 (2017).

  19. 19.

    Jain, A. et al. FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comp. Pract. E. 22, 685–701 (2010).

  20. 20.

    Meredig, B. et al. Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys. Rev. B 89, 1–7 (2014).

  21. 21.

    Ward, L. et al. Including crystal structure attributes in machine learning models of formation energies via Voronoi tessellations. Phys. Rev. B 96, 1–12 (2017).

  22. 22.

    Ulissi, Z. W. et al. Machine-learning methods enable exhaustive searches for active bimetallic facets and reveal active site motifs for CO2 reduction. ACS Catal. 7, 6600–6608 (2017).

  23. 23.

    Peterson, A. A. Acceleration of saddle-point searches with machine learning. J. Chem. Phys. 145, 074106 (2016).

  24. 24.

    Boes, J. R. & Kitchin, J. R. Modeling segregation on AuPd(111) surfaces with density functional theory and Monte Carlo simulations. J. Phys. Chem. C 121, 3479–3487 (2017).

  25. 25.

    Han, Z. H. & Zhang, K. S. in Real-World Applications of Genetic Algorithms (ed. Roeva, O.) Ch. 17 (InTech, London, 2012).

  26. 26.

    Settles, B. Active Learning (Williston, Morgan & Claypool, 2012).

  27. 27.

    Gómez-Bombarelli, R. et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat. Mater. 15, 1120–1127 (2016).

  28. 28.

    Warmuth, M. K. et al. Active learning with support vector machines in the drug discovery process. J. Chem. Inf. Comput. Sci. 43, 667–673 (2003).

  29. 29.

    Gubaev, K., Podryabinkin, E. V. & Shapeev, A. V. Machine learning of molecular properties: Locality and active learning. J. Chem. Phys. 148, 1–9 (2018).

  30. 30.

    Jain, A. et al. The materials project: A materials genome approach to accelerating materials innovation. APL Mater. 1, 1–11 (2013).

  31. 31.

    Lukasz, M. Mendeleev—a Python resource for properties of chemical elements, ions and isotopes (2014);

  32. 32.

    Davie, S. J., Di Pasquale, N. & Popelier, P. L. Kriging atomic properties with a variable number of inputs. J. Chem. Phys 145, 1–11 (2016).

  33. 33.

    Calle-Vallejo, F., Loffreda, D., Koper, M. T. M. & Sautet, P. Introducing structural sensitivity into adsorption-energy scaling relations by means of coordination numbers. Nat. Chem. 7, 403–410 (2015).

  34. 34.

    Zhang, Y. & Ling, C. A strategy to apply machine learning to small datasets in materials science. Comp. Mater. 25, 28–33 (2018).

  35. 35.

    Olson, R. S. et al. in Applications of Evolutionary Computation (eds Squillero, G. & Burelli, P.) 123–137 (Lecture Notes in Computer Science, Vol. 9597, Springer International Publishing, Porto, 2016).

  36. 36.

    Hyndman, R. J. & Athanasopoulos, G. Forecasting: Principles and Practice (2014);

  37. 37.

    Morimoto, M. et al. Electrodeposited Cu–Sn alloy for electrochemical CO2 reduction to CO/HCOO. Electrocatalysis 9, 323–332 (2018).

  38. 38.

    Torelli, D. A. et al. Nickel–gallium-catalyzed electrochemical reduction of CO2 to highly reduced products at low overpotentials. ACS Catal. 6, 2100–2104 (2016).

  39. 39.

    Kortlever, R. et al. Palladium–gold catalyst for the electrochemical reduction of CO2 to C2–C5 hydrocarbons. Chem. Commun. 52, 10229–10232 (2016).

  40. 40.

    Maaten, L. V. D. Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 1–21 (2014).

  41. 41.

    Cherepanov, P. V., Ashokkumar, M. & Andreeva, D. V. Ultrasound assisted formation of Al–Ni electrocatalyst for hydrogen evolution. Ultrason. Sonochem. 23, 142–147 (2015).

  42. 42.

    Yamauchi, M., Abe, R., Tsukuda, T., Kato, K. & Takata, M. Highly selective ammonia synthesis from nitrate with photocatalytically generated hydrogen on CuPd/TiO2. J. Am. Chem. Soc. 133, 1150–1152 (2011).

  43. 43.

    Liao, H. et al. A multisite strategy for enhancing the hydrogen evolution reaction on a nano-Pd surface in alkaline media. Adv. Energ. Mater. 7, 1–7 (2017).

  44. 44.

    Nørskov, J. K. et al. Trends in the exchange current for hydrogen evolution. J. Electrochem. Soc. 152, J23 (2005).

  45. 45.

    Kresse, G. & Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993).

  46. 46.

    Kresse, G. & Hafner, J. Ab initio molecular-dynamics simulation of the liquid-metal–amorphous-semiconductor transition in germanium. Phys. Rev. B 49, 14251–14269 (1994).

  47. 47.

    Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comp. Mater. Sci. 6, 15–50 (1996).

  48. 48.

    Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).

  49. 49.

    Hjorth Larsen, A. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Mat. 29, 273002 (2017).

  50. 50.

    Hammer, B., Hansen, L. B. & Nørskov, J. Improved adsorption energetics within density-functional theory using revised Perdew–Burke–Ernzerhof functionals. Phys. Rev. B 59, 7413–7421 (1999).

  51. 51.

    Bernhardsson, E., Freider, E. & Rouhani, A. Luigi, a Python package that builds complex pipelines of batch jobs (bithub, 2012);

  52. 52.

    Mathew, K. et al. Atomate: a high-level interface to generate, execute, and analyze computational materials science workflows. Comp. Mater. Sci. 139, 140–152 (2017).

Download references


This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. We thank K. Chan for helpful discussions about descriptor targets, as well as P. de Luna and E. T. Sargent for helpful discussions about analysis.

Author information

K.T. and Z.W.U. contributed to the scientific workflow software and DFT calculations. K.T. and Z.W.U. made the regression models and analysis. K.T. performed the clustering analysis. K.T. and Z.W.U. wrote the manuscript. Z.W.U. conceived the idea.

Correspondence to Zachary W. Ulissi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figures 1–7, Supplementary Tables 1 & 2, Supplementary Notes 1–3, Supplementary Methods & Supplementary References

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tran, K., Ulissi, Z.W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nat Catal 1, 696–703 (2018).

Download citation

Further reading