Significance of the oxygen evolution reaction

Sustainable, climate-friendly, alternatives to fossil resources are needed to meet the needs of the energy and chemical sectors. Precursor feeds of non-potable water1 and in some cases aqueous nitrogen or carbon dioxide could be electrochemically reacted to sustainably produce many key fuels and valuable chemicals from renewable sources used in devices such as electrolyzers or photoelectrochemical cells (Fig. 1). The water oxidation reaction (WOR) or oxygen evolution reaction (OER) at the anode takes a pivotal role in this approach as it provides the protonated ions for the reduction of precursors at the cathode to the desired fuel or chemical. Four electrons and ions need to be transferred to make O2 from 2H2O (or 4OH- in alkaline media), at the cost of large overpotential at the fuel- or chemical-producing cathode. Additionally, the needed high potentials to drive the OER may degrade various electrode components2. Therefore, the identification of stable and active electrocatalysts for the OER has received considerable interest in the last decades. While stability and activity are both important and usually linked, this Comment focuses on reporting the activity of OER electrocatalysts in the context of the most mature field of water electrolysis and in particular the issue of comparability among reported activities in current benchmarking studies.

Fig. 1: Versatile uses of the oxygen evolution reaction (OER) for the production of fuels and chemicals.
figure 1

The anodic OER provides the ions needed for the cathodic reduction of precursors such as water, nitrogen, or carbon dioxide to the desired fuel or chemical.

Benchmarking protocols

Bligaard et al.3 define benchmarking in catalysis as a “community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments […] between new and standard catalysts”. Building on the latter definition and previous implementation4, this author proposes that a complete benchmarking protocol should contain (1) a definition of all relevant test input parameters and environmental conditions; (2) a protocol of the test procedure, i.e., the sequence of measurements to be performed; (3) a concise definition of the test output parameters and their evaluation criteria; (4) a well-defined and readily accessible gold standard. The current state toward a complete protocol is depicted in Fig. 2 for materials-centered research and device-centered research on the OER.

Fig. 2: Current state of benchmarking protocols and standardization.
figure 2

Green (left) indicates sufficient definition, yellow (middle) indicates further work needed, and red (right) indicates a lack of the criterion.

Several protocols have been published for materials-centered investigations of the OER. The protocol reported by McCrory et al.5 is the most widely used, yet still only by an insignificant fraction of all OER publications. Other protocols were reported by Spanos et al.6, Burke-Stevens et al.7, Peugeot et al.8, and Creel et al.9, as well as recommendations by Wei et al.10 (Table 1). The published protocols and the majority of materials-centered OER studies use potential sweeps (cyclic or linear sweep voltammetry) for conditioning, surface area determination (note the pitfalls11), and activity determination. The protocols include either additional current and/or potential steps for activity determination or alternative current and/or potential steps for this purpose.

Table 1 Overview of benchmarking protocols.

For device-centered investigations of the OER testing, Malkow et al.4 published protocols for testing low-temperature water electrolyzers that employ galvanostatic sweeps or a list of current density setpoints (Table 1). A similar protocol was used in a round-robin study12. Higher current densities are included but the range of current densities overlaps with that of the materials-centered investigations. Additional definitions and experimental parameters are published in a series of reports by Tsotridis and Pilenga13,14 that are partly based on definitions of the International Electrochemical Commission (IEC) such as standard IEC TS 60050-485:2020.

Assessment of the protocols

The test input parameters are defined in protocols for both materials and device testing with sufficient detail for reproduction. More environmental conditions are controlled for device testing. It will improve materials testing to mandate control of environmental conditions such as temperature, which is readily available through jacketed electrochemical cells. In the materials-centered protocols, it is not specified how the electrochemical data is sampled, i.e., whether the current/potential reading occurs at the end of the sampling interval or by integration. This can drastically affect the contribution of capacitive currents in sweep measurement or short pulses on (desirable) high surface area materials, which would lead to an overestimation of activity metric based on electrochemical current.

The procedures vary for materials-centered testing, where potential sweeps as well as potential and current holds are performed in various combinations. The author expects that this is the main issue that reduces comparability among the protocols because different surfaces can be formed by sweep and potential/current holds15 and because the range of potential sweeps may affect the measured currents16. Wei et al.10 and Malkow et al.4 recommend either sweeps or holds. This author recommends several current holds, i.e., a Tafel plot (note the pitfalls17), with increasing current density until 2 V is reached and sufficient duration to ensure a steady state of the double layer and electrocatalyst microstructure (electronic structure, phase, and morphology). Using current holds for both materials- and device-centered investigations to determine the activity metric(s) can be a small step towards crossing the gap between fundamental and applied research.

Output parameters and evaluation criteria are clearly defined in the previous reports for both materials- and device-centered investigations. Common activity metrics are various (over)potentials at fixed current (density) or current (densities) at fixed (over)potential where the current is normalized by a property of the used electrocatalyst material (e.g., electrocatalyst mass or surface area) or a property of the electrode (e.g., electrode area). In addition to electrochemical data, this author urges to also report a measure of the evolved oxygen or the Faradaic efficiency5,6. The focus on specific activity metrics and reporting recommendations differ in details but several protocols include Tafel plots as recommended above, from which a desired metric could be calculated, most readily if the electrochemical data was published openly and FAIR (findable, accessible, interoperable, reusable)18 in a data repository.

Importance of standards

To date, gold standard materials for the OER are Ni-Fe oxides in alkaline and RuO2 as well as IrO2 in acid. Unfortunately, the outcome of their test evaluation criteria depends strongly on details of synthesis, possible non-electrochemical post-treatment steps as well as electrochemical conditioning steps. For Ni-Fe oxide, a simple synthesis has been reported7. Powders of these oxides and membrane electrode assemblies (MEA) based on iridium-ruthenium oxide are also available commercially. Issues with preparation aside, there is no standard electrocatalyst or electrode consistently used in all reported protocols. Ideally, the field would need a benchmark akin to the international prototype of the kilogram and its exact copies, which would enable to comparison of the reported protocols. In the field of photovoltaics, testing centers such as the European Solar Test Installation (ETSI) have been established where one sends samples for standardized tests, thus eliminating the considerable variation observed in round-robin tests (on electrolyzers)12. A clearly defined gold standard and standardized testing, especially in specialized facilities, would significantly improve the reliability of reported OER activity metrics to benchmark electrocatalysts and electrodes, identify structure-property relationships, and harness big data analysis in electrocatalysis.

In summary, the state of standardization of materials-centered investigations of the OER is less advanced as compared to device-centered investigations, yet there are no international formalized standards such as the ones that exist in corrosion science, e.g., ASTM G150-18 or DIN EN ISO 17864:2008-07, for either community. For materials testing, there are additionally no harmonized protocols or no round-robin studies on gold standards. Furthermore, most reports of highly active materials unfortunately do not follow any of the reported protocols to obtain their activity metric(s).

Outlook

As pointed out by Bligaard3 benchmarking must be a community-driven effort. This raises the question of which are the relevant communities and should we thrive to identify a universal protocol? This author believes that the conditioning part of the protocol should be defined by sub-communities, e.g., catalyst ink investigations, epitaxial thin films, alkaline electrolyzers, etc. As recommended above, a current step protocol could better connect Tafel plots in materials- and device-centered OER investigations. Additional measurements could be performed after the Tafel plot or on separate samples, e.g., measurements of the Faradaic efficiency5. These recommendations (Table 1) should be seen as a seed for the needed discussion in the community rather than competition with previous protocols. Implementing a harmonized base protocol and gold standard would be comparably little effort with large gain for the community toward truly benchmarking the OER being important in many contexts beyond water electrolysis (Fig. 1).