Introduction

Green energy and a circular economy are some of the key paradigms that our human society needs to realize in the next few decades. This implies that we need to give up on the combustion of fossil fuels. A key element to achieve this paradigm shift is the use of electrochemistry, be it for batteries and fuel cells, to convert electrical energy to hydrogen or other valuable chemicals, or to convert hydrogen back to energy without direct combustion in air.

The redox potential of electron transfer (ET), Ox + ne → Red in liquids, is an essential property for a variety of electrochemical energy conversion devices, such as batteries, fuel cells, and electrochemical fuel synthesis. It determines the alignment of redox levels relative to the Fermi level of a metal, or valence band maximum (VBM) and conduction band minimum (CBM) of semiconductor and insulator electrodes. It also determines the stability windows of ions and molecules in solutions, that is the range of voltages within which a specific ion or molecule can undergo electrochemical reactions. This information is vital to designing redox species and solvent molecules, such as redox couples for redox-flow batteries1, solvents and additives for Li-ion batteries2,3,4, radical scavengers for fuel cells5 and electrocatalysts for fuel synthesis6,7.

Unfortunately, to date, accurate first-principles (FP) predictions of this crucial property remain challenging, with typical prediction errors around 0.5 V. Sprik and co-workers developed a thermodynamic integration (TI) method utilizing the computational standard hydrogen electrode (CSHE)8,9 and applied this method to several redox reactions in aqueous solutions10,11. They discovered that the use of a semi-local functional leads to errors exceeding 0.5 V. This discrepancy arises because the functional inaccurately yields the shallow valence band edge and the deep conduction band edge, resulting in incorrect hybridization with the redox levels. Similar magnitudes of errors have also been observed in other FP calculations that employ semi-local approximations12,13. As a result, Sprik and co-workers opted for a hybrid functional. Nonetheless, they observed a significant spread of values for two metal ion couples, with the Cu2+/Cu+ couple ranging from − 1.13 to − 0.20 V (experimental value 0.16 V) and the Ag2+/Ag+ couple ranging from 0.90 to 1.72 V (experimental value 1.98 V)11,14. These variations were attributed to differences in the pseudopotential and the computational code base (CMPD versus CP2K). While the ”best” values obtained using the hybrid functional and highly accurate pseudopotentials are relatively close to experimental values ( − 0.20 V for Cu, and 1.72 V for Ag), the agreement is still far from being quantitative. Due to the high computational cost of hybrid functionals, most calculations have been performed using approximated methods, such as continuum solvation models15,16,17,18 and QM/MM models19,20. Although these models can reproduce the experimental redox potentials of ions and molecules with convenient accuracy, the computational results heavily rely on many approximations, making it unclear which predictions are strictly correct. Here, we briefly note that these FP and approximated methods have been extended to electrochemistry at liquid-solid interfaces21,22,23,24,25,26,27,28,29. Nowadays, these methods have become indispensable for elucidating electrochemical interfacial phenomena and designing advanced materials30,31,32,33,34,35. However, even in the calculation of redox reactions at interfaces, approximations are made in most applications, such as representing the motion of atomic nuclei with simple statistical models like the harmonic oscillator model21,22,30,31,33,36,37, modeling solvents by reference interaction site model based on the integral equation theory26, or modeling by continuum mediums23,24,25,27,29. A rigorous FP method that eliminates these approximations is also desired in the field of interfacial electrochemistry.

The main goals of the present work are three-fold: First, we want to accurately calculate the redox potential of metal ions in water for three prototypical cases: Ag, Cu, and Fe. Ag2+ ions are among the most aggressive oxidants with a large redox potential, whereas the redox potential of Cu2+ ions is fairly shallow, and the Fe3+/Fe2+ reaction lies in between. The first two redox reactions involve large changes in the ion water coordination, which makes the calculation challenging, whereas the redox reaction of Fe is a so-called simple outer sphere ET reaction and has been the subject of numerous experimental and theoretical studies38. The Fe ions are conceived to be particularly challenging for density functional theory. Second, we want to establish a computationally feasible pathway that yields statistically accurate results. Last but not least, we want to systematically explore different density functionals to set a guideline for future studies.

Results

Free energy change of electron transfer reaction

We begin with an overview of the used theory and modeling. Further details can be found in the Methods section. The reactions evaluated in this study are electron transfer reactions in water: Fe3+ + e ↔ Fe2+, Cu2+ + e ↔ Cu+, and Ag2+ + e ↔ Ag+. We assume that other side reactions do not occur, and only the valency of redox species changes due to the reaction similar to the previous study11. The redox potential Uredox is determined by the free energy difference ΔA between the reduced and oxidized states as

$${U}_{{{{\rm{redox}}}}}=-\frac{\Delta A}{en},$$
(1)

where e is the elementary charge and n is the number of electrons involved in the reaction. Here, we also assume that the change in volume during the one electron-transfer half reaction is negligible similar to the previous studies8,11. Then, the Gibbs free energy is replaced by the Helmholtz free energy (A). The free energy difference ΔA can be exactly determined by thermodynamic integration (TI)39,40:

$$\Delta A=\int\nolimits_{0}^{1}{\left\langle \frac{\partial H}{\partial \lambda }\right\rangle }_{\lambda }d\lambda .$$
(2)

Here, 〈Xλ denotes the expectation value of X for an ensemble created by the Hamiltonian at coupling λ. The integral seamlessly connects the oxidized state (λ = 0) to the reduced state (λ = 1) along a coupling path41,42. The potential energy surface upon which atoms move is described by the grand potential Ω of the system opened for electrons43. Consequently, the Hamiltonian of the system is described as follows:

$$H=\mathop{\sum }\limits_{i=1}^{{N}_{{{{\rm{a}}}}}}\frac{{\left\vert {{{{\bf{p}}}}}_{i}\right\vert }^{2}}{2{m}_{i}}+\Omega ,$$
(3)
$$\Omega =U-\mu N,$$
(4)

where Na is the number of atoms, mi and pi are the mass and momentum vector of the i-th atom, and μ and N are the chemical potential and the number of electrons. The chemical potential μ is fixed at the reservoir level, whereas N varies by n along the coupling path. U represents the potential energy surface at λ, equating to the sum of the Helmholtz free energy of the electronic subsystem and the electrostatic interactions among nuclei. Following previous studies41,42, U can be described as

$$U=\lambda {U}_{1}+\left(1-\lambda \right){U}_{0},$$
(5)

where U0 and U1 are the potential energies of the oxidized and reduced states, respectively. Hence, the free energy difference ΔA is written as

$$\Delta A=\int\nolimits_{0}^{1}{\left\langle {U}_{1}-{U}_{0}\right\rangle }_{\lambda }d\lambda -\mu n.$$
(6)

If the structural changes are significant from the oxidized to the reduced species — recall that this is the case for Ag and Cu — many integration steps are required to accurately determine the energy difference. The application of this approach entails two difficulties. (i) Clearly, it implies huge computational cost if applied directly to hybrid functionals; if 100.000 timesteps using a complete plane wave basis set are required to obtain good statistical accuracy, several 10 mio core hours are necessary. (ii) Second, during the reaction one electron needs to be transferred from the reservoir, characterizing the chemical potential of the electrons. The vacuum level is the best-suited reference chemical potential that allows one to align the redox levels and band edges of the electrode in the absolute potential scale. However, in FP calculations of bulk systems under periodic boundary conditions, the vacuum level is a quantity that cannot be directly accessed during simulations.

Chemical potential of electrons

We will address the second point (ii) first. Jiao and co-workers44 suggested using the average electrostatic potential as a suitable reference point, and Leung45 calculated the position of the average electrostatic potential with respect to the vacuum level in a second independent calculation involving a water slab. We refine this approach in a conceptually easy-to-understand way that simultaneously reduces finite-size errors. As a reference, instead of using the vacuum level, we employ the O 1s level of water, which is fixed relative to the vacuum level and can be conveniently calculated with the FP code used in this study. Our approach is schematically illustrated in Fig. 1. In FP calculations of a solution system under a periodic boundary condition, the energy contribution \({\left\langle {U}_{1}-{U}_{0}\right\rangle }_{\lambda }\) in Eq. (6) is equal to the negative electron affinity of the oxidized species scaled to the average local potential of the system. The same calculation can also determine the O 1s level \({\left\langle {\epsilon }_{{{{\rm{1s,bulk}}}}}\right\rangle }_{\lambda }\) of water, sufficiently far from the redox species and unaffected by the reactant, scaled to the average local potential. Therefore, measuring the redox level using the O 1s level as a reference results in \({\left\langle {U}_{1}-{U}_{0}\right\rangle }_{\lambda }/n-\left\langle {\epsilon }_{{{{\rm{1s,bulk}}}}}\right\rangle\), as highlighted in orange letters in Fig. 1c. In practice, \({\left\langle {\epsilon }_{{{{\rm{1s,bulk}}}}}\right\rangle }_{\lambda }\) may slightly vary along the coupling path due to finite size effects (refer to Supplementary Table 4). By aligning the potentials between the ’defect’ and the ’host’ within the same supercell in this manner, the finite size effects can be mitigated46,47. The vacuum level referenced to the O 1s level can be calculated using a slab model. As depicted in Fig. 1b, when referencing the O 1s level of water molecules located in the middle layer of the water slab, the vacuum level can be expressed as \(\mu -\left\langle {\epsilon }_{{{{\rm{1s,slab}}}}}\right\rangle\), as indicated in blue letters in Fig. 1c. The difference between the redox level and vacuum level scaled to the O 1s level results in the redox level scaled to the vacuum level, as shown in red letters in Fig. 1c. Consequently, the free energy difference ΔA on an absolute scale is written as

$$\Delta A=\int\nolimits_{0}^{1}{\left\langle {U}_{1}-{U}_{0}\right\rangle }_{\lambda }d\lambda -ne\Delta \bar{\phi },$$
(7)
$$e\Delta \bar{\phi }=\int\nolimits_{0}^{1}{\left\langle {\epsilon }_{{{{\rm{1s,bulk}}}}}\right\rangle }_{\lambda }d\lambda -\left\langle {\epsilon }_{{{{\rm{1s,slab}}}}}\right\rangle ,$$
(8)

where the vacuum level μ is set to zero. As illustrated by the green letters in Fig. 1c, \(\Delta \bar{\phi }\) accounts for the difference between the local potential at the vacuum in the slab model and the one in the bulk solution model.

Fig. 1: Aligning energy levels based on the O 1s level of water molecules.
figure 1

a Aligning the redox level based on the O 1s level of water molecules far from the redox species in the bulk solution model under a periodic boundary condition, (b) aligning the O 1s level of water molecules at the middle of the slab based on the local potential at the middle of the vacuum layer in the slab model, and (c) schematic of the alignment. The figure inset in (b) shows the snapshot of the water slab and computed local potential profile across the water slab. The graphics showing bulk and interfacial models are made by VESTA105.

Equation (7) is similar to the approach used in the CSHE method described in previous studies8,9. In these studies, the electrostatic potential of water was employed for alignment instead of the O 1s level. As shown by the gray dashed line in Fig. 1c, using the electrostatic potential, referred to here as the local potential, away from the redox species yields a result that is consistent with the use of the O 1s level within the statistical error bar (refer to Supplementary Note 3 for the case of pure water). However, our method differs in two key ways from the previous approach. First, we calculate the absolute vacuum reference rather than using SHE as a reference, which allows for the assessment of absolute potentials in half-cell reactions. Second, machine-learned (ML) force fields (FFs) can create many statistically independent configurations for the water slab. We do this by on-the-fly learning an H2O force field for the bulk and then for the surface and performing finally extensive million-step (total 1.5 ns) ML molecular dynamics for the surface. From this simulation, we draw 3000 statistically independent snapshots. Only for these snapshots, FP calculations are performed to determine the average O 1s level with respect to the vacuum level. This substantially reduces the required computational time from 1 mio core hours for brute force runs using the semi-local functional to only 2200 core hours for the FP calculations on 3000 structures, including the ML simulations and training runs, while retaining statistical accuracy, as demonstrated by the local potential profile shown in the inset of Fig. 1b (see details of the estimation of compute time in Supplementary Note 2).

Thermodynamic integration

To address the problem of computing the free energy difference, i.e. point (i), we propose the ML-aided scheme as depicted in Fig. 2. Here, we use the abbreviations FPnl(Ox/Red), FPsl(Ox/Red), and ML(Ox/Red) to denote calculations using a non-local hybrid functional, a semi-local functional and machine-learned force field for the oxidized and reduced cases, respectively. Naively, one could just perform the required TI using ML surrogate models. As we will show later in this article, this yields only acceptable accuracy. Errors in ΔA resulting from inaccuracies in the trajectory and the energy predictions by the ML potential can be corrected by performing TI from the ML potential to the FP potential for both the oxidized and reduced states. We will adopt this strategy for the FPsl method. So this involves two calculations: TI from the oxidized to the reduced species using ML surrogate models via Eq. (11) in the Methods section, ML (Ox) → ML (Red), and then for each oxidation state, TI from MLFF to the FPsl Hamiltonian via Eq. (13), ML(Ox) → FPsl (Ox) and ML(Red) → FPsl (Red). This two-step integration has three advantages as summarized below:

  • The integration ML (Ox) → ML (Red) using the MLFF takes into account most of the non-linear components of the integrand in the TI (see Supplementary Figs. 8, 9). Excellent statistical accuracy can be reached for this step.

  • The MLFFs also provide well-equilibrated initial structures required for other calculational steps.

  • The integrands in ML(Ox) → FPsl(Ox) and ML(Red) → FPsl (Red) are small and almost linear in the coupling parameter (see Supplementary Fig. 10) owing to the accurate reproduction of the FPsl structures by the MLFF (see Overview of results). Hence, these demanding integrals (evaluation of FPsl calculation in every MD step) converge using a few tens of picosecond MD simulations.

Fig. 2: Schematic of ML-aided TI and TPT to compute the free energy difference ΔA.
figure 2

The notations ML, FPsl and FPnl mean the machine-learning force field, FP method with semi-local functional and FP method with non-local hybrid functional, respectively. See details of equations in the Methods section.

There is one final obstacle though: performing TI to a potential calculated by a hybrid functional that includes non-local exchange (FPnl) is still exceedingly demanding and challenging. So in this specific case, as depicted in Fig. 2, we have decided to apply the Δ-machine learning (Δ-ML)48,49,50,51,52,53,54,55 which learns the difference ΔU between the FPsl potential and the FPnl potential. Due to the very smooth energy difference between the FPsl functional and the FPnl functional, it is possible to learn an extremely accurate ML representation of ΔU with just a few tens of FPnl calculations, with errors significantly smaller by an order of magnitude or more compared to those associated with MLFF models (see details in Supplementary Figs. 2 to 4 and Figs. 16 and 17). In the current implementation, the TI integration has been replaced with thermodynamic perturbation theory (TPT),

$$\Delta A={A}_{1}-{A}_{0}=-\frac{1}{\beta }{{{\rm{\ln }}}}{\left\langle {e}^{-\beta \Delta U}\right\rangle }_{0}=-\frac{1}{\beta }{{{\rm{\ln }}}}{\left\langle {e}^{\beta \Delta U}\right\rangle }_{1},$$
(9)

where β is the inverse temperature, and the symbol ΔU denotes the potential energy difference between the two end points. Although Eq. (9) is in principle exact, the potential energy difference might need to be evaluated for thousands or even many ten thousand configurations if the ensembles generated by the two potentials are too distinct. This implies the significantly expensive FPnl calculations. The Δ-ML scheme allows for the circumvention of this issue, enabling the reduction of the required FPnl calculations to merely tens. Thanks to the remarkable accuracy of the Δ-ML models, it is possible to obtain exceedingly accurate free energy differences between different FP methods without further correction (see Supplementary Fig. 12). This is one of the key advances of the present work. The computational cost is ultimately only limited by generating sufficient configurations using the FPsl. Thus, the required compute time for direct TI using the FPnl method is reduced from 20 mio core hours to 16800 core hours for the FPMD simulations that generate configurations using the FPsl method (see details of the estimation in Supplementary Note 2).

Overview of results

We now detail our results and will show that the adopted procedure yields statistically highly accurate results. The calculations were performed using VASP56,57 and the projector-augmented wave (PAW) method58,59. For the ML force fields (MLFFs) the implementation detailed in previous publications is used60,61,62. Similar to the pioneering ML approaches50,63,64, the potential energy in our MLFF method is approximated as a summation of local energies [see Eq. (20)]. The local energy is approximated as a weighted sum of kernel basis functions [see Eq. (21)]. A Bayesian formulation allows to efficiently predict energies, forces and stress tensor components as well as their uncertainties. The predicted uncertainty enables the reliable sampling of the reference structures on the fly during the FPMD simulation. Details of the equations, parameters and training conditions are summarized in the Methods section and Supplementary Note 1. As in the previous studies60,65,66,67, the MLFFs trained on a semi-local functional with dispersion corrections achieve root mean square errors (RMSEs) of 1–5 meV atom-1 and 0.04-0.11 eV Å-1 for energies and forces (see error distributions in Supplementary Figs. 1 to 4). The three ET reactions are examined in water by using a semi-local functional68 with a dispersion correction69,70 (RPBE+D3) and hybrid functionals71,72 with and without a dispersion correction (PBE0 and PBE0+D3). Systematic comparisons of different functionals help us to study the effects of the exact exchange as well as dispersion corrections. As shown in Table 1 [see lines of PBE0 (0.25) and PBE0+D3 (0.25)] good agreement with experiment is achieved using the hybrid functional with 1/4 exact exchange, regardless of whether dispersion corrections are used or not.

Table 1 Redox potentials Uredox of three reported in ref RPBE+D3, PBE0 (0.25), PBE0 (0.50), PBE0+D3 (0.25) and PBE0+D3 (0.50) using MLFF and Δ-ML

Water surface calculations

For RPBE+D3, the present MLFF provides a surface tension of 79 ± 5 mN m−1 for the 128 molecular system and 84 ± 5 mN m−1 for the 1024 molecular system at 298 K. Here, the surface tension was computed as73

$$\gamma =\frac{{L}_{z}}{2}\left({p}_{zz}-\frac{{p}_{xx}+{p}_{yy}}{2}\right),$$
(10)

where x and y define the directions parallel to the macroscopic interface, z defines the direction normal to the interface, Lz is the length of the unit cell in the z-direction, and pij is the pressure tensor. The results are slightly larger than the value of 68 ± 2 mN m−1 calculated by a neural network potential73 and experimental value of 72 mN m−174 while it is within the range (50-90 mN m−1) of previous MD results by FP75 and classical force fields76,77. Distributions of interfacial water dipole moments for both, 128 and 1024 molecular systems, are shown in Supplementary Fig. 5. They consistently indicate that the orientation of interfacial water molecules is bimodal as reported in previous MD studies employing the classical SPC/E force field76. The distributions are also consistent with the results of sum frequency generation (SFG) analyses78.

Metal water coordination

Figure 3 shows metal-oxygen radial distribution functions (RDFs) and running integration numbers (RINs) at the reduced and oxidized states calculated by the MLFF and FPsl methods. The MLFFs well reproduce RDFs and RINs of the FPsl method. Both methods show that the coordination number of Fe ions is 6 independent of the charge state. In contrast, the value for Cu changes from 5-6 in the oxidized state (Cu2+) to 2–3 in the reduced state (Cu+). The coordination number of Ag also changes from 5-6 in the oxidized state (Ag2+) to 4–5 in the reduced state (Ag+). These hydration structures agree with the ones reported in previous MD studies using FPMD methods79,80,81 and empirical force fields79. Although there are slight deviations in the Fe-O distance and shoulders for Cu-O and Ag-O in the RDFs likely related to the short FPMD simulation time and errors in the MLFFs, overall, our MLFFs reproduce the first-principles energies and structures of the hydrated metal cations with good accuracy.

Fig. 3: Metal-oxygen radial distribution functions (gX-O) and running integration numbers (nX-O) provided by 100 ps MLFF-MD and 10 ps FPMD simulations.
figure 3

Gray and black lines are for the reduced and oxidized states, respectively. Solid and dashed lines are results obtained by the MLFF and FPsl, respectively. Graphics in the insets the show first solvation structures at the reduced and oxidized states.

Redox potentials

After verifying the size effect on the redox potentials obtained at the FPsl level using unit cells containing 32, 64 and 96 water molecules (see Uredox in Supplementary Fig. 13), calculations were conducted on the bulk solutions containing 64 water molecules in the unit cell. The computed redox potentials are compared with the experimental ones in Fig. 4. All relevant data (\({\left\langle {U}_{1}-{U}_{0}\right\rangle }_{\lambda }\) and \(\Delta \bar{\phi }\)), as well as results of other functionals with error bars, are summarized in Supplementary Notes 3, 4 and 5. The MLFFs trained on FPsl (RPBE+D3) (see ML in Fig. 4) lead to non-negligible deviations of 30-250 mV from the values of full FPsl calculations without any MLFF (see FPsl w/o ML) depending on the training data size (see Supplementary Note 9). The deviations can be corrected by two TI integrations [ML(Ox) → FPsl(Ox) and ML(Red) → FPsl(Red) in Fig. 2] as shown by FPsl w/ ML. However, the semi-local functional results in fairly large and non-systematic errors. For Ag, the redox potential is underestimated, whereas for Cu it is significantly overestimated compared to experiment.

Fig. 4: Computed and experimental redox potentials. ML means the results obtained by the MLFF trained on FPsl (RPBE+D3) without any correction.
figure 4

The small letters `w/ ML' under FPsl mean that the ML result was corrected by the scheme shown in Fig. 2. The letters `w/o ML' mean the result calculated by FPsl using Eqs. (18) and (19) without MLFF. Here, FPnl means PBE0+D3 (0.25) result obtained by the scheme shown in Fig. 2. Experimental values are taken from ref. 14.

The errors can be significantly decreased to 0.11 V on average using hybrid functionals with one-quarter exact exchange. As tabulated in Table 1, Uredox for the Cu2+/Cu+ couple decreases with increasing fractions of the exact exchange, whereas the redox potential for the Ag2+/Ag+ couple increases with increasing fractions. For Fe3+/Fe2+, the trend is not so obvious (first increase then slight decrease). Overall the present trends agree with the results obtained using semi-local and hybrid functionals as reported by Liu and co-workers11. Finally, the effects of Grimme’s dispersion correction are small for all redox couples. This implies that changes in the electronic properties (such as water valence band maximum and conduction band minimum) are most relevant, whereas all the functionals give a similar and good account of the solvation structure. It remains unclear, however, why one-quarter of exact exchange results in balanced accuracy. The functional form of PBE0 was rationalized by the adiabatic connection from the uncorrelated exact exchange to the fully interacting energy, which is approximated by the PBE functional71,82. Nonetheless, the ratio of exact exchange continues to be a parameter. One-quarter of exact exchange is known to achieve balanced accuracy for the geometries, thermochemistry, and spectroscopic properties of molecules. However, as reported in previous studies83, this functional underestimates the band gap of liquid water, even though it provides a more accurate prediction than the HSE06 functional. The mechanism behind this remains an open question.

Another key observation in this study lies in the relationship between the error of the ML surrogate model and the error in the redox potential. Our MLFF models achieve an RMSE of a few meV atom−1 for energy and tens of meV Å−1 for force. These accuracies can be considered standard level compared to ML models generated in past research50,60,63,64,67,84,85,86, yet they yield non-negligible deviations in the redox potential from the FP method. In comparison, Δ-ML models, which attained an RMSE substantially lower by more than an order of magnitude, markedly diminished the deviation in the redox potential to below 10 mV (refer to Supplementary Fig. 12). These results suggest that in aiming for an accuracy of 10 mV in reproducing the redox potential of the FP method, an RMSE at least an order of magnitude smaller than that shown by standard MLFFs is required. Achieving this level of accuracy is highly challenging for MLFFs, even if they are trained on larger training datasets, as demonstrated in the previous study on liquid water87. While the accuracy of emerging MLFFs continues to improve88,89, there is always a risk that machine learning models may produce errors concerning the structure of extrapolation regions outside the training data. Even in the future where machine learning models have further advanced, our ML correction schemes will serve as a powerful method for quantifying errors and providing results from accurate FP calculations.

In summary, our approach enables efficient statistical sampling that is indispensable for accurate computations of the free energies of aqueous systems. The TI and TPT calculations allow us to improve the accuracy from the ML model to the semi-local functional and from the semi-local functional to the hybrid functional step-by-step. Combining TPT and Δ-machine learning is particularly promising since this allows us to obtain statistically highly accurate results even for expensive functionals in very little compute time. For instance, it is well conceivable that one could also use methods beyond density functional theory for the final step. Our final results reproduce the redox potentials of the three transition metal cations with excellent accuracy using a standard hybrid functional. The integration pathways chosen here are generalizable to a wide variety of electron transfer reactions. We believe that the scheme will pave the way to first-principles electrochemistry predicting the key properties of redox reactions in energy conversion devices.

Methods

TI and TPT

The TI and TPT shown in Fig. 2 in the main text are conducted by using the equations listed below.

  • ML(Ox) → ML(Red)

$$\Delta {A}^{{{{\rm{ML}}}}}=\int\nolimits_{0}^{1}{\left\langle \frac{\partial {H}^{{{{\rm{ML}}}}}}{\partial \lambda }\right\rangle }_{\lambda }d\lambda ,$$
(11)
$${H}^{{{{\rm{ML}}}}}=\mathop{\sum }\limits_{i=1}^{{N}_{{{{\rm{a}}}}}}\frac{{\left\vert {{{{\bf{p}}}}}_{i}\right\vert }^{2}}{2{m}_{i}}+\lambda {U}_{1}^{{{{\rm{ML}}}}}+\left(1-\lambda \right){U}_{0}^{{{{\rm{ML}}}}}-Ne\Delta \bar{\phi }.$$
(12)
  • ML(Ox) → FPsl(Ox) andML(Red) → FPsl(Red)

$$\Delta {A}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}-{{{\rm{ML}}}}}=\int\nolimits_{0}^{1}{\left\langle \frac{\partial {H}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}-{{{\rm{ML}}}}}}{\partial \eta }\right\rangle }_{\eta }d\eta ,$$
(13)
$${H}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}-{{{\rm{ML}}}}}=\mathop{\sum }\limits_{i=1}^{{N}_{{{{\rm{a}}}}}}\frac{{\left\vert {{{{\bf{p}}}}}_{i}\right\vert }^{2}}{2{m}_{i}}+\eta {U}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}+\left(1-\eta \right){U}_{\kappa }^{{{{\rm{ML}}}}}.$$
(14)
  • FPsl(Ox) → FPnl(Ox) andFPsl(Red) → FPnl(Red)

$$\Delta {A}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{nl}}}}}-{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}\simeq {\left\langle \Delta {U}_{\kappa }^{\Delta {{{\rm{ML}}}}}\right\rangle }_{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}-\frac{\beta }{2}{\left\langle {\left(\Delta {U}_{\kappa }^{\Delta {{{\rm{ML}}}}}-{\left\langle \Delta {U}_{\kappa }^{\Delta {{{\rm{ML}}}}}\right\rangle }_{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}\right)}^{2}\right\rangle }_{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}.$$
(15)

The symbols \({U}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{nl}}}}}}\), \({U}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}\) and \({U}_{\kappa }^{{{{\rm{ML}}}}}\) are the potential energies for the oxidized (κ = 0) and reduced (κ = 1) states calculated by the non-local functional, semi-local functional and MLFF trained on the semi-local functional, respectively. The symbol \(\Delta {U}_{\kappa }^{\Delta {{{\rm{ML}}}}}\) denotes the potential energy difference calculated by the Δ-ML model trained on the potential energy difference \({U}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{nl}}}}}}-{U}_{\kappa }^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}\) between the non-local and semi-local functionals. In Eq. (15), the second-order cumulant expansion is employed. The expansion is exact if the probability distribution of \(\Delta {U}_{\kappa }^{\Delta {{{\rm{ML}}}}}\) is Gaussian (see the derivation in Supplementary Note 8). The condition is reasonably satisfied as shown in Supplementary Fig. 11. Preliminary TI and TPT simulations using the MLFFs also indicate that the TPT calculation reproduces TI results as shown in Supplementary Note 6.

The free energy differences of the FPsl and FPnl methods are obtained as

$$\Delta {A}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}=\Delta {A}^{{{{\rm{ML}}}}}+\Delta {A}_{1}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}-{{{\rm{ML}}}}}-\Delta {A}_{0}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}-{{{\rm{ML}}}}},$$
(16)
$$\Delta {A}_{{{{\rm{nl}}}}}^{{{{\rm{FP}}}}}=\Delta {A}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}+\Delta {A}_{1}^{{{{{\rm{FP}}}}}_{{{{\rm{nl}}}}}-{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}-\Delta {A}_{0}^{{{{{\rm{FP}}}}}_{{{{\rm{nl}}}}}-{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}.$$
(17)

To validate the MLFF-aided computations of the free energy difference \(\Delta {A}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}\), the same property was also computed by the TI without using the ML method:

$$\Delta {A}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}=\int\nolimits_{0}^{1}{\left\langle \frac{\partial {H}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}}{\partial \lambda }\right\rangle }_{\lambda }d\lambda ,$$
(18)
$${H}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}=\mathop{\sum }\limits_{i=1}^{{N}_{{{{\rm{a}}}}}}\frac{{\left\vert {{{{\bf{p}}}}}_{i}\right\vert }^{2}}{2{m}_{i}}+\lambda {U}_{1}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}+\left(1-\lambda \right){U}_{0}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}-Ne\Delta \bar{\phi }.$$
(19)

The TI calculation in Eq. (2) can be decomposed into the two terms on the right-hand side of Eq. (7). The integrand in the first term nonlinearly varies along the coupling path (see Supplementary Figs. 8 and 9) while the integrand in Eq. (8), which is relevant to the second term in Eq. (7), varies only slightly (see Supplementary Table 4). To perform the integration of the first term in Eq. (7), Simpson’s rule with equidistant five points was used following the previous FP study by Blumberger and co-workers41. For the integration in Eq. (8), the average of the O 1s levels in the fully reduced and oxidized states was used based on the trapezoidal rule. For each point, the ensemble average was taken over an 80-ps-NVT-ensemble MD simulation at 298 K after a 20 ps equilibration. Similar to the MLFF calculations, Simpson’s rule with equidistant five points was used for the TI calculation in Eq. (18). For each grid, the ensemble average was taken over a 20-ps-MD simulation starting from the final structure of the TI calculation using the MLFF at the same grid point. Each initial structure of the MD simulations was prepared by annealing the system from 400 K to 298 K by a 100-ps-NVT-ensemble MD simulation using the MLFF after annealing the same system from 1000 K to 400 K by a 1-ns-NVT ensemble MD simulation using the polymer consistent force field (PCFF)90 implemented in a homemade MD program91. Supplementary Figs. 8 and 9 show the integrands of Eqs. (11) and (18), respectively, as functions of the coupling parameter λ. In the same figures, probability distributions of \(\Delta {U}^{{{{\rm{ML}}}}}={U}_{1}^{{{{\rm{ML}}}}}-{U}_{0}^{{{{\rm{ML}}}}}\) and \(\Delta {U}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}={U}_{1}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}-{U}_{0}^{{{{{\rm{FP}}}}}_{{{{\rm{sl}}}}}}\) at each λ are also shown. For all redox couples, the variance of the distribution varies with changing λ, and thus, the integrand is non-linear with respect to λ [see Supplementary Eq. (14)]. Hence, the second cumulant expansion Supplementary Eq. (12) is not applicable to the whole integration from the oxidized state to the reduced state.

The TI calculations in Eq. (13) were conducted using the trapezoidal rule with three equidistant points. At each point, a 10-ps-NVT-ensemble MD simulation at 298 K was performed. The integrands shown in Supplementary Fig. 10 are smaller than the ones shown in Supplementary Figs. 8 and 9, respectively. They are also nearly proportional to the coupling parameter η.

In the TPT calculations using the Δ-ML models, the ensemble average in Eq. (15) was taken over 1400 configurations selected randomly from 70-ps-NVT-ensemble FPMD simulations using the FPsl method. Although these FPMD simulations are expensive, the overall computational time is still much smaller than full FP simulations. To ensure the applications of the second-order cumulant expansion, we show the probability distributions of the energy difference \(\Delta {U}_{\kappa }^{\Delta \rm{ML}}\) in Supplementary Fig. 11. The distribution is well-fitted by Gaussian functions, indicating that Eq. (15) is a reasonable approximation.

The MD simulations were performed using a Langevin thermostat92. For efficient sampling, the mass of hydrogen and time step were set to 2 amu and 1 fs.

MLFF and Δ-ML

Similar to previous machine-learning approaches63,64, the potential energy U of a structure with Na atoms in our MLFF method is approximated as a summation of local energies Ui written as

$$U=\mathop{\sum }\limits_{i=1}^{{N}_{{{{\rm{a}}}}}}{U}_{i}.$$
(20)

Following the Gaussian approximation potential pioneered by Bártok and co-workers64, the local energy Ui is approximated as a weighted sum of functions \(K({{{{\bf{x}}}}}_{i},{{{{\bf{x}}}}}_{{i}_{{{{\rm{B}}}}}})\) centered at reference points \(\{{{{{\bf{x}}}}}_{{i}_{{{{\rm{B}}}}}}| {i}_{{{{\rm{B}}}}}=1,...,{N}_{{{{\rm{B}}}}}\}\)

$${U}_{i}=\mathop{\sum }\limits_{{i}_{{{{\rm{B}}}}}=1}^{{N}_{{{{\rm{B}}}}}}{w}_{{i}_{{{{\rm{B}}}}}}K\left({{{{\bf{x}}}}}_{i},{{{{\bf{x}}}}}_{{i}_{{{{\rm{B}}}}}}\right).$$
(21)

The coefficients \(\{{w}_{{i}_{{{{\rm{B}}}}}}| {i}_{{{{\rm{B}}}}}=1,...,{N}_{{{{\rm{B}}}}}\}\) are optimized to best reproduce the FP energies, forces, and stress tensor components as obtained by the FPMD simulations. The descriptor xi used in this study is a vector containing two and three body contributions67:

$${{{{\bf{x}}}}}_{i}^{{{{\rm{T}}}}}\to \left(\sqrt{{\beta }^{(2)}}{{{{\bf{x}}}}}_{i}^{(2){{{\rm{T}}}}},\sqrt{{\beta }^{(3)}}{{{{\bf{x}}}}}_{i}^{(3){{{\rm{T}}}}}\right),$$
(22)

Here, β(2) and β(3)( = 1 − β(2)) are the weights on the two and three body descriptors, \({{{{\bf{x}}}}}_{i}^{(2)}\) and \({{{{\bf{x}}}}}_{i}^{(3)}\), respectively. The vectors \({{{{\bf{x}}}}}_{i}^{(2)}\) and \({{{{\bf{x}}}}}_{i}^{(3)}\) collect the expansion coefficients of two and three body distribution functions with respect to the orthonormal radial and angular basis sets60,67:

$${\rho }_{i}^{(2)}\left(r\right)=\frac{1}{\sqrt{4\pi }}\mathop{\sum }\limits_{n=1}^{{N}_{{{{\rm{R}}}}}^{0}}{c}_{n}^{i}{\chi }_{n0}\left(r\right)$$
(23)
$${\rho }_{i}^{(3)}\left(r,s,\theta \right)=\mathop{\sum }\limits_{l=0}^{{L}_{\max }}\mathop{\sum }\limits_{n=1}^{{N}_{{{{\rm{R}}}}}^{l}}\mathop{\sum }\limits_{\nu =1}^{{N}_{{{{\rm{R}}}}}^{l}}\sqrt{\frac{2l+1}{2}}{p}_{n\nu l}^{i}{\chi }_{nl}\left(r\right){\chi }_{\nu l}\left(s\right){P}_{l}\left(\cos \theta \right).$$
(24)

The two and three body distribution functions \({\rho }_{i}^{(2)}\) and \({\rho }_{i}^{(3)}\) are defined as:

$${\rho }_{i}^{(2)}\left(r\right)=\frac{1}{4\pi }\int\,{\rho }_{i}\left(r\hat{{{{\bf{r}}}}}\right)d\hat{{{{\bf{r}}}}},$$
(25)
$${\rho }_{i}^{(3)}\left(r,s,\theta \right)=\iint \,d\hat{{{{\bf{r}}}}}d\hat{{{{\bf{s}}}}}\ \delta \left(\hat{{{{\bf{r}}}}}\cdot \hat{{{{\bf{s}}}}}-\cos \theta \right){\rho }_{i}\left(r\hat{{{{\bf{r}}}}}\right){\rho }_{i}\left(s\hat{{{{\bf{s}}}}}\right),$$
(26)
$${\rho }_{i}\left({{{\bf{r}}}}\right)=\mathop{\sum }\limits_{j=1}^{{N}_{{{{\rm{a}}}}}}{f}_{{{{\rm{cut}}}}}\left(\left\vert {{{{\bf{r}}}}}_{j}-{{{{\bf{r}}}}}_{i}\right\vert \right)g\left({{{\bf{r}}}}-\left({{{{\bf{r}}}}}_{j}-{{{{\bf{r}}}}}_{i}\right)\right)$$
(27)

The function g is the smoothed δ-function, and fcut is a cutoff function that smoothly eliminates the contribution from atoms outside a given cutoff radius Rcut. For χnl and Pl, normalized spherical Bessel functions χnl = jl(qnr) and Legendre polynomials of order l are used in this work, respectively. For the kernel basis functions, the smooth overlap of atomic positions (SOAP) kernel50 is employed

$$K\left({{{{\bf{x}}}}}_{i},{{{{\bf{x}}}}}_{{i}_{{{{\rm{B}}}}}}\right)={\left({\hat{{{{\bf{x}}}}}}_{i}\cdot {\hat{{{{\bf{x}}}}}}_{{i}_{{{{\rm{B}}}}}}\right)}^{\zeta }.$$
(28)

The hat symbol \({\hat{{{{\bf{x}}}}}}_{i}\) denotes a normalized vector of xi. The normalization and exponentiation in Eq. (28) introduce non-linear terms that mix two- and three-body contributions.

The same formulation is used for the Δ-ML method. In the Δ-ML method, differences in potential energies and forces between two FP methods, semi-local and non-local functionals in this study, are used as the training data.

Parameter sets of the descriptors and kernel basis functions used in previous publications were employed in this study60,62,67. The parameters are tabulated in Supplementary Table 1.

Bulk solutions containing the redox species were modeled by systems as shown in Fig. 1. The number of water molecules was set to 32, 64, and 96. Three different model sizes were examined to clarify the system size effect. The sizes of the unit cells were set to obtain a water density of 0.99 g cm−3. The size of the unit cell for the 32 water molecules is the same as the one used in previous FPMD studies10,11,41,80. For each of the reduced and oxidized states, MLFF and Δ-ML models were constructed. All MLFF models were generated on the fly during a 100-ps-NVT-MD simulation at 400 K by using the active-learning algorithm developed in our previous study60. The temperature for the training runs was set to a value higher than the target one of 298 K for production runs, to ensure that training data and kernel basis functions were provided in a wider phase space. A Langevin thermostat92 was used to control the temperature. Exchange-correlation interactions between electrons were described by the semi-local RPBE functional68 with Grimme’s D3 dispersion corrections69,70. Probability distributions of the errors of the constructed MLFFs for energies and forces on test data are shown in Supplementary Figs. 1 to 4. The RMSEs are similar to those of MLFFs used in previous studies60,65,66,67.

After examining the system size effect using the semi-local functional (see results in Supplementary Fig. 13), Δ-ML models were constructed on systems containing 64 water molecules. Each Δ-ML model was trained on FP energies and forces of 40 structures selected randomly from a trajectory of a 20 ps NVT-ensemble FPMD simulation at 298K. The FPMD simulation was performed using the RPBE+D3 functional. Differences in energies and forces between the non-local and semi-local functionals for these 40 structures were used as training data. PBE072 with and without Grimme’s D3 dispersion correction69,70 was employed as the non-local functional because the functional is known to accurately predict properties of water93,94,95,96,97,98,99,100,101. The fraction of the exact exchange was set to 0.25 and 0.50 to determine how this influences the redox potentials. Error distributions of the Δ-ML models on test structures are shown in Supplementary Figs. 2 to 4. The RMSEs are one to two orders of magnitude smaller than the errors of the RPBE+D3 MLFFs.

The vacuum-water interface for the production run was modeled by a pure water slab without the redox species composed of 128 water molecules per unit cell. Following the previous study45, a rectangular cell of 12.5 × 12.5 × 50 Å 3 was employed. Similar to the MLFFs for the bulk solution systems containing the redox species, the MLFF for the interface was also generated by using the active-learning scheme. The systems used for the training were a pure water bulk composed of 64 water molecules in a 12.4 × 12.4 × 12.4 Å 3 cubic cell and a pure water slab composed of 64 water molecules in an 8.8 × 8.8 × 40.8 Å 3 rectangular cell. Training simulations for both the bulk and slab were performed by NVT-ensemble MD simulations at 300, 400, 600 and 800 K. As shown in Supplementary Fig. 1, the constructed MLFF realizes small errors on test data taken from a 100-ps-MD simulation of a water slab composed of 128 water molecules at 298 K.

The annealing procedure used for the production runs explained in the previous subsection was also used to prepare for the initial structures for the training runs. All FP calculations were performed using VASP56,57. A 2 × 2 × 2 k-point mesh was used for the bulk systems containing 32 water molecules. For other systems, Γ-point was used. Plane-wave cutoff energy was set to 520 eV. The PAW58,59 distributed in VASP 5.4 was used in all FP calculations. The PAW atomic reference configuration was 1s1 for H, 2s22p4 for O, 3d74s1 for Fe, and 4d105s1 for Ag. The comparison of two atomic configurations for Cu, specifically 3d104p1 and 3p63d104p1, was conducted to examine the impact of semi-core electron relaxations on the redox potential. Upon verification that these effects are minimal within the PAW framework in VASP, as detailed in Supplementary Note 7, we employed the less computationally demanding 3d104p1 electronic configuration. The parameters for the MD simulations are the same as the ones described in the previous subsection.