Introduction

A potential energy surface (PES) that yields potential energy of a system of atoms with given atomic coordinates is the fundamental enabler for atomistic simulation methods. In principle, ab initio or first principles methods that solve the Schrödinger equation, typically some approximation within the Kohn-Sham density functional theory (DFT) framework,1,2 can be applied to directly calculate the PES. While such methods are highly accurate and transferable across diverse chemistries and bonding types, their high computational cost limit their application in molecular dynamics (MD) simulations to relatively small and simple systems containing up to a few hundreds of atoms and sub-nanosecond time scales. Empirical interatomic potentials, on the other hand, are a much cheaper alternative. The functional form of these potentials are drastically simplified with only a few fitting parameters to satisfy physical considerations.3,4 However, the accuracy of the empirical potentials is necessarily limited by the approximations made in selecting the functional form, which are generally not transferable to another system with different bonding types.

In recent years, an alternative approach has gained popularity in constructing interatomic potentials with improved transferability.5,6,7,8,9,10 In this approach, the atomic coordinates are featurized using local environment descriptors that are invariant to translations, rotations and permutations of homo-nuclear atoms, and are differentiable and unique.8,11 A machine learning model is then trained to map the structural features to data (energies, forces, etc.) from first principles calculations. Such potentials have been demonstrated to achieve accuracy close to first principles methods at much lower computational costs.5,7,9,10

The coefficients of the bispectrum of local atomic density were first applied in the Gaussian approximation potential by Bartók et al.7 Thompson et al. later showed that a linear model of bispectrum coefficients from the lowest order—the so-called spectral neighbor analysis potential (SNAP)—can accurately reproduce DFT energies and forces as well as a variety of calculated properties (e.g., elastic constants and migration barrier for screw dislocations) in bcc Ta and W.9,12 More recently, the current authors have extended the SNAP formalism to bcc Mo, fcc Ni, and Cu, and the binary fcc Ni-bcc Mo alloy systems and showed that it outperforms traditional embedded atom method (EAM) and modified EAM potentials across a wide range of properties.13,14 Thus far, SNAP models have mainly been developed for metallic systems.

For ionic systems, a common strategy in constructing interatomic potentials is to incorporate long-ranged electrostatic interactions (e.g., through the use of the Ewald summation) on top of energy model. This has been done for both traditional empirical models15,16 as well as modern local atomic environment descriptor-based potentials (e.g., GAP for the mixed ionic-covalent GaN7 and neural network potential for ZnO6). In this work, we develop a highly accurate electrostatic SNAP (eSNAP) model for ionic α-Li3N (see Fig. 1a). α-Li3N is one of the earliest lithium superionic conductors ever reported,17 and remains a promising solid electrolyte/anode coating candidate today due to its stability against Li metal.18,19 A highly accurate potential model for α-Li3N would enable large-scale, long-time-scale diffusion studies of this highly important prototypical lithium conductor, as well as serve as a platform in which to develop similar potentials for more complex systems.

Fig. 1
figure 1

a Unit cell of α-Li3N (space group: P6/mmm). Green: Li; gray: N. b Intra-planar and c inter-planar Frenkel defect configuration. The vacancy (white dashed circle X) forms at Li1 site for both cases, while the interstitial (red) forms at Li2 and Li1 site for intra-planar and inter-planar configurations, respectively

Results

Optimized model parameters

In our proposed electrostatic SNAP (eSNAP) model, we write the total potential energy Ep as the sum of the electrostatic contributions and the local energy (SNAP) due to the variations in atomic local environments, as follows:

$$E_{\mathrm{p}} = \gamma E_{{\mathrm{el}}} + E_{{\mathrm{SNAP}}}$$
(1)
$${\mathbf{F}}_j = - \nabla _jE_{\mathrm{p}} = - \gamma \nabla _jE_{{\mathrm{el}}} - F_{j,{\mathrm{SNAP}}}$$
(2)

where Eel and ESNAP are the electrostatic energy computed using the Ewald summation approach20 and the energy from SNAP, respectively, and γ is an effective screening prefactor for electrostatic interactions. An iterative procedure was developed to fit all model parameters using total energies and forces from DFT calculations until the training and test errors are converged (see Methods section for details).

For Li3N, we calculated the electrostatic energy by assigning formal charges 1 and −3 to Li and N, respectively. For highly ionic α-Li3N, we find that assigning formal charges, with screening accounted for via a fitted parameter (γ in Eq. (1)), results in a simpler, more stable potential model than variable charge models such as the charge equilibration (QEq)21 method. The narrow charge distribution of Li atoms from Bader analysis (see Fig. S1) also supports the usage of fixed charge. The final hyperparameters and coefficients for the optimized eSNAP model are given in Table 1. The optimized effective screening parameter γ is 0.057.

Table 1 Final hyperparameters and coefficients of SNAP

Energy and force prediction

Figure 2a, b shows the comparison between DFT calculated and eSNAP predicted energies and forces on both training and test dataset in the final iteration. Both energy and force predictions agree well with those from DFT calculations, indicating the eSNAP model has successfully captured the fundamental relationship between atomic environment and potential energy/atomic forces. The mean absolute errors (MAEs) on energies and forces reached convergence after only two iterations, as shown in Fig. 2c, d. In comparison, the MAEs between DFT and the Coulomb–Buckingham potential by ref. 22 on the initial training configuration pool are substantially higher for both energies (22 meV/atom) and forces (0.48 eV/Å).

Fig. 2
figure 2

Energy and force prediction errors for eSNAP. Comparisons between DFT and eSNAP predictions for a energies and b forces on both training and test dataset in the final iteration. Convergence of the test and training MAEs for c energies and d forces with iteration number

Structural properties

Table 2 compares the computed physical properties of α-Li3N with different potential energy surfaces. The lattice constants calculated from eSNAP agree with those from DFT and experiments.23 The calculated elastic constants from eSNAP also match reasonably well with DFT calculated and experimental values.24 This excellent agreement on structural properties can be expected from the fact that the energies of unit cells with various distortions have been fed to the model with a large sample weight. In comparison, the lattice constants and elastic constants from the Coulomb–Buckingham potential match poorly with both DFT and experimental values, despite the fact that these physical properties were used to determine the potential parameters.22

Table 2 Calculated structural properties from different potentials and lattice constants23 and elastic constants24 from experimental measurements

We have also calculated the formation energy of Li Frenkel defects and the migration barrier of these defects. We considered two Frenkel configurations where a vacancy is introduced on a Li2 site and the interstitial Li is located at either Li2 site (intra-planar, Fig. 1b) or Li1 site (inter-planar, Fig. 1c), and all defect configurations are fully relaxed within each potential. The eSNAP model yields reasonably close formation energy of intra-planar defect to the DFT value, but slightly overestimates the value of inter-planar defect by 0.12 eV. On the other hand, the Coulomb–Buckingham potential underestimates the defect formation energies, likely due to the use of unsatisfactory lattice constants in building the defect configurations. Using the nudged elastic band (NEB) method,25 we calculated the migration barrier of two types of hops, namely intra-planar (Li2 to Li2) vacancy migration and inter-planar (Li1 to Li2) Li interstitial migration. As shown in Table 2, the eSNAP barrier for intra-planar vacancy migration is in good agreement with the DFT barrier, while the eSNAP barrier for inter-planar interstitial migration underestimates the DFT barrier by 0.13 eV. We note that the eSNAP defect formation energies and migration barriers for the dominant intra-planar diffusion direction are reasonably close to the DFT values, while the overestimation of the inter-planar defect formation energy by eSNAP is compensated by the underestimation of the vacancy migration barrier. Similar errors and error compensations have been reported in prior non-electrostatic SNAP models on metals such as Mo and Ni.13,14 On the other hand, we are unable to converge the NEB barriers using the Coulomb–Buckingham potential due to its inability to model the transition states.

Finally, Fig. 3 compares the calculated phonon dispersion curves of α-Li3N from eSNAP with those from DFT calculations. The phonon dispersion curves were calculated using the finite displacement approach on a 3 × 3 × 3 supercell as implemented in the phonopy package.26 We find that the phonon dispersion curves calculated from eSNAP are in good agreement with that from DFT. The only discrepancy is the imaginary phonon mode at Γ point observed in DFT phonon dispersion. According to Wu et al.,27 this lattice instability is associated with the vibration of Li2 sites along the c axis, resulting in a more stable phase that is only 0.3 meV/atom lower in energy after displacing Li2 site by ~0.1 Å. This energy difference is well within the energy prediction error of the eSNAP model. We also note that the experimentally measured phonon dispersion curves at room temperature do not exhibit this lattice instability.24 In contrast, the phonon dispersion curve calculated from the Coulomb–Buckingham potential show severely overestimated frequencies (Fig. S2) due to its unsatisfactory force prediction.

Fig. 3
figure 3

Phonon dispersion curves of α-Li3N calculated from DFT and eSNAP

Bulk diffusion

MD simulations were performed using the optimized eSNAP to investigate Li diffusion in bulk α-Li3N. Built from the unit cell with equilibrium volume, the simulation box is a 5 × 5 × 5 supercell of bulk α-Li3N containing 500 atoms. MD simulations were carried out at elevated temperatures from 600 to 1200 K in an NVT ensemble for 1 ns long.

We first validated the eSNAP by comparing the mean square displacement (MSD) and diffusivities obtained from eSNAP MD simulations with those obtained from ab initio molecular dynamics (AIMD) simulations at high temperatures (1000 and 1200 K). Runs at lower temperatures were not chosen due to the poor convergence of diffusivity at limited simulation length (40 ps). It should be noted that even though 1200 K is above the melting point of Li3N, the lattice did not melt in either AIMD or eSNAP MD during the short period of simulations. As shown in Fig. S3, the generally high Li mobility and anisotropic diffusion in α-Li3N are successfully reproduced with eSNAP MD simulations. The tracer diffusivities (given by the slope of the MSD with respect to time) from eSNAP MD (1.48 × 10−4 cm2/s at 1000 K, 2.35 × 10−4 cm2/s at 1200 K) are in generally good agreement with those from AIMD (1.28 × 10−4 cm2/s at 1000 K, 2.16 × 10−4 cm2/s at 1200 K), showing a slight overestimation of about 15% and 8% at 1000 K and 1200 K, respectively.

Beyond tracer diffusivities, the orders of magnitude lower computational cost of the eSNAP relative to DFT affords us the capability to compute the charge diffusivity Dσ. For each temperature, 100 independent simulations were performed starting from different initial velocities. Diffusivities were obtained by averaging square displacements over all simulations at a particular temperature. Figure 4 plots the predicted Haven ratio and Arrhenius plot for Li3N from eSNAP MD simulations. The activation energies, extrapolated room temperature conductivities and average Haven ratio across all temperatures are tabulated in Table 3. The anisotropic diffusion in α-Li3N observed experimentally28,29 is reproduced in many aspects, including the magnitude of diffusivity, activation energy, and Haven ratio. The higher diffusivities and lower activation energy in the direction perpendicular to c axis is consistent with the lower Haven ratio found. The activation energy perpendicular to c axis is close to the one in single crystal measurement, though the value parallel to c axis is much lower compared with experiments.28 The lower activation energies lead to much higher extrapolated room temperature ionic conductivity for both directions. The Haven ratios obtained from eSNAP MD are reasonably close to the NMR measured values. We note that the activation energies obtained from MD simulations are lower than the sum of defect formation and migration energies. This is a result of the concerted motion of ions lowering the energy barriers, which is confirmed by the low Haven ratio. In comparison, we also performed a similar series of MD simulations with the Coulomb–Buckingham potential,22 and the results significantly underestimate the fast ionic conduction in α-Li3N, and significantly overestimates the Haven ratio. In particular, the Haven ratio for the direction parallel to the c-axis is computed to be >1 using the Coulomb–Buckingham potential.

Fig. 4
figure 4

a Haven ratio and b Arrhenius plot for Li charge diffusivity in bulk α-Li3N obtained from eSNAP MD simulations

Table 3 Bulk diffusion results from MD simulations using eSNAP and Coulomb–Buckingham potential and single crystal dc conductivity28 and NMR29 measurements

Grain boundary diffusion

To investigate grain boundary (GB) diffusion, we first computed the GB energies of two low Σ twist GB configurations—Σ4 [1000] and Σ7 [0001]. Both configurations are fully relaxed using DFT and eSNAP. The eSNAP-calculated GB energies for twist Σ4 [1000] and Σ7 [0001] GBs are 1.41 and 0.85 J m−2, respectively, in good agreement with the DFT values of 1.64 and 0.86 J m−2, respectively.

The lower-energy twist Σ7 [0001] GB is then used in large-scale diffusion studies, as shown in Fig. 5. The simulation box (Fig. 5a) contains 5040 atoms in total. Due to the periodic boundary conditions, two GBs are separated by 10× lattice vector c present in the box. NVT MD simulations were carried out at 300 K, with thermalization lasting for 30 ps followed by the production simulation of 1 ns. We find that the MSD of Li atoms within the GB plane is much higher than that in the bulk region, as shown in Fig. 5b, c, and there are few migration events occurring between the GB layer and the bulk layers. From the MSD, we estimate the 2D Li self-diffusivity within the twist GB to be 7.09 × 10−8 cm2/s, about three times of extrapolated total value (2.24 × 10−8 cm2/s in 3D) in the bulk at 300 K. These results indicate that grain boundaries may provide a rapid pathway for Li diffusion in α-Li3N.

Fig. 5
figure 5

a Constructed simulation box with twist Σ7 [0001] GBs. b Trajectories for selected Li ions in the box with twist GBs in 0.5 ns. Li ions on the left lie in the bulk region, and the ones on the right are close to one of the GBs. c MSD (by components) vs. time plot for Li ions located at the twist GBs only (bulk Li ions are excluded). The z direction is perpendicular to the GBs. Diffusivity is computed within the 2D GB plane

Discussion

In this work, we demonstrate that modern potentials based on local environment descriptors such as the SNAP can be adapted for ionic systems by incorporating long-range electrostatics.

The introduction of γ as a hyperparameter offers more flexibility to the potential model in order to achieve higher predictive power. Physically, γ can be interpreted as the inverse of dielectric constant. Indeed, the optimized value of γ is 0.057, which implies an effective dielectric constant of 17.5, reasonably close to the experimental dielectric of α-Li3N of 14.30 We note that while the experimentally measured dielectric constant could have been provided as an input to model development, the goal of this effort is to develop a general approach to training eSNAP models for materials, some of which may not have measured dielectric constants. We have also attempted to fit a regular SNAP model for Li3N without the use of electrostatic interactions, but using a larger cutoff radius of 8 Å to allow the model to learn screened electrostatic interactions. The resulting SNAP model has significant higher MAEs in energies and forces of 2.3 meV/atom and 0.15 eV Å−1, respectively.

Unlike earlier works where the sample weights are treated as hyperparameters optimized toward structural properties (lattice constants, elastic constants, etc.),13,14 we used fixed sample weights in linear regression as the different scales between energies and forces are unified by using standardized z-scores as targets. Sample weight assignment then effectively becomes an exercise in assigning importance of matching various computed properties from DFT. Note that reproducing energetic calculations where atoms are relaxed remains a challenge, as eSNAP could not distinguish the difference of defect formation energies in different Frenkel defect configurations.

It should be noted that the focus of the current eSNAP model is on reproducing the energies and forces on solid-phase α-Li3N for the purposes of scaling MD simulations beyond the limited simulation cells and time scales in AIMD for diffusion studies. As such, the training structures were selected mainly for this purpose and no attempt was made to include a broad diversity of training structures from different polymorphs of Li3N, liquid configurations, etc. in the training pool.

Our choice of the SNAP approach is motivated by its simple linear form and its efficient implementation in the widely available open-source LAMMPS Molecular Dynamics Simulator.31 Though the MAEs of linear SNAP may not be as low as those achieved using other regression models and descriptors,8,32 its efficiency and low training data requirements are the decisive factors for our choice. In terms of scaling performance, we tested the running time of a 1000-step MD simulation with various system sizes (500–500,000 atoms). Despite the O(N log N) time complexity of Ewald summation, eSNAP generally shows linear scaling performance (Fig. S4), presumably governed by the time-consuming bispectrum coefficient calculations.

Finally, we applied the eSNAP model to conduct long-time-scale (~1 ns) simulations of complex models (500–5000 atoms) of α-Li3N. We report the Haven ratio of α-Li3N by directly calculating charge diffusivity and show that grain boundaries may provide faster diffusion pathways (relative to bulk). The calculation of charge diffusivity, which is difficult to converge in AIMD simulations, enables us to compute much more reliable estimates of the anisotropic diffusivities of α-Li3N. Interestingly, though we find that conductivity in the c-crystallographic direction is in general slower than the ab plane, the value is only one order of magnitude lower, contrary to single crystal measurements.28 Li et al.19 have recently grown pinhole-free Li3N nanofilms as a protective layer on Li metal anodes by flowing nitrogen gas. A critical design requirement is that the conductivity of Li in the [001] direction is sufficiently high. Li et al.19 measured conductivities of up to 0.5 mS/cm, which is in good agreement with our predictions and in disagreement with prior experiments and simulations with the Coulomb–Buckingham potential. It should be emphasized that the conductivity of ~0.01 mS/cm in the c direction reported in previous work28 would lead to a highly resistive, low-performing coating. We hope that further careful experiments in the near future may shed further light on these discrepancies in anisotropic diffusivities between different experiments and computational simulations on this highly important lithium conductor.

Methods

Electrostatic SNAP (eSNAP) model

The atomic environment around atom i at coordinates r can be described by its atomic neighbor density ρi(r) with the following equation:7,9

$$\rho _i({\mathbf{r}}) = \delta ({\mathbf{r}}) + \sum\limits_{r_{ii^{\prime}} < R_{ii^{\prime}}} {f_c} (r_{ii^{\prime}})w_{i^{\prime}}\delta ({\mathbf{r}} - {\mathbf{r}}_{{\mathbf{ii}}^{\prime}}),$$
(3)

where rii' is the vector joining the coordinates of central atom i and its neighbor atom i′, the cutoff function fc ensures that the neighbor atomic density decays smoothly to zero at cutoff radius Rii, and the dimensionless neighbor weights wi distinguish atoms of different types. This density function can be expanded as a generalized Fourier series in the 4D hyper-spherical harmonics \(U_{m,m\prime }^j(\theta ,\phi ,\theta _0)\) as follows:

$$\rho _i({\mathbf{r}}) = \mathop {\sum}\limits_{j = 0,\frac{1}{2},...}^\infty {\mathop {\sum}\limits_{m = - j}^j {\mathop {\sum}\limits_{m^{\prime} = - j}^j {u_{m,m^{\prime}}^j} } } U_{m,m^{\prime}}^j(\theta ,\phi ,\theta _0),$$
(4)

where the coefficients \(u_{m,m^{\prime}}^j\) are given by the inner product \(\langle U_{m,m^{\prime}}^j|\rho \rangle\). The bispectrum coefficients are then given as:

$$B_{j_1,j_2,j} = \mathop {\sum }\limits_{m_1,m_1^\prime = - j_1}^{j_1} ,\mathop {\sum }\limits_{m_2,m_2^\prime = - j_2}^{j_2} \mathop {\sum }\limits_{m,m^\prime = - j}^j \left( {u_{m,m^{\prime}}^j} \right)^ \ast H\begin{array}{*{20}{c}} {jmm^{\prime}} \\ {j_1m_1m_1^{\prime}} \\ {j_2m_2m_2^{\prime}} \end{array}u_{m_1,m_1^\prime }^{j_1}u_{m_2,m_2^\prime }^{j_2},$$
(5)

where the constants \(H\begin{array}{*{20}{c}} {jmm^{\prime}} \\ {j_1m_1m_1^\prime} \\ {j_2m_2m_2^\prime} \end{array}\) are coupling coefficients.

In the original formulation of the non-ionic SNAP model,9 the energy and forces are expressed as a linear function of the bispectrum coefficients, as follows:

$$E_{{\mathrm{SNAP}}} = \mathop {\sum}\limits_\alpha {\left( {\beta _{\alpha ,0}N_\alpha + \mathop {\sum}\limits_{k = \{ j_1,j_2,j\} } {\beta _{\alpha ,k}} \mathop {\sum}\limits_{i = 1}^{N_\alpha } {B_{k,i}} } \right)}$$
(6)
$${\mathbf{F}}_{j,{\mathrm{SNAP}}} = - \mathop {\sum}\limits_\alpha {\mathop {\sum}\limits_{k = \{ j_1,j_2,j\} } {\beta _{\alpha ,k}} } \mathop {\sum}\limits_{i = 1}^{N_\alpha } {\frac{{\partial B_{k,i}}}{{\partial {\mathbf{r}}_j}}} .$$
(7)

where α is the chemical identity of atoms, Nα is the total number of α atoms in the system, and βα,k are the coefficients in the linear SNAP model for type α atoms.

For ionic systems, electrostatic interactions spanning in the entire range of interatomic distances are indispensable in the construction of energy model due to the long-range tail beyond the cutoff distance for local environment description (see Fig. 6). In our proposed electrostatic SNAP (eSNAP) model, we write the total potential energy as the sum of the electrostatic contributions and the local energy (SNAP) due to the variations in atomic local environments, as follows:

$$E_{\mathrm{p}} = \gamma E_{{\mathrm{el}}} + E_{{\mathrm{SNAP}}}$$
(8)
$${\mathbf{F}}_j = - \nabla _jE_{\mathrm{p}} = - \gamma \nabla _jE_{{\mathrm{el}}} - F_{j,{\mathrm{SNAP}}}$$
(9)

where Eel is the electrostatic energy computed using the Ewald summation approach20 and γ is an effective screening prefactor for electrostatic interactions. The coefficients (γ and β) can be solved by fitting the linear model to total energies and forces from DFT calculations.

Fig. 6
figure 6

Schematic of energy contributions vs. interatomic distances in ionic systems. Rii denotes the cutoff radius in considering contributions from local environment

In addition, nuclei repulsions emerge at extremely short interatomic distances. In this work, the Ziegler-Biersack-Littmark (ZBL) potential is used to account for short-ranged nuclei repulsions.33 To ensure that the fitting process captures the relevant relationship between the bispectrum coefficients and the DFT energies and forces, the cutoff distances of ZBL were chosen to be short enough (Ri = 1.0 Å, Ro = 1.5 Å) such that the ZBL potential has negligible contribution to energies or forces among the initial training configurations where extremely close interatomic distances were not sampled. More details about ZBL settings used in this work can be found in Supplementary Information.

Training data generation

Figure 1a shows the hexagonal P6/mmm unit cell of α-Li3N, where Li2 sites form Li2N layers with N sites in the ab plane and Li1 sites connect N sites in neighboring Li2N layers along the c axis. To sample a diverse set of configurations, the initial training set includes two major components:

  1. 1.

    Starting from the relaxed α-Li3N unit cell, we first generated two series of unit cells with lattice distortions. One series samples different lattice constants a and c, and the other samples unit cells with different levels of strains (−1% to 1% at 0.2% intervals) applied in six different modes as described in de Jong et al.34

  2. 2.

    Snapshots were extracted from AIMD simulations at temperatures from 400 to 1200 K at 200 K intervals under an NVT ensemble. Starting from a 3 × 3 × 3 supercell with equilibrium volume, for each temperature, 200 snapshots were taken from a 40 ps AIMD simulation.

To ensure accurate energies and forces, static DFT calculations were performed on all configurations (including snapshots from AIMD).

DFT calculations

All DFT calculations were performed using the Vienna Ab initio Simulation Package (VASP)35 within the projector augmented wave approach.36 The Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation was adopted as the exchange-correlation functional.37 To ensure the convergence of energy and atomic force, a plane-wave energy cutoff of 520 eV and Γ-centered k-point meshes with a density of at least 30 Å were employed for all static DFT calculations. For AIMD simulations, a single Γ k-point and a much lower-energy cutoff of 300 eV were used for rapid propagation of trajectories.

Model training and test

Table 4 shows the weights applied on the different sets of training configurations during model training. As the initial training dataset contains many more configurations from AIMD snapshots with larger number of atoms, a much larger weight was applied on the energies of the distorted unit cells relative to those from the AIMD snapshots. A zero weight was applied on the negligibly small forces for the distorted unit cells.

Table 4 Data distribution and applied weights on different types of data points in the initial training dataset

As shown in Fig. 7a, the energies and forces differ greatly in magnitude and distribution due to differences in the scales and units. In the original SNAP training approach, the effect of this difference in magnitude and distribution is partially accounted for by treating the data weights as hyperparameters to be optimized.13,14 In this work, we use the standardized z-scores of energies and forces (plotted in Fig. 7b) as the targets in model training to avoid incorporating the effect of the distribution in the data weights, which are therefore fixed at the values in Table 4. The “standardized” eSNAP model in the fitting process is then given by the following:

$$\left[ {\begin{array}{*{20}{c}} {\frac{{e - \bar e}}{{\sigma _e}}} \\ \vdots \end{array}} \right] = \frac{1}{{N\sigma _e}}\left[ {\begin{array}{*{20}{c}} {E_{{\mathrm{el}}}} & {N_\alpha } & {\mathop {\sum}\limits_{i = 1}^{N_\alpha } {B_{1,i}} } & \ldots & {\mathop {\sum}\limits_{i = 1}^{N_\alpha } {B_{k,i}} } & \ldots \\ \vdots & \vdots & \vdots & \ldots & \vdots & \ldots \end{array}} \right]{\boldsymbol{\beta }}^{\mathrm{T}},$$
(10)
$$\left[ {\begin{array}{*{20}{l}} {\frac{{{\mathbf{F}}_j}}{{\sigma _F}}} \hfill \\ \vdots \hfill \end{array}} \right] = \frac{1}{{\sigma _F}}\left[ {\begin{array}{*{20}{c}} { - \frac{{\partial E_{{\mathrm{el}}}}}{{\partial {\mathbf{r}}_j}}} & 0 & { - \mathop {\sum}\limits_{i = 1}^{N_\alpha } {\frac{{\partial B_{1,i}}}{{\partial {\mathbf{r}}_j}}} } & \ldots & { - \mathop {\sum}\limits_{i = 1}^{N_\alpha } {\frac{{\partial B_{k,i}}}{{\partial {\mathbf{r}}_j}}} } & \ldots \\ \vdots & \vdots & \vdots & \ldots & \vdots & \ldots \end{array}} \right]{\boldsymbol{\beta }}^{\mathrm{T}},$$
(11)

where e is the energy per atom, \(\bar e\) is the mean of e, and σe and σF are the standard deviations of e and F, respectively. The mean of forces is omitted since it is close to zero. The coefficient vector βT to be solved can be written as:

$${\boldsymbol{\beta }}^{\mathrm{T}} = \left[ {\begin{array}{*{20}{l}} \gamma \hfill & {\beta _{\alpha ,0} - \bar e} \hfill & {\beta _{\alpha ,1}} \hfill & \ldots \hfill & {\beta _{\alpha ,k}} \hfill & \ldots \hfill \end{array}} \right]^{\mathrm{T}}.$$
(12)
Fig. 7
figure 7

Distribution of a original atomic energies and forces and b normalized z-score of atomic energies and forces

For bispectrum coefficient calculations, we used the implementation available in LAMMPS.9 The two hyperparameters (cutoff distance Rα and atomic weight wα) for each element (Li and N in the case of Li3N) were determined using a two-step grid search scheme for the atomic weights and then followed by the cutoff distances. The MAE of forces from a linear model trained on the initial training set was chosen as the metric. For the atomic weights, it should be noted that the atomic density in ionic systems is generally higher than that in metallic systems; hence the search of atomic weights was performed in the range where |wα| < 1. Similarly, the search space for cutoff radius was limited to the range where Rα < 4 Å. The results from grid search (Fig. S5) are available in Supplementary Information, and the final hyperparameters can be found in Table 1.

Figure 8 shows the flow chart of the iterative procedure used for training the eSNAP model in this work. A preliminary eSNAP model was first trained using the initial training set. Using this fitted eSNAP model, MD simulations were then carried out using a 3 × 3 × 3 supercell in equilibrium volume at temperatures ranging from 300 K to 1200 K at 100 K intervals under an NVT ensemble for 40 ps. Ten snapshots were sampled from each MD simulation to form a new set of test configurations. Static DFT calculations were performed on these test configurations. If the test MAEs for either energies or forces were significantly larger than the corresponding training MAEs, the test set was then merged into the training set to form a new extended training set. The entire eSNAP fitting, simulation and testing procedure was repeated until there is no significant over-fitting in both energies and forces. In this work, we use 150% of training MAE as the threshold to achieve a balance between the benefit gained by adding more training instances and the associated costs of performing more DFT calculations. It should be noted that this strategy is designed to bias the eSNAP model to improve the predictions on energy and force of MD simulations, which is the target application of interest in this work.

Fig. 8
figure 8

Flowchart of iterative procedure for eSNAP model training and test

Diffusivity calculations

The tracer diffusivity of Li D* is calculated from the MSD of all diffusing Li ions as described by the Einstein relation:

$$D^ \ast = \frac{1}{{2dt}}\frac{1}{N}\mathop {\sum}\limits_{i = 1}^N {\left\langle {\left[ {{\mathrm{\Delta }}{\mathbf{r}}_i(t)} \right]^2} \right\rangle } ,$$
(13)

where d is the number of dimensions in which diffusion occurs, N is the total number of diffusing Li ions, Δri(t) is the displacement of the ith Li ion at time t.

The charge diffusivity of Li Dσ is calculated from the square net displacement of all diffusing Li ions, as described below:

$$D_\sigma = \frac{1}{{2dt}}\frac{1}{N}\left\langle {\left[ {\mathop {\sum}\limits_{i = 1}^N \Delta {\mathbf{r}}_i(t)} \right]^2} \right\rangle$$
(14)

The Li conductivity at temperature T (unit: K) can be calculated from the charge diffusivity Dσ using the Nernst-Einstein equation:

$$\sigma = \frac{{\rho z^2F^2}}{{RT}}D_\sigma ,$$
(15)

where ρ is the molar density of Li, z is the charge of Li (+1), F is the Faraday constant, and R is the gas constant.

In addition, the ratio between the tracer and charge diffusivities is referred to as the Haven ratio HR = D*/Dσ.

All the simulations with the eSNAP were performed using LAMMPS.31 All the structure manipulations and interfacing with VASP and LAMMPS were handled by the Python Materials Genomics (pymatgen) library.38