Introduction

As a powerful modeling technique, ab initio molecular dynamics (AIMD) simulation has been recently applied to a wide range of research topics in chemistry and materials science.1,2,3,4 AIMD simulations are carried out using the potential energy surface and atomistic forces calculated from ab initio methods, such as density functional theory (DFT), and model the dynamics of atomistic systems with ab initio level of accuracy and chemical versatility. This accuracy is lacking in classical molecular dynamics (MD) simulations, which are based on interatomic potentials (a.k.a. force fields). In addition, the limited availability of reliable interatomic potentials for studying new materials greatly limits the applicability and chemical versatility of classical MD simulations. Therefore, AIMD simulation is the method of choice for studying the dynamics of atoms with complex chemistry changes and simulating materials that cannot be described by available interatomic potentials. The wide applicability of AIMD simulations has been successfully demonstrated in studies of diffusional properties,5,6,7,8,9 reaction processes,10,11 vibrational frequency,12,13,14 amorphous materials,15,16,17 phase transition,18 etc.

Mo et al.5 pioneered the application of AIMD simulations to the study of ionic diffusion in lithium ionic conductor materials, such as Li10GeP2S12 (LGPS). Since then, AIMD simulation has been widely adopted as a standard technique for studying fast ionic conductor materials with a wide range of mobile species (e.g., Li+, Na+, Mg2+, O2−).6,7,19,20,21,22,23 Both ab initio and classical MD simulations model the real-time dynamics of atoms with a femtosecond-scale time resolution, which is difficult to directly obtain from experimental characterizations or from other non-MD computational methods. Due to their ab initio level of accuracy and chemical versatility, AIMD simulations provide more accurate quantification of diffusion properties than classical MD and have a unique ability to predict new diffusion mechanisms. AIMD simulation technique has achieved great successes in identifying diffusion mechanism at the atomistic scale and in quantifying the diffusional properties.

Major advantages of AIMD simulations (as well as classical MD simulations) are that the diffusion events, i.e., atom/ion hops, in the materials are directly observed from the dynamics and trajectory of atoms and that specific diffusion mechanism is not presumed as calculation input. In contrast, the widely used nudged-elastic-band (NEB) calculations only obtain the energy barrier of a specific static migration pathway and require the pathway as input. That NEB input pathway is often derived from the guess of the researchers. The prior guess of diffusion mechanism may be difficult for complex materials such as super-ionic conductors, where the mobile-ion sublattice is highly disordered. For example, recent AIMD simulations identified that the diffusion mechanism in lithium super-ionic conductors, such as thio-LISICON9,24 and lithium garnet,9,25,26 is a concerted migration of multiple ions rather than isolated ion hopping. AIMD simulations played unique roles in uncovering and identifying these concerted migration mechanisms, whereas isolated ion hopping was often assumed for previous NEB calculations. In addition, AIMD simulations sample the statistical contributions of all diffusional events and obtain diffusional properties such as diffusivity and ionic conductivity from the collective contributions of all different diffusion modes. In addition, performing AIMD simulations at different temperatures obtains the Arrhenius relation of diffusivity and ionic conductivity, including the pre-factor and overall activation energy, which is similar to experimental measurements.

However, the high computational expenses of ab initio calculations limit the accessible length scale and timescale of AIMD simulations. The small system size and short physical time scale are major shortcomings of AIMD simulations when compared to classical MD simulations.27,28 Even on state-of-the-art supercomputing facilities, typical AIMD simulations are limited to a system size of a few hundred atoms and a total physical time duration of tens to thousands of picoseconds. As a result, a limited number of diffusion events, i.e., ion hops, are observed during an AIMD simulation. The limited sampling of diffusion events may lead to significant statistical variances in calculated diffusional properties or even, if only a few ion hops are sampled, to erroneous values.

As AIMD simulation has become a widely used technique in studying diffusion mechanisms and in quantifying diffusional properties of materials, the procedures for applying AIMD simulations to extract diffusional properties need to be systematically established. First, the fitting procedure of AIMD simulation results requires a number of steps and parameters. Currently, every study chooses their own tested parameters, but the choice of these parameters has not been discussed. As we show in this study, the choice may significantly impact obtained diffusional properties. Second, most previous studies correlated the error bar and statistical uncertainty of diffusivity to the goodness of linear fitting to the Einstein relation. However, as we show in this study, the goodness of linear fitting to the Einstein relation does not necessarily capture the true statistical variance of fitted diffusivity. Instead, the statistical variance and uncertainty of fitted diffusivity should be a direct consequence of the limited observation of diffusion events (i.e., ion hops) during the MD simulation. In the literature, several studies estimated the errors in the fitted diffusivity from single-particle tracking,29,30,31,32 classical MD simulations33 and kinetic Monte Carlo simulations.34 However, these studies, which are developed and tested for systems and measurements with a significantly larger number of diffusion events than typical AIMD simulations, are not ideal for analyzing errors from AIMD simulations, where the total number of diffusion events are small. Third, a significant number of diffusion events should be captured to quantify the diffusion with statistical validity. However, in many studies, the AIMD simulations only capture and sample ion diffusion from a few ion hops with a total mean-squared displacement (MSD) of a few Angstrom.2 The diffusion properties from such poor sampling of ion hops may have low statistical significance. The limitation of AIMD simulations should be quantified and charted out. In summary, the following three problems of using AIMD simulations to study diffusional properties need to be addressed:

(1) How shall one properly extract diffusional properties from the dynamics of atoms with minimal error?

(2) How shall one quantify the statistical variances of diffusional properties extracted from AIMD simulations?

(3) What are the accessible ranges of materials properties and physical conditions for typical AIMD simulations in studying diffusion?

The aim of this paper is to resolve the aforementioned problems and to establish a systematic procedure for quantifying diffusional properties and their statistical variances from AIMD simulations. In the following sections, we examine and establish the proper procedure to extract diffusivity D from the atomic trajectory of AIMD simulations. We illustrate the origin and the functional dependence of the statistical variance of the diffusivity D. We estimate the proper temperature range for AIMD simulations to obtain D with reasonable accuracy. We estimate the errors of activation energy and ionic conductivity from the Arrhenius relation for typical super-ionic conductor materials.

Results

Quantifying diffusivity, ionic conductivity, and activation energy from AIMD simulations

The diffusional properties are calculated from the trajectory of ions (or atoms) \({\boldsymbol{r}}_i(t)\) from AIMD simulations. The displacement \(\Delta {\boldsymbol{r}}_i\) of ion i from time t1 to t2 can be calculated as

$$\Delta {\boldsymbol{r}}_i\left( {\Delta t} \right) = {\boldsymbol{r}}_i\left( {t_2} \right) - {\boldsymbol{r}}_i\left( {t_1} \right),\,{\mathrm{where}}\,\Delta t = t_2 - t_1.$$
(1)

The total squared displacement sums up the squared displacement of all mobile ions, \(\mathop {\sum}\nolimits_{i = 1}^N {\left( {\left| {\Delta r_i\left( {\Delta t} \right)} \right|^2} \right)}\), and describes the movement of all N mobile ions over a time interval Δt. During AIMD simulations over a total time duration ttot, there are many (NΔt) time intervals with the same duration Δtt<ttot) but with different starting time t. Since the displacement of ions over Δt reflects the mobility of ions, the total mean squared displacement (TMSD) for all diffusional ions over time interval Δt is calculated as:

$${\mathrm {TMSD}}\left( {\Delta t} \right) = \mathop {\sum}\nolimits_{i = 1}^N {\left\langle {\left| {{\boldsymbol{r}}_i\left( {\Delta t} \right) - {\boldsymbol{r}}_i(0)} \right|^2} \right\rangle } = \mathop {\sum}\nolimits_{i = 1}^N {\frac{1}{{N_{\Delta t}}}} \mathop {\sum}\nolimits_{t = 0}^{t_{tot} - \Delta t} {\left| {{\boldsymbol{r}}_i\left( {t + \Delta t} \right) - {\boldsymbol{r}}_i(t)} \right|^2},$$
(2)

by averaging over a total of NΔt time intervals with the same duration Δt. This averaging over different NΔt time intervals provides essential ensemble sampling to obtain more accurate diffusional properties. To estimate the diffusivity of the mobile-ion species, the MSD over time interval Δt is calculated as the TMSD per mobile ion:

$${\mathrm {MSD}}\left( {\Delta t} \right) = \frac{1}{N}{\mathrm {TMSD}}\left( {\Delta t} \right).$$
(3)

N is the number of ions that are assumed to be the mobile carriers contributing to diffusion.

In general, the dependence of MSD over time interval Δt follows a linear relationship if a large amount of diffusional displacement is captured during the MD simulation. The diffusivity of these ions is calculated as the slope of the MSD over time interval Δt according to the Einstein relation:

$$D = \frac{{{\mathrm {MSD}}\left( {\Delta t} \right)}}{{2d\Delta t}} + D_{{\mathrm{offset}}},$$
(4)

where d = 3 is the dimension of the system, and the offset Doffset of this linear dependence is discussed in later sections. This calculated diffusivity D is the tracer diffusivity of the mobile-ion species and is an intrinsic property of the material at the given condition.

From the diffusivity D, the ionic conductivity is calculated based on the Nernst–Einstein relation:

$$\sigma = \frac{{Nq^2}}{{VkT}}D,$$
(5)

where V is the total volume of the model system, q is the charge of the mobile-ion species, T is the temperature, and k is the Boltzmann constant. By combining Eqs. (35), the ionic conductivity is directly determined by TMSD(Δt)/Δt:

$$\sigma = \frac{{q^2}}{{VkT}}\frac{{{\mathrm {TMSD}}\left( {\Delta t} \right)}}{{2d\Delta t}}.$$
(6)

The ionic conductivity and TMSD calculated in Eq. (6) are independent of the specific choice of the diffusion carrier, such as vacancy or interstitial. In comparison, the diffusivity D can be calculated for specific carriers, e.g., Li+ or vacancy, by normalizing the specific number of carriers in Eq. (3). The choice and counting of mobile carriers do not affect the calculated ionic conductivity from Eq. (6). Therefore, the use of conductivity and TMSD is more straightforward in describing the overall diffusion that occurred during the AIMD simulations.

In addition, the Nernst–Einstein relation (Eq. (5)) assumes dilute, non-interacting mobile ions in the materials systems. However, recent AIMD simulation studies have shown the strong correlation of the ionic diffusion in fast ionic conductors.9,24,25,26,35,36,37 The Haven ratio is often used to describe the correlation factor, and can be quantified by the ratio of the jump diffusion coefficient DJ against the tracer diffusivity D calculated from Eq. (4).38,39,40 The jump diffusion coefficient DJ, which describes the collective migration of all ions, is estimated from the MSD of the mass center of all mobile ions. However, our analysis found that tracking the displacement of the single center point exhibits higher statistical variance. In this study, the scheme for estimating statistical variance is developed for tracer diffusivity D in Eqs. (5, 6), but a similar scheme can be applied and developed for analyzing the variance of DJ.

As in the experiments,41,42,43,44 AIMD simulations can be performed at multiple temperatures to obtain the Arrhenius relation of D as a function of T:

$$D = D_0{\mathrm{exp}}\left( { - \frac{{E_{\mathrm a}}}{{kT}}} \right).$$
(7)

The activation energy Ea of ion diffusion can be obtained through fitting the data of log D vs. 1/T to the Arrhenius relationship (Eq. (7)). The fitted Arrhenius relationship can be used to extrapolate the diffusivity D and conductivity σ to other temperatures. It should be noted that, by extrapolating this Arrhenius relation to other temperatures, the identical diffusion mechanism is assumed at those extrapolated temperatures.

Regions of MSD–Δt dependence

Figure 1 shows a typical MSD–Δt curve from AIMD simulations of the Li-ion superionic conductor Li1.33Ti1.67Al0.33(PO4)3 (LATP). The linear MSD–Δt dependence as described in the Einstein relationship (Eq. (4)) only holds within a certain range of time intervals Δt, and a notable fraction of this dependence is not linear. The MSD–Δt curve at short time interval Δt< 0.1 ps follows \({\mathrm {MSD}} \propto \Delta t^{1.42}\), which is consistent with the local harmonic vibration motion model as shown in the Supplementary Note 1 and Supplementary Figure 1. This portion of the MSD–Δt curve is named the ballistic region, corresponding to the ballistic and vibrational motion of Li ions around their local equilibrium sites rather than Li-ion hopping to new sites.33,45 More details of this ballistic region are discussed in the next section.

Fig. 1
figure 1

MSD of Li+ as a function of Δt in LATP from an AIMD simulation over 200 ps at 1200 K. The linear \({\mathrm {MSD}} \propto \Delta t\) dependence (red dash line) corresponds to diffusional displacement. The \({\mathrm {MSD}} \propto \Delta t^{1.42}\) dependence (blue dash line) shows the behavior for the local vibrational motion of ions. Inset: The MSD–Δt curve at <0.1 ps in linear-scale axes

In Δt range of 10–100 ps, the MSD–Δt curve follows \({\mathrm {MSD}} \propto \Delta t\), the linear Einstein relationship for diffusional displacement. However, when Δt reaches a large fraction of ttot (>100 ps in the case of Fig. 1), the MSDΔt curve shows notable variance and deviation from the linear relationship. This deviation is a result of poor statistics for large Δt values, as values that are comparable to ttot result in averaging a smaller number of time intervals NΔt in Eq. (2) than would be averaged for small Δt values that are a small fraction of ttot. The effect of this deviation is quantified in the next section.

Given that the linear MSD–Δt dependence only holds at certain Δt range, one should not utilize the entire MSD–Δt curve to fit the Einstein relation (Eq. (4)) and to deduce the diffusivity. The ballistic region at short times intervals and the poor statistical region at large time intervals should be excluded from fitting diffusivity. The linear fitting of MSD–Δt curve should be performed in a range [Δt low , Δt up ], where Δt low and Δt up are the lower and upper bound, respectively. In last sections, we establish the procedures to determine the bounds Δt low and Δt up for the linear fitting of MSD–Δt in order to minimize the errors caused by improper fitting to the Einstein relation.

Lower bound of linear diffusion region

In Fig. 2, the local derivative dMSD/dΔt calculated using a finite difference method shows the transition from the ballistic region to the linear region and is used to determine the lower fitting bound Δtlow. The derivative dMSD/dΔt has a large value at small Δt values (<0.2 ps), and decreases significantly as Δt increases. The MSD–Δt curve becomes linear at Δt ~ 1 ps, and the value of dMSD/dΔt reaches a plateau value of ~3 Å2/ps. In the case of Fig. 2, the cut-off of the ballistic region (black dotted line in Fig. 2) is at an MSD value of ~5 Å2. An MSD of 5 Å2 is approximately 0.5a2, where a is the distance between two neighboring Li sites and is 3.2 Å in LATP, which is typical of Li ionic conductors. The ballistic region has the MSD cut-off at a fraction of a2 because the local vibration of Li ions on their equilibrium sites is confined within the potential well between two nearby sites. This local vibrational displacement at small Δt< Δtlow does not represent the ionic diffusion from site to site, and should be excluded from the linear fitting for D.

Fig. 2
figure 2

MSD (black) of Li+ and dMSD/dΔt (red) as a function of Δt from the AIMD simulation of LATP in Fig. 1. The black dotted line illustrates the cut-off of the ballistic region

One may identify the specific range of the ballistic region and the precise lower fitting bound Δtlow for each individual AIMD simulation of different materials systems at different temperatures using the same procedure shown in Figs. 1 and 2. The cut-off values of Δt (i.e., Δtlow) and MSD for the ballistic region are dependent on the temperatures and the materials, which yield different potential energy surfaces near the equilibrium sites. In general, the displacement of local vibration is within a fraction of site distance a, so the ballistic region corresponds to the MSD ranging from 0 to a fraction of a2 (Supplementary Note 1 and Supplementary Figure 1). Therefore, we propose to quantify the cut-off values of Δt, i.e., Δtlow, by defining the cut-off in MSD using the value of a fraction of a2 (e.g., 0.5 a2). As a safer measure for lower fitting bound (i.e., Δtlow), the entire region with MSD less than this cut-off based on a2 may be excluded.

The linear fitting to Einstein relation should only be performed on the linear region corresponding to diffusional displacement. Otherwise, a part of ballistic motion would be included into the fitting for D, leading to an over-estimation of D. This over-estimation of D would be pronounced for the MSD–Δt curves with small values of the maximum MSD (~ a few Å2), due to a significant fraction of ballistic motion mixed into the MSD–Δt curve. Therefore, AIMD simulation should be long enough to have the MSD per mobile ion larger than a few a2, so that the ballistic region (and Δtlow) can be distinguished and separated from the linear fitting for D.

Upper bound of linear diffusion region

The upper fitting bound Δtup is determined by the transition from the linear diffusion region to the region at large Δt with large variance and deviation. Ten different MSD–Δt curves from AIMD simulations of the same LATP structure model at 1200 K over 50 ps (Fig. 3a) were obtained by dividing a total AIMD simulation over 500 ps into ten non-overlapping parts. The significant deviations from the linear dependence of these ten MSD–Δt curves are typically observed at large values of Δt, i.e., >25 ps in Fig. 3 or 50% of ttot. At large Δt, a smaller number of time intervals NΔt is averaged in Eq. (2) and many of these Δt intervals overlap other intervals that contain physically identical trajectories of ions, thus leading to larger deviation from the linear MSD–Δt relation. Therefore, the linear fitting for D should be performed below the upper bound Δtup of the linear diffusion region.

Fig. 3
figure 3

Variances of MSD–Δt curves. a MSD–Δt curves from ten different AIMD simulations of the same LATP structure model over 50 ps at 1200 K. Each curve represents an independent AIMD simulation over 50 ps (ttot = 50 ps). b Goodness of linear fit R2 of MSD–Δt curves using different upper fitting bound. The values and error bars of R2 are the average and the standard deviation, respectively, from ten AIMD simulations

To determine the upper fitting bound Δtup, we fitted these MSD–Δt curves to the Einstein relation (Eq. (4)) with different Δtup values ranging from 10% to 100% of ttot. For the fitting of each curve, the value of R2 in the linear regression was calculated to evaluate the goodness of fitting. The average and standard deviation of R2 values over these ten curves are shown in Fig. 3b. At Δtup ≤ 0.3ttot, all values of R2 are very close to 1, showing good linearity of MSD–Δt curves at small Δt. At Δtup > 0.7ttot, R2 values decrease significantly from 1, indicating poor linearity of MSD–Δt curves at large Δt, and the standard deviation of R2 also increases. Therefore, one should only fit the linear region of MSD–Δt below an upper fitting bound of Δtup < 0.7ttot. By performing the same test on other fast ion conductor materials, such as LGPS and Li7La3Zr2O12 (LLZO) at a few different temperatures, we found this Δtup < 0.7ttot to be generally applicable (Supplementary Figure 3). As the optimal Δtup values may depend on the material (the mobile ions) and temperatures, for unique systems one may use this scheme to determine the specific Δtup values.

Within properly determined lower and upper fitting bounds, the goodness of fitting to the MSD–Δt curve would always be good. Therefore, the goodness of fitting itself does not reflect the statistical variance in the fitted D, the slope of the MSD–Δt curve, and is different among different AIMD simulations for the same materials model (Fig. 3a). The changes in the slopes of MSD–Δt curves below Δtup reflect the statistical variances in the fitted diffusivity D from different runs of AIMD simulations, which are quantified and analyzed in the next section.

Statistical variance of diffusivity and conductivity

In Fig. 3a, the diffusivities, i.e., the slopes of different MSD–Δt curves, exhibit significant variance among different AIMD simulations of the exact same material model. The variance among the fitted diffusivities D are a result of the stochastic nature of the diffusion process, which causes different numbers of ion hops during AIMD simulations with different initial conditions.

In this section, we quantify the statistical variance of the fitted diffusivity D from AIMD simulations. The statistics were calculated based on a set of MD simulations that were created by dividing a long MD simulation into several non-overlapping shorter MD simulations, each of which was treated as an individual MD simulation. Following our established fitting procedure, the value of D was extracted from each MD simulation. The standard deviation of D, s D , was calculated from the set of fitted D values (Supplementary Figure 2). The relative standard deviation (RSD) of D was calculated as s D /Dtrue, where the true value of the diffusivity Dtrue is calculated from the longest available MD simulation. Since more ion hops improve the sampling of the diffusional property, we found that the RSD of D decreases with the total effective ion hops Neff as

$$\frac{{S_D}}{{D_{{\mathrm{true}}}}} = \frac{A}{{\sqrt {N_{{\mathrm{eff}}}} }} + B.$$
(8)

Neff is calculated as

$$N_{{\mathrm{eff}}} = \frac{{\max _{\Delta t}\left[ {TMSD\left( {\Delta t} \right)} \right]}}{{a^2}},$$
(9)

where \({\max_{\Delta t}\left[ {{\mathrm{TMSD}}\left( {\Delta t} \right)} \right]},\) is the maximum value of TMSD over the entire MSD–∆t curve. Neff can be considered as the effective number of ion hops that contributed to the TMSD of all mobile ions in the entire duration of the MD simulation. The value of a is 2.4, 3.2, 2.8, 3.4 Å for LLZO, LATP, LGPS, RbAg4I5, respectively, as averaged distance between neighboring mobile-ion sites. For all materials in Fig. 4, the values of A and B are fitted as 3.43 and 0.04, respectively. As shown in Fig. 4, Eq. (8) and its parameters are general for ionic conductor materials and also hold for a classical MD simulation of LLZO over much longer time duration with a large number of ion hops (Fig. 4b).

Fig. 4
figure 4

Relative standard deviation (RSD) of D, s D /Dtrue, as a function of total effective hops Neff of the mobile ion (Li+ or Ag+). Red line is the fitted relationship between RSD of D and the total effective ion hops Neff

Equation (8) can be used to estimate the statistical variance of D calculated from any given AIMD simulation, where the number of total effective ion hops Neff can be first estimated from the maximum TMSD of all mobile ions using Eq. (9). For example, in a material with site distance a = 3 Å (assumed for all following estimations), an AIMD simulation that reaches a maximum TMSD of 1800 Å2, corresponding to Neff = ~ 200, results in an RSD of D of ~28%. In addition, since the diffusivity D and ionic conductivity σ are linearly dependent (Eq. (5)), the RSD of σ is equivalent to the RSD of D. Thus, Eq. (8) is also valid for estimating the RSD of ionic conductivity σ from AIMD simulations.

This result suggests that in order to obtain more accurate diffusivity from AIMD simulations, one should run longer MD simulations that sample more diffusion events, i.e., have larger Neff. For AIMD simulations with an estimated Neff of 50, 100, and 150, corresponding to max (TMSD) of 450, 900, and 1350 Å2, respectively, RSD of D is 52%, 38%, and 32%, respectively. These levels of RSD are reasonable, but still allow for a notable statistical error of the fitted D. To obtain D with <20% RSD, the AIMD simulation should observe more than max (TMSD) of ~4150 Å2 (Neff = ~ 460). The RSD is reduced to ~ 10% for Neff = 3200 and max (TMSD) = ~ 30,000 Å2. It should be noted that Neff and TMSD are from all mobile ions, while only MSD per ion is often presented in the literature. A max(TMSD) of a few thousand Å2 is quite significant for typical AIMD simulations within 1 ns and can only be reached in fast ion conductors at relatively high temperatures. In addition to running longer MD simulations, running MD simulations on larger systems with more mobile ions, for example as is done in classical MD simulations, can achieve more effective ion hops Neff and hence less statistical variance in D and σ.

Accessible range of diffusivity, activation energy, and temperature

Given the short physical time duration of 100 ps to 1 ns in typical AIMD simulations, AIMD simulations of materials with large activation energy Ea or at low temperature T may not observe enough number of ion hops. In order to achieve a reasonable accuracy of the fitted D, AIMD simulations may only be applicable to materials with relatively high ionic conductivity. Given the high computational costs of AIMD simulations, it is often desired to pre-estimate the range of temperatures that can be performed for a given material (with an estimated Ea). The accessible ranges of T at a given Ea can be estimated as follows. The estimation here assumes an AIMD simulation over a total physical time duration of 1 ns and a supercell with a volume V of 1000 Å3 (i.e., 10 Å × 10 Å × 10 Å) with N = 20 mobile ions and a site distance a = 3 Å. In order to achieve <50% RSD of D and σ, Neff should be >55 according to Eq. (8). For this RSD limit, the ionic conductivity should be >0.025 S/cm at 600 K to achieve Neff > 55 over 1 ns (Eqs. (4), (5)). This ionic conductivity corresponds to a Li+ diffusivity of ~4.2 × 10–7 cm2/s (Eq. (5)) for the assumed supercell model. For the materials supercell with different size V or different number of mobile carriers N, the accessible range of diffusivity can be estimated in the same way using Eqs. (58). In general, a minimum diffusivity of ~10−7 cm2/s is accessible by the AIMD simulations with typical size (~1 nm3) and time duration (~1 ns), in order to have a reasonable number of effective ion hops (Neff > 50).

Figure 5 shows a plot of D as a function of Ea and T estimated from the transition state theory

$$D = a^2v^ \ast \exp \left( { - \frac{{E_a}}{{kT}}} \right),$$
(10)

where \(v^ \ast\) is the attempting jump frequency and is chosen to be 1012 Hz, a is the neighboring-site distance, and k is the Boltzmann constant. The geometric factor and correlation factor, which also affect the value of D, were neglected in this back-of-the-envelope estimation. This plot can be used to estimate the range of Ea and T accessible to typical AIMD simulations. In the same example model above, a minimum diffusivity of ~5 × 10–7 cm2/s is needed to achieve an RSD of ~50% within 1 ns. The highest Ea and lowest temperatures to have the desired D value can be read from this plot (Fig. 5). For materials with an Ea of 0.6 eV, as in some cathode materials, AIMD simulations below 900 K may not observe enough ion hops and the fitted D would have a large error. For super ionic conductors with an Ea of 0.20 eV or lower, the temperature accessible by AIMD simulations may go as low as 300 K.

Fig. 5
figure 5

Diffusivity D as a function of activation energy Ea and temperature T for a typical ionic conductor with assumed site distance a = 3 Å and an attempting frequency ν = 1012 Hz

In addition, this plot can be used to determine the appropriate temperatures for AIMD simulations to obtain D with reasonable accuracy. For example, in order to achieve a desired RSD of ~20%, a maximum TMSD of 4150 Å2 over 1 ns is needed (Eqs. (78)), and this maximum TMSD corresponds to a minimum diffusivity of ~3 × 10–6 cm2/s (Eq. (5)) in the same example material model. To achieve that diffusivity, AIMD simulations of a material with Ea of 0.2 eV need to be at above 450 K (Fig. 5). An ionic conductor with Ea of 0.3 eV should have AIMD simulations above 700 K to achieve similar accuracy (RSD = ~20%). An AIMD simulation at >1150 K is needed for an ionic conductor with Ea of 0.5 eV. Therefore, for materials with slow diffusion, long AIMD simulations at high temperatures are essential to obtain D with low statistical uncertainty. In general, AIMD simulations containing a significant number of ion hops are crucial for achieving small error bounds and a high confidence level in the fitted D.

Estimating errors of diffusional properties from Arrhenius relation

Given the statistical uncertainty of fitted D, it is crucial to include the statistical uncertainty of every D data point into the fitting of the Arrhenius relation. In particular, the data points from the simulations at lower temperatures or over shorter durations may have a significantly smaller number of ion hops and thus a higher uncertainty, and this uncertainty should be taken into account during the fitting. The data points in Fig. 6 indeed show significantly larger variance at lower temperatures. Therefore, these lower-temperature data points should have less weight in the fitting. As a common practice in linear regression, the inverse square of the standard deviation of log(D) is factored into the fitting as the weight of each data point to account for the statistical uncertainty.46 Given that the fitting of the Arrhenius relationship is usually performed as a linear fitting of log(D) and 1/T, the derived standard deviation of log(D) shall be used as the weight on each point (as shown as error bars in Fig. 6) for the linear fitting (Supplementary Note 2). Therefore, the error bounds and statistical intervals of Ea and D0 in the Arrhenius relation (Eq. (7)) can be obtained using standard error analysis for linear regression, and so are also the errors for diffusivity and conductivity when extrapolated to other temperatures.

Fig. 6
figure 6

Arrhenius plot of Li+ diffusivity D as a function of temperature T in Li1.33Ti1.67Al0.33(PO4)3 (LATP), Li10GeP2S12 (LGPS) and Li7La3Zr2O12 (LLZO) from AIMD simulations. The error bar is the statistical uncertainty of each diffusivity data point estimated from Eqs. (79)

Table 1 shows the error bounds of the activation energy Ea and ionic conductivity σ extrapolated to 300 K for LATP, LGPS and LLZO Li-ion conductors. Ea and extrapolated σ at 300 K agree well with experimental values, showing the capability of AIMD simulations for quantifying diffusional properties.41,42,47,48 For all three materials, the Ea has a standard deviation of ~0.02 eV, and the extrapolated σ at 300 K has the error bound within 1–2 orders of magnitude. It should be noted that extrapolated conductivity (i.e., room temperature conductivity) assumes that there is no change in the slope of the Arrhenius relation due to a phase change. While excellent agreement of AIMD simulations and experiments is widely reported, one should note that the statistical uncertainty of extrapolated σ at 300 K can be as large as an order of magnitude for typical AIMD simulations. The errors in Ea and σ may be larger for AIMD simulations over shorter duration or on slower ion conductors, which sample fewer diffusion events. As shown in Supplementary Note 3 and Supplementary Figure 4, using short AIMD simulations, especially at low temperatures, would lead to significant overestimation of diffusivity and, in particular, the RT conductivity.

Table 1 Activation energy Ea and extrapolated Li+ conductivity σ at 300 K from fitted Arrhenius relation

In addition, the errors of Ea and extrapolated σ also depend on the number of data points and the linearity of the relationship according to the error analyses of linear regression.46 To minimize the error of Ea and extrapolated σ, more data points are helpful. For example, the error of Ea and σ may increase significantly if some data points are omitted for the LGPS data in Fig. 6, leading to more deviations from the experimental values (Supplementary Figure 5 and Supplementary Table 1). Therefore, having more data points of D (from sufficiently long AIMD simulations) would lead to smaller error and less statistical uncertainty for fitted diffusional properties.

Discussion

In this work, we systematically test and establish the procedures to obtain the diffusional properties and their statistical variances from AIMD simulations. Only the linear region of the MSD–Δt curve within the lower and upper fitting bound Δtlow and Δtup, which corresponds to diffusional displacements, should be used to fit the diffusivity. The cut-off of the ballistic region, Δtlow, is at the MSD per mobile ion of less than a fraction of a2 (a is the distance between neighboring mobile-ion sites), since the local ballistic vibration is limited within the potential well of the equilibrium site. The AIMD simulation should be long enough to have the MSD per mobile ion larger than a few times a2 so that the distinction between the linear diffusion region and the ballistic region can be clearly observed. Otherwise, a part of ballistic motion would be mixed into the fitting, leading to the over-estimation of D. Moreover, the upper fitting bound Δtup should be set as <30–70% of ttot to exclude the region with poor linearity at large Δt. An adequately long AIMD simulation with properly determined lower and upper fitting bounds is crucial to have a linear MSD–Δt curve for fitting D.

If the proper procedure is followed to determine the linear diffusion region of the MSD–Δt curve, the goodness of fitting to the MSD–Δt curve would be nearly perfect and would not reflect the statistical variance of the fitted diffusivity. The statistical variance of the fitted diffusivity and conductivity is instead a direct result of the total number of ion hops sampled during the MD simulations. The statistical variances of D from fitting Einstein relation should be quantified by the total displacement TMSD using the empirically determined relation in Eq. (8).

On the basis of these results, AIMD simulations should be adequately long to collect a sufficient number of ion hopping events. As shown in typical alkali-ion conductor materials, a simple rule of thumb is that a maximum TMSD of a few hundred Å2 is necessary to obtain a reasonable estimation of D (with ~50% RSD), and a maximum TMSD of a few thousand Å2 is necessary to obtain a more accurate D (~20–30% RSD). Given the high uncertainty of D from typical AIMD simulations, the error bounds of diffusivity and conductivity should be reported.

Given the need to collect a large number of ion hops, AIMD simulation is more suitable for studying fast-ion-conducting materials. Likewise, it is necessary to perform AIMD simulations at high temperature for most materials. A minimum diffusivity of ~10−7 cm2/s is necessary for a material to be accessible by AIMD simulations and a diffusivity of ~10−5 cm2/s may allow for the calculation of a more accurate D. On the basis of these estimated bounds of minimum diffusivity, the approximate accessible temperature ranges of AIMD simulations may be estimated using the plot in Fig. 5 or using a back-of-the-envelope estimation based on the Arrhenius relation. For materials with a high activation energy, running AIMD simulations at sufficiently high temperatures is crucial for simulating an adequate number of ion hops.

In addition, it is also crucial to include the statistical error bounds of each diffusivity data point when fitting to the Arrhenius relation (Eq. (7)). The error bounds of D should be accounted for as a weight into the linear fitting of log(D) over 1/T. The error of Ea and extrapolated σ can be established through standard regression error analysis. Such error analyses indicate that the typical statistical error bounds of extrapolated σ at 300 K may be as large as an order of magnitude, even for fast ionic conductors with many data points at different temperatures and with long AIMD simulations. One should consider the statistical variances when interpreting the diffusional properties calculated from AIMD simulations.

While the results of this study are largely based on AIMD simulations, our analyses, schemes, and conclusions are applicable for diffusional studies using classical MD simulations. While AIMD simulations provide ab initio potential energy surfaces, leading to more accurate diffusion properties, classical MD simulations can be performed on significantly larger model systems with more mobile ions and over longer time scales, so more diffusional events can be observed. When the TMSD and total effective ion hops Neff from classical MD simulations are large, the statistical variances of diffusional properties are expected to be small. However, in classical MD simulations in which TMSD or Neff is small, which is common for systems with slow diffusion or at low temperatures, our scheme and analyses should be carried out to properly quantify diffusional properties and their statistical variances.

In summary, the major conclusions for calculating diffusional properties from AIMD simulations are as follows. These conclusions also serve as specific guidelines for future practices.

  1. 1.

    In the calculation of diffusivity, fitting to the Einstein relation (Eq. (4)) should be limited to the linear diffusion region of the MSD–Δt curve. This linear diffusion region excludes the ballistic region at MSD a2 and the poor linearity region at large Δt > Δtuptup < 0.7ttot).

  2. 2.

    The AIMD simulations should be long enough and should only be performed on materials with reasonably fast diffusion so that a large number of diffusion events can be captured to obtain accurate diffusivity. In addition, the MSD per mobile ion should be larger than a few times a2 to distinguish the ballistic region (i.e., Δtlow).

  3. 3.

    The statistical variance of the diffusivity should be derived from the total diffusional displacements (such as maximum TMSD using Eqs. (89)) rather than the goodness of fitting to the Einstein relation. The statistical variance of fitted diffusivity values from AIMD simulations should be reported.

  4. 4.

    The statistical uncertainties of each data point of diffusivity should be included when fitting to the Arrhenius relation. The error bounds of activation energy and extrapolated ionic conductivity are non-negligible and should be evaluated using standard regression analysis.

Our study establishes the proper calculation procedures and statistical error analyses for the correct application of AIMD simulations in estimating diffusional properties. The obtained knowledge and established procedures are the basis for quantifying diffusional properties, drawing proper conclusions from the AIMD simulation results, and further developing AIMD simulations.

Methods

In this study, all DFT calculations were performed using the Vienna ab initio Simulation Package49 within the projector augmented-wave approach with the Perdew−Burke−Ernzerhof generalized-gradient approximation.50 The initial structures for AIMD simulations are statically relaxed in DFT using the standard parameters from Materials Project.51,52,53 In AIMD simulations, the DFT-based force evaluations were non-spin-polarized with a single Γ-center k-point grid. The AIMD simulations performed in this study were based on the Born–Oppenheimer approximation with a time step of 2 fs. Classical MD simulations of garnet were performed using Large-scale Atomic/Molecular Massively Parallel Simulator54 in a supercell model of 8 formula units of Li7La3Zr2O12 and the interatomic potential from ref. 55. All AIMD and classical MD simulations used a time step of 2 fs, an NVT ensemble using a Nose–Hoover thermostat, and a velocity–Verlet algorithm for integrating the equation of motion.

The crystal structures of LGPS, cubic phase LLZO, and RbAg4I5 investigated were obtained from Inorganic Crystal Structure Database (ICSD)56 and Materials Project.51,52,53 We used the Li super-ionic conductor LATP based on the NASICON structure to establish the fitting scheme, and then tested the same scheme in other Li super-ionic conductors. The structure of LATP was derived from the LiTi2(PO4)3 structure by partially substituting Ti with Al and by inserting extra Li atoms. The structures with disordered site occupancy were ordered using the same method in the previous studies.5,7

Data availability

The computation data to support the findings of this study is available from the corresponding author on reasonable request. The code to analyze AIMD simulation is available at: https://github.com/mogroupumd/aimd