Introduction

Our microscopic understanding of heat and mass transport in extended systems is rooted in the Green-Kubo (GK) theory of linear response1,2, as applied to the Navier-Stokes equations for the densities of the conserved extensive variables3,4, which include energy, momentum, and the particle numbers for each molecular species. Transport coefficients are determined by the equilibrium fluctuations of the relevant currents and are in fact proportional to their autocorrelation times. Notwithstanding its beauty, rigor, and broad scope, the practical implementation of the GK theory is known to require very long molecular dynamics (MD) simulations, much longer in fact than the current autocorrelation times one is required to evaluate5,6,7,8,9. Even though a number of expedients have been devised to cope with this problem7,9,10, none of them is fully satisfactory in that in some cases they fail to provide rigorous criteria to estimate the accuracy resulting from a given MD trajectory, and in all cases the length of the required simulations is unaffordable with accurate but expensive quantum simulation techniques11.

It has long been believed that the GK theory of heat transport could not be combined with modern simulation methods based on electronic-structure theory because energy densities and currents are inherently ill-defined at the quantum-mechanical level12. This misconception has been recently challenged by a couple of papers showing that, while energy densities and currents are actually ill-defined classically no less than they are quantum-mechanically, the heat conductivity resulting from the GK formula is indeed well defined by virtue of a kind of gauge invariance stemming from energy extensivity and conservation13,14. This finding spurred a renewed interest in the quantum simulation of thermal conduction11,15, and made it urgent to devise a data-analysis technique to make these simulations affordable, thus paving the way to the ab initio simulation of heat transport.

In this paper we propose a data-analysis protocol leading to an asymptotically unbiased and consistent estimate of transport coefficients (i.e. the bias and the statistical error can be made both arbitrarily small in the limit of long simulation times) and requiring shorter simulations than used so far. This protocol avoids any ad hoc fitting procedure and naturally provides an accurate estimate of the statistical error, thus lending itself to an easy implementation and automated use. While motivated by heat-transport applications, our approach naturally applies to any other transport properties that can be expressed, in a GK framework, in terms of time integrals of suitable autocorrelation functions, such as, e.g., ionic conductivities, viscosities, and tracer diffusivity, to name but a few. We test our procedure in the specific case of heat transport and we find that simulation times of one to a few hundred picoseconds are sufficient to achieve an accuracy of the order of 10% on the estimated thermal conductivities in a wide range of systems at typical physical conditions. In the Theory section we present the general theory of our approach; in the Applications and Benchmarks section we validate it with extensive benchmarks performed on the calculation of thermal conductivities in the paradigmatic cases of elemental and molecular fluids (liquid Ar and H2O) and of crystalline and glassy solids (MgO and a-SiO2). The Conclusions and Perspectives section finally contains our conclusions.

Theory

Heat transport in insulators is determined by the dynamics of atoms, the electrons following adiabatically in their ground state, and the GK theory of linear response1,2 allows one to derive the thermal conductivity from an analysis of equilibrium, possibly ab initio, MD trajectories. According to this theory, the thermal conductivity of a macroscopic and isotropic system is given by:

$$\kappa =\frac{1}{V{k}_{B}{T}^{2}}{\int }_{0}^{{\rm{\infty }}}\langle \hat{J}(t)\hat{J}(0)\rangle \,dt,$$
(1)

where \(V\) and \(T\) are the system volume and temperature, \(\hat{J}\) is one Cartesian component of the macroscopic heat flux14,16, \({k}_{B}\) is the Boltzmann constant, and the brackets \(\langle \cdot \rangle \) indicate ensemble averages over initial conditions. Here and in the following a caret, as in “\(\hat{J}\)”, denotes a realization of a random variable, whereas properties of the underlying stochastic process, such as time correlation functions, are indicated without carets. According to Eq. (1), the heat conductivity is proportional to the zero-frequency component of the power spectrum (aka the power spectral density) of the heat flux: \(\kappa =\frac{1}{2V{k}_{B}{T}^{2}}S(f=\mathrm{0)}\),

$$S(f)={\int }_{-\infty }^{\infty }{{\rm{e}}}^{-i2\pi ft}G(t)\,dt,$$
(2)

where \(G(t)=\langle \hat{J}(t)\hat{J}\mathrm{(0)}\rangle \) is the equilibrium time-correlation function of the heat-flux process.

In practice, MD gives access to a discrete sample of the process, \({\hat{J}}_{n}=\hat{J}(n\varepsilon )\), where \(0\le n\le N-1\), \(\varepsilon \) is the sampling period of the current, and \(N\) the length of the time series, that we assume to be even. The Wiener-Khintchine theorem17,18 states that, for large \(N\), the power spectrum of the heat-flux process evaluated at \({f}_{k}=\frac{k}{N\varepsilon }\) is proportional to the expected value of the squared modulus of the discrete Fourier transform of the corresponding time series:

$$S({f}_{k})=\langle {\hat{S}}_{k}\rangle ,$$
(3)
$${\hat{S}}_{k}=\frac{\varepsilon }{N}{|{\hat{F}}_{k}|}^{2},$$
(4)
$${\hat{F}}_{k}=\sum _{n\mathrm{=0}}^{N-1}{{\rm{e}}}^{-2\pi i\frac{kn}{N}}{\hat{J}}_{n},$$
(5)

for \(0\le k\le N-1\). The sample spectrum \({\hat{S}}_{k}\) is thus an estimator of the power spectrum of the process; in the statistics literature it is usually referred to as the periodogram. The reality of the \(\hat{J}\)’s implies that \({\hat{F}}_{k}={\hat{F}}_{N-k}^{\ast }\) and \({\hat{S}}_{k}={\hat{S}}_{N-k}\), so that periodograms are usually reported for \(0\le k\le \frac{N}{2}\) and their Fourier transforms evaluated as discrete cosine transforms.

The heat flux is an extensive quantity whose density autocorrelations are usually short-ranged. In the thermodynamic limit it can therefore be seen as the sum of (almost) independent identically distributed stochastic variables, so that, according to the central-limit theorem, its equilibrium distribution is Gaussian. A slight generalization of this argument allows us to conclude that the heat-flux process is Gaussian as well. The heat-flux time series is in fact a multivariate stochastic variable that, in the thermodynamic limit, results from the sum of (almost) independent variables, thus tending to a multivariate normal deviate. This implies that at equilibrium the real and imaginary parts of the \({\hat{F}}_{k}\)’s defined in Eq. (5) are zero-mean normal deviates that, in the large-\(N\) limit, are uncorrelated among themselves and have variances proportional to the power spectrum evaluated at \({f}_{k}\). For \(k\mathrm{=0}\) or \(k=\frac{N}{2}\), \({\hat{F}}_{k}\) is real and \({\rm{var}}({\hat{F}}_{k})=\frac{N}{\varepsilon }S({f}_{k})\); for \(k\notin \{\mathrm{0,}\,\frac{N}{2}\}\), \({\rm{v}}{\rm{a}}{\rm{r}}({\rm{\Re }}({\hat{F}}_{k}))={\rm{v}}{\rm{a}}{\rm{r}}({\rm{\Im }}({\hat{F}}_{k}))=\frac{N}{2\varepsilon }S({f}_{k})\). We conclude that in the large-\(N\) limit the sample spectrum of the heat-flux time series reads:

$${\hat{S}}_{k}=S({f}_{k}){\hat{\xi }}_{k},$$
(6)

where the \(\hat{\xi }\)’s are independent random variables distributed as a \({\chi }_{1}^{2}\) variate for \(k=0\) or \(k=\frac{N}{2}\) and as one half a \({\chi }_{2}^{2}\) variate, otherwise. For the sake of simplicity, we make as though all the \({\hat{\xi }}_{k}\)’s were identically distributed as one half a \({\chi }_{2}^{2}\) variate for all values of \(k\), thus making an error of order \({\mathscr{O}}(1/N)\), which vanishes in the long-time limit that is being assumed in most of the arguments presented in this paper. In many cases of practical interest, multiple time series are available to estimate the power spectrum of a process. For instance, in equilibrium MD, a same trajectory delivers one independent time series per Cartesian component of the heat flux, all of which are obviously equivalent in isotropic systems. In these cases it is expedient to define a mean sample spectrum by averaging over \(\ell \) different realizations:

$${}^{\ell }\hat{S}_{k}=\frac{1}{\ell }\sum _{i\mathrm{=1}}^{\ell }{\hat{S}}_{k}^{i},$$
(7)

where the index “\(i\)” enumerates the realizations. For \(k\notin \{\mathrm{0,}\,\frac{N}{2}\}\) the ordinates of the mean sample spectrum are therefore distributed as in Eq. (6), with the \(\hat{\xi }\)’s being \({\chi }_{2\ell }^{2}\) variates, divided by the number of degrees of freedom:

$${}^{\ell }\hat{\xi }_{k}\sim \frac{1}{2\ell }{\chi }_{2\ell }^{2}\mathrm{.}$$
(8)

Eq. (6) brings both good and bad news. The good news is, the expectation of the periodogram of the time series is the power spectrum of the process, i.e. the former is an unbiased estimator of the latter. The bad news is, this estimator is not consistent, i.e. its variance does not vanish in the large-\(N\) limit. This is so because a longer time series increases the number of discrete frequencies at which the power spectrum is sampled, rather than its accuracy at any one of them. A consistent estimate of the zero-frequency value of the power spectrum (or the value at any other frequency, for that matter) can be obtained by segmenting a time series into several blocks of equal length and then averaging over the sample spectra computed for each of them. When the length of the trajectory grows large, so does the number of blocks, thus making the variance of the average arbitrarily small. In practice, the determination of the optimal block size is a unwieldy process that leads to an inefficient determination of the length of the trajectory needed to achieve a given overall accuracy. Here we adopt a different approach that allows us to obtain a consistent estimate of the zero-frequency value of the power spectrum from the statistical analysis of a single trajectory sample (i.e. no block analysis is needed) and such that the estimate of the trajectory length necessary to achieve a given accuracy is optimal.

Spectral density estimation from finite empirical time series is the subject of a vast literature in the statistical sciences, embracing both parametric and non-parametric methods19. In the following we propose a semi-parametric method to estimate the power spectrum of a stochastic process, based on a Fourier representation of the logarithm of its power spectrum (the “log-spectrum”). The advantage of dealing with the log-spectrum, instead of with the power spectrum itself, is twofold. First and foremost, the noise affecting the former is additive, instead of multiplicative, thus making it simple and expedient to apply linear filters: limiting the number of components of the Fourier representation of the log-spectrum acts as a low-pass filter that systematically reduces the power of the noise and yields a consistent estimator of the log-spectrum at any given frequency. Second, as a bonus, the logarithm is usually smoother than its argument. Therefore, the Fourier representation of the logarithm of the power spectrum is more parsimonious than that of the spectrum itself.

Let \({}^{\ell }{\hat{L}}_{k}=\,\mathrm{log}({}^{\ell }{\hat{S}}_{k})\) be the “sample log-spectrum” of our time series. By taking the logarithm of Eq. (6), \({}^{\ell }\hat{L}_{k}\) can be expressed as:

$${}^{\ell }{\hat{L}}_{k}=\,\mathrm{log}(S({f}_{k}))+\,\mathrm{log}({}^{\ell }{\hat{\xi }}_{k})=\,\mathrm{log}(S({f}_{k}))+{\lambda }_{\ell }+{}^{\ell }{\hat{\lambda }}_{k},$$
(9)

where

$${\lambda }_{\ell }={\int }_{0}^{{\rm{\infty }}}\,\mathrm{log}(\frac{\xi }{2\ell }){P}_{{\chi }_{2\ell }^{2}}(\xi )\,d\xi =\psi (\ell )-\,\mathrm{log}(\ell )$$
(10)

is the expected value of the logarithm of the \({}^{\ell }\hat{\xi }\) stochastic variables defined in Eq. (8), \({P}_{{\chi }_{2\ell }^{2}}\) is the probability density of a \({\chi }_{2\ell }^{2}\) variate, \({}^{\ell }{\hat{\lambda }}_{k}=\,\mathrm{log}({}^{\ell }{\hat{\xi }}_{k})-{\lambda }_{\ell }\) are zero-mean identically distributed independent stochastic variables, and \(\psi (z)\) and is the digamma function20. The variance of the \({}^{\ell }\hat{\lambda }\) variables is:

$${\sigma }_{\ell }^{2}={\int }_{0}^{{\rm{\infty }}}\,\mathrm{log}\,{(\frac{\xi }{2\ell })}^{2}{P}_{{\chi }_{2\ell }^{2}}(\xi )\,d\xi -{\lambda }_{\ell }^{2}={\psi }^{^{\prime} }(\ell ),$$
(11)

where \(\psi ^{\prime} (z)\) is the tri-gamma function20.

Eq. (9) explicitly shows that the sample log-spectrum of a time series is equal to the logarithm of the power spectrum one wishes to evaluate (modulo a constant), plus a (non-Gaussian) white noise. In order to eliminate the high-frequency components of the noise, and thus reduce its total power, we define the “cepstrum” of the time series as the inverse Fourier transform of its sample log-spectrum:21

$${}^{\ell }{\hat{C}}_{n}=\frac{1}{N}\sum _{k=0}^{N-1}\,{}^{\ell }{\hat{L}}_{k}{{\rm{e}}}^{2\pi i\frac{kn}{N}},$$
(12)

and its coefficients as the cepstral coefficients. According to a generalized central-limit theorem for Fourier transforms of stationary time series22, in the large-\(N\) limit, the (inverse) Fourier transform of the \({}^{\ell }\hat{\lambda }\)’s, appearing in Eq. (9) and implicitly in Eq. (12), is a set of independent (almost) identically distributed zero-mean normal deviates. It follows that:

$${}^{\ell }{\hat{C}}_{n}={\lambda }_{\ell }{\delta }_{n0}+{C}_{n}{+}^{\ell }{\hat{\mu }}_{n},$$
(13)
$${C}_{n}=\frac{1}{N}\sum _{k=0}^{N-1}\,\mathrm{log}(S({f}_{k})){{\rm{e}}}^{2\pi i\frac{kn}{N}},$$
(14)

where \({}^{\ell }\hat{\mu }_{n}\) are independent zero-mean normal deviates with variances \(\langle {}^{\ell }\hat{\mu }_{n}^{2}\rangle =\frac{1}{N}{\sigma }_{\ell }^{2}\) for \(n\notin \{\mathrm{0,}\,\frac{N}{2}\}\) and \(\langle {}^{\ell }\hat{\mu }_{n}^{2}\rangle =\frac{2}{N}{\sigma }_{\ell }^{2}\) otherwise. This result can be easily checked explicitly by using the definition of the discrete Fourier transform; the non-trivial extra information provided by the central-limit theorem is the asymptotic independence and normality of the \({}^{\ell }\hat{\mu }\)’s. Similarly to the sample power spectrum, the cepstral coefficients are real, periodic, and even: \({\hat{C}}_{n}={\hat{C}}_{N-n}\). If the log-spectrum, \(\mathrm{log}(S({f}_{k}))\), is smooth enough, the number of non-negligible \({C}_{n}\) coefficients in Eq. (14) is much smaller than \(N\). Let us indicate with \({P}^{\ast }\) the smallest integer such that \({C}_{n}\approx 0\) for \({P}^{\ast }\le n\le N-{P}^{\ast }\), and let us restrict the discrete Fourier transform of the cepstrum defined in Eq. (12) to \(n < {P}^{\ast }\) or \(n > N-{P}^{\ast }\). The \(k\mathrm{=0}\) component of this restricted Fourier transform reads:

$${}^{\ell }{\hat{L}}_{0}^{\ast }={}^{\ell }{\hat{C}}_{0}+2\sum _{n=1}^{{P}^{\ast }-1}\,{}^{\ell }{\hat{C}}_{n}$$
(15)
$$=\,{\lambda }_{\ell }+\,\mathrm{log}({S}_{0})+{}^{\ell }{\hat{\mu }}_{0}+2\sum _{n=1}^{{P}^{\ast }-1}\,{}^{\ell }{\hat{\mu }}_{n}.$$
(16)

We conclude that \({}^{\ell }\hat{L}_{0}^{\ast }\) is a normal deviate with expectation and variance

$$\langle {}^{\ell }{\hat{L}}_{0}^{\ast }\rangle \equiv {L}_{\ell }^{\ast }=\,\mathrm{log}({S}_{0})+{\lambda }_{\ell },$$
(17)
$${\sigma }_{\ell }^{\ast }{({P}^{\ast },N)}^{2}={\sigma }_{\ell }^{2}\frac{4{P}^{\ast }-2}{N}\mathrm{.}$$
(18)

Eqs. (17) and (18) are the main results of our method: they show how the logarithm of the transport coefficient can be estimated from the (inverse) Fourier coefficients of the sample log-spectrum, Eq. (17), and how the resulting statistical error only depends on the number of Fourier coefficients retained and the total simulation length, Eq. (18).

The value of \({P}^{\ast }\) is a property of the stochastic process underlying the time series, and is therefore independent of \(N\): for any given value of \({P}^{\ast }\) the variance \({\sigma }_{\ell }^{\mathrm{\ast 2}}\) tends to zero in the large-\(N\) limit, and \({}^{\ell }{\hat{L}}_{0}^{\ast }-{\lambda }_{\ell }\) is thus a consistent estimator of \(\mathrm{log}({S}_{0})\). Notice that the absolute error on \({L}_{\ell }^{\ast }\) directly and nicely yields the relative error on \({S}_{0}\), which is proportional to the transport coefficient we are after. In general, all the cepstral coefficients are different from zero and assuming that many of them actually vanish introduces a bias. The optimal value of \({P}^{\ast }\) is the one for which the bias, which is a decreasing function of it, is of the order of the statistical error, which instead increases with \({P}^{\ast }\). Its choice is the subject of model selection theory, another vast chapter in the statistical sciences23. Among the several tests that have been devised to perform this task, we choose the optimization of the Akaike’s information criterion (AIC)23,24, as described below.

Given a model depending on \(P\) parameters, \(\theta =\{{\theta }_{1},{\theta }_{2},\cdots {\theta }_{P}\}\), the AIC is a sample statistic defined as

$${\rm{A}}{\rm{I}}{\rm{C}}(P)=-2\mathop{max}\limits_{\theta }\mathrm{log}\,{\mathscr{L}}(\theta ,P)+2P,$$
(19)

where \( {\mathcal L} (\theta ,P)\) is the likelihood of the parameters. The optimal number of parameters is then determined as the argument of the AIC minimum:

$${P}_{A}^{\ast }\equiv \arg \mathop{min}\limits_{P}{\rm{A}}{\rm{I}}{\rm{C}}\,(P).$$
(20)

In the present case the parameters of the model are the \(P\) coefficients \(C=\{{C}_{0},{C}_{1},\cdots \,{C}_{P-1}\}\) as defined in Eq. (14), and the log-likelihood reads, up to additive terms independent of \(P\) and \(C\):

$$2\,\mathrm{log}\,{\mathscr{L}}(C,P)=-\frac{N}{2{\sigma }_{\ell }^{2}}{({C}_{0}+{\lambda }_{\ell }-{\hat{C}}_{0})}^{2}-\frac{N}{{\sigma }_{\ell }^{2}}\sum _{n=1}^{P-1}{({C}_{n}-{\hat{C}}_{n})}^{2}-\frac{N}{{\sigma }_{\ell }^{2}}\sum _{n=P}^{\frac{N}{2}}{\hat{C}}_{n}^{2}.$$
(21)

Evidently, the above expression is maximized, for given \(P\), by \({C}_{n}={\hat{C}}_{n}-{\delta }_{n0}{\lambda }_{\ell }\) for \(n\,=\,\mathrm{0,}\,\mathrm{1,}\,\cdots \,P-1\), and the corresponding value of the maximum is: \(2\,{max}_{C}\,\mathrm{log}\,{\mathscr{L}}(C,P)=-\frac{N}{{\sigma }_{\ell }^{2}}{\sum }_{n=P}^{\frac{N}{2}}{\hat{C}}_{n}^{2}\). We conclude that the value of the AIC is:

$${\rm{AIC}}(P)=\frac{N}{{\sigma }_{\ell }^{2}}\sum _{n=P}^{\frac{N}{2}}{\hat{C}}_{n}^{2}+2P\mathrm{.}$$
(22)

The value of \(P\) that minimizes this expression is the optimal number of parameters in the Akaike sense, \({P}_{A}^{\ast }\), as defined in Eq. (20).

The maximum frequency available for spectral/cepstral analysis is the Nyqvist frequency25, determined by the sampling period ε as \({f}_{{\rm{Ny}}}=\frac{1}{2\varepsilon }\). Transport coefficients only depend on the low-frequency behavior of the spectrum, which is independent of \(\varepsilon \), as long as the latter is small enough as to avoid aliasing effects. For this reason it may prove convenient to eliminate the high-frequency portion of the spectrum (\(f > {f}^{\ast }\)) by applying a low-pass filter to the time series (e.g. a moving average26) and then resample the latter with a sampling period \({\varepsilon }^{\ast }=\frac{1}{2{f}^{\ast }}\), thus resulting in a time series of \({N}^{\ast }=N\frac{{f}^{\ast }}{{f}_{{\rm{Ny}}}}\) time steps. The optimal number of cepstral coefficients resulting from Eqs. (20) and (22), as well as the error in the estimate of the transport coefficients resulting from Eq. (18), depends in general on the choice of the cutoff frequency, \({f}^{\ast }\). The smaller \({f}^{\ast }\), the smaller will presumably be the number of cepstral coefficients necessary to describe the log-spectrum to any given accuracy over such a shorter frequency range. However, the shorter length of the filtered time series, \({N}^{\ast }\), results in an increased variance of the estimator \({}^{\ell }\hat{L}_{0}^{\ast }\) defined in Eq. (15), according to Eq. (18). Numerical experiments performed on the MD data reported in the Applications and Benchmarks section show that both the estimated value of \({}^{\ell }\hat{L}_{0}^{\ast }\) and its variance are actually fairly insensitive to the value chosen for the cutoff frequency, \({f}^{\ast }\), provided that the power spectrum of the re-sampled time series faithfully features the first band of the original spectrum (i.e. the first prominent feature) and that this band is not too peaked at the origin.

Data analysis based on the theory presented above is straightforward. Given a current time series from a MD trajectory, the sample spectrum \({}^{\ell }\hat{S}_{k}\) is first determined from Eqs. (4), (5), and (7) and smoothed out using e.g. a moving average26 performed over a narrow frequency window, so as to reduce the statistical noise to a level where the shape of the power spectrum of the underlying process can be appreciated. A cutoff frequency, \({f}^{\ast }\), is then determined so as to encompass the first band of the smoothed power spectrum. The cepstrum \({}^{\ell }\hat{C}_{n}\) is computed by Fourier analyzing the logarithm of the sample spectrum up the cutoff frequency thus estimated, as in Eq. (12), and the optimal number of cepstral coefficients is determined using Eqs. (20) and (22). The logarithm of the zero-frequency value of the power spectrum is finally estimated from Eqs. (15) and (17) as \({}^{\ell }\hat{L}_{0}^{\ast }-{{\lambda }}_{\ell }\), and its variance from Eq. (18).

Mind the difference between the moving average performed in the frequency domain to smooth out the power spectrum and that performed in the time domain, as suggested before, and acting as a low-pass filter. Spectral smoothing using a moving average in the frequency domain is common practice in the analysis of time series, and it actually provides a consistent estimator of the power spectrum. In fact, the number of frequencies falling within a window of given width increases linearly with the length of the series, so that the variance of the average decreases as the inverse of the product of the length of the series times the width of the window. The resulting spectral estimate is however biased by the variation of the signal within the window, thus strongly reducing the width of the windows that can be afforded. Moreover, both the bias and the statistical error are difficult to estimate due to the multiplicative nature of the noise; therefore neither the plain periodogram nor a running average thereof are adequate for a quantitative estimate of the zero-frequency value of the power spectrum, which is proportional to the transport coefficient we are after.

Applications and Benchmarks

In order to benchmark the methodology described above, we have applied it to the calculation of the thermal conductivity in four systems representative of different classes of materials, spanning from elemental and molecular fluids (Ar and H20) to crystalline and glassy solids (MgO and a-SiO2). Classical MD simulations were run using the LAMMPS package27 with the setup described below. For liquid Ar we used a Lennard Jones potential as described in Ref.28 at a temperature \(T\approx 220\,{\rm{K}}\) and density \(\rho =1.55\,{{\rm{g}}/{\rm{c}}{\rm{m}}}^{3}\), in a cubic supercell containing 864 atoms, with a time step \(\Delta t=4\,{\rm{fs}}\). For liquid H2O we used a flexible model as in Ref.29 at a temperature \(T\approx 300\,{\rm{K}}\) and density \(\rho =1.0\,{\rm{g}}/{{\rm{cm}}}^{{\rm{3}}}\), in a cubic supercell containing 180 molecules, with a time step \(\Delta t=0.5\,{\rm{fs}}\). For crystalline (FCC) MgO we used a Buckingham-plus-Coulomb potential as in Ref.30 at a temperature \(T\approx 1000\,{\rm{K}}\) and density \(\rho =3.61\,{{\rm{g}}/{\rm{c}}{\rm{m}}}^{3}\), in a 4×4×4 simple cubic conventional supercell with 512 atoms, and a time step \(\Delta t=0.3\,{\rm{fs}}\). For a-SiO2 we used the potential proposed by van Beest, Kramer, and van Santen31 as implemented by Shcheblanov et al.32, at a temperature \(T\approx 1000\,{\rm{K}}\) and density \(\rho =2.29\,{{\rm{g}}/{\rm{c}}{\rm{m}}}^{3}\), in a supercell containing 216 atoms with a time step \(\Delta t=1\,{\rm{fs}}\). The glass model was obtained from a quench from the melt (\(T\approx 6500\,{\rm{K}}\)) at a constant quenching rate of \(5.5\times {10}^{12}\,{\rm{K}}/{\rm{s}}\). Each system was equilibrated in the NVT ensemble at the target temperature for several hundred picoseconds; data were then collected in the NVE ensemble for trajectories whose length was chosen so as to represent realistic simulation runs that could be afforded using ab initio MD (\({\mathscr{T}}=100\,{\rm{ps}}\) for Ar, H2O, and a-SiO2, and \({\mathscr{T}}=500\,{\rm{ps}}\) for MgO). In order to compare our estimates of the transport coefficients and their statistical errors with reliable and statistically significant reference data, in all cases we ran much longer (\(\approx 50\,{\rm{ns}}\)) simulations. This allowed us to compare our predicted conductivities with accurate values estimated from the direct integration of the time auto-correlation function in Eq. (1), as obtained from a block average33 performed over the long trajectory. In addition, we could collect abundant statistics of our estimator for the transport coefficients, Eq. (15), and validate its normal distribution specified by Eqs. (17) and (18).

In Fig. 1 we report the sample periodograms of the heat fluxes computed for the four systems considered in this paper and averaged over the three Cartesian components, as described above. The solid lines in color indicate a further moving average26 computed over a narrow frequency window, as explained at the end of the Theory section. The values of the cutoff frequencies used for cepstral analysis, \({f}^{\ast }\), are chosen so as to encompass the first prominent feature of the (smoothed) power spectrum. For instance, in H2O and a-SiO2 we choose \({f}^{\ast }\approx 29\,{\rm{THz}}\) and \({f}^{\ast }\approx 28\,{\rm{THz}}\), respectively. In MgO we assume \({f}^{\ast }\approx 0.6\,{\rm{THz}}\), just at the upper edge of the first narrow peak, whereas in Ar there is just one band, corresponding to a purely diffusive behavior of a simple fluid, and we assume \({f}^{\ast }\approx 7\,{\rm{THz}}\), where the spectrum has exhausted most of the available power, but its value is not yet too small (see Fig. 1). The corresponding average numbers of cepstral coefficients given by the optimization of the AIC are: \({P}_{A}^{\ast }=5\) (Ar), \(7\) (H2O), \(4\) (MgO), and \(31\) (a-SiO2). Later we will will display the dependence of the number of optimal cepstral coefficients and of the resulting estimate of the thermal conductivity on the choice of \({f}^{\ast }\), and show that this choice is not critical.

Figure 1
figure 1

Sample power spectra of the heat flux computed from MD trajectories for Ar, H2O, a-SiO2 (\(100\,{\rm{ps}}\)) and MgO (\(500\,{\rm{ps}}\)) (gray line, see text). The solid lines in color correspond to a moving average performed over a narrow frequency window. The vertical arrows indicate the cutoff frequencies, \({f}^{\ast }\), used for the subsequent cepstral analysis (see text). The inset in the MgO panel is a magnification of the low-frequency region of the spectrum.

In order to validate our data-analysis protocol, we first computed the heat conductivities from a direct integration of the current autocorrelation function, Eq. (1), combined with standard block analysis over the \(50\,{\rm{ns}}\) long trajectory, that will be taken as a reference, obtaining: \({\kappa }_{{\rm{ref}}}\,=\,0.1965\pm 0.0015\), \(0.970\pm 0.009\), \(19.2\pm 0.4\), and \(2.115\pm 0.025\,{\rm{W}}/{\rm{m}}{\rm{K}}\), for Ar, H2O, MgO, and a-SiO2, respectively. Although our simulations were meant for benchmarking purposes only, and no particular care was paid to exactly match the simulation conditions of previous work, these data are in fair agreement with the foregoing theoretical results: \(\approx \,0.19\,{\rm{W}}/{\rm{m}}{\rm{K}}\) (Ar)28, \(\approx \,0.85\,{\rm{W}}/{\rm{m}}{\rm{K}}\) (H2O)34, \(\approx \,12\,{\rm{W}}/{\rm{m}}{\rm{K}}\) (MgO)30, and \(\approx \,2.1\,{\rm{W}}/{\rm{m}}{\rm{K}}\) (a-SiO2)35.

In Fig. 2 we display the distributions of the values of \(\mathrm{log}(\kappa /{\kappa }_{{\rm{r}}{\rm{e}}{\rm{f}}})\) estimated by applying our protocol to multiple MD segments of \(100\,{\rm{ps}}\) (for Ar, H2O, a-SiO2) and \(500\,{\rm{ps}}\) (for MgO), extracted from the \(50\,{\rm{ns}}\) long trajectory. The optimal numbers of cepstral coefficients, \({P}^{\ast }\), have been redetermined for each segment independently, while the values of the cutoff frequency, \({f}^{\ast }\), which only depends on the qualitative features of the spectrum, have been determined once for all for one of them. The distribution of the resulting number of cepstral coefficients is reported in Fig. 3. The observed distributions of \(\mathrm{log}(\kappa )\) successfully pass the Shapiro-Wilk normality test36 (i.e. they do not fail it) and the observed sample standard deviations closely match the theoretical values estimated from Eq. (18) (sample/theory): \(\mathrm{0.104/0.079}\) (Ar), \(\mathrm{0.053/0.045}\) (H2O), \(\mathrm{0.17/0.11}\) (MgO), and \(\mathrm{0.114/0.095}\) (a-SiO2). Remember that the error on \(\mathrm{log}(\kappa )\) is the relative error on \(\kappa \): the corresponding absolute errors achievable with a short trajectory of \(100\) (Ar, H2O, and a-SiO2) or \(500\) (MgO) ps, not to be confused with the long trajectory used to establish the reference data above, are therefore: \({\sigma }_{\kappa }\approx 0.015\) (Ar), \(0.045\) (H2O), \(2\) (MgO), and \(0.2\) (a-SiO2) \({\rm{W}}/{\rm{m}}{\rm{K}}\). This indicates that a single and short sample trajectory, such as one that is affordable with ab initio MD, is sufficient to achieve and accurately estimate a very decent relative error on the computed transport coefficient. In an attempt to evaluate the thermal conductivity from the direct computation of the GK integral, Eq. (1), and standard block analysis using similarly short MD trajectories, our best estimate of the resulting statistical error was 2–3 times larger than using our protocol (meaning 5–10× longer trajectories to achieve a comparable accuracy) in all cases but liquid Ar, where only a marginal improvement is achieved using our methodology. Much more than this, the standard analysis of MD data depends on a number of hidden parameters, such as the upper limit of the GK integral, Eq. (1), or the width of the blocks for error analysis, that are hard to determine and keep under control. Our method, instead, only depends on a single parameter, the number of cepstral coefficients, whose optimal value can be easily determined from the Akaike’s information criterion, or other more sophisticated model-selection methods23,37, as appropriate.

Figure 2
figure 2

Distributions of the logarithm of the thermal conductivities, \(\mathrm{log}(\kappa )\), estimated over multiple MD segments (\(100\,{\rm{ps}}\) for Ar, H2O, and a-SiO2, and \(500\,{\rm{ps}}\) for MgO) extracted from a \(50\,{\rm{ns}}\) long trajectory. The reported data are referred to \({\kappa }_{{\rm{ref}}}\), which is the value obtained from the direct integration of the current autocorrelation function in Eq. (1), combined with standard block analysis over the \(50\,{\rm{ns}}\) trajectory, and represented by the vertical gray bands. The Gaussian curves represent the distributions predicted by the theory, centered at the sample mean. Remember that the absolute error on \(\mathrm{log}(\kappa )\) is the relative error on \(\kappa \).

Figure 3
figure 3

Distribution of the optimal numbers of cepstral coefficients, \({P}_{A}^{\ast }\), as determined by optimizing the Akaike’s information criterion, Eqs. (20) and (22), for each segment of the \(50\,{\rm{ns}}\) long MD trajectory, as described in the text. The vertical dashed lines indicate the average value of \({P}_{A}^{\ast }\).

In order to estimate the bias introduced by limiting the number of cepstral coefficients, we examined the sample mean computed over the distributions displayed in Fig. 2, \(\langle \kappa \rangle \), obtaining: \(\langle \kappa \rangle =0.1878\pm 0.0007\), \(0.969\pm 0.002\), \(16.7\pm 0.2\), and \(2.131\pm 0.009\,{\rm{W}}/{\rm{m}}{\rm{K}}\), for Ar, H2O, MgO, and a-SiO2, respectively. Comparing these data with the reference data obtained from the direct evaluation of the GK integral, we see that the bias is negligible for H2O and a-SiO2, very small for Ar, and small but not negligible for MgO. In Fig. 4 we display the dependence of \(\mathrm{log}(\kappa /{\kappa }_{{\rm{r}}{\rm{e}}{\rm{f}}})\) on the number of cepstral coefficients, \({P}^{\ast }\), as estimated from Eq. (15). We observe that when the number of cepstral coefficients, \({P}^{\ast }\), is larger than the optimal value determined from the AIC, \({P}_{A}^{\ast }\), the estimated value of \(\kappa \) seems not to depend on \({P}^{\ast }\) for all systems but MgO, for which a slight bias seems to persist, and to a much lesser extent for Ar. Also, Eq. (18) seems to slightly underestimate the sample variance for small \({P}^{\ast }\) in these cases. In the case of MgO this behavior is likely due to the difficulty of the AIC to cope with the sharp low-frequency peak in the power spectrum, due to the highly harmonic character and slow decay of the vibrational heat carriers in periodic crystals11, thus requiring longer simulation times. In the case of Ar the very small bias observed for \({P}^{\ast }={P}_{A}^{\ast }\) may be due to the difficulty of choosing a suitable cutoff frequency when only a single diffusive band is present in the spectrum, and to the divergence of the log-spectrum at high frequency. In all cases, use of the Aikake’s information criterion results in a bias that is smaller than the statistical error estimated from an individual short sample trajectory and that can be systematically removed by increasing the value of \({P}^{\ast }\), at the price of increasing the statistical error, if and when needed.

Figure 4
figure 4

Dependence of \(\mathrm{log}(\kappa )\), as estimated from Eq. (15), on the number of cepstral coefficients \({P}^{\ast }\). \({P}_{A}^{\ast }\) is the optimal number of coefficients estimated from the AIC using Eqs. (20) and (22). The black dots represent the mean values of \(\mathrm{log}(\kappa )\) computed over multiple MD segments (\(100\,{\rm{ps}}\) for Ar, H2O, and a-SiO2, and \(500\,{\rm{ps}}\) for MgO) extracted from a \(50\,{\rm{ns}}\) long trajectory; the colored bands and dashed lines represent one standard deviation as estimated from the empirical statistics and from Eq. (18), respectively. The reported data are referred to \({\kappa }_{{\rm{ref}}}\), which is the value of thermal conductivity obtained from direct integration of the current autocorrelation function in Eq. (1), combined with standard block analysis over the \(50\,{\rm{ns}}\) trajectory, and represented by the horizontal gray bands. Remember that the absolute error on \(\mathrm{log}(\kappa )\) is the relative error on \(\kappa \).

Finally, in Fig. 5 we report the dependence of the optimal number of cepstral coefficients, \({P}_{A}^{\ast }\), as a function of the cutoff frequency, \({f}^{\ast }\), along with the dependence of the resulting estimate of \(\mathrm{log}(\kappa /{\kappa }_{{\rm{r}}{\rm{e}}{\rm{f}}})\). \({P}_{A}^{\ast }\) increases with \({f}^{\ast }\). Notwithstanding, the estimated value of the heat conductivity, as well as its variance, is fairly insensitive on the precise value of \({f}^{\ast }\) as long as the latter is large enough as to encompass the lowermost prominent feature of the spectrum. Some more comments are in order for MgO. In this case the high thermal conductivity, due to the strong harmonic character of slowly decaying phonon modes, manifests in the form of a narrow peak centered at \(f=0\), followed by a broad plateau that carries little spectral weight. This feature determines a more pronounced increase in the number of significant cepstral coefficients as \({f}^{\ast }\) increases and a corresponding increase of the bias when keeping \({P}^{\ast }\) at the value given by the AIC. The AIC is a less reliable indicator of the number of cepstral coefficients necessary to keep the bias low, in this case. By increasing this number by a factor of two or more, the bias decreases, as indicated by the results reported in lighter colors in Fig. 5, and eventually vanishes, as shown in Fig. 4.

Figure 5
figure 5

Triangles: average optimal number of cepstral coefficients, \({P}_{A}^{\ast }\), as determined by the AIC,Eqs. (20) and (22), as a function of the cutoff frequency used for cepstral analysis, \({f}^{\ast }\) (see discussion just after Eq. (22)). Squares: \(\mathrm{log}(\kappa )\) resulting from a given choice of \({f}^{\ast }\) and of the corresponding value of \({P}_{A}^{\ast }\). All the values are averages performed over multiple \(100\,{\rm{ps}}\) long segments (\(500\,{\rm{ps}}\) for MgO) extracted from a \(50\,{\rm{ns}}\) long MD trajectory, as discussed in the text. The colored bands indicate the sample standard deviation and the dashed lines that resulting from our theoretical analysis (see Eq. (18)). The vertical arrows indicate the cutoff frequencies, \({f}^{\ast }\), used for the cepstral analysis in this paper (see Fig. 1 and text). In the case of MgO, the data indicated with lighter colors are obtained using a number of cepstral coefficients twice as large as that provided by the AIC, \({P}^{\ast }=2{P}_{A}^{\ast }\). The data are referred to \({\kappa }_{{\rm{ref}}}\), which is the value of thermal conductivity obtained from direct integration of the current autocorrelation function in Eq. (1), combined with standard block analysis over the \(50\,{\rm{ns}}\) trajectory, and represented by the horizontal gray bands. Remember that the absolute error on \(\mathrm{log}(\kappa )\) is the relative error on \(\kappa \).

Conclusions and Perspectives

We believe that the methodology introduced in this paper will finally open the way to the quantum simulation of heat transport, which has been considered out of its scope until very recently, particularly for strongly anharmonic and/or disordered systems. Its implementation is straightforward and its use robust, as the only parameter to be determined is the optimal number of cepstral coefficients, using e.g. the Akaike’s information criterion. Our methodology performs best when a low conductivity results from large cancellations in the integral of a highly oscillatory time auto-correlation function, such as in molecular liquids or amorphous solids, where simulation times of the order of \(100\,{\rm{ps}}\) seem sufficient to obtain accuracies of the order of 10% in the estimated thermal conductivities. The performance is less spectacular in periodic crystals, where slowly-decaying strongly-harmonic phonon modes require longer simulation times and the ensuing sharp peak in the low-frequency region of the power spectrum requires a larger number of cepstral coefficients than predicted by the optimization of the AIC. Even so, simulation times of the order of a few hundred picoseconds seem sufficient to achieve a comparable accuracy. In the latter case, it is possible that a combination of the methodology introduced here with specialized techniques based on normal-mode analysis, such as that presented in Ref.11, will result in further improvements. Replacing the optimization of the AIC with more sophisticated and possibly more efficient approaches, such as e.g. weighted multi-model inference techniques23,37, may also assist in this and other difficult cases. Finally, we expect that our methodology will impact on the simulation of any transport phenomena to which the Green-Kubo theory applies, such as ionic conduction, viscosity, and many others.