Shear viscosity is one of the most important transport properties governing the macroscopic flow of liquids. As such, it plays a fundamental role in various fields of science and technology, such as, e.g., chemical and mechanical engineering or earth and planetary sciences, to name but a few. For instance, the viscosity of a solvent crucially affects the dynamics of solutes and the reactions rates, of fundamental importance in the study of biological processes and chemical reactions1,2,3. The value of the viscosity of liquid iron, abundant in Earth’s outer core, is key in the prediction of the magnetic field of rocky planets4,5. An accurate determination of the temperature and pressure profile of the viscosity is also essential for the correct modeling of tidal interactions in the planets’ interior, in particular in the presence of icy layers6,7.

In this work, we focus on water, an ubiquitous molecular liquid with extraordinary and complex properties8,9,10,11,12,13. In spite of the great importance of this system and the large number of studies based on density-functional theory (DFT) and ab initio molecular dynamics (AIMD) devoted to it14,15,16,17,18,19,20,21,22, all of these efforts have, until very recently23, dodged its viscous properties, because an accurate computation of the viscosity of water would require exceedingly long first-principles simulations20. A number of studies based on classical force fields exists24,25,26,27, but the poor transferability of these models sets a limit to their predictive power. An attempt to estimate the viscosity of water from first principles was made with an indirect approach relying on the Stokes–Einstein relation22, which, however, does not hold over all the phase diagram for liquid water, particularly in the supercooled regime28,29,30.

A rigorous microscopic description of the shear viscosity of liquids, η, is provided by the Green–Kubo (GK) theory of linear response31,32,33,34, according to which its value is proportional to the integral of the time auto-correlation function (tACF) of the off-diagonal matrix elements of the stress tensor. This integral can be estimated from the time series of the stress, generated by an equilibrium molecular-dynamics simulation of the system of interest. A number of different procedures have been developed to cope with the evaluation of the GK integral35,36. Here, we adopt a spectral approach, recently proposed by Ercole et al.37,38,39,40, which allows one to compute transport coefficients, along with the statistical errors affecting them, from shorter trajectories than previously thought to be necessary. This progress notwithstanding, the estimate of transport coefficients from AIMD may require generating trajectories of a few hundred picoseconds for systems as large as a few hundred atoms. It is evident that, although technically quite possible, AIMD simulations of this size do not lend themselves to an easy estimate of the statistical accuracy of the results, let alone a systematic exploration of a broad region of the phase diagram of a material.

The last decade has seen the rise of machine-trained potentials, as represented by either deep-neural networks41,42,43,44 or by Gaussian processes45, as powerful tools for atomistic simulations. These potentials are able to deliver a nearly quantum mechanical accuracy at a cost that is only marginally higher than that of classical force fields. This opens the way to extend the scope of AIMD simulations to the size range necessary for the computation of reliable transport coefficient such as the viscosity. In the present work, we adopt the recently developed Deep Potential framework43,46,47 to study the shear viscosity of liquid water. Deep potential molecular dynamics (DPMD) simulations have already been proved to successfully predict bulk thermodynamic properties beyond the reach of direct DFT calculations13,48,49,50,51,52, as well as dynamic properties like mass diffusion in solid state electrolytes53,54, their interactions with defects55, thermal transport properties in silicon56, infrared spectra of water and ice57, Raman spectra of water58 and very recently also the thermal conductivity of liquids such as liquid water59.

So far, a combination of AIMD, advanced data analysis, and neural-network techniques has only been applied to thermal and charge transport39,40,55,59,60,61. In this work, we attempt to apply them to the computation of viscosity. In this study, we report on calculations, from both direct DFT and DPMD simulations, of the shear viscosity of water. We show that η can be obtained with trajectories of ≈400 ps, which are still quite demanding for an extensive ab initio study over a broad portion of the phase diagram. We thus take advantage of the DPMD technique and perform extensive simulation employing a deep-neural-network potential (NNP) trained on extensive DFT data. Our methodology proceeds in two steps. In the first, we train a NNP on Perdew–Burke–Ernzerhof (PBE)62 data and validate our procedure against results from a rather long (400-ps) AIMD trajectory. We then adopt the strongly constrained and appropriately normed (SCAN) meta-GGA exchange-correlation (XC) functional63,64, which provides a much more accurate description of the H-bond network in water19, to perform extensive simulations of the viscous properties of water just above melting. Close to melting, the viscosity depends very sensitively on temperature. Once the error resulting from the imperfect prediction of the melting line is offset by referring the simulated temperature to the theoretical melting one, our SCAN predictions of the shear viscosity of water in a temperature range extending above the melting line are in very good agreement with experiment.

Our paper is organized as follows. “Results” contains all the discussion of the results: in “Ab initio molecular dynamics” we present the results of our direct PBE-AIMD simulations and draw some conclusions on the simulation time and length scales necessary to achieve an acceptable statistical accuracy; in “PBE NNP” we benchmark our NNP against ab initio MD simulations of liquid water at the PBE level of theory; in “Statistical analysis and finite-size scaling“ we expand our analysis on the statistical properties of our estimator of the shear viscosity and briefly discuss its size-dependency. Once our methodology is set up and validated, in “SCAN NNP” we report on an extensive set of simulations performed with a NNP model trained on SCAN meta-GGA DFT data and we compare their results with available experimental data and our PBE-NNP results. We show that SCAN meta-GGA reduces the deviation from experiments of the predicted shear viscosity. “Discussion” contains our final discussion with some interesting perspectives and further applications of our work. In “Methods”, we recall the main theoretical and numerical methods used throughout the work: the main aspects of the GK theory of transport; its application to viscosity; the main data-analysis technique; and briefly describe the neural-network model.


Ab initio molecular dynamics

We performed AIMD simulations of liquid water at near-ambient conditions using the PBE62 XC functional, the plane-wave pseudopotential method, Hamann–Schluter–Chiang–Vanderbilt norm-conserving pseudopotentials65, and a kinetic-energy cutoff of 85 Ry. The simulated system was made of 64 molecules at the standard density of 1 gr cm−3, corresponding to a cubic box of edge l = 12.43 Å. All the simulations were carried out with the Car-Parrinello extended-Langrangian method66 using the cp.x component of the QUANTUM ESPRESSO™ distribution67,68,69 and setting the fictitious electronic mass to 25 physical masses and the timestep to dt = 0.073 fs. We performed two simulations aiming at thermodynamic conditions near ambient temperature and somewhat above it. As PBE is known to enhance the short-range structure of water and to overestimate the melting temperature by ≈ 140 K70,71, we set the target temperatures of the two simulations to 450 and 600 K, respectively. Both trajectories were first equilibrated in the NVT ensemble using a Nosè-Hoover thermostat72 at the target temperature, followed by long production NVE runs of 400-ps. Finally, the shear viscosity was obtained from the cepstral analysis of the power spectrum of the off-diagonal elements of the stress, using the SporTran73 code.

In Fig. 1, we display the (moving averages74 of the) power spectra of the stress-tensor time series resulting from our two simulations. While showing similar features at high frequency, the two spectra differ substantially approaching ω = 0. In particular, lower temperatures see the appearance of sharp peaks near ω = 0, which requires a greater care in the cepstral analysis of the data, which is based on a low-pass filter of the (logarithm of) the power spectra. In the inset, we display the low-frequency region of the spectra together with the results carried out by the cepstral analysis, i.e., by applying a low-pass filter to the logarithm of the raw spectra. The filtered spectra are represented by thick solid lines whose zero-frequency value is a fair and accurate estimate of the shear viscosity we are after:

$$\eta =\left\{\begin{array}{l}0.383\pm 0.023\,{{{\rm{cP}}}}\,{{{\rm{at}}}}\,454\,{{{\rm{K}}}},\\ 0.178\pm 0.005\,{{{\rm{cP}}}}\,{{{\rm{at}}}}\,600\,{{{\rm{K}}}}.\end{array}\right.\,\,({{{\rm{PBE}}}})$$

where the unit cP stays for centipoise, 1 cP = 10−3 Pas. It is often assumed that the predictions of ab initio simulations should be compared with experiments upon shifting the simulated temperature by the offset between the theoretical and experimental melting temperatures, which, in the case of PBE, amounts to Tm(PBE) − Tm(expt) ≈ 140 K71. We thus compare our value predicted by PBE at T = 454 K with the experimental value measured at T = 313 ≈ 454−140 K, ηexpt(T = 313 K) = 0.653 cP. The agreement is fair, on account of both the uncertainties related to the empirical temperature shift and the very sensitive dependence of the viscosity upon temperature near melting. More on the meaning of the residual disagreement will be discussed in “SCAN NNP”.

Fig. 1: Ab initio power spectra.
figure 1

Power spectra of the off-diagonal elements of the stress in water at 454 K (blue) and 600 K (orange), obtained from AIMD simulations (see text). The spectra are filtered by a moving average with a window of 0.05 THz. The thick solid lines in the inset represent the cepstral-filtered spectra whose zero-frequency value gives an estimate of the shear viscosity.

In Fig. 2, we display how the prediction of the shear viscosity in water depends on the length of the simulation. In order to highlight the impact of possibly long relaxation times on the estimate of the transport coefficient, we have split our 400-ps trajectories into segments of 100, 200, and 300-ps (in the latter case the two segments were overlapping). The estimates from different segments coincide with the statistical errors evaluated within each of them at 600 K, but not quite so at 454 K. This can be ascribed to the emergence of a narrow peak in the stress power spectrum at ω = 0 (see Fig. 1), related to an increase of the stress correlation time occurring as the freezing temperature is approached. A similar behavior had been already observed by Ercole et al.37 in the case of heat transport in strongly harmonic crystals. All these considerations suggest that near freezing the computation of the shear viscosity requires longer simulation runs, and even longer runs would be required for a fair evaluation of the statistical uncertainties, indicating that AIMD may not be the most efficient approach to explore a broad range of thermodynamic conditions. In the following we show that neural-network models of inter-atomic interactions trained on ab initio data provide a valid alternative to direct AIMD simulations, yielding results of similar quality at a much lower computational cost.

Fig. 2: η vs. length of the simulation.
figure 2

Dependency of the shear viscosity η on the length of the simulation, estimated by AIMD a at 454 K and b 600 K. Different colors refer to different simulation times. Error bars represent standard deviations.


In order to appraise the ability of NNP to accurately predict shear viscosity, we have generated one such model, by training it on a set of PBE-DFT data. The training dataset is prepared via a recently proposed “on-the-fly” learning procedure called Deep Potential Generator (DP-GEN)75,76 and it consists of the energies and atomic forces of 4000 configurations of water generated by the DP-GEN from NPT MD trajectories at different temperatures in the [300–700 K] range and for pressures up to 50 kbar. The PBE-NNP is then constructed and trained with the DeePMD-kit. The cutoff radius is set to 6 Å. The size of the embedding and fitting nets is (50, 50, 50) and (250, 250, 250), respectively. The model was trained by minimizing the standard loss function, \({{{\mathcal{L}}}}\), presented in Eq. (8) of “Methods” with 2 million steps of Adam stochastic gradient descent77. We tried to include the values of the virial in the definition of the loss function, but we found no improvement with respect to the standard definition of Eq. (8), and thus decided not to modify it.

Figure 3 shows a scatter plot of the NNP predictions for atomic forces and stress vs. PBE-DFT data, evaluated over a set of 10,000 configurations, not included in the training dataset. The average error on the forces and on the off-diagonal elements of the virial are σF = 40 meV Å−1 and σΞ = 1.4 meV/atom, respectively, corresponding to correlation coefficients78 of 0.998 and 0.995, respectively.

Fig. 3: Neural network accuracy.
figure 3

Scatter plot of the NNP forces (a) and off-diagonal elements of the virial per atom (b) vs. DFT data, for a dataset of 10,000 configurations. The corresponding correlation coefficients are \({R}_{a}^{2}=0.998\) and \({R}_{b}^{2}=0.995\).

In order to validate our neural-network methodology for the prediction of the shear viscosity, we performed DPMD simulations in the NVE ensemble for the same model of liquid water described above. Simulations of 20-ns were run at two different temperatures, 454 K and 600 K, using our NNP trained to PBE water. All simulations were carried out using the LAMMPS code79 interfaced with DeepMD-kit. In Fig. 4 we display the results obtained by analyzing independently each one of the about 50 400-ps segments in which we have partitioned the whole 20-ns trajectory. The shear viscosity of each segment is obtained again by cepstral analysis using the SporTran code and is represented by solid dots together with its estimated statistical error. The blue and orange regions represent respectively the estimate of the shear viscosity given in “Ab initio molecular dynamics” from ab initio MD simulations at 454 K and 600 K. We observe a very good agreement between the two approaches and conclude therefore that our NNP is capable of predicting correctly the shear viscosity of water at the given pT conditions. Also, notice the close agreement between the standard deviation of the viscosity estimated by cepstral analysis on individual 400-ps trajectory segments and the value computed over a sample of about 50 segments. More on the statistical analysis and significance of our data in “Statistical analysis and finite-size scaling”.

Fig. 4: Comparison between the shear viscosity predicted by DFT AIMD simulations and by DPMD simulations.
figure 4

The results are obtained from 400-ps long DFT AIMD simulations (horizontal blue and orange bands, 454 and 600 K, respectively) and by DPMD simulations of the same length (solid dots). The width of the bands and the vertical bars across the dots indicate the standard deviation of the data they refer to, as estimated by cepstral analysis (see Section “Data analysis”). The red and green bars on the right of the box indicate the sample averages and standard deviations of the DPMD data. Error bars represent standard deviations.

In Table 1 we report our results for the viscosity of water computed at two different temperatures with DPMD and NNP trained on PBE-DFT data, obtained from very long (20-ns) trajectories, and compare them with the AIMD data of “Ab initio molecular dynamics”.

Table 1 Viscosity of water [cP] computed from AIMD or DPMD performed at two different temperatures using the PBE XC functional.

Statistical analysis and finite-size scaling

We are now ready to investigate the statistical behavior of the shear viscosity for different simulation lengths. To this end, we sliced our 20-ns simulations in segments of smaller lengths (100-, 200-, and 400-ps) and analyzed them independently. Before proceeding, we remind the pivotal tenet of cepstral analysis: if a sample of a stationary stochastic process is longer than all the relevant time scales of the process, then the sample spectrum (i.e., the squared modulus of the Fourier transform of the series) equals the theoretical power spectrum of the process, times a set of identically distributed χ2 stochastic variables that are independent of each other for different frequencies. This implies that the low-pass-filtered logarithm of the sample spectrum is normally distributed at any (sufficiently low) frequency37 and that the estimator of the transport coefficient—which is proportional to the ω = 0 value of the filtered spectrum—is, therefore, a log-normal variate. In order to check the reliability of the cepstral estimate of the viscosity from trajectories of different lengths, in Fig. 5 we display the distributions of the logarithm of these estimates from trajectory segments of different length (100-, 200-, and 400-ps) and report the p-values of the Shapiro-Wilk (SW) normality test80 for each distribution. We observe that: i) at T ≈ 450 K the WS test is failed for segments shorter than 400-ps, indicating the subsistence of slow stress fluctuations that adversely affect our data analysis technique ; ii) at T = 600 K the WS is never failed with respect to a standard significance level α = 0.05; even for the shortest segment length (100 ps), for which we compute a p value of 0.07 over a sample of 200 segments \(\log (\eta )\); iii) the width of the distributions of the viscosity estimated at different lengths is slightly larger than the standard deviation estimated within each segment by cepstral analysis; iv) this difference decreases as the length of the segments increases, until it roughly vanishes at 400-ps; v) this difference also decreases by increasing the temperature. This observation is made more quantitative in Fig. 6, which shows the correlation between the standard deviations of the cepstral estimates of the viscosity from trajectories of different lengths and temperatures, vs. the spread of the distribution of their values resulting from different trajectories. The former quantity is itself affected by a statistical uncertainty because cepstral analysis returns different standard deviations for different trajectories of the same length. Figure 6 indicates that as the system approaches freezing from above and the viscosity increases, the low-frequency components of the virial fluctuations become increasingly important, and simulations of increasing length become necessary to cope with them. This is confirmed in Fig. 7 that displays the low-frequency portion of the power spectrum of the off-diagonal elements of the stress in water at different temperatures, and shows that as the system approaches freezing from above, a narrow peak develops at ω = 0, as a consequence of the onset of long-lived relaxation modes. In the present case, it appears that at 450 K trajectories as long as 400-ps are needed to get a reliable estimate of the statistical error affecting the estimate of the PBE-DFT viscosity. More generally, it seems that the flexibility offered by NNP and the long simulations they can afford are instrumental not only in exploring broad regions of the phase diagram of a material, but also in providing a reliable estimate of the statistical accuracy of individual simulations.

Fig. 5: Normalized distributions of the logarithm of the shear viscosities, log(η).
figure 5

The results are estimated over multiple MD segments (blue: 100-ps; orange: 200-ps; green: 400-ps) extracted from a 20-ns trajectory at 454 K (left) and 600 K (right). The reported data are referred to the average, \(\bar{\eta }\). We remind that the absolute error on log(η) is the relative error on η. The shaded area denote the average standard deviation of the shear viscosities, as estimated by cepstral analysis within each individual segment.

Fig. 6: Estimated standard deviations of η vs. spread of the distribution.
figure 6

Correlation between the cepstral estimates of the standard deviations of the viscosity of water from trajectories of different lengths and temperatures, σcep, vs. the spread of the distribution of their values resulting from different trajectories, σref (see text). Error bars represent standard deviations.

Fig. 7: Moving average of the low-frequency region of the power spectrum of the off-diagonal elements of the stress tensor in water at different temperatures, as obtained from DPMD simulations trained on PBE DFT data.
figure 7

An averaging window of 0.05 THz was used. Simulations were run at the fixed density of 1 gr cm−3.

Finite-size effects may affect the transport properties calculated in numerical simulations81,82. In order to quantify these effects in the present case, we run up to 5-ns long NVE simulations at 454 K and 600 K of PBE-NNP water at fixed density and increasingly larger cells (with up to 4096 molecules). The results, reported in Table 2, indicate that η shows no evident size dependence within the error bars of our simulations.

Table 2 Shear viscosity [cP] computed for water at different temperatures with PBE-NNP force field and using simulation boxes of different sizes.


The SCAN meta-GGA XC functional has demonstrated the ability to predict well several properties of water over a broad range of thermodynamic conditions, whose exploration was made possible by NNP techniques13,19,20,48,83. A combination of AIMD and NNP techniques, based on the SCAN XC functional, has recently been successfully applied to the prediction of the heat transport properties of liquid water59. In the following, we report on our extension of this effort to the computation of the shear viscosity.

Accurate DPMD simulations were performed using NNP force fields trained on both PBE and SCAN DFT data59 and the same software setup as in “PBE NNP”. Our simulated systems consist of 512 water molecules. With systems of this size, temperature fluctuations are smaller than 1 K. We first perform NVT simulations at the target temperature, followed by NVE production runs, up to 5-ns long. The volume was fixed to the value corresponding to the equilibrium densities evaluated in ref. 83 via enhanced-sampling simulations for SCAN, while for PBE it is computed from direct DPMD NPT simulations at ambient pressure, whose results are in agreement with previous calculations18,84.

In Fig. 8 we compare our SCAN-NNP and PBE-NNP results with each other and with experimental data85,86. Results below the melting temperature, Tm, refer to the undercooled fluid, which becomes increasingly viscous as the temperature decreases. Remarkably, when temperatures are referred to the theoretical melting one, the SCAN predictions for the viscosity are in close agreement with the experiment at melting (and above, as we will discuss shortly). This is not so for PBE. One could argue that PBE yields too low a viscosity as a consequence of the too low equilibrium density (0.77 vs ≈1 gr cm−3 at melting). This is not the case, however, because repeating the simulations at the density of 1 gr cm−3 (dashed lines) results in only a marginal increase in the predicted viscosity. We conclude that the common wisdom according to which the properties of PBE water would match those of real water at a simulation temperature 100 K above the experimental one is likely too simplistic: PBE water not only freezes at too high temperature, but its dynamics is way too fast at melting, as confirmed by the too-high self-diffusivity predicted by PBE, with respect to SCAN and experiment, when all the simulations are performed at the same temperature offset from Tm as in the experiment. For instance, the self-diffusivity of water predicted by PBE at a temperature T = 430 K, which is ≈ 20 K higher than the PBE melting temperature, Tm(PBE) ≈ 410 K, is 0.45 Å2 ps−159. This is to be compared with a value of 0.19 Å2 ps−1 predicted by SCAN at 20 K above its own melting temperature (i.e., at 330 ≈ 312 + 20 K,19 and practically the same value measured at T = 20 C, 0.2 Å2 ps−187). In a model where the dependence of the self-diffusivity on temperature was Arrhenius-like, this behavior would be consistent with a too-small pre-exponential factor predicted by PBE relative to SCAN and experiment. Further insight into the dynamics of the water hydrogen-bond network at melting would deserve further investigation.

Fig. 8: Comparison between the shear viscosities of water computed via DPMD simulations using NNP force fields trained to different DFT datasets and with experiments85,86.
figure 8

When not visible, the error bars are smaller than the dots. Continuous lines refer to simulations performed at the equilibrium density corresponding to each temperature. PBE data marked with a dashed line are obtained at the density of 1 gr cm−3. The thin vertical and horizontal lines mark the melting temperature and the corresponding viscosities. Error bars represent standard deviations.

In Fig. 9 we compare with the experiment the SCAN-NNP predictions for the viscosity of water, on a temperature scale that has been offset by the difference between the predicted melting temperature for the model and the one observed in the experiment, ΔT = 312−273 = 39 K. One observes that, while the agreement between theory and experiment is excellent above the melting temperature, SCAN consistently overestimates the viscosity in the undercooled regime. This indicates that the tendency toward dynamical arrest upon undercooling is occurring faster in the model than in the experiment. Interestingly, a crossover between the predicted and observed densities occurs at temperatures near melting: SCAN slightly overestimates the density of water for T > Tm, while it underestimates it in the undercooled regime. We hypothesize that the too large SCAN predictions for the viscosity below freezing may be related to a propensity of SCAN to overestimate the strength of the hydrogen bonds. In turn, this would lead to overestimate low-density (LD) over high-density (HD) fluctuations upon cooling, corresponding to configurations that underlie the structure of amorphous ices and water. At very deep undercooling they may lead to phase separation between an LD and a HD liquid13,88,89. The stronger local structure of LD water with respect to HD water seems compatible with a more marked solid-like behavior90,91,92 and, hence, with a larger viscosity.

Fig. 9: Comparison between the SCAN predictions and the experimental values for the shear viscosity of water as a function of the temperature.
figure 9

The temperature scale for SCAN data has been offset by the difference between the theoretical and experimental melting temperatures, Tm (see text). Error bars represent standard deviations.


We conclude with a summary of our results and some interesting perspectives and further applications of our work. In this article, we have performed a systematic ab initio study of the viscosity of liquid water, made possible by a combination of quantum-mechanical first-principles and deep-neural-network techniques. Our study confirms the ability of the SCAN exchange-correlation density functional to predict a broad array of properties of water over a wide range of thermodynamic conditions. Minor shortcomings observed in the undercooled regime are possibly related to the subtle balance between the high- and low-density fluctuations that become more prominent upon undercooling, as one approaches the hypothesized metastable liquid–liquid critical point. These shortcomings might be attenuated by training a neural network on more accurate quantum mechanical data, such as obtained from hybrid functionals18,93, or by using density-corrected DFT94, which adopts a more accurate electron density obtained at the Hartree–Fock level of theory. One of the most successful in describing the property of water is the recently developed DC-SCAN95, which produces remarkably accurate molecular dynamics for liquid water, and a highly realistic self-diffusion coefficient as a function of temperature. Finally, as a technical, but important, side product of our study we have highlighted that a careful analysis of the statistical properties of the stress time series, from which the viscosity can be evaluated through the Green–Kubo theory of linear response, is necessary, and we have provided a detailed report on some mathematical and computational tools that can be deployed to ease this task.


The GK theory of linear response31,32 provides a rigorous and elegant framework to compute transport coefficients in extended systems, such as the viscosity η, in terms of the stationary time series of a macroscopic flux (a flux, J, is defined as the macroscopic average of a current density, j(r): \({{{\boldsymbol{J}}}}=\frac{1}{V}{\int}_{V}{{{\boldsymbol{j}}}}({{{\boldsymbol{r}}}})d{{{\boldsymbol{r}}}}\), where V is the system’s volume) evaluated at thermal equilibrium with MD. For an isotropic system of N interacting particles, the shear viscosity η is related to the fluctuations of the off-diagonal elements of the stress tensor:

$$\eta =\frac{V}{{k}_{{{{\rm{B}}}}}T}\int\nolimits_{0}^{\infty }\langle \,{\sigma }_{s}\left({{{{\boldsymbol{\Gamma }}}}}_{t}\right){\sigma }_{s}\left({{{{\boldsymbol{\Gamma }}}}}_{0}\right)\,\rangle \,dt,$$

where V is the volume of the system, T is its temperature, kB is the Boltzmann’s constant, σs is any of the three independent off-diagonal elements of the stress tensor, (\({\sigma }_{s}\in \left\{{\sigma }_{xy},{\sigma }_{xz},{\sigma }_{yz}\right\}\)), and Γt indicates the time evolution of a point in phase space from the initial condition Γ0. In practice, the value of the integral in Eq. (1) is averaged over the three pairs of Cartesian indices.

Expression of stress tensor

The thermodynamic stress tensor is the equilibrium average, \(\bar{\sigma }\), of a microscopic estimator, σ, defined as:

$${\sigma }_{\alpha \beta }=\frac{1}{V}\left[\mathop{\sum }\limits_{i=1}^{N}\frac{{p}_{i\alpha }{p}_{i\beta }}{{m}_{i}}+{{{\Xi }}}_{\alpha \beta }\right],$$

where α, β represent Cartesian coordinates, piα is the α component of the momentum of the i-th atom, mi is its mass, while Ξ is the virial term, defined as the derivative of the system’s potential energy, E, with respect to an uniform scale transformation of the system (rα → rα + ∑βϵαβrβ, ϵ being the strain tensor):

$${{{\Xi }}}_{\alpha \beta }=-\frac{1}{V}\frac{\partial E}{\partial {\epsilon }_{\alpha \beta }}.$$

The expression of the virial term depends on the approach one adopts to perform the simulations: explicit formulas in the classical case are given, e.g., in ref. 96, for pair-wise potentials, and in ref. 97, for general many-body potentials, while the quantum-mechanical case is thoroughly covered within DFT in refs. 98,99. The expression of the virial stress using Deep Potential models relies on the decomposition of the total energy into individual atomic contributions, as it is the case for the heat current97, and will be presented in some detail in “Neural-network potentials”.

Data analysis

The MD evaluation of the GK formula starts with the computation of the stress time auto-correlation function. This can be done by exploiting the ergodic hypothesis and turning the ensemble average into a time average. The following step is to integrate the tACF, as stated in Eq. (1). Despite the apparent simplicity of this process, the straight evaluation of any transport coefficient through the GK formula is jeopardized by the fact that, while ideally the tACF goes to zero for large times, in practice it is very noisy. Indeed, as the tACF approaches zero, Eq. (1) starts accumulating noise and the integral behaves like the distance traveled by a random walk, whose variance grows linearly with the upper integration limit, making it very difficult to estimate both the bias due to the truncation of the integral and the statistical error.

A better approach is to focus on the power spectrum S(ω) of the stress time series σs(t), which, according to the Wiener–Khintchine theorem100,101, is the Fourier transform of the tACF of time series:

$$\begin{array}{ll}S(\omega )\doteq \mathop{\lim }\limits_{t\to \infty }\frac{1}{t}\left\langle {\left|\int\nolimits_{0}^{t}{\sigma }_{s}(t^{\prime} ){e}^{{{{\rm{i}}}}\omega t^{\prime} }dt^{\prime} \right|}^{2}\right\rangle \\\qquad\,\, =\int\nolimits_{-\infty }^{+\infty }\langle {\sigma }_{s}\left(t\right){\sigma }_{s}\left(0\right)\rangle {e}^{{{{\rm{i}}}}\omega t}dt.\end{array}$$

According to Eq. (4), the shear viscosity we are after, Eq. (1), is proportional to the ω = 0 value of the stress power spectrum,

$$\eta =\frac{V}{2{k}_{{{{\rm{B}}}}}T}S\left(0\right),$$

and any method able to accurately estimate the latter can be leveraged for the former. Cepstral analysis102 is one such method37,39,40, and we will rely on it in the present case, as previously done for the thermal and electrical conductivities37,39,40,60,103. A full and user-friendly implementation of cepstral analysis for the estimate of transport coefficients is available in the SporTran73 open-source code.

Neural-network potentials

The Deep Potential scheme has already been fully explained in the literature43,46, so in this section, we limit ourselves to a brief overview of its main features.

Let us consider a system of N atoms and let us indicate by R the set of its atomic coordinates: \({{{\bf{R}}}}=\{{{{{\bf{r}}}}}_{1},...,{{{{\bf{r}}}}}_{N}\}\in {{\mathbb{R}}}^{3N}\). The potential energy surface of the system \(E({{{\bf{R}}}})=E\left({{{{\bf{r}}}}}_{1},...,{{{{\bf{r}}}}}_{N}\right)\) is a function of the 3N atomic coordinates and of the species of each atom. Assuming that interatomic interactions are local, we make the ansatz that E(R) can be decomposed into the sum of atomic contributions, \({{{{\mathcal{E}}}}}_{i}\), which only depend on the coordinates of the atoms that are close enough to the one they are associated with. In order to establish a convenient notation, let us define by \({{{{\mathcal{R}}}}}_{i}\) the set of coordinates of the atoms whose distance from the i-th atom is smaller that a certain cut-off radius, Rc, referred to the position of the i-th atom itself (let Ni be the number of them):

$${{{{\mathcal{R}}}}}_{i}=\left(\begin{array}{l}{{{{\bf{r}}}}}_{1i}\\ {{{{\bf{r}}}}}_{2i}\\ \vdots \\ {{{{\bf{r}}}}}_{{N}_{i}i}\end{array}\right)=\left(\begin{array}{lll}{x}_{1i}&{y}_{1i}&{z}_{1i}\\ {x}_{2i}&{y}_{2i}&{z}_{2i}\\ \vdots &\vdots &\vdots \\ {x}_{{N}_{i}i}&{y}_{{N}_{i}i}&{z}_{{N}_{i}i}\end{array}\right)\in {{\mathbb{R}}}^{3{N}_{i}},$$

where \({{{{\bf{r}}}}}_{ij}={{{{\bf{r}}}}}_{i}-{{{{\bf{r}}}}}_{j}=({x}_{ij},{y}_{ij},{z}_{ij})\). Using these ingredients, the symmetry-preserving descriptors of the local atomic environments, \({{{{\mathcal{D}}}}}_{i}\), are defined and fed to a neural network, which returns the local atomic energies \({{{{\mathcal{E}}}}}_{s(i)}({{{{\mathcal{D}}}}}_{i})\), depending on the chemical species of the i-th atom, s(i), and on its environment, as described by \({{{{\mathcal{D}}}}}_{i}\) (extensive details in ref. 46). The total potential energy of the system is recovered as the sum of all the atomic contributions, thus ensuring extensivity:


The neural network is trained to return the local energy contribution corresponding to any given local environment. The training is performed by minimizing the so-called loss function, \({{{\mathcal{L}}}}\) with respect to the parameters ω of the deep-neural network:

$${{{\mathcal{L}}}}={p}_{E}{{\Delta }}{E}^{2}+\frac{{p}_{F}}{3N}\mathop{\sum}\limits_{i}{{\Delta }}{{{{\bf{F}}}}}_{i}^{2},$$

where ΔE2 and \({{\Delta }}{{{{\bf{F}}}}}_{i}^{2}\) are the squared deviations of the potential energy and atomic forces respectively, between the reference DFT model and the NNP predictions. The two prefactors, pE and pF are needed to optimize the training efficiency and to account for the difference in the physical dimensions of energies and forces.

The force acting on the i-th atom is given by:

$$\begin{array}{ll}{{{{\bf{F}}}}}_{i}=-\frac{\partial E}{\partial {{{{\bf{r}}}}}_{i}}&=-\frac{\partial }{\partial {{{{\bf{r}}}}}_{i}}\mathop{\sum}\limits_{j}{{{{\mathcal{E}}}}}_{s(j)}\left({{{{\mathcal{D}}}}}_{j}\right)\\ &=-\mathop{\sum}\limits_{j}\frac{\partial {{{{\mathcal{E}}}}}_{s(j)}}{\partial {{{{\mathcal{D}}}}}_{j}}\frac{\partial {{{{\mathcal{D}}}}}_{j}}{\partial {{{{\bf{r}}}}}_{i}}\end{array}$$

where we applied Eq. (7) and the chain rule. Thus the computation of the atomic forces can be split in two different contributions: the first is the derivative of the atomic energy \({{{{\mathcal{E}}}}}_{s(j)}\) with respect to each element \({{{{\mathcal{D}}}}}_{j}\) of the descriptor and can be easily evaluated through TensorFlow104, while the second term is given by the gradient of the descriptor with respect to the position of the atoms105.

Beside energies and forces, the NNP predicts also the virial of the system defined as in Eq. (3). Using Eq. (7) one can write105:

$${{{\Xi }}}_{\alpha \beta }=\mathop{\sum}\limits_{i}{r}_{i\alpha }{F}_{i\beta }=-\mathop{\sum}\limits_{i\ne j}{r}_{ij\alpha }\frac{\partial {{{{\mathcal{E}}}}}_{s(i)}}{\partial {r}_{ij\beta }},$$

where the second term can be further split in two contributions as previously shown for the forces. We remark that the resulting formula is well-defined in PBC and enters directly in the calculation of the stress given by Eq. (2), serving our purpose of computing the shear viscosity through the GK formula Eq. (1).