Thermal transport and phase transitions of zirconia by on-the-fly machine-learned interatomic potentials

Machine-learned interatomic potentials enable realistic finite temperature calculations of complex materials properties with first-principles accuracy. It is not yet clear, however, how accurately they describe anharmonic properties, which are crucial for predicting the lattice thermal conductivity and phase transitions in solids and, thus, shape their technological applications. Here we employ a recently developed on-the-fly learning technique based on molecular dynamics and Bayesian inference in order to generate an interatomic potential capable to describe the thermodynamic properties of zirconia, an important transition metal oxide. This machine-learned potential accurately captures the temperature-induced phase transitions below the melting point. We further showcase the predictive power of the potential by calculating the heat transport on the basis of Green-Kubo theory, which allows to account for anharmonic effects to all orders. This study indicates that machine-learned potentials trained on the fly offer a routine solution for accurate and efficient simulations of the thermodynamic properties of a vast class of anharmonic materials.


I. INTRODUCTION
The atomistic modelling of thermodynamic properties of materials at finite temperature and realistic conditions is of fundamental importance for a multitude of technological applications, from photovoltaics to optoelectronic devices, as well as for advancing our understanding of the physical properties of matter. This represents a huge challenge for modern atomistic simulations. Molecular dynamics (MD) and Monte Carlo techniques offer powerful tools that, when used in conjunction with ab initio (AI) theories, allow us to achieve the desired accuracy and predictive power [1]. First-principles based methods, however, suffer from serious limitations. For instance, the description of many thermodynamic properties, even in simple crystals, requires accessing time and size scales that are far beyond the capabilities of state-of-the-art methods.
Two of the most challenging, yet central, properties are temperature-induced structural phase transitions in solids and the transport of heat, which in semiconductors and insulators mainly stems from the lattice vibrations. Both can be extracted directly from MD simulations [2]. In the case of heat transport, this can be obtained from equilibrium MD calculations on the basis of the Green-Kubo (GK) theory [3]. This method, based on the fluctuation-dissipation theorem, is exact at sufficiently high temperatures where nuclear quantum effects are negligible and is thus superior to lattice dynamics approaches based on the Boltzmann transport equation (BTE) in crystalline systems [4]. AI calculations of thermal transport based on GK theory, however, have only recently become possible, after a fundamental issue related to the ill-definition of the microscopic heat flux was resolved [5].
Recently, two new theories of thermal transport have also been developed. These are capable of treating crystalline and amorphous systems on the same footing and reduce to the BTE in the limit of periodic solids [6,7]. Yet, the GK approach carries the advantage that it accounts exactly for anharmonicity to all orders, while intrinsically lending itself to a unified description of ordered and disordered solids, as well as liquids. Nevertheless, the long simulation times and large supercell sizes required in order to achieve converged results are still prohibitive for AI calculations. While some approaches have been devised to overcome this problem [8,9], they are not sufficient to straightforwardly enable calculations of heat transport from first-principles in any type of solid.
MD simulations based on (semi)empirical interatomic potentials are not affected by these limitations, as the computing time for each MD step is generally more than five orders of magnitude lower than with first-principles methods. Interatomic potentials, however, are lengthy and cumbersome to develop, have very limited flexibility and transferability, and generally do not offer sufficient accuracy [10] even in the case of simple systems [11].
Machine-learned force fields (MLFFs) have recently emerged as an ideal solution to drastically augment the length and time scales accessible to MD simulations while still retaining first-principles accuracy [12,13]. Machine-learning models based on mapping the local environment around each atom in the system onto a set of descriptors allow to train force fields that reproduce first-principles energies, forces, and stresses with very small errors. A key issue is the construction of the training sets. To avoid expensive and somewhat manual procedures for building a training database, so-called active learning schemes are becoming increasingly popular [14][15][16][17]. In particular, a number of algorithms have been put forward in order to generate potentials on the fly, relying on query strategies to allow the machine to judge whether new structures need to be added to the training dataset. An active learning strategy where structures are generated on the fly during MD simulations, combined with Bayesian inference to estimate the uncertainty of the machine-learning model, has been shown to be especially flexible and efficient, and it can be seamlessly integrated into existing first-principles codes [16,18].
The success of machine-learned potentials used in combination with MD simulations has been demonstrated in a variety of applications, from the study of phase transitions in halide perovskites [19] and lithium manganese oxide spinels [20] to thermal conductivities in solid and amorphous silicon [21] and the CoSb 3 skutterudite [22]. For any MLFF, however, it is extremely challenging to simultaneously describe vibrational properties, dynamical changes of lattice parameters during phase transitions, and anharmonic effects relevant for transition temperatures as well as thermal conductivity. In fact, none of Refs. [19][20][21][22] show the ability of MLFFs to describe all these different thermodynamic properties simultaneously.
Harmonic vibrational properties are already a stringent test of the quality of ML potentials, especially when considering materials with soft phonon modes and different polymorphs [23].
However, the description of phonon dispersion relations requires the acquisition of only a two-body term, the harmonic force constants. Anharmonicity, on the other hand, requires the development of a surrogate model for tiny, subtle energy differences on the highly corrugated harmonic energy surface. The MLFF must capture at least third-order anharmonic force constants, the coupling between phonons and lattice distortions, and, especially at high temperature, even higher-order anharmonic force constants. For example, calculating just the third-order terms easily requires more than a thousand first-principles calculations.
Whether machine-learned potentials can describe all these effects with satisfactory accuracy has been insufficiently investigated.
In this work we address this gap by focusing on one paradigmatic example of anharmonic systems, zirconia (ZrO 2 ), and we present a procedure to accurately study its thermodynamic properties using MLFFs trained on the fly. This allows for a semi-automated creation of a reference database. ZrO 2 is a transition metal oxide of tremendous importance in a wide variety of applications, including thermal barrier coatings, solid oxide fuel cells, and biomedical implants [24,25]. Pure ZrO 2 exhibits low thermal conductivity due to strong anharmonic phonon scattering [26], thus representing an ideal testbed for the investigation of thermal transport via GK theory and a challenging system for building accurate MLFFs.
At ambient pressure, ZrO 2 is known to have three structural phases [27,28] shown in Fig. 1. At high temperature the cubic fluorite structure (space group F m3m) is thermodynamically stable and it transforms to a tetragonal phase (space group P 4 2 /nmc) at about 2570 K, with the oxygen sublattice shifting up or down along the c direction ( Fig. 1b) accompanied by a slight tetragonal distortion. Around 1400 K the structure undergoes a tetragonal to monoclinic (space group P 2 1 /c) phase transition. This transition is of martensitic type, involving a complex cooperative motion of both the Zr and O sublattices accompanied by a lattice shear in the ac plane and a volume increase. The phase diagram of zirconia is extremely important for its technological applications, as for instance the volume change associated to the tetragonal to monoclinic transition can lead to mechanical degradation.
Moreover, the tetragonal and cubic structures can be stabilized at lower temperatures by introducing dopants such as yttria, ceria, and other oxides. Stabilized zirconia is one of the most important ceramic materials due to its combination of high strength and toughness at room temperature [29,30]. The very high temperature of the cubic to tetragonal transition makes it challenging for experiments to understand its nature and dynamics [31], while computational studies report conflicting results [32][33][34]. On the opposite, the firstorder tetragonal to monoclinic transition has been extensively characterized, but theoretical calculations are limited to approximate treatments such as the quasi-harmonic approximation [35,36]. Here we use a recently developed on-the-fly training scheme [16] for building a machinelearned interatomic potential that describes all three polymorphs of zirconia with near firstprinciples accuracy. In contrast, no empirical interatomic potential is capable of seamlessly describing all phases, nor to reach such accuracy [33]. We validate the MLFF against the results from first-principles density-functional theory (DFT) calculations and use it to determine the phase transitions and heat transport in ZrO 2 at ambient pressure, accounting for anharmonic effects to all orders. The agreement with experiments demonstrates that machine-learned interatomic potentials constructed on the fly offer an excellent, highly automated solution for accurate and realistic predictions of anharmonic properties in complex materials.

A. Machine-learned potential
In order to machine learn an interatomic potential for zirconia we used the kernel-based machine-learning model introduced in Refs. [16,37]. Details of the model are presented in the Methods section, including the on-the-fly force field generation scheme that we adopted.
As we aim to compute the thermodynamic properties of a strongly anharmonic material up to very high temperatures, the configurational phase space should be sampled as broadly and as efficiently as possible. Our on-the-fly training strategy, described in the Methods, is well suited for this, allowing us to seamlessly sample an ample configuration space including large thermal fluctuations up to 2800 K during MD simulations. In fact, training on the fly enables us to skip most ab initio steps and perform ab initio calculations only when required, that is, when the estimated Bayesian error of the MLFF is large. As a result, only 592 structures consisting of 96-atom supercells were automatically selected for performing ab initio calculations from a trajectory over 300 ps long, and were included in the final training dataset. Nevertheless, the resulting MLFFs are very accurate.
The root-mean-square errors (RMSE) in the energies, forces, and stress tensors predicted by our MLFFs for the training dataset are reported in Table 1. In particular, we used two distinct approaches to obtain the regression coefficients typical of kernel-based methods, namely Bayesian linear regression (BLR) and singular value decomposition (SVD, see Methods). While BLR is formally similar to the standard ridge regression methods, SVD does not introduce any regularization parameters. Here we compare the accuracy of two different MLFFs obtained using BLR and SVD and find that SVD yields consistently lower errors than BLR. Overall, we observe that our MLFFs achieve lower errors than the neuralnetwork potential developed in Ref. [41], which has an RMSE in the energy of 7.9 meV per atom, and whose training required a more manual setup of many more DFT calculations.
Although the training dataset of Ref. [41] included additional structures containing oxygen Table 1. RMSE in the energies, forces, and stress tensors predicted by the MLFF for the training and test dataset (the latter in parenthesis). The RMSE in the predicted phonon energies are also reported (in meV). All results are shown for the MLFFs trained both using BLR and SVD.  vacancies, here we considered temperatures up to 2800 K, which induce strong fluctuations that are considerably challenging to reproduce.
To validate the MLFFs, energies, forces and stress tensors were calculated for a test set consisting of 200 structures and compared to the DFT ones, as shown in Fig. 2. The structures were selected randomly from an MD simulation where monoclinic ZrO 2 was heated between 300 K and 2800 K. The largest errors correspond to the high-energy structures, that is the cubic phase at elevated temperatures. In this regime, the MLFF is less accurate due to the increased thermal vibrations. The RMSE for this test set, reported in Table 1, are very close to the ones obtained for the training set. This shows that the MLFFs were not overfitted, even in the SVD case where no regularization was used.

B. Structural and vibrational properties
The calculated zero-temperature lattice parameters for the conventional 12-atom unit cells are summarized in Table 2 and compared to the experimental data [28,42,43] (the cubic and tetragonal experimental lattice parameters are extrapolated to zero temperature).
The energy differences between the three phases are also shown in the table. AI calculations based on the PBEsol exchange-correlation functional [44] describe very accurately the relaxed structures compared to experiment, almost on par with benchmark many-electron calculations [45]. The MLFF results are in excellent agreement with the AI ones, the differences being less than 0.2% for the lattice parameters and less than 1 meV per atom for the energies. Note that here BLR and SVD perform equally well.
The MLFFs were then used to calculate the interatomic force constants and dynamical matrices according to the finite displacement method in a supercell containing 96 atoms.
To account for the long-range dipole-dipole interactions, the nonanalytic contribution to the dynamical matrix was treated using the standard method of Ref. [46]. As illustrated in properties. Some discrepancies in the dispersion of the highest-energy optical phonon modes can be attributed to finite-size effects. Note that the MLFF trained using BLR is slightly less accurate than using SVD, as seen in particular for the highest optical phonon branches in the cubic phase. The RMSE of the phonon energies reported in Table 1 confirms this trend. It is not surprising that the largest errors appear in the cubic phase: since it is unstable at low temperature, the training database contains hardly any configuration corresponding to this unstable, ideal structure. It is thus harder for the MLFF to reproduce the 0 K properties of the ideal undistorted cubic phase. Remarkably, both MLFFs accurately reproduce the imaginary soft phonon modes in cubic ZrO 2 , which is essential in order to quantitatively capture phase transitions. In fact, such modes are generally associated with lattice instabilities, leading to structural phase transformations in the material. In ZrO 2 , the soft phonon at the X point in the cubic phase, which involves a tetragonal distortion of the oxygen sublattice, is linked to the cubic to tetragonal phase transition [47].

C. Temperature-induced phase transitions
We now move to studying the transitions between the three different structural phases of ZrO 2 at ambient pressure using MD simulations with our MLFFs. The results shown are from the MLFF obtained using the SVD. The phase transitions can be directly observed by progressively heating a relatively large supercell containing 324 atoms. Specifically, we used a heating rate of 0.5 K ps −1 and a time step of 1.5 fs in the isothermal-isobaric (NPT) ensemble.
The evolution with temperature of the unit-cell volume resulting from this simulation is illustrated in red in Fig. 4a. As in the experimental data, reported as well, the first order transition between the monoclinic and tetragonal phase corresponds to a sharp change in the cell volume around 1750 K, while there is no volume discontinuity in the tetragonal to cubic transformation around 2400 K, but only a small change in the slope of the thermal expansion. An analogous cooling simulation showed that the transformation between the cubic and tetragonal phase is reversible and presents no thermal hysteresis, whereas the tetragonal to monoclinic transition was not observed upon cooling. In Fig. 4b-c the lattice parameters and monoclinic angle β are reported, respectively. These results were extracted from NPT simulations carried out at fixed temperature for 300 ps, taking the equilibrium structure at each temperature as the starting configuration for the next (higher) temperature point. Near the phase transitions, the total time was increased to 750 ps due to the large fluctuations in the case of the tetragonal to cubic transformation, and to the time required to observe the first-order monoclinic to tetragonal transition. The comparison with available experimental data shows excellent agreement, including the anisotropy of the lattice thermal expansion in the monoclinic phase, however, further investigation on the phase transitions is warranted.
To accurately determine the transition temperature we used thermodynamic integration, as explained in this paragraph. As seen in Fig. 4a-c, the martensitic transformation between the monoclinic and tetragonal phases proceeds abruptly, and because of hysteresis it is not possible to extract a reliable transition temperature from direct MD simulations. For example, when heating the system at the rate of 0.5 K ps −1 the transition appears around Expt. [31] Expt. [28] Calc. 1750 K, but when performing MD simulations at fixed temperatures we could observe the transition already at 1700 K after 700 ps. Moreover, this transition was not reversible when cooling the system. To estimate the theoretical transition temperature, we therefore calculated the fully anharmonic free energies of the monoclinic and tetragonal phase as a function of temperature. From the thermodynamic relations in the NPT ensemble, the Gibbs free energy G of a system at constant pressure can be written as: where H = U + P V is the enthalpy (U is the internal energy of the system), and G 0 is the Gibbs free energy at T = T 0 . Since the integrand in the first term on the r.h.s. diverges like 1/T for classical statistics, direct integration from T 0 = 0 is not possible. Instead, we here calculated G 0 at a low temperature T 0 = 25 K using the quasi-harmonic approximation (QHA). Within the QHA, the Helmholtz free energy F of the crystal is computed as a function of temperature T and volume V via the standard harmonic approximation, while the Gibbs free energy is found by minimizing F (V, T ) + P V for a given temperature and pressure [49]. Thus, the anharmonic effects are taken into account only via the volume dependence of the vibrational frequencies. To be compatible with the classical MD simulations, we computed the harmonic free energies using classical Maxwell-Boltzmann statistics.
We evaluated the integral in Eq. (1) over a path of constant (ambient) pressure for both monoclinic and tetragonal ZrO 2 by performing additional MD simulations at temperatures ranging from 25 K to 1700 K, sampling each trajectory for 300 ps. The integration paths are continuous, since the tetragonal phase is trapped in a local minimum and does not transform into the monoclinic one during our MD simulations (compare Fig. 4a), while the monoclinic phase remains stable for sufficient time up to 200 K above the transition temperature.
The free energy difference ∆G t-m between the tetragonal and monoclinic phase as a function of temperature is illustrated in Fig. 4d. The results obtained from Eq. (1) are shown in blue and compared to the ones calculated using the QHA either in the classical approximation (red line) or using quantum statistics (green line). In the QHA, both statistics yield identical transition temperatures T c , with the differences between both statistics being entirely negligible above 250 K. This suggests that the neglect of quantum statistics is justified for free energy calculations of ZrO 2 . According to our fully anharmonic free energy calculations, the transition temperature is 1493 K, in good agreement with the experimental value of 1400 K. Above this temperature, the tetragonal phase is thermodynamically stable due to the vibrational entropy. Instead, the QHA underestimates the transition temperature by more than 200 K (T c = 1178 K), highlighting the need to describe anharmonicity beyond the QHA. We note that the fully anharmonic calculations performed with the MLFF trained using BLR yield T c = 1530 K, in line with the slightly larger predicted energy difference between the tetragonal and monoclinic phase at 0 K (see Table 2). The accuracy of our MLFFs in describing anharmonic effects was further confirmed by comparing the QHA results obtained with the MLFF and from first principles, yielding a difference of 20 K in the transition temperature.
The tetragonal to cubic phase transition can be more easily described using direct MD calculations; however, the nature of this transition is not clear experimentally as cubic zir-conia is observed only at temperatures roughly above 2570 K, making it difficult to study.
From our MD data we observe a continuous transformation with no thermal hysteresis, moreover the system undergoes frequent fluctuations between the two phases near the transition temperature. Our results therefore suggest that the transition is largely second order, at variance with a similar calculation based on a semiempirical interatomic potential [33]. We note that due to the strongly anharmonic energy surfaces, the lattice parameters change rather quickly (but continuously) near the cubic to tetragonal transition temperature. As shown in Ref. [32] on the basis of Landau theory, this phase transition is highly sensitive to the coupling between the elastic strains and the distortion of the oxygen sublattice. Hence, an interatomic potential must be able to describe the system quantitatively with high accuracy. This is not the case even for the 'best' model interatomic potential for ZrO 2 [33], whereas are in reasonable agreement with the experimental data for pure zirconia [26,53], available only for the monoclinic phase, underestimating them by no more than 20 %. As discussed in Ref. [26], it is unclear why the thermal conductivity measured in Ref. [54] decreases less rapidly with temperature, but these data are also reported in Fig. 6 for completeness.
Note that our calculated κ does not show any discernible discontinuity across the phase transition temperature. Moreover, we found no significant anisotropy, in agreement with experimental observations [53]. Previous calculations based on GK theory and using an empirical force field [52] are unable to produce the correct temperature dependence and underestimate the room-temperature thermal conductivity by 40%. In contrast, the firstprinciples GK calculations from Ref. [8] overestimate both our results and the experiment, and show a less smooth temperature dependence which could indicate limited convergence or limited statistical sampling. We note that in standard MD simulations the dynamics of the nuclei is purely classical, which may yield higher phonon scattering rates and thus lower thermal conductivities [11,55], although this effect should be negligible at sufficiently high temperatures. calculations (AI-GK, empty blue circles) [8] and an empirical force field (FF-GK, empty orange circles) [52]. Experimental data from Refs. [26,53,54] are shown as square symbols. The red line is a fit to the present data.
In this work we machine-learned an interatomic potential for zirconia using an on-the-fly training scheme. We demonstrated that the MLFF allows to accurately describe the harmonic lattice dynamics and the anharmonic higher order contributions, with the anharmonic phonon-phonon interactions driving the temperature-dependent behavior, up to very high temperatures near the melting point. In particular, the solid-solid phase transformations that are especially hard to study using AI methods were correctly captured, and the transition temperatures were predicted with high accuracy. Conversely, conventional force fields are not sufficiently flexible to achieve a quantitative description of the harmonic (quadratic) and higher-order energy terms. Likewise, modelling thermal conductivity is a stringent test of an accurate description of the anharmonic, high-order interatomic force constants terms.
The same MLFF is indeed capable of predicting the lattice thermal conductivity using GK theory, the gold standard for computing thermal transport in strongly anharmonic crystalline and amorphous solids. This work demonstrates that machine-learned potentials have unlocked the ability to routinely perform GK calculations with near first-principles accuracy, as these calculations are generally prohibitive for AI methods and are beyond the accuracy of classical force fields.
It is noteworthy that although MLFFs cannot extrapolate well beyond the phase space of the training data, as is well known, we showed that a MLFF that is highly accurate up to almost 3000 K can be trained on a dataset consisting of as little as a few hundred structures using the present on-the-fly method. This method provides a general strategy for constructing an optimal training database via a semi-automated process. In the future, this can also be exploited for high-throughput predictions of vibrational and thermodynamic properties. Interestingly, we demonstrated that the routinely used regularization is not necessary in general in order to prevent overfitting when training a MLFF, and an unregularized regression based on SVD systematically improves the accuracy of the MLFF. As showcased in this work, on-the-fly machine-learned interatomic potentials are an enabling tool for accurate and efficient simulations of complex thermodynamic properties of a wide variety of technologically relevant materials.

A. On-the-fly machine-learned force fields
To build a machine-learned interatomic potential, first of all, a set of reference configurations with their quantum-mechanical energies, atomic forces and, ideally, stress tensors is required. Active learning schemes provide an extremely efficient solution for constructing such a database, whereby automated query strategies allow the machine to judge whether new structures should be added to the training dataset or not [56]. Extensive details of the algorithm as well as of the machine-learning model can be found in Refs. [16,37], while here only the most relevant features are outlined. The core of our training strategy is that the MLFF is constructed on the fly during MD simulations, and at every MD step it is used to judge whether a first-principles calculation should be executed and a new structure should be added to the dataset. The decision relies on an estimation of the errors in the predicted atomic forces for each structure on the basis of Bayesian inference, as it will be explained later. When no new first-principles calculations are carried out, the atomic positions and velocities are updated using the MLFF predictions.
Machine-learning models for building interatomic potentials rely on a local mapping of structural features onto a set of descriptors. A robust and versatile choice that allows for an accurate description of the many-body interactions is to map the local environment around each atom i in a system of N a atoms onto two-and three-body atomic distribution functions [57]. In this work we relied on so-called separable descriptors introduced in Ref. [37], with the two-and three-body distribution functions (ρ (2) i and ρ i , respectively) defined as: In these expressions, ρ i (r) (r = rr) is the three-dimensional distribution function around the atom i, ρ i (r) = Na j =iρ ij (r), andρ ij (r) is the likelihood to find atom j at position r from atom i within a certain cutoff radius. Essentially, ρ (2) i (r) yields the probability to find an atom j (j = i) at a distance r from atom i, and ρ (3) i (r, s, θ) the probability to find two atoms j, k at a distance r, s, respectively, from atom i (j = i, k = j, i) and spanning the angle θ between them. In practice, these distribution functions are expressed as a combination of radial and angular basis functions, with the coefficients forming a set of descriptors that we simply denote as x (2) i and x (3) i . Following [37], we define x i = (x (2) i , x (3) i ). The quantum-mechanical energies, atomic forces and stress tensors can then be expressed as a nonlinear function of the descriptors. In the present method, the potential energy U of a system of N a atoms is partitioned into a sum of local energies U i associated to each atom. Following the Gaussian approximation potential (GAP) approach [58], each term U i is written as a linear combination of kernel functions K (x i , x b ) weighted by coefficients w b : In practice, the kernel function measures the similarity between a local configuration around atom i, x i , and a local reference configuration around atom b, x b . The N b atoms are chosen from the set of reference structures generated for the MLFF training. Here K is the dot- The normalization and exponentiation by ζ introduce non-linear terms that mix the twoand three-body descriptors and generate many-body terms necessary for capturing higher order atomic interactions.
By the same token, the forces on each atom and the stress tensor components can also be written in terms of the kernel functions. Then, the coefficients w = {w b } are optimized to best reproduce the first-principles energies per atom, forces and stress components in the training dataset, in other words by solving the problem: in a least-squares sense, where y is a vector collecting all the first-principles data, and Φ is the so-called design matrix containing the K(x i , x b ) components and the derivatives with respect to atomic coordinates and lattice vectors. When fitting simultaneously energies, forces and stress tensors, as done here, it is expedient to rescale these quantities by their respective standard deviations [16]. To increase the accuracy of the MLFF, it is also possible to tune the relative importance of energies, forces and stresses during the fitting.
The solution of the linear regression problem in Eq. (5) is a crucial step. The standard methods used for training MLFFs are based on ridge regression, where a regularization parameter σ 2 v /σ 2 w is introduced [59]. In the probabilistic framework of Bayesian linear regression (BLR) adopted here, the coefficients are formally determined in the same way: however, here the regularization parameters σ v and σ w have a probabilistic meaning: σ 2 v models the uncertainty caused by noise in the training data, while σ 2 w is the variance of the prior distribution of the regression coefficients. These values are optimized using the evidence approximation [16,59] and are generally invoked to prevent overfitting. Moreover, the probabilistic framework of BLR is advantageous for the on-the-fly training. In fact, under the assumption that the prior distribution of the target data (that is, the ab initio energies, forces and stress tensors) follows a multivariate Gaussian distribution, from Bayesian inference it derives that the posterior distribution of the predicted data is a Gaussian distribution too.
The BLR framework then allows us to define a 'prediction variance' that measures the uncertainty of the predictions. For any given configuration, this is estimated as the variance of the posterior distribution: where ϕ is a matrix containing the kernel components and their partial derivatives with respect to atomic coordinates and lattice vectors for the given configuration. This is the uncertainty estimate that guides the selection of the training dataset during on-the-fly active learning.
Alternatively, in this work the regression coefficients were also determined as: where Φ + denotes the Moore-Penrose pseudoinverse of the design matrix Φ, which can be found by performing the singular-value decomposition (SVD) of Φ. Even though SVD carries a larger computational cost, a clear advantage of this method is that the condition number of the problem is smaller, since the design matrix is not squared. As in Ref. [60], we found that while the squared problem in Eq. (6) is generally ill-conditioned and requires regularization, Eq. (8) can be solved numerically without regularizing the singular values of the matrix Φ.
As our tests show, this does not lead to overfitting, demonstrating that regularization is not necessary in general. As this is a deterministic method, SVD is performed only at the end of the on-the-fly training process, and the different MLFFs obtained by using BLR and SVD can then be compared.

B. Thermal conductivity via Green-Kubo theory
On the basis of GK theory, the thermal conductivity κ is related to the equilibrium average of the heat-flux autocorrelation function [3]: where k B is the Boltzmann constant, T the temperature, V the volume of the system and j α (t) the α-th Cartesian component of the macroscopic heat flux. The symbol · denotes an ensemble average over many time origins and over independent MD trajectories.
The heat flux for a system of N a atoms is given by [61]: where E i = m i v 2 i /2 + U i is the total (kinetic and potential) energy associated to atom i with mass m i , while r i and v i are its position and velocity, respectively. For classical many-body potentials, this definition does not pose any problem, since the total energy of the system is naturally partitioned into on-site contributions E i . This holds for machine-learned potentials too, as the potential energy U is expressed in terms of the potential energies associated to each atom U i [cfr. Eq. (4)]. Eq. (10) is readily rewritten in a form applicable to periodic systems [62]: where the summation over j is carried out over all N (i) atoms within the cutoff radius representing the local environment of atom i. The first term in Eq. (11) is the convective heat flux, which gives no contribution to the thermal conductivity in solids, while the second one is the so-called virial or conductive heat flux.
Combining Eqs. (9) and (11) it is thus possible to evaluate the thermal conductivity from the atomic energies provided by MLFFs in a similar manner as when using classical force fields. The advantage here is that the AI description of the energy landscape is retained.

C. First-principles calculations and MLFF training
All calculations were performed using VASP. First-principles DFT calculations were carried out using the PBEsol [44] exchange-correlation functional. The plane-wave cutoff was set to 600 eV. Phonon calculations were performed using the Phonopy package [49]. The primitive cells of zirconia contain one, two, and four ZrO 2 formula units for the cubic, tetragonal and monoclinic structures, respectively. A unit cell of 12 atoms can therefore be used to describe all three phases, and a 4 × 4 × 4 Monkhorst-Pack k-mesh ensures energy convergence within less than 0.2 meV per atom. The structures were optimized until the forces were smaller than 10 −4 eVÅ −1 .
Our MLFF for solid ZrO 2 was trained on the fly during MD simulations in the NPT ensemble at ambient pressure, with the temperature and pressure controlled by a Langevin thermostat [63] combined with the Parrinello-Raman method [64,65]. The training was performed on a supercell containing 96 atoms using a time step of 2.5 fs. First, the system was gradually heated from 1600 K to 2800 K for 180 ps, then it was heated again starting from 500 K to 1600 K for another 180 ps. The initial structures for these MD simulations were obtained by equilibrating the tetragonal and monoclinic structures at 1600 K and 500 K, respectively, for 50 ps using the on-the-fly scheme and discarding the MLFFs thus generated.
Finally, to sample the low-temperature, unstable cubic, and tetragonal structures as well, we performed 10 additional MD steps at 100 K starting from the DFT-relaxed structures.
In total, only 592 first-principles calculations were executed, out of almost 145,000 MD steps. These constitute the reference structures dataset, from which N b = 831 and 1090 local reference configurations were selected for Zr and O, respectively.
The main hyperparameters used for constructing the descriptors and kernel functions [16,37] are as follows. The cutoff radius used to represent the local environment of each atom was set to 6Å, while the width of the Gaussian functions used to broaden the atomic distributions was 0.4Å. These parameters were the same for the two-and three-body descriptors. To expand the atomic distributions, up to 15 spherical Bessel functions (for the angular quantum number l = 0) and Legendre polynomials of order l up to l max = 4 were employed, resulting in 2048 descriptors for the Zr and O atoms. The hyperparameter for the dot-product kernel was set to the standard value ζ = 4. The automatic sparsification of the three-body descriptors based on the CUR algorithm was applied [37], allowing us to halve the number of descriptors needed without affecting the accuracy of the MLFF. As described previously, the MLFF can be further tuned by reweighing the energies, forces and stresses differently when setting up the linear regression problem. Here we found that weighing the energy equations 10 times more strongly reduces the errors in the energy by about 0.5 meV per atom on average, while increasing the errors in the predicted forces and stresses only marginally. Finally, both BLR and SVD were adopted to determine the regression coefficients.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.

CODE AVAILABILITY
The VASP code is distributed by the VASP Software GmbH. The machine learning modules will be included in the release of vasp.6.3. Prerelease versions are available from G. K. upon reasonable request.

ACKNOWLEDGMENTS
The computational results presented here have been mainly achieved using the Vienna Scientific Cluster (VSC). This work was supported by the Austrian Science Fund FWF (SFB TACO). P.L. gratefully acknowledges the support of the Advanced Materials Simulation Engineering Tool (AMSET) project, sponsored by the US Naval Nuclear Laboratory (NNL) and directed by Materials Design, Inc.