Introduction

Thermodynamic properties such as the heat capacity, the expansion coefficient, and the bulk modulus are key benchmarks in materials design. They provide insight into phase stability, phase transformations, microstructural stability and strength, thereby giving guidance to synthesis and application. The heat capacity is linked to thermodynamic potentials such as the Gibbs energy, and thus facilitates the construction of phase diagrams. By virtue of being experimentally measurable via calorimetry, the heat capacity remains—ever since the seminal work of Einstein1—among the most fundamental properties in basic research and industrial applications2. Knowledge of the thermal expansion coefficient is crucial, for example, to optimize the Invar behavior of alloys used in instrumentation applications3. The bulk modulus is vital in modeling strength and ductility up to high temperatures4.

Coordinated surveys of experimental data on thermodynamic properties have been carried out in multiple works, resulting in well-known series of books, e.g., Touloukian et al.5 and Landolt-Börnstein6. Although these books have well served as the basis for materials design, there is an urgent need for an efficient extension of the databases triggered by the ever-increasing demand for new and optimized materials. However, applying experimental techniques alone is time-consuming and thus inefficient.

To complement experiments and to rapidly obtain material properties, ab initio databases have been intensively sought after recently7,8,9,10,11. These databases allow for a quick, online access to a wide range of material properties. As yet, these databases are to a large extent based on T = 0 K (− 273.15 °C) data, and low-temperature approximations to account for the effect of temperature. For example, in one such database7, from a total of 12,000,000 entries on material properties, only 1810 thermal properties are available that are derived from harmonic or quasiharmonic approximations. Even though these approximations can provide rapid results (though not for all systems), they do not consider any explicit finite-temperature vibrations and coupling effects, rendering them unsuitable at elevated temperatures12.

To predict reliable high-temperature thermodynamic properties, an accurate representation of the system’s free energy is needed, including in particular explicit anharmonicity, i.e., phonon–phonon interactions and interactions of phonons with other excitation mechanisms (electronic, magnetic). Approximation schemes have been developed to account for phonon–phonon interactions. One class of approaches utilizes effective, renormalized harmonic Hamiltonians13,14,15,16. Another approximate route involves the use of a low-temperature perturbative expansion17. Though often efficient, such approximations can be inaccurate at high temperatures and for systems with strong anharmonicity12,18.

Explicit anharmonicity can be included up to all orders using thermodynamic integration (TI), wherein one computes the free-energy difference between a reference state and density-functional-theory (DFT)19. Over the last decade, several improvements have been made to TI-based methods20,21,22. Notably, the upsampled thermodynamic integration using Langevin dynamics (UP-TILD) (TI to low-accuracy DFT followed by “upsampling”)12,23 and the two-stage upsampled thermodynamic integration using Langevin dynamics (TU-TILD) (UP-TILD split into two stages with an intermediate potential)24 have increased computational performance. Recent developments have been directed towards exploiting advancements from machine learning25,26,27. The so-called moment tensor potentials (MTPs)28 have proved to be one of the most efficient machine-learning potentials29,30,31. The application of MTPs within the TU-TILD formalism has further improved the efficiency of free-energy calculations18,32.

Despite the advancements, free-energy calculations, including the impact of anharmonicity, have remained challenging until now, even for supposedly ‘simple’ elementary systems. A critical aspect is the accurate determination of thermodynamic properties, for which numerically converged first and second derivatives of the free energy are indispensable. The situation is illustrated in Table 1 for four prototype systems covering the fundamental crystallographic structures of metals, i.e., bcc Nb, fcc Ni and Al, and hcp Mg. Although the first thermodynamic calculations including anharmonicity were done as early as 2002 for fcc Al20, more recent studies have either neglected anharmonicity (empty fields in the table) or have utilized approximate approaches. Only a few of the studies included explicit anharmonicity to DFT accuracy (marked in bold). The situation is worse for the other elements, for which almost no information is available on thermodynamic properties, including anharmonicity at the full DFT level. A low-temperature expansion as performed in ref. 17 is particularly uncertain for high-melting elements with strong anharmonicity, such as Nb. This explains why the data was analyzed only below 1000 K17. At higher temperatures, Nb experiences large anharmonicity which strongly impacts the thermodynamic properties. To account for such features, explicit anharmonicity needs to be captured to all orders and the derivatives of the free-energy surface need to be stabilized by higher-order parameterizations on dense volume–temperature (V, T) grids of explicitly computed free energies.

Table 1 Ab initio studies of the isobaric heat capacity (Cp), the thermal expansion coefficient (α), and the bulk modulus (B) for bcc Nb, fcc Ni, fcc Al, and hcp Mg.

Summarizing the current state-of-the-art in finite-temperature ab initio simulations, it has to be concluded that a holistic computational methodology to readily obtain accurate thermodynamic properties is not yet at hand. Consequently, ab initio thermodynamic properties are only very scarcely and inconsistently available in the literature, even for elementary systems. This clearly hampers the development of ab initio thermodynamic databases.

In the current paper, we present such a holistic computational procedure with which affordable, fully anharmonic free-energy calculations become possible on a sufficiently large number of (V, T) points (Table 1). As shown in this work, a dense sampling and appropriately chosen parametrizations of the free energy are mandatory and facilitate the computation of thermodynamic properties to highest DFT accuracy. The proposed procedure encompasses key insights and techniques distilled from several of the above-mentioned studies12,18,24,33. We lay down a detailed, complete, and pedagogical description of all the steps of the procedure (“Methods” section below and Supplementary Information) and discuss all relevant numerical and performance aspects.

An important ingredient is the direct-upsampling technique34—a modification to TU-TILD—where the upsampling is performed directly on MTP configurations. The upsampling establishes DFT level accuracy and accounts for the impact of vibrations on electronic and magnetic free energies. Direct upsampling was introduced previously on a preliminary, purely theoretical example (multi-component alloy)34. However, the convergence, robustness and optimization of the upsampling was not touched upon and has remained elusive. In the present work, we therefore perform a systematic and rigorous application and analysis of direct upsampling. Importantly, our analysis provides a measure of the number of configurations needed for convergence of the upsampled free energy, which is crucial in keeping the number of highly expensive DFT calculations to a minimum. Another key ingredient to our procedure is the fitting approach and the resulting robustness of the MTP interatomic potential. Here, we propose a novel two-step approach of fitting the MTP to DFT data with additional stabilization measures (by “harmonic” configurations), which further reduces the number of expensive DFT calculations and ensures stable simulations.

We apply the procedure to the above-mentioned, experimentally well-assessed four prototypical systems (bcc Nb, magnetic fcc Ni, fcc Al, hcp Mg) and provide a database-like collection of highly converged equilibrium thermodynamic properties. Beyond the main thermodynamic properties of interest (heat capacity, expansion coefficient, bulk modulus) additional quantities are tabulated in the Supplementary Information, and all properties are provided online. Among the investigated systems, Nb offers the most numerical challenges and insights into the methodology, owing to its large anharmonicity, large impact of vibrations on electrons, and long-range interactions14. In fact, the inherent complexity even challenges the MTPs in reproducing ab initio data for Nb. Calculations for Ni and Al are performed on much denser (V, T) grids in comparison to previous results12,23 leading to an improved and systematically assessed convergence of the thermodynamic properties. Some of the thermodynamic properties of Ni (including the impact of magnetism) and Al are reported here for the first time at the full DFT level of accuracy (bulk modulus for Al and Ni, expansion coefficient for Ni). For both Nb and Mg, the current work is the first instance of evaluation of thermodynamic properties, including explicit anharmonicity up to the melting point with DFT accuracy.

Results

General overview of the methodology

The most integral part of the procedure is an accurate representation of the free-energy surface up to the melting point over the relevant volume range, from which thermodynamic properties can be derived at various pressures and temperatures. This calls not only for highly stable absolute free-energy values, but also for stable first and second derivatives of the free-energy surface. Essential to achieving this stability is a dense and optimally distributed set of (V, T) points on which explicit free-energy calculations are performed, such as to obtain a smooth and converged parametrization of the surface. This is illustrated in Fig. 1f, wherein the blue dots depict a representative and converged (V, T) grid and the arrows the derivatives along different directions. To this end, a fast and reliable method is needed to obtain the free energy for every single (V, T) point, and on that account, we utilize the direct-upsampling method aided by the outstanding performance of MTPs (Fig. 1a–e).

Fig. 1: Schematic description of the entire workflow using fcc Ni as an example.
figure 1

The upper box illustrates different stages of direct upsampling: a TILD at different temperatures from effective QH to MTP (whose force RMSEs are shown in (b, c)), and d, e the upsampling to high-DFT. The center box is the crux of the workflow. f is a representation of the free-energy surface, with the (V, T) mesh on which free-energy calculations are performed represented by blue dots, volumes at Tmelt chosen for MTP training set represented by green dots, volumes for the low-temperature effective QH fitting represented by green crosses, the 0 K EV curve in purple and different derivatives represented by black arrows. g is the free energy as a function of volume at the melting point calculated while including different excitations. The lower box shows the numerically computed (h) isobaric heat capacity Cp, i linear thermal expansion coefficient α and j adiabatic bulk modulus BS along with a comparison to experimentally calculated values41,51,52. (Veq = Veq(T) denotes the equilibrium volume at T and a given pressure, V0K = Veq(0 K) and Vmelt = Veq(Tmelt), where Tmelt is the experimental melting temperature at ambient pressure; S is the entropy).

The relevant excitation mechanisms can be captured by starting with the adiabatic decomposition of the total free energy F(V, T) according to the free-energy Born–Oppenheimer approximation35, as given by

$$F(V,T)={E}_{{{{\rm{0K}}}}}(V)+{F}^{{{{\rm{el}}}}}(V,T)+{F}^{{{{\rm{vib}}}}}(V,T)+{F}^{{{{\rm{mag}}}}}(V,T).$$
(1)

Here, E0K denotes the 0 K total energy of the static lattice; Fel is the electronic free energy, including coupling from atomic vibrations; Fvib is the vibrational free energy which, in our framework, is further decomposed into an effective quasiharmonic (QH) and an anharmonic (AH) part Fqh and Fah, respectively; Fmag is the magnetic free energy including the coupling from electronic and atomic vibrations. The various free-energy contributions and their impact on the thermodynamic properties are exemplified for Ni in Fig. 1g–j.

Contrary to the other excitation mechanisms, there is a lack of standard DFT methods to self-consistently calculate the magnetic excitations. The reasons for this are the complex coupled magnetic degrees of freedom36 and the relevance of longitudinal spin fluctuations (LSFs)37. So far, one has to resort to effective Heisenberg models fitted to the experimental Curie temperature38, DFT-informed semi-empirical heat capacity models, and more recently, magnetic MTPs to facilitate constrained magnetic calculations to allow sampling the LSFs with MD39. Here, for Ni, we use a thoroughly tested DFT-informed semi-empirical model33.

A key element in our methodology is the preparation of a high-accuracy MTP (high-MTP) in an efficient two-step approach. First, a lower-quality MTP (low-MTP) is fitted to inexpensive low-converged DFT data. This low-MTP is used to rapidly generate MD snapshots at the melting point at various volumes. High-converged DFT (high-DFT) calculations are then performed for the snapshots and the data is used to fit the high-MTP. The high-MTP is stabilized by introducing a “harmonic” configuration containing small interatomic distances into the fitting procedure. This harmonic configuration is obtained by sampling the phase space with an effective QH reference, which is fitted to low-temperature high-DFT forces. This approach ensures that the magnitude of the forces remains physical during the following TI, especially when atoms come close to each other for small λ coupling parameter values due to the softness of the harmonic reference. The resulting high-MTP is thus accurate and stable over the relevant part of the phase space.

To obtain the free energy, we perform a λ-based TI from the effective QH reference to the high-MTP. The TI is performed on (V, T) points on a dense grid in large-size supercells and over long time scales to achieve statistical convergence. To corroborate the free-energy differences calculated using the TI, we also perform a temperature integration, which serves both as a cross-check for the TI and substitutes in cases where the TI becomes unstable (e.g., when there is diffusion during the TI and a fixed-lattice reference becomes inadequate). In order to achieve high-accuracy free energies, we propose a modification, specifically a quantum correction, to the conventional temperature integration, to yield the same free-energy differences as the TI. A detailed description of the quantum-corrected temperature integration is provided in Supplementary Discussion 1E.

Direct upsampling on high-MTP snapshots is then utilized to efficiently obtain DFT accuracy, including the electronic and magnetic contributions. These contributions implicitly include the coupling effect from thermal vibrations. Once the total free energies are calculated across all the (V, T) grid points, they are parametrized to obtain highly converged free-energy surfaces. A Legendre transformation on the free-energy surface gives the Gibbs energy G(p, T) = F(V, T) + pV, from which thermodynamic properties at a given pressure are obtained. The Methods section contains further information on the involved steps, and the Supplementary Information all relevant details.

Accuracy of the MTPs

An accurate MTP that can reproduce ab initio data is crucial for the efficiency of our proposed methodology. The smaller the RMSE, the fewer the number of snapshots that are needed for converging the direct upsampling, as will be discussed below. In this regard, Fig. 2 shows the root-mean-square error (RMSE) in the energies and forces of the high-MTP on a high-DFT test set at the respective melting point for the four systems. Comparatively, although the MTP for Nb shows a larger RMSE, the values are still sufficiently small for an efficient evaluation of thermodynamic properties, given our detailed analysis and understanding of direct upsampling. The relatively large RMSE of Nb probably arises from the large anharmonicity and a relatively complex atomic distribution for Nb, as also observed for other highly anharmonic systems in prior works18,34,40. In Supplementary Methods 2C3, we assess the performance of MTPs with increasing levels.

Fig. 2: Performance of the MTPs.
figure 2

The test set root-mean-square errors (RMSEs) of the high-MTPs (level 20) in (a) energy (per atom) and (b) forces in medium-size supercells.

Optimization of direct upsampling

The most expensive stage during the free-energy calculations is by far the upsampling from high-MTP snapshots to high-DFT (Table 2). The efficiency of the methodology is thus significantly improved by minimizing the number of high-DFT calculations during direct upsampling, while still maintaining highest accuracy in the final free energies. The number of high-DFT calculations needed to achieve a certain accuracy is correlated to the RMSE of the high-MTPs.

Table 2 CPU-time estimate (in core hours) for different stages of the free-energy calculation for Nb (PBE, 11 valence electrons) using the current framework and using the previous TU-TILD methodology.

Figure 3 shows such a relation for Nb for different convergence criteria (symbols connected with lines), featuring a decreasing number of snapshots for a decreasing energy RMSE of the MTPs. A model MTP is used here as the target system in the upsampling (further details in Supplementary Discussion 1A). The trend can be analytically formulated irrespective of the system solely by using the energy RMSE of the MTP and the target accuracy. By approximating the standard deviation of the upsampling as the MTP energy RMSE, assuming a normal distribution, and using the standard error within a 95% confidence interval, the number of required snapshots is estimated as

$$n={\left(\frac{2\,{{{\rm{RMSE}}}}}{c}\right)}^{2},$$
(2)

where ± c is the target accuracy. For instance, for Nb with an MTP RMSE of 2 meV/atom, around 40 snapshots are needed to achieve a target accuracy of 0.6 meV/atom. This is indicated with the purple star in the figure. Besides Nb, the other systems in this work have high-MTPs fitted even to within 0.5 meV/atom and 0.05 eV/Å accuracy in energies and forces, respectively (Fig. 2), requiring much fewer snapshots for the convergence of the direct upsampling. For instance, Mg with an MTP RMSE of 0.16 meV/atom would require only around 10 snapshots to reach 0.1 meV/atom accuracy, as indicated by the red star in Fig. 3. The RMSEs here are evaluated at the melting temperature, and provide the largest estimates of the standard deviation. At lower temperatures, the standard deviation decreases and fewer snapshots are needed.

Fig. 3: Number of snapshots required for convergence of the direct upsampling as a function of the energy RMSE of the MTP.
figure 3

The colors represent different target accuracy as indicated in the legend. The symbols connected by lines denote upsampling to an Nb MTP model system. The purple and the red star show upsampling from MTP to DFT at Tmelt for Nb within ±0.6 meV/atom and for Mg within ±0.1 meV/atom, respectively. The error bars denote a 95% confidence interval derived from independent sets of calculations.

Significance of the (V, T) grid density

The proposed formalism enables affordable full free-energy calculations on a higher number of (V, T) points as compared to previous works. The importance of the (V, T) grid density is illustrated in Fig. 4a, b, which show the anharmonic free energy for Nb at Tmelt and the resulting bulk modulus calculated using different grid densities. The anharmonic free energies in Fig. 4a are given with respect to the anharmonic free energy calculated using the highest-density grid (11V × 13T). As the grid density increases, the anharmonic free energy begins to converge as noticed by the closer proximity of the dashed red curve to the solid black line, in comparison to the dotted gray curve. It is observed that even a small difference in the anharmonic free energy (about 0.5 meV/atom at Tmelt) can lead to a considerable change in the high-temperature bulk modulus (18% change at Tmelt in the drop in the bulk modulus from 0 K). Moreover, calculations on a coarse 4V × 4T grid are also not sufficient to capture the qualitative (more quadratic) behavior of the bulk modulus at higher temperature (Fig. 4b). This reveals the importance of a highly converged free-energy surface with respect to the grid density, especially at higher temperatures and higher volumes. For further improvement, we additionally choose grid points above Tmelt and Vmelt to obtain converged thermodynamic properties also at the melting point. Such a study is made feasible owing to the rapidness of our methodology.

Fig. 4: Convergence of thermodynamic properties.
figure 4

a Anharmonic free energy as a function of volume for Nb at Tmelt calculated by using different grid densities. The free energy is plotted with respect to the values calculated using the largest grid (11V × 13T). b The corresponding adiabatic bulk modulus for the three grids. c Anharmonic free energy for Nb at Tmelt calculated using a 11V × 13T grid but parametrized with polynomials of a different order. The free energy is plotted with respect to the values calculated using the 4th-order parametrization. d Corresponding thermal expansion coefficient for the three parametrizations.

Once free-energy calculations are performed on a sufficiently dense (V, T) grid, it is crucial to parametrize the surface, in particular the anharmonic free energy, with a sufficiently high-order polynomial basis. This is illustrated in Fig. 4c, d which show the anharmonic free energy (with respect to the 4th-order parametrization) calculated using different orders and the resulting thermal expansion coefficient. Although a second-order polynomial fit (gray curve) of Fah(V, T) differs from the fourth order by less than 1 meV/atom, it amounts to a 20% difference in the expansion coefficient at the melting point. As the order of the polynomial increases further to three and four, the free energy differs by less than 0.2 meV/atom, leading to converged thermodynamic properties (red dashed and black solid curves fall on each other).

Benchmarks of the methodology

Taking both an optimized direct upsampling and a converged (V, T) grid into account, Table 2 shows the total computational cost of obtaining thermodynamic properties to the desired accuracy using the current framework for Nb. The values are compared to the state-of-the-art TU-TILD+MTP scheme, in which MTPs were fitted directly to AIMD energies and forces, and a second TI was performed from the MTP to DFT, prior to upsampling. A 4.5-times speed-up is achieved during high-MTP fitting by using the two-stage training procedure (see the top half of Table 2). Through direct upsampling from precise high-MTPs and an optimized number of snapshots, we completely do away with the second stage of TU-TILD (TI from MTP to DFT), thereby achieving a 4.8-times gain in speed during free-energy calculations using a 11 × 13 (V, T) grid, shown in the bottom half of the table. Although the speed-up coming from direct upsampling in comparison to TU-TILD was mentioned in the work of Zhou et al.34, here we optimize both the number of snapshots for a single (V, T) point and the total number of grid points needed for the evaluation of thermodynamic properties.

Free energies for the prototype systems

In Fig. 5, we highlight key insights from the calculated free-energy surfaces, from which the target thermodynamic properties are derived. The first column provides the Gibbs energy G(T) at ambient pressure for the four elements using the GGA-PBE approximation to the exchange-correlation (XC) functional. Gibbs energies are the fundamental input to the calculation of phase diagrams (CALPHAD41). On the total scale, the full DFT Gibbs energy curves (solid lines) are closely tracing the CALPHAD values. Differences can be noticed at high temperatures when the DFT curves are referenced with respect to the CALPHAD data as shown in the insets. The discrepancy is not a shortcoming of the present methodology, but instead due to the inherent limitation of the local nature of the standard XC functionals. For Al and Mg, we also demonstrate the results using the LDA XC functional. Results from both functionals can act as an “ab initio confidence interval”, as was shown previously42.

Fig. 5: Ab initio calculated thermodynamic potentials for Nb, Ni, Al, and Mg.
figure 5

Gibbs energy G(T) at 100 kPa (first column), free energy F(V) at Tmelt (second column), and anharmonic free energy Fah(V, T) (third column), using the PBE XC functional. The G(T) values are referenced to the minimum energy of the static lattice at 0 K. Calculations using the CALPHAD method41 (aligned to the ab initio values at room temperature) are shown in blue dots for comparison. The insets contain the full ab initio Gibbs energy at high temperatures with respect to the CALPHAD values, with the results for LDA added for Al and Mg. For G(T) and F(V, Tmelt), curves including different excitation mechanisms are shown. The melting temperatures Tmelt correspond to experimental values. The error bars denote a 95% confidence interval.

In the second column in Fig. 5, the free energy at the melting point is plotted as a function of volume. The FV curves, including all excitation mechanisms (solid lines), are analogous to the conventional EV curves at 0 K, but corresponding to Tmelt. They contain, for example, information on the equilibrium volume and the (isothermal) bulk modulus at the melting point. In contrast to the EV curves at 0 K, the FV curves are dominated by thermal excitations. The anharmonic contribution (calculated with the effective QH as reference) is large for Nb (≈+50 meV/atom, see also the third column). This strong anharmonic behavior can be intuited by the open bcc structure of Nb that favors vibrational entropy, as compared to the close-packed fcc and hcp structures for the other elements. The electronic free energies are large for Nb and Ni (≈ −100 and −50 meV/atom), which can be corroborated with the Fermi-level contribution of a smeared-out electronic density of states43. For Ni, the magnetic contribution is as large as the electronic contribution, with its strength determined by the local magnetic moments on the Ni atoms. It needs to be stressed that the impact of thermal vibrations on the electronic and magnetic free energy is important, as the vibrations break the symmetric arrangement of the atoms and thereby significantly smoothen the electronic density of states.

As noted above, reaching the desired accuracy in the thermodynamic properties that require the first and second derivatives of the free energy requires control over sub-meV differences in the free energies. In particular, it is of crucial importance to faithfully describe the physically relevant variations of the free energy with volume and temperature (cf. the wavy dependence for Ni’s Fah(V) in Fig. 5), while at the same time avoiding any overfitting. The sufficiently dense sets of explicitly computed F(V, T) points are seen in the right column in Fig. 5, guaranteeing convergence with respect to the number of basic elements in the expansion of the free-energy surface.

Thermodynamic properties for the prototype systems

Figure 6 shows the target thermodynamic properties—the isobaric heat capacity Cp(T), the linear thermal expansion coefficient α(T) and the adiabatic bulk modulus BS(T)—calculated up to the melting point using the current ab initio framework, for Nb, Ni, Al, and Mg, including the different excitation mechanisms (provided in the legend). Experimental values are shown as blue circles for comparison, and our calculations, including all excitation mechanisms (solid lines), show excellent agreement.

Fig. 6: Ab initio calculated Cp(T), α(T), and BS(T) up to the melting point for Nb, Ni, Al, and Mg.
figure 6

Calculations are compared to experimental results shown in blue circles. Results considering different excitation mechanisms (effective QH (qh), anharmonic (ah), electronic (el), and magnetic (mag)) are shown. Nb and Ni results are for PBE, while for Al and Mg LDA results are additionally shown. The following experimental values are used for comparison: Nb41,53,54,55, Ni41,51,52, Al5,41,56, and Mg5,41,57.

The unprecedented accuracy achievable with the current framework is apparent in the results for Nb (first row in Fig. 6). We have shown in the previous section that Nb has the largest electronic and anharmonic contribution of the four systems studied. Nb also possesses long-range interactions at 0 K that gradually disappear as temperature increases14. The disappearance is validated by explicit TI calculations on large-size cells from high-MTP to DFT (see Supplementary Discussion 1D). In addition for Nb, the energies predicted by the high-MTP (fitted at Tmelt) on low-temperature configurations with de-coupled phonons become less accurate (see Supplementary Discussion 1D3). However, the loss in accuracy gets fully compensated for in the directly upsampled free energy. Considering all such challenges offered by Nb, we still achieve remarkable accuracy with experiments. In particular, the calculations are able to reproduce the strong temperature dependence and the curvature of the expansion coefficient all the way to the melting point. This is made possible by virtue of a dense (V, T) grid on which the free-energy calculations are performed, leading to highly converged numerical first and second derivatives.

The results for Nb also showcase how the different thermodynamic properties probe distinct features of the free-energy surface. The strong anharmonic free energy discussed above affects significantly the expansion coefficient (yellow dotted vs red dashed line) and contributes to its curvature. In contrast, the heat capacity is much less affected by the anharmonic free energy. The situation is opposite for the electronic thermal excitations which strongly increase the computed heat capacity of Nb bringing it close to the experiment, while the expansion coefficient is less affected. It is the dependence on derivatives along distinct directions on the free-energy surface that brings about the different behavior of the thermodynamic properties.

In the second row in Fig. 6, we present the results for Ni. By virtue of its construction, the model can well predict the peak in the heat capacity at the Curie temperature arising from the second-order magnetic phase transition. The validity of the current approach for calculating magnetic free energies and the negligibility of contributions beyond (e.g., LSFs) have been thoroughly proven in earlier works33,44. However, the magnetic model utilized for Ni cannot capture the small peak in the experimental expansion coefficient originating from the magnetic phase transition. Other, more elaborate magnetic models could be incorporated into the present framework to further improve the magnetic description.

The results for Al and Mg (last two rows in Fig. 6) are provided for both the LDA and PBE XC functionals. The difference coming from the XC functionals is evident in the calculated bulk modulus, where PBE and LDA results are identified as a lower and upper bound to the experimental bulk modulus, providing an ‘ab initio confidence interval’42 similarly as for the Gibbs energies mentioned above. Although this has been documented for some systems and properties in literature23,42, we report it for the first time for the bulk moduli at full DFT accuracy.

Discussion

The procedure described in this work presents a complete and very efficient methodology to predict highly accurate ab initio free-energy surfaces and thermodynamic properties up to the melting point. The procedure has been developed and streamlined by considering key insights and findings from ab initio studies over the past decade, and by utilizing direct upsampling and advanced machine-learning potentials (i.e., MTPs). The current proposition makes calculations on a very dense (V, T) grid affordable. It also takes the relevant finite-temperature excitations into account—the electronic free energy, the magnetic contribution, anharmonicity, and coupling effects. Consequently, even sub-meV differences that affect high-temperature thermodynamics are factored in.

The proposed procedure can be combined with any ab initio electronic structure approach, and with advanced exchange-correlation functionals, e.g., hybrid functionals45, meta-generalized gradient approximations46, or even with the random-phase-approximation and the adiabatic connection fluctuation–dissipation theorem47. The more computationally expensive functionals can be employed either for the EV curve or during upsampling in order to reach higher ab initio accuracy. The application and efficiency of the procedure relies primarily on accurately fitted MTPs. Hence, the technique can also be employed to more complex and possibly disordered systems such as multi-component alloys, for some of which MTPs within 3 meV/atom energy RMSE (similar to that for Nb) exist in literature40. The here-derived optimization and analysis (in particular for the direct upsampling, where the RMSE of the MTP determines the efficiency) can be applied to efficiently obtain well-converged free energies. Free energies of arbitrary phases can be computed, as long as it is possible to perform sufficient dynamics within the considered phase, such as to obtain statistically converged quantities. In addition to equilibrium phases, properties of meta-stable and dynamically stabilized phases are accessible. Once bulk-free energies are available, the procedure can be extended to systems with various kinds of defects (e.g., vacancies, surfaces, interfaces, grain boundaries). With the calculated free energies of the ideal bulk and the defective structure, one can evaluate the formation Gibbs energy of the respective defect. In the case of thermal vacancies, their presence at elevated temperatures will mildly contribute to the thermodynamic properties of the system, providing an even more realistic comparison to experimental data. For example, in Al, thermal vacancies are known to add 0.07 kB to the heat capacity at the melting point23. Predictions of highly accurate solid free-energy surfaces are also required as a reference for liquid phase calculations, e.g., in the TOR-TILD methodology48. From the free-energy surface, other thermodynamic properties such as the enthalpy, entropy, and Grüneisen parameter can be likewise derived.

Data sets for the studied properties—Gibbs energy, enthalpy, entropy, isobaric, and isochoric heat capacity, thermal expansion coefficient, isothermal, and adiabatic bulk moduli—of the here investigated prototype systems are tabulated in Supplementary Discussion 4 and provided online (see “Data availability” and “Code availability”). With the introduced, robust and efficient methodology, we are well-positioned to extend this work to other systems and develop an entirely ab initio thermodynamic database.

Methods

An overview of the steps involved in our framework to calculate the relevant free-energy contributions and eventually thermodynamic properties is provided here, with more details in the Supplementary Information.

Energy–volume curve and Debye–Grüneisen model (Supplementary Methods 2A)

We start with a conventional 0 K energy–volume (EV) curve calculation with very well-converged DFT parameters (2 × ENMAX, i.e., twice the maximum recommended energy cut-off, and more than 60,000 k-points (kp) × atoms) in a reasonable volume range (typically −8% to + 12%) around the 0 K equilibrium volume on a mesh of at least 11 volumes. The DFT computed values are used to fit the Vinet equation of state49 to obtain E0K(V) for Eq. (1). The EV curve is also used to obtain the free energy within the Debye–Grüneisen model50, from which we estimate the relevant volume range for the free energy calculations. For the melting temperature, we use experimental values.

Low-quality MTP (Supplementary Methods 2B)

In order to efficiently generate a highly accurate MTP (next point), first, a lower-quality MTP is obtained to rapidly sample the vibrational phase space. For that purpose, AIMD is performed using low-accuracy DFT (low-DFT) parameters (1 × ENMAX energy cut-off, 288–432 kp × atoms and without electronic temperature) on a coarse set of four volumes from the relevant volume range, at the melting point, on small-size supercells of 32–54 atoms and a timestep of 5 fs for 1000 steps. Uncorrelated snapshots are chosen to train a lower-quality MTP (low-MTP) with few basis functions. Specifically, levels of lev\({}_{\max }=6\)-14 and radial basis sizes of NQ = 8 are used, resulting in 25–88 fitting parameters. The minimum distances (\({R}_{\min }\)) are 1.33–2.00 Å and the cutoff radii (Rcut) are 4.96–6.24 Å.

High-quality MTP (Supplementary Methods 2C)

The low-MTP is used to perform NVT MD simulations in a medium-size supercell (96–128 atoms) for 8–10 volumes in the relevant volume range, at the melting point, using a timestep of 1 fs for 9000 steps. From the trajectories, 30 snapshots are chosen from each volume, and DFT calculations with high-converged (high-DFT) parameters (1.5 × ENMAX and 6100–8200 kp × atoms) are performed to serve as the training set for a higher accuracy MTP (high-MTP). The high-MTP has significantly more basis functions than the low-MTP, i.e., a level lev \({\,}_{\max}=20\) and NQ = 8 radial basis functions, resulting in 332 fitting parameters. A ‘harmonic’ snapshot generated with the effective QH reference (next point) is included into the fitting database to stabilize the high-MTP for small interatomic distances. The resulting \({R}_{\min}\) are 1.36–2.00 Å and the Rcut are 4.97–6.10 Å. The electronic contribution is not included in the DFT database for the high-MTP, because the current MTPs do not entail electronic degrees-of-freedom. Note that within our approach, no expensive high-DFT AIMD is required.

Effective QH reference (Supplementary Methods 2D)

We use an effective QH model as a reference for the TI to high-MTP. To obtain it, low-temperature MD (e.g., at 20 K) is run using the high-MTP on medium-size supercells at several volumes in the relevant range for 10,000 steps of 1 fs each. From this, we choose 30 snapshots for each volume and calculate high-DFT forces. An effective dynamical matrix is then fitted to these forces and extended to larger system sizes. Each of the force constants is parametrized using a second-order polynomial in V. For each V, Fqh(T) is calculated on a 30 × 30 × 30 q-point mesh in reciprocal space.

An effective QH reference is preferred to a 0 K QH due to its wider applicability and stability (even for 0 K unstable systems that become stable at elevated temperatures), efficiency for low-symmetry systems, and a fitting dataset where errors in atomic forces are averaged out.

TI from effective QH to high-MTP (Supplementary Methods 2E)

Next, we perform TI using Langevin dynamics (TILD) to obtain the free-energy difference between the effective QH reference and the high-MTP. The high-MTP is used as an intermediate potential to minimize high-DFT calculations for the final free energy. During TILD, the free-energy difference is given by

$$\Delta{F}^{{{{\rm{qh\to MTP}}}}}=\int\nolimits_{0}^{1}d\lambda {\left\langle {E}^{{{{\rm{MTP}}}}}-{E}^{{{{\rm{qh}}}}}\right\rangle }_{\lambda },$$
(3)

where λ(= 0 . . . 1) dictates the coupled system with energy Eλ = (1 − λ)Eqh + λEMTP. TILD is performed on large-size supercells (432–500 atoms) with a timestep of 1 fs for 50,000 steps. For each (V, T), a very dense set of around 20 λ values is used. We then integrate over λ with an analytical fit based on a tangential function to obtain \(\Delta{F}^{\rm{qh\to MTP}}\).

In certain situations, the λ-based TILD calculations cannot be straightforwardly performed. For example, in systems that feature diffusion of atoms and exchange of sites during the TI, a fixed-lattice reference such as the effective QH becomes inadequate. Then, it is possible to utilize an alternative method to calculate free-energy differences, i.e., temperature integration. In the present study, temperature integration has been used to corroborate the TILD calculations. Details and special considerations about this method can be found in Supplementary Discussion 1E.

One should keep in mind that the present step of the procedure involves no DFT calculations. Hence, we can afford to perform highly converged free-energy calculations on large-size supercells which also include contributions from vibrations with long wavelengths. Moreover, finite-size effects (e.g., stacking fault formation in small-size Mg hcp cells, see Supplementary Discussion 1F) are avoided by utilizing large-size supercells.

Figure 1a–c encapsulates the just discussed three stages. Here, \(\left\langle {E}^{{{{\rm{MTP}}}}}-{E}^{{{{\rm{qh}}}}}\right\rangle\) is plotted against λ for a single volume and a set of temperatures for a 500-atom Ni supercell, where λ = 0 corresponds to the effective QH reference and λ = 1 corresponds to a high-MTP. The high density of λs as seen in the figure, achieves good convergence and is affordable due to the inexpensive nature of this step.

Direct upsampling from high-MTP to high-DFT (Supplementary Methods 2F)

In the spirit of the direct-upsampling approach, we perform high-DFT runs on high-MTP-generated snapshots (illustrated by red dots in Fig. 1d). In addition to reaching DFT accuracy for the vibrational free energy, we also include the electronic contribution. The notion behind upsampling from MTP relies on its superior accuracy, due to which highly converged upsampled energies can be achieved within a few tens of snapshots (as discussed, the speed of convergence depends on the accuracy of the MTP. For the full analysis, see Supplementary Discussion 1A).

Since this step involves computationally demanding calculations (Table 2), they are restricted to medium-size supercells. The upsampling is performed in two parts. First, the free-energy difference between high-MTP and high-DFT is calculated using the free-energy perturbation expression, as given by

$$\Delta {F}^{{{{\rm{up}}}}}=-{k}_{B}T\ln {\left\langle \exp \left(-\frac{{E}^{{{{\rm{DFT}}}}}-{E}^{{{{\rm{MTP}}}}}}{{k}_{B}T}\right)\right\rangle }_{{{{\rm{MTP}}}}},$$
(4)

where EDFT and EMTP are high-DFT (without electronic temperature) and high-MTP energies. The averaging is performed on uncorrelated high-MTP snapshots. Equation (4) corresponds to the full free-energy perturbation formula. We note that at least the second-order approximation of the perturbation equation is vital to capture the full upsampled free-energy difference, (see Supplementary Discussion 1A1) Adding the upsampled free energy to \(\Delta{F}^{\rm{qh\to MTP}}\) from the previous stage gives the anharmonic vibrational contribution:

$${F}^{{{{\rm{ah}}}}}={{\Delta }}{F}^{{{{\rm{qh\to MTP}}}}}+{{\Delta }}{F}^{{{{\rm{up}}}}}.$$
(5)

In the second part, we calculate the electronic free energy Fel using the same snapshots, as given by

$${F}^{{{{\rm{el}}}}}=-{k}_{B}T\ln {\left\langle \exp \left(-\frac{{E}_{{{{\rm{el}}}}}^{{{{\rm{DFT}}}}}-{E}^{{{{\rm{DFT}}}}}}{{k}_{B}T}\right)\right\rangle }_{{{{\rm{MTP}}}}},$$
(6)

where \({E}_{{{{\rm{el}}}}}^{{{{\rm{DFT}}}}}\) is the high-DFT energy, including electronic temperature. Since it is performed on MD snapshots, the upsampling also accounts for the effect of atomic vibrations on the electronic free energy.

For magnetic systems (Ni, in this case), we extract average magnetic moments from the high-DFT runs (including electronic temperature). Along with the experimental Curie temperature, they are used as model parameters for an empirical heat capacity formula as a function of temperature, and for the corresponding numerically integrated magnetic free energy Fmag(V, T).

Parametrization and thermodynamic properties (Supplementary Methods 2G)

A sufficiently dense (V, T) grid is necessary to fit a smooth free-energy surface and numerically calculate converged second derivatives in the evaluation of Cp(T) and BS(T) all the way to Tmelt. For this purpose, it is also recommended to extend the grid further (by one or two points in V and T) beyond the corresponding melting temperature and volume, i.e., to T > Tmelt and V > Vmelt.

Smooth surfaces in (V, T) are fit to each of the free-energy contributions. For a dense temperature mesh (steps of 1 kelvin), Fqh(T) is parametrized with a third-order polynomial in V to obtain the effective QH free-energy surface Fqh(V, T). The anharmonic free energies Fah at every (V, T) grid point are used to fit a continuous anharmonic free-energy surface using renormalized effective anharmonic frequencies23,33. Here, an adequate polynomial basis is a requisite for the frequencies since the derived thermodynamic quantities are particularly sensitive to them. A fourth-order polynomial in V and T is found to be a conservative and well-converged basis set. The total vibrational free energy can be obtained by summing up the effective QH and anharmonic surfaces: Fvib(V, T) = Fqh(V, T) + Fah(V, T). The electronic free energies Fel at every (V, T) grid point are used to fit a polynomial in (V, T)43 to obtain a continuous electronic free-energy surface Fel(V, T). The average magnetic moments m at every (V, T) grid point are first parametrized with a polynomial in T, and later with a polynomial in V for every 1-kelvin step, to obtain a continuous Fmag(V, T) surface.

All contributions are discretized in 1-kelvin steps, and the total free energy is obtained by summing the contributions to the EV curve (Eq. (1)). Numerical first and second derivatives along different directions are performed to obtain Cp(T), α(T), and BS(T).