Machine-learned Interatomic Potentials for Alloys and Alloy Phase Diagrams

We introduce machine-learned potentials for Ag-Pd to describe the energy of alloy configurations over a wide range of compositions. We compare two different approaches. Moment tensor potentials (MTP) are polynomial-like functions of interatomic distances and angles. The Gaussian Approximation Potential (GAP) framework uses kernel regression, and we use the Smooth Overlap of Atomic Positions (SOAP) representation of atomic neighbourhoods that consists of a complete set of rotational and permutational invariants provided by the power spectrum of the spherical Fourier transform of the neighbour density. Both types of potentials give excellent accuracy for a wide range of compositions and rival the accuracy of cluster expansion, a benchmark for this system. While both models are able to describe small deformations away from the lattice positions, SOAP-GAP excels at transferability as shown by sensible transformation paths between configurations, and MTP allows, due to its lower computational cost, the calculation of compositional phase diagrams. Given the fact that both methods perform as well as cluster expansion would but yield off-lattice models, we expect them to open new avenues in computational materials modeling for alloys.


INTRODUCTION
The technology frontier relies on the exceptional performance of next-generation materials. First-principles calculations of material properties provide one way to discover new materials or optimize existing ones. However, calculating properties from a firstprinciples approach is resource-intensive, restricting its applicability. For example, molecular dynamics simulations using density functional theory (DFT) are currently capable of handling fewer than a thousand atoms at picosecond time scales. However, many interesting materials science problems and technologically important processes can only be described with millions of atoms on micro-to millisecond time scales.
Conventional interatomic potentials (IPs), such as Lennard-Jones, embedded atom method (EAM), modified EAM, Tersoff, Stillinger-Weber, and so on, typically provide six to eight orders of magnitude speed-up compared to DFT calculations, and due to their simple, physically motivated forms, they are somewhat robust in the sense that their predictions for low energy structures are plausible. However, their quantitative accuracy is typically quite poor compared to DFT, especially in reproducing macroscopic properties. Machine-learned IPs tend to be much more accurate, but they are typically three to four orders of magnitude slower than conventional IPs. Importantly, the range of their applicability may be quite restricted. In typical parlance, their transferability can be limited. This transferability problem requires researchers to take care of constructing, applying, and validating IPs, and in particular makes it a rather tenuous proposition to use them to discover and predict new structures and novel properties.
In 2010, the Gaussian approximation potential (GAP) 1 was introduced as an approach to create IPs with ab initio accuracy, using kernel regression and invariant many-body representations of the atomic neighborhood. Since their introduction, they have been effective at modeling potential energy surfaces 2-4 and reactivity 5 of molecules 6 and solids 7,8 , defects 9 , dislocations 10 , and grain boundary systems 11 . Recently Bartók et al. 12 showed that a GAP model using a smooth overlap of atomic position (SOAP) kernel can be systematically improved to reproduce even complex quantum-mechanical effects 13 . SOAP-GAP has thus become a standard by which to judge the effectiveness of numerical approximations to ab initio data. There are a number of other machine-learned potentials that also perform well and have overlapping applications with SOAP-GAP, although applications of several of these methods to alloys are still nascent [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31] . While recent work 32 compares the performance of multiple methods for alloy systems, it does not address dynamical quantities such as phonon dispersion or temperature-composition dependence in phase diagrams. Note that although the GAP framework can be used with arbitrary kernels, for simplicity we will use the GAP abbreviation to mean SOAP-GAP exclusively in the rest of this paper.
The moment tensor potential (MTP) 33 is another approach to learning quantum-mechanical potential energy surfaces. Due to the efficiency of its polynomial basis of interatomic distances and angles, MTP is significantly faster than GAP and has already been shown to be capable of reaching equivalent accuracy for modeling chemical reactions 34 , single-element systems 35,36 , single-phase binary systems 37 , or ground states of multicomponent systems 25 . In this work, we demonstrate that both GAP and MTP are capable of fitting the potential energy function of a binary metallic system, the Ag-Pd alloy system, with DFT accuracy across the full space of configuration and composition for solid and liquid systems. In addition to reproducing energies, forces, and stress tensor components with near-DFT accuracy, we show that these potentials can also approximate phononic band structure quite well and can be used to model compositional phase diagrams. These new capabilities of quantum-accurate IPs for alloys would pave the way to accelerated materials discovery and optimization.
The Ag-Pd system provides a stringent test for a machinelearned IP that shows whether it can compete with the cluster expansion (CE) method despite the much simpler "lattice gas" formalism of the latter. The chemical similarity of silver and palladium and their similar atomic sizes (leading to small atomic mismatch) 38 make it an ideal system for CE and a challenging test for competing methods. Ultimately, the challenging aspect for competing methods is handling both the structural and compositional degrees of freedom. For a system with matched atom sizes, the structural degrees are less of an issue for CE and it shines because of its effectively compositional basis functions. Since methods that use continuous basis functions, like the ones we test, typically perform better for structural rather than compositional variation, this system is a challenging test.
Phonon band structures directly describe phase stability at moderate temperatures via the quasi-harmonic approximation. We first show that both SOAP-GAP and MTP potentials can accurately reproduce DFT-calculated phonon band structures for alloy configurations that are not in the training set. As a demonstration of speed and transferability, we use the MTP potential to calculate melting lines and transition temperatures for the Ag-Pd phase diagram using the nested sampling (NS) method [39][40][41] . We then compare the performance of GAP and MTP across a low energy transition pathway between two stable configurations to demonstrate the importance of regularization and active learning.

METHODS Datasets
In this section, we describe the datasets used to fit and validate the potentials. Both the MTP and GAP potentials were fitted to the same active-learned dataset, while a liquid dataset provided validation for energies, forces, and "virials" (volume-weighted coefficients of the stress tensor). Although only the active-learned dataset was used for building the models, there was some overlap between the seed configurations in the active-learned dataset and the configurations for which phonons are predicted (discussed later), both having their origin in enumerated 42 supercells.

Active-learned dataset
We use the MTP potential and its associated tools to create a database via active learning 25,43 as described below. We start with a catalog of small fccand body-centered cubic (bcc)-based derivative superstructures. The energies, forces, and virials of these structures are computed by DFT and are then used to fit an MTP potential. This potential is then used to perform structural relaxation for all structures in the database. Active learning 25,43 is based on a geometric criterion determining whether extrapolation was attempted while predicting energies, forces, and virials. If, during the relaxation of a particular structure, the estimated degree of extrapolation of the potential is too large, then that (partially relaxed) structure is computed with DFT and added to the training set. When the potential can reliably relax all structures in the enumerated database, the database is expanded to include larger unit cells, and the process is repeated.
For this work, an initial catalog of 58 enumerated structures 42 with bcc and fcc derivative superstructures containing four atoms or less were calculated. We iterated the active-learning process until the MTP was able to successfully relax all enumerated structures with cell sizes up to 12 (a total of 10,850 structures). This final active-learning dataset has 774 configurations.
All the DFT data for these potentials were calculated with the Vienna Ab initio Simulation Package (VASP) [44][45][46][47][48] using the PBE functional 49 . The kpoints were selected using either Monkhorst-Pack 50 or WMM 51 schemes as described below. PREC = Accurate and EDIFF = 1e − 4 were used for all calculations unless otherwise specified.
During active learning, we used a k-point density setting of MINDI-STANCE = TRUE and an energy convergence target of EDIFF = 1e -4 for the self-consistent loop. However, for the final fit, we found it necessary to recompute the DFT for this dataset with higher k-point density and a tighter EDIFF setting in order to get good convergence of phonon dispersions. In our experience, a linear k-point density of 0.015 k-points per Å −1 is a reliable density for alloy fits.
The final dataset for training the GAP and MTP potentials used the original 774 configurations discovered through active learning but computed with MINDISTANCE = 65 in Mueller's scheme and EDIFF = 1e − 8.

Liquid dataset
We built a dataset of liquid-like configurations by performing MD simulations using VASP at a high temperature. These calculations were performed at compositions of 25, 50, and 75 at.% Ag in cells with 32 atoms. The temperature for each simulation was set around the theoretical melting point (linearly interpolated from atomic melting points). Thus, 2766, 3063, and 3360 K were set as target temperatures for the MD runs and the thermostat parameters were SMASS = 3 and POTIM = 1.0. The simulation ran for 100,000 fs with snapshots taken every 50 fs. NELMIN = 4 ensured sufficient electronic steps were taken at each MD step. For this MD data, only the k-point at Γ was used. After the MD runs, each independent snapshot was evaluated again with VASP, but using a 4 × 4 × 4 MP kpoint grid.

Potential fitting
The GAP model was fitted to the active-learned dataset using the QUIP package available from https://github.com/libAtoms/QUIP, using a sum of a two-body term with Gaussian kernels of pairwise distances and a manybody term with a SOAP kernel, a combination that has produced successful fits of materials in the past [52][53][54][55] . Parameters for the two-and many-body parts of the GAP model are summarized in Tables 1 and 2, and are broadly in line with what were used in the previous works. The σ values control regularization in the GAP model, and can be broadly thought of as target accuracies; they were set to 10 −3 eV for energies (per atom), 10 −3 eV/Å for forces, and 0.02 eV for virials (per atom). Their relative magnitudes also control the trade-off between the fit accuracies in energies, forces, and virials.
The MTP model with polynomial degree up to 16 25 with 188 adjustable fitting parameters was trained on the same dataset as GAP. Table 3 summarizes the parameters needed to recreate the MTP model. The fitting weights (roughly corresponding to σ parameters of GAP) were 10 −3 eV, 1.4 × 10 −2 eV/Å, and 0.04 eV for energy, force, and stress, respectively. This is somewhat different from the parameters used for GAP; however, as we verified, this does not significantly affect the results.

NS: dataset augmentation
As described below, NS simulates atoms at extremely high temperatures that are well outside of the typical active-learned dataset described above. The MTP potential used for NS had to be trained using a slightly augmented dataset, to avoid the formation of dimers in the gaseous phase.
As the first step to constructing the augmented dataset, we identified 67 structures that are within 5 meV/atom from the convex hull of stable Ag-Pd structures. These structures were periodically repeated to form supercells with 32-64 atoms. These structures were used as initial

NS: methodology
The constant pressure NS method 40,41 was used to calculate phase diagrams by sampling the entire potential energy surface with corresponding configuration space volumes to calculate the isobaric-isothermal partition function. The specific heat, which is the second derivative with respect to the temperature of the partition function, shows peaks at phase transitions, and we use temperatures of specific heat maxima as estimates of the corresponding transition temperatures. While the NS method has previously been applied to multicomponent systems, those simulations assumed constant composition. However, it is possible for phase separation to occur in temperature-composition space, which would be neglected by this constraint. Here we have extended the constant pressure NS method to a semi-grand-canonical (sGC) version 56 , where the total number of atoms is constant, but the numbers of the individual species are allowed to vary. This is implemented by carrying out the NS procedure on a free energy F defined as where N i and μ i are the number of atoms and chemical potential of species i, respectively, and the sum is carried out over all species. To explore these degrees of freedom, we also added Monte-Carlo steps that propose the changing of the species of a randomly selected atom. Note that since the procedure is invariant to shifts in the total energy, the total number of particles is conserved, and only two species are present; the simulation is entirely characterized by the difference in chemical potentials Δμ.
To calculate the phase diagram in temperature-composition space, as it is usually plotted, we carry out sGC NS runs at a range of values of Δμ. In the sGC framework, the composition is an output of the simulation, and its value can be calculated as a function of temperature using the same ensemble average (with NS phase space volumes and Boltzmann weights) as any other quantity in the NS approach. For phase transitions that cross phase-separated regions, we would in principle expect discontinuous changes in composition (analogous to discontinuous changes in structure, internal energy, etc.) across the transition, but these will be broadened by finite-size effects. We also compare these results to constant composition NS runs, where a single transition temperature is identified with the peak of the C p (T) curve. For one composition, 50%, we continue one of those NS runs to a sufficiently low temperature to search for solid-state phase transitions. The parameters used for both types of NS runs are listed in Table 4 in the notation of ref. 41 , and in all NS simulations one configuration per NS iteration was removed (K r = 1).

Energy, force, and virial predictions
We now compare the performance of GAP and MTP models for the Ag-Pd system. Both the MTP and GAP models were validated against the liquid dataset for energy, force, and virial predictions. Table 5 summarizes the root mean square error (RMSE) in each of these properties for both GAP and MTP. No liquid data were included in the original active-learned dataset. Thus, these predictions represent severe extrapolation. The fact that both machine-learned IPs perform so well in this dataset is evidence of their transferability. The relatively simple approach to building the training set (iterative fitting and relaxing of enumerated superstructures) resulted in IPs that would be reliable in most solid or liquid simulations.
In Fig. 1, a cumulative probability distribution of errors for energy, force, and virial predictions are plotted for GAP and MTP, where errors are calculated relative to DFT. For energy, MTP has a lower cumulative probability of error overall. This difference is less pronounced for the force errors where MTP is only marginally better. Interestingly, the probability of error for virial predictions in MTP deviates significantly from GAP at larger errors. These error statistics are consistent with the ratio of energy, force, and virial weights (σ parameters) for the two models: on the one hand, MTP has >10× higher weights (lower σ values) for energies relative to    Root mean square error of energy, force, and virial predictions for GAP and MTP interatomic potentials validated against the Ag-Pd dataset of~6000 liquid configurations subsampled from ab initio molecular dynamics using VASP. The models were both trained using the same active-learning dataset.
C.W. Rosenbrock et al. forces, whereas GAP uses the same; meanwhile, GAP uses twice the weights (half σ) for the virials.

Phonon predictions
The phonon eigenvalues computed from a force-constants matrix for a crystal structure describe the energy required to excite a specific vibrational mode within the crystal. For a potential to closely reproduce a phonon spectrum, it must accurately approximate the curvature of the potential energy surface for small deformations from its relaxed geometry. Thus, while energy and force validation provides useful insights into the accuracy of a potential for specific points and their slopes on the potential energy surface, validating against phonons gives insight into the second derivatives of the energy surface. Historically, potentials have successfully reproduced phonon spectra for certain compositions and configurations, but not generally for an entire alloy system. To demonstrate the ability of our potentials to produce accurate phonon band structures, we compare DFT-and IPcalculated band structures for face-centered cubic (fcc)-type derivative superstructures 42 of cell sizes from 2 to 6; there are 65 of these cells.
DFT phonon dispersion curves First, with DFT, we relaxed each configuration twice using IBRION = 2 and ISIF = 3, which allows both cell shape and volume to change during relaxation. We then used phonopy to generate frozen phonon displacements. For selecting the supercell, we enumerated the list of all possible hermite normal form (HNF) matrices for each structure and selected the HNF in each case that maximized the distance between periodic images with a supercell size of >32 atoms. When two HNFs were equivalent for both size and distance metric, we selected the one with the larger point group. This procedure allowed us to choose the smallest possible supercell with the highest symmetry subject to the constraint of the maximal distance between periodic images. See the discussion in the appendix of ref. 57 for the utility of using HNF matrices in this context.

Machine-learned phonon dispersion curves
Using both GAP and MTP, we demonstrate here that a single, machine-learned potential can simultaneously approximate phonons across the full compositional space for many configurations, and with good accuracy. In Supplementary Figs. 1 -11, we include additional phonon plots that cover a broad structurecomposition range. In this section, we have chosen two that are interesting for discussion purposes.
In Fig. 2, we plot a typical phonon spectrum for a 50 at.% Ag configuration with four atoms. For this structure, both GAP and MTP approximate the eigenvalues along the special path well. The RMSE, reported within parentheses, is the integrated error across all eigenvalues in the Brillouin zone (BZ), sampled on a 13 × 13 × 13 grid. Figure 3 shows a structure with a dynamic instability (i.e., negative phonons), as reported by DFT. For this structure, both GAP and MTP learn and reproduce this instability, albeit with different accuracies. Both figures demonstrate that the ability of the IPs predicts dynamic changes due to small perturbations. In Supplementary Figs. 1-11, we similarly plot 65 phonon spectra. Table 6 provides statistics for the integrated error across all 65 structures for GAP and MTP. Both GAP and MTP predictions for integrated error are close to 0.1 THz across the full validation set.  Table 5. Similar training and prediction errors indicate a good bias/ variance trade-off (not over-fitting). Importantly, these aggregated results show that across a broad range of alloy compositions and structures for Ag-Pd, machine-learned IPs are in good agreement with DFT in the harmonic approximation for vibrational modes.

Transition pathway
As discussed in the "Introduction," reduced transferability is the price we pay for approximating quantum mechanics with high accuracy. In general, a machine-learned IP is only valid within the subspace in which it was trained. Although it is possible to apply an IP outside of that space, the results will not be trustworthy. We demonstrate this by computing the energy along a smooth, but high energy, transition pathway between two structures. Figure 4 shows two Ag-Pd structures that are connected by a smooth transition (essentially these two structures are identical, except that the upper two atoms switch places). Although the cell must enlarge slightly and distort, the atoms have a clear path to transition from the starting configuration (index 1) to the final configuration (index 11) without colliding. The total energy along the path is also shown in Fig. 4. Note that in the figure, the y-axis is the total energy, not the energy difference between distorted and undistorted cases. Also note that y-axis scale is linear between −1 and +1 eV and logarithmic elsewhere in the upper plot. In the starting configuration, the total energy is~−16 eV. At its highest point on the transition path, the energy is~9.5 eV, a total difference of~25 eV. Such a high energy structure is not problematic for DFT, but it is a big ask to expect a machinelearned IP to accurately extrapolate to this kind of a structure if similar structures were not included in the training dataset. Nevertheless, the GAP does quite well. Although the absolute error of its prediction for the top of the barrier is several eVs, the qualitative behavior is correct. Due to its more local basis functions and built-in regularization, the GAP model provides reasonable physical behavior for the transition between the starting and final configurations. MTP, with its global polynomial basis functions, relies on active learning to ensure that its predictions fall within the interpolation regime. As part of its framework, MTP (like GAP) provides the extrapolation grade 25,43 to distinguish between configurations that can be evaluated reliably and configurations that should be added to the training set to avoid large extrapolation errors. In our test, MTP correctly detects extrapolation, but has a much poorer extrapolation behavior compared to GAP, and this is the price one pays for using unregularised global basis functions. The active-learning approach is general and could be applied to GAP too (using the predicted variance of the underlying Gaussian process as a proxy Fig. 3 Phonon dispersion curves for a 6-atom structure with 16 at. % Ag. Both the RMS errors within parentheses represents the integrated error across all eigenvalues sampled on a 13 × 13 × 13 grid in the Brillouin zone. According to DFT, this structure is not dynamically stable. Both GAP and MTP recover this dynamic instability, although GAP is more accurate. Note that this seed configuration was included in the active-learned dataset that the potentials were fitted to.  for extrapolation 52 ), which would be expected to make its predictions better too. This demonstration should be seen as a warning in the application of machine-learned IPs generally. Using such models safely requires understanding the properties of the basis functions, how the training set was built, which parts of the configuration space were included, and on. In the case of MTP, the extrapolation grade should be used to check against representative samples from the configurations that are expected to be encountered before embarking on large-scale molecular dynamics simulations.

Phase diagram results
The success of the models in learning basic properties and phonons motivates examining the temperature-concentration phase diagram for the alloy system. We used NS with the MTP model to find the liquidus-solidus line, calculate order-disorder transition temperatures, and so on. The MTP model is significantly faster than the current implementation of GAP, and this makes the exploration of the phase space more practical. With respect to speed, the final GAP potential uses 1075 basis functions vs. 569 for the MTP potential. Note that MTP basis functions are polynomial, whereas in our current GAP-SOAP, basis functions require the calculation of an overlap integral. For example, investigation of a single temperature slice of the phase space (for fixed composition and pressure) requires >2 billion evaluations of the potential. This cost is presently prohibitive for GAP but reasonable with MTP.

Liquid-solid transition
Inasmuch as NS cools down from a high-temperature gas phase, we first reproduce the liquid transition behavior as a function of temperature. Each solid line in Fig. 5 shows a trace of the ensemble-averaged composition as a function of temperature that results from a NS run at fixed Δμ = μ Ag − μ Pd . In the hightemperature liquid and low-temperature solid regions, the composition varies smoothly with temperature. The solid-liquid transition is indicated by a sharper horizontal (along the composition x-axis) jump, which we expect would become discontinuous in the large system size limit. The width of the approximate discontinuity indicates the width of the phaseseparated region. As is clear from Fig. 5, the melting behavior of our MTP potential qualitatively matches the experimental line [58][59][60][61][62] . However, although the entire line has roughly the same shape, it is shifted from the experimental results by~200 K. The liquidus-solidus gaps are also in reasonable agreement with experimental data when the same shift of 200 K is included (added in Fig. 5 to facilitate comparison). This shift in temperature is expected for DFT with a Perdew-Burke-Ernzerhof (PBE) functional and has been discussed in the context of other ab initio studies [63][64][65] . Since the shift does not appear to be composition-dependent, the trends are still reliable.
To further verify our NS results, we have also performed a coexistence simulation of an equimolar Ag-Pd system with 16,384 atoms (8 × 8 × 64 four-atom cubic fcc cells). We ran a combined molecular dynamics and Monte-Carlo simulation to model the disordered phase. The coexistence simulations predicted the melting point of~1315 K as compared to~1380 K as predicted by our NS simulations. This is consistent with previous work showing that for NS simulations the finite-size errors in the melting temperature estimated using the full-width at half-maximum of the heat capacity peak was~200 K for 64 particles and 100 K for 128 particles (see the Supplementary information of ref. 40 ).

Solid-solid transition
Another stringent test of the potential is whether it can recover the transition from a disordered solid solution to an ordered phase. Experimentally, it appears that ordered phases are kinetically inhibited by low transition temperatures and would be unlikely to appear in experimental constitutional phase diagrams. All reported phase diagrams 59,66-68 typically show just solidus and liquidus lines and indicate a solid solution below the solidus (see Fig. 5), although one proposed phase diagram guesses two solid-solid transitions, based on some reports of ostensibly ordered phases 58 , reliable evidence for first-order transformations in the solid state is lacking 69 . A careful review of relevant experimental literature [70][71][72][73][74][75][76] since 1961 (after ref. 58 ) suggests that, when Ag-Pd samples are annealed in the air or otherwise exposed to oxygen, the formation of oxide phases can be misinterpreted as the effects of ordering. The experimental literature does not agree on the stoichiometry of these oxide phases, such phases have not been reported in samples not exposed to oxygen, and no structural information for these phases (e.g., from X-ray diffraction (XRD) studies) have been reported. It seems reasonable that there is no formation of intermetallic phases in the temperature ranges reported in the phase diagrams.
Since the melting transition was underestimated, we expected that any disorder-order transition would also take place at a reduced temperature, hence for the 1:1 composition system we continued the sampling well below the melting temperature. As shown in Fig. 5, which includes a region of the phase diagram outside the experimental data, the order-disorder transition does exist at 125 K for this system. Other computational works find the transition temperatures to be similar to what we report here [77][78][79][80] . In Fig. 6, we show the atomic ordering before and after this transition. While the atomic views in panels (a) and (b) clearly show ordering in the stacking planes, we also show simulated XRD in panel (c) where the ordering is clearly discernible.  [66][67][68] . The slight difference between the NS results and coexistence results can be explained by the system size: 64 atoms for NS and 16,384 atoms for coexistence. Error bars for fixed-composition NS are the full-width half-maximum obtained from the heat capacity peaks.

Comparison to CE
As a reference, we include a comparison of the GAP and MTP models to a state-of-the-art CE of the same Ag-Pd system 81 . Table  7 compares the training and validation errors of the GAP, MTP, and CE models on the validation dataset used while building the CE model. Note, however, that the training error reported for each model is with respect to the training set used for that model. Since CE does not train on off-lattice data, its training dataset is different. We also include the number of basis functions in each model. Because the CE basis is well suited to the configurational degrees of freedom, relatively few basis functions are needed compared to GAP and MTP. However, it is worth remembering that the computational cost in evaluating a basis function in each method is different so that these values are only somewhat representative of computational cost.

DISCUSSION
CE has been a go-to tool for computing energies across configuration space for alloys. Because of its speed and applicability over the full range of compositions, it is useful for performing ground-state searches, and even for temperaturedependent phase mapping in certain systems. However, it cannot address dynamic processes that involve structural perturbations, which often limits its usefulness.
This work demonstrates that machine-learned IPs are nearly as good as CE for on-lattice computation of energies 81 . Nevertheless, for on-lattice systems where the atomic displacements are small, CE is still the best choice. Ref. 38 discusses metrics to determine when CE performs well as a function of atomic displacement. But unlike CE, the machine-learned IPs can compute forces, virials, and hessians across the compositional space as well. These additional derivatives of the potential energy surface are sufficiently accurate to approximate dynamic properties like phonon dispersion curves, as well as map out the temperature-composition phase diagram for an alloy. Software for creating datasets and fitting potentials is readily available and easy to use. These potentials, therefore, offer a viable alternative to CE models, and arguably represent the future direction of first-principles computational alloy design.
Although the fitting errors are a factor of 2-3 larger for the GAP and MTP on-lattice predictions, this could be due to lower k-point errors in the DFT data for the CE work, where an "equivalent kpoint" scheme was used to provide fortuitous cancellation for some of this error (but this is only possible in systems with small lattice mismatch 38 ). The predictions in all three methods are nearly at DFT accuracies and sufficient for many practical alloy questions.

CODE AVAILABILITY
Code to produce GAP models is available at libatoms Download. The code implementing MTPs can be obtained for academic purposes free of charge by sending a request to A. Shapeev.  CE cluster expansion. Details on building the dataset and model are given in ref. 81 . While the CE model is slightly better, both GAP and MTP perform well in that region of configuration space where the CE model is valid. We also include the number of basis functions used by each models as a surrogate for performance. Because the CE basis is well-suited to on-lattice modeling, comparatively few basis functions are needed for its predictions. Note that the reported training error is with respect to the dataset used to train the model; since CE does not train on off-lattice data, its training dataset is different from GAP and MTP.