Abstract
Firstprinciples based cluster expansion models are the dominant approach in ab initio thermodynamics of crystalline mixtures enabling the prediction of phase diagrams and novel ground states. However, despite recent advances, the construction of accurate models still requires a careful and timeconsuming manual parameter tuning process for groundstate preservation, since this property is not guaranteed by default. In this paper, we present a systematic and mathematically sound method to obtain cluster expansion models that are guaranteed to preserve the ground states of their reference data. The method builds on the recently introduced compressive sensing paradigm for cluster expansion and employs quadratic programming to impose constraints on the model parameters. The robustness of our methodology is illustrated for two lithium transition metal oxides with relevance for Liion battery cathodes, i.e., Li_{2x }Fe_{2(1−x)}O_{2} and Li_{2x }Ti_{2(1−x)}O_{2}, for which the construction of cluster expansion models with compressive sensing alone has proven to be challenging. We demonstrate that our method not only guarantees groundstate preservation on the set of reference structures used for the model construction, but also show that outofsample groundstate preservation up to relatively large supercell size is achievable through a rapidly converging iterative refinement. This method provides a general tool for building robust, compressed and constrained physical models with predictive power.
Introduction
Firstprinciples density functional theory (DFT) calculations have established themselves as a routine and reliable tool in computational materials science research^{1,2,3,4} and have enabled important advancements in materials discovery.^{1, 2, 5} Although implementations with increasing numerical efficiency and growing computational power have made it possible to simulate ever larger structures with DFT, the method’s intrinsic scaling with the number of electrons prevents applications that require large structures (thousands of atoms) and intensive sampling (millions of configurations). Approximate energy models fitted to DFT reference data, such as cluster expansion (CE) lattice models^{6,7,8,9} or machine learning regression,^{10, 11} can overcome these limitations by constructing computationally more efficient models with accuracies that are close to DFT for a chosen structural and chemical space. One prototypical application for approximate energy models is the prediction of ordered ground states based on an underlying lattice topology.^{12, 13}
The concept of CEs goes back to the Ising model,^{14} which describes the magnetic phases of an atomic lattice in terms of pair interactions. A CE model, or generalized Ising model, is the discrete sum representation of materials properties in terms of lattice site topologies and site interactions, such as site pairs, triplets, quadruplets, and so on. CEs have broad applications in different fields of science,^{6, 15} such as magnetism^{15} and alloy thermodynamics.^{6}
The key challenge in constructing CE models is to determine the expansion coefficients, the effective cluster interactions (ECIs), in a robust fashion through a fit to reference configurations.^{1, 16, 17} Conventional ECI fitting procedures^{1, 16, 17} focus on minimizing the overall difference between the CE fit and the input configurations with respect to the expanded quantity, such as the energy. In many cases, that input quantity may be determined by an accurate abintio method such as Density Functional Theory. One essential requirement that each CE fit must meet for practical applications is groundstate preservation, i.e., a physically accurate CE model must reproduce the ground states of the input if only the input configurations are considered. This requirement is important as the ground states usually govern the material properties at relevant temperature,^{18} such as finite temperature voltage profiles^{18} and phase diagrams.^{18} In this article, we revisit the ECI fitting problem with a focus on constraints that guarantee groundstate preservation. We propose a robust and efficient scheme to construct groundstate preserving CE models based on compressive sensing^{17, 19} and quadratic programming.^{20}
The manuscript is organized as follows: First, we briefly review the CE formalism and define the ECI fitting problem in a rigorous mathematical manner. We then derive groundstate preserving constraints that can be used in conjunction with a quadratic programming solver. Afterwards, we consider the phase diagrams of two prototypical oxides of practical relevance as benchmark cases. Finally, we compare the strengths and weaknesses of our approach with established methods.
Results and Discussion
Compressed sensing and cluster expansion
For a rigorous mathematical introduction to cluster expansions and their formal relationship to the partition function of crystalline solids, we refer the readers to references.^{6, 9, 21} Here, we only illustrate the key features of cluster expansions that are of relevance to the present work.^{17}
The general expression of a cluster expansion Hamiltonian is
where σ is the spin representation of an atomic configuration in which each component σ _{ i } (a spin variable) denotes the occupancy of site i. Following the Ising model convention, σ _{ i } takes on values of ±1 in a binary system, encoding the atomic species on site i. Each product of spin variables, σ _{ i } σ _{ j }... , (spin product) corresponds to a cluster of lattice sites, and the cluster expansion energy E _{CE} is a polynomial of the spin variables weighted by the expansion coefficients J, the ECIs. For brevity, we denote the set of interacting clusters as C. For any cluster c∈C, J _{ c } is the corresponding ECI and σ _{ c } is the corresponding spin product. Note that typically multiple clusters of the same type exist (e.g., the point term for each equivalent site or the cluster corresponding to the nearestneighbor pair interaction), and symmetry requires the coefficients of equivalent clusters have to be identical.^{22} The summation in Eq. (1) is therefore actually over cluster types, and the individual spin products can be replaced by their average over all equivalent clusters, the cluster correlations.
From Eq. (1) it is obvious that the CE energy is linearly dependent on the ECIs, J, when the configuration σ is fixed. We can thus write
where Π(σ) is the row vector of cluster correlations (with multiplicity incorporated) corresponding to configuration σ. Given a set of input atomic configurations S and their DFT energies E _{DFT,S}, the problem of determining the ECIs can then be naïvely expressed as minimization of the L _{2} norm
where the rows of the feature matrix Π_{ S } are the cluster correlations of the configurations in S. Note that the L _{2} norm is the conventional Euclidean norm, and the general L _{ p } norm \({\left\ {\bf{u}} \right\_p}\) is defined as:
Simply minimizing the L_{2} norm in Eq. (3) essentially means that the ECIs are fitted such that the average squared difference between the DFT energies and the CEpredicted energies of all structures is minimized. However, such a direct minimization of the error function leads to overfitting when the number of ECIs (the model parameters) exceeds or becomes close to the number of reference configurations (the fitting parameters), i.e., when the system of linear equations Eq. (3) is underdetermined. Overfitting means that the ECIs accurately reproduce the energies of the reference structures (insample data) but deliver poor generalization, i.e., the CE model does not reliably predict the energy of other unseen structures (outofsample data). A standard method to avoid overfitting is regularization,^{23} i.e., the simultaneous minimization of the sum of the error function and the magnitude of the model parameters. Compressive sensing^{17, 19} implements L _{1} norm regularization, which has been shown to be a nearly optimal and robust way to reconstruct signals from a small number of data points.^{24} The compressive sensing formulation of the cluster expansion problem is:
where μ is a parameter controlling the sparseness of the fit. A higher value of μ shifts the weight towards minimizing the L _{1} norm, when μ is small the minimization of the L _{2} error dominates. The L _{1} norm of a vector is a measure of the vector’s sparseness,^{24} thus larger μ values result in fewer ECIs not equal to zero and thereby reduce overfitting. An optimal μ value can be determined through minimizing the error of the CE model on unseen data.^{17}
Constrained cluster expansion models
For practical applications it is often desirable that a CE model preserves some invariants on the input data. For example, predicting the qualitative features of a phase diagram may require that the energetic order of all structures is exactly preserved while quantitative errors in the structural energies might be tolerable.^{25} This is because the set of ground states and the ranking of excited states close in energy determines the topology of a phase diagram more than the actual energies themselves.^{26} As the energy difference between competing structures is typically small, minimization of the average error in reproducing the DFT energy does not by itself enforce the structural energy order one wants to preserve. As a result even very small energy errors in the CE can qualitatively change a phase diagram when it leads to new ground states.^{26} We have found practically that trying to preserve the structural ordering and ground states by increasing the relative weights of these input data rapidly leads to overfitting in the CE. In the following, we will develop a methodology that allows including constraints in the ECI optimization problem in a systematic, unbiased fashion and without overfitting.
In recent years, mathematical programming has been a rapidly growing field that enables the highly efficient, systematic and rigorous solution of problems in different standard forms.^{27} One rapidly growing area is quadratic programming (QP),^{28} for which robust solvers exist,^{20} and a variety of different approaches have been researched and implemented, such as the interior point method, the active set method and the augmented Lagrangian method.^{28} In essence, quadratic programming is a mathematical optimization technique for problems of the following specific form:
where Q is a positive semidefinite matrix, A and C are real matrices, and b, c and d are real vectors. Note that a matrix is positive semidefinite if and only if for all real vectors X, x ^{T} Qx ≥ 0. The semidefinite property is essential so that the optimization problem is convex. Also note that when Q = 0 Eq. (6) reduces to a standard linear programming problem which was introduced to CE optimization in reference.^{25}
Our key strategy for CE fitting is to cast the compressive sensing problem Eq. (5) into a quadratic programming problem Eq. (6)^{29} and to add constraints that guarantee groundstate preservation. Explicitly, Eq. (5) can be rewritten as:
In the conversion step in Eq. (7) auxiliary variables z _{ c }, corresponding to constraints on J, have been introduced to remove the L _{1} norm of Eq. (5). The equivalence in Eq. (7) holds because every z _{ c } can be independently minimized while it is constrained to be larger than ± J _{ c }. Note that the QP formulation in Eq. (6) does not allow absolute value operations, so that two separate linear constraints are required in Eq. (7), z _{ c } ≥ J _{ c } and z _{ c } ≥ −J _{ c }, even though they are in combination essentially expressing the absolute value constraint z _{ c } ≥ J _{ c }. The conversion step in Eq. (8) is a direct expansion of the L _{2} norm into vector multiplication. Note that \({\Pi _{\rm{S}}}^T{\Pi _{\rm{S}}}\) is always positive semidefinite for every vector x, since \({{\bf{x}}^T}{\Pi _{\rm{S}}}^T{\Pi _{\rm{S}}}{\bf{x}} = {\left( {{\Pi _{\rm{S}}}{\bf{x}}} \right)^T}{\Pi _{\rm{S}}}{\bf{x}} \ge 0\). Hence, we have arrived at a formulation of the compressive sensing ECI problem Eq. (5) in terms of a QP problem.
The second key step of our methodology is to include suitable constraints for groundstate preservation in the QP formulation. Ground states, i.e., those configurations that are thermodynamically stable at zero temperature (0 K), can be identified by constructing the lower convex hull of the formation energies.^{30} When the energy of a configuration is above the groundstate hull it is thermodynamically unstable with respect to decomposition into neighboring ground states.
Note that there are 2 different scenarios that lead to inconsistent ground states from an ECI fit: The first type of groundstate inconsistency occurs when the energy of some nongroundstate configuration is underestimated so much that it erroneously becomes a groundstate of the CE model. This problem is illustrated in Fig. 1 (labeled with P_{1}), where the energy of configuration s_{1} is predicted to be below its decomposition line in the input data, i.e., below the convex combination of configurations h_{2} and h_{3} (shown as the line connecting the points). To constrain the QP system such that no inconsistency of type 1 occurs, we add the first constraint:
(C1) for each configuration that is not on the groundstate hull (i.e., a configuration that would thermodynamically decompose into ground states), we require that its CE configuration energy is greater than its CE decomposition line. To express this condition formally, we denote the ith groundstate configuration (i.e., the ith configuration on the lower convex hull) by σ _{ h,i } with i ∈ H. With this notation, the decomposition of an unstable configuration σ _{ s } into the stable ground states can be expressed as Dec _{ H }(σ _{ s }) = {(x _{ i }(σ _{ s }),σ _{ h,i })_{ i∈H }} where x _{ i }(σ _{ s }) is the fraction of σ _{ h,i } in the decomposition products. The constraint to remove groundstate inconsistencies of type 1 becomes
where ε is some small number used as numerical tolerance.
Introducing constraint (C1) in Eq. (9) to the QP problem in Eq. (8) guarantees that all ground states of the CE model are also ground states of the DFT input data. However, the converse is not necessarily true, i.e., a DFT groundstate configuration might not be a groundstate of the CE model. This scenario is shown in Fig. 1 (P_{2}), where configuration h_{2} has a greater CE energy than its convex hull decomposition line defined by h_{1} and h_{3}. To remove this second type of groundstate inconsistency, we introduce a second constraint:
(C2) for each groundstate configuration σ _{ h } (i.e., for each configuration σ _{ h } on the lower convex hull), we require that its energy is smaller than the energy of a modified hull that results from removing σ _{ h } from the set of input ground states. Formally, given a groundstate configuration σ _{ h } on the input hull, we consider its decomposition into a modified groundstate hull as Dec _{ H\h }(σ _{ h }) = {(x _{ i,H\h }(σ _{ h }),σ _{ h,i })_{ i∈H\h }} where H\h is the index set of all input hull configurations not including σ _{ h }, and x _{ i,H\h }(σ _{ h }) is the fraction of decomposition product σ _{ h,i }. The constraint to remove groundstate inconsistencies of type 2 thus becomes
Constraint (C2) in Eq. (10) guarantees that all groundstate configurations in the (DFT) input data are also ground states of the CE model. Consequently, by combining (C1) and (C2), a configuration is a groundstate of the resulting CE model if and only if it is a groundstate of the input data. The full quadratic programming formulation for groundstate preserving CE fitting is
Cation ordering in the rocksalttype lithium transition metal oxide systems Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O
We demonstrate the effectiveness of the QP approach to cation ordering in two oxide systems. Rocksalttype lithium transition metal oxides, LiMO_{2} (M = one or more transition metal species), are the most important class of cathode materials for lithiumion batteries in consumer electronics.^{31} During the last decade, materials with lithium excess compositions, Li_{(1+x)}M_{(1x)}O_{2}, have attracted much interest owing to their high lithium storage capacities.^{32, 33} One criterion for the suitability of Li_{(1+x)}M_{(1x)}O_{2} as cathode material is whether the material is a sufficiently good conductor for Li ions, which critically depends on the cation (Li, M) ordering in the structure.^{34, 35} While conventional oxidebased cathode materials form in ordered crystal structures (such as layered LiCoO_{2} ^{36}), several cationdisordered lithiumexcess materials with high practical capacities have recently been discovered.^{35, 37} Some of these new compositions contain Ti^{38} and Fe^{37} which makes them attractive for technological applications because of the metals’ high abundance and nontoxicity. However, LiTiO_{2} ^{39, 40} and LiFeO_{2} ^{41, 42} are the only LiMO_{2} with single transition metal species that form in cationdisordered structures in solidstate synthesis, and consequently their configurational phase diagrams are challenging to investigate experimentally.
In the following we employ the groundstate preserving QP methodology developed above to investigate the phase diagrams of Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O^{39,40,41,42} to obtain a better understanding of the relevant atomic configurations. The input consists of 863 and 602 reference configurations for Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O respectively. DFT calculations for Li_{ x }Fe_{(1−x)}O configurations were performed within the HubbardU corrected Generalized Gradient Approximation (GGA+U), using the PBE exchangecorrelation functional.^{43, 44} The U values are taken from the work of Jain et al.^{45} DFT calculations for Li_{ x }Ti_{(1−x)}O configurations did not employ a HubbardU correction. For both systems, an initial set of configurations at x = 0.5 with supercell sizes up to 8 sites was generated using the enumerating algorithms by Hart et al.,^{46} and the reference sets were subsequently refined by including groundstate configurations of preliminary cluster expansions determined using a recently published groundstate search algorithm for lattice models.^{47} The corresponding groundstate input hulls are shown in Fig. 2 as black dots and lines. We note that both systems, Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O, cannot easily be fitted using the conventional (unconstrained) compressive sensing technique, as the approach gives rise to a number of spurious ground states as shown in Fig. 2 (with optimal μ parameter as will be discussed below). Specifically, some Li_{ x }Fe_{(1−x)}O configurations with x = 1/3, 5/8 and 9/16 and Li_{ x }Ti_{(1−x)}O configurations with x = 1/8, 1/6, 1/4, 5/9, 3/5, 5/8 are erroneously predicted to be ground states (i.e., inconsistencies of type 1 as defined above). These overstabilized configurations are marked with arrows in Fig. 2. In addition, the actual Li_{ x }Ti_{(1−x)}O groundstate configurations with x = 1/10, 1/5, 8/15 become unstable in the compressive sensing CE model (inconsistencies of type 2). These examples demonstrate that groundstate preservation is not an automatic feature inherent to the compressive sensing approach, and the problem needs to be addressed before predictive simulations of materials systems are possible.
As seen in Fig. 2, the QP fitting scheme achieves groundstate preservation for both materials and yields CE hulls that are spanned by the same configurations as the input DFT hulls. However, it is worth noting that the sparseness parameter, μ, of Eq. (11) needs to be carefully selected to arrive at this result. In the following section we will show that μ should be chosen such that the crossvalidation error is minimized. The discussion of the crossvalidation error is essential in that it provides a standard measure of predictive power of our fitting scheme which sets the method apart from other approaches for groundstate preservation, such as the adjustment of configuration weights, which will also be shown below.
Crossvalidation of the choice of the sparseness parameter
Crossvalidation is the standard way to decide the optimal sparseness of a numerical model, which is generally referred to as bias variance tradeoff in statistical inference.^{48} To determine the sparseness parameter μ by means of crossvalidation we randomly split the DFT data, D, into N = 10 equal parts. For each part D _{ i }, we define its complement \({\overline D _i}\) as all the DFT data points except those belonging to D _{ i } (formally, \({\overline D _i} \equiv D  {D_i}\)). Next, the QP scheme of Eq. (11) is applied to the complement set \({\overline D _i}\) to obtain a CE fit without the information in part D _{ i }, so that an outofsample validation can be performed by calculating the root mean square error (RMSE) of the unseen data D _{ i }. We denote the resulting outofsample RMSE as e _{ i,μ }. The crossvalidation (cv) score cv _{ μ } given a sparseness parameter μ is then defined as the root mean square of the outofsample RMSE over all N data parts, i.e., formally \(c{v_\mu } = \sqrt {\mathop {\sum}\limits_{i = 1...N} {e_{i,\mu }^2{\rm{/}}N} } \). Using this definition, the optimal μ resulting in the model with greatest predictive power can be determined by plotting cv _{ μ } against μ and selecting the value of μ that minimizes the cv score.
The crossvalidation score cv _{ μ } of the Li_{ x }Fe_{(1−x)}O system as function of μ is shown in Fig. 3a for various different numbers of input clusters. Note that the number of input clusters to draw from is determined by a maximum interaction order (e.g., triplets) and a radial cutoff. Across all five curves, the cv score initially decreases and then increases with increasing μ. We consider the concept of bias variance tradeoff^{48} to understand this behavior: The input DFT energies may conceptually be understood as the sum of an ideal cluster expansion and a certain degree of noise ε, i.e., E _{DFT} = Π(σ)J + ε. Here, the noise could originate from numerical errors in the DFT energies. For small values of μ, the CE fit uses all available degrees of freedom (i.e., all ECIs) to incorporate the noise ε into the CE model, resulting in severe overfitting. As the value of μ increases, the number of nonzero ECIs decreases and the effect of noise, i.e., the variance in fitting, becomes less severe. However, when μ becomes too large, the bias that ECIs should tend to 0 becomes dominating over the data itself, resulting in severe underfitting and thus increasing cv scores. As a consequence, the cv score has a pronounced minimum allowing to determine the optimal μ corresponding to the best tradeoff between the variance and bias during fitting.
As seen in Fig. 3a, for small values of μ where \(\log \left( \mu \right) <  2\) the cv score increases dramatically with the number of input clusters indicating overfitting as the result of insufficient regularization. As the sparseness parameter increases above \(\log \left( \mu \right) > 2\), the cv score becomes less sensitive with respect to the number of input clusters, indicating that the regularization is effective and that most nonessential ECIs are fitted to zero regardless of the number of input clusters. The optimal cv scores are found for \(  1 \le \log \left( \mu \right) \le 0\) and are plotted in Fig. 3b for different numbers of input clusters (labeled “QP methodology”). As seen in the figure, the optimal cv score decreases from 0.0345 eV/f.u. (formula unit) for 54 input clusters to 0.0261 eV/f.u. for 625 input clusters. The cv score stabilizes at 625 input clusters and barely changes for 1184 input clusters (0.0260 eV/f.u.). Hence, we conclude that 625 input clusters and a sparseness parameter μ = 0.144 result in a CE model with optimal predictive power for the Li_{ x }Fe_{(1−x)}O system. The corresponding analysis for the Li_{ x }Ti_{(1−x)}O system is shown in Fig. 3c, and the optimal parameters, μ = 0.144 and 411 input clusters, yield a cv score of 0.0331 eV/f.u.
Insample groundstate preservation and comparison with conventional weight adjustment
Per construction, the QP form Eq. (11) guarantees that the CE fit preserves the ground states of the reference data set. Conventionally, such insample goundstate preservation is often achieved by assigning weights to the reference configurations to manually bias the fit. In the following, we compare the performance of the QP methodology with the conventional weight adjustment technique to further assess the utility of our approach. Before we detail the weight adjustment method, we briefly consider how configuration weights can be included in the QP approach in practice. For this purpose, we define a diagonal weight matrix W whose diagonal entries w _{ i,i } correspond to the weight of the i^{th} input configuration. With this definition, W can be incorporated into Eq. (5) to achieve multiplying weights to the insample fitting error:
Note that large w _{ i,i } result in a strong bias of the fitting error for the i^{th} input configuration to be 0. The concrete weight adjustment procedure that we employed in this work is as follows:

(1)
Initialize all weights to be 1.

(2)
Perform QP to construct a CE model.

(3)
Check if the CE model preserves insample ground states. If it does, the groundstate preserving fit is completed. If it does not, we define the set T of all DFT and CE groundstate configurations
Further, we define the maximum CE hull as \(er{r_{{\rm{hull}}}} \equiv \mathop {{\max }}\limits_{j \in {\bf{T}}} \left {{{\bf{E}}_{{\rm{CE}},\, j}}  {{\bf{E}}_{{\rm{DFT}},\, j}}} \right\) and introduce a weightincrement set T ′ ≡ {i∈T:E_{CE, i }−E_{DFT, i }>0.5err _{hull}}. For each configuration i∈T ′, w _{ i,i } is increased \(\root 4 \of {2}\) ≈ 1.19 times. The procedure is continued with step (2).
This weight adjustment scheme guarantees that insample ground states are preserved, since it iteratively converges the CE hull to the DFT hull and corrects spurious groundstate configurations.
A comparison of the optimal cv scores obtained for different numbers of input clusters using both methods is shown in Figs. 3b, c. The cv score is, once again, used as a standard measure for the predictive power of the fits. In case of the Li_{ x }Fe_{(1−x)}O system, the predictive power of the QP fit is consistently better than the fit obtained using the weight adjustment technique, and the improvement of the cv score is generally found to be around 2 meV/f.u. or 10%. For the Li_{ x }Ti_{(1−x)}O system, the cv score of the QP CE fit also improves about 3 meV/f.u. or 10% of the CE fit from weight adjustment, except for 625 input clusters for which both methods give equivalent results. However, considering all numbers of input clusters, the overall best cv score for the QP method is 1.5 meV/f.u. or 5% better. While absolute energy errors on the order of a few meV/f.u. are close to the inherent error of density functional theory, similar errors in the relative energies of different configurations may add up and thereby give rise to qualitatively different phase diagrams.
Generally, we observed that the weight adjustment method biases some configurations by more than a factor of one thousand (w _{ i,i } > 10^{3}), resulting in overfitting of those particular configurations, whereas the QP scheme shows no evidence of such a partial overfitting.
In summary, we conclude that the QP methodology of this work has significant advantages over conventional weight adjustment for the preservation of insample ground states. However, as will be demonstrated in the following section, the superiority of the QP approach becomes truly evident when outofsample configurations are considered.
Outofsample groundstate preservation
Suppose that we are certain that the set of reference configurations available for the construction of the CE model comprises all physical ground states of the system. This situation could occur after an extensive exploration of the configurational space or when the DFT data agrees exceptionally well with experiment. With such confidence in the reference data set, we would like to guarantee that the fitted CE model not only reproduces the ground states of the reference data, but also does not possess any additional ground states that are not already present in the reference data. We call this property outofsample groundstate preservation. In the following, we describe an iterative procedure for constructing CE models that guarantee outofsample groundstate preservation up to a given number of periodic sites. We will further show that this procedure is generally a useful strategy to construct CE models even when, initially, it is not known whether ground states outside of the reference set exist.
The QP formulation established in Eq. (11) provides groundstate preservation within the set of input data. However, out of sample groundstate preservation is not guaranteed. In principle, if the true configuration polytope,^{49} P, is known for a set of possible ECIs, i.e., σ∈P can be added and solved within a QP, one could add the following constraint:
to Eq. (11) and the corresponding optimization problem will result in a globally groundstate preserved CE fit. In practice, however, solving the configurational polytope for an arbitrary CE is an undecidable problem.^{50} Although this does not necessarily mean that finding a groundstate preserving fit is globally undecidable as well, this fact hints at the intrinsic difficulty of the outofsample groundstate preservation problem.
Instead of determining a priori constraints that guarantee outofsample preservation, we first examine a CE fit with insample groundstate preservation obtained from the QP methodology with optimal parameters (sparseness, number of input clusters) and determine all ground states of the CE model up to a defined system size using the methodology of reference.^{47} The groundstate hull defined by the input configurations is denoted as the insample hull whereas we refer to the hull that is based on all identified ground states as the outofsample hull. A comparison of the insample and outofsample hulls for Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O for supercell sizes with up to 16 sites is shown in Fig. 4. For Li_{ x }Fe_{(1−x)}O, one extra groundstate at x = 5/8 is identified that is predicted to be 6 meV below the insample hull. Even though the distance between the insample and outofsample hulls is small (6 meV), this CE would produce a qualitatively wrong phase diagram due to the spurious groundstate at x = 5/8. For Li_{ x }Ti_{(1−x)}O, the discrepancy between the insample CE hull and the outofsample CE hull is even more severe, as shown in Fig. 4b.
In the following we would like to arrive at a scheme to construct a CE that does not lead to additional ground states, i.e., outofsample groundstate preservation. Such scheme is useful in efficient determination of new groundstate configurations and selfconsistent CE. Instead of determining the true configurational polytope of Eq. (14), we arrive at a CE model with outofsample groundstate preservation iteratively by determining the global ground states of preliminary CE models (as above) to identify those configuration σ∈P for which Eq. (14) is not satisfied. Afterwards, without additional DFT calculations, the constraint corresponding to these configurations σ are added to the QP form as in Eq. (14). By iteratively calculating the groundstate hulls and adding further constraints, global groundstate preservation up to a large supercell size can be achieved. The procedure is illustrated in Fig. 5a.
To demonstrate the convergence of this iterative refinement, we applied the procedure to the two model systems for supercell size of up to 16 sites. The weight adjustment procedure described above is used for comparison. Small initial weights, 10^{−4}, and energies of about 1 meV above the hull are assigned to the predicted new ground states. The results are shown in Fig. 6a for Li_{ x }Fe_{(1−x)}O and Fig. 6b for Li_{ x }Ti_{(1−x)}O. For reference, the figure also shows the results of an iterative refinement using the weight adjustment method. The maximum distance between the insample hull and the outofsample hull is plotted in the upper panel as a measure of the difference between the two hulls as the iteration progresses. The corresponding cv score is plotted in the lower panel as a measure of the predictive power of the CE fit.
As seen in Fig. 6, for both systems, Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O, the maximum distance (defined as the difference of energy under the same x) between the outofsample and insample hulls decreases monotonously to 0 with the QP methodology. The iterative weight adjustment also converges for Li_{ x }Fe_{(1−x)}O, though the distance between the hulls fluctuates and does not decay monotonously. For Li_{ x }Ti_{(1−x)}O the weight adjustment method does not converge. More importantly, the cv scores of the QP fits are nearly constant throughout the iterations, whereas the cv score continuously increases for the weight adjustment algorithms. This means that, using the QP methodology, outofsample groundstate preservation can be achieved without sacrificing the predictive power of the CE fit. On the other hand, the weight adjustment technique that is often used for CE construction is not guaranteed to converge and rends to achieve goundstate preservation at the cost of predictive power (increasing cv score). We therefore conclude that the QP methodology developed in the present work allows for the systematic construction of CE models with insample and outofsample groundstate preservation.
The results above are based on an exact groundstate search for system sizes of up to 16 sites, however, for the purpose of phase diagram calculations via Monte Carlo simulations much larger supercell sizes may be required. To construct CE models that are in practice groundstate preserving even for sufficiently large system sizes, the exact groundstate search may be replaced by simulated annealing simulations, which allow to determine plausible ground states for larger supercell size (but cannot provide proof that all ground states have been identified, see ref. ^{47} for a more detailed discussion). We repeated the iterative procedure of Fig. 5a for Li_{ x }Ti_{(1−x)}O for supercell sizes with up to 512 sites using simulated annealing, and the results are depicted in Fig. 6c.
As shown in Fig. 6a for smaller cell sizes, using the QP methodology, the distance between the insample and outofsample CE hulls decreases monotonously to 0 within 7 iterations, and the cv score remains nearly constant. As before, for the iterative weight adjustment algorithm does not achieve complete convergence even after 18 iterations and gives rise to a dramatic increase of the cv score. This final example demonstrates again that the QP methodology is a robust scheme to obtain groundstate preserving CE fits even for large system sizes that are suitable for realistic Monte Carlo simulations.
Finally, we point out that the iterative procedure for outofsample groundstate preservation is not only useful, when the ground states of the system are known a priori. Instead, the procedure may also serve as a means for the sampling of the configurational space to generate additional reference data. For this purpose, the configurations that were identified as “spurious” ground states may be evaluated with the reference method (i.e., DFT) to confirm whether any unknown groundstate has been discovered. By construction, this approach also provides a good stopping criterion for the cluster expansion fit when no additional ground states have been identified. This procedure is illustrated in Fig. 5b. If DFT calculations for all prospective new ground states are carried out and none of them turns out to be an actual groundstate, the outofsample groundstate preserving fit has the correct assumption and the resulting CE fit is a valid fit with consistently low cv errors. No further iteration is necessary, and the CE fit is finalized. On the other hand, if additional DFT ground states are found within the proposed set, then the out of sample groundstate preserving fit would have to be restarted.
To summarize, in this article, we presented a robust and efficient procedure to obtain groundstate preserving cluster expansion models. The method is formulated in terms of quadratic programming and compressive sensing and is mathematically rigorous. We demonstrated the robustness of the approach by application to the phase diagrams of Li_{ x }Fe_{(1−x)}O and Li_{ x }Ti_{(1−x)}O that are challenging to describe with conventional cluster expansion techniques. We further showed that outofsample groundstate preservation can be achieved up to large supercell sizes. These properties make the presented quadratic programming approach an attractive tool for the fit of general constraint lattice models and point the way towards the fully automated construction of cluster expansion models for materials simulations.
Data availability statement
The data that support the findings of this study have been deposited in Open Science Framework with the identifiers (DOI: 10.17605/OSF.IO/6DEHY).^{51}
References
 1.
Fischer, C. C., Tibbetts, K. J., Morgan, D. & Ceder, G. Predicting crystal structure by merging data mining with quantum mechanics. Nat. Mater. 5, 641–646 (2006).
 2.
Yang, K., Setyawan, W., Wang, S., Buongiorno Nardelli, M. & Curtarolo, S. A search model for topological insulators with highthroughput robustness descriptors. Nat. Mater. 11, 614–619 (2012).
 3.
Rong, Z. et al. Materials design rules for multivalent ion mobility in intercalation structures. Chem. Mater. 27, 6016–6021 (2015).
 4.
Liu, M. et al. Spinel compounds as multivalent battery cathodes: a systematic evaluation based on ab initio calculations. Energy Environ. Sci. 8, 964–974 (2015).
 5.
Jain, A., Shin, Y. & Persson, K. A. Computational predictions of energy materials using density functional theory. Nat. Rev. Mater. 1, 15004 (2016).
 6.
Sanchez, J. M., Ducastelle, F. & Gratias, D. Generalized cluster description of multicomponent systems. Phys. A Stat. Mechan. Appl. 128, 334–350 (1984).
 7.
van de Walle, A. & Ceder, G. Automating firstprinciples phase diagram calculations. J. Phase Equilib. 23, 348–359 (2002).
 8.
De Fontaine, D. Configurational thermodynamics of solid solutions. Solid State Phys. 34, 73–274 (1979).
 9.
De Fontaine, D. Cluster approach to orderdisorder transformations in alloys. Solid State Phys. 47, 33–176 (1994).
 10.
Artrith, N. & Urban, A. An implementation of artificial neuralnetwork potentials for atomistic materials simulations: performance for TiO 2. Comput. Mater. Sci. 114, 135–150 (2016).
 11.
Artrith, N., Hiller, B. & Behler, J. Neural network potentials for metals and oxides–First applications to copper clusters at zinc oxide. Phys. Status Solid. 250, 1191–1203 (2013).
 12.
Nahas, S., Ghosh, B., Bhowmick, S. & Agarwal, A. Firstprinciples cluster expansion study of functionalization of black phosphorene via fluorination and oxidation. Phys. Rev. B 93, 165413 (2016).
 13.
Predith, A., Ceder, G., Wolverton, C., Persson, K. & Mueller, T. Ab initio prediction of ordered groundstate structures in ZrO 2Y 2 O 3. Phys. Rev. B 77, 144104 (2008).
 14.
Ising, E. Beitrag zur Theorie des Ferromagnetismus. Z. Physik. 31, 253–258 (1925).
 15.
Casola, F. et al. Direct observation of impurityinduced magnetism in a spin(1/2) antiferromagnetic Heisenberg twoleg spin ladder. Phys. Rev. Lett. 105, 067203 (2010).
 16.
Herder, L. M., Bray, J. M. & Schneider, W. F. Comparison of cluster expansion fitting algorithms for interactions at surfaces. Surf. Sci. 640, 104–111 (2015).
 17.
Nelson, L. J., Hart, G. L. W., Zhou, F. & Ozoliņš, V. Compressive sensing as a paradigm for building physics models. Phys. Rev. B 87, 035125 (2013).
 18.
Ceder, G. & Van der Ven, A. Phase diagrams of lithium transition metal oxides: investigations from first principles. Electrochim. Acta 45, 131–150 (1999).
 19.
Candès, E. J. & Wakin, M. B. An introduction to compressive sampling. IEEE Signal Process. Mag. 25, 21–30 (2008).
 20.
Andersen, E. D., Roos, C. & Terlaky, T. On implementing a primaldual interiorpoint method for conic quadratic optimization. Math. Program. 95, 249–277 (2003).
 21.
Ceder, G. A derivation of the Ising model for the computation of phase diagrams. Comput. Mater. Sci. 1, 144–150 (1993).
 22.
Ceder, G. Alloy Theory and its Applications to LongPeriod Superstructure Ordering in Metallic Alloys and HighTemperature Superconductors (California Univ., 1991).
 23.
Hawkins, D. M. The problem of overfitting. J. Chem. Inf. Comput. Sci. 44, 1–12 (2004).
 24.
Candès, E. J., Romberg, J. & Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory 52, 489–509 (2006).
 25.
Garbulsky, G. D. & Ceder, G. Linearprogramming method for obtaining effective cluster interactions in alloys from totalenergy calculations: application to the fcc PdV system. Phys. Rev. B Condens. Matter. 51, 67–72 (1995).
 26.
Kohan, A., Tepesch, P., Ceder, G. & Wolverton, C. Computation of alloy phase diagrams at low temperatures. Comput. Mater. Sci. 9, 389–396 (1998).
 27.
Winston, W. L., Venkataramanan, M. & Goldberg, J. B. Introduction to Mathematical Programming Vol. 1 (Thomson/Brooks/Cole Duxbury; Pacific Grove, 2003).
 28.
Gill, P. E. & Wong, E. Methods for convex and general quadratic programming. Math Program Comput 7, 71–112 (2014).
 29.
Kim, S.J., Koh, K., Lustig, M. & Boyd, S. in 2007 IEEE International Conference on Image Processing III117–III120 (IEEE, 2007).
 30.
Urban, A., Seo, D.H. & Ceder, G. Computational understanding of Liion batteries. Npj Comput. Mater. 2, 16002 (2016).
 31.
Goodenough, J. B. & Park, K. S. The Liion rechargeable battery: a perspective. J. Am. Chem. Soc. 135, 1167–1176 (2013).
 32.
Hong, J., Gwon, H., Jung, S.K., Ku, K. & Kang, K. Review—lithiumexcess layered cathodes for lithium rechargeable batteries. J. Electrochem. Soc. 162, A2447–A2467 (2015).
 33.
Rozier, P. & Tarascon, J. M. Review—LiRich layered oxide cathodes for nextgeneration LiIon batteries: chances and challenges. J. Electrochem. Soc. 162, A2490–A2499 (2015).
 34.
Urban, A., Lee, J. & Ceder, G. The configurational space of rocksalttype oxides for highcapacity lithium battery electrodes. Adv. Energy Mater. 4, 1400478n/a. doi:10.1002/aenm.201400478 (2014).
 35.
Lee, J. et al. Unlocking the potential of cationdisordered oxides for rechargeable lithium batteries. Science 343, 519–522 (2014).
 36.
Mizushima, K., Jones, P., Wiseman, P. & Goodenough, J. LixCoO2 (0< x<1): a new cathode material for batteries of high energy density. Mater. Res. Bull. 15, 783–789 (1980).
 37.
Glazier, S. L., Li, J., Zhou, J., Bond, T. & Dahn, J. R. Characterization of disordered Li (1 + x) Ti2 x Fe (1–3 x) O2 as positive electrode materials in LiIon batteries using percolation theory. Chem. Mater. 27, 7751–7756 (2015).
 38.
Lee, J. et al. A new class of high capacity cationdisordered oxides for rechargeable lithium batteries: Li–Ni–Ti–Mo oxides. Energy Environ. Sci. 8, 3255–3265 (2015).
 39.
Bongers, P. Structure and magnetic properties of several complex oxides of the transition elements. (University of Leiden thesis, 1957).
 40.
Lecerf, A. in ANNALES DE CHIMIE FRANCE. 513&.
 41.
Hoffmann, A. Crystal chemistry of lithium Ferrite. Naturwissenschaften 26, 431 (1938).
 42.
Posnjak, E. & Barth, T. F. A new type of crystal finestructure: lithium ferrite (Li 2 O· Fe 2 O 3). Phys. Rev. 38, 2234 (1931).
 43.
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
 44.
Anisimov, V. V., Zaanen, J. & Andersen, O. K. Band theory and Mott insulators: Hubbard U instead of Stoner I. Phys. Rev. B Condens. Matter. 44, 943–954 (1991).
 45.
Jain, A. et al. A highthroughput infrastructure for density functional theory calculations. Comput. Mater. Sci. 50, 2295–2310 (2011).
 46.
Hart, G. L. & Forcade, R. W. Algorithm for generating derivative structures. Phys. Rev. B 77, 224115 (2008).
 47.
Huang, W. et al. Finding and proving the exact ground state of a generalized Ising model by convex optimization and MAXSAT. Phys. Rev. B 94, 134424 (2016).
 48.
Bishop, C. M. Pattern recognition. Mach. Learn. 128, 147–152 (2006).
 49.
Ducastelle, F. Order and Phase Stability in Alloys (NorthHolland, 1991).
 50.
Huang, W. et al. Constructing and proving the ground state of a generalized Ising model by the cluster tree optimization algorithm. arXiv 1606, 07429 (2016).
 51.
Huang, W. et al. Data for Construction of GroundState Preserving Sparse Lattice Models for Predictive Materials Simulations. doi:10.17605/OSF.IO/6DEHY (2017).
Acknowledgements
This work was supported primarily by the US Department of Energy (DOE) under Contract No. DEFG0296ER45571.
Author information
Affiliations
Contributions
W.H. and G.C. jointly proposed this project. W.H. is the lead on the project. W.H, A.U., Z.R., Z.D. and G.C. contributed to the theoretical and mathematical development of the quadratic programming method. W.H. and A.U. carried out the DFT calculations. W.H., A.U., Z.R., Z.D. and C.L. contributed to the implementation of the method. All authors contributed to the manuscript writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Huang, W., Urban, A., Rong, Z. et al. Construction of groundstate preserving sparse lattice models for predictive materials simulations. npj Comput Mater 3, 30 (2017). https://doi.org/10.1038/s4152401700320
Received:
Revised:
Accepted:
Published:
Further reading

Robust datadriven approach for predicting the configurational energy of high entropy alloys
Materials & Design (2020)

Solving Chemistry Problems via an EndtoEnd Approach: A Proof of Concept
The Journal of Physical Chemistry A (2020)

Revealing the Atomic Origin of Heterogeneous Li‐Ion Diffusion by Probing Na
Advanced Materials (2019)

FirstPrinciples Statistical Mechanics of Multicomponent Crystals
Annual Review of Materials Research (2018)

Recursive alloy Hamiltonian construction and its application to the NiAlCr system
Acta Materialia (2018)