Discovery of high-entropy ceramics via machine learning

Although high-entropy materials are attracting considerable interest due to a combination of useful properties and promising applications, predicting their formation remains a hindrance for rational discovery of new systems. Experimental approaches are based on physical intuition and/or expensive trial and error strategies. Most computational methods rely on the availability of sufficient experimental data and computational power. Machine learning (ML) applied to materials science can accelerate development and reduce costs. In this study, we propose an ML method, leveraging thermodynamic and compositional attributes of a given material for predicting the synthesizability (i.e., entropy-forming ability) of disordered metal carbides. The relative importance of the thermodynamic and compositional features for the predictions are then explored. The approach’s suitability is demonstrated by comparing values calculated with density functional theory to ML predictions. Finally, the model is employed to predict the entropy-forming ability of 70 new compositions; several predictions are validated by additional density functional theory calculations and experimental synthesis, corroborating the effectiveness in exploring vast compositional spaces in a high-throughput manner. Importantly, seven compositions are selected specifically, because they contain all three of the Group VI elements (Cr, Mo, and W), which do not form room temperature-stable rock-salt monocarbides. Incorporating the Group VI elements into the rock-salt structure provides further opportunity for tuning the electronic structure and potentially material performance.


INTRODUCTION
Traditional alloys have been developed utilizing one principal element with minor additions of other alloying elements as a means of achieving a desired combination of properties and/or microstructures. Recently, research efforts have been directed toward the study of materials with significant atomic fractions of multiple elements, thus opening a richer composition space [1][2][3] . This class of materials typically contains four or more elements that do not necessarily result in a single phase (multi-principle element alloys) and often greater than five elements to maximize the configurational entropy and improve the stability of the single-phase solid solution (high-entropy alloys) 4 . High-entropy offers increased solubility of components, drawing new attention to unexplored center regions of phase diagrams. Novel highentropy materials that exist as a single, highly disordered, crystalline phase have been of particular research interest [5][6][7][8][9][10] . As this field has continued to evolve, a number of fascinating combinations of material properties have begun to emerge [11][12][13][14][15][16] .
Finding these materials is often challenging though, owing to the sheer size of these unexplored regions away from the corners of phase diagrams. The search for effective scientific strategies and models has thus far required time-and cost-intensive experimental evaluations of many candidate single-phase highentropy materials. The disordered configuration presents a challenge for most computational approaches 17 and there is not always sufficient experimental data for validation of positive and negative calculated results. Phase diagram calculations, often combined with other rules and models, have been applied successfully 6,[18][19][20] but the underlying databases lack significant experimental underpinnings. High-throughput computational materials design combines thermodynamic and electronicstructure methods with data-mining capabilities to more quickly evaluate material compositions for novel properties [21][22][23] . These ab-initio computing efforts have recently yielded a descriptor known as entropy-forming ability (EFA), which has shown considerable promise for predicting the ease of synthesizability and homogeneity of such materials 5,12 . A high EFA value for a specific composition signifies a small energy penalty to incorporate disorder, i.e., this descriptor can be a sorting parameter for likely single-phase, disordered, high-entropy materials. This descriptor was previously calculated for 56 high-entropy carbide (HEC) compositions and the single-phase cutoff was experimentally validated to exist between an EFA value of 45 and 50 5,12 . The highest EFA materials have been demonstrated, via extended Xray absorption fine structure (EXAFS), to exhibit minimal shortrange chemical order 12 , a concern in the high-entropy materials community 24,25 . Although this method is high-throughput in comparison with other ab-initio efforts, calculation of EFA values remains a computationally intensive, time-consuming task. Thoroughly searching this new composition space, conservatively estimated to comprise hundreds of billions of new alloys 1 , is simply not feasible with this approach alone. Herein, we propose applying data science tools, specifically machine learning (ML), to guide more expensive computational and experimental search strategies toward promising candidate materials and therefore accelerate materials discovery.
Recently, the materials science field has embraced the big data revolution, as large databases become cost-effective and data generation rates continue to accelerate [26][27][28][29][30] . This has resulted in the development of a number of powerful data science tools to assist material scientists [31][32][33][34][35][36] . In the realm of materials discovery, data science tools have aided in the accelerated discovery or identification of new compositions for bulk metallic glasses 37 , shape memory alloys 38 , Heusler compounds 39,40 , and photocatalysts for CO 2 reduction 41 . Other work has focused on the development of ML methods to establish structure-property linkages 35,42 , or predict the crystal stability of new materials 43,44 . In 2016, Ward et al. 45 proposed a chemically diverse list of attributes, primarily data-mined from the periodic table, as a general set of features for broad material property prediction. These data-driven models can be fit to existing experimental data and continuously refined as new data are collected 46 . This inherent flexibility of MLbased decision-making frameworks provides an advantage given the dynamic nature of phase formation and stability. Moreover, when compared with density functional theory (DFT), the state-ofthe-art toolbox for quantum mechanical modeling, ML models can perform well with reduced computational cost and without the need for atomic structure information 44,47 . This provides an opportunity to search materials space in an unconstrained manner without concern for the combinatorial explosion of higher-order compositions (ternaries, quaternaries, quinaries, etc.) 48 .
Through this work, we aim to accelerate materials innovation by developing a rapid predictor of the stability of high-entropy materials and demonstrating the model's capability to predict single-or multi-phase results. With regard to speed, our ML model can evaluate the EFA of a single composition in under a millisecond, compared with hundreds of hours per composition with DFT, even using efficient automatic frameworks such as Automatic-Flow (AFLOW) 49 . The robustness of the model is investigated by focusing on locating successful five component compositions containing all three of the Group VI metals (Cr, Mo, and W) as 60% of the cation sublattice. The interest in using the ML model to locate single-phase compositions containing all three Group VI metals stems from the relationship between the electronic structure and mechanical/physical properties of transition metal carbides 50,51 . Prior studies have revealed that the transition metal carbides can be more effectively tuned by the enhanced metallic bonding, owing to valence filling instead of conventional microstructural engineering principles 50,52 . For example, the Group IV and V monocarbides readily form the rock-salt structure and demonstrate improved mechanical properties, such as fracture toughness, with changing directionality of the bonding as more valence electrons become available in Group V 50,53 . Computationally, the trend in increasing toughness is expected to continue to the right on the periodic table; however, the Group VI metals do not form a room temperature-stable rocksalt phase [54][55][56] . By employing high-entropy effects (i.e., increased solubility), we proposed that the three Group VI metals can be incorporated into a room temperature-stable rock-salt structure, resulting in an increased number of available electrons, and a novel group of materials with the potential to overturn previous material engineering limitations.
In this work, several single-phase, rock-salt crystal structure, fivemetal cation carbides-for which three of the precursors have different structures and stoichiometric ratios of anions to cations from the resultant face centered cubic high-entropy material-are evaluated. The available precursors for the Group VI metals are hexagonal Mo 2 C, hexagonal WC, orthorhombic W 2 C, and orthorhombic Cr 3 C 2 . Rock-salt MoC and WC are only stable at temperatures above 1940°C and 2500°C, respectively. The only face-centered cubic (FCC) system in the Cr-C phase diagram is Cr 23 C 6 . See Supplementary Figs. 1-3 for the binary phase diagrams. To date, the authors are unaware of any previously explored high-entropy carbides containing Cr or the prior calculation of the EFA value by DFT for any Cr-containing compounds. The formation of a rock-salt structured monocarbide, wherein 60% of the cation species (Cr, Mo, and W) do not form this structure as their stable room temperature phase, is neither obvious nor readily predictable based on current theories.
These design goals are accomplished by supplementing the set of chemical descriptors of each composition with information from the calculated phase diagrams and utilizing an ML framework to rapidly predict the EFA of seventy previously unstudied high-entropy carbides containing Cr, an element not considered in the original composition space 5 . Complete information on the construction, training, and implementation of the ML model is included in the Methods section. Based on the validation against previously reported high-entropy metal carbides, comparison with DFT calculations for several new compositions, and the ability to locate and synthesize several otherwise unintuitive materials, we find that this screening strategy is aptly designed to identify promising high-entropy systems. The successful outcome demonstrates the synergy between thermodynamics, chemical descriptors, and ML methods for rapidly evaluating new materials based on prior experiments and computation.

Model performance
The search for new high-entropy ceramics begins with fitting a random forest 57 , a type of ML model, on 56 previously reported EFA values 5 . This data set includes nine synthesized compositions, six single phase, and three multi-phase. The previous study only utilizes eight carbide forming metal elements (Hf, Nb, Ta, Ti, Mo, V, W, and Zr). As will be demonstrated, even this sparse data set with relatively few compositions with high entropic contributions is very useful in guiding subsequent experiments toward the best candidates and away from the multi-phase materials.
As our goal is to select the best model hyperparameters for predicting new compositions outside our training set, we evaluate the ML model's performance using fivefold cross-validation and a grid search across selected hyperparameters (see the Methods section for further details). The final model hyperparameters selected for both models, with and without CALPHAD data, are ten predictor trees and mean absolute error (MAE) for scoring. The best hyperparameters were used to fit models to the labeled data. Supplementary Fig. 4 shows an example predictor tree from the model with CALPHAD data and demonstrates the complex relationships between the predictor variables. Figure 1 compares the performance of the ML model fit with only chemical attributes ( Fig. 1a) and the model fit with chemical attributes and information from CALPHAD (Fig. 1b). The DFT-calculated and ML-predicted values for each model are listed in Supplementary  Table 1. Although the MAEs for all models are equivalent (3.8 (eV/ atom) −1 ), the coefficient of determination (R 2 ) suggests the observed outcomes are better replicated by the ML model with access to the CALPHAD data. However, both models have a systematic error in which the compositions with known EFA < 50 are overestimated and, more noticeably, compositions with an EFA above 80 are underestimated. The small number of samples above 80 (6 total) coupled with bootstrapping (~66% of the data is used per tree) results in a low probability for them to be included in the construction of each decision tree. Further, the average EFA of the materials in each tree (∼58 depending on the tree) is in line with the average for the data set. With only 1 sample above 100 in the data set, and these samples having a low probability of being used in tree construction, the averaging process in random forest pulls down the predicted values for the highest EFA materials. It will be demonstrated that the improved R 2 performance of the model toward fitting the starting data set will provide improved extrapolation on the Cr-containing systems in the search for highentropy ceramics containing all three Group VI precursors.

Feature importance
The permutation importance of the chemical attribute and CALPHAD features is studied to provide interpretability to the ML model. Details for each chemical attribute can be found in the Supplementary Information. The rationale for selecting permutation importance is the following: randomly permuting the value of predictor variable X i and computing the EFA together with the unpermuted predictor variables, will result in significantly reduced prediction accuracy if the original variable X i was significantly associated with the output value. Permutation importance also has the advantage, compared with univariate screening methods, in that it assesses the impact of each predictor variable individually and with the other unpermuted predictor variables 58 . Table 1 shows the top ten features and their importance rank for fitting each random forest model to the EFA values calculated from DFT. As evidenced in the model performance and feature importance, the eight additional CALPHAD features provide valuable information about the EFA of a given composition, particularly the liquidus temperature (ranked second). However, CALPHAD diagrams alone would also be insufficient for determining the ability to fabricate a single-phase material. Supplementary  Fig. 5 demonstrates this by comparing ThermoCalc SSOL6 database computed diagrams of compositions known to form single or multi-phase carbides. For each of these compositions, CALPHAD alone would predict rock-salt to be the primary structure to evolve from the liquid, which would be stable down to nearly 1500 K before forming a secondary metal carbide. In reality, only MoNbTaVWC 5 ( Supplementary Fig. 5a) readily forms a single phase experimentally, whereas the other three compositions ( Supplementary Fig. 5b-d) have been demonstrated previously to be multi-phase materials 5 . However, including some CALPHAD data as features improves the ML model via this thermodynamic-based preview of what is likely to occur and improves its extrapolation capabilities beyond that of the chemical attributes alone.
Important features from the chemical attributes are the average ionic character between each of the atomic species, the maximum and fraction-weighted covalent radius, and a few features representing the valence electrons or unfilled orbitals. These chemical attributes quantify the expected bonding nature and local environment each atom will experience if single phase (i.e., homogeneously disordered). Along the same lines, these metrics also assist the ML model to determine what atomic environments are unfavorable, resulting in multi-phase materials. Further analysis of the relationship between the EFA of a composition and the top ranked predictors reveals there is noticeable correlation (Fig. 2). A plot of average ionic character vs. EFA reveals that increasing the average ionic character between the pairs of atoms is more likely to result in a multi-phase material (Fig.  2a). This property has been previously suggested to play a role in determining single or multi-phase outcomes, but has not yet been extensively studied and its contribution not well understood 10,59 . The top ten features for the ML model with only the chemical attributes are on the left. The top ten features for the ML model including CALPHAD features are on the right. Both models rely on similar features regarding electronegativity, ionic character, and electron orbitals for making the best predictions. The avg(x) and avg. dev(x) denote the composition-weighted average and average deviation, respectively, calculated over the vector of elemental values for each compound. The min(x), max(x), fwm(x), and range(x) correspond to the minimum, maximum, fraction-weighted mean, and range of an attribute for each compound. Features marked with an * are computed from CALPHAD. Fig. 1 Evaluation of the ML models fit to available data. a The ML-predicted EFA using a random forest fit with 108 chemical attributes evaluated against the labels of the data set from DFT. b The ML predicted EFA values for a random forest fit with 108 chemical attributes plus 8 features from CALPHAD evaluated against the known EFA from DFT. The line y = x is plotted to show the deviation from perfect predictions.
A parameter not previously studied in the high-entropy literature, the liquidus temperature derived from CALPHAD also provides insight into the magnitude of the expected EFA for a given composition (Fig. 2b) Table 2 have ML predicted EFA values that suggest they will readily form a single-phase highentropy carbide, despite containing the three Group VI refractory metal elements. If successfully synthesized into a single phase, these novel materials would contain three carbides that do not exist as room temperature-stable rock-salt monocarbides (refer to the binary phase diagrams in Supplementary Figs. 1-3). Several fundamentally interesting compositions are those where one of the rock-salt stable precursors (i.e., NbC, TaC, or VC) is substituted from MoNbTaVWC 5 (EFA DFT of 125 (eV/atom) −1 ) 5 with an orthorhombic (Cr 3 C 2 or W 2 C) or hexagonal (Mo 2 C or WC) precursor that do not form stable rock-salt structures.
As the first step in validating the ML model's extrapolation into the Cr-containing chemical space, the EFA of the seven selected compositions were subsequently computed by DFT. The ab-initio EFA values are located in Table 2 and plots comparing the ML model with chemical attributes (Fig. 3a) and the ML model including CALPHAD data (Fig. 3b) illustrate the improved regression performance of the model after inclusion of the CALPHAD features. The red circles in Fig. 3 are the predicted EFA values for the seven Cr-containing compositions compared with their DFT-calculated value. Although the ML models were not refit with the new DFT-computed EFA values, the R 2 and MAE of each ML model can be re-evaluated after including the extrapolated data. In comparison with the chemical attributes alone, the R 2 value remains the same and the MAE increases only slightly.
As a secondary method of validating the ML model, the seven selected materials were fabricated following conventional fabrication processes described in detail in the Methods section. Successful fabrication of the rock-salt structure after full densification was verified via X-ray diffraction (XRD) (Fig. 4). Results of XRD analysis for each sample following spark plasma sintering (SPS) demonstrate that compositions CrMoNbVWC 5 , CrMoNbTaWC 5 , CrMoTaVWC 5 , and CrMoTiVWC 5 (the top four) only exhibit a single set of FCC peaks of the desired rock-salt high-entropy phase. Conversely, XRD of CrHfTaWZrC 5 CrMoTiWZrC 5 , and CrHfMoTiWC 5 (bottom three) reveal the presence of multiple structures. In the event there are multiple FCC structures present, the majority FCC phase is indexed. In CrMoTiWZrC 5 and CrHfMoTiWC 5 , the secondary phase is also FCC. The CrHfTaWZrC 5 system contains a secondary hexagonal phase. The XRD pattern for CrHfMoTiWC 5 and CrHfTaWZrC 5 also contain a small amount (<5%) of HfO 2 that remains due to processing. This is determined not to significantly alter the composition of the carbide phase.
Microstructure analysis and energy dispersive X-ray spectroscopy were then utilized to determine the homogeneity of the sintered pellets as shown in Fig. 5. Coupling the results of both techniques verified that the as-processed samples were either single-phase and chemically homogenous or underwent chemical segregation. For example, in the CrMoNbVWC 5 microstructure, only grain contrast is present, and no notable indication of clustering or segregation is visible in the elemental maps. On the contrary, the CrHfTaWZrC 5 sample has observable chemical contrast in the microstructure, and the chemical maps demonstrate that the secondary phase present in XRD is rich in Cr and W. The CrMoTiWZrC 5 and CrHfTaWZrC 5 samples displayed were sintered at 1600°C to prevent the loss of Cr. When sintered at 1800°C, EDS revealed the Cr content in these samples was as low as 2 at% and the chrome carbide was found to have reacted with the graphite tooling. The medium entropy composition, CrMo-TiVWC 5 , resulted in a single FCC rock-salt structure after initial sintering, but required annealing as described in the Methods section to reach chemical homogeneity. Subsequently, electron backscatter diffraction (EBSD) was utilized to study the resulting microstructure of the samples. The single-phase, homogenous samples are observed to contain large, nearly equiaxed grains with some deviation owing to the remaining pores. This furthers the assertion these compositions are single phase, as they allow for the kinetics of grain growth. In stark contrast, the multi-phase materials have a significantly reduced grain size, owing to the competing phases preventing further grain growth during sintering.

DISCUSSION
A powerful data-driven approach to estimating the synthesizability of high-entropy materials, based on data from previous DFT calculations and experimental results, is detailed and demonstrated on 70 new chromium containing compositions. The ML framework is found to be improved by the inclusion of data from CALPHAD and robust toward extrapolating outside the starting chemical space. The ML model enhancement achieved by combining general features and thermodynamic data from CALPHAD is explored via assessing the impact of each predictor variable individually as well as with the other predictor variables (permutation importance) and evaluating compositions outside the original chemical space. The predictive capability of this method is validated by ab-initio calculations and experimental fabrication of several previously unreported compositions, including four single-phase rock-salt materials that would not be obvious candidates given the stable precursors and binary phase diagrams of the Group VI transition metals. These novel materials, of which 60% of the cation lattice contains Group VI metals, represent a step forward in electronic structure engineering of transition metal carbides: prior modeling of the bonding nature Results for both ML models are provided for each composition. For the selected compositions, a DFT-computed EFA value is listed in the next column. In the experimental result, "S" and "M" stand for single-and multi-phase, respectively. Units: EFA in (eV/atom) −1 .
with increased valence electrons [54][55][56] suggests that future material property studies are likely to yield useful combinations for practical engineering applications. Furthermore, the experimentally studied compositions result in single or multi-phase materials in agreement with their predicted EFA values. The remaining predicted materials include diverse chemistries and present ample opportunity for materials discovery. Moreover, the methodology designed opens the door to locating other high-entropy materials, not just ceramics, in a similar manner.

Machine-learning architecture
Random forests are a combination of decision trees that individually make predictions on each input and the overall prediction determined by a majority voting process 57,60 . Random forest was selected for its utility and performance on diverse problems when compared with other supervised learning models 60 . The random forest regressor is implemented with Scikit-learn 61 . Model hyperparameters are selected via an exhaustive fivefold cross-validated grid search using the following parameters: number of tree predictors in range 10-110 in steps of 10, mean-squared error and MAE as criterion, and the number of features to consider when looking for the best split from one to the total number of features available. Each fold is scored using the MAE between the labels from DFT and the predicted values. To obtain a deterministic behavior during model fitting, the random state is seeded. The best performing hyperparameters are selected to fit a model using the entire training set, with bootstrapping, to maximize the amount of information available for making future predictions.

From chemistry to features
Each composition is converted to a set of features with the goal of creating a quantitative representation that relates to the essential chemistry, physics, and thermodynamics of each material in a data set. The attributes utilized in this work should not be considered an exhaustive list, but instead a step toward creating a synergistic set of attributes that capture the knowledge of chemistry and experimentally robust thermodynamics. The 108 compositional attributes, defined in the Supplementary Information, are a subset of the general ML framework demonstrated previously to perform well on diverse material problems 45 . The elemental data used to compute the compositional features is sourced from Magpie 45,62 . These chemical attributes are augmented with select data about the number of phases and phase fractions calculated in 100 K steps, as well as the liquidus and solidus temperature from ThermoCalc Software SSOL6 database version 6.1 63 . The~800 CALPHAD features are reduced to 8 predictor variables (1% of those available) using the Select From Model method in Scikit-learn 61 to avoid the "curse-of-dimensionality" and find the most relevant subset [64][65][66] . Select From Model was chosen in this study for its rapid reduction of features in one step in comparison to other multi-step methods such as recursive feature elimination. The max number of features was set to 8 for this study, to target 1% of the available data. We do not intend for this feature list to be exhaustive or concrete. The selected features are defined in the Supplementary Information. The data for the predictor variables for the training data and new compositions are contained in the GitHub repository.
Interpreting the random forest algorithm The random forest model is analyzed to provide clarity to how the ML model evaluated these materials. The variable importance is extracted for the fit model using the "rfpimp" package in Python (available at https:// github.com/parrt/random-forest-importances, last access: 15 August 2019). The predictor variable importance is ranked on the permutation importance, which directly measures importance by observing the effect on model accuracy by randomly permuting the values of each predictor  variable 67 . That is to say, the permutation importance is measuring the impact on output EFA of swapping the value of a selected feature from one composition with the value from a different composition. This method has recently been introduced as an improvement to the mean decrease in impurity metric 58 .

Sample preparation
All samples were prepared using the same methods and tools utilized in the previous EFA and HEC studies 5,12 . Initial powders of each of the five binary precursor carbides (NbC, HfC, TiC, ZrC, VC, TaC, Mo 2 C, W 2 C, WC, and Cr 3 C 2 ) are obtained in >99% purity and −325 mesh (<44 μm) particle size (Alfa Aesar). The sample is weighed out in 12 g batches and mixed to achieve the desired five-metal carbide compositions.

Sample analysis
Microstructural and elemental analysis is performed using a Thermo Fischer (formerly FEI) Apreo field emission scanning electron microscope equipped with an Oxford X-Max N EDS detector and an Oxford Symmetry EBSD detector. A combination of secondary and back-scattered electron detectors are utilized for imaging. EDS scans are conducted at length scales of 500× and 1000× to verify multi-length scale homogeneity in the resulting microstructure. EDS quantification confirmed the resulting ratio of metal ions are nearly equiatomic. Crystal structure analysis is implemented using a Rigaku Miniflex X-ray Diffractometer with a 1D detector using a step size of 0.02°and 5°per minute scan rate, using Cu Kα radiation (wavelength λ = 1.54059 Å) for all measurements. The lattice parameter is calculated utilizing a combinatorial method from both MDI Jade and Match! Phase identification software. This is subsequently utilized to model and create a theoretical diffraction profile to be utilized in EBSD.

Calculation of the entropy-forming ability
The EFA is calculated using the AFLOW-POCC module 17 implemented in the AFLOW Framework for Materials Discovery 68 . For each disordered composition, a set of representative ordered supercells is resolved. First, AFLOW-POCC determines the smallest supercell size accommodating the stoichiometry exactly (for the five-metal rock-salt carbides, the value is 5).
The unique superlattices of this size are then constructed based on the Hermite Normal Form matrices. The lattices are decorated to generate all viable configurations. To identify unique configurations and their degeneracies rapidly, the Universal Force Field method is employed. The energies of the unique configurations are then calculated using DFT having input parameters/settings in accordance with the AFLOW Fig. 5 Microstructural analysis of the synthesized materials. The first column is an electron micrograph for each of the synthesized compositions. Columns 2-6 are selected EDS chemistry maps are present for each of the five-metal cations present in each system. Column 7 is an EBSD map of the grain structure, revealing the effect on grain size in multi-phase compared with single-phase compositions.
Compositions are listed from largest to smallest ML predicted EFA. Scale bar 100 µm.
Standard 29 . K-point meshes are generated using the Monkhorst-Pack scheme (Gamma-centered for all materials belonging to the hP and hR Bravais lattice) having at least 6000 k-points per reciprocal atom. Project-Augmented Wavefunction potentials are constructed according to the Perdew-Berke-Ernzerhof exchange-correlation functional as implemented in the Vienna Ab initio Simulation Package (VASP). The plane-wave basis has a kinetic energy cutoff 1.4 times larger than that recommended for each species. Spin polarization is considered. The electronic and ionic convergence criteria are 10 −3 and 10 −2 eV, respectively. The EFA is defined as the inverse of the spread of these energies 5 .

DATA AVAILABILITY
All data analyzed during the current study are available at GitHub address https:// github.com/krkaufma/ML-EFA or from the corresponding author upon reasonable request.

CODE AVAILABILITY
All code and models generated, developed, and/or utilized are available at GitHub address https://github.com/krkaufma/ML-EFA with the trained weights and sample code to demonstrate how to use the model for material discovery. We intend to continually refine the model by training on larger datasets, and expanding the composition range, as they become available.