Although high-entropy materials are attracting considerable interest due to a combination of useful properties and promising applications, predicting their formation remains a hindrance for rational discovery of new systems. Experimental approaches are based on physical intuition and/or expensive trial and error strategies. Most computational methods rely on the availability of sufficient experimental data and computational power. Machine learning (ML) applied to materials science can accelerate development and reduce costs. In this study, we propose an ML method, leveraging thermodynamic and compositional attributes of a given material for predicting the synthesizability (i.e., entropy-forming ability) of disordered metal carbides. The relative importance of the thermodynamic and compositional features for the predictions are then explored. The approach’s suitability is demonstrated by comparing values calculated with density functional theory to ML predictions. Finally, the model is employed to predict the entropy-forming ability of 70 new compositions; several predictions are validated by additional density functional theory calculations and experimental synthesis, corroborating the effectiveness in exploring vast compositional spaces in a high-throughput manner. Importantly, seven compositions are selected specifically, because they contain all three of the Group VI elements (Cr, Mo, and W), which do not form room temperature-stable rock-salt monocarbides. Incorporating the Group VI elements into the rock-salt structure provides further opportunity for tuning the electronic structure and potentially material performance.
Traditional alloys have been developed utilizing one principal element with minor additions of other alloying elements as a means of achieving a desired combination of properties and/or microstructures. Recently, research efforts have been directed toward the study of materials with significant atomic fractions of multiple elements, thus opening a richer composition space1,2,3. This class of materials typically contains four or more elements that do not necessarily result in a single phase (multi-principle element alloys) and often greater than five elements to maximize the configurational entropy and improve the stability of the single-phase solid solution (high-entropy alloys)4. High-entropy offers increased solubility of components, drawing new attention to unexplored center regions of phase diagrams. Novel high-entropy materials that exist as a single, highly disordered, crystalline phase have been of particular research interest5,6,7,8,9,10. As this field has continued to evolve, a number of fascinating combinations of material properties have begun to emerge11,12,13,14,15,16.
Finding these materials is often challenging though, owing to the sheer size of these unexplored regions away from the corners of phase diagrams. The search for effective scientific strategies and models has thus far required time- and cost-intensive experimental evaluations of many candidate single-phase high-entropy materials. The disordered configuration presents a challenge for most computational approaches17 and there is not always sufficient experimental data for validation of positive and negative calculated results. Phase diagram calculations, often combined with other rules and models, have been applied successfully6,18,19,20 but the underlying databases lack significant experimental underpinnings. High-throughput computational materials design combines thermodynamic and electronic-structure methods with data-mining capabilities to more quickly evaluate material compositions for novel properties21,22,23. These ab-initio computing efforts have recently yielded a descriptor known as entropy-forming ability (EFA), which has shown considerable promise for predicting the ease of synthesizability and homogeneity of such materials5,12. A high EFA value for a specific composition signifies a small energy penalty to incorporate disorder, i.e., this descriptor can be a sorting parameter for likely single-phase, disordered, high-entropy materials. This descriptor was previously calculated for 56 high-entropy carbide (HEC) compositions and the single-phase cutoff was experimentally validated to exist between an EFA value of 45 and 505,12. The highest EFA materials have been demonstrated, via extended X-ray absorption fine structure (EXAFS), to exhibit minimal short-range chemical order12, a concern in the high-entropy materials community24,25. Although this method is high-throughput in comparison with other ab-initio efforts, calculation of EFA values remains a computationally intensive, time-consuming task. Thoroughly searching this new composition space, conservatively estimated to comprise hundreds of billions of new alloys1, is simply not feasible with this approach alone. Herein, we propose applying data science tools, specifically machine learning (ML), to guide more expensive computational and experimental search strategies toward promising candidate materials and therefore accelerate materials discovery.
Recently, the materials science field has embraced the big data revolution, as large databases become cost-effective and data generation rates continue to accelerate26,27,28,29,30. This has resulted in the development of a number of powerful data science tools to assist material scientists31,32,33,34,35,36. In the realm of materials discovery, data science tools have aided in the accelerated discovery or identification of new compositions for bulk metallic glasses37, shape memory alloys38, Heusler compounds39,40, and photocatalysts for CO2 reduction41. Other work has focused on the development of ML methods to establish structure–property linkages35,42, or predict the crystal stability of new materials43,44. In 2016, Ward et al.45 proposed a chemically diverse list of attributes, primarily data-mined from the periodic table, as a general set of features for broad material property prediction. These data-driven models can be fit to existing experimental data and continuously refined as new data are collected46. This inherent flexibility of ML-based decision-making frameworks provides an advantage given the dynamic nature of phase formation and stability. Moreover, when compared with density functional theory (DFT), the state-of-the-art toolbox for quantum mechanical modeling, ML models can perform well with reduced computational cost and without the need for atomic structure information44,47. This provides an opportunity to search materials space in an unconstrained manner without concern for the combinatorial explosion of higher-order compositions (ternaries, quaternaries, quinaries, etc.)48.
Through this work, we aim to accelerate materials innovation by developing a rapid predictor of the stability of high-entropy materials and demonstrating the model’s capability to predict single- or multi-phase results. With regard to speed, our ML model can evaluate the EFA of a single composition in under a millisecond, compared with hundreds of hours per composition with DFT, even using efficient automatic frameworks such as Automatic-Flow (AFLOW)49. The robustness of the model is investigated by focusing on locating successful five component compositions containing all three of the Group VI metals (Cr, Mo, and W) as 60% of the cation sublattice. The interest in using the ML model to locate single-phase compositions containing all three Group VI metals stems from the relationship between the electronic structure and mechanical/physical properties of transition metal carbides50,51. Prior studies have revealed that the transition metal carbides can be more effectively tuned by the enhanced metallic bonding, owing to valence filling instead of conventional microstructural engineering principles50,52. For example, the Group IV and V monocarbides readily form the rock-salt structure and demonstrate improved mechanical properties, such as fracture toughness, with changing directionality of the bonding as more valence electrons become available in Group V50,53. Computationally, the trend in increasing toughness is expected to continue to the right on the periodic table; however, the Group VI metals do not form a room temperature-stable rock-salt phase54,55,56. By employing high-entropy effects (i.e., increased solubility), we proposed that the three Group VI metals can be incorporated into a room temperature-stable rock-salt structure, resulting in an increased number of available electrons, and a novel group of materials with the potential to overturn previous material engineering limitations.
In this work, several single-phase, rock-salt crystal structure, five-metal cation carbides—for which three of the precursors have different structures and stoichiometric ratios of anions to cations from the resultant face centered cubic high-entropy material—are evaluated. The available precursors for the Group VI metals are hexagonal Mo2C, hexagonal WC, orthorhombic W2C, and orthorhombic Cr3C2. Rock-salt MoC and WC are only stable at temperatures above 1940 °C and 2500 °C, respectively. The only face-centered cubic (FCC) system in the Cr-C phase diagram is Cr23C6. See Supplementary Figs. 1–3 for the binary phase diagrams. To date, the authors are unaware of any previously explored high-entropy carbides containing Cr or the prior calculation of the EFA value by DFT for any Cr-containing compounds. The formation of a rock-salt structured monocarbide, wherein 60% of the cation species (Cr, Mo, and W) do not form this structure as their stable room temperature phase, is neither obvious nor readily predictable based on current theories.
These design goals are accomplished by supplementing the set of chemical descriptors of each composition with information from the calculated phase diagrams and utilizing an ML framework to rapidly predict the EFA of seventy previously unstudied high-entropy carbides containing Cr, an element not considered in the original composition space5. Complete information on the construction, training, and implementation of the ML model is included in the Methods section. Based on the validation against previously reported high-entropy metal carbides, comparison with DFT calculations for several new compositions, and the ability to locate and synthesize several otherwise unintuitive materials, we find that this screening strategy is aptly designed to identify promising high-entropy systems. The successful outcome demonstrates the synergy between thermodynamics, chemical descriptors, and ML methods for rapidly evaluating new materials based on prior experiments and computation.
The search for new high-entropy ceramics begins with fitting a random forest57, a type of ML model, on 56 previously reported EFA values5. This data set includes nine synthesized compositions, six single phase, and three multi-phase. The previous study only utilizes eight carbide forming metal elements (Hf, Nb, Ta, Ti, Mo, V, W, and Zr). As will be demonstrated, even this sparse data set with relatively few compositions with high entropic contributions is very useful in guiding subsequent experiments toward the best candidates and away from the multi-phase materials.
As our goal is to select the best model hyperparameters for predicting new compositions outside our training set, we evaluate the ML model’s performance using fivefold cross-validation and a grid search across selected hyperparameters (see the Methods section for further details). The final model hyperparameters selected for both models, with and without CALPHAD data, are ten predictor trees and mean absolute error (MAE) for scoring. The best hyperparameters were used to fit models to the labeled data. Supplementary Fig. 4 shows an example predictor tree from the model with CALPHAD data and demonstrates the complex relationships between the predictor variables. Figure 1 compares the performance of the ML model fit with only chemical attributes (Fig. 1a) and the model fit with chemical attributes and information from CALPHAD (Fig. 1b). The DFT-calculated and ML-predicted values for each model are listed in Supplementary Table 1. Although the MAEs for all models are equivalent (3.8 (eV/atom)−1), the coefficient of determination (R2) suggests the observed outcomes are better replicated by the ML model with access to the CALPHAD data. However, both models have a systematic error in which the compositions with known EFA < 50 are overestimated and, more noticeably, compositions with an EFA above 80 are underestimated. The small number of samples above 80 (6 total) coupled with bootstrapping (~66% of the data is used per tree) results in a low probability for them to be included in the construction of each decision tree. Further, the average EFA of the materials in each tree (∼58 depending on the tree) is in line with the average for the data set. With only 1 sample above 100 in the data set, and these samples having a low probability of being used in tree construction, the averaging process in random forest pulls down the predicted values for the highest EFA materials. It will be demonstrated that the improved R2 performance of the model toward fitting the starting data set will provide improved extrapolation on the Cr-containing systems in the search for high-entropy ceramics containing all three Group VI precursors.
The permutation importance of the chemical attribute and CALPHAD features is studied to provide interpretability to the ML model. Details for each chemical attribute can be found in the Supplementary Information. The rationale for selecting permutation importance is the following: randomly permuting the value of predictor variable Xi and computing the EFA together with the unpermuted predictor variables, will result in significantly reduced prediction accuracy if the original variable Xi was significantly associated with the output value. Permutation importance also has the advantage, compared with univariate screening methods, in that it assesses the impact of each predictor variable individually and with the other unpermuted predictor variables58. Table 1 shows the top ten features and their importance rank for fitting each random forest model to the EFA values calculated from DFT. As evidenced in the model performance and feature importance, the eight additional CALPHAD features provide valuable information about the EFA of a given composition, particularly the liquidus temperature (ranked second). However, CALPHAD diagrams alone would also be insufficient for determining the ability to fabricate a single-phase material. Supplementary Fig. 5 demonstrates this by comparing ThermoCalc SSOL6 database computed diagrams of compositions known to form single or multi-phase carbides. For each of these compositions, CALPHAD alone would predict rock-salt to be the primary structure to evolve from the liquid, which would be stable down to nearly 1500 K before forming a secondary metal carbide. In reality, only MoNbTaVWC5 (Supplementary Fig. 5a) readily forms a single phase experimentally, whereas the other three compositions (Supplementary Fig. 5b–d) have been demonstrated previously to be multi-phase materials5. However, including some CALPHAD data as features improves the ML model via this thermodynamic-based preview of what is likely to occur and improves its extrapolation capabilities beyond that of the chemical attributes alone.
Important features from the chemical attributes are the average ionic character between each of the atomic species, the maximum and fraction-weighted covalent radius, and a few features representing the valence electrons or unfilled orbitals. These chemical attributes quantify the expected bonding nature and local environment each atom will experience if single phase (i.e., homogeneously disordered). Along the same lines, these metrics also assist the ML model to determine what atomic environments are unfavorable, resulting in multi-phase materials. Further analysis of the relationship between the EFA of a composition and the top ranked predictors reveals there is noticeable correlation (Fig. 2). A plot of average ionic character vs. EFA reveals that increasing the average ionic character between the pairs of atoms is more likely to result in a multi-phase material (Fig. 2a). This property has been previously suggested to play a role in determining single or multi-phase outcomes, but has not yet been extensively studied and its contribution not well understood10,59. A parameter not previously studied in the high-entropy literature, the liquidus temperature derived from CALPHAD also provides insight into the magnitude of the expected EFA for a given composition (Fig. 2b). Intuitively, the compositions with the highest EFA values lie furthest away from the trendlines, highlighting the need for multi-variable approaches, like those offered in ML, to locate the best compositions.
Experimental and computational validation
New experimental compositions were chosen utilizing all nine of the Group IV, V, and VI refractory metals (Cr, Hf, Nb, Mo, Ta, Ti, V, W, and Zr) in equiatomic amounts plus carbon occupying the anion lattice. The ML model fitted to the 56 compositions with DFT-derived EFA values is used to rapidly screen the Cr-containing compositions for high and low expected EFA values. The ML calculated EFA values for the full set of 70 new five-metal compositions are provided in Table 2. Seven candidates are selected from this list for analysis by DFT and experimental synthesis: (i) three candidates with predicted EFA values over 100 (eV/atom)−1 and each containing all of the Group VI metals (CrMoNbVWC5, CrMoTaVWC5, and CrMoNbTaWC5), (ii) three candidates with a predicted EFA ≤ 55 (eV/atom)−1 (CrHfMoTiWC5, CrMoTiWZrC5, and CrHfTaWZrC5), and (iii) one composition with an intermediate EFA (CrMoTiVWC5) also containing the three Group VI metals. Computing the EFA from DFT and fabricating the selected compositions with a low predicted EFA serves two purposes: (i) to demonstrate the model performs well at finding both the best and worst candidates and (ii) to establish that not every system containing all three Group VI metals will form a single phase.
Several of the compositions in Table 2 have ML predicted EFA values that suggest they will readily form a single-phase high-entropy carbide, despite containing the three Group VI refractory metal elements. If successfully synthesized into a single phase, these novel materials would contain three carbides that do not exist as room temperature-stable rock-salt monocarbides (refer to the binary phase diagrams in Supplementary Figs. 1–3). Several fundamentally interesting compositions are those where one of the rock-salt stable precursors (i.e., NbC, TaC, or VC) is substituted from MoNbTaVWC5 (EFADFT of 125 (eV/atom)−1)5 with an orthorhombic (Cr3C2 or W2C) or hexagonal (Mo2C or WC) precursor that do not form stable rock-salt structures.
As the first step in validating the ML model’s extrapolation into the Cr-containing chemical space, the EFA of the seven selected compositions were subsequently computed by DFT. The ab-initio EFA values are located in Table 2 and plots comparing the ML model with chemical attributes (Fig. 3a) and the ML model including CALPHAD data (Fig. 3b) illustrate the improved regression performance of the model after inclusion of the CALPHAD features. The red circles in Fig. 3 are the predicted EFA values for the seven Cr-containing compositions compared with their DFT-calculated value. Although the ML models were not refit with the new DFT-computed EFA values, the R2 and MAE of each ML model can be re-evaluated after including the extrapolated data. In comparison with the chemical attributes alone, the R2 value remains the same and the MAE increases only slightly.
As a secondary method of validating the ML model, the seven selected materials were fabricated following conventional fabrication processes described in detail in the Methods section. Successful fabrication of the rock-salt structure after full densification was verified via X-ray diffraction (XRD) (Fig. 4). Results of XRD analysis for each sample following spark plasma sintering (SPS) demonstrate that compositions CrMoNbVWC5, CrMoNbTaWC5, CrMoTaVWC5, and CrMoTiVWC5 (the top four) only exhibit a single set of FCC peaks of the desired rock-salt high-entropy phase. Conversely, XRD of CrHfTaWZrC5 CrMoTiWZrC5, and CrHfMoTiWC5 (bottom three) reveal the presence of multiple structures. In the event there are multiple FCC structures present, the majority FCC phase is indexed. In CrMoTiWZrC5 and CrHfMoTiWC5, the secondary phase is also FCC. The CrHfTaWZrC5 system contains a secondary hexagonal phase. The XRD pattern for CrHfMoTiWC5 and CrHfTaWZrC5 also contain a small amount (<5%) of HfO2 that remains due to processing. This is determined not to significantly alter the composition of the carbide phase.
Microstructure analysis and energy dispersive X-ray spectroscopy were then utilized to determine the homogeneity of the sintered pellets as shown in Fig. 5. Coupling the results of both techniques verified that the as-processed samples were either single-phase and chemically homogenous or underwent chemical segregation. For example, in the CrMoNbVWC5 microstructure, only grain contrast is present, and no notable indication of clustering or segregation is visible in the elemental maps. On the contrary, the CrHfTaWZrC5 sample has observable chemical contrast in the microstructure, and the chemical maps demonstrate that the secondary phase present in XRD is rich in Cr and W. The CrMoTiWZrC5 and CrHfTaWZrC5 samples displayed were sintered at 1600 °C to prevent the loss of Cr. When sintered at 1800 °C, EDS revealed the Cr content in these samples was as low as 2 at% and the chrome carbide was found to have reacted with the graphite tooling. The medium entropy composition, CrMoTiVWC5, resulted in a single FCC rock-salt structure after initial sintering, but required annealing as described in the Methods section to reach chemical homogeneity. Subsequently, electron backscatter diffraction (EBSD) was utilized to study the resulting microstructure of the samples. The single-phase, homogenous samples are observed to contain large, nearly equiaxed grains with some deviation owing to the remaining pores. This furthers the assertion these compositions are single phase, as they allow for the kinetics of grain growth. In stark contrast, the multi-phase materials have a significantly reduced grain size, owing to the competing phases preventing further grain growth during sintering.
A powerful data-driven approach to estimating the synthesizability of high-entropy materials, based on data from previous DFT calculations and experimental results, is detailed and demonstrated on 70 new chromium containing compositions. The ML framework is found to be improved by the inclusion of data from CALPHAD and robust toward extrapolating outside the starting chemical space. The ML model enhancement achieved by combining general features and thermodynamic data from CALPHAD is explored via assessing the impact of each predictor variable individually as well as with the other predictor variables (permutation importance) and evaluating compositions outside the original chemical space. The predictive capability of this method is validated by ab-initio calculations and experimental fabrication of several previously unreported compositions, including four single-phase rock-salt materials that would not be obvious candidates given the stable precursors and binary phase diagrams of the Group VI transition metals. These novel materials, of which 60% of the cation lattice contains Group VI metals, represent a step forward in electronic structure engineering of transition metal carbides: prior modeling of the bonding nature with increased valence electrons54,55,56 suggests that future material property studies are likely to yield useful combinations for practical engineering applications. Furthermore, the experimentally studied compositions result in single or multi-phase materials in agreement with their predicted EFA values. The remaining predicted materials include diverse chemistries and present ample opportunity for materials discovery. Moreover, the methodology designed opens the door to locating other high-entropy materials, not just ceramics, in a similar manner.
Random forests are a combination of decision trees that individually make predictions on each input and the overall prediction determined by a majority voting process57,60. Random forest was selected for its utility and performance on diverse problems when compared with other supervised learning models60. The random forest regressor is implemented with Scikit-learn61. Model hyperparameters are selected via an exhaustive fivefold cross-validated grid search using the following parameters: number of tree predictors in range 10–110 in steps of 10, mean-squared error and MAE as criterion, and the number of features to consider when looking for the best split from one to the total number of features available. Each fold is scored using the MAE between the labels from DFT and the predicted values. To obtain a deterministic behavior during model fitting, the random state is seeded. The best performing hyperparameters are selected to fit a model using the entire training set, with bootstrapping, to maximize the amount of information available for making future predictions.
From chemistry to features
Each composition is converted to a set of features with the goal of creating a quantitative representation that relates to the essential chemistry, physics, and thermodynamics of each material in a data set. The attributes utilized in this work should not be considered an exhaustive list, but instead a step toward creating a synergistic set of attributes that capture the knowledge of chemistry and experimentally robust thermodynamics. The 108 compositional attributes, defined in the Supplementary Information, are a subset of the general ML framework demonstrated previously to perform well on diverse material problems45. The elemental data used to compute the compositional features is sourced from Magpie45,62. These chemical attributes are augmented with select data about the number of phases and phase fractions calculated in 100 K steps, as well as the liquidus and solidus temperature from ThermoCalc Software SSOL6 database version 6.163. The ~800 CALPHAD features are reduced to 8 predictor variables (1% of those available) using the Select From Model method in Scikit-learn61 to avoid the “curse-of-dimensionality” and find the most relevant subset64,65,66. Select From Model was chosen in this study for its rapid reduction of features in one step in comparison to other multi-step methods such as recursive feature elimination. The max number of features was set to 8 for this study, to target 1% of the available data. We do not intend for this feature list to be exhaustive or concrete. The selected features are defined in the Supplementary Information. The data for the predictor variables for the training data and new compositions are contained in the GitHub repository.
Interpreting the random forest algorithm
The random forest model is analyzed to provide clarity to how the ML model evaluated these materials. The variable importance is extracted for the fit model using the “rfpimp” package in Python (available at https://github.com/parrt/random-forest-importances, last access: 15 August 2019). The predictor variable importance is ranked on the permutation importance, which directly measures importance by observing the effect on model accuracy by randomly permuting the values of each predictor variable67. That is to say, the permutation importance is measuring the impact on output EFA of swapping the value of a selected feature from one composition with the value from a different composition. This method has recently been introduced as an improvement to the mean decrease in impurity metric58.
All samples were prepared using the same methods and tools utilized in the previous EFA and HEC studies5,12. Initial powders of each of the five binary precursor carbides (NbC, HfC, TiC, ZrC, VC, TaC, Mo2C, W2C, WC, and Cr3C2) are obtained in >99% purity and −325 mesh (<44 μm) particle size (Alfa Aesar). The sample is weighed out in 12 g batches and mixed to achieve the desired five-metal carbide compositions. To ensure adequate mixing, each sample is high energy ball milled under argon in a shaker pot mill for a total of 2 h in individual 30 min intervals intersected by 15 min rest times to avoid heating and consequent oxide formation. All milling is done in tungsten carbide-lined stainless-steel milling jars with tungsten carbide grinding media. Bulk sample pellets are synthesized via solid-state processing routes. The field-assisted sintering technique, also called SPS, is employed to simultaneously densify and react the compositions into single-phase materials. Sintering of each composition is performed at 1800 °C with a heating rate of 100 °C/min, 60 MPa uniaxial pressure applied at temperature, with a 10 min dwell at temperature. Subsequent samples of CrMoTiWZrC5 and CrHfTaWZrC5 were necessarily sintered at 1600 °C instead to prevent the loss of Cr. The composition with a medium EFA value, CrMoTiVWC5, is annealed at 1800 °C for 3 h followed by 2000 °C for 3 h to attain chemical homogeneity. All samples are heated in a vacuum environment of <20 mtorr with additional holds throughout for adequate off-gassing of the powder materials. Sintering is done in 20 mm graphite die and plunger sets surrounded by carbon-based heat shielding. In addition, graphite foil surrounds the samples on all sides to prevent reaction with the die. The compositions listed are nominal since actual synthesized compositions can very due to carbon vacancies in the anion sublattice.
Microstructural and elemental analysis is performed using a Thermo Fischer (formerly FEI) Apreo field emission scanning electron microscope equipped with an Oxford X-MaxN EDS detector and an Oxford Symmetry EBSD detector. A combination of secondary and back-scattered electron detectors are utilized for imaging. EDS scans are conducted at length scales of 500× and 1000× to verify multi-length scale homogeneity in the resulting microstructure. EDS quantification confirmed the resulting ratio of metal ions are nearly equiatomic. Crystal structure analysis is implemented using a Rigaku Miniflex X-ray Diffractometer with a 1D detector using a step size of 0.02° and 5° per minute scan rate, using Cu Kα radiation (wavelength λ = 1.54059 Å) for all measurements. The lattice parameter is calculated utilizing a combinatorial method from both MDI Jade and Match! Phase identification software. This is subsequently utilized to model and create a theoretical diffraction profile to be utilized in EBSD.
Calculation of the entropy-forming ability
The EFA is calculated using the AFLOW-POCC module17 implemented in the AFLOW Framework for Materials Discovery68. For each disordered composition, a set of representative ordered supercells is resolved. First, AFLOW-POCC determines the smallest supercell size accommodating the stoichiometry exactly (for the five-metal rock-salt carbides, the value is 5). The unique superlattices of this size are then constructed based on the Hermite Normal Form matrices. The lattices are decorated to generate all viable configurations. To identify unique configurations and their degeneracies rapidly, the Universal Force Field method is employed. The energies of the unique configurations are then calculated using DFT having input parameters/settings in accordance with the AFLOW Standard29. K-point meshes are generated using the Monkhorst–Pack scheme (Gamma-centered for all materials belonging to the hP and hR Bravais lattice) having at least 6000 k-points per reciprocal atom. Project-Augmented Wavefunction potentials are constructed according to the Perdew–Berke–Ernzerhof exchange-correlation functional as implemented in the Vienna Ab initio Simulation Package (VASP). The plane-wave basis has a kinetic energy cutoff 1.4 times larger than that recommended for each species. Spin polarization is considered. The electronic and ionic convergence criteria are 10−3 and 10−2 eV, respectively. The EFA is defined as the inverse of the spread of these energies5.
All data analyzed during the current study are available at GitHub address https://github.com/krkaufma/ML-EFA or from the corresponding author upon reasonable request.
All code and models generated, developed, and/or utilized are available at GitHub address https://github.com/krkaufma/ML-EFA with the trained weights and sample code to demonstrate how to use the model for material discovery. We intend to continually refine the model by training on larger datasets, and expanding the composition range, as they become available.
Miracle, D. B. High entropy alloys as a bold step forward in alloy development. Nat. Commun. 10, 1805 (2019).
Oses, C., Toher, C. & Curtarolo, S. High-entropy ceramics. Nat. Rev. Mater. 5, 285–309. https://doi.org/10.1038/s41578-019-0170-8 (2020).
George, E. P., Raabe, D. & Ritchie, R. O. High-entropy alloys. Nat. Rev. Mater. 4, 515–534 (2019).
Toher, C., Oses, C., Hicks, D. & Curtarolo, S. Unavoidable disorder and entropy in multi-component systems. npj Comput. Mater. 5, 69 (2019).
Sarker, P. et al. High-entropy high-hardness metal carbides discovered by entropy descriptors. Nat. Commun. 9, 4980 (2018).
Senkov, O. N., Miller, J. D., Miracle, D. B. & Woodward, C. Accelerated exploration of multi-principal element alloys with solid solution phases. Nat. Commun. 6, 6529 (2015).
Cantor, B., Chang, I. T. H., Knight, P. & Vincent, A. J. B. Microstructural development in equiatomic multicomponent alloys. Mater. Sci. Eng. A 375–377, 213–218 (2004).
Yeh, J.-W. et al. Nanostructured high-entropy alloys with multiple principal elements: novel alloy design concepts and outcomes. Adv. Eng. Mater. 6, 299–303 (2004).
Gild, J. et al. A high-entropy silicide: (Mo0.2Nb0.2Ta0.2Ti0.2W0.2)Si2. J. Mater. https://doi.org/10.1016/j.jmat.2019.03.002 (2019).
Rost, C. M. et al. Entropy-stabilized oxides. Nat. Commun. 6, 8485 (2015).
Gludovatz, B. et al. A fracture-resistant high-entropy alloy for cryogenic applications. Science 345, 1153–1158 (2014).
Harrington, T. J. et al. Phase stability and mechanical properties of novel high entropy transition metal carbides. Acta Mater. 166, 271–280 (2019).
Lim, X. Mixed-up metals make for stronger, tougher, stretchier alloys. Nature 533, 306–307 (2016).
Li, Z., Tasan, C. C., Springer, H., Gault, B. & Raabe, D. Interstitial atoms enable joint twinning and transformation induced plasticity in strong and ductile high-entropy alloys. Sci. Rep. 7, 40704 (2017).
Tsao, T.-K. et al. The high temperature tensile and creep behaviors of high entropy superalloy. Sci. Rep. 7, 12658 (2017).
Senkov, O. N., Wilks, G. B., Scott, J. M. & Miracle, D. B. Mechanical properties of Nb25Mo25Ta25W25 and V20Nb20Mo20Ta20W20 refractory high entropy alloys. Intermetallics 19, 698–706 (2011).
Yang, K., Oses, C. & Curtarolo, S. Modeling off-stoichiometry materials with a high-throughput ab-initio approach. Chem. Mater. 28, 6484–6492 (2016).
Gao, M. & Alman, D. Searching for next single-phase high-entropy alloy compositions. Entropy 15, 4504–4519 (2013).
Zhang, F. et al. An understanding of high entropy alloys from phase diagram calculations. Calphad 45, 1–10 (2014).
Lederer, Y., Toher, C., Vecchio, K. S. & Curtarolo, S. The search for high entropy alloys: a high-throughput ab-initio approach. Acta Mater. 159, 364–383 (2018).
Curtarolo, S. et al. The high-throughput highway to computational materials design. Nat. Mater. 12, 191–201 (2013).
Rohrer, G. S. et al. Challenges in ceramic science: a report from the workshop on emerging research areas in ceramic science. J. Am. Ceram. Soc. 95, 3699–3712 (2012).
Levy, O., Hart, G. L. W. & Curtarolo, S. Structure maps for hcp metals from first-principles calculations. Phys. Rev. B Condens. Matter Mater. Phys. 81, 174106 (2010).
Singh, P., Smirnov, A. V. & Johnson, D. D. Atomic short-range order and incipient long-range order in high-entropy alloys. Phys. Rev. B Condens. Matter Mater. Phys. 91, 224204 (2015).
Yin, S., Ding, J., Asta, M. & Ritchie, R. O. Ab initio modeling of the role of local chemical short-range order on the Peierls potential of screw dislocations in body-centered cubic high-entropy alloys. Preprint at arXiv: 1912.10506 (2019).
White, A. A. Big data are shaping the future of materials science. MRS Bull. 38, 594–595 (2013).
Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with high-throughput density functional theory: the Open Quantum Materials Database (OQMD). JOM 65, 1501–1509 (2013).
Curtarolo, S. et al. AFLOW: an automatic framework for high-throughput materials discovery. Comput. Mater. Sci. 58, 218–226 (2012).
Calderon, C. E. et al. The AFLOW standard for high-throughput materials science calculations. Comput. Mater. Sci. 108, 233–238 (2015).
Jain, A. et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
DeCost, B. L. & Holm, E. A. A computer vision approach for automated analysis and classification of microstructural image data. Comput. Mater. Sci. 110, 126–133 (2015).
DeCost, B. L., Lei, B., Francis, T. & Holm, E. A. High throughput quantitative metallography for complex microstructures using deep learning: A case study in ultrahigh carbon steel. Microsc. Microanal. 25, 21–29 (2019).
Zhu, C., Wang, H., Kaufmann, K. & Vecchio, K. S. A computer vision approach to study surface deformation of materials. Meas. Sci. Technol. 31, 055602 (2020).
Kaufmann, K. et al. Crystal symmetry determination in electron diffraction using machine learning. Science 367, 564–568 (2020).
Park, W. B. et al. Classification of crystal structure using a convolutional neural network. IUCrJ 4, 486–494 (2017).
Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
Ren, F. et al. Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments. Sci. Adv. 4, eaaq1566 (2018).
Xue, D. et al. Accelerated search for materials with targeted properties by adaptive design. Nat. Commun. 7, 11241 (2016).
Sanvito, S. et al. Accelerated discovery of new magnets in the Heusler alloy family. Sci. Adv. 3, e1602241 (2017).
Carrete, J., Mingo, N., Wang, S. & Curtarolo, S. Nanograined half-Heusler semiconductors as advanced thermoelectrics: an ab initio high-throughput statistical study. Adv. Funct. Mater. 24, 7427–7432 (2014).
Singh, A. K., Montoya, J. H., Gregoire, J. M. & Persson, K. A. Robust and synthesizable photocatalysts for CO2 reduction: a data-driven materials discovery. Nat. Commun. 10, 443 (2019).
Isayev, O. et al. Universal fragment descriptors for predicting properties of inorganic crystals. Nat. Commun. 8, 15679 (2017).
Ye, W., Chen, C., Wang, Z., Chu, I.-H. & Ong, S. P. Deep neural networks for accurate predictions of crystal stability. Nat. Commun. 9, 3800 (2018).
Legrain, F., Carrete, J., van Roekeghem, A., Curtarolo, S. & Mingo, N. How chemical composition alone can predict vibrational free energies and entropies of solids. Chem. Mater. 29, 6220–6227 (2017).
Ward, L., Agrawal, A., Choudhary, A. & Wolverton, C. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput. Mater. 2, 16028 (2016).
Ling, J., Hutchinson, M., Antono, E., Paradiso, S. & Meredig, B. High-dimensional materials and process optimization using data-driven experimental design with well-calibrated uncertainty estimates. Integr. Mater. Manuf. Innov. 6, 207–217 (2017).
Stanev, V. et al. Machine learning modeling of superconducting critical temperature. npj Comput. Mater. 4, 29 (2018).
Meredig, B. et al. Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys. Rev. B Condens. Matter Mater. Phys. 89, 1–7 (2014).
Oses, C., Toher, C. & Curtarolo, S. Data-driven design of inorganic materials with the Automatic Flow Framework for Materials Discovery. MRS Bull. 43, 670–675 (2018).
De Leon, N., Yu, X. X., Yu, H., Weinberger, C. R. & Thompson, G. B. Bonding effects on the slip differences in the B1 monocarbides. Phys. Rev. Lett. 114, 165502 (2015).
Wuchina, E., Opila, E., Opeka, M., Fahrenholtz, W. & Talmy, I. UHTCs: ultra-high temperature ceramics for extreme environment applications. Electrochem. Soc. Interface 16, 30–36 (2007).
Jhi, S. H., Ihm, J., Loule, S. G. & Cohen, M. L. Electronic mechanism of hardness enhancement in transition-metal carbonitrides. Nature 399, 132–134 (1999).
Toth, L. E. Transition Metal Carbides and Nitrides (Academic Press, 1971).
Kavitha, M., Sudha Priyanga, G., Rajeswarapalanichamy, R. & Iyakutti, K. Structural stability, electronic, mechanical and superconducting properties of CrC and MoC. Mater. Chem. Phys. 169, 71–81 (2016).
Balasubramanian, K., Khare, S. V. & Gall, D. Valence electron concentration as an indicator for mechanical properties in rocksalt structure nitrides, carbides and carbonitrides. Acta Mater. 152, 175–185 (2018).
Das, T., Deb, S. & Mookerjee, A. Study of electronic structure and elastic properties of transition metal and actinide carbides. Phys. B Condens. Matter 367, 6–18 (2005).
Breiman, L. Random forests. Mach. Learn 45, 5–32 (2001).
Strobl, C., Boulesteix, A. L., Zeileis, A. & Hothorn, T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics 8, 25 (2007).
Huang, E. W. et al. A study of lattice elasticity from low entropy metals to medium and high entropy alloys. Scr. Mater. 101, 32–35 (2015).
Caruana, R. & Niculescu-Mizil, A. An empirical comparison of supervised learning algorithms. ACM Int. Conf. Proc. Ser. 148, 161–168 (2006).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
wolverton/magpie — Bitbucket. Available at: https://bitbucket.org/wolverton/magpie/src/master/ (Accessed 21 November 2019).
Andersson, J. O., Helander, T., Höglund, L., Shi, P. & Sundman, B. Thermo-Calc & DICTRA, computational tools for materials science. Calphad Comput. Coupling Phase Diagr. Thermochem 26, 273–312 (2002).
Ramírez, J. et al. Computer aided diagnosis system for the Alzheimer’s disease based on partial least squares and random forest SPECT image classification. Neurosci. Lett. 472, 99–103 (2010).
Qi, Y. in Random Forest for Bioinformatics BT - Ensemble Machine Learning: Methods and Applications (eds Zhang, C. & Ma, Y.) 307–323. https://doi.org/10.1007/978-1-4419-9326-7_11 (Springer US, 2012).
Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M. & Ghiringhelli, L. M. SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018).
Altmann, A., Toloşi, L., Sander, O. & Lengauer, T. Permutation importance: a corrected feature importance measure. Bioinformatics 26, 1340–1347 (2010).
Toher, C. et al. in Handbook of Materials Modeling 1–28. https://doi.org/10.1007/978-3-319-42913-7_63-1 (Springer International Publishing, 2018).
We acknowledge support through the Office of Naval Research ONR-MURI (grant number N00014-15-1-2863). K.K. acknowledges support by the Department of Defense (DoD) through the National Defense Science and Engineering Graduate Fellowship (NDSEG) Program. K.K. also acknowledges the financial support of the ARCS Foundation, San Diego Chapter. K.S.V. acknowledges the financial generosity of the Oerlikon Group in support of his research group.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Kaufmann, K., Maryanovsky, D., Mellor, W.M. et al. Discovery of high-entropy ceramics via machine learning. npj Comput Mater 6, 42 (2020). https://doi.org/10.1038/s41524-020-0317-6
Machine learning assisted screening of non-rare-earth elements for Mg alloys with low stacking fault energy
Computational Materials Science (2021)
Journal of Energy Chemistry (2021)
Temperature-dependent elastic properties of binary and multicomponent high-entropy refractory carbides
Materials & Design (2021)
Machine Learning and Big Data Provide Crucial Insight for Future Biomaterials Discovery and Research
Acta Biomaterialia (2021)
Chemical Engineering Journal (2021)