Introduction

The conventional alloying method almost always starts with one or two principal metallic elements and advances by incorporation of different alloying elements to engineer desired mechanical and chemical properties1,2,3. Therefore, the mechanical and chemical properties of the synthesized alloy remain controlled by the principal elements. For instance, Fe is the principal element in steels, Cu/Zn in brass, Ni/Co in superalloys and Ti in titanium alloys4,5,6. About 15 years ago, Yeh and Cantor7,8 introduced a novel alloy concept known as high entropy alloys (HEA) that consist of multiple-principal elements (N = 5 or more elements) in near equiatomic percentages. The increased complexity introduces higher configurational entropy (growing as kBT NlnN, where T is the temperature) compared to conventional alloys. As the number of elements N increases, the number of pairs grows as ~ N2 and raises the probability of favorable pair-driven formation enthalpy, which introduces a complex-chemistry effect (often referred to as a “cocktail effect”). The mixing of multi-principal elements generally introduces four core effects, such as, high mixing entropy, lattice distortions, slow diffusion, and a “cocktail” effect, which result in a simple microstructure and excellent mechanical properties9,10,11,12,13. Further study revealed that several HEAs, such as the Mo0.5AlNbTa0.5TiZr system, did not overcome the enthalpic contributions due to comparatively lower configurational entropies and featured the formation of secondary phases instead of just solid solution phases. Therefore, a more preferred terminology for such alloy systems has emerged, with the more general naming and definition called CCAs14,15 which is the naming convention used throughout this paper.

The number of elemental compositions is much higher in CCAs than that of traditional metallic alloys because CCAs comprise multiple-principal elements16. Moreover, a broader range of compositional space provides an opportunity to improve mechanical properties, such as Young’s modulus, yield strength, and hardness. However, it is extremely challenging to select the appropriate composition by trial-and-error experiment or intuition17. Atomistic modeling, such as molecular dynamics (MD), density functional theory (DFT), and thermodynamic modeling have been devoted to study phase stabilization, solidification, and crystallization kinetics of CCAs18,19,20,21,22,23,24,25. These techniques are computationally expensive, challenging to apply to the study of large polycrystalline samples, time consuming, and hence cannot be used on a large scale to narrow down the search space. Moreover, the variety of microstructures gives rise to complex and computationally expensive calculations compared to traditional alloys and hence it is challenging to predict the chemistries and compositions for a target property.

Nowadays, data-driven research and more specifically ML, which is widely used in self driving cars26, image classification27, web-searches28, and fraud detection29, is also employed to solve different challenges in materials science30. For instance, Zhang et al.19 found that atomic size difference (δ), mixing entropy (\(\Delta S_{mix}\)) and enthalpy (\(\Delta H_{mix}\)) are the most important features in phase selection of HEAs. Singh et al.31,32,33 used high-throughput DFT to predict properties through the chemical ranges and revealed correlations with valence electron concentration (VEC), size-difference (bandwidth) and vacancies. Roy et al.34 proposed that the average melting temperature (Tm) is the most important feature to predict the Young’s modulus of low, medium and high entropy alloys. Recent efforts utilizing ML35 considered two additional features such as, Pauling electronegativity difference and difference in VEC and used a neural network (NN) to predict the phases that form in these CCAs. Thus, different features control each property of the alloy and the importance of features varies from property to property.

Here, we have employed different tree-based ensemble ML models, linear regression ML models, kernel-based ML models to predict the Young’s modulus of CCAs consisting of refractory elements. This work initially identified VEC, average melting temperature and difference in atomic radii as the most important physical properties that control the Young’s modulus of CCAs. The study compared the relative merits of different ML models for a training set of refractory alloy data that was gathered from published literature. The model prediction was then validated against the Young’s modulus measured for 32 new alloys synthesized and tested as part of this work. The findings offer considerable promise for alloy down selection based on ML models validated against high-quality experimental data of known provenance.

Methodology

Training data collection and feature selection

Data on Young's modulus for CCAs were collected from existing literature34,36,37,38. Two different data sets were used for model training. The first data set contains 154 alloys with a mixture of refractory and non-refractory alloys. The second data set contains 96 refractory alloys of Mo, Nb, Ta, W, mixed with some other elements like Al, Cr and Ni. Both datasets are presented in Tables 1 and 2 in the supplementary section. The goal of using two different data sets (one with a mixture of refractory and non-refractory alloys and the other with only refractory alloys) was to examine the effect of the elemental composition of training data on the reliability of the prediction with respect to experimentally synthesized validation data.

For the features that were used to train the ML models, we calculated 11 feature values of these alloys. These features are listed in Table 1. Past studies have shown that all of these features have a direct effect on the Young’s modulus for any alloy. To obtain these features, we collected data on features identified from domain knowledge, such as Pauling electronegativity, VEC, lattice constant, melting temperature, mixing enthalpy and atomic radii. Then we used Python language scripts to calculate the features mentioned in Table 1.

Table 1 Features of alloys considered in this analysis.

To see the association between the features, we examined the Pearson correlation coefficients (PCC). Figure 1 shows the PCC for the mixed alloys data set and for the refractory alloys data set. In the PCC “heatmap”, P = + 1 indicates a strong positive correlation and P = − 1 indicates a strong negative correlation. Figure 1 indicates the absence of any significant correlation amongst any pair of features except \(\Delta\)a and am from Fig. 1a. However, the ML models we considered here can deal with the multicollinearity, and hence this correlation will not have any significant impact on the predictions. Therefore we considered all the features in the model.

Figure 1
figure 1

(a) PCC for data with both refractory and non-refractory alloys, (b) PCC for data with only refractory alloys. A value close to 1 or − 1 indicates positive or negative correlation, respectively.

Validation data preparation and Young’s modulus measurement

An experimental data set was used to validate the final model predictions. The validation set consisted of 32 alloys in the Mo-based family of refractory CCAs, including Mo, Ta W, Ti, Zr, Al, Cr. The validation alloys used in the study were prepared at Ames Lab Materials Preparation Center in the form of thin metal plates/foils. The alloys (1.5 g each) with selected compositions were synthesized by arc melting using a 32-cavity arc melting system (MTI corp, SP-MAM32). The actual compositions of the alloys after arc melting were quantified by energy dispersive spectroscopy (EDS). The densities of the samples were measured by Archimedes measurement. The arc-melted buttons were then sliced by electrical-discharge machining into near-cylinder shapes (two parallel sides) with thicknesses of ~ 3 mm. The elastic modulus values were measured on the cylinders by the ultrasonic pulse-echo technique using a digital ultrasonic thickness gauge (Olympus, 38DL PLUS).

Machine learning models construction

To predict the Young's modulus, four tree based ensemble methods i.e. Gradient Boosting, Ada Boost, Extreme Gradient Boost (or XGBoost), Random Forest (RF), two linear models i.e. LASSO regression, Ridge regression, two kernel based methods i.e. Gaussian Process Regression and Support Vector Machine (SVM) models were used. These models were trained for the two sets of data separately. Once the data was collected and the feature values were selected for both data sets, the 8 ML models were trained on both the data sets. We obtained 16 models, 8 for the larger data set with both the refractory and non-refractory alloys and 8 for the smaller data set with only refractory alloys. Five-fold cross-validation was used to determine the errors. The cross-validation approach is better than the train-test split approach as it gives more robust estimation of the errors. There exist many good metrics to quantify the predictive strength of the model like root-mean-squared (RMS) error, mean-squared error, mean-absolute error (MAE), and the coefficient of determination R2. We chose to use the MAE as our metric as it most closely represents the format of error as reported in most experimental measurements. Additionally, we also reported the R2 values for the optimized models.

The errors were minimized by performing hyper-parameter optimization using the grid-search algorithm. This algorithm works by determining the test error for all possible combinations of the supplied hyper-parameter values. Out of all combinations, the one with the least error was selected for our model. Each of the algorithms has a different set of hyper-parameters. Once the best hyper-parameters were selected, the optimized model using those hyperparameters was used to make predictions for our validation set whose Young’s modulus had been experimentally measured. Finally, the uncertainty of the predictions i.e. standard deviations was calculated by Bootstrapping method by resampling 100 times for each case. All of the above-mentioned tasks like cross-validation and grid search were performed using the scikit-learn45 library in Python. For our study, we employed all the ML models through the scikit-learn machine learning library for the Python language46. The XGBoost model was implemented through the library created by Tianqi Chen47.

Results and discussion

Model optimization

The ML models were first trained on both data sets. The hyper-parameters were optimized and then the training and validation error were calculated using five-fold cross-validation. We used these hyper-parameters to construct our final optimized models. The optimized hyperparameters are presented in the supplementary section (Table 3 in the supplementary section). These hyperparameters were used to predict the Young’s modulus for the unseen data i.e. the experimentally synthesized validation data set. The cross-validated MAE and R2 values for all the models are presented in Table 2. From Table 2 it is clear that the performance of the Gradient Boosting model is superior to other models both in terms of accuracy (i.e., the MAE is lower and R2 is higher than any other models) and robustness (i.e., the standard deviation of cross-validation is lower). Because of this excellent performance, we will discuss the feature importance and prediction of Young’s modulus generated by the Gradient Boosting model.

Table 2 Optimized hyperparameter and cross-validated MAE and R2 for both data sets.

In our data sets, tree-based ensemble type models perform better than other models to predict Young’s modulus. Ensemble type algorithm showed better performance in other studies to predict materials properties34,48,49. Ensemble methods are meta algorithms that combine several base models to produce a better predictive model. To decrease variance, a bagging ensemble method can be used and to decrease bias a boosting ensemble method can be used. A boosting method converts weak learners to strong ones50,51,52. Usually, decision stumps are used as the base weak learners, but this is not always the case. Most Boosting methods build models in a stage-wise fashion and they generalize the model by optimizing an arbitrary differentiable loss function. Boosting methods also help prevent the problem of over-fitting to some extent. Additionally, Boosting methods solve the problems of a non-linear relation between target properties and features and help to deal with the collinearity among the features. Furthermore, most boosting methods provide the feature importance associated with the model. Feature importance is important to conclude which features influence Young’s modulus the most. Boosting methods are affected by the presence of outliers. Hence, it is recommended to perform outlier analysis before training the data.

Feature importance

After training the models on both data sets containing refractory and non-refractory alloys using the optimized hyper-parameters, we determined the feature importance associated with the Gradient Boosting model. Feature importance is simply the score assigned to the features based on how useful they are at predicting a target variable. The feature importance for the larger data set containing both the refractory and non-refractory alloys, and smaller data set only with refractory alloys are presented in Fig. 2a,b, respectively. From feature importance, it is clear that the sequence of the features is not identical for both data sets. However, the smaller training data set showed better prediction accuracy as indicated in Table 2. Hence, we selected the important features generated from the smaller data set presented in Fig. 2b. In the next paragraph, we are going to explain the physical significance of some of the important features for the Young’s modulus of CCAs.

Figure 2
figure 2

Feature importance for (a) larger training set containing both refractory and non-refractory alloys, (b) smaller training set containing only refractory alloys.

We found that VEC was the most important feature and had importance higher than 0.7. While it is not shown here, it is important to mention that other ML models i.e. XGBoost and RF showed good prediction capabilities and identified the VEC as the most important feature with an importance of more than 0.7. In the elastic limit and at a constant value of Poisson’s ratio, the Young’s modulus is related to the bulk modulus (Eq. 1) and hence we will explain the physics of the Young’s modulus dependence on VEC by exploring the physical relationship between bulk modulus and VEC53,54.

$$K = \frac{E}{3(1 - 2v)}$$
(1)

Here, K, E and \(v\) are the bulk modulus, Young’s modulus and Poisson’s ratio, respectively. Gilman et al.53,54, reported that materials with higher valence electron density (VED) (valence electrons/unit volume) possess higher bulk modulus. As the number of valence electrons increases, the bulk modulus increases, and it decreases as the atomic size increases. The bulk modulus is determined predominantly by the resistance of the valence electrons to compression. In a metallic system, electrons behave like a dense gas, or liquid, with only a very small amount of viscosity. Hence, the greater the electron density, the more the resistance to compression, and the higher the bulk modulus and the Young’s modulus. For instance, osmium, possesses a VED 17% higher than for diamond and correspondingly exhibits a bulk modulus 4% greater as well53,54. Though we considered VEC instead of VED in this work, it still follows the upward trend of Young’s modulus both for training and validation data sets with VEC as presented in Fig. 3a,b. Our calculated feature importance indicates that the melting point of alloys, which is an indirect metric of bond strength34,55, has an impact on Young’s modulus, which generally increases with increasing melting temperature as presented in Fig. 3c,d. The geometrical parameter λ, which is a function of mixing entropy (\(\Delta S_{mix}\)) and the difference in atomic radii (δ) has a significant impact on Young’s modulus. The δ parameter has an impact on cohesive energy and Young’s modulus increases with increasing cohesive energy56,57. In our case, we have seen that a lower value of δ results in higher Young’s modulus as presented in Fig. 3e,f. The difference in atomic radius influences the distribution of alloying elements and metallic bond energy. The electronegativity has an impact on the electron density of atoms and the larger value of electronegativity result in a higher Young’s modulus of metallic alloys58. Additionally, larger electronegativity differences (\(\Delta \chi\)) and higher mixing enthalpy (\(\Delta H_{mix}\)) increases the probability of formation of intermetallic brittle phases, which have lower Young’s modulus. Therefore, these two parameters could play an important role to determine Young’s modulus of CCAs34.

Figure 3
figure 3

Impact of of some prominent features on Young’s modulus. (a) Relation between Young’s modulus and VEC for training set and (b) for experimental validation set. (c) Relation between melting temperature and Young’s modulus for training set and (d) for experimental validation set. (e) Relation between the difference of atomic radii and Young’s modulus for training set and (f) for experimental validation set.

It is important to mention that Roy et al.34 predicted Young’s modulus of low, medium and high entropy alloys composed of 5 elements by employing Gradient Boosting method and found that average melting temperature (Tm) was the most important feature without considering the impact of VEC. Corresponding MAE for their study was 23.59 GPa. In this study, we achieved significantly better performance (MAE = 6.15 GPa) by considering VEC in the feature sets. From the above discussion, we propose that VEC is the most important feature that determines the Young’s modulus of this refractory alloy system. Therefore, it is essential to include VEC as a key parameter in the design of new CCAs with tailored Young’s modulus.

Experimental validation

We finally used the trained Gradient Boosting model to predict Young's modulus of unseen CCAs, which are the experimentally synthesized 32 CCAs mostly composed of Mo–Ta–Ti–W–Zr elements. As the experimental validation alloys are all refractory alloys, we examined how the types of training sets have impact on the prediction of Young’s modulus. When we trained the Gradient Boosting model with larger data set containing both refractory and non-refractory alloys the predictions of the Young’s modulus were significantly off compared to experimentally measured Young’s modulus as presented in Fig. 4a. The predicted value consistently underestimated the experimental value. In contrast, we have achieved excellent predictions when we consider only the refractory alloys to train the Gradient Boosting model as presented in Fig. 4b. Only 2 predictions (alloy numbers 6 and 8) out of 32 alloys are outside of 68.3% confidence interval (± σ, where σ is the standard deviation of each prediction. Table 3 presents the actual value of experimental Young’s modulus, mean prediction of Young’s modulus with the percentage of error and standard deviation when the model was trained with refractory alloys. 26 of the alloys had errors ≤ 5% and a few of the predictions are almost identical compared to experimental values.

Figure 4
figure 4

Young's Modulus Prediction by Gradient Boosting model when trained (a) with data containing both refractory and non-refractory alloys and (b) with only refractory alloys.

Table 3 Predicted Young's modulus with percentage of error and standard deviation from Gradient Boosting model trained with data containing refractory alloys.

From Fig. 4 and Table 3 we conclude that the quality of the training data is very important to predict the target property accurately. We have a larger training set (154 alloys) with refractory and non-refractory alloys. On the other hand, we have a smaller training set (96 alloys) only with refractory alloys. Since the training set was more homogeneous for the smaller data set, we achieved better predictions. Moreover, the predicted Young’s modulus followed the trend with the experimental Young’s modulus with some exceptions as presented in Fig. 4b. Therefore, it is not only the size of the training data but also the quality and relevance of the training data that are important for better predictions.

Conclusion

We have presented an approach that uses ML with high throughput experimental synthesis and mechanical testing of alloys to predict the Young’s modulus of CCAs reliably. We conclude that among the eight ML models we used, Gradient Boosting had the best predictive strength. The prediction of Young’s modulus was influenced by the model chosen and by the composition of training data. Our experimental validation set was composed of refractory alloys, and when the models were trained with data containing only refractory alloys, the predictions were closer to the experimental values. This shows that when training ML models to predict characteristics of alloys, it is advantageous to include alloys of similar composition in the training data set. The valence electron concentration is the most important feature governing the Young’s modulus of refractory CCAs and can be used to rapidly screen alloys. Since feature importance also appears to be influenced by the choice of training data set, it is important to choose carefully the training data set based on the type of alloy being studied and validate against high-quality experimental data of known provenance. The integration of experimental synthesis and testing, machine learning, and physics-based interpretation demonstrated in this work holds considerable promise for alloy design and property prediction.