Coupling physics in machine learning to predict properties of high-temperatures alloys


High-temperature alloy design requires a concurrent consideration of multiple mechanisms at different length scales. We propose a workflow that couples highly relevant physics into machine learning (ML) to predict properties of complex high-temperature alloys with an example of the 9–12 wt% Cr steels yield strength. We have incorporated synthetic alloy features that capture microstructure and phase transformations into the dataset. Identified high impact features that affect yield strength of 9Cr from correlation analysis agree well with the generally accepted strengthening mechanism. As a part of the verification process, the consistency of sub-datasets has been extensively evaluated with respect to temperature and then refined for the boundary conditions of trained ML models. The predicted yield strength of 9Cr steels using the ML models is in excellent agreement with experiments. The current approach introduces physically meaningful constraints in interrogating the trained ML models to predict properties of hypothetical alloys when applied to data-driven materials.


Material design assisted by data analytics is an emerging area of materials science and engineering that offers a reduction in cost, risk, and time over traditional material development approaches based solely on experimental investigations and/or physics-based simulations1,2,3,4,5. Due to their complexity (i.e., chemistry, melt process, thermo-mechanical process, heat treatment, and resulting developed microstructure), the rational design of high-temperature alloys by machine learning (ML) requires a comprehensive dataset that can cover various aspects: multi-component, multi-phase, multi-physics, multi-scale, and multiple strengthening mechanisms as well as significant influence of processing conditions on the properties of final products.

The majority of previous efforts applying ML to predict the properties of high-temperature alloys have used alloy compositions and simple processing conditions as features6,7,8,9,10,11,12,13. While these approaches can leverage experimental data accumulated over decades, extrapolating (and even interpolating) these models outside the range of the input data is risky due to the absence of physical constraints. There have been attempts to incorporate atomistic-level features, e.g., atomic radius/volume, electronegativities, cohesive energy, and local electronegativity mismatch, for predicting high-temperature alloy properties14,15,16,17, but features related to phenomena/mechanisms occurring in larger length scales (i.e., micro- and meso-scale) may have more impact on alloys.

For high-temperature alloy design, physical information, such as microstructure, is essential for representing process–structure–property correlation18,19,20,21,22. Pioneering work by Zhao and Henry showed that the performance of a regression model for predicting the rupture time of Ni-based alloys could be significantly improved by incorporating the equilibrium volume fraction of the γ′ phase21. Recently, further advancement in this area was made by establishing a data analytics workflow by integrating microstructure-related synthetic features via the CALPHAD approach to predict the creep strength of alumina-forming austenitic stainless steels23 and high-strength stainless steel24.

However, for many material systems (high-temperature alloys in particular), microstructure-related synthetic features from CALPHAD are often not enough since the microstructure changes over time and the strengthening mechanisms evolve with applied stress and temperature. Consider the case of 9–12 wt% Cr martensitic–ferritic steels (hereafter referred to as 9Cr steel) as an example. This class of alloy consists of a tempered martensitic microstructure, where temperature plays a critical role with respect to strengthening mechanisms25. Fine prior austenite grains/packet/lath structure and dislocation density significantly control the room-temperature strength. With increasing temperatures, up to around 600–650 °C, second-phase precipitates, i.e., M23C6 (M = Fe, Cr, and Mn), MX (M: mainly V, X: C and N), and even Laves phase, within the sub-grain interior or along these sub-boundary, play an important role in strengthening. Above 700 °C, microstructure instability, such as rapid precipitate coarsening, recovery and/or recrystallization, lead to a significant loss in mechanical strength.

Thus, relevant features such as phase transformation temperatures, e.g., A3 temperature (the temperature at which transformation of ferrite to austenite is completed during heating) and martensite start temperature (Ms) etc., should be considered, in addition to microstructure information. These phase transformation temperatures are directly correlated with martensitic microstructure evolution, other microstructural features (e.g., prior austenite grain size (PAGS), packet/lath sizes, dislocation density in the as-normalized condition, etc.26,27), and consequently influence their initial mechanical properties as well as long-term microstructural stability.

Herein, we demonstrate a workflow of coupling highly relevant physics into ML models for predicting properties of multi-phase and multi-component high-temperature alloys. A yield strength dataset of 9Cr martensitic–ferritic steels is selected to elucidate this strategy. Figure 1 illustrates the structure of the yield strength dataset of 9Cr steel used in this study. The computed synthetic alloy features, along with raw experimental data, are listed in Table 1. The correlation between these features in the dataset and the 9Cr yield strength was quantitatively determined and compared with generally accepted mechanisms by the community. We evaluated the performance of representative ML models, i.e., linear regression (LR)28, Bayesian ridge (BR)29,30, k-nearest neighbor (NN)31, random forest (RF)32, and support vector machines (SVM)33. Additional work was also carried out to assess the performance of ML models on predicting the PAGS of the 9Cr steel since PAGS is an essential input for calculating Ms of 9Cr steels.

Fig. 1: Features in 9Cr dataset.

Alloy features considered in the 9Cr data analytics. Groups 1/2 are raw experiment data, and groups 3/4 are computed synthetic alloy features. This dataset cover data from room temperature to 800 °C and the values in parentheses next to temperatures represent the number of data points at each temperature. Complete details of this dataset are reported in Table 1.

Table 1 List of alloy features considered in this work to predict 0.2% yield strength (MPa) of 9–12 wt% Cr steels.

Results and discussion

We started with the 9Cr dataset with only raw experimental data (i.e., elemental alloy compositions, processing, and testing conditions, and PAGS—groups 1 and 2 in Fig. 1) to train five different ML models. Figure 2 shows the average accuracy of these models, and their standard deviation from ten training runs as a function of the numbers of top-ranking features from Pearson’s correlation coefficient (PCC)34 and maximal information coefficient (MIC)35 analyses. Overall, ML models RF, NN, and SVM exhibit high accuracy (R2 > 0.9) regardless of the number of top-ranking features. More specifically, RF was the most accurate (always higher than 0.95), followed by SVM. Nevertheless, the applicability of these models for alloy design is questionable since PAGS is the only physically measured microstructure-related feature involved in ML training. Other relevant physically meaningful features, such as volume fraction of key phases and phase transformation temperatures, are required to properly represent the process–structure–property relationship and serve as physical constraints in ML.

Fig. 2: Machine learning accuracy based on correlation analysis.

Accuracy of five trained ML models (BR Bayesian ridge regression; LR linear regression; NN nearest neighbor; RF random forest; SVM support vector machines) with raw experimental data (compositions, processing, and testing conditions, and PAGS) as a function of the number of top-ranking features in the entire 9Cr dataset. The hyperparameters of each model were tuned up to 150 iterations to obtain optimum parameters. Each model was trained ten times to determine the average accuracy and its standard deviation.

Analyses of temperature-based sub-datasets

Given the lack of physically measurable microstructure features in the 9Cr dataset, the raw experimental data were augmented with synthetically derived features, i.e., groups 3 and 4 in Fig. 1 (see Table 1), from high-throughput CALPHAD calculations. Since the primary strengthening mechanisms of 9Cr steel are temperature dependent, it was essential to carefully examine whether the present dataset is capable of representing the temperature-dependent strengthening mechanism. Thus, we divided the 9Cr dataset into several sub-datasets based on the testing temperature for further analysis. As such, we performed correlation analysis for each sub-dataset.

The top 10 and bottom 10 features from the PCC analysis were evaluated at three representative temperatures, i.e., 200 °C (low temperature), 550 and 650 °C (medium to high temperatures), and 750 °C (above service temperature). These results are presented in Fig. 3. From this analysis, it was observed that the closer the absolute value of the correlation coefficient is to 1, the stronger the correlation is between the feature and yield strength. Those features identified with either a positive or negative correlation with yield strength at 200, 550, and 650 °C were consistent and mostly in good agreement with generally accepted strengthening factors/mechanisms in 9Cr steel. For example, Ni content exhibited a strong positive correlation with yield strength, i.e., the higher the Ni content, the higher the yield strength. This is in accordance with the practice of adding Ni to 9Cr steel to stabilize austenite at high temperatures, lower the martensitic transformation temperatures, and consequently, increases the hardenability in the normalization process. These effects generally increase the yield strength of martensitic–ferritic steels, including the 9Cr family of steel27. This result is also logistically supported by the present correlation analysis that shows a strong negative correlation between the Ms temperatures and the yield strength.

Fig. 3: Correlation analysis of subset data at different temperatures.

Results of correlation analysis between all features (composition, processing and test conditions, microstructure, and phase transformation temperature, see Table 1) and yield strength at selected representative temperatures. The correlation coefficients of the top 10 and bottom 10 features out of 45 features from 200 °C (low temperature), 550 and 650 °C (medium to high temperatures), and 750 °C (above service temperature) are presented.

The M23C6 phase also plays an important role in strengthening the 9Cr steel from the precipitate strengthening perspective and stabilizes the tempered martensite microstructure, especially at elevated temperatures36. A higher volume fraction of M23C6 leads to higher yield strength. Thus, it is reasonable that the volume fraction of M23C6 has one of the strongest positive correlations with yield strength. The elements V and N facilitate the formation of strengthening MX precipitates during tempering, which also assists in increasing yield strength by impeding dislocation motion during deformation and stabilizing the sub-grain structure. Co is also an austenite stabilizer that suppresses δ-ferrite formation during the normalizing heat treatment step. Ms and microstructure-related features (e.g., volume fractions of M23C6, hcp, and fcc phases) from our high-throughput calculation are highly impactful features, critical to obtaining high-fidelity surrogate ML models. This finding is also applicable to the other sub-datasets up to 650 °C (see Supplementary Table 1).

For the sub-datasets above 650 °C (e.g., 750 °C in Fig. 3), the correlation coefficients are smaller than those at low temperatures, indicating weaker response between alloy features and yield strength. In addition, the feature ranking order at 750 °C is counterintuitive and very different from the trends below 650 °C. For instance, Ms has a negative correlation below 650 °C, and now it shows to have a positive response at 750 °C. Features wC, wCr, wW, and PAGS should positively contribute to yield strength are now identified as having a negative impact at 750 °C. The MIC analysis also shows a similar trend (see Supplementary Fig. 1).

The correlation between alloy features and yield strength at 750 °C is much weaker than those at lower temperatures. Typical high impact features, such as Temper 1, wV, wNb, wNi, wC, T2_VPV_M23C6, have been correctly identified at 200, 550, and 650 °C, while at 750 °C they are counterintuitive in nature. The present findings may be put into context by realizing that (1) the number of data points at >650 °C is insufficient for representing the effects of certain features on yield strength correctly, and (2) the microstructure changes during exposure at high temperatures are significant and may result in a variation of yield strength attributed to other factors that are not considered in the present dataset (e.g., the heating rate and/or the holding time before tensile testing at temperature).

We then trained five ML models (BR, LR, RF, NN, and SVM) with these sub-datasets at each temperature. Since these sub-datasets have a maximum of 44 data points, we limited the number of top-ranking features used in ML to 10 to avoid overfitting. The top 10 features of each sub-dataset from correlation analysis are summarized in the Supplementary information (Supplementary Table 1). As an example, Fig. 4 shows the accuracy of the RF model trained with various top-ranking features as a function of temperature-based sub-datasets. The results of the 9Cr entire dataset (“All”) are also included for comparison. As shown in Fig. 4a, the accuracy of ML models trained with sub-datasets is always lower than that of the one using the entirety of the 9Cr dataset (i.e., “All”), which can be attributed to their smaller volume of data for the former. The performance of RF trained with top-ranking features from MIC does not improve with more features, and the top 4 features already lead to the maximal accuracy. This exercise shows that these features are sufficient to fit the RF model well. However, the top 8 features from PCC analysis are required to reach maximal accuracy (Fig. 4b). In both cases, the maximum accuracy is always >0.8 from room temperature (RT) to 600 °C regardless of the ranked features from the MIC or PCC analyses. From this point, it decreases monotonically above 600 °C, which is in accordance with the decreasing data volume above 600 °C (see Fig. 1). Since the ranking of features at 650 °C is reasonable (see Fig. 3), the lower accuracy at 650 °C may be attributed to its slightly smaller data volume than the lower temperature datasets.

Fig. 4: ML performance of respective temperature sub-datasets.

Accuracy of trained ML models (random forest as an example) as a function of temperature and numbers of top-ranking features from the a MIC and b PCC analyses. The models were trained for ten times to determine the average accuracy and its standard deviation (error bar). The hyperparameters of each model were tuned up to 150 iterations to obtain optimum parameters. The vertical dash line indicates where issues of lack of data and relevant features start to occur.

Consequently, no matter how many top-ranking features are used in ML models, the accuracy (R2) is always below 0.6. This observation again confirms that data at >650 °C are insufficient, and the features in the present 9Cr dataset cannot represent the microstructure instability at high temperatures. Therefore, including the data at >650 °C could mislead the training of ML models, and consequently, result in an incorrect prediction. For this reason, data above 650 °C were removed, resulting in the truncated (≤650 °C) 9Cr dataset for the following ML model.

Truncated (≤650 °C) dataset

Figure 5 and Table 2 summarize the results of correlation analysis for the truncated dataset. Many physically meaningful features (i.e., volume fractions of phases and Ms) that we added into the raw 9Cr yield strength dataset commonly have high correlation coefficients. These highly impactful features from both PCC and MIC analyses are in good agreement with the generally accepted strengthening mechanisms, indicating that the features collected in the truncated dataset can capture the strengthening mechanisms of 9Cr steel well in the given temperature range. In this dataset, tensile testing temperature (TTTemp) is included, which allows its inclusion into the temperature dependence of yield strength in the ML models. TTTemp possesses a strong negative correlation with yield strength, which is also consistent with the experimental observations that the higher the test temperature, the lower the yield strength.

Fig. 5: Correlation analysis between all alloy features and yield strength in the truncated (≤650 °C) dataset.

a Top 20 ranking features from MIC and the corresponding features from |PCC|, and b top 10 and bottom 10 features from the PCC analysis. The ranking of each feature from both analyses is reported in Table 2.

Table 2 Top 20 features from the correlation analysis between alloy features (simple features plus synthetic features populated from the high-throughput calculation) and yield strength using the MIC and PCC methods for the truncated (≤650 °C) dataset.

There is a discrepancy between the results from MIC and PCC analyses, for example, MIC ranked wCo 1st (9th in PCC), while PCC ranked T2_VPV_M23C6 2nd (14th in MIC). This is attributed to the different algorithms in assigning in the strength of correlation. PCC only evaluates the strength of the linear relationship and MIC has an advantage over PCC when there is a non-linear correlation between input feature and target property. Detailed comparison of MIC and PCC analyses with different data structures are available in ref. 37. It should be emphasized that the purpose of performing both MIC and PCC analyses in this study is not to rank one method over the other. Correlation analysis is a topic of its own, aiming to study the statistical relationship strength between two variables. It is also a category of feature selection approach that facilitates the choice of the most relevant input features for ML23. The intent here is also to demonstrate that correlation analysis is necessary to validate whether or not underlying mechanisms have been efficiently captured by quantitatively evaluating the score of features considered. It can also be used to evaluate the quality of the consistency of a material dataset. The results of different correlation analyses can be further analyzed to inspire alloy design experts to generate alloy hypotheses.

Five ML models (i.e., BR, LR, RF, NN, and SVM) were trained using the truncated dataset. The results are shown in Fig. 6. Similarly, the number of top-ranking features based on the MIC and PCC analyses was varied to train these models. The accuracy of the models using the top-ranking features from the MIC and PCC analyses show similar trends. For example, increasing the top-ranking features from 5 to 10 for PCC, and from 5 to 15 for MIC increased the accuracy of these models significantly. After taking into account the top-ranking features, the accuracy of the BR, LR, RF, and SVM models was almost constant, with the NN model showing a monotonic decrease in accuracy. For the models utilized, it was necessary to include at least the top 10 features for PCC and the top 15 features for MIC to obtain good accuracy.

Fig. 6: ML performance as a function of the number of top-ranking features.

Accuracy of five trained ML models (BR Bayesian ridge regression; LR linear regression; NN nearest neighbor; RF random forest; SVM support vector machines regression) in predicting yield strength. These models were trained with synthetic features populated from high-throughput calculation as a function of the number of top-ranking features in the truncated (≤650°C) dataset. The hyperparameters of each model were tuned up to 150 iterations to obtain optimum parameters. Each model was trained ten times to determine the average accuracy and its standard deviation (error bar).

Regardless of the type and number of features used for the PCC and MIC analyses, the accuracy of the trained models in predicting yield strength were, in order: RF > SVM > NN > BR ≈ LR. More specifically, RF, NN, and SVM exhibited very high accuracy (R2 > 0.9), while the maximum accuracy of the LR and BR models were ~0.85. For example, Fig. 7 shows the predicted yield strength using the RF model. It exhibits an excellent agreement with the experimentally determined yield strength. Although the accuracy of trained ML models with the dataset augmented by synthetic features is similar to those trained only with raw experimental data (see Fig. 2), the fidelity of these models is notably enhanced for LR, BR, and SVM. This is because the synthetic features we incorporated into the dataset are proved to be highly correlated with the yield strength of 9Cr steel. Moreover, the ML models still achieve very high accuracy even though the truncated dataset contains ~10% less data than the initial 9Cr dataset, mainly because the inconsistent data above 650 °C was eliminated. As such, we believe that the trained ML models (as described in this section) are more accurate and can provide more realistic predictions.

Fig. 7: Experimental vs. predicted yield strength of the 9Cr steel with random forest (RF) with the top 10 features from MIC and PCC analyses.

MAE stands for mean absolute error (MAE).

The high-fidelity surrogate models obtained in this work will allow prediction of the yield strength of hypothetical 9Cr alloys. However, in this case, additional work on predicting PAGS is required, as it was used as an input feature to predict the yield strength. For all features in groups 1 and 2 (see Fig. 1), PAGS is unique. The PAGS is an essential input for predicting Ms38, which was previously identified as a highly relevant feature for yield strength and served as an important constraint in training high-fidelity surrogate models. Also, PAGS depends on various details of the composition and processing conditions. However, PAGS of an alloy can only be obtained by physical inspection, i.e., metallography. Thus, following the similar workflow in the present study, surrogate models for PAGS were trained using the truncated dataset. The predicted PAGS using the NN, RF, and SVM models is in excellent agreement with experimental data (see Supplementary Fig. 2 in Supplementary materials). As an example, a comparison between experimental and predicted PAGS of the 9Cr steel using the RF ML model is shown in Fig. 8. We believe the outstanding performance of trained ML models is attributed to the extremely high correlation between input features and PAGS (see the correlation scores of high-ranking features in Supplementary Table 2). The average MIC score of top 15 features is 0.933 ± 0.061, which is extremely high. The average scores of PCC are not as high as those of MIC, but the average score of top 10 is 0.660 ± 0.100, which can be regarded to be high. With the success of this approach, PAGS for any 9Cr steel alloys can be derived and used as input to predict yield strength via a data analytics approach as demonstrated in the present study.

Fig. 8: Experimental vs. predicted PAGS of the 9Cr steel with random forest (RF) with the top 10 features from MIC and PCC analyses.

MAE stands for mean absolute error (MAE).

In summary, we have demonstrated a workflow that can incorporate highly relevant physics into ML for predicting properties of complex heat-resistant alloys. Using a yield strength dataset of the 9–12 wt% Cr steel as an example, the approach has been described in detail. We augmented raw experimental data with key features that can capture both the microstructure and phase transformation of this class of alloy, i.e., the volume fraction of key phases, A3, and martensite phase transformation temperatures. It is worth mentioning that the present features could not capture the complex location- and size-specific microstructural detail of the secondary phases that form in the 9Cr alloys. It would be ideal to incorporate such detailed microstructure-related information into the data analytics workflow. However, obtaining such a large volume of high-fidelity microstructural details for all the alloy chemistries and processing conditions will be extremely time and cost-prohibitive.

We computed these synthetic features using high-fidelity thermodynamic models in a high-throughput manner. Critical evaluation of each temperature-based sub-datasets, including correlation analysis and ML training, showed that data above 650 °C are insufficient for correctly capturing the significant factors related to the yield strength of 9Cr steel due to the relative lack of experimental data and relevant microstructure features. Thus, this information was removed from the 9Cr dataset, and correlation analysis of this truncated dataset showed that the high-ranking features were in good agreement with the generally accepted strengthening mechanisms.

We tested the performance of representative ML models, i.e., RF, SVM, NN, BR, and LR, as a function of the number of top-ranking features. From this exercise, the top 10 features from PCC and the top 15 features from MIC are necessary to obtain good accuracy for all models. Among the ML models tested, the RF and SVM ones exhibited very high accuracy (R2 > 0.95) for predicting 9Cr steel yield strength. In conclusion, this study demonstrated that high-fidelity surrogate models could be trained with highly relevant and physically meaningful features. Such physical constraints effectively prevent erroneously predicting properties of hypothetical candidate alloys when interrogating trained ML models in a data-driven materials design. We anticipate that the approach demonstrated in the present work can be further extended by integrating additional alloy physical/chemical features beyond what is achievable in this study.


Experimental dataset and synthetic alloy features via thermodynamic calculations

The raw experimental dataset was compiled by National Energy Technology Laboratory8,9, USA, using the creep datasheet for high Cr steel39 in the MatNavi materials database by the National Institute for Materials Science, Japan. The dataset is consists of compositions of 18 elements, processing and testing temperatures, and PAGS (converted from austenite grain size number). The state-of-the-art steel and Fe-alloys dataset TCFE940 was used to compute the volume fractions of the phases and the A3 temperature for each steel composition by the CALPHAD approach41. A recently developed thermodynamic model38 (also implemented in Thermo-Calc software package42,43) was adopted to calculate Ms temperatures. This analytical model, which is an extension of the models developed by Borgenstam and Hillert44, and Stormvinter et al.45, takes into account of the thermodynamic driving force for of FCC–BCC phase transformation as the major contribution as well as PAGS as a non-chemical contribution to predict Ms of a given 9Cr alloy. Raw experimental data were augmented with these synthetic features by the high-throughput calculation using Thermo-Calc, resulting in a dataset with 451 instances/rows, 45 input features/columns, and one target (0.2% yield strength), and the temperature range of RT to 800 °C.

Correlation analysis

The necessity of correlation analysis in materials data analytics is threefold: (1) validate if high-ranking features are consistent with generally accepted mechanisms; (2) provide a numerical/statistical basis for the selection of input features in the training of ML models; and (3) facilitate the generation of alloy hypotheses by identifying overlooked/hidden features in previous work. The correlation between the input features and the target was represented by PCC34 and MIC35. While PCC only evaluates the strength of the linear relationship, MIC identifies the strength of both linear and non-linear relationships. The correlation coefficient of PCC lies between −1 and 1, where 1 indicates a total positive linear correlation, −1 indicates a complete negative/reciprocal linear correlation, and 0 indicates no linear correlation. The closer the coefficient is to 1 or −1, the stronger the correlation between the two variables is. The correlation coefficient of MIC ranges between 0 and 1. The closer the coefficient is to 1, the stronger is that the correlation.

Machine learning

The performance of five representative ML models was studied: (1) LR28, (2) BR29,30, (3) k-NN31, (4) RF32, and (5) SVM33. A different number of top-ranking features based on the ranking from MIC and |PCC| (i.e., the absolute value of the correlation coefficient of PCC) was used to train ML models and evaluate their performance. The hyperparameters of each model were tuned by using up to 150 iterations to identify the optimum parameters. Each model was trained ten times for a given set of features to determine the averaged accuracy and its standard deviation. The ranking from correlation analysis does not assign any hierarchical factor to the features, i.e., all features have the same weight in ML training regardless of their ranking. The coefficient of determination (R2) was adopted to represent the accuracy of ML models. The correlation analysis and ML were performed using the open-source data analytics frontend, Advanced data SCiEnce toolkit for Non-Data Scientists (ASCENDS)46,47, which is available via GitHub (

Depending on the flexibility of the ML models, overfitting or underfitting the data is possible. The k-fold approach48 with k = 5 was used for the ML training. Four groups were used to train the ML model, and the one remaining group (i.e., unseen data) was withheld during training and later used as the validation data to evaluate the accuracy of models. Then we have trained the same ML model (i.e., the same feature set for a given ML algorithm) ten times to get the statistics for uncertainty quantification. As such, it ensured that the fitting of the ML models to the data was balanced.

Data availability

The data that support the findings of this study are available from the corresponding authors upon reasonable request.


  1. 1.

    Ramprasad, R., Batra, R., Pilania, G., Mannodi-Kanakkithodi, A. & Kim, C. Machine learning in materials informatics: recent applications and prospects. npj Comput. Mater. 3, 54 (2017).

    Article  Google Scholar 

  2. 2.

    Ramakrishna, S. et al. Materials informatics. J. Intell. Manuf. 30, 2307–2326 (2019).

    Article  Google Scholar 

  3. 3.

    Bock, F. E. et al. A review of the application of machine learning and data mining approaches in continuum materials mechanics. Front. Mater. 6, 110 (2019).

    Article  Google Scholar 

  4. 4.

    Alberi, K. et al. The 2019 materials by design roadmap. J. Phys. D: Appl. Phys. 52, 013001 (2018).

    Article  Google Scholar 

  5. 5.

    Rajan, K. Materials informatics. Mater. Today 8, 38–45 (2005).

    CAS  Article  Google Scholar 

  6. 6.

    Sourmail, T., Bhadeshia, H. K. D. H. & MacKay, D. J. C. Neural network model of creep strength of austenitic stainless steels. Mater. Sci. Technol. 18, 655–663 (2002).

    CAS  Article  Google Scholar 

  7. 7.

    Agrawal, A. et al. Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters. Integr. Mater. Manuf. Innov. 3, 90–108 (2014).

    Article  Google Scholar 

  8. 8.

    Verma, A. K. et al. Mapping multivariate influence of alloying elements on creep behavior for design of new martensitic steels. Metall. Mater. Trans. A 50, 3106–3120 (2019).

    CAS  Article  Google Scholar 

  9. 9.

    Verma, A. K. et al. Screening of heritage data for improving toughness of creep-resistant martensitic steels. Mater. Sci. Eng. A, 763, 138142 (2019).

  10. 10.

    Zhang, M. et al. High cycle fatigue life prediction of laser additive manufactured stainless steel: a machine learning approach. Int. J. Fatigue 128, 105194 (2019).

    CAS  Article  Google Scholar 

  11. 11.

    Bhadeshia, H. K. D. H. & Sourmail, T. Design of creep-resistant steels: success & failure of models. Jpn. Soc. Promot. Sci. Comm. Heat.-Resist. Mater. Alloy. 44, 299–314 (2003).

    Google Scholar 

  12. 12.

    Dimitriu, R. C. & Bhadeshia, H. K. D. H. Hot strength of creep resistant ferritic steels and relationship to creep rupture data. Mater. Sci. Technol. 23, 1127–1131 (2007).

    CAS  Article  Google Scholar 

  13. 13.

    Bhadeshia, H. K. D. H. Neural networks in materials science. ISIJ Int. 39, 966–979 (1999).

    CAS  Article  Google Scholar 

  14. 14.

    Shin, D., Lee, S., Shyam, A. & Haynes, J. A. Petascale supercomputing to accelerate the design of high-temperature alloys. Sci. Technol. Adv. Mater. 18, 828–838 (2017).

    Article  Google Scholar 

  15. 15.

    Wen, C. et al. Machine learning assisted design of high entropy alloys with desired property. Acta Mater. 170, 109–117 (2019).

    CAS  Article  Google Scholar 

  16. 16.

    Huang, W., Martin, P. & Zhuang, H. L. Machine-learning phase prediction of high-entropy alloys. Acta Mater. 169, 225–236 (2019).

    CAS  Article  Google Scholar 

  17. 17.

    Zhang, Y. et al. Phase prediction in high entropy alloys with a rational selection of materials descriptors and machine learning models. Acta Mater. 185, 528–539 (2020).

  18. 18.

    Meredig, B. Five high-impact research areas in machine learning for materials science. Chem. Mater. 31, 9579–9581 (2019).

    CAS  Article  Google Scholar 

  19. 19.

    Kalidindi, S. R. Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials. Int. Mater. Rev. 60, 150–168 (2015).

    CAS  Article  Google Scholar 

  20. 20.

    Panchal, J. H., Kalidindi, S. R. & McDowell, D. L. Key computational modeling issues in integrated computational materials engineering. Comput. Aided Des. 45, 4–25 (2013).

    Article  Google Scholar 

  21. 21.

    Zhao, J. C. & Henry, M. F. CALPHAD—is it ready for superalloy design? Adv. Eng. Mater. 4, 501–508 (2002).

    CAS  Article  Google Scholar 

  22. 22.

    Kalidindi, S. R. & De Graef, M. Materials data science: current status and future outlook. Annu. Rev. Mater. Res. 45, 171–193 (2015).

    CAS  Article  Google Scholar 

  23. 23.

    Shin, D., Yamamoto, Y., Brady, M. P., Lee, S. & Haynes, J. A. Modern data analytics approach to predict creep of high-temperature alloys. Acta Mater. 168, 321–330 (2019).

    CAS  Article  Google Scholar 

  24. 24.

    Shen, C. et al. Physical metallurgy-guided machine learning and artificial intelligent design of ultrahigh-strength stainless steel. Acta Mater. 179, 201–214 (2019).

    CAS  Article  Google Scholar 

  25. 25.

    Abe, F. in Proceedings of the Materials for Advanced Power Engineering, COST Conference, Liege, Belguim, September 18–20, 2020.

  26. 26.

    Washko, S. & Aggen, G. ASM Handbook Volume 1, Properties and Selection: Irons, Steels, and High-Performance Alloys (ASM International, 1990).

  27. 27.

    Dossett, J. L. & Totten, G. E. ASM Handbook, Volume 4D: Heat Treating of Irons and Steels, 382–396 (ASM International, 2014).

  28. 28.

    Freedman, D. A. Statistical Models: Theory and Practice. 26 (Cambridge University Press, 2009).

  29. 29.

    MacKay, D. J. Bayesian interpolation. Neural Comput. 4, 415–447 (1992).

    Article  Google Scholar 

  30. 30.

    Tipping, M. E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001).

    Google Scholar 

  31. 31.

    Altman, N. S. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992).

    Google Scholar 

  32. 32.

    Barandiaran, I. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998).

    Article  Google Scholar 

  33. 33.

    Awad, M. & Khanna, R. Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers (Apress, 2015).

  34. 34.

    Sedgwick, P. Pearson’s correlation coefficient. BMJ 345, e4483 (2012).

    Article  Google Scholar 

  35. 35.

    Reshef, D. N. et al. Detecting novel associations in large data sets. science 334, 1518–1524 (2011).

    CAS  Article  Google Scholar 

  36. 36.

    Abe, F. Strengthening mechanisms in steel for creep and creep rupture, in Creep-Resistant Steels (eds Kern, T. U., Abe, F. & Viswanathan, R.) 279–304 (Woodhead Publishing Series in Metals and Surface Engineering, 2008).

  37. 37.

    Kortum, F., Karras, O., Klünder, J. & Schneider, K. in Proceedings of International Conference on Product-Focused Software Process Improvement. 725–740 (Springer, 2019).

  38. 38.

    Hanumantharaju Gulapura, A. K. Thermodynamic Modelling of Martensite Start Temperature in Commercial Steels, Master thesis (KTH Royal Institute of Technology, 2018).

  39. 39.

    National Research Institute for Metals, NIMS Materials Database (MatNavi), Creep Data Sheet, Category: High Cr Steels, Technical Reports 10B, 13B, 19B, 43A, 46A, 48B, 51A, 52A (NIMS, Japan, 1994–2018).

  40. 40.

    Thermo-Calc Software AB, TCFE9: TCS Steel and Fe-alloys Database, 2019,

  41. 41.

    Lukas, H. L., Fries, S. G. & Sundman, B. Computational Thermodynamics: The Calphad Method 131 (Cambridge University Press, Cambridge, 2007).

    Google Scholar 

  42. 42.

    Andersson, J.-O., Helander, T., Höglund, L., Shi, P. & Sundman, B. Thermo-Calc & DICTRA, computational tools for materials science. Calphad 26, 273–312 (2002).

    CAS  Article  Google Scholar 

  43. 43.

    Sundman, B., Jansson, B. & Andersson, J.-O. The thermo-calc databank system. Calphad 9, 153–190 (1985).

    CAS  Article  Google Scholar 

  44. 44.

    Borgenstam, A. & Hillert, M. Driving force for fcc → bcc martensites in Fe-X alloys. Acta Mater. 45, 2079–2091 (1997).

    CAS  Article  Google Scholar 

  45. 45.

    Stormvinter, A., Borgenstam, A. & Ågren, J. Thermodynamically based prediction of the martensite start temperature for commercial steels. Metall. Mater. Trans. A 43, 3870–3879 (2012).

    CAS  Article  Google Scholar 

  46. 46.

    Lee, S., Peng, J., Williams, A. & Shin, D. ASCENDS: advanced data SCiENce toolkit for non-data scientists. J. Open Source Softw. 5, 1656 (2020).

    Article  Google Scholar 

  47. 47.

    Peng, J., Lee, S., Williams, A., Haynes, J. A. & Shin, D. Advanced data science toolkit for non-data scientists—a user guide. Calphad 68, 101733 (2020).

    CAS  Article  Google Scholar 

  48. 48.

    James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning. Vol. 112, 181 (Springer, 2013).

Download references


This research was sponsored by the US Department of Energy, Office of Fossil Energy, eXtreme environment MATerials (XMAT) consortium. This research used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC05-00OR22725. The authors thank YiYu Wang for valuable discussion and Chris Layton for his support on using CADES. This paper has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this paper, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (

Author information




D.S. conceived the study. J.A.H provided the dataset. J.P. performed correlation analysis and machine learning training. J.P., Y.Y, and D.S. analyzed the data. J.P. drafted the paper. Y.Y., J.A.H., E.L-C., and D.S. reviewed the paper.

Corresponding author

Correspondence to Dongwon Shin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Peng, J., Yamamoto, Y., Hawk, J.A. et al. Coupling physics in machine learning to predict properties of high-temperatures alloys. npj Comput Mater 6, 141 (2020).

Download citation


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing