Introduction

Thermal conductivity is a physical quantity representing how heat is transferred from one side of a material to the other when thermal energy is applied. It is one of the most fundamental and important physical quantities. Moreover, it is an important physical quantity in terms of application, which is necessary for the understanding of thermal management to ensure the performance, life-time, and safety for thermoelectric energy conversion devices, and spintronics technology.

Thermal conductivity can be divided into electron and lattice contributions. The thermal conductivity of electrons can be determined from the electrical conductivity using the Wiedemann–Franz law. Half-Heusler compounds have a high lattice thermal conductivity due to their high crystal symmetry. Therefore, it is necessary to know the exact lattice thermal conductivity for thermal management in half-Heusler devices. Theoretical predictions of the lattice thermal conductivity of solids can be made using non-equilibrium molecular dynamics simulations1,2,3,4,5,6,7 or the density functional theory (DFT) calculations8,9,10,11,12,13,14,15,16. In recent years, with the advances in machine learning (ML) algorithms, several studies on understanding the heat transport properties of functional materials have been reported17,18,19,20,21. The prediction of thermal conductivity using non-equilibrium molecular dynamics requires an enormous amount of computational time because of the need to calculate the time evolution of the vast amount of atomic movements. On the other hand, because the DFT calculation can accurately calculate the interaction between atoms, the thermal conductivity can be predicted for several hundred times. Presently, theoretical studies of the lattice thermal conductivity are limited to systems with a small number of atoms in the unit cell, such as simple pure metals8,11,16, binary materials9,13,14,15,16,22, and full and half-Heusler compounds12,23,24,25,26. Therefore, it is difficult to perform comprehensive lattice thermal conductivity calculations for a large number of materials.

Cubic half-Heusler compounds have drawn significant attention in various fields as candidate materials for thermoelectric energy conversion27,28,29,30,31,32,33,34,35 and spintronics technology36,37,38,39,40.This is because it has various electronic structures, such as semi-metals, semiconductors, and a topological insulator. Carrete et al.41 and Liu et al.42 attempted to predict the lattice thermal conductivity of half-Heusler compounds obtained from the results of the DFT calculations by ML. Carrete et al. reported that the lattice thermal conductivity of half-Heusler compounds can be predicted in the range of 10% using the Young's modulus value obtained from the DFT calculations as the descriptor for the ML. Liu et al. reported that the lattice thermal conductivity of half-Heusler compounds can be predicted in the range of 10% using the atomic numbers, atomic masses, and atomic radii of the constituent atoms of half-Heusler compounds as the descriptors for ML.

The prediction of the lattice thermal conductivity must be highly accurate to predict the performance of thermoelectric materials and thermal management of electronic devices. Therefore, we have developed a ML algorithm to predict the lattice parameter and lattice thermal conductivity of half-Heusler compounds from the atomic information of their constituent atoms (atomic radius and atomic mass) only. This algorithm can predict lattice thermal conductivity values with high accuracy of less than 4% for many half-Heusler compounds. In this study, we found that the lattice thermal conductivity of the half-Heusler compounds, which is difficult to predict using the DFT calculations, can be predicted with high accuracy by ML. This report will provide a breakthrough in the development of new materials, as it can contribute to the discovery of innovative materials for next-generation thermoelectric conversion and spintronics materials, and other technologies requiring thermal management in the future.

Results and discussions

Figure 1 shows a flowchart of the ML used to predict the lattice parameter and lattice thermal conductivity in half-Heusler compounds. The atomic radii, r1, r2, r3, and atomic masses, m1, m2, m3, of the elements at 4c, 4a, and 4b sites in the C1b-type crystal structure were used as descriptors. The Python library Pymatgen, Python Materials Genomics43, was used to obtain the elemental information. The swapped data set of the elemental information of the 4a and 4b sites was also created because the 4a and 4b sites are interchangeable. First, to build a ML model to predict the lattice parameters, the atomic radii and masses were used as descriptors. Second, to build a ML model to predict the calculated lattice thermal conductivity, the parameters generated by various combinations of atomic masses, atomic radii, and the lattice parameters were used. The parameters of the combinations of atomic radii and atomic masses used for lattice thermal conductivity prediction are listed in Table S2 in the Supplementary materials section. Finally, to find the best combination of parameters, we used the Wrapper method with a backward feature elimination to sequentially remove unimportant parameters hence build an optimal ML model for predicting the lattice thermal conductivity. For lattice parameter and lattice thermal conductivity prediction, the multiple linear regression and boosted decision tree regression models were used as the ML models. The hyperparameters were adjusted by random sweeps to adjust the optimal hyperparameters. Python 3.6 was used for implementing the ML model. The training and test data were split between 80 and 20%. We used fivefold cross validation on the data set to evaluate the decision coefficients using the test data for each fold, and the mean of the decision coefficients, R2, was used to evaluate the ML model.

Figure 1
figure 1

Flowchart of the ML used to (a) prediction of lattice parameter and (b) prediction of thermal conductivity.

Figure 2a–c show the results of ML using multiple linear regression of the lattice parameters for various half-Heusler compounds obtained from the structural optimization by DFT calculations. ML with multiple linear regression predicts the lattice parameter with a higher accuracy using the atomic masses and atomic radii than that using the atomic radii only. As described in Table S1, the stability of half-Heusler compounds is affected by the atomic radii of each site and the atomic masses. Therefore, besides the atomic radii, the atomic masses have an important influence on the lattice parameter determination. The prediction of the lattice parameter using multiple linear regression can be determined with an accuracy of approximately 5%, as shown in Fig. 2c. To determine the lattice parameter with a better accuracy, we performed ML by boosted decision tree regression as shown in Fig. 2d–f. Using atomic radius and mass as parameters, R2 improved to 0.979. As shown in Fig. 2f, the ML model reproduced the lattice parameter almost perfectly with an accuracy of approximately ± 1%. The boosted decision tree regression using the atomic radii and masses of the atoms at the three sites of the C1b-type structure as a description was found to be the most suitable for predicting the lattice parameter by ML.

Figure 2
figure 2

Comparison between the lattice parameters predicted by DFT calculations and those predicted by: (ac) the ML model of multiple linear regression and (df) the ML model of boosted decision tree regression. (a,d) and (b,e) are the results of ML using a combination of atomic radii (3 parameters) and atomic mass and atomic radius (6 parameters) as descriptors, respectively. The regression equations determined by multiple linear regression are shown at the bottom of figures (a,b). (c,f) Frequency of deviations between the calculated and predicted lattice parameters.

Figure 3 show the results of the predicted lattice thermal conductivity for various half-Heusler compounds using ML with multiple linear regression and boosted decision tree regression. As shown in Fig. 3a, the boosted decision tree regression model shows a higher coefficient of determination than the multiple linear regression model, and reproduces the thermal conductivity of the half-Heusler compounds well. The ML model, such as a simple multiple linear regression is not suitable for ML of the lattice thermal conductivities. This result suggests that the lattice thermal conductivity exhibits a high accuracy owing to the complex interaction of various descriptors. Figure 3b shows the feature importance scores for the 55 parameters in ML of boosted decision tree regression. Among the 55 parameters, the top 4 parameters make a significant contribution to the accuracy of ML. It is known that when many 55 parameters are used in ML, the prediction accuracy is reduced because of over-fitting. It is necessary to find the best combination of parameters to improve the prediction accuracy of the lattice thermal conductivity. Figure 3c shows the results of the evaluation by the Wrapper Method using backward feature elimination. The backward feature elimination calculates the permutation feature importance of each parameter and sequentially removes the unimportant parameter to find the optimal combination of parameters. The R2 for the ML of thermal conductivity using the top four parameters is the highest R2 of 0.84, which is an improvement over the R2 when all the parameters are considered. The best parameter combination of the important features score for the prediction of the lattice thermal conductivity are in the following order:

  • (Top 1) x55: lattice parameter.

  • (Top 2) x42: the difference between the mean atomic radius of the constituent elements and the atomic radius of the 4c site, (r1 + r2 + r3)/3 − r1.

  • (Top 3) x33: the difference between the mean atomic mass of the constituent elements and the atomic mass of the 4c site, (m1 + m2 + m3)/3 − m1.

  • (Top 4) x29: the sum of the atomic masses, m1 + m2 + m3.

Figure 3
figure 3

(a) R2 for the ML model of multiple linear regression and boosted decision tree regression performed on the calculated lattice thermal conductivity. (b) Top 10 parameters of permutation feature importance for each parameter in the ML model of boosted decision tree regression. (c) Dependence of the number of features on the R2 evaluated by the wrapper Method with backward feature elimination.

Figure 4a,b show the results of lattice thermal conductivity predicted by ML for the boosted decision tree regression described with 55 and 4 parameters, respectively. The predictions of thermal conductivity with 4 parameters are in a better agreement with the results of ML with 55 parameters. Figure 4c shows the number of deviations between the predicted and calculated lattice thermal conductivities by the ML model of the boosted decision tree regression with 55 and 4 parameters. When 55 parameters were used, the lattice thermal conductivity was overestimated, with a deviation of approximately 8%. Conversely, when 4 parameters were used to describe the lattice thermal conductivity, the accuracy improved to approximately ± 4%. Carrete et al.41 and Liu et al.42 have previously reported that the accuracy of the lattice thermal conductivity prediction in half-Heusler compounds predicted by ML is approximately ± 10%, as shown in Fig. 4d. In this study, our ML model using the lattice parameter as a descriptor and selecting an appropriate combination of atomic radii and atomic masses as a descriptor led to a significant improvement in the accuracy of the prediction of lattice thermal conductivity.

Figure 4
figure 4

(a,b) Comparison between the lattice thermal conductivity predicted by the ML model of boosted decision tree regression using 55, and 4 parameters and the lattice thermal conductivity calculated by the DFT calculation. (c) Number of deviations between the predicted and calculated lattice thermal conductivity using 55 and 4 parameters. (d) Number of deviations between the predicted and calculated lattice thermal conductivity reported by Carrete et al.41 and Liu et al.42.

Figure 5 plots the relationship between three parameters and thermal conductivity, which are important in determining the lattice thermal conductivity of half-Heusler compounds. For many compounds with large lattice parameters, the lattice thermal conductivity is below 6, which is a low lattice thermal conductivity material. The lattice thermal conductivity of a solid material is expressed by the equation κ ⁓ C v l, where C, v, and l represent the specific heat, sound velocity, and phonon mean free path, respectively. It is essential to reduce the thermal conductivity to improve the performance of thermoelectric conversion materials. For this purpose, C, v, and l should be reduced. However, it is difficult to change C and l significantly in the same crystal system. The lattice thermal conductivity is lower in a compound with a large sum of atomic mass, including the structure and small lattice parameter because the sound velocity is expressed as a function inversely proportional to the density. Therefore, for compounds with small lattice parameters, the decrease in thermal conductivity can be qualitatively explained by the decrease in sound velocity. In half-Heusler compounds with lattice parameters of 0.7 to 0.8, the thermal conductivity is below 4 for the systems BaNaSb, KBaSb, CaCdSn, and KSrSb with large differences between the average atomic mass and 4c sites in the compounds. Even in half-Heusler compounds with large lattice parameters, small lattice thermal conductivities can be achieved by selecting the heavy elements to occupy the 4c site.

Figure 5
figure 5

3D plots of the three parameters (x33, x42, x55) and predicted lattice thermal conductivity. The magnitude of the lattice thermal conductivity in 3D plotting is shown in color scale.

The material design of half-Heusler compounds has focused on compounds with elements from group 1 to 4 applied to the 4a site, elements from group 12 to 16 applied to the 4b site, and elements from group 8 to 11 applied to the 4c site. This study revealed that many half-Heusler compounds contain the group 14–16 elements to the 4c site and group 1 to 2 elements to the 4a and 4b sites. When heavier elements, such as Sn and Sb, are occupied to the 4c site, the difference between the average atomic weight in the compound and the 4c site is particularly large, and compounds are predicted to actually show a significant decrease in thermal conductivity. In the future, the synthesis of these groups of materials will lead to new low thermal conductivity half-Heusler compounds.

The computational cost is defined as the computation time required to perform a calculation, considering the number of CPUs used to accomplish the computational process. A computation time of only 0.5 h was required to develop the prediction model of thermal conductivity using ML, whereas the corresponding DFT calculations for a single compound required approximately 72 h. Therefore, the computation time required to build a model for predicting thermal conductivity using ML is sufficiently smaller than that required for the DFT calculations. This implies that the ML-based prediction models are more efficient than the DFT method for thermal conductivity calculations. Thus, ML-based modeling is a highly effective tool for predicting the thermal conductivity of half-Heusler compounds.

Conclusion

In this study, we investigated whether the lattice thermal conductivity of half-Heusler compounds can be predicted by ML from the atomic radii and masses of the constituent elements. The results show that the lattice thermal conductivity of the half-Heusler compounds can be predicted with an accuracy of ± 4% using the predicted lattice parameters and the atomic radii and masses. In addition to the conventional material design in which a material with a small lattice parameter and low density has a low thermal conductivity, it was found that even for materials with a large lattice parameter, one can design a material with a low thermal conductivity by selecting elements occupying the 4c site of the half-Heusler structure with a larger atomic mass than those occupying the 4a and 4b sites. Using the results of ML, the thermal conductivity of unknown half-Heusler compounds can be instantly predicted. It is expected that the search for half-Heusler compounds with the desired thermal conductivity will become easier and the development of functional materials in a short time and at a low cost will be advanced in the future.

Calculation methods

Evaluation of site selection of the half-Heusler structure and lattice parameter

Candidate half-Heusler compounds for lattice thermal conductivity calculations were sought from the Materials Project44. In the crystal structure of C1b-type half-Heusler compounds, there are three types of atomic sites: 4a (0, 0, 0), 4b (0.5, 0, 0), and 4c (0.25, 0.25, 0.25) sites. The 4a and 4b sites are crystallographically interchangeable. Three structural models can be considered as one of the three constituent elements occupies the 4c site and the other two elements occupy the 4a and 4b sites. The lattice parameters were optimized using the DFT calculations for the three models. By comparing the total energies of the three models, the structural model with the lowest energy was determined to be the most stable structural model for the half-Heusler compound. The DFT calculations were performed using the Vienna ab initio Simulation Package (VASP)45,46,47. We adopted the projector augmented-wave (PAW) method48,49 with the generalized gradient approximation of Perdew, Burke, and Ernzerh50 for the exchange–correlation interactions. The k-mesh, cut-off energy and convergence energy were set to 7 × 7 × 7, 600 eV, and 10–6 eV, respectively.

Evaluation of lattice thermal conductivity of half-Heusler compounds

The lattice thermal conductivity of half-Heusler compounds was calculated using the Phono3py code, developed by Atushi Togo et al.22. The interatomic force constants were determined using various positions of the atoms in a supercell made of 2 × 2 × 2 primitive cells. Although the lattice thermal conductivity shows a temperature dependence, herein, we discuss the lattice thermal conductivity at 300 K. The calculated lattice thermal conductivities and lattice parameters are summarized in Table S1 in the Supplementary materials section.