Predicting lattice thermal conductivity via machine learning: a mini review

Over the past few decades, molecular dynamics simulations and first-principles calculations have become two major approaches to predict the lattice thermal conductivity (κL), which are however limited by insufficient accuracy and high computational cost, respectively. To overcome such inherent disadvantages, machine learning (ML) has been successfully used to accurately predict κL in a high-throughput style. In this review, we give some introductions of recent ML works on the direct and indirect prediction of κL, where the derivations and applications of data-driven models are discussed in details. A brief summary of current works and future perspectives are given in the end.


INTRODUCTION
The lattice thermal conductivity (κ L ) is a key design parameter for various technological applications. For example, heat sinks in the electronic devices require higher κ L to dissipate the excessive thermal energy 1 , while reducing κ L is an effective approach to improve the efficiency of thermoelectric (TE) conversion 2 . It is thus quite necessary to discover/design particular systems with desired κ L . On the theoretical side, the most reliable approach for predicting κ L is the solution of phonon Boltzmann transport equation (BTE) within the framework of density functional theory (DFT) 3,4 . However, the required calculations of the interatomic force constants (IFCs) are time-consuming, especially for those with large unit cell and low symmetry. As an alternative, the classic molecular dynamics (MD) simulations can be utilized to predict the κ L of systems with complex crystal structure 5 . Nevertheless, the accuracy of MD significantly depends on the choice of interatomic potentials, which also limits its wide application. In a word, there remains some challenges or difficulties to accurately predict the κ L , especially in a highthroughput way.
As an important technique of artificial intelligence, machine learning (ML) can efficiently determine the underlying connectivity among enormous data at extremely low cost [6][7][8][9] . During the past few decades, many efforts have been devoted to evaluate the κ L of various systems, both theoretically and experimentally [10][11][12][13][14] . Based on these available data, ML can establish a mapping between the target property (κ L ) and the input features (such as the atomic mass, the phonon frequency, and the volume of unit cell 8,15 ). Compared with first-principles calculations and MD simulations, the data-driven ML models enable high-throughput evaluation of κ L , which exhibit strong predictive power for systems both inside and beyond the training set 15,16 . In addition to such direct prediction of κ L , ML has been successfully used to build accurate interatomic potentials for MD simulations. Generally speaking, the machine learning potential (MLP) employs regression algorithm to determine the ab-initio potential energy surface (PES), and the atomic configurations are usually adopted as input features 17,18 . Recently, the MLPs have been utilized to accurately predict the κ L of systems with complex crystal structures and chemical compositions, such as the alloys 19 , the heterostructures 20 , and the molten salts 21 . On the other hand, as the derivative of total energy with respect to the atomic displacement, IFCs can be obtained from the Taylor expansion of the PES 4,22 . The MLPs determine the accurate PES and thus can derive the IFCs at almost negligible computational cost, which enable accelerated solution of phonon BTE for the evaluation of κ L 22,23 . Collectively speaking, ML can overcome the inherent disadvantages of MD simulations and first-principles calculations to accurately and readily predict κ L .
The remainder of this review is organized as follows. In the section of "Direct prediction", we give a brief introduction of the dataset construction, the feature selection, and the training algorithms, which are then combined to obtain the highthroughput models for predicting κ L . In the section of "Indirect approach", we focus on the construction of MLPs, and highlight their first-principles level accuracy as well as advantages over general approaches in predicting κ L . The review is concluded with a summary of current works and future perspectives.

DIRECT PREDICTION Dataset construction and related features
As a data-driven technique, ML requires a dataset that contains the κ L for a substantial number of systems to derive reliable prediction model. In general, the κ L can be collected from firstprinciples calculations, MD simulations, and experimental measurements. As an addition, one can also obtain the κ L from some materials databases. For example, thousands of entries in the Automatic FLOW (AFLOW) database have included the κ L values calculated by the so-called Automatic Gibbs Library (AGL) method 12,24 . Here, the GIBBS quasiharmonic Debye model 25 is employed to evaluate the Debye temperature and the Grüneisen parameter based on the computationally feasible adiabatic bulk modulus, which are then inserted into the well-known Slack model for the determination of κ L 26 . We should emphasize that it would be better to collect all the κ L obtained using the same approach, for example, either first-principles or MD. However, if there is not enough data available for ML, one can also consider both of them, as long as the interatomic potentials adopted in MD are well-tested and the results exhibit sufficient accuracy. Figure 1 is a schematic illustration of ML for the high-throughput prediction of κ L , where the dataset is usually divided into two subsets including the training and testing sets. To avoid random selection of training data, the principal component analysis (PCA) can be used to identify systems that possess distinct features in the dataset. For example, Tranås et al. demonstrated that a model based on a semi-random pool of half-Heusler (HH) compounds (i.e. assumed "bad luck" in the training set) was unable to correctly predict the small κ L values of those systems in the testing set from the rest 27 . As an alternative, they used active sample selection based on PCA, where three compounds with extremely low κ L were included in the training process. Such an approach can significantly improve the model performance, in particular the ability to identify the low κ L compounds in the testing set.
As mentioned in the introduction, ML is implemented to establish a mapping between the κ L and some related input features, which usually contains the information about: (1) the structural properties, such as the lattice constant, the volume of unit cell, the number of atoms, and the bond length; (2) the elemental properties of the constituent atoms, including the atomic number, the atomic mass, and the Pauli electronegativity; (3) the phonon properties, such as the phonon frequency, the group velocity, the heat capacity, and the Grüneisen parameter. To be compatible with most ML algorithms, systems with different size needs to be represented by feature vectors with a fixed length. Such a problem can be solved by adopting statistical values of the elemental properties, such as the maximum, the minimum, the composition-weighted (CW) value, and the standard deviation 28 . Obviously, it is quite important to screen out features that are closely related to the target property. For instance, Juneja et al. revealed high Pearson correlation between κ L and several fundamental properties, including the maximum phonon frequency, the average atomic mass, the volume of the unit cell, and the integrated Grüneisen parameter up to 3 THz 15 . Using these input features, they developed a Gaussian process regression-based ML model by training the κ L of 120 dynamically stable and nonmetallic compounds. To keep only those features that are highly related to κ L , Chen et al. performed the so-called recursive feature elimination (RFE) on the initial feature vector, which significantly reduces the dimensionality from 63 to 29 29 . It should be also noted that highly intercorrelated features could increase the computational cost and may affect the predictive power. It is thus quite necessary to check the feature relevancy and redundancy before any training process.

Machine learning algorithms
With the rapid development of artificial intelligence, various ML algorithms have been proposed, such as the Bayesian Optimization (BO) 30 , the eXtreme Gradient Boosting (XGBoost) 31 , the Neural Network (NN) 32 , the Kernel Ridge Regression (KRR) 33 , the Least Absolute Shrinkage and Selection Operator (LASSO) 34 , the Sure Independence Screening and Sparsifying Operator (SISSO) 35 , the Generalized Linear Regression (GLR) 36 , the Random Forest (RF) 37 , the Gaussian Process Regression (GPR) 38 , and etc. In this section, we give a brief introduction of the NN, SISSO, and RF, which are widely used to construct high-throughput ML models for the prediction of κ L .
In the learning process, NN algorithm first feeds the input layer with feature data, which is then manipulated by several hidden layers, and the output layer finally generates the target value. Each neuron is connected with all the neurons from the previous layer, and it deals with the data according to a specific activation function. When data is transferred between neurons, its value will  be multiplied by a weight parameter. To generate the best model, one should optimize the hyperparameters (such as the activation function, the number of neurons and the hidden layers) and minimize the loss function. It has been demonstrated that NN can effectively handle non-linear and complex problems, whereas the obtained model is usually treated as a black box.
The SISSO algorithm relies on two key steps: the feature-space construction and the descriptor identification. Specifically, the input features are first combined by iteratively using the algebraic operators I; þ; À; ; =; exp; log; À j j; ffi p ; À1 ; 2 ; 3 n o , which could construct a huge feature space. The sure independence screening (SIS) then scores each new feature with a metric (correlation magnitude), and selects the subspace that contains the descriptors highly related with the training data. The sparsifying operators (SO) is finally utilized to find the optimal n-dimensional descriptor. Compared with many other ML algorithms, the SISSO can identify descriptors that are explicit and analytic functions of key inputs, which is very beneficial for understanding the inherent physical mechanisms.
As an ensemble learning algorithm, RF combines multiple decision trees (DT) 39 to avoid overfitting. Each DT in the "forest" is individually trained by randomly selecting a subset of features. Here, the training data is divided into two or more categories at the root node based on the feature values, and each subsequent node will receive a data subgroup and then output the separations to the next nodes until all the generated groups are homogeneous. The output of a final node (called a "leaf") gives the mean value of the corresponding separated samples. In a word, the trained DT model obtains the predicted value by determining the interval of input features. Collectively, the RF model can rank the importance of each feature according to the order of nodes, and output the average value of predicted results by all the DT ensemble models.

High-throughput prediction models
As an effective high-throughput method, ML has been widely used to predict κ L of various systems in recent years 15,16,[27][28][29][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57] .  b Dependence of the predicted κ L on specific elements for compounds in the ICSD, and the values are shown by colors along with the ΔH atomic and ρ. a and b are reproduced with permission from ref. 44 . c Schematics of the multilayer structures and the MD simulation setup. d Comparisons of the real and predicted κ L for 100 randomly generated RMLs and their corresponding 100 GMLs in the testing set. c and d are reproduced with permission from ref. 45 .
set, as shown in Fig. 2a 44 . Among 75 input features, the average atomization enthalpy (ΔH atomic ) and the density (ρ) were found to be most relevant with the κ L . Note that the training set contains 4937 κ L calculated by the above-mentioned AGL method 12 , which were collected from the AFLOW database 24 . The model was then employed on all the entries in the Inorganic Crystallographic Structure Database (ICSD) 58 , and it was found that compounds containing halogen elements or heavy atoms exhibit low κ L (see Fig. 2b). Among them, potential TE materials (such as BiTe 2 Tl and Cl 2 CsI) were screened out and the prediction accuracy was validated by first-principles calculations. Besides, the NN algorithm has been successfully applied to predict the κ L of random multilayer (RML) and gradient multilayer (GML) structures composed of two types of conceptual atoms with different mass values, as shown in Fig. 2c 45 . To construct the training set, the κ L of 1600 RMLs and their corresponding 1600 GMLs were calculated by MD simulations. In contrast to generally used crystal and elemental properties, the input features include several key parameters for quantifying the disorder in layer thicknesses of RMLs, as listed in Table 1. Figure 2d shows the predicted κ L for 200 multilayer structures beyond the training set, which are in good agreement with the MD results. Unlike most ML models which appears as black boxes, Liu et al. employed the SISSO method to establish a physically intuitive descriptor for predicting the κ L of HH compounds 46 . They found that the first term dominates the three-dimensional (3D) descriptor, where the κ L decreases with the lattice constant a but increases with the electronegativity difference |χ A − χ B | between atoms at site A and B. This is consistent with the general belief that systems with larger unit cell usually have smaller κ L , and stronger chemical bonding would lead to higher κ L . Beyond the initial 86 training data, the strong predictive power of the descriptor was confirmed by 75 HH compounds and 15 full-Heusler (FH) systems.
As typical input features for many ML models, accurate structural parameters are usually obtained by first-principles calculations 40,41,54,56 if experimental results are not available. Alternatively, Jaafreh et al. utilized the crystal features of a series prototype structures to establish a RF-based model, which can be applied to related systems without the use of any DFT-relaxed structural parameters 16 . It should be noted that the crystal features are generated by using the area of each face of the Wigner-Seitz cell (see Fig. 3a) and the characteristics of neighboring atoms. The training set contains 2146 κ L of 119 compounds at a series of temperatures from 100 to 1000 K. As shown in Fig. 3b, the RF-based model exhibits strong predictive power for 4 systems in the testing set. To go further, the model was used to predict the room temperature κ L for 32,116 compounds in the ICSD, where 273 have ultralow values and 4 are even <0.1 Wm −1 K −1 suggesting very promising applications in the field of energy harvesting.
In recent years, the Convolutional Neural Network (CNN) algorithm has been adopted to predict the κ L of porous graphene 47 , hybrid carbon-boron nitride honeycombs 50 , aperiodic superlattices 57 , and etc. The input layer of CNN is fed with particular arrays, which can be obtained by extracting characteristics of an image representing the system, instead of selecting features from various physical properties. In particular, the Crystal  Graph Convolutional Neural Network (CGCNN) algorithm enables the prediction of target properties by a graph representing the connection of atoms in the crystal 59 . As a major advance, Zhu et al. employed CGCNN to predict the κ L of all known inorganic crystals directly from their atomic structures 51 . As shown in Fig. 3c, they established a model based on the dataset 60 containing 2668 calculated lattice thermal conductivities (named κ C in their work), where the crystal structure was converted to a graph with the nodes and the edges respectively representing atoms and connections between neighboring atoms. The CNN was initialized by feature vectors that characterize each node and edge. It should be noted that these κ C were firstly calculated by a semi-empirical model, which inevitably exhibit insufficient accuracy 51,61 . To address such a problem, they collected 132 experimentally measured lattice thermal conductivities (κ exp ). Due to the small size of the training set, the established model exhibits a large mean absolute error (MAE) of 0.51 (for log-scaled κ exp ). As correlated datasets share similar domain knowledge, they developed a transfer learning scheme (see Fig. 3c) where all layers from the model trained by κ C was transferred to initialize a second CGCNN with reduced MAE of~0.27. It should be noted that ML models can be further optimized via active learning, which is very useful for the inverse design of systems with desired κ L . For example, it is very time-consuming to identify the optimal distribution of holes that can minimize the κ L of two-dimensional (2D) materials since the design space expands dramatically with increasing hole density. Taking porous graphene as a prototypical class of examples, Wan et al. adopted a CNNbased inverse design approach to determine the structure with the lowest κ L , which only needs to simulate~10 3 systems by MD out of the total 10 6 possible candidates 47 . By performing MD simulations, Chowdhury et al. obtained the κ L of 300 randomly generated Si/Ge RMLs which is used as the initial dataset 57 . They iteratively identified RMLs with locally enhanced phonon transport and included them as additional training data. Using the CNN model, RMLs with unexpectedly higher κ L are discovered, which can be attributed to the presence of closely spaced interfaces.
Summarizing this section, considerable progress has been made in the high-throughput prediction of κ L by leveraging various datadriven models. To have a fast understanding, Table 1 provides the training and testing sets, the input features, and the adopted algorithms of representative ML works in recent 5 years. With the increasing growth of big data and accelerated development of artificial intelligence, it is expected that ML would become a major scientific paradigm for accurately predicting κ L , and more ML models or descriptors could be emerged to give physical insights into different mechanisms to manipulate κ L , such as phonon coherence, weak coupling of phonons, and high-order phonon anharmonicity 62 .

INDIRECT APPROACH
Due to the lack of available training data, the above-mentioned ML models cannot be directly applicable to various 2D materials, nanowires, alloys, ternary salts, and etc. As an alternative, ML can be also utilized to construct accurate interatomic potentials or force constants so that efficient evaluation of κ L become feasible using MD simulations or even first-principles calculations.
It is known that the atomic-scale simulations need to determine the PES that provides the potential energy as a function of atomic positions. In principle, the most accurate PES can be obtained by quantum mechanics calculations, which is however very timeconsuming and even prohibitive for large systems. Based on the physical knowledge of the interatomic bonding, many specific analytic expressions have been proposed, known as the empirical potentials 63 . However, the PES is a multidimensional real-valued function, which cannot be completely fitted by these specific functional forms 64 . The empirical interatomic potentials thus usually exhibit insufficient accuracy and the involved parameters should be  16 . c Schematic of transfer learning based on CGCNN for the prediction of κ L . Here, the crystal structure is converted to a graph, where the nodes represent atoms and the edges connect the neighboring nodes. Reproduced with permission from ref. 51 .
carefully optimized for different systems. Taking the bulk silicon as an example, the κ L calculated by using the original Stillinger-Weber potential (~244 Wm −1 K −1 at room temperature) is much higher than the experimentally measured result (~148 Wm −1 K −1 ) 65 . It is thus quite necessary to develop alternative potentials for accurate prediction of κ L , and MLP is one of good choices.

Training data and input features
In principle, ML can be used to fit the correlation between atomic configurations and physical properties of given systems. Compared with empirical potentials, MLPs determine the PES in a datadriven manner to describe the interatomic interactions. To ensure that the MLPs exhibit first-principles level accuracy, the dataset is usually constructed by performing ab-initio molecular dynamics (AIMD) simulations at a series of temperatures, where the energies, the forces, and the stresses of different atomic configurations are then recorded [66][67][68] . It should be noted that the atomic configurations are sampled from AIMD trajectories and uncorrelated with each other. Unlike many ML models for high-throughput prediction of κ L , establishing MLPs requires to input features that can represent the local environment around each atom, usually within a specific cutoff radius. The adopted features must be invariant to Euclidean transformations and permutation of chemically equivalent atom 69 . A simple example is the atomic Cartesian coordinates which cannot be used for training. The reason is that when the system is rotated or the chemically equivalent atoms are exchanged, a new list of Cartesian coordinates is generated, which however corresponds to the same atomic configuration. For the evaluation of κ L , the widely used features are the moment tensor 70 , the atomcentered symmetry functions (ACSFs) 71,72 , and the smooth overlap of atomic positions (SOAP) 18 . Taking the ACSFs as an example, the G 2 i and G 4 i in the following expressions respectively describe the radial and angular environment of atom i, and  Here R ij is the distance between atom i and j, θ ijk is the angle centered at atom i, and f c is the smooth cutoff function. η s and R s define the width and the center of the Gaussians, respectively. In Eq.
(2), the angular resolution and distribution can be determined by ζ and η a , and λ has the values of +1 and −1. It should be emphasized that the construction of appropriate features is a very challenging task, and we refer the interested reader to a review article 73 that summarizes recent work on the efficient representations of atomic and molecular structures. Table 2 summarizes several important MLPs used for the evaluation of κ L , which includes the Moment Tensor Potentials (MTPs) 70 , the Neural Network Potentials (NNPs) 71,74 , and the Gaussian Approximation Potentials (GAPs) 75 . In particular, the MTPs exhibit an excellent balance between accuracy and computational efficiency 76 , which have been widely used to predict the κ L of various systems, such as monolayers, alloys, and complex compounds 68,[77][78][79][80][81][82][83][84][85][86][87][88][89][90][91][92][93] . In principle, the purpose of training MTP is minimizing the difference between the predicted and DFTcalculated energies (E), forces (f), and stresses (σ) for K atomic configurations: 70,94,95

Machine learning potentials
Here, w e , w f , and w s are respectively positive weights that express the importance of energies, forces, and stresses in the training process. In order to improve the quality of MTPs, active learning is usually implemented, where the atomic configuration will be included in the training set if its extrapolation grade (a feature correlated with the prediction error 94 ) is above the threshold and below the allowed maximum. Figure 4 show the widely used active learning scheme for training a MTP, which usually contains six stages labeled as A to F. To validate the accuracy of the trained MLP, the energies, the forces, and the stresses of different atomic configurations are checked in the testing and predicting sets. For instance, Huang et al. proposed a single atom neural network potential (SANNP) for the amorphous silicon based on the training set containing 800 atomic configurations from AIMD simulations 67 . Figure 5a−c respectively show the predicted total energies, atomic energies, and atomic forces in the testing set, which agree well with those obtained from DFT calculations.

Application examples
As mentioned in the introduction part, the MLPs with firstprinciples level accuracy can be implemented into the MD simulations or the phonon BTE to indirectly predict κ L of given systems. Unlike conventional first-principle calculations or classic MD with empirical potentials, the evaluation of κ L by employing MLPs simultaneously exhibits strong reliability and high efficiency, which have been demonstrated by various systems as also summarized in Table 2 [20][21][22][23][66][67][68] . For example, Korotaev et al. used the active learning algorithm to develop the MTP for the CoSb 3 skutterudite and accurately predict its κ L at different temperatures 68 . Indeed, we see from Fig. 6a that the κ L indirectly obtained from MTP almost coincide with the experimentally measured results. It should be emphasized that, compared with conventional first-principles calculations, the MTP can significantly accelerate the prediction process (the computational speed increased by more than four orders of magnitude).
Due to the compositional complexity, predicting the κ L of highentropy materials is usually a challenging task. Recently, Dai et al. established a deep learning potential for the thermal insulting material (Ti 0.2 Zr 0.2 Hf 0.2 Nb 0.2 Ta 0.2 )B 2 120 , which was then used in the MD simulations to calculate its average lattice thermal conductivity along two directions (κ p ), as shown in Fig. 6b 107 . At room temperature, the κ p is predicted to be 4.0 Wm −1 K −1 , which is close to the experimentally measured value of~4.8 Wm −1 K −1 .
In addition, the GAP of crystalline Si with vacancies was adopted by Babaei et al. to determine the lattice thermal conductivities contributed from phonon-vacancy scattering (κ ph-v ) 116 . As can be found from Fig. 6c, the κ ph-v predicted from the GAP show good agreement with the DFT-calculated results at different vacancy concentrations, while those with empirical potentials exhibit much larger errors. Similar picture can be found in Fig. 6d, where the effects of three-phonon and phonon-vacancy scatterings are both included (κ 3ph+ph-v ). Note that the computational cost of the GAP is five orders of magnitude smaller than that of DFT calculations in their own work, indicating the high efficiency of such kind of MLP for the prediction of κ L .
Last but not least, we note that several other MLPs were proposed recently, such as the spectral neighbor analysis potential, the bond order potential, the force constant potential, and the spatial density neural network force fields, which have been also demonstrated to accurately evaluate the κ L of many systems at low computational cost 19,[121][122][123][124][125][126][127] . With the deep understanding of interatomic interactions, it is reasonable to expect that more and more reliable and universal MLPs could be developed in the future.

SUMMARY AND PERSPECTIVE
To conclude, we hope this mini review could enable the interested reader to have a preliminary understanding of predicting κ L via ML, either directly or indirectly. As high-throughput ML models for the direct prediction of κ L , the input features usually contain several fundamental physical properties related to the investigated systems and the constituent elements, such as the lattice constants, the phonon frequency, the atomic mass, and so on. Such kind of data-driven models can be utilized for the rapid screening and inverse design of materials with desired κ L , and their predictive power has been demonstrated by checking many systems both inside or beyond the training sets. In addition, the MLPs can be readily implemented into the MD simulations or the Fig. 5 Validation of the accuracy of machine learning potentials. The intuitive linear correlations between the predicted a atomic energies, b total energies, c atomic forces and those calculated by DFT in the testing set for the amorphous silicon. Reproduced with permission from ref. 67 .
phonon BTE, which offer an indirect but quite efficient prediction of κ L for particular systems, including crystal with defects, highentropy compounds, amorphous structures, and so on. Compared with conventional DFT and MD approaches, the MLPs can significantly accelerate the evaluation of κ L and simultaneously retain first-principles level accuracy.
Although considerable advances have been made in the direct prediction of κ L via ML, there remain several challenges to be addressed in the future. For example, it is still difficult to construct large and reliable dataset required for training, which definitely affects the predictive power and the transferability of such datadriven method. In particular, many ML models are severely limited to some specific systems (e.g. HH compounds, zincblende and rocksalt structures), leaving a much larger materials space unexplored 40,41,43,46,52,54 . Although one can find the κ L values for thousands of compounds in the AFLOW and the TE Design Lab repositories, they are usually calculated by using empirical models 12,61 and may exhibit insufficient accuracy compared with those obtained from first-principles calculations, MD simulations, or experimental measurements. On the other hand, substantial advances have been made in the high-throughput discovery of 2D materials, while their thermal transport properties are less known so far [128][129][130][131] . Due to the limited experimental and theoretical data, it is rather difficult to derive a reliable ML model to predict the κ L of various 2D materials. It is believed that the transfer learning can overcome the disadvantage of small data size by pretraining the correlated dataset. However, it still requires accurate firstprinciples calculations to obtain the scattering phase space of numerous systems 53 , which remains a tough task. To take better advantage of transfer learning, much efforts should be devoted to identify readily available physical properties that are highly correlated with κ L .
In the case of developing efficient MLPs, it is usually necessary to calculate the energies, the forces, and the stresses of a substantial number of atomic configurations within the framework of DFT. Such a task is however very time-consuming for the systems with large unit cell and complex chemical composition, such as molten salts 21 , skutterudites 68 , high-entropy compounds 107 and so on. Besides, the employed features usually indicate the atomic environment within a certain cutoff radius, and the established MLPs thus ignore the long-range interactions that could be very important for thermal transport properties in some cases. It is expected that the efficiency and accuracy of MLPs can be further improved by careful selection and optimization of input features and/or learning schemes.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author on reasonable request.
Received: 19 October 2022; Accepted: 2 January 2023; Reproduced with permission from ref. 68 . b The lattice thermal conductivities along two different directions (κ a and κ c ) and their average value (κ p ) of (Ti 0.2 Zr 0.2 Hf 0.2 Nb 0.2 Ta 0.2 )B 2 , which are plotted as a function of temperature. The inset shows the auto-correlation function. Reproduced with permission from ref. 107 . c κ ph-v and d κ 3ph+ph-v of silicon with respect to vacancy concentration at 300 K, predicted by the DFT, GAP, and empirical potentials. Reproduced with permission from ref. 116 .