Predicting interfacial thermal resistance by machine learning

Various factors affect the interfacial thermal resistance (ITR) between two materials, making ITR prediction a high-dimensional mathematical problem. Machine learning is a cost-effective method to address this. Here, we report ITR predictive models based on experimental data. The physical, chemical, and material properties of ITR are categorized into three sets of descriptors, and three algorithms are used for the models. Those descriptors assist the models in reducing the mismatch between predicted and experimental values and reaching high predictive performance of 96%. Over 80,000 material systems composed of 293 materials were inputs for predictions. Among the top-100 high-ITR predictions by the three different algorithms, 25 material systems are repeatedly predicted by at least two algorithms. One of the 25 material systems, Bi/Si achieved the ultra-low thermal conductivity in our previous work. We believe that the predicted high-ITR material systems are potential candidates for thermoelectric applications. This study proposed a strategy for material exploration for thermal management by means of machine learning.


INTRODUCTION
Thermal transport across the interfaces of two different materials is a crucial issue in micro/nanoscale electronic, photonic, and phononic devices. A temperature discontinuity exists between the interface of dissimilar materials; this discontinuity can be described as interfacial thermal resistance (ITR) in the equation R = q/ΔT, where q is the heat flux and ΔT is the temperature difference at the interface. In nanostructured devices, in which the characteristic length scales are shorter than the phonon mean free paths, the transport mode is ballistic rather than diffusive, and ITR becomes the dominant factor of phonon transport as the length scale decreases. Practically, phonon transport in thin films is affected by a variety of interfacial properties, including roughness, binding energy, and the presence of impurities or intermediate layers of mixed atoms. Even when the interfaces are in perfect contact, phonon reflections occur across the boundary as a result of differences in the acoustic properties of adjacent materials. Thus, several characteristics contribute to ITR, making it difficult to describe or predict.
Methods such as acoustic mismatch model (AMM), diffuse mismatch model (DMM), and molecular dynamics (MD) are commonly used to predict ITR. In AMM and DMM, which were introduced by Khalatnikov in 1952 and Swartz and Pohl in 1989, respectively, phonons in the equilibrium state are modeled without accounting for the nonequilibrium distribution of phonons. 1,2 AMM assumes that incident phonons at an interface undergo specular reflection or transmission, however, highfrequency or high-temperature phonons are scattered diffusely because of the interface roughness, leading researchers to develop more accurate methods to predict ITR. Prasher proposed the scattering-mediated acoustic mismatch model (SMAMM) and modified traditional AMM for weakly bonded atoms at an interface. 3 DMM assumes that phonons are elastically scattered and lose their memory of transport modes at the interface. In addition, the transmission probability depends on the ratio of the phonon density of states (PDOS). Therefore, the assumption of elastically scattering will result in failure when inelastic phonons are present, as at the imperfect interfaces, where they create energy channels. In AMM and DMM models, properties including temperature, density, sound velocity (longitudinal and transverse), and unit cell volume, are used as descriptors. However, AMM and DMM result in large discrepancies between the predicted and experimental values, with correlation coefficients of 0.6 and 0.62, and with RMSE of 121.3 and 91.4 (10 −9 m 2 K/W), respectively. 4 Both AMM and DMM assume that the phonons are in equilibrium on each side of the interface; however, in systems where the layer thickness is smaller than the phonon mean free path (e.g., systems with multiple quantum wells and superlattices), the nonequilibrium distribution of phonons should be taken into account. Thus, AMM and DMM have important shortcomings that need to be addressed.
The effect of lattice mismatch at the interface on ITR can be evaluated by MD simulation. Equilibrium MD is more suitable for the analysis of transient response measurements, whereas nonequilibrium MD is applied for steady-state measurements. In classical MD, atomic motion is calculated from classical Newtonian mechanics rather than quantum theory, and the zero-point energy is assumed to be zero. In contrast, ab initio MD provides a higher accuracy than classical MD, however, the simulation time and particle numbers are restricted to allow the full quantum calculation of the electronic structure for every configuration of atoms. All trajectories of the system often need more information than that which is known.
Predicting the ITR for various material systems is a timeconsuming process. Generally, the physical explanation with the different prediction methods mentioned above can be used only in specific cases. It is difficult to consider every property that might affect ITR in a single equation, particularly for interfacial conditions. Machine learning is a cost-effective and timeefficient method to address this high-dimensional problem. The machine learning has been implemented for thermal transport properties in many reported works. Xin Qian et al. developed Gaussian approximation potential models for analyzing the phonon dispersion stability. 5 Shenghong Ju et al. designed the Si/Ge interfacial structure for controlling heat conduction through atomistic Green's function and Bayesian optimization. 6 Masaki Yamawaki et al. used Bayesian optimization for multifunctional structural design of graphene nanoribbons for thermoelectric materials. 7 Our previous work also shows promising improvements of ITR predictive performance compared with the commonused AMM and DMM models through machine learning. 4 And, the melting point, heat capacity, unit cell volume, density, and film thickness were proposed to be important descriptors for ITR prediction. 4 In general, the larger dissimilarities of phonon properties lead to high ITR corresponding to the temperature. However, the inelastic interfacial scattering processes, which would be influenced by the interfacial quality, interfacial bonding, and phonon transmission coefficient, become important at and above the room temperature. Therefore, the physical and chemical properties which affect the interfacial quality should be carefully considered. Based on the thermophysical properties selection of our previous work, 4 in this study, we will further discuss how we consider lots of important factors, especially the interfacial conditions, through machine learning. In the following, we will introduce how we evaluated the models, analyzed the predictions. The details of data collection, descriptor selection, and algorithms selection can be found in the Method section.

RESULTS AND DISCUSSION
Predictive performance The predictive performance by three different algorithms was estimated by R, R 2 and RMSE as shown in Table 1. The R values of Regression tree ensembles of LSBoost (simplified as LSBoost), support vector machines (SVMs), and Gaussian Regression Processes (GRPs) models trained with all the descriptors are 0.958, 0.938, and 0.957, whereas the RMSE values are 8.944, 10.897, and 9.073 (10 −9 m 2 K/W), respectively. The R and R 2 values of the LSBoost, SVMs, and GRPs models with all descriptor sets are all higher than those with only the thickness and property descriptors, while all the RMSE are further reduced. It is said that introducing the compound and process descriptors, which provides chemical characteristics and material compositions information, can enhance the predictive performance of all three models. Take LSBoost model for example, the experimental ITR values against predicted ITR values are shown in Fig. 1. The navy circles are the results predicted by the SVMs model with all descriptors while the orange squares are the results predicted with thickness and property descriptors. The orange circles are more scattered and far away from the black diagonal line. In other words, by including all the descriptor sets, the mismatch between the predicted and experimental values becomes smaller. To further investigate the effect of descriptors on the ITR prediction, three material systems with the largest improvement were taken as examples in larger scale. Figure 2 shows the ITR experimental values against predicted values of Pb/diamond (circle), Au/GaN (square), and Au/Ti/GaN (triangle), respectively. The predictive performance of all of these material systems are improved by the models with all descriptors, even for the material system with interlayer such as Au/ Ti/GaN. For the Pb/diamond, the predicted ITR value get closer to the experimental value by 16.5 (10 −9 m 2 K/W).  Generally, the database is not sufficiently large (e.g., lack of experimental results for some material systems; more data for binary systems than for ternary or quaternary systems), therefore, the known physical and chemical knowledge should be included as descriptors to assist the machine in learning from the data.

ITR prediction
After achieving high predictive performance, we input 80,282 material systems (metal/metal is excluded) to predict by the three models of LSBoost, GPRs, and SVMs. The 80,282 material systems are composed of 293 materials that the similarities between materials can be found in Fig. 3. These 293 materials are elements or binary compounds. The materials included in the training dataset are defined as red stars and the other materials are defined as navy spheres in Fig. 3. The similarities are evaluated by Euclidean distance of all descriptors (except for thickness, interlayer) and then projected in the two-dimensional plot through metric Multidimensional scaling (MDS), 8 which uses a similarity matrix to plot the graph between a series of n objects to the coordinates of the same objects in an m-dimensional space. The m is usually fixed at two or three so as to be visualized easily. That is to say, the metric MDS seeks a low-dimensional representation of the data in which the distances between output two points are as close as possible to the similarity in the original high-dimensional space. The distance between two points is shorter, the similarities of their descriptors (properties) are higher. For examples, the metals which are located in the left bottom have different EN and IP properties compared with other inorganic compounds, Pr 6 O 11 and Fe 7 Se 8 which are far away from the other materials have large unit cell volumes, and LiH which is on the top alone has high heat capacity and low density than the averages.
From the ranking of ITR predictions, we compared the top-100 high-ITR material systems predicted by models of LSBoost, GPR, and SVM. Among the top-100 high-ITR material systems, there are 25 material systems repeatedly predicted by at least two models as shown in Fig. 4a. The predicted top-100 high-ITR material systems are illustrated by circles in Fig. 4a, and the intersections show the repeatedly-predicted material systems. There are 13 and 15 material systems predicted by SVMs and LSBoost, and SVMs and GPRs, respectively as shown in Fig. 4a. Besides, there are three material systems, Bi/graphite, Bi/diamond, and Bi/B, predicted by all three models corresponding to the orange points in Fig. 4b. The similarities of the 25 material systems are plotted by twodimensional MDS in Fig. 4b. These 25 material systems in the MDS plot are mainly separated into two groups: Bi/oxides and AsI 3 / Tellurides or Iodides (such as Sb 2 Te 3 , and CdI 2 ).
The linear correlation by Pearson heatmap between the descriptors and ITR by the three models can be found in Fig. 5. If the color is lighter yellow or darker navy, it shows stronger positive or negative linear correlation with ITR, respectively. From the prediction of 80,282 various material systems, there is no obvious correlation except melting point and mass in Fig. 5a-c. On the other hand, from the prediction of the top-100 high-ITR material systems, the stronger linear correlation of compound descriptors, such as atomic coordinates, binding energy and ionic potential, can be found in Fig. 5d-f. Besides, some descriptors of Fig. 3 The two-dimensional MDS plot of 293 materials by their properties including heat capacity, density, melting point, unit cell volume, composition ratio, atomic coordinate (AC), electronegativity (EN), ionization potential (IP), mass, and binding energy. The red stars are the materials included in the training dataset, and other materials are navy spheres. The similarities of properties are higher, the distance between two points is closer Fig. 4 a The circle regions represent the top-100 high-ITR material systems predicted by the three models of LSBoost, GPR, and SVM. The blue and orange intersection regions express the 25 material systems repeatedly predicted at least by two models. b The two-dimensional MDS plot of the 25 material systems according to the blue and orange regions in a. The three orange points are predicted by all three models. The similarities of properties are higher, the distance between two points is closer the two materials in Fig. 5d-f, (e.g., f_melt and s_melt, f_density and s_density, f_AC1y and s_AC1y), show opposite linear trends. It is said that if the differences of several descriptors between the two materials are larger, the ITR might be increased. However, the effect of descriptors on the target (ITR) cannot be considered individually, but a combination effect based on various algorithms. Although all three models have similar improved trend on predictive performance after introducing all descriptors, the improvement of various material systems might differ among three models. The predicted ITR of the 25 material systems by the three models can be found in Fig. 6. Some material systems, such as Bi/BeO, Bi/Al 2 O 3 , and Bi/Si, have very close predicted values, while some material systems, such as AsI 3 / PtTe 2 and AsI 3 /Bi 2 Te 3 , have larger differences. It is normal that the predicted values differ among various algorithms even if the predictive performance are all higher than 93 %. It is also implied that the correlation between the descriptors and the target (ITR) might have various regulations and be fitted by several algorithms. In order to prevent the partial and inadequate analysis, the intersection region among the three models with high accuracy would be a good way to seek the candidates from the prediction.
We measured the ITR of Bi/Si interface by frequency-domain thermoreflectance 9 as 51.8 ± 4.5 (10 −9 m 2 K/W), which is in the range of the predicted ITR of 50.68-61.13 (10 −9 m 2 K/W) as shown in Fig. 6. The Bi/Si material systems, the red points in Fig. 4b and Fig. 6, achieved the ultralow thermal conductivity of 0.16 Wm −1 K −1 . 10 This thermal insulating Bi/Si nanocomposite thin film, which was proposed for the first time, was composed of crystallized-Bi in an amorphous-Si matrix by a laboratory-built combinatorial sputtering system. The low thermal conductivity can be attributed to the high ITR, high ratio of interfacial surface area to volume and high atomic ratio of Si/Bi. The details of interfacial design and nanostructure analysis can been found in our previous paper. 10 It is proved that the high-ITR predictions are potential candidates for thermal insulating applications. Through the ITR prediction by machine learning, the material systems exploration for thermal management can be accelerated.
A precise ITR prediction through machine learning with high correlation coefficient R of 0.96 was achieved by further considering the interfacial conditions based on chemical, physical, and material properties. The descriptors are categorized into three descriptor sets: property descriptors, compound descriptors, and process descriptors, respectively, as shown in Fig. 8. From the top-100 high-ITR prediction among 80,282 kinds of material systems, 25 material systems were repeatedly predicted by at least two of the models of LSBoost, GPRs, and SVMs. There are two main groups of the 25 material systems as shown in MDS plot: Bi/oxides and AsI 3 / Tellurides or Iodides. One of the 25 material systems, Bi/ Si, accomplished the ultralow thermal conductivity of 0.16 Wm −1 K −1 . The high-ITR prediction is proved to be the potential candidates for thermal insulating or thermoelectric applications. The ITR predictive model can also be extended for more specific thermal needs by limiting the material searching space, such as high melting point for high temperature

Data of ITR
The ITR database in this study contained 1317 data entries describing 456 interface samples composed of 54 materials, including metals, semiconductors and insulators. The materials are elements or binary compounds. The 456 interfaces are defined by the film, interlayer, substrate materials, measurement temperatures and experimental conditions. The 1317 data entries contained in the database are experimental data collected from 85 published papers as shown in Fig. 7. 9,11-93 Generally, the ITR decreases with the temperature increasing as shown in Fig. 7. The main contribution for heat transfer of metals is mobile electrons, however the heat transfer of non-metals is lattice vibrations. Therefore, the metal/metal interfaces are not considered together in the training dataset. Besides, the interfaces including 2D materials (e.g., graphene, MoS 2 ), compounds with uncertain compositions (e.g., TiO x ), polymers, or under specific treatment such as plasma bombardment are not considered this time.

Descriptor selection
Lots of factors, such as interface conditions, thermal properties of materials adjacent to the interface and temperature, can affect the ITR; these properties determine the transfer mode and efficiency of the carriers. In general, constructing the machine learning model in material science suffers from the lack of data or the inconsistency from the references, especially some material properties such as thermal conductivity, sound speed, and Debye temperature. If the data has missing properties, then it cannot be used for training or further predicting and the size of the dataset will shrink a lot. From our previous paper, the thermal related properties of heat capacity, density, melting point, and unit cell volume, as well as the material property of thickness were selected as good descriptors with high data-consistency among the references and high data-availability, and achieved good predictive performance. 4 However, when two materials meet at an interface, the new binding will form aside materials and tend to reduce the total energy of the system. The mode of phonon transfer changes based on the interfacial conditions, and the PDOSs of the two materials forming the interface shift to reduce mismatch between the materials, thereby improving the thermal conductance. 94,95 Thus, the interfacial conditions which become important especially at or above the room temperature should be further considered in the model. In order to enhance the machine predictive power based on the known knowledge, we further considered the interfacial properties and categorized these properties into three descriptor sets (property descriptors, compound descriptors, and process descriptors), as shown in Fig. 7. Property descriptors include the physical thermal properties, which is based on the previous results, 4 of the materials on both sides of the interface, whereas compound descriptors include the properties that can describe the chemical characteristics of the interfaces (e.g., binding energy, electronegativity, and ion potential). Process descriptors include the properties that can be tuned during the experimental procedure, such as film thickness and interlayer (e.g., Al/SiO 2 /Si: with interlayer and Al/Si: without interlayer). These descriptors were collected from various sources, including Atom Work Adv in NIMS, 96 TPRC data series, 97 and published papers. 9, The compound descriptors were atomic ratio (R), atomic coordinate (AC), electronegativity (EN), ionization potential (IP), mass, and binding energy (Eb) as shown in Fig. 8. The atomic ratio of the compounds for the first and second elements was defined as R 1 and R 2 , respectively; take Al 2 O 3 for example, R 1 and R 2 are 2 and 3. AC represents atomic coordinates defined from the periodic table: the group as the x coordinate and the period as the y coordinate as (AC i x, AC i y), where i represents the order of the elements of the compound. For a binary compound, the coordinates of each element of the compound are introduced. As an example, the coordinates of (AC 1 x, AC 1 y) and (AC 2 x, AC 2 y) for AlN are (13,3) and (15,2), respectively. These coordinates also provide the information of atomic radius and electronic configuration. In the periodic table, the atomic radius increases from right to left of group and from up to bottom of period. Groups 2 and 15 of the periodic table are more stable as a result of the completely filled and half-filled electronic configurations, respectively. Element in groups 2 or 15 are thought to need more energy to remove electrons from the outside orbitals. EN reflects the ability of an atom to attract an electron and can be used to predict the formation of ionic or covalent bonds. EN ranges from 0 to 4, with a higher EN indicating greater attraction to electron. IP is the energy required for an isolated atom to discharge an electron and become a cation. The degree of mismatch in mass and bond energy also affects the ITR of two dissimilar materials. Choi et al. reported that differences in mass and bond energy result in mismatches between phonon dispersion and limit high frequency phonon transport at the interface. 98 The process descriptors include film thickness and interlayer. Film thickness is proved to be an important descriptor and affect the predictive performance a lot in our previous paper. 4 The change in phonon transmission probability at the interface corresponds to the change in thickness. 60,94 The interlayer reflects whether an interlayer is present between the materials at the interface; it is assigned a value of either 1 (if an interlayer is present) or 0 (if no interlayer is present). In many cases, the adhesion layer, a naturally or thermally formed oxidation layer, and surface plasma treatment will form interlayers or a mixed region between the materials instead of a clear interface. Hopkins et al. reported that increased atomic mixing and disorder at the interface have detrimental effects on ITR. 52 Deng et al. proposed that in the weak interfacial coupling, the detailed interfacial nanostructure and thickness of the heterojunction significantly affect the ITR; whereas the structural effects are not obvious in cases of strong interfacial coupling. 99 In other words, the interfacial serves as a coupling layer to tune the phonon transport mode and DOS, particularly when the interfacial film has mediating vibrational properties. 94 Moreover, the interlayers or materials with mediating properties along the heat flux may possibly enhance the channels for thermal conductance. 95 After selecting the descriptors, the three descriptor sets were used to train and generate the predictive models.

Machine learning
The entire dataset was randomly separated into ten folds: one testing fold and nine training folds. This process was repeated ten times while keeping the testing folds orthogonal to each other so that the testing data were not duplicated. During each iteration of this process, embedded ten-fold cross validation was conducted to generate appropriate hyperparameters for optimizing the model. After the ten times iterations, the correlation coefficient R, R 2 and the root-mean-squared error (RMSE) of the ten test folds from each process were used to estimate the models.
According to the previous paper, the linear repressor which has worse predictive performance and the auto-descriptors-selected regressors (e.g., Least-absolute shrinkage and selection operator regularization (LASSO-GLR)) which neglects the necessary descriptors (e.g., measurement temperature) are not suitable for this dataset. 4 Therefore the support vector machines (SVMs), Gaussian Regression Processes (GRPs) and Regression tree ensembles of LSBoost (simplified as LSBoost) 100,101 were used in terms of the dataset size and non-linear based regressors. SVMs and GPRs are kernel-based algorithms, the front constructs an optimal hyperplane as a decision surface to separate and train the observations, and the latter is a nonparametric method which finds a distribution over possible functions f(x) consistent with the observations. The kernel functions used for SVMs and GPRs in this study are both Radial Basis Function. LSBoost performs least-squares boosting, which fits regression ensembles to minimize mean-squared error. The ensemble fits a new learner using the difference between the observed response and the predictions of all previous learners. The algorithms were run using MATLAB statistical software. 101 The acquisition function for hyperparameters optimization was expected-improvement-plus for all algorithms. The initial algorithm settings and additional algorithms details can be found in our previous paper and the Matlab Statistics and Machine Learning Toolbox. 4,101

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.