Unexpected thermal conductivity enhancement in aperiodic superlattices discovered using active machine learning

While machine learning (ML) has shown increasing effectiveness in optimizing materials properties under known physics, its application in discovering new physics remains challenging due to its interpolative nature. In this work, we demonstrate a general-purpose adaptive ML-accelerated search process that can discover unexpected lattice thermal conductivity (κl) enhancement in aperiodic superlattices (SLs) as compared to periodic superlattices, with implications for thermal management of multilayer-based electronic devices. We use molecular dynamics simulations for high-fidelity calculations of κl, along with a convolutional neural network (CNN) which can rapidly predict κl for a large number of structures. To ensure accurate prediction for the target unknown SLs, we iteratively identify aperiodic SLs with structural features leading to locally enhanced thermal transport and include them as additional training data for the CNN. The identified structures exhibit increased coherent phonon transport owing to the presence of closely spaced interfaces.


INTRODUCTION
The demand for efficient energy systems and high-performance electronic devices has created the challenging requirement to rapidly identify new materials and design nanostructures with extreme transport properties. As the limitations of traditional intuition-driven trial-and-error search methods become more prominent, machine learning (ML) and data informatics have emerged as powerful tools for solving these design and optimization problems. In thermal transport, ML methods have found success in predicting material properties and accelerating design of nanostructures with target thermal transport [1][2][3][4][5][6][7][8][9] . However, the applications of ML to solve thermal engineering problems till date have been limited to finding solutions that show optimization of a well understood effect, such as maximization of disorder-induced Anderson phonon localization. In contrast, ML has not been used to explore and discover exceptional solutions, which exhibit unexpected or unknown phonon transport physics. This can be attributed to the 'interpolative' nature of traditional ML algorithms which allows for accurate prediction and exploration within the subspace spanned by known data points (and, therefore, known physics), but fails for excursions outside the training dataset. Consequently, suitable adaptations are needed to use ML methods in the identification of materials or nanostructures showing exceptional physical properties.
In this work, we demonstrate the potential of an adaptive machine learning approach to identify unexpected thermal transport behavior in aperiodic superlattices. Binary superlattices (SLs), composed of periodically alternating layers of two materials, have received widespread attention in the recent decades due to their lower lattice thermal conductivity (κ l ) compared to the constituent materials [10][11][12][13] , which makes them greatly attractive for applications such as thermoelectric devices [14][15][16] . Recent studies have shown that randomizing the constituent layer thicknesses in periodic SLs further reduces κ l , even below the random alloy limit 7,[17][18][19][20][21][22][23] . In the resulting aperiodic superlattices or random multilayers (RMLs), destructive interference of coherent phonons due to reflections at the randomly spaced interfaces can cause Anderson localization, thereby limiting thermal transport by these long wavelength phonon modes 18,24 . Additionally, ML methods such as Bayesian optimization 3 and genetic algorithms (GA) 7 have allowed efficient identification of RML structures with ultra-low thermal conductivities at a fraction of the computational cost associated with exhaustively searching the prohibitively large set of candidate structures. However, it has not yet been elucidated whether certain random distributions of SL layer thicknesses can actually lead to higher κ l than the periodic SLs. Interestingly, in a recent study, Wei et al. 25 used a GA-based search process to identify two-dimensional graphene nanomeshes with disordered pore configurations showing enhanced κ l than nanomeshes with uniformly spaced pores. Their results challenged the previous understanding that randomness in pore spacings leads to lower κ l in these systems 26,27 , and motivate the search to find exceptions for other well understood systems such as 1-D SLs and RMLs. Such rare examples of multilayered phononic structures showing enhanced thermal transport can also benefit applications such as thermal management of quantum cascade lasers 28,29 . However, the heuristic GA search method used in Wei et al.'s work 25 , although considered to be 'extrapolative' due to the ability to explore the design space outside the initial known dataset, is still computationally expensive for the identification of very lowprobability-of-occurence solutions as a result of the probabilistic nature of the GA evolutionary operators of selection, crossover, and mutation. Therefore, we look to find a systematic approach that can utilize the advantages of ML while enabling an extrapolative approach for the efficient identification of exceptional solutions.
Here, we identify RML structures with unexpectedly higher κ l than corresponding SLs with same total length and average period. To accelerate the search over the prohibitively large design space, a convolutional neural network (CNN)-based prediction method is used for obtaining the κ l of the candidate structures. An iterative approach is employed for generating a representative training dataset that enables the CNN to accurately predict the high κ l of the target RML structures that are absent from the initial dataset. Finally, the identified non-intuitive RML structures are used to gain insight into the heat transport mechanisms leading to higher κ l and its correlation with RML structural features.

RESULTS AND DISCUSSIONS Manual search for higher thermal conductivity RMLs
We perform our calculations on the model Si/Ge system using non-equilibrium molecular dynamics simulations to search for high κ l RML structures. This system has been extensively investigated in literature, given the wide application of these semiconductor materials as multilayer systems [30][31][32][33][34][35][36] and the simplicity of performing molecular dynamics simulations using interatomic potential parameters. The SLs and RMLs are constructed by stacking the diamond cubic unit cells (UCs) of each material along the [100] direction. Two different lengths of SL and RML structures are studied in this work: a shorter 20 UC (11 nm) system and a longer 40 UC (22 nm) system. Periodic boundary conditions are maintained in the other two directions, so that our system results in a superlattice thin film. The smallest layer thickness allowed along the cross-plane heat transport direction is set to be 1 UC, and only RMLs with equal number of Si and Ge layers are studied to ensure meaningful comparison of κ l among all structures. Additionally, the first and last UCs along the RML length are constrained to be Si and Ge respectively, to prevent extra interfaces with the heat reservoirs. With these constraints imposed, the number of possible RML structures is found to be 48620 for the 20 UC system and 35345263800 for the 40 UC system. Figure 1a shows schematic images of representative periodic SL and RML structures. Details about the molecular dynamics simulation methodology and κ l calculation can be found in the Methods section.
First, we search for 20 UC (10 nm) RMLs showing enhanced κ l from the corresponding SL structures. The thermal conductivities of the 20 UC N − N SL system are calculated, where N is the number of unit cells of Si or Ge in one period of the SL. To ensure an integral number of periods within the fixed total length of 20 UCs, N can take values of 1, 2, 5, and 10 only. The thermal conductivities obtained using NEMD simulations are shown in Fig.   1b, where a minimum of 2 W/mK is obtained at an SL period of 2.2 nm. This characteristic variation of κ l with SL period has been predicted theoretically 11,13,37,38 and recently observed experimentally [39][40][41][42] , and is commonly understood to be the result of the transition from coherent phonon to incoherent phonon dominated transport regimes. Phonons traveling along the cross-plane direction of SLs with large periods can exhibit particle-like behavior when anharmonic phonon-phonon scattering causes them to lose phase information before encountering an interface. On the other hand, multiple phase-preserved reflections at closely spaced periodic interfaces can lead to the formation of coherent phonon modes showing wave-like phonon transport characteristics. At periods greater than 2.2 nm, the interface density is small enough to ensure a low coherent phonon contribution. As a result, the reduction in incoherent phonon scattering by the SL interfaces leads to a greater thermal conductivity at higher periods. In contrast, when the SL period is below 2.2 nm, a significant portion of the thermal transport is contributed by the coherent phonon modes, which are no longer scattered by the closely spaced interfaces. In this regime, the increase of thermal conductivity at lower periods has been attributed to effects such as weaker band flattening and increased group velocities.
We then attempt the traditional intuition-guided search process to identify possible RMLs showing κ l enhancement due to aperiodicity. Due to the absence of any previous evidence supporting the existence of enhanced κ l RML structures, no guidance is available to narrow down the search to a computationally tractable subset of the design space. In this case, a random search can be considered as a viable search method. To perform the manual search, we randomly choose 300 candidate RML structures from the design space and calculate the thermal conductivities using NEMD simulations. The results of these calculations are compared with the SL thermal conductivities in Fig. 1b. We find, as expected, that all of the 300 randomly generated RMLs have significantly lower thermal conductivities than the corresponding SLs. We also calculated the histogram of thermal conductivity values for the 300 randomly generated RML structures as shown in Fig. 1c. It can be observed that the majority of RMLs have low thermal conductivities compared to the SLs. This shows the evident need for an alternative systematic and efficient way to perform the search and motivates the use of machine learning for such tasks.

ML accelerated search for higher thermal conductivity RMLs
While NEMD simulations can provide accurate values of thermal conductivity of the RML structures, they are computationally expensive when more than hundreds of simulations need to be performed for a particular system. As a result, exhaustive searches using MD simulations over design spaces as large as the current problem become impractical. In order to accelerate the calculation of thermal conductivity of RMLs and perform a rapid screening of a large number of candidate solutions, we use a neural network-based rapid thermal conductivity prediction tool that can replace the time consuming NEMD simulations. Neural networks (NNs) have emerged as a powerful tool for regression and classification problems due to their ability to fit complex multifunctional datasets without the need for encoded sets of rules which may introduce human bias. Recently, convolutional neural networks (CNNs) have been successfully used in predicting material properties including κ l from input structure data 5,6 , particularly due to their feature detection and translational invariance characteristics. As a result, we employ a CNN which can predict the thermal conductivity of an RML by detecting relevant spatial features in the input RML structure. Since the time taken by the trained CNN to predict the κ l of each RML structure is extremely small, the entire design space can be evaluated within a few minutes. As a result, an exhaustive search can be performed over the entire RML design space using the κ l values predicted by the CNN. The architecture of the CNN used in this work and the details of the training process can be found in the Methods section. A well-known characteristic of neural networks is their 'interpolative' nature, i.e., they cannot generally be expected to extrapolate to unknown points outside the region spanned by the training dataset. This is problematic for our current objective, where the CNN is required to accurately predict thermal conductivities of high κ l exceptional RMLs which are absent from the initial training dataset. To resolve this, we utilize the ability of CNNs to extract spatial features contributing to locally enhanced thermal transport. Although the training dataset is composed of RML structures with low to moderate κ l , many of these structures contain spatial features that lead to locally enhanced thermal transport, such as large bulk regions or short regions of periodic interfaces. By forming featureproperty maps from these structural features, the CNN is able to assimilate them and accurately predict the high κ l of RMLs containing combinations of these favorable features. On the other hand, randomly sampling the design space does not automatically ensure inclusion of RML structures showing enhanced local thermal transport characteristics within the dataset. This can be seen from the probability distribution of thermal conductivities of the 300 randomly generated RML structures (Fig. 1c), where the majority of RMLs have low thermal conductivities compared to the N − N SLs. In order to overcome this challenge, we adopt an iterative approach to generate our training dataset comprising RMLs with moderate to high thermal conductivities while performing the accelerated search. In the initial step, the CNN is trained on a dataset of the 300 randomly generated RMLs. The trained network is then used to predict the thermal conductivities (κ CNN ) of all structures in the search space. Next, we select 100 RML structures predicted by the CNN to have the highest thermal conductivities and perform NEMD calculations of thermal conductivity (κ NEMD ) to validate the CNN predicted values. If any of these 100 RML structures identified in the search show a higher κ NEMD than the corresponding SL, the search is stopped with successful identification of the exceptional RML structures. Otherwise, these RML structures are included in the training data with their κ NEMD values, and the CNN is retrained to fit  Fig. 2 Schematic of the iterative search algorithm used to discover unexpected thermal conductivity (κ l ) enhancement in aperiodic superlattice systems. the augmented data set. Subsequently, the thermal conductivities of all structures are again predicted (with potentially higher accuracy) and the algorithm is progressed in this manner. Figure 2 shows the complete work flow of the search algorithm followed in our work. In the initial iteration, the κ CNN values are not expected to be accurate over the entire search space, given the relatively small size of the training dataset and the absence of representative features. However, the accuracy of the prediction improves as the size of the training dataset increases with each successive iteration, and RMLs with high κ l constitute a greater fraction of the training data. We first evaluate the performance of the CNN in predicting κ l of the 20 UC RML system. Figure 3a shows the variation of the mean absolute percentage error (MAPE) and the root mean square error (RMSE) with each iteration of the search process, when evaluated on a testing set of unknown RML structures not introduced to the CNN during training. We observe that the CNN is able to predict thermal conductivities with a very low MAPE varying from 4.6 to 6.4%, or an average RMSE of 0.09 W/mK. The MAPE generally decreases with each progressing iteration of the search due to the addition of more RML structures to the training dataset which increases the representative set of features. The comparison between the predicted (κ CNN ) and 'true' values (κ NEMD ) is shown by the parity plot in Fig. 3b after training the CNN on data from 600 RML structures. It is seen that the CNN can provide accurate predictions over a wide range of thermal conductivities from 1 to 2.5 W/mK, thus demonstrating the capability of the CNN to extract suitable spatial features governing low and high κ l . The progress of the ML-enabled search for 20 UC RMLs with enhanced κ is shown in Fig. 3c, in comparison to a manually performed random search. We also compared our ML accelerated search process with a Genetic Algorithm (GA) based search, which is a heuristic method that uses probability-based genetic operators (selection, crossover, and mutation) to explore the design space efficiently and find optimal solutions to a given loss function. The implementation of the GA search process is similar to our previous work 7 , with the only difference being the objective function which is maximization of thermal conductivity for the current problem.
We only compare RML structures against the corresponding SL having the same average period. As a result, the contribution of interface scattering of incoherent phonons to the thermal transport is the same in the compared multilayer structures, and any difference in κ l is purely the result of coherent phonon transport. We find that our ML-based search process is able to identify RML structures with higher κ than the corresponding SL within two iterations of the search utilizing 200 CPU hours. In contrast, the manual random search returns far lower κ l than periodic SLs even after double the simulation hours spent. Although the GA search is able to identify structures with reasonably high thermal conductivity, none of the identified RMLs can exceed the reference SL thermal conductivity. The thermal conductivities of the RMLs scanned by the ML search process are plotted with respect to average period in Fig. 1b. By searching through RML structures with different average periods, the κ l of RML structures are found to exceed the superlattice κ l at a relatively higher average period of 5.4 nm, corresponding to the 5 − 5 SL. The identified RML κ of 2.36 W/mK is found to be higher than the SL κ of 2.28 W/mK by 3.5%, which is above the statistical uncertainty as confirmed by averaging these values over multiple independent runs. The error bounds for the 5 − 5 SL and the identified RML are shown in the figure and are calculated from the standard deviation of the independent runs for each structure. The structures of the 5 − 5 SL and the RML showing enhanced κ l are shown in Fig. 3d.
We also perform a similar ML accelerated search for a larger RML system with a total length of 40 UCs. Since the number of possible RML structures for this system is several orders of magnitude larger than the 20 UC system, we limit our search to a tractable subset of the design space by using the knowledge gained from the results of the search on the 20 UC RML system. In particular, only RMLs with the relatively larger average period of 5.4 nm, corresponding to perturbations of the 5 − 5 SL, are sampled. With this constraint, the reduced design space consists of 938961 RML structures which can be efficiently handled by our ML search framework. Similar to the previous search process, the CNN accelerated search method can successfully identify an RML structure with higher κ than the corresponding SL within the validation of 612 RMLs which constitute less than 0.1% of the design space. The identified structure, shown in Fig. 3d, has a κ l exceeding that of the SL by 5.5% which is also confirmed by averaging over multiple runs. Interestingly, the 40 UC RML structure identified by our search is found to be a composite SL which can be created by the combination of the single interface structure and the shorter period 2 − 2 SL. As a result, the structure has the features of a local periodicity which enhances thermal transport despite having a globally random layer thickness distribution.
Contribution of interfacial resistance to κ l enhancement Finally, the identified exceptional RML structures shown in Fig. 3d are studied to understand the underlying phonon transport characteristics leading to the disorder-induced enhancement of κ l . We observe the presence of small layer thicknesses due to closely spaced interfaces in these structures, which we attribute as the cause for the increased thermal transport. At an SL period of 5.4 nm, the relatively large layer thicknesses are above the coherence length of most phonons, as a result of which the contribution of coherent phonon transport to the SL κ l is quite low. However, the reduced thicknesses of some layers in the identified RMLs lead to an increased coherent phonon contribution, whereby the apparent thermal resistance of the interfaces are lowered. To verify our hypothesis, we calculated the total resistance across the RML as well as the contribution of the apparent interface resistances for three different 40 UC structures: (i) the RML with κ l higher than the 5 − 5 SL identified through our search process, (ii) the 5 − 5 SL and (iii) a RML with low κ l identified by the random search. The apparent interfacial thermal resistances at each of the interfaces in the RML structure are shown in Fig. 4a-c, superimposed on the visual representation of the RML structure.
We can see that compared to the 5 − 5 SL (Fig. 4b), the apparent interfacial resistances are visibly reduced in the high κ RML (Fig.  4a), which is the effect of a higher amount of coherent phonon transport. As a result, the RML shows a lower total interfacial thermal resistance and total thermal resistance than the SL, as seen in Fig. 4d. Finally, the localization of coherent phonon modes due to sufficient layer thickness randomization in the RML structure shown in Fig. 4c and the absence of many closely spaced interfaces leads to a higher interfacial resistance and lower κ l , which is in accordance to the previously accepted hypothesis.
Our results indicate that randomness of layer thicknesses in SLs can be engineered to have dual effects via tuning the contribution of coherent phonons, which can either decrease or enhance thermal conductivities. Generally, in short period SLs, randomness can cause localization of coherent phonons and reduce κ l . On the other hand, certain forms of aperiodicity in large period SLs can enable stronger coherent phonon transport that is not localized, thus enhancing κ l . In summary, we demonstrate an iterative machine learning approach for discovering exceptional thermal transport physics. Although it is generally accepted that randomization of layer thicknesses of a binary periodic superlattice lowers its cross-plane κ l , we aim to find structures showing the opposite trend, i.e., an enhancement of κ l due to disorder. We employ a convolutional neural network to rapidly predict the thermal conductivities of all RMLs in the design space. The training dataset is generated in an iterative method in order to help the CNN dynamically learn the spatial features leading to locally enhanced phonon transmission. Our CNN accelerated search is able to identify RML structures with higher κ l than the superlattice at an average period of 5.4 nm, which is attributed to an increase in coherent phonon contribution and decrease in apparent interfacial thermal resistance at closely spaced RML interfaces as compared to the SL. Our results demonstrate the ability of machine learning-based methods to help discover exceptions and low-probability-of-occurrence solutions in a large search space.
An important aspect we wish to highlight here is that the main computational expense in our iterative search method is the NEMD evaluation of RML thermal conductivity and training of the CNN. The exhaustive scanning of the entire design space did not add any significant computational expense over and above this, and as a result, we were able to use the exhaustive search without requiring a more sophisticated optimization algorithm. However, we note that for a larger number of design variables, the exponentially increasing size of the design space will make an exhaustive search impracticable. In such cases, the use of a more sophisticated optimization algorithm, such as a GA, in conjunction with the CNN predictor is an attractive approach that can be studied in future work.

Non-equilibrium molecular dynamics simulations
Thermal conductivity calculations for the multilayered nanostructures are performed using non-equilibrium molecular dynamics simulations with the LAMMPS package 43 . The interatomic interactions are described using the three-body Tersoff potential 44,45 , which is commonly used to study vibrational properties of the Si/Ge system. The unequal equilibrium lattice constants of Si and Ge in these potential descriptions lead to a symmetric cross-sectional strain in the system, which can cause large oscillations at the interface regions 33 . To eliminate this strain, the lattice constant of Ge is artificially set to be equal to that of Si within the interatomic Tersoff potential parameters. A 6 × 6 UC cross-section is used, which is sufficient to provide converged κ l values. The thermal conductivity of the nanostructures is calculated at a temperature of 300 K. A timestep of 0.5 fs is used to integrate the equations of motion, which is sufficient to resolve the highest frequency of phonon vibrations in either material.
A schematic of the NEMD simulation domain for the direct calculation of thermal conductivity is shown in Fig. 5. Two bulk material regions consisting of 20 UCs of Si and Ge are attached to either side of the SL or RML to act as thermal reservoirs. Initially, the entire system is relaxed for 500 ps at 300 K, under a constant pressure and temperature ensemble (NPT) with periodic boundary conditions applied to all three directions. Following this, another 250 ps of equilibration under fixed volume and energy (NVE) is performed. Subsequently, non-equilibrium conditions are applied by thermostatting the Si and Ge bulk regions on either side at 330 and 270 K respectively, using Langevin thermostats. Two UCs of atoms at each end of the system are also kept fixed to mimic fixed boundary conditions along the heat transport direction. The system is allowed to reach steady state under this imposed temperature gradient over a period of 500 ps. Following this, the temperatures at equal intervals along the cross-plane direction are obtained by from the velocities of atoms in onedimensional bins. The temperature and heat flux data are collected and averaged over a period of 4 ns. The cross-plane lattice thermal conductivity (κ l ) is then calculated as Here, q″ is the steady state heat flux and L is the length of the SL or RML along the heat transport direction. The thermal boundary resistance at each interface of the system (R i ) can also be calculated from the temperature drop across the interface (ΔT i ) as Convolutional neural network-based prediction of thermal conductivity The architecture of the CNN used in this work is shown in Fig. 6. The input layer to the CNN is an N-bit array, corresponding to the number of UCs in the RML structure (20 or 40). Each bit can take a value of 1 or 2 depending on whether the corresponding UC at that location along the superlattice length consists of Si or Ge atoms respectively. This is followed by one or more one-dimensional convolutional layers, each of which consists of several kernels or filters to extract the relevant features from the input array by striding over the length of the input. Here, we use convolutional layers consisting of 44-50 filters with filter lengths of 5-9 bits, a stride length of 1, and no-padding. A max pooling layer is used after every two convolutional layers, which causes down-sampling of the identified features and incorporates translational invariance in the feature maps. After multiple convolutional layers, we use a flatten layer to reduce the dimensionality of the features. Finally, a fully connected or dense layer consisting of 100 nodes is used to combine the identified features into a single output thermal conductivity value. Non-linearity is accounted for within the CNN by using a Rectified Linear Unit (ReLU) as the activation function throughout the network. For the 20 UC RML system, we use a CNN consisting of two convolutional layers, 1 max-pooling layer, and 1 fully connected layer. On the other hand, for the 40 UC RML system where the number of input parameters is much larger, we switch to a CNN architecture consisting of 4 convolutional layers with 1 max-pooling layer after every 2 convolutional layers, and 1 fully connected layer as before.  Fig. 5 Schematic of the NEMD simulation setup. The multilayer nanostructure (SL or RML) is sandwiched between two thermal baths. A layer of atoms is fixed at each end to impose fixed boundary conditions. The corresponding temperature profile is also shown.
The weights of the different layers are initiated randomly and need to be fit to the training data provided to the network. This is done by calculating a loss function over the entire training set and backpropagating the errors over the various layers of the network to minimize the loss. The loss function used to train our CNN is chosen to be the MAPE, given by Here, N is the number of training data points provided to the network, y i is the predicted output and y i is the target output. Apart from the loss function, the RMSE is also used a metric to evaluate the performance of the network. We note that these metrics are most commonly associated with regression problems, instead of others such as accuracy which are convenient for classification tasks. The training of the network by backpropagation of errors is performed using the Adamax algorithm 46 and the fitting is performed over 300−500 epochs within which sufficient convergence of the loss function is observed. Overfitting of the data by the CNN, which is common occurence in neural network training, is avoided using early stoppage of the fitting process if the testing loss is found to become constant or increase.

DATA AVAILABILITY
Data generated from the molecular dynamics simulations and the datasets and code used for training the machine learning algorithm are available from the corresponding author upon reasonable request.

Dense layer
Output layer RML structure Kernel . . . Fig. 6 Schematic of the convolutional neural network architecture. The SL or RML structure is encoded as a binary array and used as the input layer, while a single output node provides the predicted thermal conductivity.