Introduction

The dynamics of phonons, the quanta of lattice vibrations, play a critical role in various technologies ranging from heat dissipation in modern semiconductors1 to thermal barrier coatings in turbine blades2. In general, applications involving heat transfer require either extreme phonon impedance or conduction, which is often the performance-limiting property. Another application example of phonons is quantum communication, which can be realized by microwave-frequency phonons such as acoustic resonators3 and is of significant interest for the generation of remote entanglement and the secure transmission of information. Over the recent years, material scientists have sought new materials with excellent phonon properties through a combination of laboratory synthesis and computational prediction using density functional theory (DFT). Although the former requires decades of trial-and-error with the intuition of experienced chemists4, the latter has progressed the discovery of new materials and the understanding of microscopic phonon transport due to the availability of high-performance computers. Indeed, tremendous amount of previous studies in the thermal science have been dedicated to bulk, interfacial, low-dimensional, and layered materials5,6,7,8,9,10. Generally speaking, many investigations on thermal transport in materials have uncovered the structure-thermal property relationship, including those from bond strength, structure, and chemistry11,12,13,14,15,16,17. Additionally, unique phonon interactions may arise from certain crystals promoting anharmonicity and lowering lattice thermal conductivity (κL), including rattler atoms17,18,19, ferroelectric instability20, and electron-phonon interactions21,22. Notably, some studies also advance the theoretical description of phonon transport, e.g., addition of fourth order anharmonicity23 and temperature-dependent effective potential for temperature effects on the interatomic force constants (IFCs)12,24,25.

Despite the robust nature of DFT in predicting thermal transport properties, the explicit treatment of electronic degrees of freedom entails significant computational costs when faced with potentially thousands of candidates to screen target phonon properties. By majority, previous studies are limited to one or few materials that may acquire results in a matter of a few days, depending on their hardware resources. However, with the advent of materials genome such as Materials Project4 and Open Quantum Materials Database (OQMD)26, the DFT evaluation of phase stability and properties of interest for thousands to millions of previously unexplored materials puts several years of delay on the synthesis of novel materials. This is especially magnified for phonon properties. For instance, the κL, one of the most important phonon properties, is computationally demanding by DFT, due to the required calculations of large amount of supercells with different atomic displacements which is then processed to give IFCs for the Boltzmann transport equation (BTE) simulation27.

In response, data-driven techniques such as machine learning (ML) have surfaced in materials science to address the demanding costs of DFT, effectively trading some accuracy for significant speed-up. The basic assumption with ML-based models for predicting DFT-level properties is the introduction of a finite cutoff, whereby atomic interactions beyond such cutoff are neglected. This allows for linear scaling of the computational cost against the number of atoms as opposed to the cubic scaling with DFT. With ML, prediction of target properties requires a physically informative set of inputs as the descriptors. For a sufficiently accurate model, descriptors should satisfy several requirements, including (a) distinguishable representation for each system, (b) descriptive of the similarities/discrepancies between systems, (c) completeness to sufficiently differentiate systems, and (d) simplicity of the descriptor to ease calculation time28. Given these requirements, several methods have been proposed over the past few years to confront computationally costly phonon properties. For example, progresses have been made in the ML prediction of κL recently, either directly or indirectly. The direct prediction refers to a ML model or several models in sequence with κL as the final output. The majority of studies over the years fall under this category. Several ML models have been trained on 110 half-Heusler compounds by compiling elemental, compound, and compound-elemental descriptors, obtaining high validation accuracy and revealing the bond distance as the most important descriptor for κL29. Diamond-like materials were explored for ultra-high κL using a small training datasets through transfer learning of the three-phonon scattering channel volume, or P3 for short30. On the other hand, the indirect prediction approach first predicts lower-level properties eventually leading to κL, including the atomic forces and the IFCs which are required for κL. To date, little research has been seen in this subarea. Notably, providing lower-level physics introduces several advantages over the direct method. Firstly, with the atomic forces and/or IFCs, one can compute the full phonon properties, e.g., phonon dispersions, temperature-dependent κL, and scattering rates. This allows for in-depth study of the phonon properties at the high-throughput level without needing to rely strictly on pure DFT. The temperature dependent κL is especially desirable for high-temperature applications whereas most studies involving ML focus only on room temperature. Secondly, because κL is not directly computed, variables involved in the BTE calculation may be modulated, such as the inclusion of higher order anharmonicity beyond 3rd order and off-diagonal contribution. Thirdly, atomic forces and IFCs are much more abundant than single κL values associated with each structure, allowing for improved training and potential transferability of information among diverse structures.

Here, we demonstrate a bottom-up approach by application of the Elemental Spatial Density Neural Network Force Field (Elemental-SDNNFF) for high-quality phonon property prediction of large materials databases. Previously we have applied the model to a smaller set of 11,866 structures with half, quaternary, and full Heusler structures spanning 55 elements31. In this work, we expand the model to a more complex set of 77,091 cubic structures containing 16 structural prototypes and 63 chemical species. This is made possible by our model providing sufficient flexibility to distinguish many unique atomic environments for the high-throughput calculation of full phonon properties. Because the forces are provided, thermodynamic stability of these structures can be determined via their predicted phonon dispersions which is not possible in direct methods. In this work, the model is initially trained on a small subset of 3107 structures and is iteratively improved on a larger dataset of 77,091 structures with active learning. Data augmentation is incorporated whereby equivalent atomic environments are rotated to provide a ~3× boost to the total atomic forces for training. Then, the final model is deployed to predict the complete phonon properties of the remaining stable structures, with speed of three orders of magnitude faster than full DFT calculations for systems containing greater than 102 atoms as seen by Supplementary Fig. S1. Then, we focus on the high-quality phonon property prediction addressing several major challenges for high-throughput material property prediction. Specifically, by predicting the atomic forces, access to a full description of harmonic properties, such as phonon dispersions, specific heat, and the third order scattering channel volume, and anharmonic properties, such as the lattice thermal conductivity, are evaluated simultaneously with a single neural network model whilst providing deep physical insight and agreement to DFT. In addition, previous models for Heusler structures were trained efficiently by nature of providing only three structural prototypes to the model. Here, although the structure diversity is significantly more complex, the model maintains prediction of meaningful atomic forces and corresponding phonon properties. Overall, our model is capable of growing and adapting to new datasets for exploration of previously unseen materials.

Results

Training dataset from active learning rounds

To develop an ML model for evaluating atomic forces, it is crucial to prepare a sufficiently large and diverse dataset of various atomic environments. In theory, there are limitless combinations of atomic environments considering structural symmetries, chemistries, and displacements that one may encounter during the evaluation of atomic forces. This is especially true when involving random or stochastic displacement methods for generating IFCs such as compressive sensing lattice dynamics (CSLD)32. As ML methods specifically supervised learning are interpolative by nature, selection of training or reference data for evaluation of seemingly infinite possible atomic environments requires a human-free or self-informative approach for efficient model training from costly DFT calculations. The crystalline structures analyzed here are borrowed from the OQMD database26,33 and are categorized into 16 prototypes spanning four cubic space groups (Table 1). Some structures are split from the pool for initial training and evaluated with DFT for their atomic forces. From DFT, the IFCs and subsequent phonon properties are also gathered for comparison. Accordingly, the rest of the structures without DFT are marked as “unexplored” left for the model evaluation stage.

Table 1 Distribution of all cubic structures used for training and screening.

Figure 1 shows the overall workflow for generating the dataset in Table 1 and training the Elemental-SDNNFF model for phonon property prediction. First, structures from materials databases are gathered and filtered for low formation energy and energy above hull to increase the probability of thermodynamic stability. Then, supercells are perturbed by a small atomic displacement and the dataset is split into training and active learning structures. Thereafter, the training set is evaluated by DFT and corresponding forces are trained into a set or committee of models. Here, we manipulate the poor extrapolation capability of neural networks by evaluating untrained structures and comparing the predicted forces. Unseen structures with high force variance in the committee indicate poor representability of the local atomic environments in the supercell and are proposed for the next round of training. These structures are passed by DFT and are retrained into the model, forming a closed loop. After several rounds, the models are deployed for force evaluation and phonon calculations of large materials databases. For more details about structure generation and the active learning procedure, refer to the “Methods” section.

Fig. 1: Workflow for training Elemental-SDNNFF models.
figure 1

Arrow and box colors represent different regimes of the workflow with blue, orange, red, and green representing the structure generation phase, initial model training phase, iterative model training or active learning phase, and the application or deployment phase, respectively. In the application phase, the final ML model is applied to evaluation of atomic forces, based on which the interatomic force constants are fitted and phonon properties are subsequently predicted.

Prediction of phonon properties

To benchmark the performance of the trained model for phonon properties prediction, we first examine the atomic force accuracy on a small subset of 400 untrained structures. We also compare the performance to CHGNet, which was recently proposed as a universal potential energy surface model34. As obtained in Supplementary Fig. S3, we found a force root mean square error (RMSE) of 29.3 and 121 meV/Å for the Elemental-SDNNFF and CHGNet models, respectively, showcasing the competitive performance of our model and its consistency remaining close to the training RMSE of 18.6 meV/Å. Thereafter, the errors relative to DFT for the phonon dispersion and corresponding κL at 300 K are shown in Fig. 2. In Fig. 2a, the RMSE of the frequency is divided by the frequency range of the corresponding dispersions to normalize and merge the data to a single histogram and is shown as a percentage. The average error is 1.88% which is excellent as seen by the insets of sample dispersions in Fig. 2a. In Fig. 2b, the log value of the DFT and predicted κL yields an R2 of 0.89 with a mean average error (MAE) of 0.254 log(W m−1 K−1), meaning that the predicted κL is on average within 1.795 times the DFT value and is shown by the structures within the dashed lines representing two times the perfect agreement. The prediction capability is competitive with the 0.12 MAE and 0.87 R2 error presented by the random forest model trained on ~103 materials35. Additionally, 103 untrained structures with κL from DFT are evaluated by our model for validation and are compared in the inset of Fig. 2a. Out of these structures, 67 were predicted owning <1 W m−1 K−1 and 36 remain within the same range from DFT values. Notably, at the lower end of the κL range, the model tends to underpredict the κL with greater intensity approaching the ultralow range of predicted 0.1 W m−1 K−1. This is due to the highly sensitive nature of the phonon transport toward the extrema of the κL, specifically from the quality of atomic forces in displaced supercells and eventually the 3rd order IFCs36. Nonetheless, the materials with prediction under 0.1 W m−1 K−1 are likely to remain within 1 W m−1 K−1 range and our model is effective at filtering candidates with ultralow κL.

Fig. 2: Summary of phonon dispersion and lattice thermal conductivity predictions.
figure 2

a Comparison of the RMSE of phonon frequency normalized by the structure’s specific frequency range. (Insets) Phonon dispersions linked to the relative error containing DFT (black lines) and prediction (red lines) for visualization. b κL at 300 K between DFT and the developed single neural network model for 3107 stable structures predicted by DFT. (Inset) The comparison between the predicted and DFT κL of 64 untrained structures on the same scale.

The advantage of our bottom-up ML approach for phonons manifests from the plethora of information from standard phonon calculation packages when provided a set of predicted atomic forces. Indeed, by default phonon frequencies and scattering matrix elements required for iterative BTE are computed in advance. From the phonon frequencies, information like the speed of sound, constant volume heat capacity, and the three-phonon scattering phase space may be readily computed. Here, we compare these properties from our neural network model with those from DFT calculations to further understand more about the underlying contributions to the predicted κL. Indeed, the constant volume heat capacity (cv) is directly involved in computing κL along with the phonon group velocities and scattering rates. Additionally, the speed of sound (Vs) is a partial representative of the group velocities for long-wavelength acoustic modes in crystals20. The three-phonon scattering phase space (P3), is a quantitative measurement of the number of three-phonon scattering channels. Unlike cv and Vs, higher P3 is indicative of larger scattering rates and thus lower κL30. However, akin to cv and Vs, P3 requires only the second order IFCs and therefore is computationally inexpensive as a result after the model is trained. Finally, the mean square displacement (MSD) of vibrating atoms, usually observed in finite temperature molecular dynamics, may also be computed from the phonon frequencies and eigenmodes9. To describe structures with a single value, only the maximum MSD among all atoms in the primitive cell is assigned.

In Fig. 3, the comparison of cv, Vs, P3, and maximum MSD between the neural network model and DFT is shown with the corresponding R2 and MAE values. Exceptional agreement is found for cv, followed by Vs, P3, and log(max MSD). With the small error in phonon frequencies, the mode-weighted global property cv requires the mode-dependent phonon frequency as a direct input and is summed up over a dense sampling of Brillouin zone, and thus expectedly owns the best agreement with DFT37. The maximum MSD is also constructed similarly and thus owns high accuracy with DFT. Although Vs is also directly computed from dispersions, the gradient of phonon frequency with respect to wave vector is required and is consequentially more sensitive to the predicted atomic forces than cv. Finally, P3 also uses the phonon frequencies directly but involves a counting of three phonon collisions by energy and momentum conservation. In other words, the error propagated from atomic forces into the phonon frequencies is compounded resulting in the largest error out of the other three harmonic phonon properties. Interestingly, while Fig. 3a–c for the most part experience an even spread of error, the scatter plot for the maximum MSD in Fig. 3d is shown with increasing disagreement at higher maximum MSD. This is because higher maximum MSD, corresponding to softer phonon modes, usually indicates lower κL, and the second order IFCs are more sensitive to the atomic displacements and corresponding forces. In such a case, we anticipate that increasing the atomic displacements can better capture the anharmonicity and hence the potential energy surfaces.

Fig. 3: Prediction of harmonic properties compared with density functional theory.
figure 3

Comparison of the (a) constant volume heat capacity (cv) at 300 K, (b) speed of sound (Vs), (c) third order scattering channel volume (P3), and (d) mean square displacement (MSD) at 300 K between DFT and the neural network model for 3107 DFT predicted stable structures. The MAE of the MSD on the linear scale is 1.163 × 10−2 Å2. The black solid line denotes the perfect match and is guide for eyes.

Quantification of ultralow κ L with predicted properties

Out of the 77,091 cubic materials set aside for evaluation, 27,059 structures are predicted by our trained Elemental-SDNNFF model to have no imaginary phonon frequencies, and thus being potentially thermodynamically stable. These structures are then evaluated for predicting full phonon properties including cv, Vs, P3, maximum MSD, and κL. In the previous section, we show how the trained model represents DFT-level IFCs while maintaining speeds on the order of ~103. This was made possible due to the initial training set and subsequent active learning iterations generating millions of data points to best fine-tune the model for handling many structures and chemistries. Despite replacing DFT with a machine learned model, the process of computing anharmonic IFCs and subsequently iteratively solving BTE is still time consuming for several tens of thousands of hypothetical structures. Therefore, given the large data of phonon properties evaluated by our model, quantification of trends for κL with structural and harmonic properties is desired to search materials with known thermodynamic stability. Previous studies suggest that κL is strongly correlated with several physical parameters, including volume of the unit cell Vcell35, specific heat capacity cv38, sound velocity Vs15, three-phonon scattering phase space volume P330, and thermal MSD39. Henceforth, we have experimented with linear combinations of cv, Vs, P3, max MSD, and Vcell to correlate with κL. We found the max MSD by itself has the best performance as a descriptor for κL of crystals. The reason is most likely due to the major contribution of harmonic phonons in thermal MSD for κL when compared to those listed above. Additionally, MSD may be computed as a function of temperature and is more useful to observe temperature-dependent trends. A generally inverse-linear relationship is observed between the log κL and the log(max MSD) (Fig. 4). Note, log(max MSD) is normalized here in \({\mathbb{R}}{\mathbb{\in }}[{0,1}]\) based on values found from the DFT set for ease of comparison. Figure 4a provides evidence of linearity through comparison of κL and the max MSD. The fitted red line shows a decreasing trend of κL with increasing max MSD. Structures with extremely high maximum MSD are indicative of rattling atoms, in which strong phonon-phonon scattering and ultralow κL is prevalent39. Given the agreement of the max MSD between DFT and predictions in Fig. 3d, Fig. 4b demonstrates the prediction of both κL and maximum MSD for the 25,901 unexplored structures out of the stable 27,059 pool, since some structures failed in BTE calculations and κL was not plotted. Again, the trend remains inversely proportional to the descriptor. We do notice that the newly fitted blue line shows a steeper slope in comparison to the previous red line by DFT (also shown in the same plot for comparison), and the difference between the two lines deviates with increasing maximum MSD. The most probable cause is the underprediction of κL at the lower extreme (Fig. 2b) and increased error in the MSD at the higher extreme (Fig. 3d). Although the ultralow κL may be underestimated on the log scale, these predictions remain highly beneficial for quickly marking structures with potential as thermal insulators. To quickly filter ultralow κL materials, a maximum MSD is set such that the value of the fitted line is 1.795 W m−1 K−1. This is chosen deliberately knowing predicted values of κL are within 1.795 times the DFT value which aids the later filtration for structures less than 1 W m−1 K−1. As such, the maximum MSD filter is set to 0.076 Å2 or 0.464 on the normalized log(max MSD) plot. Specifically, we found 9306 total structures with normalized log(max MSD) higher than 0.464. Out of these structures, the κL of 8873 (95.4%) structures are less than 1 W m−1 K−1. For normalized log(max MSD) less than 0.464, out of 16,596 structures, the κL of 4590 are less than 1 W m−1 K−1. This means, the success rate for filtering structures is 66% (8873 out of 13,461 structures) for those with κL less than 1 W m−1 K−1. Thus, the maximum MSD is a reliable descriptor for indicating highly unique structures with out-of-trend values of atomic displacement and corresponding ultralow κL. Such a critical value of maximum MSD 0.076 Å2 may be used in future works to identify potential candidates for thermal insulators in cubic structures.

Fig. 4: Results of the thermal mean squared displacements against the lattice thermal conductivity.
figure 4

Plots for (a) the DFT κL against DFT computed normalized log(max MSD) for the 3107 training data and (b) predicted κL against predicted normalized log(max MSD) for the 25,901 pool of unseen structures. The solid blue line represents fitting with DFT data, whereas the solid red line is fitted with predicted data. The dashed vertical line is indicative of 0.464 normalized log(max MSD) value corresponding to 0.076 Å2 MSD and 1.795 W m−1 K−1 κL in the fitted DFT line. All values of κL and maximum MSD are at 300 K.

Rattling effect has been proved to induce large MSD in many systems. To generalize structures with high probability for rattler atoms, we plot the average of the MSD for each element across the 25,901 pool in Supplementary Fig. S4. Hydrogen and all alkali metal elements, including Li, Na, K, Rb, and Cs, have the highest average MSD among all 63 elements covered here. Some alkaline earth metal elements including Mg, Ca, Sr, and Ba have medium MSD. Interestingly, halogen elements including Cl, Br, and I also possess high MSD. N stands out among the nonmetallic elements with the next largest being C, P, and Si. Tl also stands out as a semimetallic element next to Pb and Sn with significantly lower MSD. Metallic elements including Hg, Cd, Ag, and Au are the highest in their category although their MSD is much lower than their alkali metal counterparts.

To visualize the spread of predicted κL in the dataset, Supplementary Fig. S5 displays the t-distributed stochastic neighbor embedding (t-SNE) using Elemental-SDNNFF structure input vectors. For simplification, only one point per structure is implemented corresponding to a single Elemental-SDNNFF vector centered at the unit cell rather than on a per-atom basis. In Supplementary Fig. S5a, the distribution of space group indicates an overlap of structures with space group number 216 and 225 whereas a majority of structures with space group number 221 and 227 form visible clusters. This is sensible given space group 216 is different from space group 225 by just a vacant lattice site (ABC vs. ABC2) or a different element on the same lattice site (ABCD vs. ABC2). Supplementary Fig. S5b focuses on the predicted κL where observable regions of thermally insulating materials (blue) are highly contrasted from thermally conductive materials (red). Mainly, the upper left of the figure contains a mixture of space group 216 and 225 structures with ultralow κL with some additional blue regions along the bottom and top outer edge mostly corresponding to 221 and 225. This is further manifested in Fig. 3c where regions are instead highlighted by the predicted normalized log(max MSD). To highlight the relationship between κL and max MSD, we subtract the value from unity to match the properties based on color. As seen by the comparison between Supplementary Fig. S5b, c, both figures form identical structure mappings of the predicted κP and normalized log(max MSD) values, supporting their strong correlation. Additionally, the congregation of certain out-of-trend structures with extremely high or low κL indicates a correlation between Elemental-SDNNFF input vector and phonon conductors and insulators, suggesting strong structural-property relationship with phonon transport. Overall, the t-SNE plots encompass the wide range of unique structures and physics manifested when applying machine learned atomic force fields such as the Elemental-SDNNFF.

Insight from bonding and anti-bonding analysis

In the previous section, we related thermal displacements to the κL but do not discuss the effect of chemical bonding. Here, we further analyze our predicted structures with Crystal Orbital Hamilton Population (COHP)40 to quantify the contributions to the bonding and antibonding states. To assign single values of bonding and antibonding to each structure, we perform integration over COHP curves for each atomic pair as evaluated by LOBSTER code and take the average41. Figure 5 displays the resulting bonding and antibonding with highlighted log values of predicted κL to observe trends. Notably, at low bonding values (e.g., <200) and high antibonding (e.g., >1), only κL < 3 W m−1 K−1 exists. This region contains low interatomic bonding strength and high phonon anharmonicity, resulting in ultralow κL. Our observed trend of high antibonding indicates strong phonon anharmonicity and is consistent with recent studies by full DFT calculations on other systems16,42,43,44. On the other hand, the high bonding (>200) region seems to contain all ranges of κL. This is understandable from the physics point of view, whereby the κL is governed by two major mechanisms of interatomic bonding strength and phonon anharmonicity and thus the bonding/antibonding contributions might be competing. It is also worth pointing out that, the application of COHP is a low-cost indicator of κL requiring only the DFT calculations on primitive cells after structure optimization, which is very promising for the filtration of structures with anomalously low κL.

Fig. 5: Bonding-antibonding map as related to lattice thermal conductivity.
figure 5

Crystal orbital Hamilton population (COHP) analysis for 13,718 stable structures from the prediction pool. Color represents the log values of the predicted lattice thermal conductivity.

Off-diagonal thermal transport analysis

Recently two different mechanisms for phonon transport in solids have been discussed45,46. In crystalline materials, heat carriers propagate and scatter in a particle-like behavior as described by Peierls-Boltzmann transport picture for phonon wave-packet dynamics. Such populations have a well-defined energy (frequency) and therefore can be interpreted as particle-like excitations with a well-defined wave vector (q) and mode index (s), and corresponding lattice thermal conductivity is denoted as κP. In contrast, in glass materials, heat carriers behave wave-like, hopping via a Zener-like tunneling between quasi-degenerate vibrational eigenstates, as described by the Allen-Feldman theory. Such coherences do not have an absolute energy and cannot be related to a single eigenstate. Rather, they describe oscillations between pairs of eigenstates and correspond to an evolution which does not preserve the nature of the single-particle excitation. Very recently, the importance of such coherences’ contribution to overall lattice thermal conductivity describing the wave-like interband tunneling of phonons, dubbed as two-channel thermal transport or off-diagonal contribution of heat-flux operator, has been theoretically formulated and experimentally validated in ultralow κP materials47,48,49. With the phonon property of large-scale crystals being predicted herein, it is intuitive to explore the two-channel thermal transport behavior of these materials.

The contributions of off-diagonal components (\({\kappa }_{c}^{\alpha \beta }\)) to the total thermal conductivity is obtained by48,50:

$${\kappa }_{c}^{\alpha \beta } = \frac{{{{\hbar }}}^{2}}{{k}_{B}{T}^{2}{N}\Omega }\mathop{\sum }\limits_{q}\mathop{\sum }\limits_{j,{j}^{{\prime} }}^{j\ne {j}^{{\prime} }}\frac{{\omega }_{j}\left({{{{{\boldsymbol{q}}}}}}\right)+{\omega }_{{j}^{{\prime} }}\left({{{{{\boldsymbol{q}}}}}}\right)}{2}{V}_{j,{j}^{{\prime} }}^{\alpha }\left({{{{{\boldsymbol{q}}}}}}\right){V}_{j,{j}^{{\prime} }}^{\beta }\left({{{{{\boldsymbol{q}}}}}}\right) \\ \quad \times \frac{{\omega }_{j}\left({{{{{\boldsymbol{q}}}}}}\right){n}_{j}\left({{{{{\boldsymbol{q}}}}}}\right)\left({n}_{j}\left({{{{{\boldsymbol{q}}}}}}\right)+1\right)+{\omega }_{{j}^{{\prime} }}\left({{{{{\boldsymbol{q}}}}}}\right){n}_{{j}^{{\prime} }}\left({{{{{\boldsymbol{q}}}}}}\right)\left({n}_{{j}^{{\prime} }}\left({{{{{\boldsymbol{q}}}}}}\right)+1\right)}{4{\left({\omega }_{j}\left({{{{{\boldsymbol{q}}}}}}\right)-{\omega }_{{j}^{{\prime} }}\left({{{{{\boldsymbol{q}}}}}}\right)\right)}^{2}+{\left({\Gamma }_{j}\left({{{{{\boldsymbol{q}}}}}}\right)+{\Gamma }_{j{\prime} }\left({{{{{\boldsymbol{q}}}}}}\right)\right)}^{2}}\times \left({\Gamma }_{j}\left({{{{{\boldsymbol{q}}}}}}\right)+{\Gamma }_{j{\prime} }\left({{{{{\boldsymbol{q}}}}}}\right)\right)$$
(1)

where ħ, kB, T, N, and Ω are the reduced Planck constant, Boltzmann constant, absolute temperature, the number of q-points sampled in reciprocal space, and volume of the unit cell, respectively. ωj (q), Γj (q), and nj (q) are the frequency, linewidth, and the equilibrium Bose-Einstein distribution function with wave vector q and branch j, respectively. \({V}_{j,{j}^{{\prime} }}^{\alpha }\left({{{{{\boldsymbol{q}}}}}}\right)\) is the off-diagonal elements (jj′) of velocity matrix with direction α, and can be calculated by:

$${V}_{j,j{\prime} }^{\alpha }({{{{{\boldsymbol{q}}}}}})=\frac{1}{{\omega }_{j}({{{{{\boldsymbol{q}}}}}})+{\omega }_{j{\prime} }({{{{{\boldsymbol{q}}}}}})}\langle {e}_{j}({{{{{\boldsymbol{q}}}}}})|{\nabla }_{q}^{\alpha }{{{{{\boldsymbol{D}}}}}}({{{{{\boldsymbol{q}}}}}})|{e}_{j{\prime} }({{{{{\boldsymbol{q}}}}}})\rangle$$
(2)

where ej (q) and D are the eigenvector and dynamical matrix, respectively. In our work, we modified the original ShengBTE code51 to output the off-diagonal elements of velocity matrix and then calculate the off-diagonal thermal conductivity based on Eq. (1).

For calculations of lattice thermal conductivity contribution by coherence phonons (κc), 1000 structures from each pool of materials with low (<1 W m−1 K−1), medium (1–10 W m−1 K−1), and high (>10 W m−1 K−1) κP by traditional BTE solution are randomly chosen with the same parameters and force constants from the model prediction step. From these structures, 869 low, 995 medium, and 999 high κP BTE calculations successfully converged with average off-diagonal ratios κc/κP of 2.59, 5.02 × 10−2, and 4.35 × 10−3, respectively. The κc/κP show good agreement between DFT results and prediction by our Elemental-SDNNFF model (Supplementary Fig. S6), which again displays the accuracy of our model. In Fig. 6, we compute the percentage of off-diagonal contribution which is defined as κc/κtotal where κtotal = κc + κP. Good agreement between our Elemental-SDNNFF predictions on unseen data and DFT results from training data is observed. Strong linear-like correlation trend is found between percentage of κc from κtotal. Generally speaking, the lower κP is, the higher percentage of off-diagonal contribution by κc could have. For extremely low κP materials, e.g., κP ~0.1 W m−1 K−1, the κc could contribute as high as 50% or even 70% of κtotal, showing dominant role of contributions of the coherences even at room temperature. Similar phenomenon is also found in perovskite CsPbBr3 at room temperature and La2Zr2O7 at high temperature48,49. We also noticed that, for the same κtotal, the κc contribution can differ quite largely among different structures, leading to a very wide spread of the log-scale percentage contribution in Fig. 6. This implies that, the precise off-diagonal contribution percentage depends on detailed phonon band structures and mode-level phonon anharmonicity of different materials, rather than the single absolute value of κP. Figure 6 also shows that there are considerable amount of structures with dominant wave-like heat conduction, instead of particle-like conduction as predicted by traditional Peierls BTE. Our model clearly determines the crossover from particle-like to wave-like heat conduction (dashed line in Fig. 6).

Fig. 6: The off-diagonal contribution against the total lattice thermal conductivity at room temperature from two sources of data.
figure 6

The percentage of the off-diagonal contribution is defined as κc/κtotal, where κtotal = κc + κP. The agreement between our Elemental-SDNNFF predictions on selected 2863 unseen data and DFT from 2397 training data is indicated by their overlap and matching linear trend. The dashed black line indicates 50% from off-diagonal contribution, above which the wave-like heat conduction is dominated.

To observe mode-level contributions of κP and κc, three low κP materials (NaKAs KIrCs2Cl6, and CuPdSr2 with space group no. 216, 225, and 216, respectively) from the untrained pool are chosen with varying coherence contribution for comparison. The frequency-dependent relaxation time is plotted in Fig. 7. The materials are provided in order of increasing κP with (prediction value, DFT validation value) of (0.037, 0.075), (0.101, 0.127), and (0.68, 0.935) W m−1 K−1, respectively. The κC values for (predicted, DFT) results are (0.222, 0.205), (0.252, 0.226), and (0.07289, 0.08282) W m−1 K−1, respectively. From comparison between the predicted (left column in Fig. 7) and DFT (right column in Fig. 7) results, the general trends of relaxation time against frequency are captured, confirming again the accuracy of our ML model. Despite this, κP contributions from the low frequency range are more pronounced in size and density by eye in the DFT results (Fig. 7b, d, f) than in the prediction results (Fig. 7a, c, f). This agrees with the previously observed underprediction of κP for structures with κP < 1 W m−1 K−1. In addition, the κC of NaKAs and KIrCs2Cl6 share similar values and is pronounced by the presence of high mode-level κC contribution in the entire frequency range. For CuPdSr2, a vast majority of points own low contribution by coherence phonons thus reducing the κc. From observation, κC contributions are overshadowed by dominating mode-level κP throughout the entire frequency range, leading to lower κc/κtotal ratios as previously seen in Fig. 6.

Fig. 7: The frequency-dependent relaxation time for NaKAs, KIrCs2Cl6, and CuPdSr2.
figure 7

Rows (a, b), (c, d), and (e, f) correspond to the material in the mentioned order, whereas columns (a, c, e) and (b, d, f) are from our ML model prediction and DFT results, respectively. The size of circles represents the magnitude of κP, whereas the color represents the ratio of κC/κtotal, i.e., the off-diagonal contribution to overall thermal transport. The diameter of circles is scaled up equally between predicted and DFT plots for viewing.

Conclusion

We demonstrate the development of a bottom-up machine learning approach through accurately and efficiently predicting comprehensive phonon properties of ~80,000 crystalline materials, which is realized by evaluating the atomic forces with sufficient DFT representation. The query-by-committee active learning scheme allows iterative improvement of the models by simultaneous prediction of atomic forces in the unseen pool of displaced structures. The final model is deployed for constructing IFCs for both observed and unseen structures. Given the good accuracy of phonon properties such as vibrational frequencies and κL, we exploit the abundant 25,901 pool of structures that are predicted to be thermodynamically stable by our model and quantify thermal insulators with simple descriptors. Both MSD and bonding/antibonding states are two computationally efficient approaches for screening ultralow κL. We should mention that, given the high accuracy of harmonic dominant properties such as maximum MSD, we propose our model as a method to generate high-quality data for direct prediction of phonon properties through other machine learning models, such as prediction of extremely high or low lattice thermal conductivity materials. Physical insight into off-diagonal contribution to overall phonon transport is also analyzed with our model, demonstrating the general trend of high coherence contributions to the total thermal conductivity for low κL structures and the crossover from particle-like to wave-like heat conduction in diverse structures. The precise coherence contribution percentage depends on detailed phonon band structures and mode-level phonon anharmonicity of different materials, rather than the single absolute value of κL. Our algorithm is capable of growing and being adapted to even larger unseen materials and is promising for accelerating discovery of crystals for emerging phonon mediated applications.

Methods

Training dataset generation

We first perform structure filtration by elements, formation energy, and energy above hull, and then perform structure optimization (blue path in Fig. 1). The purpose of filtration is to reduce the structure count and increase the likelihood of stability in subsequent DFT calculations of phonon dispersions. After structure optimization, we replicate primitive cells into supercells and displace all atoms by fixed 0.03 Å in random directions to create diversity in the atomic environments and facilitate training, which is also standard for IFC calculations. Additionally, for IFC fitting with CSLD, such introduction of random displacements helps mitigate the innate poor energy conservation of direct force field models, as demonstrated in previous works36,52. Specifically, due to the nature of direct force prediction, IFC fitting with methods such as finite difference method (FDM) does not guarantee zero or near-zero forces for atoms in equilibrium. Such atoms are abundant in FDM whereby only one or two atoms are displaced for IFCs up to the third order, causing significant disagreement in the force-sensitive lattice thermal conductivity. Thus, stochastic methods such as CSLD mitigate the energy conservation issue by displacing all atoms, generating a noise-canceling effect for subsequent IFCs fitted by predicted forces. Afterwards, we randomly select a small fraction of displaced supercell structures to serve as the initial training data for the models. This is passed to DFT calculations, and the resulting ground truth atomic forces and local atomic environment are passed to five initialized models with similar architecture but different weights. Here, we used a small subset of existing DFT data previously calculated for phonon properties. Although we do not introduce techniques to choose the initial structures for training, we recommend those such as the principal component analysis (PCA) shown to improve model representation especially for out-of-trend structures53. After the DFT step, we perform data augmentation in which atomic environments are rotated according to nearest neighbor rules, generating ~2–3× increase in the existing dataset (see schematic in Supplementary Fig. S2). This enhances the dataset diversity for model training in addition to the already abundant N × D dataset, where N is the number of atoms per simulation (or per supercell) and D is the number of DFT simulations. As an aside, future work is planned to introduce rotational equivariance to the existing Elemental-SDNNFF model for automatic consideration of rotated atomic environments without the need for said nearest neighbor rules and rotation matrices54. This should drastically improve the training efficiency by reducing the training size while maintaining the current rotational covariance of the force field.

Active learning details

Taking advantage of the interpolative nature of neural networks, these models serve as a committee that will judge the remaining untrained or active learning structures for atomic environments. Structures yielding high uncertainty in the forces indicates poor representation of the corresponding atomic environments and DFT forces. The uncertainty is evaluated by55:

$${\varepsilon }_{i}=\sqrt{\left\langle {{||}{f}_{i}^{m}-\bar{{f}_{i}}{||}}^{2}\right\rangle },\bar{\,{f}_{i}}=\left\langle {f}_{i}^{m}\right\rangle$$
(3)

where εi is the indicator for atom i, \({f}_{i}^{m}\) is the predicted force by model m, and \(\bar{\,{f}_{i}}\) is the average force across all models in the committee. We take the max(εi) for each displaced supercell and choose one supercell with the highest max(εi) out of all supercells associated with a unique structure to promote diversity in atomic environments for subsequent DFT evaluation. We set the uncertainty threshold to εi > 50 meV/Å well above the force error of the model to guarantee poorly represented structures in the committee. Those structures with uncertainty above the threshold are then passed to further DFT calculations and retrained into the model, forming a closed loop with iterative self-improvement (red circled arrows in Fig. 1). Once the number of recommended structures converges to near-constant value, we then publish the model for evaluation of atomic forces and subsequent phonon properties for all unseen structures (green path in Fig. 1). For more details about the model details and active learning procedure, we refer the reader to our previous work31. After seven rounds of active learning with combined data augmentation, 29.4 million atomic environments are successfully trained into the network (Supplementary Fig. S2).