Prediction of thermoelectric performance for layered IV-V-VI semiconductors by high-throughput ab initio calculations and machine learning

Layered IV-V-VI semiconductors have immense potential for thermoelectric (TE) applications due to their intrinsically ultralow lattice thermal conductivity. However, it is extremely difficult to assess their TE performance via experimental trial-and-error methods. Here, we present a machine-learning-based approach to accelerate the discovery of promising thermoelectric candidates in this chalcogenide family. Based on a dataset generated from high-throughput ab initio calculations, we develop two highly accurate-and-efficient neural network models to predict the maximum ZT (ZTmax) and corresponding doping type, respectively. The top candidate, n-type Pb2Sb2S5, is successfully identified, with the ZTmax over 1.0 at 650 K, owing to its ultralow thermal conductivity and decent power factor. Besides, we find that n-type Te-based compounds exhibit a combination of high Seebeck coefficient and electrical conductivity, thereby leading to better TE performance under electron doping than hole doping. Whereas p-type TE performance of Se-based semiconductors is superior to n-type, resulting from large Seebeck coefficient induced by high density-of-states near valence band edges.


INTRODUCTION
Due to the depletion of fossil fuels and ever-increasing environmental concerns, it is urgent to explore sustainable and clean energies, which has become a global consensus 1 . The thermoelectric (TE) conversion technique creates a very promising opportunity to directly convert waste heat into electricity 2,3 . The energy conversion efficiency is quantified by the TE materials' dimensionless figure of merit ZT: ZT = S 2 σT/(κ l + κ e ), where S, σ, κ l and κ e are the Seebeck coefficient, electrical conductivity, lattice thermal conductivity, and electronic thermal conductivity, respectively. Undoubtedly, for a material to achieve high TE performance, it should maintain the high power factor (i.e., S 2 σ) and low thermal conductivity (i.e., κ l + κ e ) simultaneously. To date, several methods have been employed to improve the ZT value, including band structure engineering [4][5][6] , nanostructure engineering 7-9 , carrier modulation [10][11][12] , and enhancement of phonon scattering through extrinsic strategies, such as the introduction of all-scale hierarchical architectures 13,14 . Besides, searching crystalline solids with intrinsically low κ l is a significant avenue [15][16][17] , and it is also beneficial to optimize the electrical transport properties independently 18 .
Very recently, we identified a large family of stable layered IV-V-VI (IV (A) = Si, Ge Sn, Pb; V (B) = As, Sb, Bi; VI (C) = S, Se, Te) semiconductors (70 compounds in total) 19 . It is noted that the chemical composition of the current advanced TE materials is mainly composed of main group IV to VI elements, such as PbTe 6 , Bi/Sb−Te-based alloys 20,21 , and recently reported Sb 2 Si 2 Te 6 22 , most of which also adopt layered crystal structures. More importantly, all these IV-V-VI compounds exhibit ultralow lattice thermal conductivity (0.28−2.02 W m −1 K −1 at 300 K) and suitable electronic bandgaps (>0.1 eV) 19 within the range of good TE materials 23 . The above inherently favorable characteristics indicate the immense potential of this ternary chalcogenide family as TE materials. Thus it is of great importance to explore their TE performance.
It is well known that the traditional trial-and-error approach is a cost-inefficient way to develop unknown materials with target properties, which may even end up with fruitlessness. In contrast, the high-throughput (HT) density functional theory (DFT) calculations have emerged as a powerful and useful tool for accelerating the discovery of various types of functional materials in recent years 24,25 . However, the full HT calculations will be computationally extremely expensive when identifying suitable materials from numerous candidates. Nowadays, driven by the advances of artificial intelligence technology, the machine learning (ML) approach has become a much attractive technique in materials science [26][27][28] , which has been successfully utilized to predict diverse materials properties [29][30][31] . By combing HT calculations and ML methodology, therefore, one can not only efficiently search the huge space with low computational cost but also gain quantitative insights into the structure-property relationship [32][33][34] .
In this work, using our developed HT intelligent computational platform (ALKEMIE) 35 , we first generate a database comprising of a series of descriptors (features) and TE labels (characteristics) of 40 compounds to compose an input data matrix for the ML models training and validating. Then, based on extensive examinations of hyperparameters of deep neural networks, two highly accurate ML models are successfully developed and used to predict the maximum ZT (ZT max ) and corresponding doping type at different temperatures for the remaining 30 semiconductors, respectively. Among this IV-V-VI family, we identify several compounds that exhibit good TE performance with ZT max values above 0.8 at 650 K, especially the n-type Pb 2 Sb 2 S 5 achieving a high ZT max of 1.2. Meanwhile, it is also observed that Se-based compounds have higher ZT max under p-type doing compared to n-type doping while Te-based compounds show better n-type TE performance 1 than p-type. Finally, with the intention of understanding this behavior and unraveling the origin of the excellent TE performance of Pb 2 Sb 2 S 5 , we comprehensively investigate the electronic structures, electrical and phonon thermal transport properties.

Dataset for machine learning
The magnitude of ZT directly describes the performance of a TE material: the larger the ZT value, the higher the TE energy conversion efficiency. Meanwhile, the optimal doping type (i.e., n-type or p-type) at which the maximum ZT value is achieved for a given temperature provides significant insights for experimental research. Therefore, the ZT max and corresponding optimal doping type are used as the two TE labels for ML prediction. It is known that the input dataset is one of the key components in developing efficient ML models. Here, we created a database of altogether 480 instances comprising of 40 compounds, as each compound is investigated at 12 different temperature levels (from 100 to 650 K with an interval of 50 K). These 40 semiconductors are selected randomly from the 70 compounds to ensure the full coverage of all data (see "Methods" section for details). Each instance is described by 31 initial descriptors: the number of atoms, valence electronic configurations, features related to the periodic table (row numbers and atomic numbers), atomic and covalent radii, electronegativities, average atomic mass, lattice constants, interatomic bond lengths, and temperature (see Supplementary Table 1 of Supplementary Information for details). Figure 1a shows the correlation coefficients between all 31 features. Clearly, many different features exhibit extremely high correlation coefficients closely approached to 1, indicating that they are strongly coupled. In order to avoid the extremely complex ML models, it is very important to select a suitable number of descriptors that could perfectly reflect the TE performance. Hence, we have performed a feature reduction by removing those redundant descriptors with correlation coefficients above the absolute 0.90 ( Supplementary  Fig. 1), leaving 11 optimal features (Fig. 1b), which are very easy to obtain without any complicated and expensive calculations. Finally, an input data matrix used to build ML models is obtained, in which the total number of data points is 5280 (i.e., 480 × 11). We divided this dataset into 75% training (3960 data points comprising of 30 compounds) and 25% validation sets (1320 data points containing 10 compounds). In addition, for preventing overfitting, 70% of the training data is adopted for training ML models, while the remaining 30% is used for real-time testing.

Machine learning models training
Owing to the rapid development of computing power and the acceleration of graphics processing unit (GPU), here we directly employed the deep neural networks as the machine learning model.
To avoid overfitting or underfitting models and ensure high accuracy simultaneously, we adopted three methods of adding batch norm layer, dropout layer, and cross-validation. In addition, a set of appropriate hyperparameters is critical for obtaining effective neural network models, which generally includes the activation function, the optimization algorithm, the number of hidden layers, the number of nodes in each hidden layer, and the initialization of weights and bias. We have extensively tested the effects of different hyperparameters on the convergence and accuracy of the ML models, as shown in Fig. 2.
Firstly, we examined three popular types of activation functions, namely, Sigmod, Tanh 36 , and rectified linear unit (Relu) 37 . As seen in Fig. 2a, the mean square error (MSE) of the training set using the Sigmod activation function is 0.00531, which is highly comparable to that of Tanh (0.00576) (Fig. 2b) and is relatively lower than the Relu (0.01037) (Fig. 2c), certifying the higher training accuracy of the Sigmod and Tanh functions. However, the root mean square error (RMSE) of Sigmod (0.16156) and Tanh (0.27545) is significantly larger than the value of Relu (0.06520), indicating the much lower prediction accuracy of the former two functions, as clearly evidenced in Fig. 2a-c (Sigmod: 83.844%, Tanh: 72.455%, and Relu: 93.480%). Also, Sigmod and Tanh functions appear serious underfitting problems, as the MSE of their testing dataset is substantially larger than that of the training dataset. In contrast, the Relu function exhibits a better fitting behavior because both training and testing datasets have similar learning curves and comparable MSE values, in spite of minor deviations. Meanwhile, among these three activation functions, Relu also possesses the largest coefficient of determination value 38 (R 2 = 0.95014), substantiating its superior globally fitting performance compared to the other two functions, and thus the Relu function is selected. Then, based on the Relu activation function, we tested different backpropagation algorithms 39 . By comparing with the Adaptive Moment Estimation (Adam) (Fig. 2c) 40 , the convergence processes during training and testing of stochastic gradient descent (SGD) 41 are almost the same, and the corresponding MSE difference could be ignored (Fig. 2d), although SGD has a slower convergence speed and moderately smaller prediction accuracy (91.765%). Hence, the algorithm of SGD is adopted as the optimizer to further eliminate the underfitting problem that existed in Adam. In addition, using the Relu activation function and the SGD optimizer, we also demonstrated that the enhancement of hidden layers ( Fig. 2e) and nodes ( Fig. 2f) has little impact on the fitting performance but increases the training cost. Therefore, the combination of Relu activation function, SGD optimization algorithm, and three hidden layers with corresponding nodes of 100, 50, and 20 were selected as the favorable ML model (defined as mode-I here) for predicting ZT max .
Notably, the prediction of ZT max is a kind of continuous problem, while the estimation of optimal doping type is the classification problem, which has only two outputs (i.e., n-type and p-type) and thus is more prone to overfitting or underfitting. Hence, we trained a different ML model through transfer learning from model-I to predict optimal doping type at different temperatures. Supplementary Fig. 2 shows the testing process for various hyperparameters. In order to avoid overfitting or underfitting behaviors and to reduce the training error simultaneously, the combination of Tanh activation function, SGD algorithm and four hidden layers with corresponding nodes of 100, 100, 50, and 20 have been adopted (model-II) to estimate the optimal doping type ( Supplementary Fig. 2e). This is because the cross-entropy difference between the training (0.20965) and testing datasets (0.20990) in model-II is negligible, suggesting the non-existence of overfitting and underfitting situations. Besides, it also shows the lowest training error compared to the other two non-overfitting and non-underfitting ML models

Prediction of TE performance
We further used the validation dataset to verify the reliability of model-I and model-II. The comparison between the DFTcalculated and ML-predicted ZT max values is shown in Fig. 3a, where the MSE is~0.008, R 2 reaches~0.952 and the prediction accuracy exceeds 90%, substantiating the validity of model-I in predicting the ZT max at various temperatures. Figure 3b indicates that out of the total 120 verification data points, the prediction accuracy of model-II is as high as 91.67%, confirming its robust reliability. Subsequently, we applied these two ML models to estimate the ZT max and optimal doping type of the remaining 30 compounds. As obviously seen from Fig. 3c, the ZT max values of all compounds increase with increasing temperature, which is also observed in the initial 40 input compounds ( Supplementary Fig.  3). Among this IV-V-VI material family, several promising TE semiconductors have been identified, which exhibit the ZT max above 0.8 at 650 K, especially the top compound Pb 2 Sb 2 S 5 having ZT max of 1.2 under n-type doping. Moreover, for a given temperature, it is interesting to observe that Te-based compounds achieve the ZT max under n-type doping while the Se-based semiconductors show p-type ZT max . Such behavior also appears in almost all input Se/Te-based compounds used to train and validate ML models ( Supplementary Fig. 4). In order to explore the superior n-type TE performance of Pb 2 Sb 2 S 5 and understand the doping type difference in Se-and Te-based compounds, we systematically investigated the electronic structures and transport properties, using three Pb 2 Sb 2 VI 5 (VI = S, Se, Te) compounds as examples in the following. Figure 4 shows the calculated electronic band structures of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 using the generalized gradient approximations (GGA) of Perdew−Burke−Ernzerhof (PBE) exchange-correlation functional 42 . Note that we have demonstrated the band dispersion features determined by the Heyd−Scuseria−Ernzerhof screened hybrid functional (HSE06) 43 are quite similar to those obtained from the GGA-PBE method ( Supplementary Fig. 5), indicating that the HSE06 approach has little effect on the shape of the band structures. Therefore, the efficient GGA-PBE band structures instead of the computational expensive HSE band structures were used in this work. It is clearly seen that all the three compounds are direct-gap semiconductors, as their conduction band minimum (CBM) and valence band maximum (VBM) are both located at the Γ point (Fig. 4a-c). The resulting bandgaps are 0.53, 0.12 and 0.34 eV for Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 , respectively, which are appreciably lower than those determined from the HSE06 functional (0.91 eV for Pb 2 Sb 2 S 5 , 0.47 eV for Pb 2 Sb 2 Se 5 , and 0.60 eV for Pb 2 Sb 2 Te 5 ) 19 . To eliminate the bipolar conduction effect 44 , therefore, the more accurate HSE bandgaps were used in our calculations of electrical transport properties (see below). In addition, these three compounds exhibit similar projected band structures (Fig. 4d-f). Specifically, the valence bands are basically dominated by the p electrons of VI atoms, and the conduction bands are mainly contributed from Pb 6p and Sb 5p states. Since the Pb/Sb p orbitals have slight contributions to the valence band edge, it is possible to modulate the hole carrier concentration via Pb/Sb defect engineering without considerably influencing the valence band structures, thereby enhancing the TE performance. For example, the isostructural compound Ge 2 Sb 2 Te 5 gives the maximal ZT value of 0.78 at 700 K by substituting the Ge sites with In,~2fold improvement compared with the undoped system because the introduction of indium as a potentially donor-like dopant lowers the hole carriers density 45 . From band structures, it is also found that the highest valence band (HVB) of Pb 2 Sb 2 S 5 is much flatter than that of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 , indicating the larger density of states (DOS) near the VBM of the former. Meanwhile, Pb 2 Sb 2 Te 5 is expected to possess the highest DOS near the CBM due to its flattest lowest conduction band (LCB). We will further discuss this later.

Electronic structures
To improve the understanding of electronic structures, the projected density of states (PDOS) of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 are plotted in Fig. 5a-c. It clearly reveals that the top of valence bands is predominated by the VI p electrons, and the electronic states near the CBM primarily come from the p states of Pb and Sb and partly from the VI p orbital. Pb 6p and Sb 5p electrons are widely distributed between −6 and 0 eV, which strongly overlap with VI p valence electrons in the same range, suggesting the covalent bonding characters of Pb-VI and Sb-VI bonds. Noticeably, there exists small Pb 6s and Sb 5s peaks close to the VBM, although they are mainly localized in the deep energy Fig. 3 ML models validation and thermoelectric performance prediction. ML-predicted and DFT-calculated a ZT max and b corresponding doping type of 10 compounds at different temperatures (120 data points). c ML-predicted ZT max and optimal doping type at 12 temperatures (from 100 to 650 K with an interval of 50 K) for the remaining 30 semiconductors.
region below the Fermi level E F (−10 to −6 eV). In order to gain further knowledge of the electronic interactions between different atoms, we calculated the projected crystal orbital Hamilton populations (pCOHP) 46 , as shown in Fig. 5d-f. In principle, a negative value of -pCOHP indicates the antibonding interaction while the bonding state exhibits a positive one. Obviously, these three compounds are considered to be electronic stable because no antibonding states are located at the E f . In the energy range slightly below the E F , the interactions between Pb/Sb s and VI p electrons have negative -pCOHP values, indicating the antibonding states of Pb 6s-IV 5p and Sb 5s-IV 5p, which would play an important role in the hole carrier transport. It is also found that the Pb/Sb p-VI p exhibit positive -pCOHP values in the range from −6 to 0 eV, forming bonding states and hence stabilizing the electronic structure. Figure 6 illustrates the total DOS of valence and conduction band edges of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 . In general, a rapid change in the DOS with energy is favorable for obtaining a  large Seebeck coefficient according to the relations given by 6 where n is the carrier concentration and m Ã d represents the DOS effective mass. It turns out that the valence DOS near the VBM of Pb 2 Sb 2 S 5 is indeed much larger than that of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 (Fig. 6a) while the increase of DOS close to the CBM is considerably faster in Pb 2 Sb 2 Te 5 (Fig. 6b), which agrees well with our above analysis of band structures. Hence, among these three compounds, we can expect that Pb 2 Sb 2 S 5 gives the highest Seebeck coefficient under p-type doping and Pb 2 Sb 2 Te 5 should have the largest value in n-type doping. Besides, both Pb 2 Sb 2 S 5 and Pb 2 Sb 2 Se 5 would exhibit higher p-type S values than the n-type due to the steeper DOS of valence band edges compared with conduction band edges. Similarly, for Pb 2 Sb 2 Te 5 , the n-type doped Seebeck coefficient would be better than that of the p-type. The band structures and total DOS of some highperformance Se/Te-based compounds have been shown in Supplementary Figs. 6-9.

Electrical transport properties
The calculated Seebeck coefficients of n-type and p-type Pb 2 Sb 2 VI 5 compounds as a function of carrier concentration (n) at 650 K are displayed in Fig. 7a, e. Note that the estimated temperature-dependent Seebeck coefficients of Ge 1 Sb 2 Te 4 and Ge 2 Sb 2 Te 5 are in good agreement with previous theoretical simulation results (Supplementary Fig. 10) 47 , verifying the reliability of our predictions. Generally, the absolute S (|S|) value decreases with increasing carrier concentration, as in the case of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Te 5 , and Pb 2 Sb 2 Se 5 . At the same p-type doping level, the S value of Pb 2 Sb 2 S 5 is much larger than that of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 ; meanwhile, Pb 2 Sb 2 Te 5 has the largest n-type |S| among these three compounds. By comparison, it is also observed that the p-type performance of Pb 2 Sb 2 S 5 and Pb 2 Sb 2 Se 5 is superior to the n-type, whereas Pb 2 Sb 2 Te 5 is much better under n-type doping. These results are well consistent with our above analysis of the total DOS around the Fermi energy.
Notably, the electronic relaxation time (τ) is a critical parameter in the calculation of electrical conductivity. Here, by comparing the σ/τ with available experimental data of σ 48 , the τ values of four GST compounds were obtained, namely, 10 fs for Ge 1 Sb 2 Te 4 and Ge 2 Sb 2 Te 5 , 12 fs for Ge 1 Sb 4 Te 7 and Ge 3 Sb 2 Te 6 , which were adopted for compounds with the same stoichiometry as GST compounds. Figure 7b, f shows the electrical conductivity of Fig. 7 Electronic transport properties analysis. a-d P-type and e-h n-type electronic transport properties of three Pb 2 Sb 2 VI 5 compounds as a function of carrier concentration (n) at 650 K: a, e Seebeck coefficient S, b, f electrical conductivity σ, c, g electronic thermal conductivity κ e , and d, h power factor PF. Pb 2 Sb 2 VI 5 compounds. Normally, a system with larger S would give lower σ 49 . Indeed, this happens for all three compounds under p-type doping. Specifically, at the same hole carrier density, the σ of Pb 2 Sb 2 Te 5 is slightly larger than that of Pb 2 Sb 2 Se 5 , while the Pb 2 Sb 2 S 5 has the smallest value; the Seebeck coefficient decrease in the order of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 . For n-type doping, however, the fact is that Pb 2 Sb 2 Te 5 , which has the largest |S|, also possesses the highest σ among Pb 2 Sb 2 VI 5 compounds. Such a behavior can be attributed to the existence of multiple conduction valleys in Pb 2 Sb 2 Te 5 , which contribute significantly to electrical conduction when the Fermi level across the conduction band edge ( Supplementary Fig. 11). By substituting the constant relaxation time into the ratio of κ e /τ given by BoltzTraP, the electronic thermal conductivity as a function of carrier concentration was determined, as shown in Fig. 7c, g. Clearly, for both p-type and n-type dopings, the κ e of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 all increases with increasing the carrier density, exhibiting similar trends to the σ−n relation (Fig. 7b, f). It is important to note that Pb 2 Sb 2 S 5 has very small n-type κ e , even at a relatively high carrier concentration of 1 × 10 20 cm −3 , which plays a significant role in achieving the high n-type TE performance. The power factor (PF) was further extracted from the Seebeck coefficient and electrical conductivity (Fig. 7d, h). The results demonstrate that the p-type doped PF of Pb 2 Sb 2 S 5 is moderately larger than that of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 in the whole studied concentration range, owing to the higher S of the former. For n-type doping, the combination of high S and σ results in the much larger PF of Pb 2 Sb 2 Te 5 compared with Pb 2 Sb 2 S 5 and Pb 2 Sb 2 Se 5 . In addition, the p-type PF of Pb 2 Sb 2 Se 5 is larger than that of the n-type, which is opposite in Pb 2 Sb 2 Te 5 , verifying the fact that the TE performance of Pb 2 Sb 2 Te 5 in n-type doping is superior to p-type doping, while Pb 2 Sb 2 Se 5 is better when p-type doped. Supplementary Figs. 12-15 illustrates the electrical transport properties of some Se-based and Te-based compounds, which exhibit similar electrical transport behaviors to Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 , respectively.

Phonon thermal transport properties
The lattice thermal conductivity is one of the key quantities that determine the TE performance of materials. Based on the linearized phonon Boltzmann transport equation (BTE), the temperature dependence of κ l for the Pb 2 Sb 2 VI 5 system was estimated, as shown in Fig. 8a. Evidently, the κ l values of all compounds gradually decrease with the increase of temperature and exhibit a typical T −1 behavior in the temperature range of 100 −650 K, manifesting that the phonon transport is dominated by the phonon−phonon Umklapp scattering. Among them, for a given temperature, Pb 2 Sb 2 S 5 possesses the lowest lattice thermal conductivity while Pb 2 Sb 2 Se 5 gives the highest value. At 300 K, the κ l value is 0.55 W m −1 K −1 for Pb 2 Sb 2 S 5 , 0.71 W m −1 K −1 for Pb 2 Sb 2 Te 5 , and 1.83 W m −1 K −1 for Pb 2 Sb 2 Se 5 , which decreases to 0.25, 0.33, and 0.85 W m −1 K −1 at 650 K, respectively. To unravel the low-κ l mechanism in Pb 2 Sb 2 S 5 , we calculated the phonon group velocity of each mode derived from the phonon dispersions. In general, the slow phonon transport in the crystal lattice suppresses the phonon thermal conduction, thereby leading to small lattice thermal conductivity 50,51 . As seen from Fig. 8b, the group velocity basically decreases in the order of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 , which is good agreement with the trend of mean sound speed estimated from the elastic moduli (Pb 2 Sb 2 S 5 (2392 m s −1 ) > Pb 2 Sb 2 Se 5 (2195 m s −1 ) > Pb 2 Sb 2 Te 5 (1919 m s −1 )) 19 . However, the fact is that the lattice thermal conductivity of Pb 2 Sb 2 Se 5 is smaller than that of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 , indicating that the factor of sound speed fails to explain the lower κ l of Pb 2 Sb 2 S 5 .
Further, we estimated the Grüneisen parameter (γ) to measure the strength of phonon anharmonicity. A larger |γ| usually indicates the higher anharmonicity of lattice vibrations and stronger anharmonic phonon−phonon scattering, which would result in a shorter phonon lifetime and the lower thermal conductivity. Figure 8c indicates that Pb 2 Sb 2 S 5 has relatively larger |γ| values compared to those of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 , especially in the low-frequency range (<75 cm −1 ), indicating the stronger anharmonic phonon interactions in Pb 2 Sb 2 S 5 . For better comparison, the total Grüneisen parameter (γ t ) obtained as the weighted sum of the mode contributions at different temperatures is illustrated in the inset in Fig. 8c. Clearly, the γ t of Pb 2 Sb 2 S 5 is much larger than that of Pb 2 Sb 2 Se 5 and Pb 2 Sb 2 Te 5 in the whole studied temperature range, causing shorter phonon lifetime (Fig.  8d) and thus lower lattice thermal conductivity.

TE performance
Using the derived electrical transport coefficients and lattice thermal conductivity, we calculated the carrier concentrationdependent ZT at 650 K for the three Pb 2 Sb 2 VI 5 compounds, as given in Fig. 9. Their ZT max values at different temperatures and corresponded optimal carrier concentrations can be found in Supplementary Fig. 16. For p-type doping, Pb 2 Sb 2 S 5 achieves the peak ZT value of 0.80 at the optimal carrier concentration of 7.5 × 10 19 cm −3 , moderately higher than that of Pb 2 Sb 2 Se 5 (0.52) and Pb 2 Sb 2 Te 5 (0.60) at their respective optimal carrier doping levels (1.7 × 10 20 cm −3 for Pb 2 Sb 2 Se 5 and 4.3 × 10 19 cm −3 for Pb 2 Sb 2 Te 5 ), which can be attributed to the larger p-type PF and lower κ l of the former. Importantly, the ZT max of n-type Pb 2 Sb 2 S 5 increases to 1.21 (n = 4.9 × 10 19 cm −3 ), comparable to and even slightly larger than that of the n-type Bismuth−Telluride-based system (1.1-1.2) 20,52 , suggesting its great promise of being a good mid-temperature TE material. The optimal ZT value of Pb 2 Sb 2 Te 5 in n-type doping (ZT max = 0.73 at the concentration of 3.6 × 10 19 cm −3 ) is relatively larger than that of p-type due to the combination of high n-type Seebeck coefficient and electrical conductivity; whereas the n-type TE performance of Pb 2 Sb 2 Se 5 is inferior to p-type (ZT max = 0.38 at the concentration of 1.68 × 10 20 cm −3 ) arising from the lower n-type S. This substantiates the fact that Pb 2 Sb 2 Te 5 is a better n-type TE material while Pb 2 Sb 2 Se 5 does better under p-type doping, which can also be applied to other Se/Te-based compounds.

DISCUSSION
In summary, by combing high-throughput DFT calculations and deep neural networks, we have trained two different ML models to predict the maximum ZT value and corresponding optimal doping type at different temperatures for the IV-V-VI family of layered semiconductors. These two models exhibit outstanding global fitting performance and have no overfitting and underfitting problems. Moreover, using the validation dataset, the comparison between DFT-calculated and ML-predicted TE properties further substantiates their robust reliability due to the exceptionally low MSE (~0.008), large R 2 (~0.95) and high prediction accuracy (>90%). Among this IV-V-VI crystal system, several compounds with high ZT max (>0.8 at 650 K), in particular Pb 2 Sb 2 S 5 under n-type doping (ZT max = 1.21), were successfully identified. It is demonstrated that the ultralow thermal conductivity is mainly responsible for the excellent TE performance of n-type Pb 2 Sb 2 S 5 . The significant lattice anharmonicity and short phonon lifetime suppress the phonon thermal conduction in Pb 2 Sb 2 S 5 , thereby leading to the exceptionally low-κ l (0.55 W m −1 K −1 at 300 K, and 0.25 W m −1 K −1 at 650 K). In addition, we also found that the Te-based semiconducting systems exhibit better n-type TE performance than that of p-type, whereas the Se-based compounds show a completely opposite situation. This is due to that n-type Te-based compounds possess a combination of high S and σ, while Se-based semiconductors have large Seebeck coefficients for p-type doping arising from the steep DOS near the VBM. This work not only provides several promising TE semiconductors for further studies and applications but also highlights the method to overcome the obstacles in the traditional trial-and-error ways, which is applicable to the discovery of various types of functional materials.

Density functional theory
Within the framework of the DFT, all calculations were carried out in the Vienna ab initio simulation package (VASP) 53 . The GGA-PBE functional was employed and the valence electronic configurations are s 2 p 2 , s 2 p 3 , and s 2 p 4 for group IV, V, and VI elements, respectively. We used the semiempirical DFT-D2 method 54 to better describe the van der Waals forces between IV-IV atomic layers. The force and energy convergence criteria were set to 1 × 10 −2 eV Å −1 and 1 × 10 −6 eV, respectively, and a kinetic energy cutoff of 400 eV was adopted for the plane-wave basis set. We employed the LOBSTER code to calculate pCOHP.

Machine learning
Among all the 70 compounds, the number of A 1 B 2 C 4 , A 1 B 4 C 7 , A 2 B 2 C 5 A 3 B 2 C 6 (A = Si, Ge, Sn, Pb; B = As, Sb, Bi; C = S, Se, Te) are 19, 18, 20, 13, respectively. By using the random module in the Numpy package of Python, we randomly selected 40 compounds as the input dataset for machine learning. Meanwhile, a fixed random factor of 31 was set to ensure the repeatability of the random process. Finally, we obtained 13 A 1 B 2 C 4 , 10 A 1 B 4 C 7 , 9 A 2 B 2 C 5 , and 8 A 3 B 2 C 6 input compounds. The code used to randomly generate the 40 input compounds is available online at https://github.com/AlienMarkWong/ztml/blob/main/ztml/tools.py. In this work, the machine learning was carried out using the package of ztml, which was developed by Python program language and PyTorch machine learning framework. All dataset and origin codes are available online at https://github.com/AlienMarkWong/ztml."

Transport properties
On the basis of the constant relaxation time and approximation and rigid band approach, the electrical transport properties were calculated by solving the semi-classic BTE as implemented in the BoltzTraP software 55 . The HSE06 electronic bandgaps obtained from our previous work 19 were used to obtain the more realistic TE properties. The temperature- Fig. 9 ZT calculation. ZT values of Pb 2 Sb 2 S 5 , Pb 2 Sb 2 Se 5 , and Pb 2 Sb 2 Te 5 as a function of a hole (p-type) and b electron (n-type) carrier concentrations at 650 K. The dotted lines indicate the optimal carrier densities that achieve the ZT max of each compound. dependent lattice thermal conductivity was determined from the singlemode relaxation time approximation using the ShengBTE package 56 , and the calculation details can be found in our recent research 19 . The detailed setting parameters in this study have been summarized in Supplementary Table 2. Note that the hexagonal conventional unit cell was employed to calculate the electrical and phonon transport properties for all compounds. For better reflecting the electronic structures, the primitive rhombohedral cell was adopted to calculate the density of states and band structures for A 1 B 2 C 4 and A 3 B 2 C 6 compounds.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author, and the dataset used to generate the ML models is available online at https:// github.com/AlienMarkWong/ztml.