High dielectric ternary oxides from crystal structure prediction and high-throughput screening

The development of new high dielectric materials is essential for advancement in modern electronics. Oxides are generally regarded as the most promising class of high dielectric materials for industrial applications as they possess both high dielectric constants and large band gaps. Most previous researches on high dielectrics were limited to already known materials. In this study, we conducted an extensive search for high dielectrics over a set of ternary oxides by combining crystal structure prediction and density functional perturbation theory calculations. From this search, we adopted multiple stage screening to identify 441 new low-energy high dielectric materials. Among these materials, 33 were identified as potential high dielectrics favorable for modern device applications. Our research has opened an avenue to explore novel high dielectric materials by combining crystal structure prediction and high throughput screening.


Background & Summary
A dielectric is an insulator that becomes polarized under the influence of an applied electric field. Materials exhibiting a high dielectric constant () are promising for energy storage applications. As an example, a parallel-plate capacitor's energy storage capability is approximately expressed as  U E 1 2 0 2 = , where  0 and  are the permittivity of vacuum and the material dielectric constant respectively, and E is the applied electric field. Materials with high dielectric constants, compared to materials with lower dielectrics, have the potential to store more charge per unit volume, which is critical to high-performance device fabrication as well as miniaturization. Many high dielectric materials, such as ZrO 2 1 , HfO 2 2 , Al 2 O 3 3,4 , Y 2 O 3 5 , SrTiO 3 6 and BaTiO 3 7 , have been extensively applied in microelectronic technologies. However, currently used dielectrics are inhibiting the development of cheaper and more efficient devices. For example, BaTiO 3 applied to multi-layer ceramic capacitors (MLCC) encounters certain limitations concerning the continuing miniaturization of circuit components, which require thinner dielectric layers while retaining the reliability of current advanced capacitors. Decreasing the particle size of BaTiO 3 introduces a so-called "size effect" wherein the ferroelectricity reduces with decreasing particle size and then vanishes below a specific critical size 8 . There also exists a technical challenge when implementing SiO 2 in complementary metaloxide-semiconductor (CMOS) and dynamic random-access memory (DRAM) devices. When the thickness of SiO 2 is reduced to a few nanometers, leakage current increases greatly because of the quantum tunneling 9 . The bandgap (E g ) is also a key property affecting device performance. In flash memory, a large band gap is necessary to satisfy the stringent leakage current specification (0~10 −9 A · cm −1 ) 10 . However, there exists a general inverse correlation between  and E g . Thus, the exploration of new high dielectrics requires a careful balance between the bandgap and dielectric constant.
To date, there are only a few hundred known materials with measured dielectric constants; this includes both organic and inorganic materials. The dielectric constants of the vast majority known inorganic compounds (~30,000) are currently unknown. Modern quantum mechanical calculations provide complementary approaches to experiments with far lower costs. In exploring new materials with high dielectric constants, high-throughput screenings of candidate materials based on density functional theory (DFT) calculations have become popular recently. Utilizing a high-throughput setting, Yim et al. 10 calculated properties for more than 1800 structures of binary and ternary oxides from the Inorganic Crystal Structure Database (ICSD) and generated a total property map of band gap versus dielectric constant. Petousis et al. 11,12 developed a computational infrastructure to perform high-throughput screening of dielectric materials based on Density Functional Perturbation Theory (DFPT) and constructed a database of dielectric tensors consisting of 1,056 inorganic ordered compounds. However, these studies focused on only already known materials. Recent successes in crystal structure prediction (CSP) has shown that it is possible to predict new materials prior to synthesis 13 . Sharma et al. 14 designed organic polymer dielectrics using a strategy of hierarchical modeling, and their efforts led to the successful synthesis of several new high- polymers. Zeng et al. 15,16 applied CSP methodology to explore high dielectrics in particular systems: hafnia-based oxides and Zr x Si 1−x O. These results inspired us to explore potential high dielectrics in an expanded chemical space.
In this study, we chose to focus on finding new ternary oxides possessing both high  and E g , given that binary oxides have already been well studied 17,18 . In this work, we combined high throughput calculations with crystal structure prediction methods to screen for target materials in a broad chemical space. Specifically, we chose chemical systems based on the combination of two types of metal oxides between group IIA/IIIA/IVA and group IVB, namely Ca(Be, Mg, Sr, Ba)O-Ti(Hf, Zr)O 2 , Al(Ga, In) 2 O 3 -Ti(Hf, Zr)O 2 , and Si(Ge)O 2 -Ti(Hf, Zr)O 2 . For each system of A m O n − BO 2 (A: IIA/IIIA/IVA; B: IVB), we performed variable composition CSP calculations to search for low energy structures which are likely to be (meta)stable if they can be synthesized. The low energy structures were then extracted and fed to our newly developed computational pipeline to screen their dielectric and band gap properties. As a result, we have provided a list of hypothetical materials which are favorable for high dielectric applications.

Methods
Theory and definitions. There are two mechanisms that contribute to a materials dielectric tensor: the ionic contribution which is a consequence of atomic displacement, and the electronic contribution which is a consequence of electron cloud distortion. As a result, the dielectric tensor can be represented as a sum of these two , where α and β denote the directions of the applied electric field and the resulting polarization in the Cartesian coordinate system. In most cases, the electronic contribution  αβ ∞ is much smaller than the ionic part and can be somewhat disregarded. The ionic dielectric tensor component  αβ 0 , due to the atomic displacements in the crystalline unit cell, is much more pronounced. Following ref. 19 , we can obtain the  αβ 0 value by summing the contribution from each vibrational phonon mode, where 0 Ω is the volume of the primitive cell, ω m denotes the frequency of each vibration mode. S m,αβ is the mode-oscillator strength tensor and can be obtained by  where κ M is the mass of the ion κ. It is obvious  is dependent on various fundamental quantities which include: Born effective charge, ionic mass, phonon frequencies, and cell volume. A detailed analysis of the influence of these quantities on both E g and  is provided in Data records section.
For the predicted stable and meta-stable structures, when they are applied to electronic devices, both dielectric constant and bandgap need to be considered. To evaluate the performance of materials, we used a fitness model proposed in a recent work 15 : where α is 1 for an insulator and 3 for semiconductors, E g is the bandgap and E gc is 4 eV, which represents the critical value to distinguish semiconductors and insulators. Using this descriptor, one can estimate the level of energy storage for any given material.

Workflow.
We performed the screening work by following the flowchart as summarized in Fig. 1 We then calculated the phonon spectrum based on DFPT. Those materials with imaginary frequency at Γ point (greater than 1 meV) were also discarded. Only the structures satisfying all criteria were stored for further dielectric calculation. Last, we used the fitness model mentioned in Eq. 4 to evaluate the level of energy storage of these materials.
www.nature.com/scientificdata www.nature.com/scientificdata/ Crystal structure prediction. We utilized the USPEX code 21 to search for new materials within the chemical space described above. For each system (e.g., CaO-TiO 2 ), we performed a CSP calculation with 50 individuals in each generation for 18 generations in total. We initiate the first generation of structures with structures of random composition and space group. The structures in the subsequent generations were generated according to the following variation operators heredity (30%), random (30%), mutation (20%) and permutation (20%), respectively. The maximum number of atoms in the unit cell is constrained to 24.
Structure optimization in DFT. For each structure generated from CSP, DFT calculation was performed with the Vienna ab intio software package (VASP) 22 , using the all-electron projector wave (PAW) method 23,24 . The exchange-correlation energy is treated with the generalized gradient approximation (GGA) within the Perdew-Burke-Ernzerhof (PBE) framework 25 . Preparation of input parameters and data extraction were done by the Python Materials Genomics (Pymatgen) package 26 . The structures were fully relaxed until the stress tensor is less than 3 kbar. The fitness function defined in our structure prediction is the negative of the ab initio free energy of the locally optimized structure. The formation energy for each structure is determined with respect to the most stable structure of the pure elemental solids. The formation energy versus chemical composition forms convex hull. In Materials Project, a phase diagram is constructed by comparing the relative thermodynamic stability of phases belonging to the system using an appropriate free energy model (with the corrections for gases, liquids and elements 27 ), the convex hull is taken to construct the phase diagram 28,29 . In our work, we combined our results and structures from Materials Project to compute the convex hull via Pymatgen, this is reasonable since we used the same functional as in Materials Project. It was found that observed metastable phases are usually not more than 0.1-0.2 eV/atom higher in energy than the ground-state structure 30 . Metastable structures may exist at ambient conditions by means of specific techniques such as doping 9 or in nano crystalline states 31 . Herein structures with fitness values higher than 0.1 eV were discarded. All of the obtained structures were fully relaxed until the interatomic forces and the total energies (per atom) are smaller than 0.01 eV/Å and 10 −6 eV respectively. The plane-wave basis sets have a kinetic energy cutoff of 600 eV. The grid density of k-mesh by reciprocal volume was set to be 200 Å −3 in Pymatgen.
Dielectric constant and bandgap calculation. The DFPT methodology, as implemented in the VASP code 32 was applied to predict dielectric constants. This method has been widely used to calculate dielectric constants 11,12 and refractive indices 33 . The dielectric tensor is composed of two parts: electronic and ionic contributions. Since the dielectric response calculated by DFPT corresponds only a monocrystalline material, for simplicity, we adopted the same approximation method proposed in Petousis' work 11 , in which the polycrystalline dielectric constant was estimated by averaging the eigenvalues of a monocrystalline dielectric tensor. A large bandgap is another important indicator when selecting industrial high dielectrics. The reported material bandgaps were calculated through DFT using the generalized gradient approximation (GGA). It is worth mentioning that a K-point mesh of high density is required for calculation of , so the K-point reciprocal density is set to be 300 Å −3 in Pymatgen, which is sufficient to reach the required computation accuracy. It is also worth nothing that GGA and LDA underestimate the band gap of oxides 34,35 , usually a GGA + U method is supposed to be implemented to reduce the error. The determination of U parameter is critical and usually achieved by an empirical way. In Pymatgen, U parameters are only available for a few oxides and fluorides. Here we use default setting of Pymatgen, the U parameters are not applied to our studied systems. www.nature.com/scientificdata www.nature.com/scientificdata/ Data records File format. The input files containing important parameters and calculation results are stored in a JSON file, which has been uploaded to figshare 36 and Materials Cloud 37 . For each material, one can check the properties by accessing values through keys, such as "e_poly", "e_total" and "e_electronic", corresponding to polycrystalline dielectric, total dielectric tensor and electronic contribution tensor, respectively (referred to Table 1). Other parameters can be found by accessing the "meta" key as listed in Table 2. effect of cations on E g and . Though there exist many factors which may affect a material's dielectric properties, we have inferred a general trend by analyzing the data from our simulation. In finding this trend, we chose a few groups of structures belonging to the same prototype with different chemical substitutions of metal cations. In a ternary oxide A m O n − BO 2 (A: IA/IIA/IIIA; B: IVB) system, we considered the roles of A and B sites separately.
For the A site, we chose two groups of materials, Ama2 IIA-TiO 3 and P2 1 /m IIA-ZrO 3 . As seen in Fig. 2a,b there is not any obvious correlation between the band gap and the dielectric constant for these two groups of materials. In Fig. 2a, IIA-TiO 3 (A = Mg, Ca, Sr, Ba) the reported structures are orthorhombic. The bandgap increases as the cation atomic number increases from Mg to Sr and then drops down for the Ba cation. A similar trend of bandgap can also be found in Fig. 2b. The dielectric constants in both groups follow completely different trends. This indicates that the factors affecting dielectric tensor are complicated. A more in-depth discussion of this complexity will follow.
For the B site, the dielectric constant and band gap of Ti/Zr/Hf-based oxides share a similar trend regardless of the difference of the cation. In Fig. 2c, three materials -GeTiO 4 , GeZrO 4 and GeHfO 4 share the same prototypical structure in orthorhombic Cmmm symmetry. Their bandgaps increase as the atomic radius increases; however, the dielectric constants follow an inverse trend. This phenomenon was also verified by the comparison of three orthorhombic Amm2 MgBO 3 (B = Ti, Zr, Hf) structures in Fig. 2d. It is worth noting that dielectric constant of GeTiO 4 (209.11) differ significantly from GeZrO 4 (26.63) and GeHfO 4 (21.80) in spite of their structural similarity, demonstrating that the composition of materials can easily influence polarization.
To better understand the atomic mechanism, we analyzed the primary contribution due to ionic vibration for all three materials in GeBO 4 . From Eq. 2, we know the ionic dielectric contribution is not only related to mode frequency but also the Born effective charges (Z * ), which are listed in Table 3 for GeBO 4 compounds. The Z * values for Ge in all compounds are larger than their nominal charges (+4 a.u.), indicating the mixed covalent and ionic character of Ge atoms 38 . The decrease of Z * values of Ge as the increase of the atomic number of B   To illustrate the components of the dielectric tensor at an atomic level, we chose Cmmm-GeBO 4 as an example to analyze the dielectric contribution of each vibration mode in Table 4. Cmmm-GeBO 4 has 6 atoms in the primitive cell, which results in 18 phonon branches, of which only 8 infrared (IR) active modes have non-zero mode-oscillator strength and contribute to the ionic dielectric constant. In GeTiO 4 , the acoustic modes are dominated by the vibration of the heaviest atom, Ge with a mass (M) of 73. The large dielectric constant of GeTiO 4 is mainly due to the lowest-frequency IR mode at 55 cm −1 , which is caused by low-frequency vibration of Ti atoms along c axis (as seen in Fig. 3a). However, this mode is absent in GeZrO 4    www.nature.com/scientificdata www.nature.com/scientificdata/ have much higher frequencies, see Fig. 3b,c as examples, which only contribute slightly to dielectric constants. Combined with the large value of Z * 33 of Ti, the factors including light weight, large displacement, and large Born effective charges of the Ti atom, collectively distinguish GeTiO 4 from other materials in the prototype. Rignanese et al. 39 also found that the variation in Ti oxides differs distinctively from Hf and Zr oxides and can be attributed to the difference in interatomic force constants. Similar information has been listed for all structural entries reported in this work. We believe such comprehensive analysis can yield a better understanding of the structure-dielectric property relation.  www.nature.com/scientificdata www.nature.com/scientificdata/ High dielectrics. Our primary goal is to find a material with E g and  larger than BaTiO 3 , the leading material used in MLCC industry. BaTiO 3 transforms from the high-temperature paraelectric cubic phase (Pm-m 3 ) to the low-temperature ferroelectric tetragonal phase (P mm 4 ) at 406 K 7 . The tetragonal polymorph has better ferroelectric, piezoelectric, and thermoelectric properties; hence, it is the most widely used polymorph in industry. Though the dielectric tensor is strongly dependent on temperature, we limited our study to materials at 0 K for simplicity. Most of the structures have both larger bandgaps and higher dielectric constants than tetragonal BaTiO 3 (E g = 1.72 eV,  = 17.76). Among them, I mmm 4/ -Sr 3 Hf 2 O 7 achieves the highest fitness value ( = 522.12 and E g = 3.68 eV). This hypothetical structure is also available in the open Materials Project (MP) database (mp − 779517) without reported values on dielectric constants. However, after we checked the phonon, this structure is evidenced to be dynamically unstable due to the existence of imaginary phonon at X (1/2, 1/2, 0) and P (1/2, 1/2, 1/2) points. Similar conclusions were also mentioned in a recent study 40 . Further check the Sr-Hf-O system in MP database, two other structures, P4mm-SrHfO 3 and I4/mmm-Sr 2 HfO 4 also stand out, with the dielectric constants of 246.36 and 159.05, respectively. Their crystal structures are very similar, as shown in Fig. 4. They both crystallize into tetragonal and have Hf centered octahedra. Their bonding length (2.06 Å, 2.07 Å and 2.06 Å for Hf-O) and bandgap are almost the same. This result raises the possibility of looking for high dielectrics in these kinds of structures. In addition, P2 1 /m-MgZrO 3 also shows high dielectric constant equals to 313.05, with bandgap of 3.64 eV. Given that DFT calculation systematically underestimates the fundamental bandgap 41 by 10-40% 15 , it is likely that P2 1 /m-MgZrO 3 's band gap is larger than 4 eV, which can meet the requirement of application of CPU (4 eV) and other CMOS devices.
In the CPU and DRAM industry, dielectrics require  > 30 for further device scaling 10 . With the additional limitation of E g > 3 eV for DRAM and E g > 4 eV for CPU, the qualified materials from our database are listed in

technical Validation
In our calculation, 441 structures of ternary oxides were screened out, as shown in Fig. 5. Comparing with materials data in the Materials Project, our newly generated dataset has both larger bandgap and dielectric constant. Most of the ternary oxides are above  * = E 40 To validate the reliability of our study, we compared the results with Petousis's work 11 , as shown in Fig. 6, in which they checked the computational error with experiments, and validated the dielectric constants of most materials deviate less than ±25% from experiments at room temperature. Here our study reached the same accuracy. In Fig. 6, most of the dielectric constants in our study (black) and Petousis' study (red) overlap. We also did a comprehensive comparison by computing the mean absolute error (MAE), root Mean Square Error (RMSE), mean absolute relative deviation (MARD) 12 , Pearson correlation and Spearman rank-order correlation coefficient. The results are: MAE = 2.03, MARD = 19.50%, RMSE = 2.77, Pearson correlation coefficient = 0.90 and Spearman rank-order correlation coefficient is 0.84 with p-value of 2.4 × 10 −14 . Pearson correlation coefficient assesses the linear correlation between two variables and Spearman coefficient is a measure of rank correlation between two sequences of ranking features. Both of these two coefficients are close to 1, indicating that the general consistency between calculation and experiment. We believe our method should be as reliable as the previous work 11 . Given that many factors (e.g., temperature, pressure, impurities, vacancies, interfaces, and surface charges) may cause the theoretical values deviate from experiments 11 . It is reasonable that the theoretical values are not exactly the same with experiment. However, the calculation would still provide an inspiration for further research.

Usage Notes
In this paper, we employed a first-principles crystal structure prediction method to perform a systematic structural study on a series of ternary oxides systems. For the low energy structures discovered from our prediction, we developed an automated computational screening scheme to evaluate their dielectric and electronic properties. This work generated a library of hypothetical materials which are promising for high dielectric applications. Among them, P2 1 /m-MgZrO 3 achieves the best theoretical performance as it has both large bandgap (3.64 eV) and large dielectric constant (313.05). The rest 32 structures (such as Pnma-CaHfO 3 , Cm-SrHfO 3 and R3 -Zr 6 BeO 13 ) with bandgap above 3 eV and dielectric constant above 30, may be useful in CPU or DRAM devices. Their structural, dielectric, and thermodynamic properties are archieved in a supplementary JSON file. In addition, we investigated the factors affecting the dielectric properties, pointing out that the dielectric properties are affected by multi-factors including vibration of atoms, Born effective charge, and atomic mass. Among these newly discovered structures, many of them were predicted from the first principles crystal structure prediction for the first time, suggesting that the crystal structure prediction methods as a complementary approach to the current high throughput screening based on data mining. The computational scheme developed here is entirely general to be used to search for other functional materials as well. www.nature.com/scientificdata www.nature.com/scientificdata/ code availability All reported crystal structure prediction calculations were performed using the USPEX code, which is based on evolutionary algorithms to predict structures with only elemental information 21,42 . Relaxation of structures and DFPT method were carried out by the VASP code 22 .  11 (red). Outliers with a deviation larger than 50% relative to experimental value were marked.