Computational screening of high-performance optoelectronic materials using OptB88vdW and TB-mBJ formalisms

We perform high-throughput density functional theory (DFT) calculations for optoelectronic properties (electronic bandgap and frequency dependent dielectric function) using the OptB88vdW functional (OPT) and the Tran-Blaha modified Becke Johnson potential (MBJ). This data is distributed publicly through JARVIS-DFT database. We used this data to evaluate the differences between these two formalisms and quantify their accuracy, comparing to experimental data whenever applicable. At present, we have 17,805 OPT and 7,358 MBJ bandgaps and dielectric functions. MBJ is found to predict better bandgaps and dielectric functions than OPT, so it can be used to improve the well-known bandgap problem of DFT in a relatively inexpensive way. The peak positions in dielectric functions obtained with OPT and MBJ are in comparable agreement with experiments. The data is available on our websites http://www.ctcms.nist.gov/~knc6/JVASP.html and https://jarvis.nist.gov.


Background & Summary:
Optoelectronic properties, such as fundamental electronic bandgaps and dielectric functions, provide important material information in designing optoelectronic devices for a variety of applications, such as photovoltaic cells 1 , light emitting diodes 2 , transparent electronics 3 , dynamic random access memory 4 , astronomical devices 5 , and smaller and faster devices 6 .For industrial advancement in these industries, there is a great need to synthesize cheaper, more efficient, and tunable devices.Designing these new materials requires knowledge of already available ones, which can then be tailored for a particular application.Databases dedicated to optoelectronic materials meet this need.However, such user-friendly and easy-accessible public databases are still in the development phase.Computationally, it is much easier to provide properties for thousands of materials in a systematic way than to do so through experiments.Density functional theory (DFT) is the tool of choice to compute these properties in a high-throughput manner.
It is important to note that the term 'bandgap' generally refers to the fundamental gap and not the optical gap.The difference between these quantities could be small in semiconductors but significant in insulators 7 .Materials Genome Initiative based projects such as the Materials Project (MP) 8 , the open quantum materials database (OQMD) 9 , and AFLOW 10 have successfully enumerated bandgaps of hundreds of thousands of materials using the generalized-gradientapproximation Perdew-Burke-Ernzerhof functional (GGA-PBE) 11 and +U corrections.MP has also calibrated the static dielectric constant of 1056 materials using density functional perturbation theory (DFPT) 12 , but frequency-dependent dielectric functional data is missing.Although PBE provides great insights in distinguishing non-metallic materials, the bandgaps of materials are generally underestimated typically by 30 % to 100 % 13,14 , hindering its practical application in the fields of semiconductors, photovoltaic materials, and thermoelectric devices.Other systematic databases of optoelectronic materials include Zunger et al. 15 work for photovoltaic materials using Green function screened coulomb (GW) calculations, and Castelli et al 16 work on energyharvesting materials using the Gritsenko-Leeuwen-Lenthe-Baerends (GLLB-SC) functional.GW is much more reliable than PBE in computing optoelectronic properties.However, its high computational cost severely limits its application in high-throughput screening.Catelli's work is also limited, containing information for only about 2400 materials.
Various techniques have been used to improve bandgap prediction at a moderate computational cost, including Chan and Ceder (delta-sol) 14 , modified Becke-Johnson potential [17][18][19] , and empirical fits by Setyawan et al. 20 .Recently, the modified Becke-Johnson (MBJ) potential introduced by Tran and Blaha [17][18][19] has been proven to improve the bandgap description in a computationally efficient way.This potential has been successfully used in characterizing electronic properties of nonmagnetic transition-metal oxides and sulfides, metals, (anti) ferromagnetic insulators, dielectric and topological insulators 19,[21][22][23][24] In this work, we have identified a sweet spot between the computational expense and accuracy for describing optoelectronic properties by using MBJ potential in a high-throughput approach.At present, we have 7358 MBJ bandgap and frequency-dependent dielectric function entries, and the database is still growing.Additionally, we computed 17805 bandgaps and frequency-dependent dielectric functions using OptB88vdW (OPT) for comparison purposes.OPT is a Van der Waal-dispersion functional (vdW-DF) with non-local correction, which can predict crystal-structure geometry, and is essential to the calculation of optoelectronic properties, especially for anisotropic materials.The OPT functional has not only been proven to reduce error in lattice constants, but its combination with MBJ functional is known to predict bandgaps of materials 25 successfully.In addition, the error in lattice constants can significantly impact the error in optoelectronic properties such as refractive indices, and hence birefringence of non-cubic class materials.

Methods:
The methodology supporting the current work consisted of several steps, including density functional theory calculations and experimental validation of a few data points.The overall processes are shown in Fig. 1 and each step is explained in detail below.

Density functional theory setup:
The DFT calculations are performed using the Vienna Ab-initio Simulation Package (VASP) 27,28 and the projector-augmented wave (PAW) method 29 .Please note commercial software is identified to specify procedures.Such identification does not imply recommendation by the National Institute of Standards and Technology.The crystal structures were obtained from the Materials Project (MP) DFT database.More specifically, we obtained all the crystal structures with less than 30 atoms per unit cell from MP, and the potential candidates for low dimensional materials using lattice-constant criteria 30 and data-mining approaches 31 .We convert the crystal cells into its primitive cell representation before a DFT calculation.If the primitive cell and corresponding conventional cell of a crystal-structure have the same number of atoms, then we prefer conventional cell as the DFT input structure.
As the error in lattice constants can significantly impact the error in optoelectronic properties, such as refractive indices and birefringence of non-cubic class materials, we re-optimized MP geometric structures using the OPT functional 26,32 .PBE is known to report good lattice constants for materials, but its applicability to vdW-bonded materials is questionable.Recently, around 5000 materials have been proposed to be vdW-bonded using lattice-constant criteria 30 and data-mining approaches 31 , signifying that a correct treatment of the vdW interactions is more important than previously thought.OPT is part of vdW-DF functional, which is a non-local correlation functional that approximately accounts for dispersion interactions.OPT has been recently determined to perform well for bulk solids as well as vdW bonded structures 26 .In a recent work by Tawfik et al. 33 , OPT was proven to be one of the most accurate functionals to capture vdW interactions among several other methods.We performed plane-wave energy cut-off and k-point convergences with 0.001 eV tolerance on energy.We assumed that satisfactory energy convergence would extrapolate to reasonably converged optical property calculations as well.The structure relaxation with OPT functional was obtained with 10 -8 eV energy tolerance and 0.001 eV/Å forceconvergence criteria.
Next, we computed bandgap and optical properties with both OPT and MBJ in subsequent DFT calculations.In the MBJ calculations, we started from OPT-relaxed structures because the MBJ functional is a potential-only functional, which implies that we cannot compute Hellmann-Feynman forces with MBJ, hence ionic relaxations were not performed using MBJ.The OPT functional has not only been proven to reduce error in lattice constants, but its combination with MBJ functional is known to predict correct bandgaps 25 as shown for few vdW bonded materials.
The MBJ potential is given by: where c is a system-dependent parameter, with c = 1 corresponding to the Becke-Roussel (BR) potential , which was originally proposed to mimic the Slater potential, the Coulomb potential corresponding to the exact exchange hole 34 .For bulk crystalline materials, Tran and Blaha proposed to determine c by the following empirical relation: Vcell is the volume of the unit cell.The c-parameter was automatically determined in VASP through a self-consistent run.
To obtain the optical properties of the materials, we calculated the imaginary part of the dielectric function from the Bloch wavefunctions and eigenvalues 35,36 (neglecting local field effects).We introduced three times as many empty conduction bands as valance bands.This treatment is necessary to facilitate proper electronic transitions.We choose 5000 frequency grid points to have a sufficiently high resolution in dielectric function spectra.The imaginary part is calculated as: where e is electron charge, is the cell volume, is the Fermi-weight of each k-point, are unit vectors along the three Cartesian directions, is the cell-periodic part of the pseudopotential wavefunction for band n and k-point k, q stands for the Bloch vector of an incident wave, c and v stand for conduction and valence bands,  stands for eigenvalues of the corresponding bands respectively.The matrix elements on the right side of Eq. ( 3) capture the transitions allowed by symmetry and selection rules 37 .The real part of the dielectric tensor 1  is obtained by the usual Kramers-Kronig transformation 35 where P denotes the principle value, and η is the complex shift parameter taken as 0.1.
It is to be noted that in conventional DFT, excited states are not optimized, hence many-body interactions are missing.To get the excited state optical properties, a high-level calculation such as the Bethe-Salpeter equation (BSE) 38 is needed, however, the conventional DFT data remains useful for qualitative comparison.
Other experimental data were taken from Aspnes et al. 39 for validation.1T-SnSe2 (40 nm thickness) was grown on a GaAs (111) substrate by molecular beam epitaxy (MBE) 40
First, a user selects the desired element/elements in the periodic table provided at the website and clicks on the 'Search' button (as shown in Fig. 2).This procedure generates a data table on the webpage consisting of the calculation-identifier, the formula of the structure, the functional used in the calculation, bandgap, mechanical property, space group of crystal and energetics of the system.Next, the user clicks on the calculation identifier for a formula, space group and functional and property data for detailed information.The detailed page is provided in the format such as https://www.ctcms.nist.gov/~knc6/jsmol/JVASP-1174.htmlwhere '1174' denotes an identifier and can assume any JARVIS-ID.The particular webpage consists first of an interactive crystal visualization, then geometric properties such as computational XRD, bandstructure and the optical properties consisting of dielectric function and refractive index.We also provide a classification of materials based on their OPT and MBJ based bandgaps, and static refractive index data as shown in Fig. 3. Clicking on one of the options in Fig. 3 results in materials with classified properties.
For example, clicking on 'Classification of 3D-bulk materials based on TB-MBJ-bandgap' produces a table with materials that have a bandgap in rage from 0 to 1, 1 to 2, 3 to 4 eV and so on.Each material is hyperlinked to its specific webpage.

Code availability:
The code used in this work is provided at https://github.com/usnistgov/jarvis .There are two main scripts in this folder-1) joptb88vdw.py and 2) master.py.The joptb88vdw.py script heavily utilizes the Pymatgen 8 and ASE 42

Technical Validation:
As discussed in the method section, the crystal structures were obtained from the Materials Project, which uses PBE for structure optimization.We re-optimize the MP crystal structures with the OPT functional.Most of the MP crystal-structures have Inorganic Crystal Structure Database (ICSD) IDs, which can be used to obtain experimental lattice parameter information.Hence, we compute PBE and OPT based mean absolute error (MAE) and root-mean-squared error (RMSE) of all the available structures in our database.There are presently 10,052 structures with ICSD IDs in our database.We further classify these structures into predicted vdW and predicted non-vdW structures.We use the lattice-constant criteria 30 and data-mining approaches 31 to identify vdW structures.All the remaining structures are treated as non-vdW bonded.The predicted vdW bonded materials can have vdW bonding in one, two or three crystallographic directions.It is to be noted that exfoliation energy is calculated to predict vdW bonded in materials 30 , but the two heuristic methods mentioned above can act as pre-screening criteria for determining vdW bonded structures.
Out of 10,052 structures, 2,241 were predicted to be vdW bonded.In addition to the overall MAE and RMSE, we also calculate the same for these two classes of materials as shown in for MBJ gaps versus experimental ones were found by Tran and Blaha et al. 18 , validating our methodology.We calculate two MAEs for the data: 1) MAE computed with respect to experiment using all available data for each method, 2) MAE computed with respect to experiment using only data for materials that have results available in all three DFT methods.Both of these values are shown in Table .3. Both of the MAEs are found to show similar results.It is to be noted that our geometric optimization was performed with OPT, which is different from the one used by Tran-Blaha et al. 18 This explains small differences in MBJ gaps found between our work and by them.
Due to the inadequacy of experimental data for all the materials, it is intractable to calculate the error for the whole database.Also, some of the experimental bandgaps were averages of multiple experiments.
The MBJ potential is found to be more suitable for large bandgap insulators and can change the energetics of bands in metallic systems also.We found that some of the materials predicted as metallic using PBE are semiconductors using MBJ, such as Ge and GaAs.To better understand the source of error in the bandgap evaluation, we followed the Materials Project (MP) approach (https://www.materialsproject.org/docs/calculations#Accuracy_of_Band_Structures) and determined a "shifted" MAE for our bandgap evaluations.This treatment allows removing the effect of the DFT systematic underestimation of the gap.To do this, we first fitted a linear equation for the OPT and MBJ data with respect to experiment.The slope was found to be 1.17 and 1.44 for MBJ and OPT, respectively.The slope was then used as a scaling parameter to scale-up the OPT and MBJ data.After the data have been shifted, the MAE with respect to experiment was found to be 0.42 for MBJ, 0.69 for OPT, to compare with the MP result of 0.6.We also calculated the Spearman's coefficient (SC), to measure monotonicity in the bandgap data from different methods compared to experiment.High value for SC suggests that the trends are similar to those in the experimental data.The highest value was obtained for HSE06 (0.97), followed by MBJ (0.94) and AFLOW (0.94).Additionally, we compare the computational time taken during HSE06, MBJ and OPT calculations for a few cases.We find that the MBJ takes about an order of magnitude more computational time than OPT, while HSE06 takes an order of magnitude more computational time than MBJ.A comparison table for computational time for calculations is given in supplementary information (Table.S1).
Next, to understand the trends in the whole database, we compared the bandgaps obtained from the OPT and MBJ as shown in Fig. 4a.It is to be noted that many of our calculations for OPT and MBJ are still running; we compare data which are common in both OPT and MBJ only.The blue circles show the MBJ bandgaps while the green ones represent the OPT bandgaps.We also plot the experimental results (red dots) for a small subset (from Table .3) in the Figure 4a.More specifically, we plotted the three types of data (MBJ, OPT and experiment) against the MBJ results.
As the MBJ data are plotted against themselves, they produce a straight line along the diagonal of the plot.For a perfect agreement between OPT and MBJ, all the OPT data should lie on the same straight line.However, most of the OPT data is below the straight line, representing an underestimation of the bandgap.Compared to experiments, the MBJ results describe bandgaps much better than the OPT results.This is shown by the fact that up to about 6 eV most of the experimental data lie on the figure diagonal, while the OPT results lie systematically under it.
The relative difference in OPT and MBJ in bandgap is shown in Fig. S1a.The percentage difference in values for OPT and MBJ are calculated as: To avoid division by very low or zero values, we calculated percentage differences for materials with OPT gap more than 1 eV.The upper bound of the relative changes in bandgap can range from 30 % up to more than 100 %.
Similar to the bandgap data, the static refractive index in x, y and z-directions are also compared for OPT and MBJ.The static refractive index is related to static dielectric function data as . The static refractive index in x, y and z directions are shown in Fig. 4 b-d.Like the MBJ bandgaps, the MBJ refractive indices are plotted against itself to give a straight line, which can be used for comparison.A subset of OPT and MBJ static dielectric constant data is shown in Table 3 and compared to experiments.The MAE values of OPT and MBJ static dielectric constant in the x-direction are 3.2 and 2.6 respectively, showing the overall superiority of MBJ compared to OPT.It is to be noted that only interband transitions and not intraband are accounted for in our calculations, hence Drude-like transitions are not taken into account 37 .It implies that our dielectric function data should be more accurate for high bandgap materials 18 .Also, in cases where OPT predicts metallic behavior while MBJ predicts semiconductor/insulating, the dielectric function and therefore the static refractive index would be different between OPT and MBJ, because Drude like transitions are not captured in present work.As MBJ bandgaps are more reliable than OPT, the MBJ optical data can be considered more accurate than OPT, especially for low bandgap materials.A very high difference (more than 100 %) in OPT and MBJ refractive index was observed for materials such as ZnCoF4 (as clearly seen in Fig. S1b-S1d) because of the very different bandgaps obtained using OPT and MBJ.We also find that the relative differences between OPT and MBJ refractive indexes are much smaller compared to those for bandgaps.
Interestingly, while OPT underestimates the bandgaps compared to experiments, the predicted dielectric functions are relatively close to the experimental measurement, especially for highbandgap materials.It is because our methodology describes inter-band transitions well but is not suitable for intra-band transitions.Lastly, we also observe that the MBJ static refractive index data are generally lower than the OPT data, as noted in Table 4.  unphysical transitions at low energies.However, overall spectrum patterns are similar for OPT and MBJ at higher energies.As observed in Fig. 5, the DFT intensity differs from experiment for some peaks, which can be explained based on 1) the difference in temperature between the experimental setup (generally at room temperature) and the DFT simulation (always at zero Kelvin), and 2) the surface roughness of the sample, which is not included in the calculation.Such differences in peak intensities compared to experiments are also observed for other high-level DFT based methods 48 .
In a nutshell, our dielectric function data can be used to complement experimental spectra for instance to allowing to distinguish various peaks.In addition to the peak positions, the DFT data can be used to characterize the orbital nature of the associated electronic transitions, which can provide physical insight into a phenomena 49 .A detailed investigation of all the optical transitions for all the materials will be pursued in future.Other quantities such as refractive index, absorption coefficient, electron-energy-loss spectra (EELS), optical conductivity can be calculated with the dielectric function data.As the dielectric function for materials can be anisotropic, we also provide the dielectric function data in xx, yy, zz, xy, yz, and zx directions, which can be used to calculate frequency dependent birefringence of materials.

Usage notes:
The database presented here represents the largest collection of consistently calculated optoelectronic properties of materials using density functional theory assembled to date.We anticipate that this dataset, and the methods provided for accessing, it will provide a useful tool in fundamental and application-related studies of materials.Our actual experimental verification provides insight into understanding the applicability and limitation of our DFT data.
. The GaAs substrate was deoxidized in-situ under ultra-high vacuum (4 x 10 -8 Pa) at 690 ºC for 3 min and annealed under a flux of Se for 20 min, which provides a smoother growth surface.After the substrate was cooled down and held at the growth temperature of 200 ºC for 40 min, sixty-three layers (≈ 40 nm) of 1T-SnSe2 were grown by a simultaneous incidence of Sn and Se at a rate of 1/38 layer per second based on Reflection High-Energy Electron Diffraction (RHEED) oscillations.The beam equivalent pressures (BEPs) for Sn and Se, supplied by using Knudsen cells, are 2.67 x 10 -6 Pa (2 x 10 -8 Torr) and 2.67 x 10 -4 Pa (2 x 10 -6 Torr), respectively.The single phase and high crystallinity of SnSe2 were confirmed by X-ray diffraction (XRD).Bulk MoS2 was commercially purchased from SPI Supplies41 .Please note the commercial product is identified to specify procedures.Such identification does not imply recommendation by the National Institute of Standards and Technology.The dielectric functions were obtained from spectroscopic ellipsometry (SE).The SE measurements were performed in a nitrogen gas-filled chamber at room temperature on a vacuum ultraviolet (UV) spectroscopic ellipsometer with a light photon energy from (0.7 eV to 8.0) eV in steps of 0.02 eV for SnSe2 and from (1.0 eV to 9.0) eV in steps of 0.01 eV for MoS2, at an angle of incidence of 70°.

Fig. 1
Fig. 1 Flowchart for calculating bandgap and dielectric function of materials using density

mbj_imagxx, mbj_imagyy, mbj_imagzz, mbj_imagxy, mbj_imagyz, mbj_imagzx Energy
The master.pytakes the argument of the identifier of the database or the structure in 'VASP's 'POSCAR' format.The master script can tackle both PBS and SLURM formalism used in HPC architecture.All data computed in this work can be found at https://www.ctcms.nist.gov/~knc6/JVASP.htmlandhttps://jarvis.nist.gov/.A JSON file is also available in a Figshare repository (Data Citation 1).Key variables for the JSON file are shown in Table1.They include identifiers, structure, bandgaps and dielectric function information with OPT and MBJ methods.The dielectric function data in xx, yy, zz, xy, yz, and zx directions can be used for analyzing the anisotropic nature of the dielectric function.The opt_gap and mbj_gap data can be used to analyze the effect of DFT methodologies codes for file and data management.The joptb88vdw.pygenerates a series of folders and JSON files starting with keyword 'ENCUT' and 'KPOINT' denoting the convergence test.An example of an actual calculation is also provided in the folder.After the convergence, the script carries out main geometric relaxation, band structure, optical property with OPT and optical property with MBJ calculations.onbandgap of a material, where available.The 'jid','mpid'and 'cif' mentioned in Table.1 belong to string-type, while 'opt_gap' and 'mbj_gap' belong to float-type data.The 'mpid' facilitates easy linking to the Materials-project database.Other values such as 'opt_en', 'mbj_en', 'opt_realxx','opt_imagxx', 'mbj_realxx' and 'mbj_imagxx' are arrays with float-type values.The 'real' part in these keys corresponds to real part of dielectric function while 'imag' corresponds to imaginary part of dielectric function in the respective directions.The Pymatgen code can be used to process the 'cif' string-type data.The key 'opt_en' has the same array-size as that of dielectric function data with OPT such as 'opt_realxx', 'opt_imagxx', while 'mbj_en' has the same arraysize as that of dielectric function data with MBJ such as 'mbj_realxx' and 'mbj_imagxx'.Packages such as Matplotlib and Gnuplot can be used to plot these arrays and visualize the data.We provide -dependent imaginary part of dielectric function in xx, yy, zz, xy, yz and zx directions using MBJ

Table 2
. As evident from Table.2, the OPT seems to improve lattice constants in a, b, c crystallographic directions compared to PBE.Significant improvement in lattice parameters is observed for predicted vdW materials, especially in c-directions.For predicted non-vdW materials, the errors are similar for OPT and PBE, suggesting that OPT can improved lattice constant predictions for vdW materials without much affecting the predictions for non-vdW bonded materials.Our PBE MAE value for all the materials (0.13 Å) are similar to that obtained by Jianmin et al43 (0.135 Å)for a smaller set of materials.

Table 2 .
Mean absolute error (MAE) and root-mean-squared error (RMSE) in a, b and c 44,45computed for all materials, only for predicted vdW bonded materials and only for predicted non-vdW bonded materials, using Material's project PBE and JARVIS-DFT OPT functional.available.Table3displays such a comparison for 54 materials and shows the corresponding results from MP, OQMD, and AFLOW (PBE/PBE+U based data).We also provide identifiers across different databases to facilitate comparison.In general, the values of our OPT and MBJ bandgap data are higher than MP's PBE data, with MBJ data being closer to experiments44,45.The mean absolute error (MAE) of MBJ with respect to experimental data is 0.51 eV, while that of OPT is 1.33.The OPT has MAE similar to MP, OQMD, and AFLOW because all of them are primarily PBE based calculations.However, significant improvement is shown with MBJ.Similar results

Table 3 :
Comparison of bandgaps obtained from OPT functional and MBJ potential schemes compared with experimental results and DFT data available in different databases.Materials, in MBJ and experimental spectrum.This is likely because when the bandgap is severely underestimated (such as for OPT), the theory predicts inter-band transitions (e.g., valence to conduction band) that simply don't exist because the gap is too high in reality.Such peaks are absent in MBJ based spectrums.It suggests that for low bandgap materials OPT can give