## Background & Summary

Optoelectronic properties, such as fundamental electronic bandgaps and dielectric functions, provide important material information in designing optoelectronic devices for a variety of applications, such as photovoltaic cells1, light emitting diodes2, transparent electronics3, dynamic random access memory4, astronomical devices5, and smaller and faster devices6. For industrial advancement in these industries, there is a great need to synthesize cheaper, more efficient, and tunable devices. Designing these new materials requires knowledge of already available ones, which can then be tailored for a particular application. Databases dedicated to optoelectronic materials meet this need. However, such user-friendly and easy-accessible public databases are still in the development phase. Computationally, it is much easier to provide properties for thousands of materials in a systematic way than to do so through experiments. Density functional theory (DFT) is the tool of choice to compute these properties in a high-throughput manner.

It is important to note that the term 'bandgap' generally refers to the fundamental gap and not the optical gap. The difference between these quantities could be small in semiconductors but significant in insulators7. Materials Genome Initiative based projects such as the Materials Project (MP)8, the open quantum materials database (OQMD)9, and AFLOW10 have successfully enumerated bandgaps of hundreds of thousands of materials using the generalized-gradient-approximation Perdew-Burke-Ernzerhof functional (GGA-PBE)11 and +U corrections. MP has also calibrated the static dielectric constant of 1056 materials using density functional perturbation theory (DFPT)12, but frequency-dependent dielectric functional data is missing. Although PBE provides great insights in distinguishing non-metallic materials, the bandgaps of materials are generally underestimated typically by 30% to 100% (refs 13,14), hindering its practical application in the fields of semiconductors, photovoltaic materials, and thermoelectric devices. Other systematic databases of optoelectronic materials include Zunger et al.15 work for photovoltaic materials using Green function screened coulomb (GW) calculations, and Castelli et al.16 work on energy-harvesting materials using the Gritsenko- Leeuwen-Lenthe-Baerends (GLLB-SC) functional. GW is much more reliable than PBE in computing optoelectronic properties. However, its high computational cost severely limits its application in high-throughput screening. Catelli’s work is also limited, containing information for only about 2400 materials.

Various techniques have been used to improve bandgap prediction at a moderate computational cost, including Chan and Ceder (delta-sol)14, modified Becke-Johnson potential1719, and empirical fits by Setyawan et al.20. Recently, the modified Becke-Johnson (MBJ) potential introduced by Tran and Blaha1719 has been proven to improve the bandgap description in a computationally efficient way. This potential has been successfully used in characterizing electronic properties of nonmagnetic transition-metal oxides and sulfides, metals, (anti) ferromagnetic insulators, dielectric and topological insulators19,2124.

In this work, we have identified a sweet spot between the computational expense and accuracy for describing optoelectronic properties by using MBJ potential in a high-throughput approach. At present, we have 7358 MBJ bandgap and frequency-dependent dielectric function entries, and the database is still growing. Additionally, we computed 17805 bandgaps and frequency-dependent dielectric functions using OptB88vdW (OPT) for comparison purposes. OPT is a Van der Waal-dispersion functional (vdW-DF) with non-local correction, which can predict crystal-structure geometry, and is essential to the calculation of optoelectronic properties, especially for anisotropic materials. The OPT functional has not only been proven to reduce error in lattice constants, but its combination with MBJ functional is known to predict bandgaps of materials25 successfully. In addition, the error in lattice constants can significantly impact the error in optoelectronic properties such as refractive indices, and hence birefringence of non-cubic class materials. Thus, for a better description of lattice constant and bandgaps of materials, it is necessary to first optimize geometries with vdW functional such OPT. OPT is also known to predict reasonable geometrical structures for non-vdW bonded structures26.

We validate our computational results in a few cases through comparison with experimental values. We create a public JARVIS database of our results available at https://www.ctcms.nist.gov/~knc6/JVASP.html. The data is also available in REST-API format at https://jarvis.nist.gov/ and Cloud of Reproducible Records (CoRR) at NIST (https://mgi.nist.gov/cloud-reproducible-records). We provide the code used in this work at github page: https://github.com/usnistgov/jarvis.

## Methods

The methodology supporting the current work consisted of several steps, including density functional theory calculations and experimental validation of a few data points. The overall processes are shown in Fig. 1 and each step is explained in detail below.

### Density functional theory setup

The DFT calculations are performed using the Vienna Ab-initio Simulation Package (VASP)27,28 and the projector-augmented wave (PAW) method29. Please note commercial software is identified to specify procedures. Such identification does not imply recommendation by the National Institute of Standards and Technology. The crystal structures were obtained from the Materials Project (MP) DFT database. More specifically, we obtained all the crystal structures with less than 30 atoms per unit cell from MP, and the potential candidates for low dimensional materials using lattice-constant criteria30 and data-mining approaches31. We convert the crystal cells into its primitive cell representation before a DFT calculation. If the primitive cell and corresponding conventional cell of a crystal-structure have the same number of atoms, then we prefer conventional cell as the DFT input structure.

As the error in lattice constants can significantly impact the error in optoelectronic properties, such as refractive indices and birefringence of non-cubic class materials, we re-optimized MP geometric structures using the OPT functional26,32. PBE is known to report good lattice constants for materials, but its applicability to vdW-bonded materials is questionable. Recently, around 5000 materials have been proposed to be vdW-bonded using lattice-constant criteria30 and data-mining approaches31, signifying that a correct treatment of the vdW interactions is more important than previously thought. OPT is part of vdW-DF functional, which is a non-local correlation functional that approximately accounts for dispersion interactions. OPT has been recently determined to perform well for bulk solids as well as vdW bonded structures26. In a recent work by Tawfik et al.33, OPT was proven to be one of the most accurate functionals to capture vdW interactions among several other methods. We performed plane-wave energy cut-off and k-point convergences with 0.001 eV tolerance on energy. We assumed that satisfactory energy convergence would extrapolate to reasonably converged optical property calculations as well. The structure relaxation with OPT functional was obtained with 10−8 eV energy tolerance and 0.001 eV/Å force-convergence criteria.

Next, we computed bandgap and optical properties with both OPT and MBJ in subsequent DFT calculations. In the MBJ calculations, we started from OPT-relaxed structures because the MBJ functional is a potential-only functional, which implies that we cannot compute Hellmann-Feynman forces with MBJ, hence ionic relaxations were not performed using MBJ. The OPT functional has not only been proven to reduce error in lattice constants, but its combination with MBJ functional is known to predict correct bandgaps25 as shown for few vdW bonded materials. The MBJ potential is given by:

$\begin{array}{}\text{(1)}& {{v}^{mBJ}}_{x}\left(r\right)=c{v}_{x}^{BR}\left(r\right)+\left(3c-2\right)\frac{1}{\pi }\sqrt{\frac{5}{12}}\sqrt{\frac{2t\left(r\right)}{\rho \left(r\right)}}\end{array}$

where c is a system-dependent parameter, with c = 1 corresponding to the Becke-Roussel (BR) potential ${v}_{x}^{BR}\left(r\right)$, which was originally proposed to mimic the Slater potential, the Coulomb potential corresponding to the exact exchange hole34. For bulk crystalline materials, Tran and Blaha proposed to determine c by the following empirical relation:

$\begin{array}{}\text{(2)}& c=\alpha +\beta {\left(\frac{1}{{V}_{cell}}{\int }_{{V}_{cell}}\frac{|\nabla \rho \left(r\right)|}{\rho \left(r\right)}dr\right)}^{1/2}\end{array}$

With $\alpha =-0.012$, $\beta =0.541$ Å1/2 and Vcell is the volume of the unit cell. The c-parameter was automatically determined in VASP through a self-consistent run.

To obtain the optical properties of the materials, we calculated the imaginary part of the dielectric function from the Bloch wavefunctions and eigenvalues35,36 (neglecting local field effects). We introduced three times as many empty conduction bands as valance bands. This treatment is necessary to facilitate proper electronic transitions. We choose 5000 frequency grid points to have a sufficiently high resolution in dielectric function spectra. The imaginary part is calculated as:

$\begin{array}{}\text{(3)}& {{\epsilon }^{2}}_{\alpha \beta }\left(\omega \right)=\frac{4{\pi }^{2}{e}^{2}}{\Omega }\underset{q\to 0}{\mathrm{lim}}\frac{1}{{q}^{2}}\sum _{c,v,k}2{w}_{\stackrel{\to }{k}}\delta \left({\xi }_{ck}-{\xi }_{vk}-\Omega \right)〈{\Psi }_{c\stackrel{\to }{k}+{\stackrel{\to }{e}}_{\alpha }q}|{\Psi }_{v\stackrel{\to }{k}}〉{〈{\Psi }_{c\stackrel{\to }{k}+{\stackrel{\to }{e}}_{\beta }}|{\Psi }_{v\stackrel{\to }{k}}〉}^{*}\end{array}$

where e is electron charge, Ω is the cell volume, ${w}_{\stackrel{\to }{k}}$ is the Fermi-weight of each k-point, eα are unit vectors along the three Cartesian directions, $|{\psi }_{n\stackrel{\to }{k}}〉$ is the cell-periodic part of the pseudopotential wavefunction for band n and k-point k, q stands for the Bloch vector of an incident wave, c and v stand for conduction and valence bands, ξ stands for eigenvalues of the corresponding bands respectively. The matrix elements on the right side of Equation (3) capture the transitions allowed by symmetry and selection rules37. The real part of the dielectric tensor ε1 is obtained by the usual Kramers-Kronig transformation35.

$\begin{array}{}\text{(4)}& {{\epsilon }^{1}}_{\alpha \beta }\left(\omega \right)=1+\frac{2}{\pi }P{\int }_{0}^{\infty }\frac{{\epsilon }_{\alpha \beta }^{2}\left(\omega \prime \right)\omega \prime }{{\omega \prime }^{2}-{\omega }^{2}+i\eta }d\omega \prime \end{array}$

where P denotes the principle value, and η is the complex shift parameter taken as 0.1.

It is to be noted that in conventional DFT, excited states are not optimized, hence many-body interactions are missing. To get the excited state optical properties, a high-level calculation such as the Bethe-Salpeter equation (BSE)38 is needed, however, the conventional DFT data remains useful for qualitative comparison.

### Experimental details

We validated our DFT dielectric function data for 2H-MoS2, 1T-SnSe2, Si, Ge, GaAs and InP comparing to experiments. We perform our experimental measurements for 2H-MoS2, 1T-SnSe2. Other experimental data were taken from Aspnes et al.39 for validation. 1T-SnSe2 (40 nm thickness) was grown on a GaAs (111) substrate by molecular beam epitaxy (MBE)40. The GaAs substrate was deoxidized in-situ under ultra-high vacuum (4×10−8 Pa) at 690 °C for 3 min and annealed under a flux of Se for 20 min, which provides a smoother growth surface. After the substrate was cooled down and held at the growth temperature of 200 °C for 40 min, sixty-three layers (≈40 nm) of 1T-SnSe2 were grown by a simultaneous incidence of Sn and Se at a rate of 1/38 layer per second based on Reflection High-Energy Electron Diffraction (RHEED) oscillations. The beam equivalent pressures (BEPs) for Sn and Se, supplied by using Knudsen cells, are 2.67×10−6 Pa (2×10−8 Torr) and 2.67×10−4 Pa (2×10−6 Torr), respectively. The single phase and high crystallinity of SnSe2 were confirmed by X-ray diffraction (XRD). Bulk MoS2 was commercially purchased from SPI Supplies41. Please note the commercial product is identified to specify procedures. Such identification does not imply recommendation by the National Institute of Standards and Technology. The dielectric functions were obtained from spectroscopic ellipsometry (SE). The SE measurements were performed in a nitrogen gas-filled chamber at room temperature on a vacuum ultraviolet (UV) spectroscopic ellipsometer with a light photon energy from (0.7 eV to 8.0) eV in steps of 0.02 eV for SnSe2 and from (1.0 eV to 9.0) eV in steps of 0.01 eV for MoS2, at an angle of incidence of 70°.

### User-interface

The data is presented in a webpage format (https://www.ctcms.nist.gov/~knc6/JVASP.html). First, a user selects the desired element/elements in the periodic table provided at the website and clicks on the ‘Search’ button (as shown in Fig. 2). This procedure generates a data table on the webpage consisting of the calculation-identifier, the formula of the structure, the functional used in the calculation, bandgap, mechanical property, space group of crystal and energetics of the system. Next, the user clicks on the calculation identifier for a formula, space group and functional and property data for detailed information. The detailed page is provided in the format such as https://www.ctcms.nist.gov/~knc6/jsmol/JVASP-1174.html where ‘1174’ denotes an identifier and can assume any JARVIS-ID. The particular webpage consists first of an interactive crystal visualization, then geometric properties such as computational XRD, bandstructure and the optical properties consisting of dielectric function and refractive index. We also provide a classification of materials based on their OPT and MBJ based bandgaps, and static refractive index data as shown in Fig. 3. Clicking on one of the options in Fig. 3 results in materials with classified properties. For example, clicking on ‘Classification of 3D-bulk materials based on TB-MBJ-bandgap’ produces a table with materials that have a bandgap in rage from 0 to 1, 1 to 2, 3 to 4 eV and so on. Each material is hyperlinked to its specific webpage.

### Code availability

The code used in this work is provided at https://github.com/usnistgov/jarvis. There are two main scripts in this folder- 1) joptb88vdw.py and 2) master.py. The joptb88vdw.py script heavily utilizes the Pymatgen8 and ASE42 codes for file and data management. The joptb88vdw.py generates a series of folders and JSON files starting with keyword ‘ENCUT’ and ‘KPOINT’ denoting the convergence test. An example of an actual calculation is also provided in the folder. After the convergence, the script carries out main geometric relaxation, band structure, optical property with OPT and optical property with MBJ calculations. The master.py takes the argument of the identifier of the database or the structure in ‘VASP’s ‘POSCAR’ format. The master script can tackle both PBS and SLURM formalism used in HPC architecture.

## Data Records

All data computed in this work can be found at https://www.ctcms.nist.gov/~knc6/JVASP.html and https://jarvis.nist.gov/. A JSON file is also available in a Figshare repository (Data Citation 1). Key variables for the JSON file are shown in Table 1. They include identifiers, structure, bandgaps and dielectric function information with OPT and MBJ methods. The dielectric function data in xx, yy, zz, xy, yz, and zx directions can be used for analyzing the anisotropic nature of the dielectric function. The opt_gap and mbj_gap data can be used to analyze the effect of DFT methodologies on bandgap of a material, where available. The ‘jid’,’mpid’and ‘cif’ mentioned in Table 1 belong to string-type, while ‘opt_gap’ and ‘mbj_gap’ belong to float-type data. The ‘mpid’ facilitates easy linking to the Materials-project database. Other values such as ‘opt_en’, ‘mbj_en’, ‘opt_realxx’,’opt_imagxx’, ‘mbj_realxx’ and ‘mbj_imagxx’ are arrays with float-type values. The ‘real’ part in these keys corresponds to real part of dielectric function while ‘imag’ corresponds to imaginary part of dielectric function in the respective directions. The Pymatgen code can be used to process the ‘cif’ string-type data. The key ‘opt_en’ has the same array-size as that of dielectric function data with OPT such as ‘opt_realxx’, ‘opt_imagxx’, while ‘mbj_en’ has the same array-size as that of dielectric function data with MBJ such as ‘mbj_realxx’ and ‘mbj_imagxx’. Packages such as Matplotlib and Gnuplot can be used to plot these arrays and visualize the data. We provide a few examples to explore the JSON files at the github page https://github.com/usnistgov/jarvis/tree/master/jarvis/db/static.

## Technical Validation

As discussed in the method section, the crystal structures were obtained from the Materials Project, which uses PBE for structure optimization. We re-optimize the MP crystal structures with the OPT functional. Most of the MP crystal-structures have Inorganic Crystal Structure Database (ICSD) IDs, which can be used to obtain experimental lattice parameter information. Hence, we compute PBE and OPT based mean absolute error (MAE) and root-mean-squared error (RMSE) of all the available structures in our database. There are presently 10,052 structures with ICSD IDs in our database. We further classify these structures into predicted vdW and predicted non-vdW structures. We use the lattice-constant criteria30 and data-mining approaches31 to identify vdW structures. All the remaining structures are treated as non-vdW bonded. The predicted vdW bonded materials can have vdW bonding in one, two or three crystallographic directions. It is to be noted that exfoliation energy is calculated to predict vdW bonded in materials30, but the two heuristic methods mentioned above can act as pre-screening criteria for determining vdW bonded structures. Out of 10,052 structures, 2,241 were predicted to be vdW bonded. In addition to the overall MAE and RMSE, we also calculate the same for these two classes of materials as shown in Table 2. As evident from Table 2, the OPT seems to improve lattice constants in a, b, c crystallographic directions compared to PBE. Significant improvement in lattice parameters is observed for predicted vdW materials, especially in c-directions. For predicted non-vdW materials, the errors are similar for OPT and PBE, suggesting that OPT can improved lattice constant predictions for vdW materials without much affecting the predictions for non-vdW bonded materials. Our PBE MAE value for all the materials (0.13 Å) are similar to that obtained by Jianmin et al.43 (0.135 Å) for a smaller set of materials.

As a first validation, we compared the MBJ and OPT bandgaps to experimental values, whenever available. Table 3 (available online only) displays such a comparison for 54 materials and shows the corresponding results from MP, OQMD, and AFLOW (PBE/PBE+U based data). We also provide identifiers across different databases to facilitate comparison. In general, the values of our OPT and MBJ bandgap data are higher than MP’s PBE data, with MBJ data being closer to experiments44,45. The mean absolute error (MAE) of MBJ with respect to experimental data is 0.51 eV, while that of OPT is 1.33. The OPT has MAE similar to MP, OQMD, and AFLOW because all of them are primarily PBE based calculations. However, significant improvement is shown with MBJ. Similar results for MBJ gaps versus experimental ones were found by Tran and Blaha et al.18, validating our methodology. We calculate two MAEs for the data: 1) MAE computed with respect to experiment using all available data for each method, 2) MAE computed with respect to experiment using only data for materials that have results available in all three DFT methods. Both of these values are shown in Table 3 (available online only). Both of the MAEs are found to show similar results. It is to be noted that our geometric optimization was performed with OPT, which is different from the one used by Tran-Blaha et al.18 This explains small differences in MBJ gaps found between our work and by them. Due to the inadequacy of experimental data for all the materials, it is intractable to calculate the error for the whole database. Also, some of the experimental bandgaps were averages of multiple experiments.

The MBJ potential is found to be more suitable for large bandgap insulators and can change the energetics of bands in metallic systems also. We found that some of the materials predicted as metallic using PBE are semiconductors using MBJ, such as Ge and GaAs. To better understand the source of error in the bandgap evaluation, we followed the Materials Project (MP) approach (https://www.materialsproject.org/docs/calculations#Accuracy_of_Band_Structures) and determined a “shifted” MAE for our bandgap evaluations. This treatment allows removing the effect of the DFT systematic underestimation of the gap. To do this, we first fitted a linear equation for the OPT and MBJ data with respect to experiment. The slope was found to be 1.17 and 1.44 for MBJ and OPT, respectively. The slope was then used as a scaling parameter to scale-up the OPT and MBJ data. After the data have been shifted, the MAE with respect to experiment was found to be 0.42 for MBJ, 0.69 for OPT, to compare with the MP result of 0.6. We also calculated the Spearman’s coefficient (SC), to measure monotonicity in the bandgap data from different methods compared to experiment. High value for SC suggests that the trends are similar to those in the experimental data. The highest value was obtained for HSE06 (0.97), followed by MBJ (0.94) and AFLOW (0.94). Additionally, we compare the computational time taken during HSE06, MBJ and OPT calculations for a few cases. We find that the MBJ takes about an order of magnitude more computational time than OPT, while HSE06 takes an order of magnitude more computational time than MBJ. A comparison table for computational time for calculations is given in supplementary information (Supplementary Table. S1).

Next, to understand the trends in the whole database, we compared the bandgaps obtained from the OPT and MBJ as shown in Fig. 4a. It is to be noted that many of our calculations for OPT and MBJ are still running; we compare data which are common in both OPT and MBJ only. The blue circles show the MBJ bandgaps while the green ones represent the OPT bandgaps. We also plot the experimental results (red dots) for a small subset (from Table 3 (available online only)) in the Fig. 4a. More specifically, we plotted the three types of data (MBJ, OPT and experiment) against the MBJ results. As the MBJ data are plotted against themselves, they produce a straight line along the diagonal of the plot. For a perfect agreement between OPT and MBJ, all the OPT data should lie on the same straight line. However, most of the OPT data is below the straight line, representing an underestimation of the bandgap. Compared to experiments, the MBJ results describe bandgaps much better than the OPT results. This is shown by the fact that up to about 6 eV most of the experimental data lie on the figure diagonal, while the OPT results lie systematically under it.

The relative difference in OPT and MBJ in bandgap is shown in Supplementary Fig. S1a. The percentage difference in values for OPT and MBJ are calculated as:

$\begin{array}{}\text{(5)}& \Delta =\frac{|{y}_{MBJ}-{y}_{OPT}|}{{y}_{MBJ}}×100%\end{array}$

To avoid division by very low or zero values, we calculated percentage differences for materials with OPT gap more than 1 eV. The upper bound of the relative changes in bandgap can range from 30% up to more than 100 %.

Similar to the bandgap data, the static refractive index in x, y and z-directions are also compared for OPT and MBJ. The static refractive index is related to static dielectric function data as $n\left(0\right)=\sqrt{{\epsilon }_{1}\left(0\right)}$. The static refractive index in x, y and z directions are shown in Fig. 4b–d. Like the MBJ bandgaps, the MBJ refractive indices are plotted against itself to give a straight line, which can be used for comparison. A subset of OPT and MBJ static dielectric constant data is shown in Table 3 (available online only) and compared to experiments. The MAE values of OPT and MBJ static dielectric constant in the x-direction are 3.2 and 2.6 respectively, showing the overall superiority of MBJ compared to OPT. It is to be noted that only interband transitions and not intraband are accounted for in our calculations, hence Drude-like transitions are not taken into account37. It implies that our dielectric function data should be more accurate for high bandgap materials18. Also, in cases where OPT predicts metallic behavior while MBJ predicts semiconductor/insulating, the dielectric function and therefore the static refractive index would be different between OPT and MBJ, because Drude like transitions are not captured in present work. As MBJ bandgaps are more reliable than OPT, the MBJ optical data can be considered more accurate than OPT, especially for low bandgap materials. A very high difference (more than 100%) in OPT and MBJ refractive index was observed for materials such as ZnCoF4 (as clearly seen in Supplementary Fig. S1b-S1d) because of the very different bandgaps obtained using OPT and MBJ. We also find that the relative differences between OPT and MBJ refractive indexes are much smaller compared to those for bandgaps. Interestingly, while OPT underestimates the bandgaps compared to experiments, the predicted dielectric functions are relatively close to the experimental measurement, especially for high-bandgap materials. It is because our methodology describes inter-band transitions well but is not suitable for intra-band transitions. Lastly, we also observe that the MBJ static refractive index data are generally lower than the OPT data, as noted in Table 4.

Next, in Fig. 5 we compare the OPT, MBJ and experimental imaginary part of dielectric function in the x-direction for 5a) 1T-SnSe2 ($P\overline{3}m1$), 5b) 2H-MoS2 (P63/mmc), 5c) Si ($Fd\overline{3}m$), 5d) Ge ($Fd\overline{3}m$), 5e) GaAs ($F\overline{4}3m\right)$ and 5f) InP $\left(F\overline{4}3m\right)$ . We carried out our experiments for dielectric functional data for 1T-SnSe2 ($P\overline{3}m1$) and 2H-MoS2 (P63/mmc), while other experimental data were obtained from previous experiments by Aspnes et al.39. It is clear from Fig. 5 that for MBJ, in general, performs better than OPT peak positions compared to experiments. For SnSe2 and MoS2, both the methodologies give similar result compared to experiments. For 1T-SnSe2, the peaks after 4 eV are more pronounced in DFT than the experiment, which can be attributed to the resolution power of the experiments. In Fig. 5b, the peaks around 2 eV and 4 eV are captured well both in OPT and MBJ for MoS2; however, there is a slight shift in the spectrum due to difference bandgap description between the two functionals at low energy range. We are still investigating the small shift at higher energies, especially for SnSe2. We observe similar spectrum shift due to bandgap underestimation for the cases 5c, 5d, 5e, and 5f. Moreover, peaks at low energy levels using OPT which are absent in MBJ and experimental spectrum. This is likely because when the bandgap is severely underestimated (such as for OPT), the theory predicts inter-band transitions (e.g., valence to conduction band) that simply don't exist because the gap is too high in reality. Such peaks are absent in MBJ based spectrums. It suggests that for low bandgap materials OPT can give unphysical transitions at low energies. However, overall spectrum patterns are similar for OPT and MBJ at higher energies. As observed in Fig. 5, the DFT intensity differs from experiment for some peaks, which can be explained based on 1) the difference in temperature between the experimental setup (generally at room temperature) and the DFT simulation (always at zero Kelvin), and 2) the surface roughness of the sample, which is not included in the calculation. Such differences in peak intensities compared to experiments are also observed for other high-level DFT based methods46. In a nutshell, our dielectric function data can be used to complement experimental spectra for instance to allowing to distinguish various peaks. In addition to the peak positions, the DFT data can be used to characterize the orbital nature of the associated electronic transitions, which can provide physical insight into a phenomena47. A detailed investigation of all the optical transitions for all the materials will be pursued in future. Other quantities such as refractive index, absorption coefficient, electron-energy-loss spectra (EELS), optical conductivity can be calculated with the dielectric function data. As the dielectric function for materials can be anisotropic, we also provide the dielectric function data in xx, yy, zz, xy, yz, and zx directions, which can be used to calculate frequency dependent birefringence of materials.

## Usage Notes

The database presented here represents the largest collection of consistently calculated optoelectronic properties of materials using density functional theory assembled to date. We anticipate that this dataset, and the methods provided for accessing, it will provide a useful tool in fundamental and application-related studies of materials. Our actual experimental verification provides insight into understanding the applicability and limitation of our DFT data. Based on the list of data, the user will be able to choose particular materials for specific applications. Data mining, data analytics, and artificial-intelligence tools then can be added to guide screening of materials.