Abstract
We perform highthroughput density functional theory (DFT) calculations for optoelectronic properties (electronic bandgap and frequency dependent dielectric function) using the OptB88vdW functional (OPT) and the TranBlaha modified Becke Johnson potential (MBJ). This data is distributed publicly through JARVISDFT database. We used this data to evaluate the differences between these two formalisms and quantify their accuracy, comparing to experimental data whenever applicable. At present, we have 17,805 OPT and 7,358 MBJ bandgaps and dielectric functions. MBJ is found to predict better bandgaps and dielectric functions than OPT, so it can be used to improve the wellknown bandgap problem of DFT in a relatively inexpensive way. The peak positions in dielectric functions obtained with OPT and MBJ are in comparable agreement with experiments. The data is available on our websites http://www.ctcms.nist.gov/~knc6/JVASP.html and https://jarvis.nist.gov.
Design Type(s)  protocol optimization design • cross over design 
Measurement Type(s)  bandgap • dielectric function 
Technology Type(s)  computational modeling technique 
Factor Type(s)  compound by chemical composition • computational method 
Machineaccessible metadata file describing the reported data (ISATab format)
Background & Summary
Optoelectronic properties, such as fundamental electronic bandgaps and dielectric functions, provide important material information in designing optoelectronic devices for a variety of applications, such as photovoltaic cells^{1}, light emitting diodes^{2}, transparent electronics^{3}, dynamic random access memory^{4}, astronomical devices^{5}, and smaller and faster devices^{6}. For industrial advancement in these industries, there is a great need to synthesize cheaper, more efficient, and tunable devices. Designing these new materials requires knowledge of already available ones, which can then be tailored for a particular application. Databases dedicated to optoelectronic materials meet this need. However, such userfriendly and easyaccessible public databases are still in the development phase. Computationally, it is much easier to provide properties for thousands of materials in a systematic way than to do so through experiments. Density functional theory (DFT) is the tool of choice to compute these properties in a highthroughput manner.
It is important to note that the term 'bandgap' generally refers to the fundamental gap and not the optical gap. The difference between these quantities could be small in semiconductors but significant in insulators^{7}. Materials Genome Initiative based projects such as the Materials Project (MP)^{8}, the open quantum materials database (OQMD)^{9}, and AFLOW^{10} have successfully enumerated bandgaps of hundreds of thousands of materials using the generalizedgradientapproximation PerdewBurkeErnzerhof functional (GGAPBE)^{11} and +U corrections. MP has also calibrated the static dielectric constant of 1056 materials using density functional perturbation theory (DFPT)^{12}, but frequencydependent dielectric functional data is missing. Although PBE provides great insights in distinguishing nonmetallic materials, the bandgaps of materials are generally underestimated typically by 30% to 100% (refs 13,14), hindering its practical application in the fields of semiconductors, photovoltaic materials, and thermoelectric devices. Other systematic databases of optoelectronic materials include Zunger et al.^{15} work for photovoltaic materials using Green function screened coulomb (GW) calculations, and Castelli et al.^{16} work on energyharvesting materials using the Gritsenko LeeuwenLentheBaerends (GLLBSC) functional. GW is much more reliable than PBE in computing optoelectronic properties. However, its high computational cost severely limits its application in highthroughput screening. Catelli’s work is also limited, containing information for only about 2400 materials.
Various techniques have been used to improve bandgap prediction at a moderate computational cost, including Chan and Ceder (deltasol)^{14}, modified BeckeJohnson potential^{17–19}, and empirical fits by Setyawan et al.^{20}. Recently, the modified BeckeJohnson (MBJ) potential introduced by Tran and Blaha^{17–19} has been proven to improve the bandgap description in a computationally efficient way. This potential has been successfully used in characterizing electronic properties of nonmagnetic transitionmetal oxides and sulfides, metals, (anti) ferromagnetic insulators, dielectric and topological insulators^{19,21–24}.
In this work, we have identified a sweet spot between the computational expense and accuracy for describing optoelectronic properties by using MBJ potential in a highthroughput approach. At present, we have 7358 MBJ bandgap and frequencydependent dielectric function entries, and the database is still growing. Additionally, we computed 17805 bandgaps and frequencydependent dielectric functions using OptB88vdW (OPT) for comparison purposes. OPT is a Van der Waaldispersion functional (vdWDF) with nonlocal correction, which can predict crystalstructure geometry, and is essential to the calculation of optoelectronic properties, especially for anisotropic materials. The OPT functional has not only been proven to reduce error in lattice constants, but its combination with MBJ functional is known to predict bandgaps of materials^{25} successfully. In addition, the error in lattice constants can significantly impact the error in optoelectronic properties such as refractive indices, and hence birefringence of noncubic class materials. Thus, for a better description of lattice constant and bandgaps of materials, it is necessary to first optimize geometries with vdW functional such OPT. OPT is also known to predict reasonable geometrical structures for nonvdW bonded structures^{26}.
We validate our computational results in a few cases through comparison with experimental values. We create a public JARVIS database of our results available at https://www.ctcms.nist.gov/~knc6/JVASP.html. The data is also available in RESTAPI format at https://jarvis.nist.gov/ and Cloud of Reproducible Records (CoRR) at NIST (https://mgi.nist.gov/cloudreproduciblerecords). We provide the code used in this work at github page: https://github.com/usnistgov/jarvis.
Methods
The methodology supporting the current work consisted of several steps, including density functional theory calculations and experimental validation of a few data points. The overall processes are shown in Fig. 1 and each step is explained in detail below.
Density functional theory setup
The DFT calculations are performed using the Vienna Abinitio Simulation Package (VASP)^{27,28} and the projectoraugmented wave (PAW) method^{29}. Please note commercial software is identified to specify procedures. Such identification does not imply recommendation by the National Institute of Standards and Technology. The crystal structures were obtained from the Materials Project (MP) DFT database. More specifically, we obtained all the crystal structures with less than 30 atoms per unit cell from MP, and the potential candidates for low dimensional materials using latticeconstant criteria^{30} and datamining approaches^{31}. We convert the crystal cells into its primitive cell representation before a DFT calculation. If the primitive cell and corresponding conventional cell of a crystalstructure have the same number of atoms, then we prefer conventional cell as the DFT input structure.
As the error in lattice constants can significantly impact the error in optoelectronic properties, such as refractive indices and birefringence of noncubic class materials, we reoptimized MP geometric structures using the OPT functional^{26,32}. PBE is known to report good lattice constants for materials, but its applicability to vdWbonded materials is questionable. Recently, around 5000 materials have been proposed to be vdWbonded using latticeconstant criteria^{30} and datamining approaches^{31}, signifying that a correct treatment of the vdW interactions is more important than previously thought. OPT is part of vdWDF functional, which is a nonlocal correlation functional that approximately accounts for dispersion interactions. OPT has been recently determined to perform well for bulk solids as well as vdW bonded structures^{26}. In a recent work by Tawfik et al.^{33}, OPT was proven to be one of the most accurate functionals to capture vdW interactions among several other methods. We performed planewave energy cutoff and kpoint convergences with 0.001 eV tolerance on energy. We assumed that satisfactory energy convergence would extrapolate to reasonably converged optical property calculations as well. The structure relaxation with OPT functional was obtained with 10^{−8} eV energy tolerance and 0.001 eV/Å forceconvergence criteria.
Next, we computed bandgap and optical properties with both OPT and MBJ in subsequent DFT calculations. In the MBJ calculations, we started from OPTrelaxed structures because the MBJ functional is a potentialonly functional, which implies that we cannot compute HellmannFeynman forces with MBJ, hence ionic relaxations were not performed using MBJ. The OPT functional has not only been proven to reduce error in lattice constants, but its combination with MBJ functional is known to predict correct bandgaps^{25} as shown for few vdW bonded materials. The MBJ potential is given by:
where c is a systemdependent parameter, with c = 1 corresponding to the BeckeRoussel (BR) potential ${v}_{x}^{BR}\left(r\right)$, which was originally proposed to mimic the Slater potential, the Coulomb potential corresponding to the exact exchange hole^{34}. For bulk crystalline materials, Tran and Blaha proposed to determine c by the following empirical relation:
With $\alpha =0.012$, $\beta =0.541$ Å^{1/2} and V_{cell} is the volume of the unit cell. The cparameter was automatically determined in VASP through a selfconsistent run.
To obtain the optical properties of the materials, we calculated the imaginary part of the dielectric function from the Bloch wavefunctions and eigenvalues^{35,36} (neglecting local field effects). We introduced three times as many empty conduction bands as valance bands. This treatment is necessary to facilitate proper electronic transitions. We choose 5000 frequency grid points to have a sufficiently high resolution in dielectric function spectra. The imaginary part is calculated as:
where e is electron charge, Ω is the cell volume, ${w}_{\overrightarrow{k}}$ is the Fermiweight of each kpoint, e_{α} are unit vectors along the three Cartesian directions, ${\psi}_{n\overrightarrow{k}}\u3009$ is the cellperiodic part of the pseudopotential wavefunction for band n and kpoint k, q stands for the Bloch vector of an incident wave, c and v stand for conduction and valence bands, ξ stands for eigenvalues of the corresponding bands respectively. The matrix elements on the right side of Equation (3) capture the transitions allowed by symmetry and selection rules^{37}. The real part of the dielectric tensor ε^{1} is obtained by the usual KramersKronig transformation^{35}.
where P denotes the principle value, and η is the complex shift parameter taken as 0.1.
It is to be noted that in conventional DFT, excited states are not optimized, hence manybody interactions are missing. To get the excited state optical properties, a highlevel calculation such as the BetheSalpeter equation (BSE)^{38} is needed, however, the conventional DFT data remains useful for qualitative comparison.
Experimental details
We validated our DFT dielectric function data for 2HMoS_{2,} 1TSnSe_{2}, Si, Ge, GaAs and InP comparing to experiments. We perform our experimental measurements for 2HMoS_{2,} 1TSnSe_{2}. Other experimental data were taken from Aspnes et al.^{39} for validation. 1TSnSe_{2} (40 nm thickness) was grown on a GaAs (111) substrate by molecular beam epitaxy (MBE)^{40}. The GaAs substrate was deoxidized insitu under ultrahigh vacuum (4×10^{−8} Pa) at 690 °C for 3 min and annealed under a flux of Se for 20 min, which provides a smoother growth surface. After the substrate was cooled down and held at the growth temperature of 200 °C for 40 min, sixtythree layers (≈40 nm) of 1TSnSe_{2} were grown by a simultaneous incidence of Sn and Se at a rate of 1/38 layer per second based on Reflection HighEnergy Electron Diffraction (RHEED) oscillations. The beam equivalent pressures (BEPs) for Sn and Se, supplied by using Knudsen cells, are 2.67×10^{−6} Pa (2×10^{−8} Torr) and 2.67×10^{−4} Pa (2×10^{−6} Torr), respectively. The single phase and high crystallinity of SnSe_{2} were confirmed by Xray diffraction (XRD). Bulk MoS_{2} was commercially purchased from SPI Supplies^{41}. Please note the commercial product is identified to specify procedures. Such identification does not imply recommendation by the National Institute of Standards and Technology. The dielectric functions were obtained from spectroscopic ellipsometry (SE). The SE measurements were performed in a nitrogen gasfilled chamber at room temperature on a vacuum ultraviolet (UV) spectroscopic ellipsometer with a light photon energy from (0.7 eV to 8.0) eV in steps of 0.02 eV for SnSe_{2} and from (1.0 eV to 9.0) eV in steps of 0.01 eV for MoS_{2}, at an angle of incidence of 70°.
Userinterface
The data is presented in a webpage format (https://www.ctcms.nist.gov/~knc6/JVASP.html). First, a user selects the desired element/elements in the periodic table provided at the website and clicks on the ‘Search’ button (as shown in Fig. 2). This procedure generates a data table on the webpage consisting of the calculationidentifier, the formula of the structure, the functional used in the calculation, bandgap, mechanical property, space group of crystal and energetics of the system. Next, the user clicks on the calculation identifier for a formula, space group and functional and property data for detailed information. The detailed page is provided in the format such as https://www.ctcms.nist.gov/~knc6/jsmol/JVASP1174.html where ‘1174’ denotes an identifier and can assume any JARVISID. The particular webpage consists first of an interactive crystal visualization, then geometric properties such as computational XRD, bandstructure and the optical properties consisting of dielectric function and refractive index. We also provide a classification of materials based on their OPT and MBJ based bandgaps, and static refractive index data as shown in Fig. 3. Clicking on one of the options in Fig. 3 results in materials with classified properties. For example, clicking on ‘Classification of 3Dbulk materials based on TBMBJbandgap’ produces a table with materials that have a bandgap in rage from 0 to 1, 1 to 2, 3 to 4 eV and so on. Each material is hyperlinked to its specific webpage.
Code availability
The code used in this work is provided at https://github.com/usnistgov/jarvis. There are two main scripts in this folder 1) joptb88vdw.py and 2) master.py. The joptb88vdw.py script heavily utilizes the Pymatgen^{8} and ASE^{42} codes for file and data management. The joptb88vdw.py generates a series of folders and JSON files starting with keyword ‘ENCUT’ and ‘KPOINT’ denoting the convergence test. An example of an actual calculation is also provided in the folder. After the convergence, the script carries out main geometric relaxation, band structure, optical property with OPT and optical property with MBJ calculations. The master.py takes the argument of the identifier of the database or the structure in ‘VASP’s ‘POSCAR’ format. The master script can tackle both PBS and SLURM formalism used in HPC architecture.
Data Records
All data computed in this work can be found at https://www.ctcms.nist.gov/~knc6/JVASP.html and https://jarvis.nist.gov/. A JSON file is also available in a Figshare repository (Data Citation 1). Key variables for the JSON file are shown in Table 1. They include identifiers, structure, bandgaps and dielectric function information with OPT and MBJ methods. The dielectric function data in xx, yy, zz, xy, yz, and zx directions can be used for analyzing the anisotropic nature of the dielectric function. The opt_gap and mbj_gap data can be used to analyze the effect of DFT methodologies on bandgap of a material, where available. The ‘jid’,’mpid’and ‘cif’ mentioned in Table 1 belong to stringtype, while ‘opt_gap’ and ‘mbj_gap’ belong to floattype data. The ‘mpid’ facilitates easy linking to the Materialsproject database. Other values such as ‘opt_en’, ‘mbj_en’, ‘opt_realxx’,’opt_imagxx’, ‘mbj_realxx’ and ‘mbj_imagxx’ are arrays with floattype values. The ‘real’ part in these keys corresponds to real part of dielectric function while ‘imag’ corresponds to imaginary part of dielectric function in the respective directions. The Pymatgen code can be used to process the ‘cif’ stringtype data. The key ‘opt_en’ has the same arraysize as that of dielectric function data with OPT such as ‘opt_realxx’, ‘opt_imagxx’, while ‘mbj_en’ has the same arraysize as that of dielectric function data with MBJ such as ‘mbj_realxx’ and ‘mbj_imagxx’. Packages such as Matplotlib and Gnuplot can be used to plot these arrays and visualize the data. We provide a few examples to explore the JSON files at the github page https://github.com/usnistgov/jarvis/tree/master/jarvis/db/static.
Technical Validation
As discussed in the method section, the crystal structures were obtained from the Materials Project, which uses PBE for structure optimization. We reoptimize the MP crystal structures with the OPT functional. Most of the MP crystalstructures have Inorganic Crystal Structure Database (ICSD) IDs, which can be used to obtain experimental lattice parameter information. Hence, we compute PBE and OPT based mean absolute error (MAE) and rootmeansquared error (RMSE) of all the available structures in our database. There are presently 10,052 structures with ICSD IDs in our database. We further classify these structures into predicted vdW and predicted nonvdW structures. We use the latticeconstant criteria^{30} and datamining approaches^{31} to identify vdW structures. All the remaining structures are treated as nonvdW bonded. The predicted vdW bonded materials can have vdW bonding in one, two or three crystallographic directions. It is to be noted that exfoliation energy is calculated to predict vdW bonded in materials^{30}, but the two heuristic methods mentioned above can act as prescreening criteria for determining vdW bonded structures. Out of 10,052 structures, 2,241 were predicted to be vdW bonded. In addition to the overall MAE and RMSE, we also calculate the same for these two classes of materials as shown in Table 2. As evident from Table 2, the OPT seems to improve lattice constants in a, b, c crystallographic directions compared to PBE. Significant improvement in lattice parameters is observed for predicted vdW materials, especially in cdirections. For predicted nonvdW materials, the errors are similar for OPT and PBE, suggesting that OPT can improved lattice constant predictions for vdW materials without much affecting the predictions for nonvdW bonded materials. Our PBE MAE value for all the materials (0.13 Å) are similar to that obtained by Jianmin et al.^{43} (0.135 Å) for a smaller set of materials.
As a first validation, we compared the MBJ and OPT bandgaps to experimental values, whenever available. Table 3 (available online only) displays such a comparison for 54 materials and shows the corresponding results from MP, OQMD, and AFLOW (PBE/PBE+U based data). We also provide identifiers across different databases to facilitate comparison. In general, the values of our OPT and MBJ bandgap data are higher than MP’s PBE data, with MBJ data being closer to experiments^{44,45}. The mean absolute error (MAE) of MBJ with respect to experimental data is 0.51 eV, while that of OPT is 1.33. The OPT has MAE similar to MP, OQMD, and AFLOW because all of them are primarily PBE based calculations. However, significant improvement is shown with MBJ. Similar results for MBJ gaps versus experimental ones were found by Tran and Blaha et al.^{18}, validating our methodology. We calculate two MAEs for the data: 1) MAE computed with respect to experiment using all available data for each method, 2) MAE computed with respect to experiment using only data for materials that have results available in all three DFT methods. Both of these values are shown in Table 3 (available online only). Both of the MAEs are found to show similar results. It is to be noted that our geometric optimization was performed with OPT, which is different from the one used by TranBlaha et al.^{18} This explains small differences in MBJ gaps found between our work and by them. Due to the inadequacy of experimental data for all the materials, it is intractable to calculate the error for the whole database. Also, some of the experimental bandgaps were averages of multiple experiments.
The MBJ potential is found to be more suitable for large bandgap insulators and can change the energetics of bands in metallic systems also. We found that some of the materials predicted as metallic using PBE are semiconductors using MBJ, such as Ge and GaAs. To better understand the source of error in the bandgap evaluation, we followed the Materials Project (MP) approach (https://www.materialsproject.org/docs/calculations#Accuracy_of_Band_Structures) and determined a “shifted” MAE for our bandgap evaluations. This treatment allows removing the effect of the DFT systematic underestimation of the gap. To do this, we first fitted a linear equation for the OPT and MBJ data with respect to experiment. The slope was found to be 1.17 and 1.44 for MBJ and OPT, respectively. The slope was then used as a scaling parameter to scaleup the OPT and MBJ data. After the data have been shifted, the MAE with respect to experiment was found to be 0.42 for MBJ, 0.69 for OPT, to compare with the MP result of 0.6. We also calculated the Spearman’s coefficient (SC), to measure monotonicity in the bandgap data from different methods compared to experiment. High value for SC suggests that the trends are similar to those in the experimental data. The highest value was obtained for HSE06 (0.97), followed by MBJ (0.94) and AFLOW (0.94). Additionally, we compare the computational time taken during HSE06, MBJ and OPT calculations for a few cases. We find that the MBJ takes about an order of magnitude more computational time than OPT, while HSE06 takes an order of magnitude more computational time than MBJ. A comparison table for computational time for calculations is given in supplementary information (Supplementary Table. S1).
Next, to understand the trends in the whole database, we compared the bandgaps obtained from the OPT and MBJ as shown in Fig. 4a. It is to be noted that many of our calculations for OPT and MBJ are still running; we compare data which are common in both OPT and MBJ only. The blue circles show the MBJ bandgaps while the green ones represent the OPT bandgaps. We also plot the experimental results (red dots) for a small subset (from Table 3 (available online only)) in the Fig. 4a. More specifically, we plotted the three types of data (MBJ, OPT and experiment) against the MBJ results. As the MBJ data are plotted against themselves, they produce a straight line along the diagonal of the plot. For a perfect agreement between OPT and MBJ, all the OPT data should lie on the same straight line. However, most of the OPT data is below the straight line, representing an underestimation of the bandgap. Compared to experiments, the MBJ results describe bandgaps much better than the OPT results. This is shown by the fact that up to about 6 eV most of the experimental data lie on the figure diagonal, while the OPT results lie systematically under it.
The relative difference in OPT and MBJ in bandgap is shown in Supplementary Fig. S1a. The percentage difference in values for OPT and MBJ are calculated as:
To avoid division by very low or zero values, we calculated percentage differences for materials with OPT gap more than 1 eV. The upper bound of the relative changes in bandgap can range from 30% up to more than 100 %.
Similar to the bandgap data, the static refractive index in x, y and zdirections are also compared for OPT and MBJ. The static refractive index is related to static dielectric function data as $n\left(0\right)=\sqrt{{\epsilon}_{1}\left(0\right)}$. The static refractive index in x, y and z directions are shown in Fig. 4b–d. Like the MBJ bandgaps, the MBJ refractive indices are plotted against itself to give a straight line, which can be used for comparison. A subset of OPT and MBJ static dielectric constant data is shown in Table 3 (available online only) and compared to experiments. The MAE values of OPT and MBJ static dielectric constant in the xdirection are 3.2 and 2.6 respectively, showing the overall superiority of MBJ compared to OPT. It is to be noted that only interband transitions and not intraband are accounted for in our calculations, hence Drudelike transitions are not taken into account^{37}. It implies that our dielectric function data should be more accurate for high bandgap materials^{18}. Also, in cases where OPT predicts metallic behavior while MBJ predicts semiconductor/insulating, the dielectric function and therefore the static refractive index would be different between OPT and MBJ, because Drude like transitions are not captured in present work. As MBJ bandgaps are more reliable than OPT, the MBJ optical data can be considered more accurate than OPT, especially for low bandgap materials. A very high difference (more than 100%) in OPT and MBJ refractive index was observed for materials such as ZnCoF_{4} (as clearly seen in Supplementary Fig. S1bS1d) because of the very different bandgaps obtained using OPT and MBJ. We also find that the relative differences between OPT and MBJ refractive indexes are much smaller compared to those for bandgaps. Interestingly, while OPT underestimates the bandgaps compared to experiments, the predicted dielectric functions are relatively close to the experimental measurement, especially for highbandgap materials. It is because our methodology describes interband transitions well but is not suitable for intraband transitions. Lastly, we also observe that the MBJ static refractive index data are generally lower than the OPT data, as noted in Table 4.
Next, in Fig. 5 we compare the OPT, MBJ and experimental imaginary part of dielectric function in the xdirection for 5a) 1TSnSe_{2} ($P\overline{3}m1$), 5b) 2HMoS_{2} (P6_{3}/mmc), 5c) Si ($Fd\overline{3}m$), 5d) Ge ($Fd\overline{3}m$), 5e) GaAs ($F\overline{4}3m)$ and 5f) InP $(F\overline{4}3m)$ . We carried out our experiments for dielectric functional data for 1TSnSe_{2} ($P\overline{3}m1$) and 2HMoS_{2} (P6_{3}/mmc), while other experimental data were obtained from previous experiments by Aspnes et al.^{39}. It is clear from Fig. 5 that for MBJ, in general, performs better than OPT peak positions compared to experiments. For SnSe_{2} and MoS_{2}, both the methodologies give similar result compared to experiments. For 1TSnSe_{2}, the peaks after 4 eV are more pronounced in DFT than the experiment, which can be attributed to the resolution power of the experiments. In Fig. 5b, the peaks around 2 eV and 4 eV are captured well both in OPT and MBJ for MoS_{2}; however, there is a slight shift in the spectrum due to difference bandgap description between the two functionals at low energy range. We are still investigating the small shift at higher energies, especially for SnSe_{2}. We observe similar spectrum shift due to bandgap underestimation for the cases 5c, 5d, 5e, and 5f. Moreover, peaks at low energy levels using OPT which are absent in MBJ and experimental spectrum. This is likely because when the bandgap is severely underestimated (such as for OPT), the theory predicts interband transitions (e.g., valence to conduction band) that simply don't exist because the gap is too high in reality. Such peaks are absent in MBJ based spectrums. It suggests that for low bandgap materials OPT can give unphysical transitions at low energies. However, overall spectrum patterns are similar for OPT and MBJ at higher energies. As observed in Fig. 5, the DFT intensity differs from experiment for some peaks, which can be explained based on 1) the difference in temperature between the experimental setup (generally at room temperature) and the DFT simulation (always at zero Kelvin), and 2) the surface roughness of the sample, which is not included in the calculation. Such differences in peak intensities compared to experiments are also observed for other highlevel DFT based methods^{46}. In a nutshell, our dielectric function data can be used to complement experimental spectra for instance to allowing to distinguish various peaks. In addition to the peak positions, the DFT data can be used to characterize the orbital nature of the associated electronic transitions, which can provide physical insight into a phenomena^{47}. A detailed investigation of all the optical transitions for all the materials will be pursued in future. Other quantities such as refractive index, absorption coefficient, electronenergyloss spectra (EELS), optical conductivity can be calculated with the dielectric function data. As the dielectric function for materials can be anisotropic, we also provide the dielectric function data in xx, yy, zz, xy, yz, and zx directions, which can be used to calculate frequency dependent birefringence of materials.
Usage Notes
The database presented here represents the largest collection of consistently calculated optoelectronic properties of materials using density functional theory assembled to date. We anticipate that this dataset, and the methods provided for accessing, it will provide a useful tool in fundamental and applicationrelated studies of materials. Our actual experimental verification provides insight into understanding the applicability and limitation of our DFT data. Based on the list of data, the user will be able to choose particular materials for specific applications. Data mining, data analytics, and artificialintelligence tools then can be added to guide screening of materials.
Additional information
How to cite this article: Choudhary K. et al. Computational screening of highperformance optoelectronic materials using OptB88vdW and TBmBJ formalisms. Sci. Data 5:180082 doi: 10.1038/sdata.2018.82 (2018).
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
References
 1
Polman, A. & Atwater, H. A. Photonic design principles for ultrahighefficiency photovoltaics. Nature materials 11, 174–177 (2012).
 2
Xiao, Z. et al. Efficient perovskite lightemitting diodes featuring nanometresized crystallites. Nature Photonics 11, 108–115 (2017).
 3
Kawazoe, H., Yasukawa, M., Hyodo, H. & Kurita, M. ptype electrical conduction in transparent thin films of CuAIO2. Nature 389, 939 (1997).
 4
Traversa, F. L., Bonani, F., Pershin, Y. V. & Di Ventra, M. Dynamic computing random access memory. Nanotechnology 25, 285201 (2014).
 5
Henning, T., Il'In, V., Krivova, N., Michel, B. & Voshchinnikov, N. WWW database of optical constants for astronomy. Astronomy and Astrophysics Supplement Series 136, 405–406 (1999).
 6
Forst, C. J., Ashman, C. R., Schwarz, K. & Blochl, P. E. The interface between silicon and a highk oxide. Nature 427, 53–56 (2004).
 7
Brothers, E. N., Izmaylov, A. F., Normand, J. O., Barone, V. & Scuseria, G. E. (AIP, 2008).
 8
Ong, S. P. et al. Python Materials Genomics (pymatgen): A robust, opensource python library for materials analysis. Computational Materials Science 68, 314–319 (2013).
 9
Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with highthroughput density functional theory: the open quantum materials database (OQMD). Jom 65, 1501–1509 (2013).
 10
Curtarolo, S. et al. AFLOW: an automatic framework for highthroughput materials discovery. Computational Materials Science 58, 218–226 (2012).
 11
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Physical review letters 77, 3865 (1996).
 12
Petousis, I. et al. Highthroughput screening of inorganic compounds for the discovery of novel dielectric and optical materials. Scientific Data 4, 160134 (2017).
 13
Wang, C. S. & Pickett, W. E. DensityFunctional Theory of Excitation Spectra of Semiconductors: Application to Si. Physical Review Letters 51, 597–600 (1983).
 14
Chan, M. & Ceder, G. Efficient band gap prediction for solids. Physical review letters 105, 196403 (2010).
 15
Yu, L. & Zunger, A. Identification of potential photovoltaic absorbers based on firstprinciples spectroscopic screening of materials. Physical review letters 108, 068701 (2012).
 16
Castelli, I. E. et al. New Light‐Harvesting Materials Using Accurate and Efficient Bandgap Calculations. Advanced Energy Materials 5, 1400915 (1–7) (2015).
 17
Tran, F. & Blaha, P. Accurate band gaps of semiconductors and insulators with a semilocal exchangecorrelation potential. Physical review letters 102, 226401 (2009).
 18
Tran, F. & Blaha, P. Importance of the Kinetic Energy Density for Band Gap Calculations in Solids with Density Functional Theory. J. Phys. Chem. A 121, 3318–3325 (2017).
 19
Rai, D., Ghimire, M. & Thapa, R. A DFT study of BeX (X= S, Se, Te) semiconductor: modified Becke Johnson (mBJ) potential. Semiconductors 48, 1411–1422 (2014).
 20
Setyawan, W., Gaume, R. M., Lam, S., Feigelson, R. S. & Curtarolo, S. Highthroughput combinatorial database of electronic band structures for inorganic scintillator materials. ACS combinatorial science 13, 382–390 (2011).
 21
Koller, D., Tran, F. & Blaha, P. Merits and limits of the modified BeckeJohnson exchange potential. Physical Review B 83, 195134 (2011).
 22
Boujnah, M., Dakir, O., Zaari, H., Benyoussef, A. & El Kenz, A. Optoelectronic response of spinels CdX2O4 with X=(Al, Ga, In) through the modified Becke–Johnson functional. Journal of Applied Physics 116, 123703 (2014).
 23
Singh, D. J. Electronic structure calculations with the TranBlaha modified BeckeJohnson density functional. Physical Review B 82, 205102 (2010).
 24
Feng, W., Xiao, D., Zhang, Y. & Yao, Y. HalfHeusler topological insulators: A firstprinciples study with the TranBlaha modified BeckeJohnson density functional. Physical Review B 82, 235121 (2010).
 25
Qiao, J., Kong, X., Hu, Z.X., Yang, F. & Ji, W. Highmobility transport anisotropy and linear dichroism in fewlayer black phosphorus. Nature communications 5 (2014).
 26
Klimeš, J., Bowler, D. R. & Michaelides, A. Van der Waals density functionals applied to solids. Physical Review B 83, 195131 (2011).
 27
Kresse, G. & Furthmüller, J. Efficiency of abinitio total energy calculations for metals and semiconductors using a planewave basis set. Computational Materials Science 6, 15–50 (1996).
 28
Kresse, G. & Furthmüller, J. Efficient iterative schemes for abinitio totalenergy calculations using a planewave basis set. Physical Review B 54, 11169–11186 (1996).
 29
Blöchl, P. E. Projector augmentedwave method. Physical Review B 50, 17953 (1994).
 30
Choudhary, K., Kalish, I., Beams, R. & Tavazza, F. Highthroughput Identification and Characterization of Twodimensional Materials using Density functional theory. Scientific Reports 7 (2017).
 31
Cheon, G. et al. Data mining for new twoand onedimensional weakly bonded solids and latticecommensurate heterostructures. Nano Letters 17, 1915–1923 (2017).
 32
Thonhauser, T. et al. Van der Waals density functional: Selfconsistent potential and the nature of the van der Waals bond. Physical Review B 76, 125112 (2007).
 33
Tawfik, S. A., Gould, T., Stamp, C. & Ford, M. J. Dispersion forces in heterostructures: problem solved? arXiv preprint arXiv 1712, 08327 (2017).
 34
Becke, A. & Roussel, M. Exchange holes in inhomogeneous systems: A coordinatespace model. Physical Review A 39, 3761 (1989).
 35
Gajdoš, M., Hummer, K., Kresse, G., Furthmüller, J. & Bechstedt, F. Linear optical properties in the projectoraugmented wave methodology. Physical Review B 73, 045112 (2006).
 36
Moseley, L. & Lukes, T. A simplified derivation of the Kubo‐Greenwood formula. American Journal of Physics 46, 676–677 (1978).
 37
Wooten, F. Optical properties of solids (Academic press, 2013).
 38
Burke, K. The abc of dft (Department of Chemistry, University of California, 2007).
 39
Aspnes, D. E. & Studna, A. Dielectric functions and optical parameters of si, ge, gap, gaas, gasb, inp, inas, and insb from 1.5 to 6.0 ev. Physical review B 27, 985 (1983).
 40
Vishwanath, S. et al. Controllable growth of layered selenide and telluride heterostructures and superlattices using molecular beam epitaxy. Journal of Materials Research 31, 900–910 (2016).
 41
Li, W. et al. Broadband optical properties of largearea monolayer CVD molybdenum disulfide. Physical Review B 90, 195434 (2014).
 42
Larsen, A. et al. The Atomic Simulation Environment—A Python library for working with atoms Journal of Physics: Condensed Matter 29, 273002 (2017).
 43
Tao, J., Zheng, F., Gebhardt, J., Perdew, J. P. & Rappe, A. M. Screened van der Waals correction to density functional theory for solids. Physical Review Materials 1, 020802 (2017).
 44
Nwigboji, I. H. et al. Abinitio computations of electronic and transport properties of wurtzite aluminum nitride (wAlN). Materials Chemistry and Physics 157, 80–86 (2015).
 45
Araujo, R. B., De Almeida, J. & Ferreira Da Silva, A. Electronic properties of IIInitride semiconductors: A firstprinciples investigation using the TranBlaha modified BeckeJohnson potential. Journal of Applied Physics 114, 183702 (2013).
 46
Botti, S. et al. Longrange contribution to the exchangecorrelation kernel of timedependent density functional theory. Physical Review B 69, 155112 (2004).
 47
Choudhary, K. et al. Computational discovery of lanthanide doped and Codoped Y3Al5O12 for optoelectronic applications. Applied Physics Letters 107, 112109 (2015).
 48
CamargoMartínez, J. & Baquero, R. Performance of the modified BeckeJohnson potential for semiconductors. Physical Review B 86, 195106 (2012).
 49
Berger, L. I. Semiconductor materials (CRC press, 1996).
 50
Kumar, A. & Ahluwalia, P. Tunable dielectric response of transition metals dichalcogenides MX 2 (M= Mo, W; X= S, Se, Te): Effect of quantum confinement. Physica B: Condensed Matter 407, 4627–4634 (2012).
 51
Holm, B., Ahuja, R. & Yourdshahyan, Y. Elastic and Optical Properties of αAl2O3 and kAl2O3. Physical Review B 777–712 (1999).
 52
Yan, J., Jacobsen, K. W. & Thygesen, K. S. Optical properties of bulk semiconductors and graphene/boron nitride: The BetheSalpeter equation with derivative discontinuitycorrected density functional energies. Physical Review B 86, 045208 (2012).
Data Citations
 1
Choudhary, K. figshare https://doi.org/10.6084/m9.figshare.5825994.v1 (2018)
Acknowledgements
We thank Carelyn Campbell, Kevin Garrity, John Vinson and Jason HattrickSimpers at NIST for helpful discussions.
Author information
Affiliations
Contributions
K.C. performed DFT calculations, developed the python framework code and webpage, worked on data analysis and verification. Q.Z. and N.V.N. performed ellipsometry experiments. S.C. performed some of the DFT calculations. Z.T. and M.W.N. helped in setting up MDCS MongoDB database for the DFT calculations. F.Y.C. helped in putting up the DFT data in CoRR. A.R. helped in webpage deployment. F.T. assisted in developing the database, designing the convergence criterion, analysis plots and writing the manuscript.
Corresponding author
Correspondence to Kamal Choudhary.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Supplementary information accompanies this paper at
ISATab metadata
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.
About this article
Cite this article
Choudhary, K., Zhang, Q., Reid, A. et al. Computational screening of highperformance optoelectronic materials using OptB88vdW and TBmBJ formalisms. Sci Data 5, 180082 (2018). https://doi.org/10.1038/sdata.2018.82
Received:
Accepted:
Published:
Further reading

Graph Convolutional Neural Networks as “GeneralPurpose” Property Predictors: The Universality and Limits of Applicability
Journal of Chemical Information and Modeling (2020)

Hidden spinpolarized bands in semiconducting 2HMoTe2
Materials Research Letters (2020)

A Critical Review of Machine Learning of Energy Materials
Advanced Energy Materials (2020)

Convergence and machine learning predictions of MonkhorstPack kpoints and planewave cutoff in highthroughput DFT calculations
Computational Materials Science (2019)

Accelerated Discovery of Efficient Solar Cell Materials Using Quantum and MachineLearning Methods
Chemistry of Materials (2019)