Density functional theory-based electric field gradient database

The deviation of the electron density around the nuclei from spherical symmetry determines the electric field gradient (EFG), which can be measured by various types of spectroscopy. Nuclear Quadrupole Resonance (NQR) is particularly sensitive to the EFG. The EFGs, and by implication NQR frequencies, vary dramatically across materials. Consequently, searching for NQR spectral lines in previously uninvestigated materials represents a major challenge. Calculated EFGs can significantly aid at the search’s inception. To facilitate this task, we have applied high-throughput density functional theory calculations to predict EFGs for 15187 materials in the JARVIS-DFT database. This database, which will include EFG as a standard entry, is continuously increasing. Given the large scope of the database, it is impractical to verify each calculation. However, we assess accuracy by singling out cases for which reliable experimental information is readily available and compare them to the calculations. We further present a statistical analysis of the results. The database and tools associated with our work are made publicly available by JARVIS-DFT (https://www.ctcms.nist.gov/~knc6/JVASP.html) and NIST-JARVIS API (http://jarvis.nist.gov/). Measurement(s) electric field gradient Technology Type(s) computational modeling technique Factor Type(s) material studied Measurement(s) electric field gradient Technology Type(s) computational modeling technique Factor Type(s) material studied Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13027220

The main goal of this work is to provide the calculated values of EFGs for different compounds in one of the most comprehensive materials databases. The paper is organized as follows: first, we present, for consistency, the general theory and methodology. Next, we shall compare the computational results for cases where reliable experimental data exist, with the experiment, and discuss several individual cases where unusually large deviations have been found. In doing that, we will point out potential factors that may render EFGs particularly sensitive to computational details. Lastly, we shall discuss the general statistical distribution of the EFG parameters.

Methods
General theory. The key parameters used to define NQR spectral lines are the quadrupole coupling constant ν Q = eQV zz /h and the asymmetry parameter where e is electric charge, h is Planck's constant, and Q is the nuclear quadrupole moment; V ii are the principal components of the diagonalized EFG tensor, defined as the second derivative in Cartesian coordinates of the Coulomb potential at the nucleus position. By construction, the EFG, V ii is a traceless tensor. The coordinate system, in accordance with the convention used by experimentalists is chosen so that xx , which forces 0 ≤ η1. Note that if the point group of the site in question is cubic, then by symmetry all components are zero; if it is tetragonal or hexagonal, then η = 0, but V zz ≠ 0. Density functional theory 17 calculations can be used to predict the EFGs, however, a systematic database of such quantities for materials is still missing. Computationally, it is much easier to provide such properties for thousands of materials in a systematic way than to do so through experiments. Note that the nuclear quadruple moment Q is needed for the comparison with experiment. However, it cannot be calculated and it is often set by comparing calculations to experiment. More details can be found in Pyykkö 18 .
DFT appears to be a good tool, with sufficient predictive power, to point the experiment on a new compound in the right direction. Indeed, ideally, DFT, by construction, provides the exact charge density and therefore exact EFGs. In practice, of course, various approximations are used, but, as a rule of thumb, DFT is much better suited for calculating the total energy and the total charge density than for calculating, say, electron excitation spectra.
Initially, DFT was applied to EFGs in its all-electron formulation. One of the first full-potential, all-electron codes to implement this calculation was the WIEN2k package developed by Schwarz, Blaha et al. 41 implementing the full-potential linearized augmented-plane-wave (FLAPW) method. In ref. 41 it was first applied to calculating EFGs and since then has been used for several classes of materials. It is essential for this task to have no additional approximations for the charge density and potential shape, as, for instance, in earlier muffin-tin versions of these codes. On the other hand, pseudopotential methods do not have any restrictions on the shape of the potential, being just plane wave expansions, and are very fast. The problem with such methods is that inside the atomic core they used pseudo-wavefunctions and pseudo-density, rather than the actual electronic charge. This problem was resolved with the invention by P. Blochl in 1994 of the projector augmented waves (PAW) method 42 , which allows rigorous extraction of the true electronic density from PAW pseudopotential calculations. The formalism was tested by Petrilli et al. 43 and found to be consistent with all-electron calculations. Later PAW pseudopotentials were implemented in the popular software package VASP (Vienna ab-initio simulation package) and the formalism of ref. 43 was implemented as well. This package will be used throughout this paper. A more detailed review of predicting nuclear quantities using density functional theory based methods can be found elsewhere 44,45 .
In this work, we apply the PAW formalism to develop a computational database of EFGs for ~15000 materials and make the database publicly available through the NIST-JARVIS platform. We compare a few of the predicted data with experiments to estimate the uncertainty in predictions and carry out correlation and trend analysis to reveal the underpinning physics. The NIST-JARVIS 38 (https://jarvis.nist.gov/) has several components, such as JARVIS-FF, JARVIS-DFT, JARVIS-ML, JARVIS-STM, JARVIS-Heterostructure and hosts material-properties such as lattice parameters 27 , formation energies 29 , 2D exfoliation energies 26 , bandgaps 30 , elastic constants 27 , heterostructure 46 , dielectric constants 30,47 , infrared intensities 47 , piezoelectric constants 47 , thermoelectric properties 32 , optoelectronic properties, solar-cell efficiencies 31,34 , topological materials 29,30 , and computational STM images 33 . The JARVIS-DFT (https://www.ctcms.nist.gov/~knc6/JVASP.html) currently hosts ~40000 3D, ~1000 2D materials with millions of calculated material-properties. We believe the EFG database along with other property-data will be a useful resource for material-design and can serve as precursors for artificial intelligence and data-mining methods.
Density functional theory calculations. The DFT calculations were carried out using the Vienna Ab-initio simulation package (VASP) [48][49][50][51] . The entire study was managed, monitored, and analyzed using the modular workflow, which we have made available on the JARVIS-Tools GitHub page (https://github.com/ www.nature.com/scientificdata www.nature.com/scientificdata/ usnistgov/jarvis). Please note that commercial software is identified to specify procedures. Such identification does not imply endorsement by the National Institute of Standards and Technology. We use the projected augmented wave method 42,52 and OptB88vdW functional 53 , which gives accurate lattice parameters for both van der Waals (vdW) and non-vdW solids 26,27 . Both the internal atomic positions and the lattice constants are allowed to relax in spin-unrestricted calculations until the maximal residual Hellmann-Feynman forces on atoms are smaller than 0.001 eV Å −1 and energy-tolerance of 10 −7 eV. We do not consider spin-orbit interactions or magnetic orderings besides ferromagnetic, because of a high computational cost. We note that nuclear spins are not explicitly considered during the DFT calculations. The list of pseudopotentials used in this work is given on the GitHub page. The k-point mesh and plane-wave cut-off were converged for each material using the automated procedure described in ref. 37 . After optimization of the structures, we calculate the EFGs at the positions of atomic nuclei following Petrilli et al. 43 . We add 30% extra plane-wave cut-off on top of that obtained from automatic convergence script and impose a tighter electronic convergence threshold of 10 −8 eV for the EFG calculations.
Starting from ~40000 3D materials the JARVIS-DFT database, we screen out materials with point group 23, m 43 , m3, 432, m 3m in which atomic sites have cubic symmetry, leaving 25931 candidates. Next, we choose materials with 20 or lesser atoms in a unit cell for computational cost reasons leading to 18672 materials. Out of these candidates we have calculated EFGs for 15187 materials and EFG data for other materials would also be available soon.

Data Records
After the calculations, the metadata is stored in the Javascript Object Notation Files (JSON) format which can be easily integrated with databases such as MongoDB. The dataset would be made publicly available through JARVIS-DFT (https://www.ctcms.nist.gov/~knc6/JVASP.html) website and NIST-JARVIS REST-API 38 (https:// jarvis.nist.gov/). We have also made a Comma Separated Values (CSV) format dataset publicly available through the Figshare repository 54 . Note in the CSV file, η is listed as zero when V zz is zero, as is standard for VASP output. The CSV file contains the following entries as shown in Table 1: In addition to the CSV file, we provide a JSON file which contains EFG components and atomic structure information. The JSON file contains an extra entry called "atoms" with atomic structure information along with above mentioned keys. An example script is also provided to parse the JSON file. The script can be used for various post-processing analysis.

Technical Validation
As mentioned previously, the exact DFT would give the exact theoretical EFG, as opposed to, for instance, the excitation gap. In reality, however, there are several sources of inaccuracy, as discussed below.
The first limitation is that in all calculations, for the purpose of uniformity, we have used the calculated crystal structure, even for those cases where an experimental structure was available. Normally that does not have a large effect. Recall that our goal is not to predict the EFGs with maximal accuracy, but to give experimentalists a reference point for each material, in the vicinity of which they should search for spectral lines. Another important factor could be the temperature effects, which is not directly captured in 0 K DFT calculations.
Notable exceptions occur when the local environment is very close to cubic, but not exactly. Consider an example of a hexagonal closed packed (hcp) material. If the ratio c/a is equal to its ideal value, ≈ . 8/3 1 633, the nearest neighbors around each site form an ideal dodecahedron and the EFGs from the six in-plane neighbors and six nearest-plane neighbors cancel out exactly. In a simple point charge model, neglecting the second and farther neighbors (whose contribution is at least 4.5 times smaller), the net contribution to V zz is proportional to ; that is to say V zz is linear with the deviation of c/a from the ideal value, with the zero intercept very close to the latter. Indeed, we observe this linear behavior for α-Sc (Fig. 1). Note that experimentally, c/a for Sc is 1.6, less than 2% away from the ideal ratio, so a 2% error in determining c/a can lead to a 100% error in EFG. Other materials' sites close to a local cubic environment, where a particular caution needs to be exercised, include hcp and its derivative, including wurtzite or lonsdaleite, or structures derived from cubic, such as distorted perovskites.
The second caveat is that, due to the nature of our massive calculations, we could not manually introduce experimental magnetic order in each case (and, in many cases, it is simply not known). To this end, we did blanket calculations for each material starting from a ferromagnetic (FM) configuration of the electron spin. In most, albeit not all cases, such procedure converges to an FM solution even for experimentally antiferromagnetic (AF) materials and captures the effect of the local magnetization on the EFGs at least semi-quantitatively. Notable exceptions occur when AF order triggers specific orbital ordering, such as Kugel-Khomskii effect 55 . Another dramatic example is provided by Fe-based superconductors, where the observed magnetic ordering breaks the tetragonal symmetry. Calculations show 56 that the FM and the Neel AF state (which does not break the tetragonal symmetry) have a relatively minor (10% -30%) effect on the EFGs. For these same materials, but with proper stripe-like (symmetry-breaking) magnetic ordering imposed, the EFG changes by up to an order of magnitude. This caveat should be kept in mind when using the database.
Finally, for heavy elements, like uranium, one may expect that including spin-orbit coupling may alter the results. However, so far our experience has been that even in those cases the effect of spin-orbit 57 is at best moderate, so this does not appear to be a serious limitation.
With this in mind, let us proceed to an experimental validation for selected cases where experimental benchmarks were available. As there is no systematic chemistry and spacegroup based experimental database for electric field gradients, we manually search for experimental data  and also extract values from the NQRS database from JAICI 39,40 . Since neither NQRS, nor the original references, necessarily give information on the crystal symmetry, we ensure that the proper polymorphs are being compared in cases of uncertainty by choosing materials that have only one known polymorph. In addition, we verify that the number of ν Q's matched the number of expected inequivalent sites. Furthermore, we only extract experimental values determined at room temperature or colder; in most cases below 100 K.
After extracting the experimental values, we compared them to our DFT predictions. The comparisons are shown in Online-only Table 1. Details of each material are available in the database with its corresponding webpage, for example, webpage (https://www.ctcms.nist.gov/~knc6/jsmol/JVASP-4149) for JARVIS-ID: JVASP-4149. The list of materials for experimental comparison consists of several chemical classes such as 17 unary, 16 binary, 5 ternary compounds including oxides, sulfides, nitrides, halides, and gallides. In addition to the experimental data, we compare with previously computed data. The mean absolute deviation of the experimental comparison dataset is 1.17 × 10 21 Vm −2 , while the mean absolute percentage difference is 28.91%. The absolute deviation varies from a low value of 0.03 ×10 21 Vm −2 for titanium to a maximum value of 5.3 × 10 21 Vm −2 for UAs 2 .
Our DFT and experimental data show close agreements (Pearson's coefficient 0.999) as shown in Fig. 2a. Our computational data compares well with the previously reported values (Pearson's coefficient 0.999) as also shown in Fig. 2b. The previously reported data could be either from FLAPW or PAW based calculations and different pseudopotentials suggesting that different computational methods have only a moderate impact on the predicted values. As discussed, materials close to a local cubic environment, such as Sc, exhibit high sensitivity to the crystal   Fig. 1 This plot illustrates the sensitivity of the V zz in the situation when the nearest neighbor environment in nearly cubic, and V zz is nonzero but numerically small for Scandium case (JVASP-996). The leftmost dotted vertical line corresponds to the JARVIS-DFT-OptB88vdW calculated c/a and the rightmost vertical line corresponds to the experimentally determined c/a. It also illustrates the effect of choosing different energy functionals and band structure methods. PBE is the standard gradient-corrected density functional (GGA), OptB88vdW denote the calculations performed as describe in the paper for the entire database (in the rest of the paper, no attempt was made to correct the c/a ratio), SCAN is a recently proposed meta-GGA functional, which improves the results for strongly correlated systems (which Sc is not), HSE06 is a hybrid nonlocal functional (see VASP manual for details) and WIEN2k is an all-electron LAPW code, also utilizing PBE. Gaussian integration was used throughout the paper, except this figure, where we used tetrahedrons to reduce noise and elucidate a clean linear dependence.
Scientific Data | (2020) 7:362 | https://doi.org/10.1038/s41597-020-00707-8 www.nature.com/scientificdata www.nature.com/scientificdata/ structure, in those cases, to c/a. Using the experimental c/a, rather than the calculated value, greatly improves the agreement with the experiment; see Fig. 1. Even without that, however, the overall computational predictions compare well with experiments, as seen in Fig. 2.   Fig. 2 Comparison of current-work, experimental, and previous calculation data from Online-only Table 1. (a) experimental data vs DFT data from current work, (b) previously computed data vs current work data. The lines are the ideal case with a slope of unity. www.nature.com/scientificdata www.nature.com/scientificdata/ Next, we present some statistical analysis, which utilizes the unprecedently large scope of this study. In Fig. 3 we present the histograms for the principal components of EFG for 15187 materials. As shown in Fig. 3, it is interesting to observe that our initial weeding out of the locally cubic sites was incomplete, because ~30% of sites have a zero EFG. It is partially due to the fact that in many compounds some, but not all sites are locally cubic.
The EFG component distributions exponentially decay as we approach high values suggesting that there are relatively few materials with high EFG. Similar behavior is observed for all three EFG components. The V zz distribution range (Fig. 3c) is higher than the other two components because by convention V zz is the highest EFG value. We find that some of the high-V Interestingly, all of these very high-V zz materials are vdW-bonded which could be responsible for high electron cloud asymmetry. The asymmetry parameter, η is calculated using Eq. 1 and we observe that majority of the atomic sites have zero asymmetry parameters by symmetry, as shown in Fig. 3d.
In order to find the highest η sites, we screen for materials with η≈1 (note that no site symmetry can lead to η being exactly 1, so such materials simply happen to have one of the principal components of the EFG tensor numerically small), which leads to candidates such as Te in Rb 2 Te 5 (JVASP-4149), and O in ZnTiO 3 (JVASP-11167). Some more examples are shown in the Table 2. The list consists of compounds with diverse chemistry. In addition to the EFG and materials information, the table contains important information, such as OptB88vdW bandgaps and energy above the convex hull that might be important from the experimental perspectives. The complete dataset of the EFG tensor and the asymmetry parameter is provided in the data-records section, from which several similar candidate materials can be easily identified.
Next, we identify correlations in the EFG parameters in Figs. 4 and 5. In Fig. 4a we plot the numerator and denominator of the asymmetry parameter equation. Clearly in Fig. 4a, if the numerator is zero, the asymmetry parameter is zero, while atomic sites lying on the x = y line represents the highest asymmetric parameter (η = 1) sites; the later may be of interest from an experimental perspective. Obviously, the highest η is obtained when V xx is zero but V zz is not. In Fig. 4b, we plot the asymmetry parameter against the V zz component which shows a bell-shape feature. Upon initial appearance, Fig. 4b suggests that high asymmetry behavior is preferentially

Materials
Atom JID E g (eV) E hull (eV) V ZZ V YY  Table 2. Some example compounds with high asymmetry parameter (η ≈ 1). Chemical formula, JARVIS-ID, bandgap (E g )(eV), energy above convex hull (E hull ) (eV), electric field gradients (V ZZ , V YY ) (10 21 Vm −2 ) and the asymmetry parameter (η) information is provided. www.nature.com/scientificdata www.nature.com/scientificdata/ observed in low V zz values. However, the 3D histogram (shown in Fig. 5), which is plotted on a logarithmic scale, reveals a similar distribution of η values for all V zz ; there are simply much fewer materials with high V zz .
In Fig. 6, we show the periodic table trends for the elements with their highest V zz value among all the materials under investigation, containing this element. Interestingly, we observe that halides such as Br and I, transition elements such as Au and Pt, actinides such as U, pnictides such as Bi and Sb, and inert gases such as Xe attain high V zz values. Low atomic weight elements such as H, Be and Li have low V zz values. These results can be important for experiments because it represents the overall trends during NMR/NQR frequencies. We note that these trends are not for individual elemental systems but elemental distribution in the multicomponent systems.

Usage Notes
The database presented here represents the largest collection of consistently calculated electric field gradient properties of materials using density functional theory assembled to date. We anticipate that this dataset, and the methods provided to access it, will provide a useful tool in fundamental and application-related studies of materials. Our actual experimental verification provides insight into understanding the applicability and limitation of our DFT data. Based on the list of data, the user will be able to choose particular materials for specific applications. Data mining, data analytics, and artificial-intelligence tools then can be added to guide the design and optimization of materials. The logarithmic z-scale allows for the strong preponderance of small V ZZ and small η values to be clearly seen, while at the same time revealing the bell-shaped distribution in V ZZ for all η values. Values where V zz = 0, and therefore η is ill-defined, have been excluded.