Abstract
We develop an open-source python workflow package, pyGWBSE to perform automated first-principles calculations within the GW-BSE (Bethe-Salpeter) framework. GW-BSE is a many body perturbation theory based approach to explore the quasiparticle (QP) and excitonic properties of materials. GW approximation accurately predicts bandgaps of materials by overcoming the bandgap underestimation issue of the more widely used density functional theory (DFT). BSE formalism produces absorption spectra directly comparable with experimental observations. pyGWBSE package achieves complete automation of the entire multi-step GW-BSE computation, including the convergence tests of several parameters that are crucial for the accuracy of these calculations. pyGWBSE is integrated with Wannier90, to generate QP bandstructures, interpolated using the maximally-localized wannier functions. pyGWBSE also enables the automated creation of databases of metadata and data, including QP and excitonic properties, which can be extremely useful for future material discovery studies in the field of ultra-wide bandgap semiconductors, electronics, photovoltaics, and photocatalysis.
Similar content being viewed by others
Introduction
Obtaining materials with properties that are optimized for a particular application traditionally relies on time-consuming and expensive experimental work. However, in recent years an alternative paradigm in the field of material discovery has emerged through the availability of modern massive supercomputing resources, development of first-principles methodologies, and ingenious computational algorithms. These advancements have pushed the boundaries of materials simulations, making them faster, more cost-effective, efficient and accurate. Consequently, high-throughput materials simulations have emerged as a tool for creating large databases and screening materials from these databases to identify candidate materials for applications in photocatalysis1,2, energy storage3,4, piezoelectrics5, electrocatalysis6 etc.
However, applying similar approaches to applications related to optical and transport properties of materials have been hindered by a few technical obstacles. Density functional theory (DFT), the most widely used tool in computational high-throughput materials discovery studies are designed to explore ground state properties of a system and has been remarkably successful in predicting structural, mechanical, electronic, and thermal properties7. However, studying the excited state properties of a system, such as optoelectronic or transport properties using DFT requires the interpretation of Kohn-Sham (KS) eigenvalues as energies involved in adding an electron to a many-electron system or subtracting one from it (QP energies). Following such a procedure, one often encounters the infamous bandgap underestimation problem due to the derivative discontinuity of the exchange-correlation energy8.
A more rigorous approach to computing QP energies and accurate bandgap is applying many body perturbation theory (MBPT) within the GW approximation9. Using this formalism, one computes the QP energies by calculating the first-order perturbative correction to the KS eigenvalues by approximating self-energy as a product of one particle Green’s function (G) and screened Coulomb interaction (W). It has been shown that MBPT within the GW approximation is particularly useful in computing QP properties of a wide variety of semiconductors and insulators10 without requiring any ad-hoc introduction of mixing parameters like those needed for hybrid functionals used in DFT11,12.
Additionally, the study of QP properties using GW formalism enables us to compute several transport properties of materials that are inaccessible from a DFT calculation. For example, QP lifetimes calculated from GW calculations can be directly used to estimate impact ionization rates, a very useful parameter in the study of high-field transport of wide bandgap materials13. In the case of low-field transport, in addition to the obvious importance14 of including GW corrections to KS eigenvalues and carrier mobilities, it has been shown that one needs to include the effects of GW correction on the orbital character of the relevant KS wavefunctions to obtain accurate transport properties of molecular junctions15.
The optical and transport properties of semiconductors to a large extent are defined by the presence of intentional dopants or unintentional defects. GW formalism has emerged as a powerful approach that complements experiments and has become reliable enough to serve as a predictive tool for the crucial point defect properties such as charge transition levels and F-center photoluminescence spectra in semiconductors16,17. Moreover, calculations based on MBPT using GW approximation has been successfully applied to estimate non-radiative recombinations such as Auger recombination rates18,19 which are very useful for optical applications. Auger recombination mechanism has been shown to cause significant efficiency loss in InGaN-based light-emitting diodes (LEDs), when operating at high injected carrier densities.
The necessity of the BSE methodology lies in the fact that even after including GW corrections the optical spectrum calculated within the independent-particle picture shows significant deviations from experimental results, as not only the absorption energies can be wrong, but often the oscillator strength of the peaks can deviate from the experiment by a factor of 2 or more. Moreover, it can not describe bound exciton states, which are particularly important in systems of reduced dimensions20. The reason is, an independent particle picture can’t include electron-hole interactions (excitonic effects) which require an effective two-body approach21. This can be achieved by evaluating the two-body Green’s function G2 and formulating an equation of motion for G2, known as the Bethe-Salpeter equation (BSE)21.
Despite its obvious indispensability, high-throughput computational material discovery studies for light-matter interaction related applications have rarely incorporated the QP or excitonic properties of materials using GW-BSE formalism22,23,24,25 mostly due to the unavailability of an automated workflow implementation that can perform such calculations. Two main challenges for such an implementation is the efficient convergence of multiple parameters and the tractability of the huge computational cost associated with the multi-step GW-BSE formalism. GW-BSE calculations are extremely sensitive to multiple interdependent convergence parameters such as the number of bands included in the GW self-energy calculation or the number of k-points used to sample the Brillouin zone (BZ) in the BSE calculation etc.
In this article, we introduce the open-source Python package, pyGWBSE, which automates the entire GW-BSE calculation using first-principles simulations software Vienna Ab-initio Software Package (VASP)26. This package enables automated input file generation, submission to supercomputing platforms, analysis of post-simulation data, and storage of metadata and data in a MongoDB database. Moreover, pyGWBSE is capable of handling multiple convergence parameters associated with the GW-BSE formalism. Using this package, high-throughput computation of various electronic and optical properties is possible in a systematic and efficient manner. For example, the QP energies, bandstructures, and density of states can be computed using both the one-shot G0W0 and partially self-consistent GW0 level of the GW formalism. The package enables automated BSE computations yielding the real and imaginary part of the dielectric function (incorporating electron-hole interaction), the exciton energies, and their corresponding oscillator strengths. DFT bandstructures, the orbital resolved density of states (DOS), electron/hole effective masses, band-edges, real and imaginary parts (absorption spectra) of the dielectric function, and static dielectric tensors can also be computed using pyGWBSE.
The package is being continuously developed and the latest version can be obtained from the GitHub repository at https://github.com/cmdlab/pyGWBSE. pyGWBSE is built upon existing open source Python packages such as, pymatgen27, Fireworks28, and atomate29. To obtain the QP bandstructure we use the Wannier9030, a program for calculating maximally-localized Wannier functions to perform the interpolation required to obtain QP bandstructure with reduced computational cost.
pyGWBSE enables high-throughput simulations of highly reliable and efficient ab-initio approaches thus enabling future materials screening studies, the creation of large databases of high-quality computed properties of materials, and in turn machine learning model development. Hence, pyGWBSE could serve as a catapult to the next generation of technological advances in the field of power electronics, optoelectronics, photovoltaics, photocatalysis, etc.
In the following sections, we present an overview of the underlying methodology, describe the workflow architecture, discuss the algorithms that were developed to perform the multi-step convergences, and benchmark the results obtained from the pyGWBSE workflow against experimental data in the literature.
Results and discussion
p yGWBSE: methodology and implementation
The first-principles calculations in the pyGWBSE package are performed using one of the most well-known packages, VASP. VASP is a first principles computer program for atomic scale materials modeling. It is capable of using the projector-augmented wave (PAW) method31 and it comes with a rigorously tested pseudopotential library. VASP provides the accuracy of the full-potential linearized augmented plane wave (FLAPW)32 method but is computationally less expensive than most of the traditional plane wave-based methods26. Currently, in addition to DFT, GW, and BSE, it supports various other post-DFT methods such as TD-DFT, ACFDT, 2nd-order Møller-Plesset perturbation theory and is under constant development, which opens up the possibility of implementing these methodologies in the pyGWBSE workflow in the future. Additionally, VASP is very efficiently parallelized and can utilize the potential of modern computers of both CPU and GPU-based architectures.
The pyGWBSE package is capable of computing several material properties using the DFT, GW, and BSE methodologies. Using pyGWBSE, properties such as bandstructures, the orbital resolved density of states (DOS), electron/hole effective masses, band-edges, real and imaginary parts (absorption spectra) of the dielectric function (ϵ(ω)), and static dielectric tensors can be computed using the DFT methodology. In the pyGWBSE package, the GW formalism can be used to compute QP energies both at one-shot G0W0 and partially self-consistent GW0 level of accuracy. The package uses maximally localized wannier functions (MLWF) to compute the electronic structure at the QP level of accuracy but at a significantly reduced computation cost. Using this package, the BSE methodology can be used to compute the real and imaginary parts of the dielectric function (incorporating electron-hole interaction), the exciton energies, and their corresponding oscillator strengths.
In the following sections, we describe the pyGWBSE package’s computational framework with a particular emphasis on all the crucial computational parameters which are needed to be converged for obtaining accurate results. Figure 1 provides a condensed diagrammatic representation of the GW-BSE framework described in the following two subsections; showing the interconnection between the key equations and the various physical quantities. Detailed discussions about the GW-BSE methodology can be found in several review articles10,33,34,35. Additionally, we also present the rationale behind the strategies that we adopt to reduce the computational cost of the convergence calculations and thus make the computations more efficient.
GW methodology: converging QP properties
The QP energies are the energies for adding an electron to a many-electron system or subtracting one from it. Within the MBPT they can be calculated by solving the following equation9,
where, T is the kinetic energy operator, Vn-e is the operator to account for the nuclear(ion)-electronic interaction, VH is the Hartree potential, Σ is the self-energy operator and r is the position vector of the electron. \({E}_{n{{{\bf{k}}}}}^{{{{\rm{QP}}}}}\) and \({\psi }_{n{{{\bf{k}}}}}^{{{{\rm{QP}}}}}\)s are the QP energies and wavefunctions for nth band with wavevector k.
GW approximation provides a practical route to compute the self-energy operator, \(\Sigma ({{{\bf{r}}}},{{{{\bf{r}}}}}^{{\prime} },\omega )\), from KS wavefunctions, ψnk, and energies, ϵnk, through one particle Green’s function, \(G({{{\bf{r}}}},{{{{\bf{r}}}}}^{{\prime} },\omega )\)9,36,37. Within the GW approximation Σ can be written as,
where, W is the screened Coulomb interaction, ω is the frequency and δ is the positive infinitesimal.
W can be computed from ψnk and ϵnk through independent particle polarizability and frequency-dependent dielectric matrix utilizing the following three equations and taking advantage of the Random Phase Approximation (RPA)38.
The dielectric matrix, \({\epsilon }_{{{{\bf{q}}}}}({{{\bf{G}}}},{{{{\bf{G}}}}}^{{\prime} },\omega )\) is related to χ as38,
Within RPA, independent particle polarizability, \({\chi }_{{{{\bf{q}}}}}^{0}({{{\bf{G}}}},{{{{\bf{G}}}}}^{{\prime} },\omega )\), is calculated as38,
where, wk is the k-point weight, the fn,k are the one electron occupancy of the corresponding states, q is the Bloch wave vector, G is the reciprocal lattice vector and η is the infinitesimal complex shift.
Once the screened Coulomb interaction, W, is computed, the diagonal matrix elements of self-energy operator, Σ(ω)nk,nk, can be obtained using36,
where μ is the Fermi energy. In the non-self-consistent GW calculation (also known as G0W0) G0 and W0 are calculated using KS eigenvalues and eigenfunctions. The wavefunction of QP Hamiltonian (Eqn. (1)) is approximated as the DFT wavefunction and the QP energies are computed to first order as,
Since Eq. (7) requires the values of \({E}_{n{{{\bf{k}}}}}^{{{{\rm{QP}}}}}\), the equation must be solved by iteration. Using the usual Newton-Raphson method for root finding, one can obtain the following update equation36,
where Znk is the renormalization factor and can be calculated as,
The iteration starts from DFT eigenvalues and if one stops after the first iteration the QP energies are obtained within G0W0 approximation. One can also continue to obtain QP energies that are self-consistently converged. This scenario is referred to as self-consistent GW approximation (scGW)39.
In the pyGWBSE workflow, both the G0W0 and scGW are implemented. In the partially self-consistent scGW approximation the G is updated self-consistently until convergence is reached but the W is kept unchanged, thus the scGW is also referred to as GW0. A full update of the G and W is seldom adopted and thus is not included in the pyGWBSE package. In fact, it has been shown by several studies40,41,42 pertaining to free-electron gas, metals, and semiconductors that fully self-consistent GW calculations, without vertex corrections, lead to an overestimation of bandgaps.
There are three crucial parameters in a GW calculation that need to be converged to obtain accurate results, namely-
-
Number of plane waves used to expand the screened Coulomb operator, \({W}_{q}({{{\bf{G}}}},{{{{\bf{G}}}}}^{{\prime} },\omega )\). This parameter can be specified by using ENCUTGW in the VASP implementation.
-
Number of frequency grid points used in Eq. (6) for the frequency integration. This parameter can be specified by using NOMEGA in the VASP implementation.
-
Number of bands used in Eqs. (5) and (6) for the summation. This parameter can be specified by using NBANDS in the VASP implementation.
In principle, the QP energies are to be converged w.r.t all three of the parameters mentioned above. Therefore, in our workflow, we have implemented the convergence tests for QP energies w.r.t. these parameters namely, ENCUTGW, NOMEGA, and NBANDS.
Table 1 and 2 show the QP gaps of 9 materials computed with different values of ENCUTGW and NOMEGA. Table 1 shows the QP gaps computed with ENCUTGW values of 100, 150, and 200 eV, whereas Table 2 shows the QP gaps computed with NOMGEA values of 50, 65, and 80. From Table 1 we can see that, a ENCUTGW value of 150 eV is sufficient for obtaining a QP gap value that is converged within ~0.1 eV for almost all the materials except diamond (C), for which we need an ENCUTGW value of 200 eV. Table 2 shows that we need a NOMEGA of 80 to converge the QP gap within 0.1 eV for all the selected materials. Thus, as recommended by the VASP manual, ENCUTGW value of 2/3 × ENCUT and NOMEGA value of 50–100, would be sufficient to obtain accurate QP energies for a wide range of materials. Furthermore, based on the convergence of the 9 materials in Tables 1 and 2 we surmise that given the variety in the chemical compositions and crystal structures of these materials, it is likely that a value of 200 eV for ENCUTGW and 80 for NOMEGA may be sufficient to converge QP gaps of a variety of other materials within 0.1 eV.
The third parameter, NBANDS, however, needs to be converged for every material. Despite many efforts to eliminate or reduce the need of including a large number of empty orbitals in the GW calculation, it still is one of the major computation costs of a GW calculation. While several methods have been proposed to reduce the total number of empty orbitals in a GW calculation such as replacing actual KS orbitals with approximate orbitals generated using a reduced basis set, truncation of the sum over empty orbitals to a reduced number, and adding the contribution of the remaining orbitals within the static (COHSEX) approximation and modified static remainder approach43. These methods are not currently implemented in the VASP package. However, in the future, if these methods are implemented, they can be easily incorporated into the pyGWBSE package and would reduce the computational costs of GW-BSE calculations significantly. Meanwhile, we strongly suggest performing convergence tests w.r.t NBANDS using pyGWBSE to obtain accurate results. It has been observed that convergence of all the aforementioned three parameters can be strongly interdependent and hence should be considered together44.
However, the computational cost of treating multiple convergence parameters together scales quite rapidly. To address this issue we suggest that one should check the convergence of NBANDS first and once convergence is reached, increase ENCUTGW and NOMEGA by ~25% and check the convergence of the quasiparticle gap once again. For concurrent convergence of all the three parameters, the convparam tag in the input file can be specified as a list of [Δ NBANDS, Δ ENCUTGW, Δ NOMEGA], where Δ NBANDS, Δ ENCUTGW, and Δ NOMEGA are the increase in respective parameters in each convergence iteration.
BSE methodology: converging excitonic properties
Studying optical electron-hole excitations is an effective two-body problem. In most cases single-particle picture of individual quasi-electron and quasi-hole excitations are not enough. We need to include electron-hole interactions as well. We can work with two-body Green’s function G2 on the basis of the one-body Green’s function G1, which can be described by the GW approximation. We can use QP electron and hole states of G1 and their QP energies to estimate the electron-hole interactions. The equation of motion for G2 is known as the Bethe-Salpeter equation45 and is very useful in the study of correlated electron-hole excitation states also known as excitons.
Following Strinati45, Rohlfing and Louie21, the BSE can be written as a generalized eigenvalue problem and the electron-hole excitation states can be calculated through the solution of BSE. For each exciton state S, within the Tamm-Dancoff approximation the BSE can be written as,
where, Avck is the exciton wavefunction, ΩS is the excitation energy, \({\epsilon }_{c{{{\bf{k}}}}}^{{{{\rm{QP}}}}}\) and \({\epsilon }_{v{{{\bf{k}}}}}^{{{{\rm{QP}}}}}\) are the QP energies of the conduction (\(\left\vert c{{{\bf{k}}}}\right\rangle\)) and valence states (\(\left\vert v{{{\bf{k}}}}\right\rangle\)) which is computed using the GW methodology discussed in the previous section. The electron-hole interaction kernel, Keh, can be separated in two terms, Keh = Kd + Kx, where Kd is the screened direct interaction term and Kx is the bare exchange interaction term. Within the GW approximation for Σ, in the basis of the single-particle orbitals in real space (ϕc/v(x), the KS orbitals obtained from DFT calculations), they are defined in the following way,
Once we have the solutions of the BSE Hamiltonian, we can construct ϵ2(ω) which incorporates excitonic effects from the solutions of the modified BSE,
where \(\hat{\lambda }\) is the polarization vector, and v is the velocity operator along the direction of the polarization of light, \(\hat{\lambda }\). The real part of the dielectric function ϵ1(ω), can be obtained by integration of ϵ2(ω) over all frequencies via Kramers-Kronig relations.
Converging the BSE absorption spectra with the number of k-points used to sample the BZ is one of the most computationally demanding tasks in a GW-BSE calculation. In this study, we propose a strategy to achieve this convergence with a significant reduction in the computational cost involved. We propose to obtain a convergent ϵ2(ω) within an independent particle picture (RPA) and use the same k-mesh to perform the BSE calculation. This strategy is expected to be useful because of the following reasons.
There are two aspects of the convergence of ϵ2(ω) w.r.t ρk. Firstly, due to band dispersion, one needs to include all the occupied-unoccupied transitions throughout the BZ to obtain absorption spectra that are converged. As a result, one needs to use a very dense k-grid for materials that have a stronger dispersion of bands near the gap. Secondly, the electron-hole interaction kernel is also dependent on the k-grid density, as the integrations described in Eqs. (11) and (12) are evaluated by a plane-wave summation in reciprocal space with the help of Fourier transform. Typically, the band dispersion doesn’t change significantly from DFT to GW bandstructure. Thus one should be able to estimate the k-grid density required to converge a BSE absorption spectra only by observing the change in the RPA absorption spectra. For the convergence of electron-hole interaction kernel, we note that in the literature46 it has been suggested that Keh varies little w.r.t the k-points, as the single-particle wave functions are quite robust w.r.t k.
Therefore, one can assume that the k-mesh required to achieve the convergence solely from a change in band dispersion (which can be estimated from RPA calculations) is likely to be sufficient to converge Keh and also the BSE absorption spectra. In the following paragraph, we will discuss the results from our calculation, which supports the aforementioned hypothesis. Moreover, we want to emphasize that by convergent absorption spectra we mean that not only the positions of the absorption peaks are converged but the oscillator strengths of these peaks are converged as well so that we obtain a ϵ2(ω) that doesn’t change with the finer sampling of the BZ.
To achieve this, we propose a similarity coefficient, SC, that is a measure of the convergence of the absorption spectra. We define the similarity coefficient as follows,
where, ΔA(Δρk, ρk) is the area between two ϵ(ω) curves computed with reciprocal density ρk and ρk - Δρk and A(ρk) is the total area under the ϵ(ω) curves computed with reciprocal density ρk. Note that ΔA in Eqn. (14) is not simply the difference in area between two curves but quantifies the similarity between two curves by summing up the areas where two curves differ from each other47. This is shown as the shaded areas in Fig. 2).
Figure 2 shows the convergence of the absorption spectra of Silicon, Diamond, AlSb, and NaCl calculated using different reciprocal densities, from both RPA (bottom panel) and BSE (top panel) calculations. Notably, the absorption spectra from RPA and BSE look drastically different. However, this is expected as the inclusion of GW corrections increases the bandgaps significantly resulting in a shift of the entire spectra toward higher frequency. Whereas, the inclusion of excitonic effects through BSE results in a change in the oscillator strengths (peak heights) in ϵ(ω). In the case of Si, Diamond, and AlSb the consequences of including excitonic effects are relatively low as the relative heights of the low energy absorption peaks don’t change significantly (Fig. 2a–c and e–g), whereas, in the case of NaCl it shows a very prominent low energy excitonic peak, almost absent in the RPA spectra (Fig. 2d and h).
Figure 3a shows the SC for the spectra of Fig. 2 computed with a Δρk of 50 Å3. The SC captures both the shifts in peak positions and oscillator strengths in the absorption spectra resulting from the change in reciprocal density used in BSE calculations. As one can see from the absorption spectra of Si and AlSb, Fig. 2a and c, that the peak positions and their oscillator strengths don’t change too much when ρk is >100 Å3. This is reflected by a larger value of SC, >0.75, for them (Fig. 3a). Whereas, a lower SC value is obtained for diamond and especially for NaCl even for ρk is >100 Å3. This is mostly due to a large change in peak positions for diamond and a change in oscillator strengths for NaCl (Fig. 2b and d). Nevertheless, once we look at the convergence of SC for both BSE and RPA absorption spectra (Fig. 3a), they look quite similar. Thus the lower resource and time-intensive RPA can be employed to estimate the k-mesh required to converge the BSE calculations (see Supplementary Figure 1 for the computational cost comparison). In Fig. 3b we show the convergence of optical gap, Eopg, with the reciprocal density, ρk, used in BSE calculation. To compare the convergence for these materials that have very different optical gaps we subtracted the converged value of the optical gap, \({E}_{opg}^{CONV}\), for each of them to show the variation in the same scale. From Fig. 3b it is clear that the k-mesh required to obtain an SC > 0.75 is sufficient to converge the optical gaps within 50 meV for most of the materials. Additionally, Fig. 3a indicates that the convergence of the SC ensures that the entire absorption spectra (in the desired frequency range) are also converged along with the optical gap. However, generally speaking, one should keep in mind that in the case of a few materials the QP correction can change the band dispersion significantly. In that case, one should not use the RPA calculation to access the convergence of the BSE absorption spectra.
To select an appropriate value Δρk we suggest code users plot the reciprocal density vs the number of k-points along the largest dimension of the BZ. We have shown such a plot for an AlN unit cell and 2 × 2 × 2 supercell in the Supplementary Fig. 4.
As the current implementation of VASP only supports a uniform k-grid for BSE calculation, all the GW-BSE calculations reported in this study have been performed with a uniform k-grid. A non-uniform k-grid with more k-points along the direction of higher dispersion is useful to converge BSE calculation and can be performed by performing BSE calculations on multiple k-grid and averaging the results48. Such a feature in future versions of pyGWBSE workflow can enable even faster convergence of BSE spectra.
p yGWBSE workflow architecture
Figure 4 shows the workflow architecture of pyGWBSE. The workflow consists of seven fireworks (FW). FWs are a set of tasks, called firetasks (FTs), that together accomplish a specific objective such as DFT structural relaxation. For example, FW1 named SCF is composed of four FTs (see Supplementary Fig. 3), which enables a DFT simulation by automatically generating the input files, running a simulation on a supercomputing resource, analyzing the simulation output, and storing it in a MongoDB database. Some FWs are optional in the workflow (shown as orange rectangles in Fig. 4) while others are essential (shown as purple rectangles in the same figure). Once each of the FWs is completed, the results along with the input parameters necessary to reproduce the results are saved in a MongoDB database.
FWs and FTs were first introduced in the Fireworks28 open-source package. They allow us to breakdown and organize a workflow in a group of tasks with the correct order of execution for each task and suitable transfer of information between the tasks. For example, FTs can be simple tasks such as writing files, or copying files from a previous directory, or more complex tasks such as starting and monitoring a VASP calculation, or parsing specific information from VASP output files and saving them in a MongoDB database.
pyGWBSE creates and stores information about FWs, FTs, and their interdependencies in MongoDB database collections as JSON objects. These collections are shown as green cylinders in Fig. 4. At the time of workflow execution on a supercomputer, the FTs of individual FWs are executed in the appropriate order using the JSON objects stored in the MongoDB database. pyGWBSE workflow needs a file named ‘input.yaml’ to initiate the workflow. In Supplementary Methods section of Supplementary Information, we have shown an example ‘input.yaml’ file with a detailed description of all the input tags one need to specify for creating a workflow that demonstrates the capabilities of the pyGWBSE package.
p yGWBSE fireworks
As mentioned earlier, we have developed seven FWs in the pyGWBSE package. They are shown as rectangular boxes in Fig. 4. Each FW is named as shown in the boxes along with an ‘FW’ suffix. The FWs are numbered in order of their execution in the workflow. Three simulation software namely VASP, Sumo and Wannier90 are used by pyGWBSE. The software used for individual FWs are denoted by left-sided bars on the FW boxes in Fig. 4.
The first FW, ScfFW, is used to obtain self-consistent charge density by solving the KS equation. It computes KS eigenvalues and wavefunctions, (\({\Phi }_{n{{{\bf{k}}}}}^{KS}\) and \({\epsilon }_{n{{{\bf{k}}}}}^{KS}\), which are required by all other FWs. The second FW, EmcFW, performs the effective mass calculation (EMC) via the Sumo code49 using the DFT bandstructure obtained by FW1. The third FW, Wannier_checkFW is designed for checking the accuracy of wannier interpolation. Fourth FW, ConvFW performs convergence tests for NBANDS, ENCUTGW, and NOMEGA as discussed in GW methodology section (Eqs. (3)–(9)) earlier. Following the same methodology, fifth FW, GwFW performs GW calculation to obtain QP energies (\({\epsilon }_{n{{{\bf{k}}}}}^{KS}\)). The sixth FW, WannierFW is designed to perform wannier interpolation to obtain QP energies along the high-symmetry k-path to produce GW bandstructure. Lastly, the seventh FW, BseFW solves the Bethe-Salpeter equation to obtain ϵ2(ω) as described in BSE methodology section (Eqs. (10)–(13)) earlier.
The optional FWs are triggered by the input tags described in the pink diamonds. For example, only when skip_emc tag is set to False, the workflow executes EmcFW (FW2) to compute the effective masses. Note that ConvFW (FW4) is labeled as both an essential and optional FW. In the Supplementary Discussion section of the Supplementary Information we explain which parts of the FW4 are optional.
Supplementary Fig. 3 in the Supplementary Information shows the breakdown of each of the 7 FWs of pyGWBSE workflow into its constituent FTs. There are 4 categories of FTs depending on their functionality. The first category can be considered as file handling FTs (shown as green boxes in Fig. S3). These FTs are used to create or copy files. For example, the WriteVaspFromIOSet FT in FW1, FW2, FW3, and FW4 are used to write the INCAR, KPOINTS, POSCAR, and POTCAR input files for the VASP simulations. The second category is that of the simulation FTs (yellow boxes in Fig. S3). These FTs launch an executable to run specific simulations on a supercomputer. For example, Run_Vasp is used to run the VASP software. The third category of FTs, communication FTs (shown in light purple boxes in Fig. S3), enables the communication between different FTs. For example, PasscalcLocs in FW1 and FW5 is used to pass the address of the directory where a parent FW was executed to its children’s FWs. The last category of FTs, the transfer to database FTs, transfers information to the database. They are shown by gray boxes For example Eps2db in FW1 is used to read the dielectric tensor from VASP output file and save it to the database.
We use separate MongoDB collections to store the data and metadata, including inputs and outputs, associated with the FWs and FTs. Figure 4 shows these various collections. The group of all these collections is called the GWBSE_DB. Dielectric tensors, KS eigenvalues, projections of the KS wavefunctions onto atomic orbitals (for computing projected DOS), RPA dielectric functions from DFT calculation, effective masses, wannier interpolated bandstructures from both DFT and GW levels, all the QP energies including those during convergence, frequency-dependent dielectric function for different light polarization axis from BSE calculations are some of the key quantities stored in the collections.
A more elaborate description of the code’s features and the functionality implementation, especially the convergence of the GW and BSE-related parameters can be found in the Supplementary Discussion and Supplementary Notes section of the Supplementary Information. Moreover, the Supplementary Information (Supplementary Methods) includes an example Jupyter Notebook that shows a step-by-step setup process to create a workflow and analyze the results obtained from the workflow to determine QP properties and the BSE absorption spectra of the wurtzite phase of AlN.
Benchmarking p yGWBSE for wurtzite AlN
Recently, we employed pyGWBSE to compute the excitonic effects in absorption spectra of ~50 photocatalysts using the Bethe-Salpeter formalism50. In that study, we have compared the QP gap computed using pyGWBSE for 10 materials of very different chemical compositions with the experimental values for the purpose of benchmarking and found good agreement. However, in the aforementioned study, we haven’t utilized all the functionalities of pyGWBSE. Therefore, here we demonstrate all the functionalities of the pyGWBSE workflow by applying it to a test case of wurtzite-AlN. In this section, we compare the various quantities obtained from the workflow simulations with the experimentally measured values reported in the literature.
We begin by evaluating the quantities obtained from the DFT calculations. To that end, the dielectric constants of AlN are found to be \({\epsilon }_{\infty }^{\perp }\) = 4.61, \({\epsilon }_{\infty }^{\parallel }\) = 4.82, and \({\epsilon }_{\infty }^{avg}\) = 4.68, with the \({\epsilon }_{\infty }^{avg}\) being in exact agreement with the experimentally measured value of 4.6851.
The electron effective masses are found to be \({m}_{e}^{\parallel }\) = 0.28m0 and \({m}_{e}^{\perp }\) = 0.3m0 which fall in the experimentally obtained ranges of ~ 0.29–0.45m052. In case of hole effective masses we find a large anisotropy, \({m}_{h}^{\parallel }\) = 0.24m0 and \({m}_{h}^{\perp }\) = 4.32m0. The average hole effective mass in AlN was recently estimated to be ~2.7m0, based on experimental measurements of the Mg acceptor binding energy in Mg-doped AlN epilayers53 which is very similar to our average computed value of \({m}_{h}^{avg}\) = 2.96.
Figure 5a shows the bandstructure of AlN computed from DFT using VASP directly (red solid) and through the use of wannier interpolation (black dashed). As one can see that, the wannier interpolation is very accurate and both the bandstructures overlap with each other throughout most of the BZ. Figure 5a also shows that AlN is a direct gap semiconductor with a DFT gap of 4.05 eV, which is expectedly underestimated compared to the experimental value of 6.2 eV54 but very close to the value obtained from DFT calculations in the earlier studies (3.9 eV)54.
Once we perform the one-shot GW calculation the direct gap increases to 5.59 eV. Although this is closer to the experimental value it is still not quite accurate. Previous studies, using an LDA functional as a starting point found a QP gap of 5.8 eV, which also doesn’t agree with the experimental gap. However, after we perform partial self-consistent GW (scGW) the QP gap becomes 6.28 eV, resulting in an good agreement with the experimental value of 6.2 eV54. The QP bandstructure with scGW is shown by the blue curve in Fig. 5a.
We use the QP energies and projection of KS wavefunctions or atomic orbitals to compute the orbital resolved DOS with QP corrections under the assumption that the KS wavefunction is a good approximation for the QP wavefunction. Figure 5b shows the orbital-resolved DOS of wurtzite-AlN with Al(p) and N(p) states shown with dashed and dotted lines respectively. We show orbital-resolved DOS obtained from both DFT and GW calculations for comparison. Our calculation suggests that the valence band edge of wurtzite-AlN mostly consists of N(p) states whereas the conduction band edge is resulting from strong hybridization between Al(p) and N(p) states, which is consistent with the findings of previous studies55.
To show pyGWBSE’s ability to perform BSE calculation and obtain absorption spectra (ϵ2(ω)) that include electron-hole interactions we perform the GW-BSE calculation for wurtzite-AlN. Figure 5c compares the absorption spectra (ϵ2(ω)) that we obtained from the BSE calculation, the calculation without electron-hole interaction, and the experimentally obtained spectra from the literature56. The light polarization is set to be perpendicular to the c-axis. As we can see from Fig. 5c the absorption spectra calculated without taking electron-hole (e-h) interaction into account completely misses the features in the 6–10 eV range, visible in the experimental absorption spectra (shown with ‘ + ’ symbols in Fig. 5c. Only when we include the e-h interaction through the BSE calculation those excitonic features are retrieved. Although, the absorption spectra obtained from BSE very closely resemble the experimental absorption spectra we find that the sharp absorption edge at 6.2 eV and two prominent absorption peaks at 7.85 and 8.95 eV are shifted (by ~ 0.15 eV for the peaks) to the lower frequencies. In the Supplementary Fig. 5 and Supplementary Table 1 in the SI, we can see that with an SC of ~ 0.8 reasonable amount of information is obtained for the spectra peaks and their positions, however, larger SC’s lead to better accuracy, albeit at a higher computational cost and computing time. Thus previous GW-BSE calculations with a finer sampling of the BZ, with randomly distributed 1000 k-points, led to even better agreement with the experimental spectra57.
To conclude, we have developed a Python toolkit, pyGWBSE, which enables high-throughput GW-BSE calculations. In this article, we present the underlying theory, the workflow architecture, the algorithmic implementation, and benchmark simulations for the pyGWBSE code. This open-source code (available at https://github.com/cmdlab/pyGWBSE) enables automated input file generation, submission to supercomputing platforms, analysis of post-simulation data, and storage of metadata and data in a MongoDB database. Moreover, pyGWBSE is capable of handling multiple convergence parameters associated with the GW-BSE formalism. To reduce the computational cost associated with obtaining a converged absorption spectrum from BSE calculations, we present a strategy for computing the similarity coefficient from RPA spectra. We have shown that this approach ensures convergence of not only the optical gap or exciton binding energy but the entire absorption spectra in the desired frequency range. Our openly available code will help to include QP properties and excitonic effects in future computational material design and discovery studies in a variety of fields such as power electronics, photovoltaics, and photocatalysis. The pyGWBSE will facilitate high-throughput GW-BSE simulations enabling the application of large data methods to further explore our understanding of materials as well as first-principles methods that are designed for computing excited state properties.
Methods
A detailed discussion of the convergence parameters for the DFT ground state calculations can be found in the existing literature58,59. In these studies, plane wave energy cutoff was set to 1.3 times the maximum energy cutoff specified in the pseudopotentials and k-grid set as 500 n−1 points, where n represents the number of atoms in the unit cell distributed as uniformly as possible in k-space (see Supplementary Fig. 2 for convergence results). It resulted in total energy convergence of 15 meV atom−1 for 96% of 182 chemically diverse materials58. The choice of exchange-correlation functional used in the pyGWBSE workflow is determined by the settings of VASP via the pymatgen27 installation and can be changed by following installation instructions of pymatgen package. Our convergence calculations performed on several materials (Supplementary Fig. 2) shows that, the same choice for the plane wave energy cutoff but a higher k-grid density is required to converge the GW-BSE calculations.
A broadening of 0.1 eV is used for calculating all the BSE as well as RPA absorption spectra reported in Fig. 2. Moreover, the number of valence (v) and conduction (c) bands included in the BSE calculation are selected to ensure all the vertical v → c transitions of energy <3 eV are included. One can change this energy window in the pyGWBSE workflow by using the enwinbse tag in the input file. In the benchmarking calculations for w-AlN we have used a reciprocal density, ρk value of 200 Å3 (12 × 12 × 7 k-grid) with a broadening (CSHIFT)48 of 0.2 eV, which produces an SC value of 0.91 (see Supplementary Fig. 5 and Supplementary Table 1 in Supplementary Information for detailed convergence results and associated computational cost requirements).
Data availability
All data supporting the findings of this work are available in the paper and its Supplementary Information. Extra data and machine-readable data are available upon reasonable request to the authors.
Code availability
The pyGWBSE code is being continuously developed and the latest version can be obtained from the GitHub repository at https://github.com/cmdlab/pyGWBSE.
References
Singh, A. K., Montoya, J. H., Gregoire, J. M. & Persson, K. A. Robust and synthesizable photocatalysts for CO2 reduction: a data-driven materials discovery. Nat. Commun. 10, 1–9 (2019).
Wu, Y., Lazic, P., Hautier, G., Persson, K. & Ceder, G. First principles high throughput screening of oxynitrides for water-splitting photocatalysts. Energy Environ. Sci. 6, 157–168 (2013).
Kirklin, S., Meredig, B. & Wolverton, C. High-throughput computational screening of new Li-ion battery anode materials. Adv. Energy Mater. 3, 252–262 (2013).
Hautier, G. et al. Phosphates as lithium-ion battery cathodes: an evaluation based on high-throughput ab initio calculations. Chem. Mater. 23, 3495–3508 (2011).
Choudhary, K. et al. High-throughput density functional perturbation theory and machine learning predictions of infrared, piezoelectric, and dielectric responses. npj Comput. Mater. 6, 1–13 (2020).
Greeley, J., Jaramillo, T. F., Bonde, J., Chorkendorff, I. & Nørskov, J. K. Computational high-throughput screening of electrocatalytic materials for hydrogen evolution. Nat. Mater. 5, 909–913 (2006).
Jones, R. O. Density functional theory: Its origins, rise to prominence, and future. Rev. Mod. Phys. 87, 897 (2015).
Perdew, J. P. Density functional theory and the band gap problem. Int. J. Quant. Chem. 28, 497–523 (1985).
Hedin, L. New method for calculating the one-particle Green’s function with application to the electron-gas problem. Phys. Rev. 139, 796 (1965).
Onida, G., Reining, L. & Rubio, A. Electronic excitations: density-functional versus many-body green’s-function approaches. Rev. Mod. Phys. 74, 601 (2002).
Muscat, J., Wander, A. & Harrison, N. On the prediction of band gaps from hybrid functional theory. Chem. Phys. Lett. 342, 397–401 (2001).
Vines, F., Lamiel-García, O., Chul Ko, K., Yong Lee, J. & Illas, F. Systematic study of the effect of hse functional internal parameters on the electronic structure and band gap of a representative set of metal oxides. J. Comput. Chem. 38, 781–789 (2017).
Kotani, T. & Van Schilfgaarde, M. Impact ionization rates for Si, GaAs, InAs, ZnS, and GaN in the GW approximation. Phys. Rev. B 81, 125201 (2010).
Darancet, P., Ferretti, A., Mayou, D. & Olevano, V. Ab initio GW electron-electron interaction effects in quantum transport. Phys. Rev. B 75, 075102 (2007).
Rangel, T., Ferretti, A., Trevisanutto, P., Olevano, V. & Rignanese, G.-M. Transport properties of molecular junctions from many-body perturbation theory. Phys. Rev. B 84, 045426 (2011).
Biswas, T. & Jain, M. Electronic structure and optical properties of F-centers in α-alumina. Phys. Rev. B 99, 144102 (2019).
Freysoldt, C. et al. First-principles calculations for point defects in solids. Rev. Mod. Phys. 86, 253 (2014).
Kioupakis, E., Rinke, P., Delaney, K. T. & Van de Walle, C. G. Indirect auger recombination as a cause of efficiency droop in nitride light-emitting diodes. Appl. Phys. Lett. 98, 161107 (2011).
McAllister, A., Åberg, D., Schleife, A. & Kioupakis, E. Auger recombination in sodium-iodide scintillators from first principles. Appl. Phys. Lett. 106, 141901 (2015).
Ugeda, M. M. et al. Giant bandgap renormalization and excitonic effects in a monolayer transition metal dichalcogenide semiconductor. Nat. Mater. 13, 1091–1095 (2014).
Rohlfing, M. & Louie, S. G. Electron-hole excitations and optical spectra from first principles. Phys. Rev. B 62, 4927 (2000).
Haastrup, S. et al. The computational 2D materials database: high-throughput modeling and discovery of atomically thin crystals. 2D Mater. 5, 042002 (2018).
Rajan, A. C. et al. Machine-learning-assisted accurate band gap predictions of functionalized mxene. Chem. Mater. 30, 4031–4038 (2018).
Liang, J. & Zhu, X. Phillips-inspired machine learning for band gap and exciton binding energy prediction. J. Phys. Chem. Lett. 10, 5640–5646 (2019).
Liu, X. et al. Finding predictive models for singlet fission by machine learning. npj Comput. Mater. 8, 1–10 (2022).
Hafner, J. Ab-initio simulations of materials using vasp: Density-functional theory and beyond. J. Comput. Chem. 29, 2044–2078 (2008).
Ong, S. P. et al. Python materials genomics (pymatgen): A robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
Jain, A. et al. Fireworks: A dynamic workflow system designed for high-throughput applications. Concurr. Comput. Pract. Exp. 27, 5037–5059 (2015).
Mathew, K. et al. Atomate: A high-level interface to generate, execute, and analyze computational materials science workflows. Comput. Mater. Sci. 139, 140–152 (2017).
Mostofi, A. A. et al. WANNIER90: A tool for obtaining maximally-localised wannier functions. Comput. Phys. Commun. 178, 685–699 (2008).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Singh, D. Planes Waves, Pseudopotentials and the LAPW Method. (Kluwer Academic, 1994).
Leng, X., Jin, F., Wei, M. & Ma, Y. GW method and Bethe–salpeter equation for calculating electronic excitations. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 6, 532–550 (2016).
Faber, C., Boulanger, P., Attaccalite, C., Duchemin, I. & Blase, X. Excited states properties of organic molecules: From density functional theory to the GW and Bethe–Salpeter Green’s function formalisms. Philos. Trans. R. Soc. 372, 20130271 (2014).
Blase, X., Duchemin, I. & Jacquemin, D. The Bethe–Salpeter equation in chemistry: relations with TD-DFT, applications and challenges. Chem. Soc. Rev. 47, 1022–1043 (2018).
Hybertsen, M. S. & Louie, S. G. Electron correlation in semiconductors and insulators: Band gaps and quasiparticle energies. Phys. Rev. B 34, 5390 (1986).
Shishkin, M. & Kresse, G. Implementation and performance of the frequency-dependent GW method within the PAW framework. Phys. Rev. B 74, 035101 (2006).
Hybertsen, M. S. & Louie, S. G. Ab initio static dielectric matrices from the density-functional approach. I. formulation and application to semiconductors and insulators. Phys. Rev. B 35, 5585 (1987).
Shishkin, M. & Kresse, G. Self-consistent GW calculations for semiconductors and insulators. Phys. Rev. B 75, 235102 (2007).
Delaney, K., García-González, P., Rubio, A., Rinke, P. & Godby, R. W. Comment on “band-gap problem in semiconductors revisited: effects of core states and many-body self-consistency”. Phys. Rev. Lett. 93, 249701 (2004).
Schöne, W.-D. & Eguiluz, A. G. Self-consistent calculations of quasiparticle states in metals and semiconductors. Phys. Rev. Lett. 81, 1662 (1998).
Tiago, M. L., Ismail-Beigi, S. & Louie, S. G. Effect of semicore orbitals on the electronic band gaps of Si, Ge, and GaAs within the GW approximation. Phys. Rev. B 69, 125212 (2004).
Deslippe, J., Samsonidze, G., Jain, M., Cohen, M. L. & Louie, S. G. Coulomb-hole summations and energies for GW calculations with limited number of empty orbitals: A modified static remainder approach. Phys. Rev. B 87, 165124 (2013).
Filip, M. R., Qiu, D. Y., Del Ben, M. & Neaton, J. B. Screening of excitons by organic cations in quasi-two-dimensional organic-inorganic lead-halide perovskites. Nano Lett. 22, 4870–4878 (2022).
Strinati, G. Application of the Green’s functions method to the study of the optical properties of semiconductors. Riv. Nuovo Cim. (1978–1999) 11, 1–86 (1988).
Kammerlander, D., Botti, S., Marques, M. A., Marini, A. & Attaccalite, C. Speeding up the solution of the Bethe-Salpeter equation by a double-grid method and wannier interpolation. Phys. Rev. B 86, 125203 (2012).
Jekel, C. F., Venter, G., Venter, M. P., Stander, N. & Haftka, R. T. Similarity measures for identifying material parameters from hysteresis loops using inverse analysis. Int. J. Mater. Form. 12, 355–378 (2019).
The VASP Manual. https://www.vasp.at/wiki/index.php/The_VASP_Manual.
Ganose, A., Jackson, A. & Scanlon, D. sumo: Command-line tools for plotting and analysis of periodic* ab initio* calculations. J. Open Source Softw. 3, 717 (2018).
Biswas, T. & Singh, A. K. Excitonic effects in absorption spectra of carbon dioxide reduction photocatalysts. npj Comput. Mater. 7, 1–10 (2021).
Akasaki, I. & Hashimoto, M. Infrared lattice vibration of vapour-grown AlN. Solid State Commun. 5, 851–853 (1967).
Dreyer, C., Janotti, A. & Van de Walle, C. Effects of strain on the electron effective mass in GaN and AlN. Appl. Phys. Lett. 102, 142105 (2013).
Nam, K., Nakarmi, M., Li, J., Lin, J. & Jiang, H. Mg acceptor level in AlN probed by deep ultraviolet photoluminescence. Appl. Phys. Lett. 83, 878–880 (2003).
Rubio, A., Corkill, J. L., Cohen, M. L., Shirley, E. L. & Louie, S. G. Quasiparticle band structure of AlN and GaN. Phys. Rev. B 48, 11810 (1993).
Jiao, Z.-Y., Ma, S.-H. & Yang, J.-F. A comparison of the electronic and optical properties of zinc-blende, rocksalt and wurtzite AlN: A DFT study. Solid State Sci. 13, 331–336 (2011).
Wethkamp, T. et al. Dielectric function of hexagonal AlN films determined by spectroscopic ellipsometry in the vacuum-uv spectral range. Phys. Rev. B 59, 1845 (1999).
Bechstedt, F., Seino, K., Hahn, P. & Schmidt, W. Quasiparticle bands and optical spectra of highly ionic crystals: AlN and NaCl. Phys. Rev. B 72, 245114 (2005).
Jain, A. et al. A high-throughput infrastructure for density functional theory calculations. Comput. Mater. Sci. 50, 2295–2310 (2011).
Kresse, G. VASP the GUIDE (2001) http://cms.mpi.univie.ac.at/vasp/.
Acknowledgements
This work was supported by ULTRA, an Energy Frontier Research Center funded by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences (BES), under Award # DE-SC0021230. In addition, Singh acknowledges support by the Arizona State University start-up funds. The authors acknowledge the San Diego Supercomputer Center under the NSF-XSEDE Award No. DMR150006 and the Research Computing at Arizona State University for providing HPC resources. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. The authors also thank Tara M. Boland, Adway Gupta, Akash Patel, and Cody Milne for testing the code and for helpful discussions.
Author information
Authors and Affiliations
Contributions
T.B. developed the pyGWBSE code and performed the simulations and calculations. Both authors contributed to designing the research methods, interpreting the data, and writing the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Biswas, T., Singh, A.K. pyGWBSE: a high throughput workflow package for GW-BSE calculations. npj Comput Mater 9, 22 (2023). https://doi.org/10.1038/s41524-023-00976-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41524-023-00976-y