Towards Fully Converged GW Calculations for Large Systems

Although the GW approximation is recognized as one of the most accurate theories for predicting materials excited states properties, scaling up conventional GW calculations for large systems remains a major challenge. We present a powerful and simple-to-implement method that can drastically accelerate fully converged GW calculations for large systems. We demonstrate the performance of this new method by calculating the quasiparticle band gap of MgO supercells. A speed-up factor of nearly two orders of magnitude is achieved for a system contaning 256 atoms (1024 velence electrons) with a negligibly small numerical error of $\pm 0.03$ eV.

Accurate predictions of excited states properties are critical for computational screening and design of materials for energy and electronics applications. While density functional theory (DFT) [1] within the local density approximation (LDA) [2] or the generalized gradient approximation (GGA) [3] has been very successful in describing materials ground-state properties, the Kohn-Sham (KS) eigenstates in principle cannot be compared directly with quasiparticle (QP) excitations. The GW approximation [4][5][6], systematically derived from the perturbative expansion of electron self-energy, is widely recognized as one of the most accurate methods for predicting excited states properties of materials.
Unfortunately, despite the tremendous success, accurate and efficient predictions of excited-states properties of complex solid systems remain a major challenge due to complication of the convergence issue [7][8][9][10][11] and the unfavorable scaling of the computational cost (both in terms of the computational time and the storage and memory requirements) with respect to the system size. This is particularly true for systems containing substantially localized electronic states and/or with large unit cells (e.g., nanostructures, complex multinary compounds, surfaces and interfaces, and defects in solids). Although zinc oxide (ZnO) has probably received the most research attention, [9,[11][12][13] the convergence problem of GW calculations is not just limited to this single material. In fact, GW calculations for many systems, including MgO, CuCl, AgBr, and CdO (to name a few), suffer similar convergence problems.
The GW approximation for the electron self-energy Σ(r, r ′ ; E) is where δ = 0 + , G is the one-particle Green's function, and W is the screened Coulomb interaction, which is related to the dielectric function ǫ and the bare Coulomb interaction v as W = ǫ −1 v. In conventional first-principles GW methods [5,6], both the Green's function and the dielectric functions are constructed using the KS eigenstates, and, in principle, all states in the Hilbert space of the KS Hamiltonian should be included in the summations.
Within the random phase approximation (RPA), the dielectric function is related to the electron polarizability where M vc (k, q, G)= v, k + q|e i(q+G) |c, k , and v and c index valence and conduction bands. The calculation of the COH self-energy also involves a summation over conduction bands: [5,14] nk|Σ where ǫ r (ǫ a ) is the retarded (advanced) dielectric function. The band summations in the calculations of dielectric function and COH self-energy should in principle include all conduction (empty) states in the Hilbert space of the KS Hamiltonian. In practical calculations, truncations are almost always applied. Since the convergence behavior of GW calculations could be very different from materials to materials, it is often difficult to predict a priori proper truncation parameters to ensure fully converged GW results. For large systems, it is extremely difficult (if possible at all) to perform fully converged GW calculations within this scheme. In this letter, we present an effective method which promises to speed up fully converged GW calculations by up to two orders of magnitude for large systems containing hundreds of atoms and is also easy to implement within available GW packages.
Computational details.-We use MgO as an example to illustrate the difficulty of scaling up conventional GW  calculations for large systems and demonstrate the performance of the newly developed method. All GW results presented in this letter are calculated within the so-called G 0 W 0 approach using a modified version of the BerkeleyGW [14] package. The Hybersten-Louie generalized plasmon-pole (HL-GPP) model [5] is used to extend the static dielectric function to finite frequencies. The new method presented in this letter, however, can be applied to other GW implementations including various self-consistent GW approaches. The DFT calculations within the LDA are carried out using the PARATEC [15] code. Norm-conserving Troullier-Martin pseudopotentials [16] are used. The kinetic energy cutoff for the plane-wave expansion of the KS wave functions is set at 80 Ry. Figure 1 shows the convergence behavior the calculated quasiparticle band gap of MgO (two-atom primitive cell) as a function of the number of conduction bands included in the COH self energy summation and the kinetic energy cutoff for the dielectric matrices. Depending on various cutoff parameters such as the number of bands included in the Coulomb-hole self-energy summation and the kinetic cutoff for the dielectric matrices, the calculated QP band gap varies from 7.25 to 7.90 eV. Once fully converged, we obtain a QP band gap of about 7.90 eV, in good agreement with the experimental value of 7.78 eV. [17] As it is shown in Fig 1, a high kinetic cutoff (|G cut | 2 /2 ∼ 40 Ry) for the dielectric matrices ǫ G,G ′ (q, ω) and a large number of conduction bands (N c ∼ 800) in the COH summation are required. If a small kinetic energy cutoff for the dielectric matrices is used, the band gap converges quickly but prematurely with respect to the number of band included in the summation.
With today's massively parallel computers, fully converged GW calculations for small systems such as 2-atom MgO can be done easily. The problem, however, quickly becomes intractable when one attempts to scale up the calculations for large systems containing hundreds of atoms because the number of conduction bands required in the COH and dielectric function calculations scales linearly with the system size (i.e., number of atoms). To put this problem in perspectives, suppose now we would like to carry out GW calculations for a 64-atom MgO supercell containing a defect (e.g., an F-center), the required number of bands would increase by 32 (64/2) times. Not only it is difficult to generate so many wave functions, storing these wave functions would also pose a serious challenge, not to mention performing the band summations in the GW calculations. Our approach is inspired by the observation that the density of states (DOS) of most materials can be divided into two regions: a low energy region in which the behavior of electronic states is strongly affected by the crystal potential and a high energy region in which kinetic energy dominates and the DOS scales as E − V xc (0) as that for free electron systems, where V xc (0) is the averaged exchange-correlation potential of the material. Figure 2 compares the DOS of MgO calculated within the LDA and that of the free-electron gas scaled with the volume of the unit cell of MgOl. The DOS below certain energy (E 0 shown in Fig.2) is what distinguishes one material from another. Above E 0 , the DOS can be well-approximated by that of free-electron gas.
This observation leads us to propose that the band summation in the GW calculations be divided into two regions: a low energy region (shaded area in Fig.2) in which the band summation is carried out explicitly and a high-energy region in which the band summation is replaced (approximated) by an energy integration. For example, the static polarizability χ 0 is rearranged in the following form: where g(E) = Ω π 2 2(E − V xc (0)) is the approximated DOS for the system with a cell volume Ω, E 0 corresponds to the energy of band N 0 , and M vE (k, q, G) are the matrix elements between the valance band v and the conduction band at (or near) the energy E. In practical calculations, we find that the number of bands N 0 can be limited to less than 10 per atom for most systems. The auxiliary integral part can be carried out using simple numerical integration techniques, for example, where N E is the number of sampling points used in the numerical integration as shown schematically in Fig.2 with large blue dots. The same approach is applicable to the calculation of the COH self-energy (Eq.3).. The integration can be carried on a uniform or nonuniform energy grid, and a typical energy step ∆E can range from 1 to 5 eV. Higher order numerical integration techniques may also be applied. We now examine the stability and performance of this new method using a reasonably small system, an MgO supercell containing 16 atoms, as an example. For this system, we choose N 0 to be 128. In other words, 8 conduction bands per atom are included in the explicit summations in the calculation of the dielectric matrices and the COH self-energy. The auxiliary integral part is calculated on a uniform energy grid with a step ∆E ranging from 1.5 to 4.0 eV, and the integration is carried out up to an energy which is equivalent to including 8,000 (or 1,000 per 2-atom cell) conduction bands. The kinetic energy cutoff for the dielectric matrix is set at 40 Ry.
For such a small system, we can compare directly the results calculated using the new method and the conventional band-by-band summation approach, as shown in Table I. As the energy integration step varies from 1.5 to 4 eV (while keeping N 0 fixed at 128), the number of conduction bands plus the number of the energy integration points decreases from 895 to 320, to be compared with 8,000 required in conventional calculations to achieve the same level of convergence, representing a speed-up factor of 9 to 25. It is rather encouraging that the results calculated using this new method are insensitive to the energy integration step ∆E, showing the stability of the this method. The GW band gap calculated with the conventional method is 7.86 eV, and that obtained from the new method only deviates by a negligible small amount of ±0.02 eV.
Our method reduces the computational time of fully converged GW calculations for a 16-atom MgO supercell from about 5 hours to 15 minutes using 64 computing cores (8 Intel Xeon E5-2650 processors). We would like to mention that within this new approach, one only needs to calculate (and store) the wave functions of the N 0 low energy states plus selected high-energy conduction states at or near predefined energy grid points. The foldedspectrum method developed by Wang and Zunger [18] and an alternative approach proposed by Tackett and Ventra [19] are ideal for this purpose. Using these methods, one can efficiently calculate KS eigenfunctions at or near specified energy grid points without the need to calculate all KS states. Therefore, the speed-up factor shown in Table I also indicates the reduction in the required memory and disk space associated with storing the wave functions in GW calculations.
The most interesting and important aspect of our method is that the speed-up factor actually increases with increasing system size, enabling fully converged GW calculations for large systems containing hundreds of atoms. This is because with a given integration step size ∆E i , the weight of a single integration point, g(E i )∆E i in Eq.5, which measures effectively the number of states contributing to the summation represented by a single state at E i , scales linearly with the system size. To demonstrate that our method can indeed be scaled up for large systems, we have carried out GW calculations for MgO systems containing 2 to 256 atoms. Table II shows the performance of our method, both in terms of accuracy and the speed-up factor. A speed up factor of over 80 times is achieved for the largest system containing 256 atoms (1,024 valence electrons), and the numerical integration error, measured by the calculated band gap, is only ±0.03 eV for all systems. Thus the greater speed-up for larger systems does not compromise the accuracy of this method, and there is still room for improvement (for example, with improved numerical integration techniques). We would like to mention that such fully converged GW calculations for large ox- ide systems containing over 200 atoms would be nearly impossible using the conventional approach due to the vast amount of computational resources required (both in terms of the computational time and the memory and storage requirements). Table III compares the calculated macroscopic dielectric constant ǫ ∞ = lim q→0 [1/ǫ −1 0,0 (q, 0)] using the conventional and the new approach. Only the result for the 16-atom system is calculated using the conventional GW approach, which includes 8,000 conduction bands in the summation. As it is shown in the Tables II and III, our method can not only reproduce accurately the GW band gap but also other quantities such as the dielectric constant at a fraction of the computational cost compared with the conventional approach. Considering the importance of predicting materials excited states properties, it is not surprising that there has been much work on improving the the efficiency of the GW formalism for large systems beyond simple crystals. One direction is to introduce better (optimized) bases for representing the dielectric matrix and/or the self-energy operator beyond plane waves and KS eigen states. [20][21][22]. The second direction is to reduce the number of conduction bands N c by replacing the summation over high-energy states with computationally tractable terms [7,8,11,23,24], or completely removing the necessity of summing over unoccupied states [25][26][27].
These efforts underline the urgency of developing efficient GW methods to meet the challenge of fast and accurate predictions of the excited states properties of large systems such as nanostructures, defects in solids, and surfaces and interfaces. Some of these approaches, however, may require substantial modifications (or completely rewriting) existing GW codes, and the study of the convergence behavior of these methods is still in the early stage. Our approach is conceptually simple and is very easy to be implemented in well-developed GW packages. More importantly, we have demonstrated affordable and fully converged GW calculations for large and hard-to-converge systems such as MgO.
In summary, we present an efficient integration method that drastically speeds up fully converged GW quasiparticle calculations for large systems. Our method takes advantage of the fact that the DOS of most materials resembles that of the free electron gas at high energies, thus the summation over high-energy conduction bands in both the dielectric function and self-energy calculations within the GW approximation can be well approximated by a numerical integration on a sparse energy grid. Using this new method, we can now carry out fully converged GW calculations for large systems without the need to perform tedious and wasteful convergence tests. We have demonstrated a nearly two orders of magnitude speed-up of GW calculations for a 256-atom MgO model system while achieving a numerical accuracy of ±0.03 eV for the predicted band gap. Although we have presented results calculated within the so-called G 0 W 0 approximation and using the HL-GPP model for the dielectric function, our approach can be applied to various levels of self-consistent GW calculations and/or GW calculations without the use of plasmon-pole models.