An improved symmetry-based approach to reciprocal space path selection in band structure calculations

Band structures for electrons, phonons, and other quasiparticles are often an important aspect of describing the physical properties of periodic solids. Most commonly, energy bands are computed along a one-dimensional path of high-symmetry points and line segments in reciprocal space (the “k-path”), which are assumed to pass through important features of the dispersion landscape. However, existing methods for choosing this path rely on tabulated lists of high-symmetry points and line segments in the first Brillouin zone, determined using different symmetry criteria and unit cell conventions. Here we present a new “on-the-fly” symmetry-based approach to obtaining paths in reciprocal space that attempts to address the previous limitations of these conventions. Given a unit cell of a magnetic or nonmagnetic periodic solid, the site symmetry groups of points and line segments in the irreducible Brillouin zone are obtained from the total space group. The elements in these groups are used alongside general and maximally inclusive high-symmetry criteria to choose segments for the final k-path. A smooth path connecting each segment is obtained using graph theory. This new framework not only allows for increased flexibility and user convenience but also identifies notable overlooked features in certain electronic band structures. In addition, a more intelligent and efficient method for analyzing magnetic materials is also enabled through proper accommodation of magnetic symmetry.


INTRODUCTION
Band structures in crystalline solids are of great interest to both experimental and theoretical researchers for their utility in describing a variety of material properties, including well-known phenomena such as optical absorption and thermal and electronic transport, in addition to those of more exotic phases of matter such as topological insulators [1][2][3][4] . In nearly all applications, energy bands are computed along high-symmetry line segments in the irreducible Brillouin zone, i.e., the symmetrically unique portion of the minimal unit cell required to describe the crystal in reciprocal space. However, the first-principles calculation of energies at every point on its boundary is generally too computationally expensive to be feasible. Therefore, in general, the inherent symmetry of the structure is exploited to construct a one-dimensional path in reciprocal space (a "k-path"), which is presumed to capture important features of the entire energy dispersion landscape.
In 2010, Setyawan and Curtarolo (SC) were the first to propose a standard set of k-paths for each Bravais lattice type with the goal of enabling easy high-throughput calculations of electronic band structures with density functional theory (DFT) 5 . Following this work, in 2016 Hinuma et al. proposed to alter and expand the scheme to account for the fact that crystals of the same Bravais lattice type could still possess different symmetry in reciprocal space 6 . While these works laid an important foundation for k-path selection, and certainly allow for greater consistency across studies, they both identify k-paths in part based on different arbitrary symmetry criteria and unit cell conventions. Furthermore, the k-path data are contained in hard-coded lookup tables for different Bravais lattices and lattice vector lengths for the former 5 , and symmorphic space group symmetries and lattice vector lengths for the latter 6 . This grants less flexibility to users and can lead to broken code when arbitrary changes are made to the conventions 7 .
In order to address these concerns, we have developed an "onthe-fly" algorithm and accompanying software to select k-paths based on the symmetry of the input structure alone. We hereon refer to this framework for clarity as the Latimer-Munro (LM) method. This new approach does not rely on any a priori cell conventions or hard-coded look-ups. Rather, it identifies and selects high-symmetry points and line segments in the Brillouin zone using space group data, in conjunction with very general symmetry criteria. Since the definition of "high-symmetry" line segments and points is inherently arbitrary, we do not claim that this method derives paths that are objectively correct. However, given a set of initial symmetry criteria (which can in principle be modified by the user), the subsequent execution of the algorithm can be performed to produce internally consistent results not reliant on any single set of conventions. For the analyses carried out in this paper, we have adopted a maximally inclusive definition for what is considered high symmetry, which has the ability to produce k-paths whose resulting band structures reveal details that would otherwise have been missed using existing conventions.
Another notable benefit of the presented approach is the ability to take into consideration the magnetic symmetry of a given structure. With previous conventions relying on tabulated data, broken time-reversal symmetry could only be accommodated through a doubling of the irreducible Brillouin zone (and sampled k-points) by inversion about the Γ-point 6 . In this work, operations in the magnetic symmetry group of a structure can be obtained directly from the magnetic moments of the atoms and used to construct the correct reciprocal point group. We thereby offer a more accurate treatment of magnetic materials in studies involving band structure calculations.

RESULTS AND DISCUSSION
High-symmetry criteria Stationary points or cusps in a band structure can often be found at and along high-symmetry points and line segments in the Brillouin zone due to avoided crossings in the energy spectrum. Of course, the system-specific structure of the crystal Hamiltonian may exhibit stationary points at generic (low-symmetry) locations in the Brillouin zone, which cannot be predicted by symmetry arguments alone. Despite this fact, constructing k-paths using a symmetry-based analysis provides a computationally tractable and useful guide for exploring the dispersion landscape.
The criteria used to define the special high-symmetry points and line segments in reciprocal space vary throughout the literature. One of the first discussions on this topic is by Cracknell 8,9 , who outlined restrictions to the local symmetry group of a point or line segment that is able to host a stationary point or cusp in an electronic band structure. In contrast, the more recent conventions laid out by SC and Hinuma et al. define special points and line segments using more generic symmetry criteria. More specifically, SC only select points and line segments that possess a higher degree of symmetry (i.e. additional unique elements in the site-symmetry group) than those in the neighboring area. In the work by Hinuma et al., points and line segments are obtained through combining the work of SC with the listings of distinctive k-vectors and their orbits by Aroyo et al. as part of the Bilbao Crystallographic Server [10][11][12][13] .
In addition to the systematic criteria described above, both previous conventions also contain additional rules and constraints imposed by the authors. For example, in the work by SC, symmetry lines not on the edges of the irreducible Brillouin zone are only included if their site-symmetry groups contain at least one new symmetry operation with respect to their endpoints. Similarly, in the work by Hinuma et al., special k-points at Brillouin zone edge centers that are the terminus for only one special connecting line will attempt to be connected to a vertex or face center point with an additional path segment. This work addresses the arbitrary nature of these additional rules by attempting to define a simple and maximally inclusive symmetry criterion that allows for the automated selection of high-symmetry points and line segments. Specifically, this framework first requires the identification of "key" points and line segments that are defined to be at vertices, edge centers, and face centers on the boundary of the first Brillouin zone. The k-path will then include all such points or line segments whose site-symmetry group contains at least one symmetry operation in addition to identity or timereversal. For points that are initially unconnected by key line segments but fulfill this criterion by having symmetry operations in their site-symmetry groups that allow for a stationary point or cusp to be present, additional segments are added between them and the Γ-point. The steps of the algorithm are visually summarized in Supplementary Fig. 2 and are described in detail in the following section.
Algorithm outline (1) Compute reciprocal symmetry group: We take as input the realspace unit cell of a periodic solid with symmetry group G. A symmetry operation O 2 G has a rotation-reflection component R as well as a translation component v and commutes with the Hamiltonian of the system. Therefore, the energy eigenvalue E n,k of a Bloch wavefunction ϕ n,k (r) is preserved under the action of O 9 : O Á ϕ n;k ðrÞ ¼ ϕ n;Rk ðr À vÞ ) E n;k ¼ E n;Rk : Given that our interest lies in the symmetry of the energy spectrum (and not the states themselves), we define the "reciprocal space group" e G as the symmorphic space group e Φ e T, where × is the direct (Cartesian) group product and e T is the group of reciprocal lattice translations. We obtain the augmented point group e Φ by taking the direct product of the point group Φ of G with f1; 1g, where 1 is the identity and 1 is spatial inversion (In practice, we work with explicit matrix representations of elements in e Φ expressed in the reciprocal lattice basis. However, this does not alter the underlying group structure.). This is because time reversal (T ) reverses momenta: T Á ϕ n;k ðrÞ ¼ ϕ n;Àk ðr À vÞ ) E n;k ¼ E n;Àk : ( For magnetic materials, the situation is slightly more delicate, since the crystal breaks the bare time-reversal symmetry (T acts on nonzero magnetic moments as inversion: T Á m ¼ Àm). However, it may still preserve operations O that are a combination of T , a rotation-reflection R, and a translation v 8 : Our algorithm computes the full magnetic space group of the input structure given its user-specified vector magnetic moments, thus allowing us to obtain the correct augmented point group e Φ as a semi-direct product of Φ and f1; 1g.
(2) Collect k-orbits and identify site-symmetry groups: Given e G computed in step 1, we identify points and line segments in the Brillouin zone whose site-symmetry groups (i.e., subgroups of e G leaving those points and line segments fixed) are nontrivial. These can be grouped into symmetry-equivalent orbits. Two points k and k 0 are in the same orbit if and only if for some R 2 e Φ and some t 2 e T, Equivalence of two line segments is a straightforward extension of Eq. (4). Numerically, we carry out this step by simple brute-force enumeration of all possible symmetry transformations acting on the Brillouin zone, checking for each one whether the reciprocal lattice is preserved, up to translation by a reciprocal lattice vector. Finally, orbits are then chosen to be included in the final k-path using the high-symmetry criteria. As previously mentioned, in the case that any selected point is not connected to a line segment, it is made to connect to one passing through the Γ-point.
(3) Post-processing for consistency and readability: Given the set of orbits chosen for the k-path, we select one representative for each such that the entire path lies within the irreducible Brillouin zone (i.e., the point symmetry-reduced Brillouin zone). The algorithm returns a list of these points and line segments for use in numerical calculations. Once all calculations are completed, there remains the task of plotting the resulting band structure for the end user. This can either be done directly or by first employing an optional post-processing step that leverages graph theory to remove path discontinuities with the goal of improving plot readability. More specifically, this first consists of converting the final k-path into an undirected graph, where nodes represent unique k-points, and edges represent the line segments connecting them. Next, edges are added between nodes of odd degree to enforce that the graph contains an Eulerian cycle 14,15 . This ensures the existence of a path which passes through each k-point at least once. While this generally results in paths with repeated segments, the final plots (see Supplementary Fig. 1) will be completely free of disjointed segments and more visually consistent with one another.
Because each k-path is generated on-the-fly, no hard-coded labels are assigned unlike the previous conventions. Instead, once the Γ-point is identified, all other symmetrically unique points are given labels consisting of lowercase letters. These are chosen based on the distance between a point from a specific crystallographic axis in a tabulated list. No individual letter is used twice for unique points, and symmetrically equivalent points are indicated by different numeric subscripts on the same letter. It is important to note that, with this scheme, label significance can only be defined within the context of a particular plot, and care must be taken when comparing results from different studies.
The outlined algorithm and associated tools have been implemented into the existing portion of the pymatgen software package 16 that handles high-symmetry k-path generation. This enables popular tools that rely on pymatgen, such as plotting and analysis packages like sumo 17 , to utilize the new framework automatically. More specifically, in addition to the ability to generate LM paths given an arbitrary input structure, new methods have been written to perform the previously described optional post-processing step, as well as generate equivalent labels between all three conventions to allow for easier comparisons between studies. In addition, it should be noted that space group symmetry elements of an input structure are obtained in pymatgen using spglib 18 . This is the same package used by the two previous conventions for symmetry detection, and results from it are dependent on user defined tolerance parameters for distances between atomic coordinates (symprec) and angles between lattice vectors (angle_tolerance). Consequently, to ensure correct symmetry identification and k-path generation as with previous tools, users should test the sensitivity of the results of the algorithm for a given input structure to variations in tolerance values.

Application
To compare the LM approach with those of SC and Hinuma et al., all three methods were used to generate k-paths for 42,425 materials from the Materials Project database 19 , representing a wide range of structure types and space group symmetries. In order to directly compare the k-paths across methods, each structure was converted to the primitive standard setting defined by SC prior to obtaining a k-path 5 , and the optional postprocessing step described in the previous section to remove path discontinuities was not employed. Furthermore, the average difference in the overall number of path segments between the new and old approaches for each space group was used as a rough metric for the relative performance of the new criteria and algorithm (see Fig. 1). We find that, for the majority of space groups, our generalized symmetry criteria results in the inclusion of a larger number of k-path segments, with the highest average path length differences found in materials with monoclinic and orthorhombic symmetries. This increase in path length suggests that the new approach naturally allows for a more comprehensive sampling of the energy dispersion landscape compared to previous conventions. However, it should be noted that, for a small number of low-symmetry space groups in the tetragonal and trigonal crystal systems, using symmetry arguments alone results on average in shorter k-paths. This indicates that these systems benefit from the additional points and segments from previous conventions to effectively sample the landscape. In response to this, the ability to generate k-paths using points from all three conventions has been included as part of the implementation in pymatgen.
We have also calculated electronic band structures using the k-paths for all 42,425 materials and incorporated them into the Materials Project electronic structure data set. Since any magnetic data for materials in the Materials Project database is from collinear DFT calculations, no explicit magnetic symmetry was considered, and each spin channel was treated separately within a given band structure calculation. Band gap values and characters (i.e., direct or indirect) derived from these band structures were compared between the three conventions, and results are shown in Fig. 2. Different histograms are shown for space groups that demonstrate a larger or smaller average total k-path length relative to the old methods using the LM algorithm. We find that, for the vast majority of band structure calculations run using the longer k-paths, the band gap is either unchanged or lowered, with 540 specifically showing lower gaps compared to the SC convention, and 487 showing the same for the convention by Hinuma et al. It should be noted that a band gap being lower or higher was defined using a tolerance of ±10 meV, which is reflected in the bin size of the histogram plots. In addition, out of the entire data set, 106 calculations show a switch in the direct or indirect character of the band gap compared to when both previous conventions are in agreement with one other. In total, there are 362 calculations that show a switch when comparing to either the convention by SC or the one by Hinuma et al. Figure 3 contains example band structure plots for the material GePtS 20 from k-paths generated with the LM approach and the convention by Hinuma et al., including highlights of the path segment differences and similarities. This example exhibits a reduction in Fig. 1 Average increase in the number of segments in the k-path generated using the LM approach compared to the previous conventions by SC and Hinuma et al. The paths were calculated using 42,425 materials from the Materials Project database, with the data plotted for each space group number represented in the set. It should be noted that the paths used have not been put through the postprocessing step described as part of the algorithm to remove disjointed segments. For the majority of space groups, our algorithm selects a higher number of path segments on average, with monoclinic and orthorhombic materials showing the largest jump in overall k-path length. This illustrates the effectiveness of the maximally inclusive symmetry criteria in increasing the amount of the dispersion landscape that is sampled. The new approach only shows a decrease in average path length for a small set of low-symmetry tetragonal and trigonal groups, as indicated by the blue and green colors. For these systems, only considering the symmetry of points and line segments in the Brillouin zone may not be sufficient, and paths generated with the new method may need to be supplemented with segments from previous conventions. the band gap value by approximately 0.27 eV when the new framework is used compared to both previous conventions. Overall, the presented results indicate that the new k-paths effectively sample the dispersion landscape for the majority of space groups.
Thus far, we have shown data for systems respecting timereversal symmetry, either by imposing zero magnetic moments for all of the atoms in the system or through scalar magnetic moments, which are a result of collinear DFT calculations. To illustrate the utility of the LM algorithm for analyzing systems with magnetic orderings, we computed k-paths for two magnetic configurations of LaMnO 3 21 . Without any magnetic moments, this polymorph of LaMnO 3 has a space group of Pnma, with non-trivial Pn 0 ma 0 magnetic symmetry originating from an antiferromagnetic ground state below T = 139. 5 K 22 . Specifically, in the primitive standard setting defined by SC 5 for the paramagnetic Pnma structure, this corresponds to including a non-zero magnetic moment of 3.87 μB along [010] for the Mn atom in the 4b site. The resulting k-path is shown in Fig. 4a and is equivalent to the path generated for the paramagnetic system. However, if the magnetic moments in this system are instead rotated such that they point along [110] and ½110, the symmetry of the structure is lowered to P2 0 1 =m 0 . More precisely, this is the case for moments that are aligned along the two axes in the same setting for the Mn atoms in the 2b and 2d, and 2a and 2c sites, respectively. Our algorithm produces a new k-path commensurate with the reduced symmetry group, as shown in Fig. 4b.
We remark that the SC conventions make no provision for magnetic structures. As previously mentioned, Hinuma et al. derives the k-path for a magnetic structure by setting e Φ ¼ Φ (in the notation of the "Algorithm outline" section), thus doubling the size of the irreducible Brillouin zone if inversion was not already an element of Φ. However, this treatment generally selects points and line segments in reciprocal space that are symmetrically equivalent, leading to unnecessary computational expense. This is illustrated in Fig. 4, where doubling the non-magnetic path in Fig. 3 Electronic band structure plots for GePtS 20 . The data were obtained from calculations using k-paths generated from the a Hinuma et al. convention and the b LM approach. Equivalent segments between the two calculations are indicated by the red and yellow highlights. New symmetrically unique segments introduced by the symmetry-based approach can be seen in b and contain bands that result in a band gap that is approximately 0.27 eV lower compared to both previous conventions. Equivalent labels in the Hinuma et al. convention are shown at the top of the plot. In addition, the path from the new method can be seen to be completely continuous compared to the disjointed path in a. This is a result of applying the optional post-processing step to ensure continuous path connectivity before plotting. Fig. 2 Histograms of the decrease in band gap energy from electronic band structures calculated using the LM approach compared to the previous conventions by SC and Hinuma et al. The histograms labeled "shorter paths" correspond to space groups with a negative value of the average increase in the number of path segments in Fig. 1. For the majority of band structures calculated with longer k-paths from the new method, the band gap is shown to either be unchanged or decreased in value. This confirms that in most cases the increased sampling of reciprocal space via the maximally inclusive high-symmetry criteria provides a better description of the electronic band structure. The smaller band gaps shown in the histogram can be seen to primarily result from calculations with shorter paths, further indicating the necessary inclusion of additional segments from previous conventions to the newly generated paths for tetragonal and trigonal systems with low symmetry. In summary, we have presented a new symmetry-based algorithm and associated software for automated reciprocal space path selection in band structure calculations. This approach addresses limitations of previous conventions by constructing k-paths on-the-fly using only the reciprocal space symmetry and maximally inclusive criteria for high-symmetry points and line segments. The resulting paths are effective at sampling the dispersion landscape for most space group symmetries, and result in electronic band structures with notable features previously overlooked by existing conventions. In addition, the non-trivial magnetic symmetry of a structure can be properly accommodated, enabling the appropriate treatment of magnetic materials in high-throughput studies involving band structure calculations. Overall, we envision the new framework becoming standard use in studies involving the calculation of band structures.

METHODS
All DFT calculations were completed using the band structure workflow within the atomate software package 23 . For each step, the projector augmented wave (PAW) method as implemented in the plane-wave Vienna Ab Initio Simulation Package (VASP) software [24][25][26][27] was used alongside the Perdew-Burke-Ernzerhof generalized-gradient approximation functional 28 . The input parameters were chosen according to those established by the Materials Project to provide well-converged results. These include an energy cutoff of 520 eV, an energy error threshold of 5 × 10 −5 eV/atom, and a reciprocal k-point density of 100/Å −3 used for both the initial self-consistent and final non-self-consistent calculations. The only difference made was to make sure VASP included aspherical contributions in the gradient corrections inside of the PAW spheres. It should also be noted that materials containing oxygen and fluorine were modeled using a Hubbard U correction 29 . Specifically, elements Co, Cr, Fe, Mn, Mo, Ni, V, and W used values of 3.32, 3.7, 5.3, 3.9, 4.38, 6.2, 3.25, and 6.2 eV, respectively.

DATA AVAILABILITY
The data that support the findings of this study will be made available through the Materials Project website and API. Fig. 4 k-paths generated with the LM approach for two different magnetic configurations of LaMnO 3 21 . a Moments oriented along [010] result in a magnetic symmetry group of Pn 0 ma 0 and a path equivalent to the paramagnetic phase with Pnma symmetry. b Moments oriented along [110] and ½110 instead result in a magnetic symmetry group of P2 0 1 =m 0 , which alters the reciprocal space symmetry and the final k-path. By correctly considering magnetic symmetry in the P2 0 1 =m 0 system, a minimally sufficient kpath of 14 total segments is generated. This can be compared to the much longer path with 36 segments generated from the doubling of the irreducible Brillouin zone using the previous convention by Hinuma et al.