## Abstract

Maximally localized Wannier functions (MLWFs) are widely used in electronic-structure calculations. We have recently developed automated approaches to generate MLWFs that represent natural tight-binding sets of atomic-like orbitals; these describe accurately both the occupied states and the complementary unoccupied ones. For many applications, it is required to use MLWFs that describe instead certain target groups of bands: the valence or the conduction bands, or correlated manifolds. Here, we start from these tight-binding sets of MLWFs, and mix them using a combination of parallel transport and maximal localization to construct manifold-remixed Wannier functions (MRWFs): these are orthogonal sets of MLWFs that fully and only span desired target submanifolds. The algorithm is simple and robust, and is showcased here in reference applications (silicon, MoS_{2}, and SrVO_{3}) and in a mid-throughput study of 77 insulators.

## Introduction

Maximally localized Wannier functions (MLWFs)^{1,2,3,4} are accurate reduced-order models^{5} for the electronic structures of periodic crystals. The generation of MLWFs from Bloch wavefunctions typically requires a choice of initial guesses, which are often conjectured from chemical intuition with trial and error. For metals, or when considering both the valence and conduction bands (VCB) of insulators and semiconductors, one typically deals with bands that overlap with higher-energy bands, i.e., the so-called entangled bands in the literature^{2}. In such cases, since the low-energy electronic structure can often be well described by a tight-binding model of atomic-like orbitals, the initial guesses are usually chosen from the hydrogenic s,p,d,f orbitals. However, when it comes to the cases of valence bands (VB) alone, or especially conduction bands (CB) which are mixed with higher-energy bands, it might become difficult to identify good initial guesses. Indeed, the VB/CB often consists of bonding/anti-bonding orbitals, or combination of atomic orbitals which are more challenging to guess or to describe, unless the crystal offers a very simple chemical picture. Achieving separate Wannierization of target manifolds is also advantageous for many applications. Some physical properties (such as the electric polarization) depend only on the Wannier functions (WFs) of the occupied manifold (sum of Wannier centers of all the valence WFs). Moreover, using dedicated MLWFs means that one can obtain smaller tight-binding models that are thus more efficient when computing, e.g., transport properties of large systems. Koopmans spectral functionals also require separate occupied and unoccupied manifolds^{6}. Last, low-energy models, such as those used in correlated-electrons calculations^{7,8,9,10}, require a description of the correlated manifold.

Several approaches have been developed in the past few years to simplify the construction of MLWFs. The selected columns of the density matrix (SCDM) algorithm^{11} uses QR decomposition with column pivoting on the density matrix to automatically generate initial projection orbitals, and a sensible choice of the density matrix can be obtained from the projectability of Bloch states onto pseudo-atomic orbitals from pseudopotentials^{12}. The optimal projection functions method^{13} starts with a larger manifold and generates the MLWFs of the valence manifold by a single rotation matrix, which is computed by a product of a series of Givens rotations. The dually localized Wannier functions method^{14} adds an additional term to spread functional, to localize the WFs in both space and energy, achieving a separation of VB and CB.

Here, we propose a different approach to automatically mix optimal MLWFs spanning valence and conduction into several submanifolds, provided that these submanifolds are gapped in their energy spectrum. This naturally applies to the case of separate Wannierizations of valence and conduction manifolds, but more generally extends to arbitrary groups of bands separated in energy. We start from the Wannierization of a larger manifold (e.g., the VCB manifold), that we do not discuss here since robust methods already exist: in addition to hydrogenic s,p,d,f initial projections, partly occupied WF method^{15,16}, the fully automated SCDM method^{11,12}, or the projectability-disentangled Wannier function (PDWF) that we recently introduced are available (in particular, the latter appears as a very general and remarkably robust approach allowing, e.g., to construct ~ 1.3 million PDWFs for ~ 22 thousands materials^{17}). Once these MLWFs are obtained, we then diagonalize the Wannier Hamiltonian at every *k*-point and partition the states into submanifolds (e.g., valence, conduction): they are grouped together if they fall inside the desired energy interval. Next, we fix the gauge randomness of the submanifolds using parallel transport^{18}. Finally, the MLWFs for each submanifold are generated by maximally localizing their spread functionals, independently. Since the submanifolds are already isolated in energy (i.e., disentanglement^{2} is not needed), and parallel transport provides a continuous gauge, the final maximal localization converges effortlessly. In the case of separating VB and CB, the final two groups of MLWFs span the fully occupied valence and the fully unoccupied conduction manifolds, and their shapes closely resemble bonding and anti-bonding orbitals, respectively. Compared with SCDM, the present method works fully in reciprocal space, reducing the computer memory requirements and also being computationally faster. Compared with the optimal projection functions method^{13} or the dually localized Wannier functions method^{14}, we do not change the spread functional but use the original one in Ref. ^{1}, thus the resulting WFs are maximally-localized in their original definition; moreover, the parallel transport step is non-iterative and always quickly provides a good starting point for the final maximally-localization step, avoiding potential convergence issues that might occur in an iterative method.

In the following, we first discuss and validate the present method, which we name manifold-remixed Wannier function (MRWF), on the VCB of a 3D material (silicon), the VCB of a 2D material (MoS_{2}), the top VB of MoS_{2}, and the 3d manifold of SrVO_{3}. We also discuss the bonding/anti-bonding character of the resulting MLWFs, as well as band interpolation accuracy. To analyze statistics of band interpolation quality and demonstrate the robustness of the present approach, we Wannierize the VB and CB of a diverse set of 77 insulators, with the number of atoms between 1 and 45.

## Results

### The manifold separation algorithm

While obtaining the starting WFs is not the focus of this paper, we remind here that the standard Wannierization algorithm^{1,2} requires initial projection orbitals \(\left\vert {g}_{n}\right\rangle\) to guide the spread minimization and find the most meaningful minimum and the related unitary transformation matrices *U*_{k} at each *k*-point **k**. The initial guesses (also called initial projection orbitals) \(\left\vert {g}_{n}\right\rangle\) are used to rotate the original Bloch wavefunctions \(\left\vert {\psi }_{m{{{\bf{k}}}}}\right\rangle\) into

where *n* and *m* are the indices for WFs and Bloch bands, respectively; *M* is the total number of Bloch bands; and **k** is the Bloch quasi-momentum. Note that \(\left\vert \tilde{\psi }\right\rangle\) are independent of any arbitrary rotation gauge for the \(\left\vert {\psi }_{n{{{\bf{k}}}}}\right\rangle\). For metals or for VCB of insulators, one typically starts with hydrogenic s,p,d,f orbitals^{1} as the initial guesses for all the corresponding valence electrons. Then, the MLWFs can be generated using either the standard disentanglement^{2} and maximal localization algorithms^{1} or minimizing directly the total spread, such as the partly occupied WF method^{15} or a variational formulation^{19}. Instead of hydrogenic orbitals, one can use SCDM^{11} or the recently introduced projectability disentanglement^{17} for a fully automated Wannierization. Irrespective of the approach taken to obtain MLWFs describing the VCB, these MLWFs will be the starting point of the present algorithm, with the next step to separate e.g., the VB and CB manifolds from the disentangled MLWFs that span both simultaneously. Note that while in the following we use the separation of VB and CB as an example to illustrate the method for clarity and simplicity, the present approach is not limited to the case of two submanifolds, but can be applied to any groups of bands separated in energy.

Since the disentanglement procedure aims at obtaining the lowest-possible spreads, it typically achieves this goal by mixing states originating from all the submanifolds (e.g., VB and CB) of interest. To decompose the manifold into two orthogonal submanifolds, we diagonalize the Wannier-gauge Hamiltonian \({H}_{{{{\bf{k}}}}}^{W}\) (the superscript *W* indicates the Wannier gauge),

where \({{{{\mathcal{E}}}}}_{{{{\bf{k}}}}}\) and *V*_{k} are the eigenvalues and the eigenvectors, respectively; * denotes conjugate transpose (note in physics, † is often used for conjugate transpose, however, when deriving complex differentials in mathematics, * is more widely used. To be consistent with the notations in Supplementary Sections 1.1 and 1.3, we choose to use * in the main text as well). Usually the eigenvalues and eigenvectors returned from linear algebra computer programs are already sorted in ascending order of eigenvalues; if not, we sort them in ascending order, so that the matrices are partitioned into two blocks,

where \({V}_{{{{\bf{k}}}}}^{1}\in {{\mathbb{C}}}^{N\times P}\) (\({V}_{{{{\bf{k}}}}}^{2}\in {{\mathbb{C}}}^{N\times Q}\)) corresponds to states whose eigenvalues \({{{{\mathcal{E}}}}}_{{{{\bf{k}}}}}^{1}\) (\({{{{\mathcal{E}}}}}_{{{{\bf{k}}}}}^{2}\)) are below (above) the band gap, and 0 represents a zero matrix. Here, *N* is the number of WFs of the VCB manifold, *P* and *Q* are the number of WFs in the valence (below band gap) and the conduction (above band gap) submanifolds, respectively, such that *N* = *P* + *Q*. Next, all the Wannier-gauge operators are rotated according to \({V}_{{{{\bf{k}}}}}^{1}\) for the valence submanifold: for instance, the overlap matrices \({M}_{{{{\bf{k}}}},{{{\bf{b}}}}}^{W}\) (for computing the spread functional) is rotated by

where

*U*_{k} are the unitary transformations from the VCB manifold Wannierization, and \(\left\vert {u}_{m,{{{\bf{k}}}}}\right\rangle\) is the periodic part of Bloch wavefunction \(\left\vert {\psi }_{m,{{{\bf{k}}}}}\right\rangle =\exp (i{{{\bf{k}}}}{{{\bf{r}}}})\left\vert {u}_{m,{{{\bf{k}}}}}\right\rangle\). For more details on the notations of *M*_{k,b} and **b**-vectors, see Ref. ^{1}. Consistently, the \({{{{\mathcal{E}}}}}_{{{{\bf{k}}}}}^{1}\) is used as the new eigenvalues. Now the problem is reformulated into a Wannierization of an isolated submanifold with *P* WFs for VB. Similarly, the conduction manifold operators are rotated by \({V}_{{{{\bf{k}}}}}^{2}\), and an analogous Wannierization of an isolated submanifold with *Q* WFs. Indeed, the first-step Wannierization of VCB has already disentangled the MLWFs from all the remaining higher-energy bands, so that these MLWFs span all the target submanifolds we are interested in. To achieve our goal, we are left with Wannierizations of two isolated submanifolds and thus the subsequent steps do not need any disentanglement. Such a two-step procedure makes the whole algorithm quite robust, especially when Wannierizing the CB, for which it is difficult to provide good initial projections of the corresponding anti-bonding orbitals.

The remaining difficulty of the Wannierization of the two isolated submanifolds is caused by the diagonalization in Eq. (2). Indeed, since the Hamiltonians \({H}_{{{{\bf{k}}}}}^{W}\) are independently diagonalized at each **k**, the resulting eigenvectors will have different gauges at different *k*-points, requiring additional Wannierizations in each submanifold. Since these Wannierizations are carried out on submanifolds that have isolated bands, the minimization algorithm is typically more robust to the choice of initial projections compared to the case of disentanglement. One could simply resort to random Gaussian initial projections followed by maximal localization to reach the MLWFs for the two submanifolds, respectively; or even brute-force maximal localization starting from the random gauge after the Wannier Hamiltonian diagonalization. However, a direct maximal localization starting from a random gauge is not robust—we observe that, in many cases, the maximal localization fails due to zeros in the diagonal of the overlap matrices *M*_{k,b}; and, even if it converges, it displays the same issues of random Gaussian projections: a large number of iterations, and oscillatory evolution of spread and the sum of MLWF centers during the minimization process (see Supplementary Fig. 1 for discussions on the convergences of these choices). Moreover, when the number of *k*-points *N*_{k} is large, the maximal localization is much harder to converge. A better solution is finding good starting gauges for the two submanifolds in an automated fashion.

To tackle this challenge, we adopt the parallel transport algorithm^{18} to construct smooth gauges for the two submanifolds. For an isolated manifold, the existence of a smooth gauge is determined by its topological obstructions, which are characterized by the Chern numbers (one in 2D and three in 3D). If the Chern numbers are 0 (as it is the case for systems with time-reversal symmetry), then it can be proven^{20} that it is possible to construct a continuous gauge explicitly by the following procedure^{18}: (a) Suppose *k*_{i} ∈ [0, 1] (in fractional coordinates) for *i* = *x*, *y*, *z*: propagate (using singular value decomposition of overlap matrices \({M}_{{{{\bf{k}}}},{{{\bf{b}}}}}^{W}\) to maximally align the gauge between neighboring *k*-points) the Bloch wavefunctions \(\left\vert {u}_{n{{{\bf{0}}}}}\right\rangle\) at Γ along *k*_{x} from *k*-point **0** = (0, 0, 0) to **1** = (1, 0, 0), to construct a continuous gauge across these *k*-points. The new gauge is not necessarily quasi-periodic, i.e., satisfying the Bloch theorem imposed on \(\left\vert {u}_{n{{{\bf{k}}}}}\right\rangle\) by \(\left\vert {u}_{n{{{\bf{k+K}}}}}\right\rangle ={\tau }_{{{{\bf{K}}}}}\left\vert {u}_{n{{{\bf{k}}}}}\right\rangle\) where \({\tau }_{{{{\bf{K}}}}}=\exp (-i{{{\bf{K}}}}{{{\bf{r}}}})\) is the translation operator in reciprocal space, and **K** is a reciprocal lattice vector. In general, instead, the two states are related by \(\left\vert {u}_{n{{{\bf{1}}}}}\right\rangle =({\tau }_{{{{\bf{1}}}}}\left\vert {u}_{n{{{\bf{0}}}}}\right\rangle ){V}_{{{{\rm{obs}}}}}\), where this expression defines the obstruction matrix *V*_{obs} quantifying the misalignment of the propagated gauge and the gauge required by Bloch theorem at (1, 0, 0). To fulfill the quasi-periodic boundary condition, we can therefore multiply each \(\left\vert {u}_{n{{{\bf{k}}}}}\right\rangle\) by \(\exp (-{k}_{x}L)\) (note *k*_{x} ∈ [0, 1]), where \({V}_{{{{\rm{obs}}}}}=\exp (L)\): in this way, we obtain a continuous gauge that also satisfies Bloch theorem, i.e., the obstruction matrix in this modified gauge becomes the identity matrix. (b) For each *k*_{x}, propagate along *k*_{y} from (*k*_{x}, 0, 0) to (*k*_{x}, 1, 0). Now we obtain a series of obstruction matrices *V*_{obs}(*k*_{x}) along *k*_{x}. If the winding number^{18,21,22} of the determinants of *V*_{obs}(*k*_{x}) vanishes (i.e., the Chern number is 0), then there is a continuous function that maps *V*_{obs}(*k*_{x}) to identity^{18}. We then multiply the gauge by this mapping, so that the new gauge satisfies the quasi-periodic boundary condition in the *k*_{x} − *k*_{y} plane. Ref. ^{18} explicitly constructs the continuous mapping by their column interpolation method for the Kane-Mele model, which is a 2D fermionic time-reversal-symmetric model (i.e., having a vanishing Chern number) but can present a non-zero \({{\mathbb{Z}}}_{2}\) number; as a comparison, previous methods had difficulties in handling \({{\mathbb{Z}}}_{2}\) systems^{18,23,24}, sometimes requiring model-specific information^{25,26}. (c) For each (*k*_{x}, *k*_{y}), propagate along *k*_{z} from (*k*_{x}, *k*_{y}, 0) to (*k*_{x}, *k*_{y}, 1). Now the obstruction matrices *V*_{obs}(*k*_{x}, *k*_{y}) depend on both *k*_{x} and *k*_{y}. Similar to point (b), if the two winding numbers of the determinants of *V*_{obs}(*k*_{x}, 0) and *V*_{obs}(0, *k*_{y}) vanish, then there is a continuous function that maps *V*_{obs}(*k*_{x}, *k*_{y}) to identity. We then multiply the gauge with this mapping and obtain the final gauge satisfying the quasi-periodic boundary condition in 3D. Ref. ^{18} demonstrates this constructive algorithm to obtain a continuous gauge for a 3D system (silicon). The results also show that the continuous gauge can be further smoothened by the standard maximal localization procedure^{1} to construct MLWFs. We stress that the algorithm is non-iterative and fast, thus solving the problem of finding good initial WFs for isolated manifolds in an efficient and robust way.

As shown in Supplementary Fig. 1, parallel transport generates a much better starting point than random Gaussian projections or random gauges: the convergence of maximal localization is much faster, and the spread and the sum of MLWF centers smoothly evolve during minimization. We note that since the propagation of gauge requires overlap matrices between a particular set of nearest-neighboring *k*-points, in the Supplementary Section 1.2 we present a procedure so that parallel transport can be applied to any arbitrary crystal structure.

In summary, the sequential parallel transports move the obstructions to the Brillouin zone edges, and the column interpolation method fixes the quasi-periodicity. Our tests on a set of 77 insulators (see discussion later in Section *Results on 77 insulators*) show that this algorithm is able to construct a good initial gauge, and maximal localization is able to construct MLWFs without issue.

We now mention that since we propagate the gauge starting from the first *k*-point (0, 0, 0), there is still one gauge arbitrariness at this Γ point. Here, we suggest to insert an additional step that first minimizes the spread functional w.r.t. a single rotation matrix *W* for the first *k*-point, before performing the standard maximal localization w.r.t. all *k*-points to obtain MLWFs. Indeed, thanks to the small size of *W*, this first preliminary step is computationally efficient, and can help in further improving the overall robustness of the full algorithm that we are presenting here. To achieve this, we derive the expression of the gradient of the spread Ω w.r.t. the rotation matrix *W* in Supplementary Section 1.1. We then use this gradient with a manifold optimization algorithm^{27} to minimize Ω w.r.t. *W*, where *W* is constrained on the unitary matrix manifold \(\left\{W\in {{\mathbb{C}}}^{K\times K}| {W}^{* }W=I\right\}\), where *K* = *P* for the valence manifold, or *K* = *Q* for the conduction manifold. This minimization provides us with a single rotation matrix *W* that further improves the localization, while still preserving the parallel transport gauge: we stress that, in addition to increasing the robustness of the algorithm as mentioned earlier, this additional step can thus be beneficial for cases where the parallel transport gauge is implicitly assumed during the derivation of equations (for instance, Wannier interpolation of Berry curvature^{28}, or Wannier interpolation of nonlinear optical responses^{29,30}).

After the parallel transport and the single rotation, the resulting WFs are close to the ideal MLWFs. However, since parallel transport only generates a continuous quasi-periodic gauge, it typically does not provide the smallest possible spread. It is therefore helpful to perform a final smoothing of the gauge^{18} by running a final maximal localization step (see examples in Sections *Silicon* and *Valence and conduction bands of MoS*_{2}). This can be achieved using either the original Marzari-Vanderbilt localization^{1} or a matrix manifold optimization w.r.t. gauge matrices at all the *k*-points, i.e., optimization on a product manifold of a series of unitary matrices \({\prod }_{{{{\bf{k}}}}}\left\{{U}_{{{{\bf{k}}}}}\in {{\mathbb{C}}}^{K\times K}| {U}_{{{{\bf{k}}}}}^{* }{U}_{{{{\bf{k}}}}}=I\right\}\), where *K* = *P* for valence and *K* = *Q* for conduction manifolds. As already mentioned, the multi-step procedure that we propose here aims at making the whole algorithm more robust, since every step produce a better starting point for the final iterative localization algorithm.

In summary, we start from an initial manifold that has been already singled out from the remaining high-energy states using standard procedure such as disentanglement and maximal localization (e.g., very accurately using projectability disentanglement^{17} to extract as much as possible the bonding and anti-bonding characters from all the bands). The subsequent diagonalizations of Wannier-gauge Hamiltonians separate the manifold into (two) orthogonal submanifolds (for VB & CB, respectively). The (two) parallel-transport steps (for the relevant submanifolds) construct continuous gauges, fixing the randomness caused by the independent Hamiltonian diagonalization at each *k*-point. The rotation w.r.t. a single unitary matrix removes the gauge arbitrariness of parallel transport at the first *k*-point. The final maximal localizations ultimately smoothen the gauge, leading to two sets of MLWFs, each of which spans the submanifold for VB or CB. In Supplementary Section 1.3, we prove that the final gauge transformation has block diagonal structure, i.e., the MRWFs are transformed according to

where *U*_{VB}(**k**) and *U*_{CB}(**k**) are unitary matrices for VB and semi-unitary matrices for CB, respectively.

### Silicon

To test the validity of the present method, we first disentangle and maximally localize the VCB of silicon into 8 WFs, using the standard hydrogenic s and p projections with energy window disentanglement (we use hydrogenic initial guesses and energy window disentanglement here to demonstrate that the present approach works well as long as the entire VCB are accurately described; one can also use PDWF to construct MLWFs spanning the entire VCB). The resulting WFs, two of which have s-character and six of which have p-character, are shown in the VCB column of Fig. 1a.

For the valence manifold, after running the Hamiltonian diagonalization and parallel transport, we obtain four WFs with spreads around 4 Å^{2}, but their shapes do not have clear physical meaning, since the gauge of the first *k*-point is still arbitrary (see PT column inside the VB column of Fig. 1a); after running the single rotation (SR) of the *W* matrix, their spreads are further minimized to around 3 Å^{2}, and the shapes now resemble the bonding orbitals between neighboring silicon atoms (PT+SR column inside the VB column of Fig. 1a); a final maximal localization (ML) further reduces the spreads to around 2 Å^{2} and the four spreads end up becoming identical (PT(+SR)+ML column inside the VB column of Fig. 1a) thus respecting the symmetry of the full system. For the conduction manifold (CB columns of Fig. 1a), again the shapes of WFs after parallel transport have no clear meaning. However, even after the single rotation, the shapes of WFs still do not resemble the expected anti-bonding orbitals, and only after the final maximal localization the anti-bonding shape is recovered. Note that, in this simple case of silicon, for both valence and conduction manifolds we reach the same set of WFs whether we run a maximal localization directly after parallel transport, or a maximal localization after parallel transport + single rotation, so the two cases are merged into one column in Fig. 1a under the header PT(+SR)+ML. The total spreads for the VB and the CB manifolds after parallel transport + maximal localization are 8.584 Å^{2} and 31.899 Å^{2}, respectively. As expected, their sum is larger than the value for the VCB manifold (29.619 Å^{2}) after disentanglement and maximal localization, since in the VCB case there is additional freedom to further minimize the spread by remixing bonding and anti-bonding WFs into pure s and p orbitals (we highlight that using atom-centered s,p projections does not lead to the most localized orbitals for VCB in silicon; with a choice of atom-centered sp^{3} projections, the total spreads can further decrease to 26.761 Å^{2}, where four WFs have spreads 3.522 Å^{2} and another four 3.168 Å^{2}). In addition, we note that since the Hamiltonian diagonalization returns a random gauge, the spreads for parallel transport and parallel transport + single rotation are different in each run, but the spreads of PT(+SR)+ML should always be the same, since the algorithm should always manage to find the maximally-localized gauge in this simple case. To quantify how our multi-step procedure increases the overall robustness of the algorithm while at the same time reducing its computational cost, we show in Supplementary Fig. 1 the evolution of WF spreads and centers during maximal localizations. Starting from the random gauge directly after Hamiltonian diagonalization (Supplementary Fig. 1a), it takes 28,430 iterations to converge; using random Gaussians as initial guesses (Supplementary Fig. 1b), the number of iterations decreases significantly to 812; with the parallel transport gauge (Supplementary Fig. 1c), the number of iterations further decreases to 228, and the evolution of spreads and centers is much smoother; the best starting gauge is the one after single rotation (Supplementary Fig. 1d), which only takes 40 iterations to converge, without any oscillations in the evolution of spreads and centers. Note that the spreads of valence MRWFs from PT(+SR)+ML are the same as MLWFs obtained from a direct Wannierization of the valence bands, i.e., the valence MRWFs after separation span the original DFT valence manifold, thus the initial VCB Wannierization does not cause any delocalization of the valence MRWFs (see Supplementary Section 1.3 for a proof).

We now discuss the quality of the band interpolation. The WFs for VB & CB are constructed from the initial VCB manifold obtained from a preliminary disentanglement and maximal localization. Therefore, if VB/CB are properly Wannierized, their band interpolation quality should be similar to that of VCB MLWFs. Thus, in the following paragraphs, we compare the band interpolations of VB/CB MLWFs, VCB MLWFs, and DFT bands. Once the starting VCB manifold was properly disentangled and could well reproduce the DFT band structure, as shown in Fig. 1b, the WFs after parallel transport + maximal localization for the VB and the CB manifolds can reproduce the corresponding part of the VCB Wannier-interpolated bands with high accuracy, being visually indistinguishable. To quantitatively evaluate the band interpolation quality, we compute the average band distance, *η*_{isolated}, between the VCB and VB/CB bands^{12,17,31}. The *η*_{isolated} is defined as

where *ε*_{nk} are the eigenvalues of a band structure, and its superscript A or B refers to the eigenvalues of two different bands, A or B, which can be DFT bands, or Wannier-interpolated bands of VCB, VB, or CB; *N*_{b} and *N*_{k} are the number of bands and *k*-points, respectively. For silicon, we obtain: \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VCB-VB}}}}}=\) 6.6 meV and \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VCB-CB}}}}}=\) 15.5 meV. In general, the VB interpolation is more accurate than CB since the VB MLWFs usually have smaller spreads. To improve the CB interpolation quality, one might need to increase *k*-point sampling, as we discuss in Section *Single top valence band of MoS*_{2}.

### Valence and conduction bands of MoS_{2}

Next, we test the method on a two-dimensional (2D) MoS_{2} monolayer. For VCB Wannierization, we use the standard hydrogenic Mo d and S s,p projections (the semicore states are excluded, so in total 9 VBs and the lowest 4 CBs are Wannierized). Since the VB and the lowest four CBs of MoS_{2} are isolated, 13 WFs are maximally localized from 13 bands without disentanglement. The 13 MLWFs can be well characterized into 4 groups by their angular momentum: as shown in the VCB column of Fig. 2b, from top to bottom, 3 resemble \({d}_{{z}^{2}}\), 2 resemble d_{xy}, and the remaining 8 resembles sp^{3} hybridized orbitals.

For the valence manifold, both after PT and after PT+SR, the WFs still do not have a clear resemblance to bonding orbitals; after PT+ML or PT+SR+ML, the WFs can be well grouped into six hybrids of Mo \({d}_{{z}^{2}}\) + S p, two s-like WFs near sulfur atoms, and one WF floating inside the hexagonal cage and having *C*_{3h} symmetry, originating from the hybridization of three properly oriented Mo \({d}_{{z}^{2}}\) orbitals from the three nearest Mo atom. For the conduction manifold, WFs after PT are already close to the anti-bonding hybrid orbitals, and the further SR or ML steps help to slightly reduce the spreads and result in more symmetrized WF shapes. We notice that in contrast to intuition, in this case it took more iterations to converge starting from PT+SR gauge than directly from the PT gauge, as shown in the Supplementary Fig. 2. Although the PT+SR cases start from a smaller total spread, the maximal localizations got stuck longer in plateaus in the final stages, leading to longer iterations. However, in all cases, the evolutions are smooth and converge in <500 iterations, since both valence and conduction are isolated manifolds themselves, and PT is able to construct good starting gauges.

In terms of band interpolation, again the Wannier interpolated VB and CB overlap essentially exactly with the DFT bands as well as the Wannier interpolated VCB, as shown in Fig. 2b, and demonstrated quantitatively by the excellent band-distance values: \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VCB-VB}}}}}=\) 0.19 meV and \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VCB-CB}}}}}=\) 0.51 meV.

### Single top valence band of MoS_{2}

In practical applications, the highest valence and lowest conduction bands are of high interest since they are critical for electronic transport properties. However, the Wannierization of a single band remains elusive since it is difficult to write down a proper initial projection, resulting from a complete hybridization of many different atomic orbitals. However, in the MoS_{2} case, since the top valence band is isolated with respect to all other bands, we can use our algorithm to construct a smooth gauge for that single band, demonstrating the more general applicability of our method, beyond the separation of VB & CB. Note that if one is only interested in Wannierizing an isolated group of bands, the parallel transport itself is sufficient to construct a good gauge as the starting point, followed by maximal localization to construct well-localized MLWFs. However, when one is interested in separating two submanifolds, or Wannierizing a submanifold that is entangled with other bands (e.g., the conduction bands of an insulator), the VCB Wannierization is necessary to construct a good starting manifold for the subsequent separation.

Figure 3 a shows the band interpolation of this single-band WF, and the inset shows the shape of this highly symmetric WF in real space. As usual, since the separate Wannierizations in each submanifold have less degrees of freedom compared with the Wannierization of the initial manifold, the WF spreads for separate Wannierization are usually larger. Indeed, the single WF has a relatively large spread (9.288 Å^{2}). For such a large spread, artificial interactions between periodic copies of the same WF in different supercells (where the supercell size is determined by the *k*-point sampling) may start to become non-negligible. Indeed, we observe some small oscillations at the minimum of the band along Γ to M and along M to K, whose zoom-ins are shown in Fig. 3b and c. By increasing the *k*-point sampling from the 12 × 12 × 1 (0.2 Å^{−1} density, same as the VCB Wannierization) to 18 × 18 × 1, the interpolation quality improves significantly (see Fig. 3b and c). This means that the band interpolation error is not caused by our separation method, but by the insufficient *k*-point sampling. Therefore, if one targets a very high band interpolation quality, the *k*-point sampling might need to be increased when considering a submanifold only.

### SrVO_{3}

Here we test the present method on the metallic perovskite SrVO_{3}, which is a correlated material on the t_{2g} Hubbard manifold. At the DFT level, the 3d manifold overlaps with other conduction bands in energy, but still at each *k*-point the 3d manifold has a finite gap w.r.t. others: the minimum gap (37.8 meV) occurs at R point separating the 3d manifold from bands above, thus satisfying the requirement of isolated manifold for the present method. Starting from 24 MLWFs for VCB, we generate two sets of MLWFs: 5 for 3d and 19 for the remaining manifold. For brevity, we only show the comparison of DFT and Wannier-interpolated bands in Fig. 4a, while the respective WF shapes and spreads are shown in Supplementary Figs. 3 and 4. For the 3d manifold, the PT gauge is already quite close to the maximally localized gauge: maximal localization only slightly decreases the total spread from 9.815 Å^{2} of PT to 9.629 Å^{2}, and symmetrizes the shapes of WFs (the 3d columns of Supplementary Fig. 3). For the remaining manifold, it is quite hard to converge: only after 13,027 iterations (Supplementary Fig. 4) the maximal localization can converge to real-valued, spatially-symmetrized MLWFs (Supplementary Fig. 3). In this case, the single rotation greatly helps in improving the convergence: only 1544 iterations are needed to converge to the same MLWFs starting from PT+SR gauge, also removing the oscillations in spreads and centers during maximal localization (Supplementary Fig. 4). For the band interpolation, again the respective manifolds are accurately reproduced, as demonstrated by the bands in Fig. 4 and the band distances: \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VCB-3d}}}}}=\) 6.74 meV and \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VCB-others}}}}}=\) 3.00 meV. Furthermore, since the t_{2g} and e_{g} manifolds are gapped in energy, we can also separate them into two submanifolds. As shown in Fig. 4b, the t_{2g} and e_{g} bands are again reproduced very well.

### Results on 77 insulators

Finally, we test our method on a set of 77 insulators with number of atoms in the unit cell ranging from 1 to 45. This is the same as the insulator set of Ref. ^{12}, except that 4 (He, Ne, Ar_{2}, Kr_{2}) of the 81 materials are excluded since they consist of closed-shell noble-gas atoms, where the valence electrons are fully occupied (and there is thus no need for separate Wannierizations). This comprehensive test set not only validates the correctness of the present method, but also helps improve its generality to cover edge cases (e.g., the additional treatment of **b**-vectors in the Supplementary Section 1.2) that would be difficult to discover with only a few test cases. The separate Wannierization is implemented as a fully automated AiiDA^{32,33,34} workflow, which first runs the Wannierization of VCB using PDWF^{17}, then splits the VCB manifold with the method discussed here (see Section *CODE AVAILABILITY* for the Wannier.jl code implementation), and then runs two separate maximal localizations using Wannier90 for the VB and the CB manifolds, respectively.

All the Wannierizations finish successfully and have excellent band interpolation quality, which we measure by the band distance^{12,17,31} *η*_{isolated} for comparisons between isolated bands (VB of separate Wannierization w.r.t. VB of DFT, VB of separate Wannierization w.r.t. VB of VCB Wannierization, CB of separate Wannierization w.r.t. CB of VCB Wannierization), and *η*_{2} for comparisons involving CB of DFT (VCB Wannierization w.r.t. VCB of DFT, CB of separate Wannierization w.r.t. CB of DFT),

where \({\tilde{f}}_{n{{{\bf{k}}}}}=\sqrt{{f}_{n{{{\bf{k}}}}}^{{{{\rm{A}}}}}({E}_{{{{\rm{F}}}}}+2,\sigma ){f}_{n{{{\bf{k}}}}}^{{{{\rm{B}}}}}({E}_{{{{\rm{F}}}}}+2,\sigma )}\) and *f*(*E*_{F} + 2, *σ*) is the Fermi-Dirac distribution with the Fermi energy set to 2 eV above the real Fermi energy *E*_{F} to compare also part of the conduction bands; the smearing width is set to *σ* = 0.1 eV. The statistics of *η* are shown in Fig. 5. It is worth noting that the VB interpolation (\({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VB-DFT}}}}}\) = 0.859 meV) is even more accurate than the VCB interpolation (\({\eta }_{2}^{{{{\rm{VCB-DFT}}}}}\) = 2.609 meV): this is partly because \({\eta }_{2}^{{{{\rm{VCB-DFT}}}}}\) is averaged over all bands, including the larger error of CB interpolation; to exclude the effect of averaging, we also compute the band distance of the VB of VCB Wannierization w.r.t. the VB of DFT, \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VB(VCB)-DFT}}}}}\) = 1.721 meV, which is larger than \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VB-DFT}}}}}\) = 0.859 meV, showing that the VB interpolation is indeed more accurate than the VCB interpolation—this can be explained by two facts: (1) the valence MRWFs are constructed by unitary transformations of Bloch states, thus the valence manifold is exactly preserved (see Supplementary Section 1.3 for a proof); (2) the valence MRWFs are more localized than the VCB MLWFs (will be discussed in the next paragraph), leading to a better Fourier interpolation quality. The CB interpolation (\({\eta }_{2}^{{{{\rm{CB-DFT}}}}}\) = 7.619 meV) is slightly worse than the CB of VCB Wannierization (\({\eta }_{2}^{{{{\rm{CB(VCB)-DFT}}}}}\) = 6.616 meV), since the CB MLWFs are more delocalized than the VCB MLWFs (will be discussed in the next paragraph); moreover, it appears much larger than the \({\eta }_{2}^{{{{\rm{VCB-DFT}}}}}\) = 2.609 meV since *η* is defined as an average over all bands—the (accurate) VB interpolations are not taken into account in the computation of \({\eta }_{2}^{{{{\rm{CB-DFT}}}}}\). In addition, and most importantly, the VB/CB to VCB distances are \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{VB-VCB}}}}}\) = 2.219 meV and \({\eta }_{{{{\rm{isolated}}}}}^{{{{\rm{CB-VCB}}}}}\) = 3.835 meV, showing that the submanifolds are well separated with little loss of interpolation accuracy compared with the starting-point VCB Wannierization. For completeness, we also show the statistics of max band distance, which is a stricter measure of band interpolation quality, in the Supplementary Fig. 5.

Now we discuss the localization of MLWFs by comparing the average spread from the 77 materials. For VCB Wannierization, the average spread Ω^{VCB} = 1.178 Å^{2}; after separation (followed by maximal localization of VB and CB, respectively), the average spread of VB MLWFs is slightly more localized (Ω^{VB} = 1.079 Å^{2}); while that of CB MLWFs are more delocalized (Ω^{CB} = 2.919 Å^{2}). This is consistent with the intuition that the VB MLWFs are the more localized bonding orbitals whereas the CB MLWFs are the more delocalized anti-bonding orbitals. Finally, as discussed in the previous section, the separated MLWFs have less degrees of freedom compared with the VCB MLWFs, thus the sum of the spreads of VB and CB (Ω^{VB} + Ω^{CB}) is in general larger than that of VCB (Ω^{VCB}). Figure 6b shows the percentage increase of Ω^{VB} + Ω^{CB} over Ω^{VCB}. On average, there is a 52.9% increase of the spread. Note, however, that in three cases (Na_{2}Se, Ca_{4}O_{14}V_{4}, and HK) there are a 39.2%, 56.0%, and 5.1% decreases of spreads, respectively. For the first two cases, there are few large-spread VCB MLWFs, probably because the VCB Wannierizations are trapped in local minima; during the separation Wannierizations, the parallel transport algorithm is able to find a smoother gauge, thus reaching more localized MLWFs for both VB and CB, leading to smaller Ω^{VB} + Ω^{CB} than Ω^{VCB}.

For completeness, we show the band-structure comparisons, the band distances, the evolution of spreads and the evolution of the sum of WF centers during maximal localization in Supplementary Section 3.1, for each of the 77 materials. The smooth evolution of spreads and sum of WF centers during maximal localization for CB and VB demonstrate that parallel transport is able to construct continuous gauge, thus maximal localization has no difficulty in further smoothening the gauge.

## Discussion

We introduce an automated method (manifold-remixed Wannier functions (MRWF)) to separate band manifolds by constructing MLWFs for the respective submanifolds that have finite energy gaps (at each *k*-point) between them. The method naturally extends to the case of valence and conduction manifolds, but also to any other case of isolated groups of bands. First, we start with a properly Wannierized valence plus conduction manifold, obtained using any manual or automated method (the recently introduced projectability disentangled WF^{17} is particularly suitable for this application since it is able to robustly and reliably construct MLWFs that preserve as much as possible the anti-bonding characters). Then, we split the manifolds by diagonalizing the Wannier-gauge Hamiltonian into submanifolds for target energy ranges, respectively. Next, using parallel transport, we construct smooth gauges for each submanifold to fix the randomness caused by the independent Hamiltonian diagonalization at every *k*-point. Finally, we maximally localize the parallel-transport gauge to obtain smooth MLWFs for the desired manifolds. Before the final maximal localization, we can optionally run a preliminary rotation w.r.t. a single unitary matrix to fix the remaining gauge randomness intrinsic to parallel transport. Often the final maximal localization is able to find the maximally-localized gauge directly; however, the single rotation step helps to improve the robustness of the final maximal localization, and has the additional benefit of improving localization while still preserving the parallel transport gauge, which might be relevant in some applications.

Results on silicon and MoS_{2} show that the final valence (conduction) MLWFs restore faithfully chemical intuition for bonding/anti-bonding orbitals, and accurately reproduce the valence/conduction part of the band structure of the valence plus conduction manifold. Moreover, we demonstrate that the method is not limited to the separation of valence and conduction manifolds, but also applicable to any system with band groups separated by a finite gap: for instance, the single top valence band of MoS_{2}; or the 3d, t_{2g}, and e_{g} manifolds of SrVO_{3}. Furthermore, we implement fully automated AiiDA^{32,33,34} workflows to carry out the whole separate Wannierization process, and test the present method on a set of 77 insulators. Statistics show that the band interpolation achieves excellent accuracy at the meV scale, and on average the sum of VB and CB MLWF spreads increase around 50% w.r.t. the VCB MLWF spreads. Thus, we highlight that to ensure accurate band interpolation quality, the *k*-point sampling density for separated Wannierization might need to be increased, as demonstrated in the Wannierization of the MoS_{2} top valence band.

As an outlook, we envision several applications that one may find useful with the preset approach: the analysis of bonding/anti-bonding orbitals based on MLWFs; material properties that rely solely on the occupied manifold, such as the electric polarization; spectral theories that require separate sets of localized orbitals for both occupied and unoccupied states (for instance, the Koopmans functionals to predict accurately the electronic band gap^{6}); and the dynamical mean field theory for correlated electrons.

## Methods

### Calculation parameters

The DFT calculations are carried out by Quantum ESPRESSO^{35}, using the SSSP efficiency (version 1.1, PBE functional) library^{31} for pseudopotentials and its recommended energy cutoffs. The high-throughput calculations for 77 insulators are managed by the AiiDA^{32,33,34} infrastructure which submits Quantum ESPRESSO and Wannier90^{4} calculations to remote clusters, parses, and stores the results into a database, while also orchestrating all sequences of simulations and workflows. The automated AiiDA workflows are open-source and hosted on GitHub^{36}. Semicore states from pseudopotentials are excluded from Wannierizations, except for a few cases where the semicore states overlap with valence states; in such cases, all the semicore states are Wannierized. A regular *k*-point mesh is used for the Wannier calculations, with a *k*-point spacing of 0.2 Å^{−1}, as selected by the protocol in Ref. ^{12}. Figures are generated by matplotlib^{37}.

## Data availability

All data generated for this work can be obtained from the Materials Cloud Archive (https://doi.org/10.24435/materialscloud:2f-hs).

## Code availability

All codes used for this work are open-source; the latest stable versions can be downloaded at http://www.wannier.org/ for Wannier90, https://www.quantum-espresso.org/ for Quantum ESPRESSO, https://www.aiida.net/ for AiiDA, and https://github.com/aiidateam/aiida-wannier90-workflows for aiida-wannier90-workflows. The MRWF method is implemented in an open-source Julia^{38} package named Wannier.jl, which is available at https://github.com/qiaojunfeng/Wannier.jl, and https://www.wannierjl.org/ for the accompanying documentation/tutorials.

## References

Marzari, N. & Vanderbilt, D. Maximally localized generalized Wannier functions for composite energy bands.

*Phys. Rev. B***56**, 12847–12865 (1997).Souza, I., Marzari, N. & Vanderbilt, D. Maximally localized Wannier functions for entangled energy bands.

*Phys. Rev. B***65**, 035109 (2001).Marzari, N., Mostofi, A. A., Yates, J. R., Souza, I. & Vanderbilt, D. Maximally localized Wannier functions: Theory and applications.

*Rev. Mod. Phys.***84**, 1419–1475 (2012).Pizzi, G. et al. Wannier90 as a community code: new features and applications.

*J. Phys.: Condens. Matter***32**, 165902 (2020).Pau, G. S. H.

*Reduced basis method for quantum models of crystalline solids*. Thesis. https://dspace.mit.edu/handle/1721.1/40376.De Gennaro, R., Colonna, N., Linscott, E. & Marzari, N. Bloch’s theorem in orbital-density-dependent functionals: Band structures from Koopmans spectral functionals.

*Phys. Rev. B***106**, 035106 (2022).Georges, A., Kotliar, G., Krauth, W. & Rozenberg, M. J. Dynamical mean-field theory of strongly correlated fermion systems and the limit of infinite dimensions.

*Rev. Mod. Phys.***68**, 13–125 (1996).Kotliar, G. et al. Electronic structure calculations with dynamical mean-field theory.

*Rev. Mod. Phys.***78**, 865–951 (2006).Maier, T., Jarrell, M., Pruschke, T. & Hettler, M. H. Quantum cluster theories.

*Rev. Mod. Phys.***77**, 1027–1080 (2005).Gull, E. et al. Continuous-time Monte Carlo methods for quantum impurity models.

*Rev. Mod. Phys.***83**, 349–404 (2011).Damle, A., Lin, L. & Ying, L. Compressed Representation of Kohn–Sham Orbitals via Selected Columns of the Density Matrix.

*J. Chem. Theory Comput.***11**, 1463–1469 (2015).Vitale, V. et al. Automated high-throughput Wannierisation.

*npj Comput. Mater.***6**, 66 (2020).Mustafa, J. I., Coh, S., Cohen, M. L. & Louie, S. G. Automated construction of maximally localized Wannier functions: Optimized projection functions method.

*Phys. Rev. B***92**, 165134 (2015).Mahler, A., Williams, J. Z., Su, N. Q. & Yang, W. Wannier Functions Dually Localized in Space and Energy. Preprint at https://doi.org/10.48550/arXiv.2201.07751 (2022).

Thygesen, K. S., Hansen, L. B. & Jacobsen, K. W. Partly Occupied Wannier Functions.

*Phys. Rev. Lett.***94**, 026405 (2005).Thygesen, K. S., Hansen, L. B. & Jacobsen, K. W. Partly occupied Wannier functions: Construction and applications.

*Phys. Rev. B***72**, 125119 (2005).Qiao, J., Pizzi, G. & Marzari, N. Projectability disentanglement for accurate and automated electronic-structure Hamiltonians.

*npj Comput. Mater.*https://doi.org/10.1038/s41524-023-01146-w (2023).Gontier, D., Levitt, A. & Siraj-dine, S. Numerical construction of Wannier functions through homotopy.

*J. Math. Phys.***60**, 031901 (2019).Damle, A., Levitt, A. & Lin, L. Variational Formulation for Wannier Functions with Entangled Band Structure.

*Multiscale Model. Simul.***17**, 167–191 (2019).Brouder, C., Panati, G., Calandra, M., Mourougane, C. & Marzari, N. Exponential Localization of Wannier Functions in Insulators.

*Phys. Rev. Lett.***98**, 046402 (2007).Cancès, É., Levitt, A., Panati, G. & Stoltz, G. Robust determination of maximally localized Wannier functions.

*Phys. Rev. B***95**, 075114 (2017).Cornean, H. D., Monaco, D. & Teufel, S. Wannier functions and \({{\mathbb{Z}}}_{2}\) invariants in time-reversal symmetric topological insulators.

*Rev. Math. Phys.***29**, 1730001 (2017).Damle, A., Lin, L. & Ying, L. SCDM-k: Localized orbitals for solids via selected columns of the density matrix.

*J. Comput. Phys.***334**, 1–15 (2017).Cornean, H. D., Herbst, I. & Nenciu, G. On the Construction of Composite Wannier Functions.

*Ann. Henri Poincaré***17**, 3361–3398 (2016).Mustafa, J. I., Coh, S., Cohen, M. L. & Louie, S. G. Automated construction of maximally localized Wannier functions for bands with nontrivial topology.

*Phys. Rev. B***94**, 125151 (2016).Winkler, G. W., Soluyanov, A. A. & Troyer, M. Smooth gauge and Wannier functions for topological band structures in arbitrary dimensions.

*Phys. Rev. B***93**, 035453 (2016).Mogensen, P. K. & Riseth, A. N. Optim: A mathematical optimization package for Julia.

*J. Open Source Softw.***3**, 615 (2018).Wang, X., Yates, J. R., Souza, I. & Vanderbilt, D. Ab initio calculation of the anomalous Hall conductivity by Wannier interpolation.

*Phys. Rev. B***74**, 195118 (2006).Wang, C. et al. First-principles calculation of nonlinear optical responses by Wannier interpolation.

*Phys. Rev. B***96**, 115147 (2017).Ibañez-Azpiroz, J., Tsirkin, S. S. & Souza, I. Ab initio calculation of the shift photocurrent by Wannier interpolation.

*Phys. Rev. B***97**, 245143 (2018).Prandini, G., Marrazzo, A., Castelli, I. E., Mounet, N. & Marzari, N. Precision and efficiency in solid-state pseudopotential calculations.

*npj Comput. Mater.***4**, 72 (2018).Pizzi, G., Cepellotti, A., Sabatini, R., Marzari, N. & Kozinsky, B. AiiDA: automated interactive infrastructure and database for computational science.

*Comput. Mater. Sci.***111**, 218–230 (2016).Huber, S. P. et al. AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance.

*Sci. Data***7**, 300 (2020).Uhrin, M., Huber, S. P., Yu, J., Marzari, N. & Pizzi, G. Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows.

*Comput. Mater. Sci.***187**, 110086 (2021).Giannozzi, P. et al. Quantum ESPRESSO toward the exascale.

*J. Chem. Phys.***152**, 154105 (2020).aiida-wannier90-workflows: A collection of advanced automated workflows to compute Wannier functions using AiiDA and the Wannier90 code. https://github.com/aiidateam/aiida-wannier90-workflows. [Online; accessed 2023-05-24].

Hunter, J. D. Matplotlib: A 2D Graphics Environment.

*Comput. Sci. Eng.***9**, 90–95 (2007).Bezanson, J., Edelman, A., Karpinski, S. & Shah, V. B. Julia: A Fresh Approach to Numerical Computing.

*SIAM Rev.***59**, 65–98 (2017).

## Acknowledgements

We thank Antoine Levitt and Michael F. Herbst for helpful discussions and feedback on the Wannier.jl implementation. We acknowledge financial support from the NCCR MARVEL (a National Centre of Competence in Research, funded by the Swiss National Science Foundation, grant No. 205602), the Swiss National Science Foundation (SNSF) Project Funding (grant 200021E_206190 “FISH4DIET”). The work is also supported by a pilot access grant from the Swiss National Supercomputing Centre (CSCS) on the Swiss share of the LUMI system under project ID “PILOT MC EPFL-NM 01”, a CHRONOS grant from the CSCS on the Swiss share of the LUMI system under project ID “REGULAR MC EPFL-NM 02”, and a grant from the CSCS under project ID s0178.

## Author information

### Authors and Affiliations

### Contributions

J.Q. implemented and tested the method. G.P. and N.M. supervised the project. All authors analyzed the results and contributed to writing the paper.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Qiao, J., Pizzi, G. & Marzari, N. Automated mixing of maximally localized Wannier functions into target manifolds.
*npj Comput Mater* **9**, 206 (2023). https://doi.org/10.1038/s41524-023-01147-9

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41524-023-01147-9