The AiiDA-KKR plugin and its application to high-throughput impurity embedding into a topological insulator

The ever increasing availability of supercomputing resources led computer-based materials science into a new era of high-throughput calculations. Recently, Pizzi et al. [Comp. Mat. Sci. 111, 218 (2016)] introduced the AiiDA framework that provides a way to automate calculations while allowing to store the full provenance of complex workflows in a database. We present the development of the AiiDA-KKR plugin that allows to perform a large number of ab initio impurity embedding calculations based on the relativistic full-potential Korringa-Kohn-Rostoker Green function method. The capabilities of the AiiDA-KKR plugin are demonstrated with the calculation of several thousand impurities embedded into the prototypical topological insulator Sb2Te3. The results are collected in the JuDiT database which we use to investigate chemical trends as well as Fermi level and layer dependence of physical properties of impurities. This includes the study of spin moments, the impurity's tendency to form in-gap states or its effect on the charge doping of the host-crystal. These properties depend on the detailed electronic structure of the impurity embedded into the host crystal which highlights the need for ab initio calculations in order to get accurate predictions.

The ever increasing availability of supercomputing resources led computer-based materials science into a new era of high-throughput calculations. Recently, Pizzi et al. [Comp. Mat. Sci. 111, 218 (2016)] introduced the AiiDA framework that provides a way to automate calculations while allowing to store the full provenance of complex workflows in a database. We present the development of the AiiDA-KKR plugin that allows to perform a large number of ab initio impurity embedding calculations based on the relativistic full-potential Korringa-Kohn-Rostoker Green function method. The capabilities of the AiiDA-KKR plugin are demonstrated with the calculation of several thousand impurities embedded into the prototypical topological insulator Sb2Te3. The results are collected in the JuDiT database which we use to investigate chemical trends as well as Fermi level and layer dependence of physical properties of impurities. This includes the study of spin moments, the impurity's tendency to form in-gap states or its effect on the charge doping of the host-crystal. These properties depend on the detailed electronic structure of the impurity embedded into the host crystal which highlights the need for ab initio calculations in order to get accurate predictions.

I. MOTIVATION
In recent years computer-driven materials design has become increasingly important in the field of materials science. The ever increasing availability of supercomputing resources opened up new possibilities towards data-driven condensed matter research. Apart from large collections of crystal structure information 1-4 , fully integrated frameworks of tools and databases have arisen that allow for high-throughput investigations using a huge amount of, mainly, densityfunctional-theory-based calculations [5][6][7][8] . Here, we present the AiiDA-KKR plugin 9 which connects our full-potential relativistic Korringa-Kohn-Rostoker Green function (KKR) method 10 to the AiiDA (Automated Interactive Infrastructure and Database for Computational Science) framework 8,11 .
The AiiDA infrastructure implements the FAIR principle 12 of findable, accessible, interoperable and reusable data sharing which provides a flexible plugin-based python environment. Through a common interface, different densityfunctional theory codes 13 can even be used in combination to exploit the strengths of different implementations and realise multi-code workflows within the same framework. The KKR method 14,15 is an all-electron implementation of density functional theory. Its Green function formulation allows, for instance, the efficient treatment of defective systems (i.e. systems that contain defects and impurities) which can be very expensive to treat with wavefunction-based methods that often require very large supercells for this task. Our newly developed AiiDA-KKR plugin is used to perform a large number of impurity embedding calculations into the prototypical topological insulator Sb 2 Te 3 .
Topological insulators (TIs) have been the center of attention in solid state research since their extraordinary physical properties, that lead to topologically protected surface states, have been discovered 16 . In the past decade the field around topological materials has evolved steadily and now aims at functionalizing TI materials by interfacing them with other states of matter. For instance, realizing the quantum anomalous Hall (QAH) 17,18 insulator state or Majorana zero modes 19 , that might lead to topological qubits, is pursued. The former requires to combine the topological band structures of TIs with magnetism while the latter needs interfacing topological materials with superconductors. Controlling the interface and understanding the effect defects and imperfections have remains a major challenge in this field to this day.
Apart from the development of the AiiDA-KKR plugin, the outcome of this study is the JuDiT database (Jülich Database of impurities embedded into a Topological insulator) of physical properties of impurities embedded into the surface of Sb 2 Te 3 . We study their tendency for charge doping (i.e. to introduce por n-doping), their impurity magnetic moments and their density of states (DOS). This collection of impurity properties allows to uncover chemical trends and can help to optimize the next generation of TI-based materials in the future. In particular, we investigate the layer and Fermi level dependence of the spin moment and find that Mo Sb defects show a high spin moment while introducing only a small charge doping, which is more than 5 times smaller than for magnetic 3d impurities. Furthermore, we find the Mo defect to be a good candidate for future applications since no impurity resonance appears in the bulk band gap region. An impurity resonance would otherwise lead to higher scattering rates of topological surface state electrons off this defect 20 and a higher probability to produce impurity bands in the gap.
The paper is structured as follows. First, the theoretical setting of the impurity embedding problem within the KKR method is introduced in section II. Then the AiiDA-KKR package is presented (section III) where the calculation and workflow plugins, that are implemented in AiiDA-KKR, are discussed. Afterwards the developments are showcased at the example of high-throughput impurity embedding into the topological insulator material Sb 2 Te 3 in section IV. Finally, section V concludes with a summary.

II. AB INITIO IMPURITY EMBEDDING
One of the advantages that arises from the Green function formulation of the KKR method lies in its ability to include impurities efficiently into crystalline solids 14 . This is achieved making use of the Dyson equation where G host is the Green function of the crystalline host system, ∆V = V imp − V host is the difference in the potential introduced due to the presence of the impurity and G imp is the Green function that describes the impurity embedded into the periodic host crystal. It is important to mention that the change in the potential ∆V occurs only in a small region around the impurity which is why the Dyson equation can be solved in a small real space region around the impurity site. This impurity cluster contains a few neighboring shells of host atoms that are necessary to properly treat the charge screening of the impurity by the neighboring host atoms. It is worthwhile noting that G imp contains the complete information on physical properties like the density of states which is computed as ρ(r; E) = − 1 π ImTrG imp (r, r; E) (the trace is implied over spin-, atom-and orbital momentum degrees of freedom of the Green function).
An impurity embedding calculation in the KKR formalism therefore consists of (i) calculating G host (from a converged host calculation), (ii) creating ∆V , (iii) performing a selfconsistent field cycle (scf) to converge G imp .

III. THE AiiDA-KKR PLUGIN
The AiiDA framework 8,11 is a python package that allows to provide a code agnostic interface for different ab initio codes which enables the automation of calculations. Simultaneously, inputs and outputs of complex sequences of calculations are automatically saved in a database 21 . This ensures the reproducibility of all results due to the stored provenance, which consist of nodes with directed connections, in the database. AiiDA implements the ADES model 8 which provides a common layer of data structures that are used by the plugins of different codes 13 . This enables workflows that use multiple codes, which allows interoperability and exploitation of the individual strengths of different implementations.
To enable high-throughput KKR calculations with AiiDA we developed the open source AiiDA-KKR python plugin 9 that provides a set of calculations and workflows (i.e. complex sequences of calculations) and some accompanying tools. A detailed and up-to-date description of the input and output structure of the individual calculations, workflows and tools of AiiDA-KKR is included in its online documentation 22 where additionally examples for the usage of the plugin are given. In the following only a brief overview of the features implemented in the AiiDA-KKR plugin is given.

A. Calculation plugins
Each calculation plugin comes with the functionality to create code-specific input files from AiiDA objects (e.g. StructureData objects that contain the structural information of a system) and a parser that is able to parse the retrieved output files. The conversion of AiiDA Dict objects to the input file needed by the JuKKR code is facilitated with a python class called kkrparams 23 that also contains methods to verify the consistency of input parameter and write the input file. This class also knows about KKR-specific features like dealing with alloys within the coherent potential approximation (CPA) 14 . Running a KKR calculation through AiiDA creates an acyclic directed graph in the database which is shown schematically for a KkrCalculation in Fig. 1(a). It can be seen that a calculation requires a few input nodes (input parameter Dict node, a Code node, and a ParentCalculation node), which are all AiiDA objects stored in the AiiDA database. The output files are retrieved once the calculation finished and parsed to extract output parameters, that are then stored as a Dict node in the database. It it worthwhile noting that the AiiDA daemon takes care of automatically storing the resulting nodes with their directed connections to the AiiDA database. This eventually results in a complex graph as it is shown exemplary in Fig. 1(b) for a complete impurity-embedding sequence including all steps outlined in section II. The full provenance of a complex procedure of calculations is stored in the database which allows to reproduce all inputs and the intermediate steps that have been performed to arrive at the final result. In the following the different calculations provided in the AiiDA-KKR plugin 9 are briefly discussed.
• The VoronoiCalculation plugin allows to use the voronoi code of the JuKKR package 10 which constructs the shape functions 24,25 needed for the fullpotential treatment and generates starting potentials. The graph of a VoronoiCalculation looks similar to the one shown in Fig. 1(a) except that the ParentCalculation input node is replaced by an AiiDA StructureData node that contains all structural information on the crystal (e.g. lattice constant, atom positions and kinds) 11 .
• The KkrCalculation plugin provides an interface to the KKRhost code of JuKKR 10 which allows to perform self-consistency (scf), density of states (DOS), bandstructure and additional postprocessing calculations (e.g. calculation of Heisenberg exchange interaction parameters 26 ).
• The KkrimpCalculation plugin connects AiiDA to the KKRimp code of the JuKKR package that solves the Dyson equation for impurity embedding (Eq. (1)). This calculation needs in addition to the usual inputs (input parameter (Dict) or ParentCalculation node) the host Green function in the impurity cluster region (G host ) which is written out with a special postprocessing run-mode of the KKRhost program using  a KkrCalculation. The KkrimpCalculation can then be used to perform electronic structure calculations for the impurity problem.

B. Workflow plugins
In addition to the calculation plugins, the AiiDA-KKR package provides some workflows that automate complex sequences of voronoi, KKRhost and KKRimp calculations. The workflows contained in AiiDA-KKR have been developed in a modular way and build upon each other. Internally these are AiiDA WorkChains which is indicated by the wc ending in the names of the workflows. For an in-depth discussion of the input and output structure and their usage we refer to the online documentation 22 . Here we restrict our discussion to a short overview of the workflows of AiiDA-KKR.
• The kkr dos wc workflow conveniently wraps around a KkrCalculation and provides the necessary inputs to perform a DOS calculation. Additionally the output is parsed and the output DOS data is stored as an array in the database which allows easy access and plotting of the output DOS.
• The voro start wc workflow wraps the VoronoiCalculation and performs some additional verification of input structure and KKR-specific parameters in order to make sure the chosen starting setting is reasonable. One of the checks performed automatically within voro start wc makes use of the kkr dos wc workflow.
• The kkr scf wc workflow builds upon the voro start wc workflow and a sophisticated series of KkrCalculations which is intended to reach convergence of a given host system reliably. This makes sure the starting setup is reasonable before the potential is pre-converged until finally convergence with higher accuracy settings is pursued.
• The gf writeout wc workflow takes care of setting the necessary options for a KkrCalculation in order to write out G host in preparation of the impurity embedding step. For a given impurity position and screening cluster size the host Green function can be reused for several impurities that respect this embedding geometry.
• The kkr imp sub wc workflow performs the self consistency cycle of KKRimp calculations similar to the logic implemented in the kkr scf wc workflow. It includes features that deal with possible convergence problems automatically.
• The kkr imp wc workflow combines the voro start wc, gf writeout wc and kkr imp sub wc steps for the impurity problem. This allows to conveniently start with a converged host calculation and the information on the impurity (e.g. its position in the host crystal, the size of the screening cluster) which defines the problem given by Eq. (1) completely.
• The kkr imp dos wc workflow gives an easier access to calculate the DOS of an impurity embedded into a host crystal.

C. Tools
Apart from tools that are used internally within the calculation and workflow plugins (e.g. to prepare the real space screening cluster), the AiiDA-KKR plugin contains a plotting tool called plot kkr. This tool takes a node identifier (the AiiDA node instance, it's pk or uuid) or even a list of nodes and plots a standard, yet fully customisable, plot of the respective data. For example, a typical self-consistency workflow for an impurity calculation (kkr imp wc node) as input to plot kkr will produce a plot of the convergence behavior with the scf iteration number or an instance of a kkr imp dos wc workflow will produce the orbitalresolved plot of the total DOS in the impurity cluster. Examples of such plots are given in Fig. 2. The tool conveniently abstracts away the need to extract the relevant data from the nodes in the database and provides a straightforward way to create commonly used plots.

IV. impDAT:TI -A DATABASE FOR IMPURITIES EMBEDDED INTO A TI
We apply the AiiDA-KKR plugin to embed a large number of impurities into the topological insulator Sb 2 Te 3 . The resulting JuDiT database 41 comes with a webinterface 42 for convenient access to the included data. The following analysis shows some of the physical insights obtained through this study. Our data analysis does not aim at being comprehensive but is intended to showcase the usefulness of our application. A future data-driven study might give additional insights but is beyond the scope of this work.

A. Sb2Te3 host system and computational details
For the JuDiT database we considered substitutional defects (denoted by X Y for impurity X replacing host atom Y ) in the first 3 quintuple layers (QL) of a 6 QL thick film of Sb 2 Te 3 . This allows to study the influence of the topological surface state, that is mainly located in the 1st QL, on impurity properties. The band structure and density of states of the clean host system are shown in Fig. 3. In order to take into consideration doping of the host material, we investigated three possible positions of the Fermi level (E F located in the valence band (VB) and in the conduction band (CB), as well as E F in the bulk band gap). The shifted positions of the Fermi level are highlighted in Fig. 3 with red (E F in VB) and green (E F in CB) lines. For the impurity embedding we neglected structural relaxations of the atoms around the impurities but included the first 3-4 shells of neighbors (containing 21-27 atoms and empty cells in the impurity cluster) within a radius of 4.8Å around the impurities. We used the exact description of the atomic cells 24,25 and the local spin density approximation 27 (LSDA) for the exchange correlation functional. A cutoff for the angular expansion of max = 3 was chosen and corrections for the truncation error using Lloyd's formula have been applied 28 . Relativistic corrections arising within the scalar-relativistic approximations as well as spin-orbit coupling have been taken into account fully selfconsistently for both host and impurity calculations.

B. Physical properties of impurities embedded into Sb2Te3
We start our analysis with an overview of the contents of the JuDiT database. In total more than 2100 impurities have been embedded self-consistently into the Sb 2 Te 3 host system. For each impurity we computed physical properties like the spin and orbital moments, the impurity's DOS as well as the tendency to show impurity resonance in the region of the bulk band gap. Furthermore, we analyzed the charge doping introduced by the defect which we define as where n imp (n host ) are the electron densities for the impurity (host) atom embedded into the surrounding host crystal integrated in the Voronoi cell of the atom. Here, Z imp (Z host ) is the nuclear charge of the impurity (host) atom. The expression in brackets in the rhs of Eq.
(2) therefore contain the information how much charge is transferred to/from the impurity. The results can conveniently be visualized with the JuDiT webinterface 42 . The plots from Fig. 3 as well as the ones of Fig. 4 have been created using the tools available there. Figure 4(a) displays the, over the different impurity sites averaged, charge doping given in electrons per impurity. The observed chemical trends partly fit the behavior of the Pauling electronegativity 29 . The details of the bonding mechanism are however more subtle and can be quantified using ab initio data. This reflects the intricate physics of the chemical bonding which was recently classified to be metavalent for the Sb 2 Te 3 class of materials 30 .
Each impurity in JuDiT also has a detail page where the complete output dictionary of the converged calculation is given and from where it's provenance can be browsed. It also features a plot of the impurity DOS which is shown exemplary in Fig. 4(b) for a Tc Sb impurity located in the fourth Sb layer from the surface. The position of the impurity in the Sb 2 Te 3 host crystal is visualized in Fig. 4(c). It can be seen that the d-states of the Tc atom are exchange-and crystal-field-split which results in a magnetic moment of the impurity and a resonance in the impurity DOS around the Fermi level consequently in the bulk band gap region (blue area in Fig. 4(b), see also Fig. 3). This particular defect is therefore expected to lead to strongly scattering of the topological surface state electrons 20 which can induce significant back-scattering since the Tc defect is a resonant scatterer and is magnetic with a spin moment of 1.86 µ B .
In order to quantify the gap-filling nature of all considered defects, we define the number of states introduced by the defect in the bulk band gap region as where E min and E max are the edges of the bulk band gap region (blue area in Fig. 3), Ω imp is the Voronoi cell around the impurity atom and ρ(r; E) denotes the charge density around the atom in the Voronoi cell. A high gap-filling value consequently signals that scattering off that particular impurity will be increased which could be detrimental to the desired transport properties of TI materials. This is especially the case if the impurity is magnetic and the k → −k backscattering channel reopens due to broken time reversal symmetry.

C. Magnetic impurities
We now focus our attention to magnetic defects which are found for some transition metal impurities. These systems are interesting in the context of realizing a robust quantum anomalous Hall phase. We start by investigating the layer and Fermi level dependence of the spin moment of 3d transition metal impurities shown in Fig. 5(a). We see a gradual increase of the magnetic moment when going from V over Cr up to Mn dopants (blue orange and green symbols, respectively) before a subsequent decrease with Fe and Co impurities (red, violet) is observed. This behavior is expected from Hund's rule and reflects the, from V to Co, increasing filling of the d-shell (see also Fig. A.1 in the appendix).
The details of the impurity's electronic structure are largely determined by two factors: (i) the atomic nature of the impurity atom determining its number of electrons in the atomic configuration and (ii) the interaction with the surrounding atoms of the host crystal that affect the hybridization of the atomic states of the impurity with the host's band structure. This effect is seen in the layer dependence of the size of the spin-moment which is highlighted for V Sb and V Te defects with the blue dashed lines in Fig. 5(a). We observe a higher spin-moment for V Te compared to V Sb which can be attributed to a larger charge transfer to the impurity in the Sb substitutional site compared to the Te substitutional site. The larger charge transfer to the impurity results in a higher filling of the V d-shell and therefore a higher spin moment which can also be seen for the Cr impurity. The same mechanism leads to a decrease in the spin moment that is found for Co Sb defects compared to Co Te .
Investigating the dependence of the impurity spin moment on the Fermi level (different symbols in Fig. 5(a)) reveals that the details of the hybridization with the electronic band structure of the host material can be controlled via the position of the Fermi level in the host material. A shift in the host's Fermi level can experimentally be achieved by appropriate doping with Bi Sb impurities 31 . This change in the impurity moment has been seen previously both theoretical 32 as well as experimental 33 and can be attributed to the competition between the impurity seeking charge neutrality and the strong change in the hybridization with the host's electronic structure with varying position of the Fermi level due to the presence of the bulk band gap in TI materials. This leads, for instance, to decreasing (increasing) spin moments for V Sb (V Te ) with increasing position of the Fermi level.
Overall we observe that for V impurities the spread in the spin moment with the impurity's surrounding (i.e. its layer dependence) is twice as large as with varying Fermi level. On the contrary, for Mn defects the change in the spin moment in different layers and with varying Fermi level is always rather small. This results from the half-filling of the Mn d orbital that make the spin moment relatively insensitive to small changes in the hybridization with the host's electronic structure. Nevertheless, the magnetic interactions among multiple magnetic impurity atoms can, even at small changes in the impurity hybridization, be strongly affected 32,33 . Figure 5(b) summarizes the charge doping ∆n imp for all defects included in JuDiT. The zigzag behavior with the impurity's core charge reflects the structure of the periodic table with its isoelectronic groups. Each data point is colored by the impurity's spin moment which shows that the maximum of the spin-moment is found for Mn and Fe impurities (yellow points). It can however be seen that the charge doping introduced by these 3d defects is fairly large which reflects the significant difference in electronegativity compared to the Sb and Te host atoms 29 .
Applying magnetic doping to achieve a robust quantum anomalous Hall phase needs to fulfill some boundary condi-tions in order to be feasible in experiments. In order to not tune the Fermi level out of the bulk band gap by magnetic doping, the induced charge doping should be as small as possible. At the same time magnetism is the key ingredient which calls for a sizable spin moment of the impurity. Furthermore, the magnetic impurity should not show a high DOS in the bulk band gap region to reduce the appearance of unwanted impurity bands with increasing magnetic doping. If we apply these conditions of a gap filling of n imp gap < 0.02 e, a charge doping of ∆n imp < 0.1 e and a spin-moment of m s > 1.5 µ B we find that Mo Sb defects meet all these criteria. Compared to the Te substitutional site, which also shows a small charge doping and high spin moment, the Sb substitution have a gap filling which is an order of magnitude smaller and could therefore be desirable. To be able to use Mo-dopants for the realization of the QAH state the magnetic exchange coupling between Mo atoms needs to be investigated further in the future. This is, however, beyond the scope of this work. The study of the exchange interactions will be especially interesting since Mo-doping of Bi 2 Se 3 showed signatures of antiferromagnetic coupling 34 which could be possibly overcome by appropriate band structure and defect engineering as it was seen for Mn and Co doping of Bi 2 Te 3 32 . Here, additional co-doping with other defects could open another way to design TI-based materials for future applications 35 .

V. CONCLUSIONS
In conclusion, we have developed the AiiDA-KKR plugin which is an open source python package that connects the JuKKR code family to the AiiDA framework. This allows to perform Korringa-Kohn-Rostoker Green function calculations in an automated high-throughput manner. We concentrated on the ability to perform ab initio impurity embedding into the topological insulator Sb 2 Te 3 .
We considered several thousand different impurities embedded into the different layers of the Sb 2 Te 3 host crystal. This procedure allowed us to study the layer and Fermi level dependence of physical properties of defects. Specifically, we studied the chemical trends in terms of the impurity's charge doping, their tendency to create resonances in the bulk band gap and their magnetic properties. The results have been collected in the JuDiT database which is openly available and comes with online tools for data visualization and export. Throughout our analysis we have seen that the details of the electronic structure of an impurity embedded into a host crystal is very important. The hybridization of the impurity states with its surrounding plays a crucial role for its physical properties. This highlights the relevance and the need for our ab initio calculations which provide predictive power.
In the future the AiiDA-KKR plugin in general and the resulting data of this study in particular can be used in broader high-throughput studies for quantum materials. The capabilities of AiiDA-KKR could be extended to further include the automated calculation of scattering [36][37][38] and transport properties 39,40 or to investigate magnetic exchange interactions 26 . Especially the vast space of combinations co- doping with other impurities introduces will be of interest in order to find ways to tune physical properties and engineer the behavior of TI-based materials.