A charge-density-based general cation insertion algorithm for generating new Li-ion cathode materials

Future lithium (Li) energy storage technologies, in particular solid-state configurations with a Li metal anode, opens up the possibility of using cathode materials that do not necessarily contain Li in its as-made state. To accelerate the discovery and design of such materials, we develop a general, chemically, and structurally agnostic methodology for identifying the optimal Li sites in any crystalline material. For a given crystal structure, we attempt multiple Li insertions at symmetrically in-equivalent positions by analyzing the electronic charge density obtained from first-principles density functional theory. In this report, we demonstrate the effectiveness of this procedure in successfully identifying the positions of the Li ion in well-known cathode materials using only the empty host (charged) material as guidance. Furthermore, applying the algorithm to over 2000 candidate cathode empty host materials we obtain statistics of Li site preferences to guide future developments of novel Li-ion cathode materials, particularly for solid-state applications.


INTRODUCTION
The need for rechargeable energy storage with higher energy density and longer operating life-time is a technical challenge for a future, carbon-neutral society. Although many battery chemistries have been commercialized in different usage scenarios, Li-ion batteries have proven to be the most successful, showing tremendous growth since its first implementation in 1991 1 . While current Li-ion technology is ubiquitous in portable electronics, there are numerous barriers that limit the wide-spread adoption in other sectors, such as transportation and the electric grid. A critical concern in current Li-ion technology is safety, due to the use of a highly oxidized cathode together with a combustible organic electrolyte 2 . Another issue is cost and element criticality-cobaltcontaining cathodes already account for a significant portion of the current cobalt market, and it is highly uncertain whether projected market growth can be accommodated 3,4 .
All-solid-state batteries replace the flammable organic electrolyte with an inorganic solid electrolyte, and are predicted to improve on both safety as well as chemical diversity 5 . In particular, the solid-state electrolyte enables the use of a metal anode, which eliminates the need for a cathode as the source of the working ion and hence opens up the playing field to a broader chemical and structural space, also considering different application specifics, beyond those traditionally tailored for portable electronics. To accelerate the discovery and design of novel cathode materials, particularly focusing on materials where the working ion is not present in the as-made material, we here develop a software and data analysis infrastructure to identify possible intercalating structures within the framework of the Materials Project 6 .
As of Feburary 2020, the Materials Project includes 125,134 material entries, 21,584 of which contain at least one transitionmetal redox element in addition to either oxygen or sulfur but none of the following possible working ions-Li + , Ca 2+ , Mg 2+ , and Na + . These candidate intercalation host materials cover 6862 distinct combinations of atomic species, which we call chemical systems. Computational investigation of any material for cathode applications require first-principles calculations of the structure at different states of charge, e.g., different working ion compositions. A thorough study of the full set of these chemical systems has not been possible until now due to several challenges, the primary one being: automated identification the preferred positions of the working ion in the host cathode structure. Naively, one may hypothesize that the best position to place a working ion is the most "empty" site in the structure, but this concept is not welldefined. In this work, we present an algorithmic framework that suggests possible insertion sites in any crystal structure using the charge density of the host material. While the method is inherently agnostic to the working ion, e.g., can be applied to Na + , Mg 2+ , Ca 2+ , etc., we will hereon focus on Li-ion insertion as the most immediately applicable technology. We validate the approach by predicting viable Li sites based on the charge density of the delithiated phases of well-known cathode materials and compare the results to the original discharged compound. Going further, we then attempt Li insertions into 2271 candidate cathode materials available within the Materials Project, which allows us to systematically investigate the effect of inserting Li into different chemical systems as well as the local coordination preference of Li, across a large diverse set of chemistries and structures. Statistics of Li insertion site preferences and chemical environments are presented.

Validation of insertion algorithm
A charge-density-based cation insertion algorithm is described in the "Methods" section. To demonstrate the insertion algorithm, we apply it to two well-known cathode materials: lithium manganese oxide (LMO) (Li x MnO 2 ) in the λ phase and lithium iron phosphate (Li x FePO 4 , or LFP) in the olivine phase, which provides two very different testing materials. The Wyckoff symbols for each site are obtained from a structure with all candidate (e.g., Li) sites occupied using pymatgen's interface to spglib library 7 . LMO presents a 3D percolating network of tetrahedral (8b) and octahedral (16d) Li sites, where typically only the 8b sites are occupied in the as-made material. In contrast, the olivine structure exhibits 1D channels of Li sites at the 4a Wyckoff position. Since both LMO and LFP are synthesized in their lithiated state, we first remove the Li ions to obtain the corresponding "empty" host structures, on which we demonstrate our insertion algorithm. The resulting relaxed atomic structures of spinel MnO 2 and olivine FePO 4 are shown in Fig. 1 with green/blue/yellow spheres indicating each local minima in the charge density. The positions of their local minima (in crystal coordinates), their average charge density, and the symmetry-equivalent group each local minima belongs to are shown in Supplementary Tables 1 and 2. For the 12-atom unit cell of spinel MnO 2 , we observe 8 local minima in the charge density, providing candidate sites for Li insertion. These sites form the three symmetry-equivalent groups as shown in Fig. 1a. The site labels correspond to Wyckoff symbols in the unit cell A ↦ 4d, B ↦ 2b and C ↦ 2a. Within the conventional (48 atom) cubic cell used in literature 8 , the Wyckoff symbols are: A ↦ 16d, B ↦ 8b and C ↦ 8a. The two lowest charge density sites are the A (4d, octahedral) site and the B (2b, tetrahedral) site. Within a sphere of radius r 0 = 0.4 Å, each A-site exhibits an average charge density (ρ) of 8.6 me/Å 3 and each Bsite has an ρ ¼ 12:9 me/Å 3 , respectively. There is a third group of symmetrically equivalent local minima, C sites, which are enclosed by MnO 6 octahedra, and exhibit a much higher average charge density of ρ ¼ 161:1 me/Å 3 .
The candidate Li insertion sites of olivine FePO 4 can also be organized into three groups with their corresponding Wyckoff symbols A ↦ 4a, B ↦ 4b and C ↦ 4c. The A and B-sites fall within two copies of the same low charge density regions in FePO 4 unit cell as shown in Fig. 1b, and exhibits average charge densities of 6.8 me/Å 3 and 7.8 me/Å 3 respectively. Similar to the previous case, the C-type local minima in FePO 4 exhibits much higher charge density-58.2 me/Å 3 .
For each identified, symmetry-unique Li site, we perform a single structure optimization with a Li ion inserted into that site. The resulting energies are then used to determine the most stable insertion site in LMO and LFP. It should be acknowledged that since we are often inserting into the primitive cells of the delithiated materials, the unit cells tend to be smaller (the smallest lattice vector is 5.83 Å for MnO 2 and 4.89 Å for FePO 4 ) which may incur finite-size effects. Also, we emphasize that any effects of Li arrangement and orderings in larger simulation cells are not considered explicitly.
To analyze the energy differences for the different candidate sites, using LMO as an example, we define the relative stability of the inserted structures, indexed by α, by the binding energy: where the notation (Li:α)MnO 2 denotes the relaxed structure after Li was inserted at site α ∈ {A, B, C} and E[Li metal ] is the energy per atom of bulk Li metal. The relative binding energy ΔE B of a particular structure is defined as which is the difference in energy between a given inserted structure and the most stable inserted structure for that material. This relative binding energy ΔE B is used henceforth as a measure of the Li site preference Li in a given host structure. The relaxed atomic configurations and binding energies of inserting Li onto different sites in LMO and LFO are shown in Fig. 2. For the A and B sites in each material, the insertion of Li: is defined as a topotactic reaction if the host crystal structure is unchanged after a single Li atom was inserted and the new structure has been optimized. The definition of "unchanged"  inherently depends on a predefined metric. Here, we utilize the structure matching capabilities in pymatgen 9 with the default tolerances parameters for the fractional length (ltol = 0.2), site position (stol = 0.3) and relative angles (angletol = 5 ∘ ). Using this definition, in MnO 2 , only the A-site and B-site insertions were classified as topotactic reactions, in agreement with the known insertion sites in the LMO spinel. On the other hand, the C-site insertion occurs at a site with much higher charge density, indicating severe repulsive interactions with the surrounding ions in LMO. As a result, when the structure with Li at the Csite is relaxed, the Li atom disrupts the MnO 2 lattice, and displaces the surrounding atoms to create space as shown in Fig. 2. The resulting configuration is more than 4 eV/atom higher in energy than the A and B-site inserted structures.
In FePO 4 , all three insertions were found to be topotactic. The Li atoms inserted into the A and B positions did not move significantly or cause large distortions in the lattice. The C-site insertion resulted in an atomic configuration that was symmetrically equivalent to the A-site insertion, indicating that the initial C-site position was too far from a local minimum in the potential energy surface. The A-site in FePO 4 was found to be the most stable, in agreement with known Li insertion into the olivine phase of LiFePO 4 10 , while the B-site is a meta-stable 4b site that connects the 4a sites within the 1D percolating channels 11,12 .
From the binding energies reported in Fig. 2, we observe that the charge-density provides a useful descriptor for site finding and correctly generates new structures with the stable and metastable Li sites for MnO 2 and FePO 4 . Moreover, the stable/ metastable Li sites all result from Li insertion at positions with low average charge density, hence providing a good descriptor for Li insertion sites at dilute Li concentrations. Incidentally, we note that, for both of these materials, migration between symmetryequivalent A and B-sites creates a percolating network between neighboring sites of similar low charge densities, indicating a possibility of favorable migration.
While the site finder algorithm performed well for the MnO 2 spinel and FePO 4 olivine cathode materials, a more systematic way to gauge its performance across different chemistries and structures is needed to confidently apply it to a broader compound space such as e.g., the Materials Project. To further test the accuracy of the algorithm, we selected eight cathode material from the Materials Project with experimentally verified structures, ionic positions and ICSD entries representing a variety of chemical and structural motifs. To further test the limitations of our insertion process, and to highlight its effectiveness in closepacked structures, we also included two Li-ion conductor materials with a varying packing efficiency. For the loose-packed example, we use a new super-ionic conductor material recently realized by Di Stefano et al. 13 LiTi 2 (PS 4 ) 3 (LTPS), which presents TiS 6 octahedra connected by edge-sharing thiophosphate (PS 4 ) groups resulting in a porous network of open cavities, where the Li ions are delocalized at room temperature. For the close-packed example, we include a hypothetical structural variant on the same chemistry, LiTi 2 (PS 4 ) 3 , in the NASICON structure where the lowenergy Li sites are known. These materials are listed in Table 1, and for each material, we removed the Li from the structure and performed a structural relaxation on the host lattice, obtaining the charge density of the "empty" host material. We insert a Li atom into each symmetry-unique site using the insertion algorithm described above. For each material in Table 1, we perform stepwise Li insertion: (i) relax the atomic structure of the host material, (ii) analyze the charge density of the host material to identify a list of symmetry-unique sites, (iii) create new structures by inserting one Li atom into each symmetry-unique site, and (iv) relax each new structure again, finally (v) we utilize the new structure with the lowest energy and repeat from step (ii) until we reach the Li content of the original material.
For each material we examine the charge density minima (fs i g) of the relaxed delithiated structure to carefully evaluate the performance of the insertion algorithm. The Wyckoff symbols of the first three candidate sites with the lowest average charge densities are given in Table 1. For seven out of the ten materials, the local minima with the lowest average charge density (labeled s 1 ) coincided with a Li site of the original structure. In all cases, except the novel LTPS from ref. 13 , repeated Li insertions into the most stable candidate site was enough to recover the original, fully lithiated, structure. In the case of LTPS, Li does not exhibit typical crystallographic sites (e.g, octahedral or tetrahedral)instead showing delocalized occupancy over broader "pocket" regions, which provides a highly challenging test case.
For the spinel phases LiTi 2 O 4 and LiMn3 2 Ni1 2 O 4 , the positions ofs 1 are symmetrically distinct from the Li atoms in the original crystal. The atomic structures of these two materials are shown Fig. 3 along with the three groups of candidate sites with the lowest ρ. For LiTi 2 O 4 , the host structure and fractional coordinates of the candidate sites are identical to those of λ-MnO 2 . Again, the two lowest charge density sites exhibit Wyckoff symbols 4d and 2b, respectively. In both materials, the ΔE B for these sites were very similar, with the 4d site being favored in MnO 2 and the 2b site being favored in TiO 2 . For LiMn 3/2 Ni 1/2 O 8 , the lithium positions coincide with the eights 3 (8c) local minima in the charge density. We note that in such cases, where the energy difference between sites are small, it is likely that Li-Li interactions, which are not taken into account here, affect the final site preference.
The novel ionic conductor (LTPS) reported by ref. 13 exhibits two candidate site regions with the lowest average charge densities. These regions correspond to the centers of the large (2d) and small (4c) cavities of the structure. As expected in such large empty cavities, during relaxation, the Li ion moves towards the wall of the cavity, mimicking absorption. The third lowest charge density site is located in a crevice of the pore linkers. Interestingly, when relaxed, the large cavity (2d) sites 1 ends up the least stable (2.1 eV higher in the energy), whereas thes 2 ands 3 sites are close in energy, 78 meV apart, with thes 3 site corresponding to one of the regions of high Li concentration observed in the ref. 13 . We speculate that, given the high degree of Li ion delocalization observed in this structure, there is a collection of sites exhibiting a flat energy landscape which enables fast ion migration. Among our three candidate sites, our framework was able to successfully identify two relevant regions of Li occupancy, however, we emphasize that further exploration and validation may be necessary for broad applications to porous structural families.
To demonstrate that the method can be applied to other complex, covalent network structures, we also attempted Li insertion on a hypothetical NASICON 14 configuration of Ti 2 (PS 4 ) 3 , e.g., chemically identical to LTPS. For this hypothetical material, the typical NASICON working-ion sites coincide with the local minima in the charge density of the relaxed Ti 2 (PS 4 ) 3 structure with 6e and 2b Wyckoff symbols, which exhibit the two lowest charge densities.
Since the effect of explicit Li insertion in most cases is accompanied with lattice relaxation and changes in the local environment, we do not expect the charge density be a reliable indicator of the ranking of stable Li sites. However, for each material in Table 1, the original Li positions were always among the three lowest-density sites identified by the charge density method. Additionally, the insertion site with the lowest binding energy (the underlined entries in Table 1) always coincided with a Li position in the original structure. This lends the confidence to our site-finding algorithm and indicates that the optimal site can be found by systematic insertion of the active cation into a handful of candidate individual sites, while recording the resulting energy. Structures with higher lithium content can be obtained by repeated insertions on the lowest energy structure, until one of the following criteria are met: (i) the new structure is no longer topotactically related to the original one, (ii) the new structure is too far above the convex-hull or (iii) the minimum redox state has been achieved. For the materials in Table 1 this type of repeated insertion was sufficient to produced the original, known lithiated structure. Although these results are highly promising, it is important to note that finite-size effects such as Li-Li interactions can influence the insertion sequence, especially for candidate sites with very similar energies.
Statistical analysis of topotactic insertion Using our benchmarked algorithm to explore possible Li intercalation into any structure, we are poised to investigate the success rate of topotactic insertions into the broad range of structurally distinct transition metal oxides and sulfides available in the Materials Project. However, 21,584 materials present a formidable number of possible Li insertions, and hence we limit the number of possible compounds that exhibit a computed energy above hull E hull < 0.1 eV 15 and limited cell size (less or equal to 20 atoms per unit cell). This selection process results in a candidate set of reasonably stable, computationally tractable 15,484 transition-metal compounds. Attempting Li insertion into all of the symmetry-unique sites in all these structures is still intractable given current computational resources; however, we can elucidate the site preference of Li across diverse transition metal systems by attempting insertions on a smaller, randomly selected subset of these structures. Hence, we applied Li insertion framework developed above on this set (which we will call S inserted ) of 2271 materials.
To understand the chemical diversity of the materials in S inserted , we plot the occurrences of different transition metals in Fig. 4. Interestingly, while the Materials Project does not enforce uniform representation of different chemical systems, similar numbers of entries are found on the diagonal of Fig. 4, representing materials that only contain one type of TM. That is, the random selection fortuitously resulted in similar numbers of single TM-containing compounds. Materials that contain more than one kind of TM atom manifest as off-diagonal cells in Fig. 4, and they are less common in S inserted .
For each material in S inserted , we attempt a single Li insertion into each candidate site identified by the charge density. A material is deemed to exhibit a successful topotactic insertion if any one of the attempted Li insertions was topotactic. For each combination of TM atoms, the percentage of materials which exhibit successful topotactic insertions is given in Fig. 5. Even though multiple TM-containing compounds are less common, they still represent a significant portion of the materials in our dataset. Although there is some variation in the topotactic Li insertion success rate across different combinations of TMs, the overall success rate for the 2271 TM containing host materials is 88% (or 2059 materials).  In addition to the chemical diversity, the set of materials S inserted also exhibit significant diversity in their crystal structure, which were analysed using the Robocrystallographer software 16 . Of the 2271 structures in S inserted , less than half (971) matched any well-known prototype mineral structure. For the ones that did, 50 distinct mineral types can be identified within the set of 971 materials. The mineral types with more than ten matching structures in S inserted are tabulated in Table 2 along with the number of matching structures. For each mineral type, the percentage of structures with successful topotactic Li insertions is also shown. From Table 2, we can see that well-known cathode structures like 3-D spinel structure and the 2-D molybdenite structure have high rates of successful topotacitic insertion. However, more tightly packed structures like the cubic and orthorhombic perovskite structures exhibit much lower rates of successful insertion.
As a first observation, regardless of structure or chemical composition, we find that the thermodynamic prerequisites for topotatic insertion of Li is a common phenomenon. We emphasize that in order for Li to actually enter and leave the structure, good kinetics is also required. However, sites and structure-tolerance to intercalation is a non-negotiable to cathode performance. Our results presents a promising, as well as daunting, prospect as almost all of the transition metal chemical space is open for a broader search of new Li-ion cathode materials.
Local environment preference of Li The coordination preference of Li in compounds of various structure and chemistry has been a long-standing question in cathode research 17,18 . The Li site insertion methodology presented here can address this question and investigate the coordination preference of Li in a systematic way. For all structures with a single topotactically inserted Li atom, we compute the coordination number of that Li atom using the ChemEnv 19,20 local environment descriptor. For any given structure and target ion, ChemEnv provides a continuous descriptor which attributes a fraction of commonly observed local-environment motifs to the chemical environment of the target ion, where the sum of all the fractions are enforce to equal unity. The assignment of local-environment fractions is sigmoidal such that small deviations from the ideal geometry still produce similar results. As an example, a perfectly octahedral environment would exhibit a fraction of 1.0 for octahedral environment and zero for all others. If a Jahn-Tellar distortion is applied where the z-direction bonds have been lengthened by 10% and the xy-plane bonds have been reduced by 10% the local environment still have an octahedral fraction of >0.99 with a small addition of square planar. However, if a distortion of 20% is applied, the local environment will be only 0.17 octahedral and 0.83 square planar.
We first examine the dataset comprised of all compounds that exhibited a successful topotactic insertion, per the procedure described above. We emphasize that this set may include several varieties of the same compound, at the same Li composition, as metastable (but relaxed) sites would be included. The distribution of the coordination number for this set of all topotactically inserted Li is shown by the blue bar histogram in Fig. 6. Not surprisingly, from Fig. 6, we find that topotactitically inserted Li has a general preference for coordination numbers 4 (CN4) and 6 (CN6). Final relaxed positions with CN4 accounts for 36% of all the topotactically inserted Li atoms, while sites with CN6 accounts for 33%. Together, these two coordination environments account for a majority of topotactic Li insertions in our dataset.
One may question whether the coordination preference changes if we consider the dataset comprising only the most stable sites (lowest ΔE B ) as a function of Li concentration, for each material. This analysis is provided as a comparative yellow bar histogram in Fig. 6. Interestingly, we find that the site preference distribution of the stable sites is quite similar to the full set of all topotactic sites, indicating that the obtained Li coordination  Mineral prototypes with more than ten occurrences in the set of insertionattempted materials. For each prototype, the number of occurrences in the set and the percentage of the materials within each prototype group with successful topotactic insertions R top . preference is robust, even when metastable (but relaxed) sites are considered as well. It further supports that the insertion site algorithm together with subsequent relaxation is able to locate suitable sites, as no coordination preference is included in this process.
Within the subsets of Li positions with CN4 and CN6, we compute the ChemEnv coordination parameters for further analysis. For a given coordination number, there are a number of well-known coordination environments recommended by the IUPAC 21 . For CN4 the possible motifs are tetrahedral, square coplanar, and see-saw; for CN6 the motifs are octahedral, pentagonal pyramidal, and trigonal prism. A given site is identified as having a particular local environment if the fraction of a given motif is above the cutoff criterion of 0.85. For the 10,956 Li insertion positions among the 2271 materials, 8535 exhibits topotactic Li sites of which 4549 are identifiable with one of the CN4 or CN6 motifs, according to the cutoff criterion. If we restrict ourselves to topotactic sites that represent the stable (lowest energy) Li insertion in a given material, 1120 out of the 2010 stable topotactic sites were identifiable. The distributions of these motifs for the CN4 and CN6 sites are shown in Fig. 7.
The results presented in Fig. 7 indicate that the octahedral local environment is the most common motif, but tetrahedral topotactic insertions are also common. If we restrict ourselves to materials with stable topotactic insertion sites for Li, we find that 80% of these sites are four or six coordinated, with tetrahedral and octahedral sites accounting for the majority of identifiable structure motifs. While these structural motifs are clearly preferred, out of the 2010 stable sites only about half (1120) were identifiable with any CN4 or CN6 motif, while 76% where identifiable with any motif at all using a ChemEnv fraction cutoff of >0.85. Since the set of stable topotactic insertion site show such great diversity in their local environment, the precise site preference should always be explicitly investigated using our insertion algorithm.

DISCUSSION
In summary we have developed a generic atom insertion algorithm for suggesting atomic insertion sites by analysing the electronic charge density and demonstrated how this algorithm performs for Li insertions in candidate cathode, transition metalcontaining material systems. We showed that the local minima of the charge density provide good initial guesses for the final lithium positions, such that the lowest energy sites can be easily recovered using a small number of calculations on these candidate sites. In fact, the most stable site is often found among the local minima with the lowest surrounding charge density. By iteratively inserting Li atoms and optimizing the new structures to identify the most stable new Li position, the fully intercalated structures of known Li cathodes can be recovered from the completely delithiated structures. Using this method, we attempted Li insertions on a set of 2271 randomly selected transition metal compounds and showed that Li can be inserted topotactically on an overwhelming majority of these structures. Finally, we demonstrated that the long-held belief that Li has a strong preference for 4 and 6 coordinated sites is valid across a large set of transition metal-containing compounds, with the octahedral coordination being the most common local environment for topotactically inserted Li atoms.

METHODS
The total energy of each atomic arrangement was calculated from first principles using density functional theory with the generalized-gradient approximation (GGA) functional of Perdew-Burke-Ernzerhof (PBE), as implemented in the Vienna Ab initio Simulation Package (VASP). To obtain accurate and directly comparable energies across chemical systems, we use GGA when appropriate, GGA+U otherwise, and mix energies from the two calculation methodologies using the procedure described by Jain et al. 22 . Structural relaxation was performed until the maximum force on each atom was less than 0.05 eV/Å.
Our insertion algorithm relies on analysis of the charge density-ρ(r) of the original host compound, without a priori information of the preferred Li-ion intercalation sites. For a given host structure, ρ can either be obtained from the Materials Project or computed and stored in a database using the atomate 23 framework. A set of local minima s i of ρ(r), as well as the average charge density in the local neighborhood of radius r 0 around each local minima, are identified. For the remaining discussions, we assume that r 0 = 0.4 Å whenever the radius is not specified (i.e., ρðsÞ ¼ ρ 0:4 ðsÞ). We note that the charge density of the host structure will often exhibit high symmetry, which will be reflected by the charge density. Hence, many of the local minima are symmetrically equivalent, and the set {s i } can be reduced to a (often much) smaller subset of symmetrically unique sites fs i g. Starting with a host structure, we perform an independent structure relaxation calculation with a single Li inserted at eachs i site, in order to obtain and identify new, suitable lithiated structures.

DATA AVAILABILITY
The data that support the findings of this study are available on the Materials Project website (materialsproject.org). The material ids of the 2271 materials in the set S inserted of materials with attempted Li insertions are listed in the Supplementary  Table 3.

CODE AVAILABILITY
Code for workflow generation and supporting analysis is available in the atomate code repository (atomate.org) and in the pymatgen code repository (pymatgen.org) respectively. Fig. 7 Distribution of local environment motifs. Occurrences of identifiable local environment motifs for the CN4 and CN6 topotactic Li insertion sites for the entire set of topotactic insertions (blue) and for the most stable topotactic insertion (yellow).