Enhancing ReaxFF for molecular dynamics simulations of lithium-ion batteries: an interactive reparameterization protocol

Lithium-ion batteries (LIBs) have become an essential technology for the green economy transition, as they are widely used in portable electronics, electric vehicles, and renewable energy systems. The solid-electrolyte interphase (SEI) is a key component for the correct operation, performance, and safety of LIBs. The SEI arises from the initial thermal metastability of the anode-electrolyte interface, and the resulting electrolyte reduction products stabilize the interface by forming an electrochemical buffer window. This article aims to make a first—but important—step towards enhancing the parametrization of a widely-used reactive force field (ReaxFF) for accurate molecular dynamics (MD) simulations of SEI components in LIBs. To this end, we focus on Lithium Fluoride (LiF), an inorganic salt of great interest due to its beneficial properties in the passivation layer. The protocol relies heavily on various Python libraries designed to work with atomistic simulations allowing robust automation of all the reparameterization steps. The proposed set of configurations, and the resulting dataset, allow the new ReaxFF to recover the solid nature of the inorganic salt and improve the mass transport properties prediction from MD simulation. The optimized ReaxFF surpasses the previously available force field by accurately adjusting the diffusivity of lithium in the solid lattice, resulting in a two-order-of-magnitude improvement in its prediction at room temperature. However, our comprehensive investigation of the simulation shows the strong sensitivity of the ReaxFF to the training set, making its ability to interpolate the potential energy surface challenging. Consequently, the current formulation of ReaxFF can be effectively employed to model specific and well-defined phenomena by utilizing the proposed interactive reparameterization protocol to construct the dataset. Overall, this work represents a significant initial step towards refining ReaxFF for precise reactive MD simulations, shedding light on the challenges and limitations of ReaxFF force field parametrization. The demonstrated limitations emphasize the potential for developing more versatile and advanced force fields to upscale ab initio simulation through our interactive reparameterization protocol, enabling more accurate and comprehensive MD simulations in the future.


Introduction
Lithium-ion batteries (LIBs) are and will continue to play an important role in the energy transition for energy storage and the fossil fuel replacement [1][2][3][4]. In the last decades, the growing popularity of electric vehicles, and energy storage systems [5] has fuelled the demand for batteries with greater capacity, more consistent performance over time, and improved safety [6]. While the materials used as anodes and cathodes largely determine battery capacity, the degradation phenomena that occur within the device govern battery stability and safety. One of the most important and yet poorly understood of these phenomena is the formation of a thin passivization film at the electrode-electrolyte interface, known as the solid electrolyte interphase (SEI) [7,8]. This layer is formed by the irreversible reaction of lithium ions with the electrolyte due to the initial thermodynamic instability between the anode and electrolyte [9]. The ions used to create the SEI are subtracted from the battery's capacity. Indeed the increase in the SEI over time is one of the causes of capacity fade observed during battery charging cycles [10]. Additionally, the uncontrolled formation and growth of the SEI layer can limit the performance and life of current LIBs, and its degradation can lead to severe battery damage and uncontrolled exothermic reactions such as the battery thermal runaway [11]. Therefore, it is essential to model the SEI to understand and predict the behaviour of lithium-ion batteries and improve their performance and safety [12]. Despite the importance of SEI, it remains a conundrum due to its high reactivity and multiscale nature, which make challenging its study in operando and in silico, respectively [3,13,14]. The theoretical understanding of the SEI may allow control of its final characteristics and it can enable the engineering of such layer, thus possibly leading to enhanced properties including better electronic insulation, prevention or alleviation of graphite anode exfoliation [15] and dendrite formation in lithium metal batteries [16]. In addition, the modelling and subsequent theoretical understanding of SEI can contribute to significantly accelerate the discovery of new materials and electrolytes for future lithium-ion batteries [3,14].
In the past two decades, various atomistic techniques have been utilized to study the kinetics of electrolyte dissociation in lithium-ion batteries [12]. The density functional theory (DFT) method has proven particularly useful in predicting the effects of lithium salt and additives on the final components of the solid electrolyte interphase [17][18][19][20][21][22]. Additionally, DFT has been used to investigate how the surface of the anode affects the electrolyte dissociation [23][24][25][26] and to calculate various properties of SEI products, including ionic conductivity [27][28][29], electron-transfer properties [26,30,31], and elastic modulus [32][33][34]. Although such simulations produce highly accurate results, they are computationally expensive. The cost of DFT simulations increases with the number of atoms in the system, typically scaling with O(N 2 ) where N is the number of atoms [35], limiting the size of the system that can be simulated to a few thousand atoms and the time interval to a few picoseconds. To observe the evolution of the SEI over extended periods, one solution is to use approximate functions V (r i ), known as force fields (FF), to calculate the system's energy. Indeed, the physical intuition-based functions employed in many force fields enable the replacement of the expensive Kohn-Sham functional E [n] [36], based on electronic density n, with an approximate functional E [V (r i )] that depends on the atomic configuration of the system r i neglecting the degree of freedom of the electrons and thus being computationally less demanding [36]. Given the reactive nature of lithium-ion batteries, traditional FF methods, which rely on harmonic laws to approximate bonded interactions, are inadequate for their study. To overcome this limitation, the ReaxFF method [37], which utilizes the bond-order (BO) and electron equilibration method (EEM) [38] to describe the connectivity and charges of atoms, respectively, has been suggested and utilized. This enabled the use of molecular dynamics (MD) simulations to observe the electrolyte decomposition and the formation of SEI components in LIBs. For instance, Kim et al. [39] used ReaxFF optimized for C-H-O-Li species by Han at al. [40] to study the formation of the SEI on lithium metal surfaces. They observed the production and layering of organic and inorganic components in the SEI on lithium metal surface, resulting from the decomposition of an EC and Dimethyl carbonate (DMC) electrolyte mixture. Using the same force field, Guk et al. [41] demonstrated how the layer of SEI that forms on the surface of graphite prevents the percolation of the electrolyte and the anode exfoliation. After this initial work, Yun et al. [42] expanded the FF to include C-H-O-Li-Si-Li-F atoms, enabling the study of SEI formation in high-capacity batteries where graphite is replaced with silicon, which has a higher theoretical capacity (372 mA h g −1 for LiC 6 and 4212 mA h g −1 for Li 4.4 Si) [43]. This parameterization was then further enhanced by Wang et al. [44] Current available ReaxFFs have demonstrated success in reproducing dissociation energies and reaction kinetics. However, as discussed below in this work, they fall short in accurately describing the solid phase of the SEI components.
In this study, we propose a protocol for reparameterizing the Yun et al. ReaxFF for C-H-O-Li-Si-Li-F atoms. Our approach focuses on correcting parameters related to Li and F atoms to better capture the properties of lithium fluoride (LiF), which is one of the possible inorganic salts resulting from the formation of SEI. LiF stems mainly from fluoroethylene carbonate (FEC) decomposition: LiF-rich SEI is known to exhibit improved cycling stability in batteries [45]. Furthermore, LiF has exceptional properties such as high ion transport, electronic insulation, and mechanical properties [46], making it a crucial element in the engineering of the solid electrolyte interface, as demonstrated by Tan et al. [47] The advancements in powerful python libraries for the management of atomistic systems, as outlined in the Table 1, have enabled the automation and orchestration of atomistic simulations using Jupyter notebooks and libraries such as ASE [48,49], PyMatgen [50,51], and ParAMS [52,53]. These tools have facilitated the parametrization process and made it more efficient. By sharing the Jupyter notebooks [54,55] and the database used in this parametrization work, Authors hope to address a current gap in the field of ReaxFF parameterization, i.e., the lack of access to raw data and datasets, which can limit the reproducibility and validity of the results presented.
This article is structured as follows: firstly, we present the reparameterization process and results of the new ReaxFF. Afterwards, we demonstrate how the enhanced force field improves the representation of the crystal structure of the LiF inorganic salt. We then proceed to test the ability of the enhanced force field to accurately describe the lithium mobility in LiF and highlight how it provides a more realistic diffusivity value for the Li atoms. Then we present a deep investigation of the results from the new ReaxFF, discussing the challenging aspects of this kind of potential. In the subsequent section, conclusion are drawn and a possible future outlook provided. Finally, the methods used are outlined in the final section. Table 1: List of the most commonly used and useful Python tools and libraries in computational materials science. These libraries can be used in one or more phases of material in silico study: PreP = pre-processing, Run = simulation running, and PostP = post-processing.

Description
PreP Run PostP ASE [48,49] Atomic Simulation Environment (ASE) is a Python library that provides a versatile framework for configuring, running, visualizing, and analyse atomistic simulations. Thanks to the object Calculator ASE provides a powerful interface to different codes.

✓ ✓ ✓
Pymatgen [50,51] Python Materials Genomics (Pymatgen) is a library for materials analysis, with a particular focus on solidstate studies and extensively used to produce and collect the data for the Materials Project (MP) [56] database. It has functions for reading and manipulating structural, thermodynamic, and electronic properties of materials. It can also handle inputs and outputs from various DFT codes and easily access the MP crystallography database via its integrated API.

✓ ✓ ✓
MDAnalysis [59,60] It is a versatile Python library mainly focused on allowing the manipulation and analysis of MD trajectories.
With support for various trajectory and system configuration formats, it simplifies the process of translating data into n-dimensional NumPy [61] arrays for further analysis.
It is a library and web application framework for handling, manipulating, and analysing crystal structures. It is widely used in the MP database as an interactively visualizing web tool.
✓ Quippy [63,64] It is the Python high-level interface to QUantum mechanics and Interatomic Potentials (QUIP) [65]. QUIP is a set of software tools for performing MD simulations that can be used as a plugin for other programs such as LAMMPS, CP2K [66], and ASE.

✓ ✓
PyLammps [67,68] Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a powerful classical molecular dynamics simulation code. The python package, which can be installed after code compilation, allows to manage the LAMMPS simulations through the lammps module, a wrapper for the code's C-library API.

ReaxFF reparameterization
In the currently available ReaxFF force fields, all the energy contributions, including the Li-F interactions, were parameterized using a database of ab initio simulations specifically designed to capture the dissociation energy for potential reactions in the system. As a result, both the force field of Yun et al. [42] and that of Wang et al. [44] can predict the reaction products in Si-Li batteries with acceptable accuracy, but the lack of data on the crystal properties of the inorganic salt limits the description of LiF aggregation and solid-phase transitions. To incorporate this information into the ReaxFF, a new database was created following the procedure proposed by LaBrosse et al. [73] specifically designed to build a force field for solid materials, in their case, pure cobalt crystal. The new database [74] was created using stable and metastable crystalline unit cells from the Material Project database [56] (Figure 1). Several initial configurations were created using the Pymatgen and ASE libraries. Over 300 DFT simulations were performed using those systems, and all relevant quantities, such as energies, forces, and partial charges, were extracted for the objective function.
In accordance with FAIR (Findable, Accessible, Interoperable, and Reusable) principles [75], we stored all system and simulation data in an SQLite3 database using the ASE library. This database, along with metadata and clear usage instruction, has been shared on Zenodo (https://doi.org/10.5281/zenodo.7959121) and Github (https://github.com/paolodeangelis/Enhancing_ReaxFF_DFT_database) for easy accessibility and usability. The ReaxFF coefficients were optimized using the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [76] genetic algorithm. Generally speaking, this evolution algorithm searches for the optimal solution by sampling the population from a multivariate normal distribution (MVN). The generated samples are ranked based using the chosen loss function, and the best points, representing lower values for the loss function, are utilized to update the covariance and mean of the MVN. Then the updated MVN is then used to obtain the next generation of samples, and the process continues iteratively until convergence or other predefined stopping criteria are met. In our case, the loss function is the sum of square errors (SSE) between values obtained from density functional theory calculations and molecular dynamics simulations. The algorithm converged after 1.3 × 10 4 iterations as shown in Figure 2, thanks to the use of the step-wise optimization technique already employed by Wang et al. [77]. Indeed, similar to the study conducted by Wang et al. [77], we capitalized on the physicalinspired structure of the ReaxFF force field, where parameters are categorized based on interaction and atom type, as outlined in Table 2. The approach consisted in gradually optimizing a selection of parameters relevant to particular types of interactions, instead than letting the algorithm operate in a very high-dimensional search space. Hence, we proceeded by optimizing initially the values associated with the bond interactions, then the values related to the dispersion energy, as evidenced by the leap in the loss function in Figure 2. However, after these first two subsets, further optimization of angle and torsional interactions was not feasible because all the initial guess values for their coefficients produced similar loss functions, which prevented the calculation Table 2 Listing of ReaxFF coefficients for each section in the force field file, organized by interaction type. Please note that the numbers in the second column represent the number of coefficients per single entry (e.g., for a single atom type or a single bond). However, the total number of parameters required in the force field depends on the total number of atom types (N A ), as indicated in the third column.
for each possible angle in the for each possible dihedral angle in the system  (Figure 3b). The first plot was obtained by applying a shear strain ε xy to a 2 × 2 × 2 supercell of the stable LiF crystal. While the EOS was computed by expanding and contracting the system using a three-dimensional volumetric strain, since the two are connected by the following relation ∆V /V = ε xx + ε yy + ε zz . At a glance, it is clear that the previous parameterizations were unable to predict the condensed state of LiF accurately. Indeed, neither the force field developed by Yun et al. [42], nor that of Wang et al. [44] showed the presence of a minimum configuration in both the elastic strain and EOS cases. The shear deformation case was particularly problematic because the inverted concavity of the curves led to a nonphysical negative value for the elastic tensor component c 12 , where the stress tensor is defined as c ij = σ ij ε −1 ij with ε ij representing the strain tensor and σ ij the stress tensor. The concavity of the energystrain plot determines this component since it is the function of the second partial derivative of the energy with respect to strain, i.e., c 12 = V −1 0 · ∂ 2 E/∂ε 1 ∂ε 2 . On the other side, thanks to the newly designed database, the reoptimized ReaxFF force field accurately predicts equilibrium configurations for both EOS and elastic strain curves. Specifically, the prediction of the shear strain closely matches ab initio results in the region close to the equilibrium value, but deviates substantially for high shear strain values, indicating its limitations in predicting material stiffness accurately. Similarly, the ReaxFF prediction of the equation of state agrees well with the reference values, even for significant volume changes. This remarkable improvement of the proposed reparameterization over previous ones is also demonstrated for the one-dimensional deformation of the crystal and for the metastable crystals (as shown in supplementary Figure S18). It is important to note, however, that the new force field predictions are less accurate for metastable cases with hexagonal and body-centered cubic lattice (Figure 1), suggesting that the functionals used to describe various interactions may have limitations in generalizing the energies from the DFT simulations.
Theor. To validate the new parameterization, as first test case, we conducted reactive MD simulations using all the discussed ReaxFFs for systems with Li and F atoms to simulate bulk LiF at ambient conditions (300 K, 1 bar). The simulation was performed on a relatively big system with respect to the ones used for the training. Indeed, a supercell of the LiF crystal created by replicating the conventional unit cell five times in all directions was placed in a simulation box of size 20.4 × 20.4 × 20.4; Å 3 , which is analogous to the system in Figure 1d. After the initial relaxation of the system, including box size adjustment, the pure LiF crystal was simulated for 50 ps under an NVT ensemble (constant number of atoms, volume, and temperature) set at 300 K and with a time-step of 0.25 fs. In Figure 4 we report the total energy, temperature, and radial distribution function (RDF) between the Li and F atoms, g Li,F (r), evaluated during the reactive MD simulations with different potentials. Comparing the results from the simulations using the Yun et al. [42] and Wang et al. [44] force fields ( Figure 4a and Figure 4b), we observed similar behavior, indicating minimal differences in the parameters for Li and F used in both ReaxFF. Furthermore, we can observe that both force fields failed to accurately describe the simulated system, as the thermal agitation overpowered the binding energy, resulting in the amorphization of the inorganic salt when it reached the external heat bath temperature. This resulted in a sudden drop in the total energy at 8 ps and the increase in temperature upto 750 K, as a consequence of the rapid conversion of potential energy into kinetic energy that results into an artificial phase change. Indeed, since that the melting temperature of LiF is 1121.35 K (848.2 • C) [78], the system should ideally exist in a solid phase in the simulated virtual conditions. This remarkable and unexpected result is even more evident looking at the two radial distribution functions in Figure 4a and Figure 4b, where the initial peaks due to the ordered structure of the system and immobility of the atoms disappear, and the curves become liquid-like RDF as the temperature rises. Moreover, we can notice a shift in the initial RDF peaks with respect to the theoretical position represented by the vertical dotted line, indicating an anomaly in the LiF solid state description of the two FFs. In contrast, our new parameterization performed well in simulating a simple bulk LiF system. Indeed, Figure 4c shows that the radial distribution function has initial peaks that align with theoretical values and remain consistent even at high temperatures. In this case, the thermal agitation of the atoms resulted only in the smoothening of the main peaks, and the emergence of secondary peaks due to the oscillation of some atoms between the lattice and interstitial positions. This important improvement is also evident by visualizing the temperature growth, which gradually reaches 300 K and is held constant as the system converges to an equilibrium state, as visible from the total energy (Figure 4c). In designing the training set, we aimed at enhancing the accuracy of predicting both the mechanical and thermophysical properties of the inorganic material through ReaxFF simulation. To achieve this, we included configurations with point defects like vacancies, interstitials, and substitutions and incorporated data from ab initio simulations conducted at different temperatures. Therefore, the second test case for the new ReaxFF was to compute the lithium transport properties with a campaign of MD simulations. We evaluated the mass diffusivity of Li atoms in three systems illustrated on the right side of Figure 1, specifically: one with 10 % interstitial lithium atoms (Li 1.1 F), one with 10 % lithium vacancies (Li 0.9 F), and one with no defects (LiF). These systems were simulated for 500 ps at three different NVT temperatures (300 K, 400 K, and 500 K). The lithium diffusion coefficients were extracted by measuring the mean square displacement (MSD) during the simulation and by using the Einstein-Smoluchowski diffusion equation [79][80][81]:

Diffusion of Li in LiF prediction from ReaxFF
where the limit argument is the MSD definition, t is time, D is the diffusion coefficient, and c d is a constant indicating dimensionality, with values of 2 for one-dimensional, 4 for two-dimensional, and 6 for three-dimensional diffusion. We obtained the diffusion coefficient values by determining the slope of the straight line that best fits the MSD evolution for each simulation (as seen in the supplementary Figure S19-S21). In Figure 5, we plotted them on a logarithmic scale and used the inverse of the temperature as abscissa. This enabled us to extract the activation energy that appears in the Arrhenius equation: Here, D is the lithium diffusion coefficient from the MD simulations, D 0 is the maximum diffusivity value (the limit value at infinite temperatures), k B is the Boltzmann constant, and E a is the activation energy. We repeated this process for all three ReaxFFs discussed, and the resulting values of D 0 and E a are reported in Table 3.
The new force field provides an improved description of bonding forces, which results in a significant reduction in the mobility of lithium atoms within the crystal lattice, as expected. For example, in the case study of defect-free LiF supercell at 300 K, the diffusion coefficient predicted by both the Yun et al. [42] and Wang et al. [44] force fields is 3.56 × 10 −5 cm 2 /s, which is significantly closer to the diffusivity of lithium ions in the electrolyte (0.5 × 10 −5 − 1.4 × 10 −5 cm 2 /s [82]) rather than in the solid. In contrast, the new parameterization predicts a diffusivity of 3.44 × 10 −8 cm 2 /s, which is a reduction of 3 orders of magnitude. Similar reductions in diffusivity are seen in cases involving interstitials and vacancies. We compared the obtained values with ab initio values calculated by Zheng et al. [29]. They studied lithium diffusion in various organic components of the SEI by exploring the energy surface using the surface energy Climbing Image Nudged Elastic Band (CI-NEB) method [83]. The exploration of the energy surface by Zheng et al. confirmed that the diffusion of lithium in lithium fluoride occurs via three distinct possible mechanisms. These mechanisms include: vacancy movement, where lithium Table 3 The values of the activation energy, Ea, and the maximum value of the lithium self-diffusion coefficient, D 0 , for the three simulated systems and for each ReaxFF used. The errors were estimated based on the inference for the linear regression model coefficients (least squares method), and assuming a 95 % confidence interval.
Yun et al. atoms jump to the adjacent empty lattice site (vacancy); direct-hopping, where lithium moves directly from one interstitial site to another; and the knock-off mechanism, in which an interstitial lithium atom replaces a lithium atom in the crystal lattice, thus causing the displaced lithium atom to move into another interstitial site. Figure 6 shows the Arrhenius curves for the three diffusion mechanisms calculated using the values of activation energy and maximum diffusion obtained from the ab initio study, as well as the curves obtained from the reactive molecular dynamics simulations.
Regarding the maximum diffusion coefficient D 0 = 3α 2 v, Zheng et al. approximate its value using the estimated phonon frequency in the crystal v = 1 × 10 13 s −1 and the migration distance for the lithium ions α observed during the CI-NEB simulations [29]. Molecular dynamics simulations can exhibit various diffusion mechanisms, leading to the calculated curves for the new force field and each studied system falling within the region defined by the three diffusion mechanisms. For the calculated temperature values, the diffusion coefficients turn out to be intermediate values between pure knockoff diffusion and pure direct-hopping diffusion. However, looking at the slope of the Arrhenius curves predicted by molecular dynamics simulations in Figure 6 it is clear that even the new ReaxFF underestimated the activation energy. Indeed, even for the most favorable diffusion mechanism (knock-off), the activation energy obtained from the ab initio energy profile is 24.1 kJ mol −1 , which is more than double the value calculated with the new force field for the defect-free case (11.0 kJ mol −1 ). The discrepancies are even more significant for all other cases listed in Table 3.
To address this discrepancy, we conducted further analysis of the ReaxFF by reproducing the energy curves for lithium diffusion by vacancies and by direct-hopping using the CI-NEB method. Initial and final configurations were made using a 2 × 2 × 2 LiF supercell and inserting the lithium atoms into two adjacent sites identified with the Voronoi analysis from the pymatgen python library [50]. While for the vacancy diffusion case, one lithium was removed from the crystal lattice from two adjacent unit cells. The energy profile was then obtained using the CI-NEB algorithm with the same parameters used in the DFT simulation for training the new ReaxFF. To ensure convergence, we used 23 images for direct-hopping diffusion and 17 images for vacancy diffusion. Our simulations yielded activation energies of 64.3 kJ mol −1 for  vacancy diffusion and 87.5 kJ mol −1 for direct-hopping. The value for vacancy diffusion was comparable to previous work (63.7 kJ mol −1 ). Still, the value for direct-hopping is higher than that calculated by Zheng et al. (52.1 kJ mol −1 ) due to the use of a smaller system to reduce the computational cost, resulting in a higher defect density in our case. Subsequently, from the result obtained from the DFT simulations, we proceeded to calculate each image's energy using the various ReaxFF discussed. In Figure 7 we show the comparison of the energy profile predicted by the various methods. The additional investigation of ReaxFF presented in Figure 7 reveals significant discrepancies between previous parameterizations, DFT prediction, and the reoptimized force field. Specifically, the energy barrier value prediction was found to be incorrect for all the ReaxFF studied, and the new force field exhibited unrealistic behavior in the case of interstitial diffusion. Indeed, the maximum energy values corresponded to the initial and final configurations, while the minimum value corresponded to the transition configuration. This behavior contradicts the physical nature of the phenomena, and the predictions made by the DFT calculation and even by previous parameterizations by Yun et al. [42] and Wang et al. [44]. In order to be able to explain this numerical artifact, it is important to consider the general structure of ReaxFF, which The markers indicate the exact energy value of each distinct image obtained from the convergence of the CI-NEB algorithm and were used to calculate the energy using three different methods: ReaxFFF by Yun et al. [42], Wang et al. [44], and a new reparameterization, represented by blue, green, and red colored markers and line, respectively.
is described by the provided equation: V reaxF F represents the total reactive potential of the particle i interacting with neighboring atoms, which can be divided into three contributions: bonded, non-bonded, and "system specific". The first describes the bonded interaction between atoms and is formed by the bond energy V bond , the angle energy V angle and the dihedral (or torsional) energy V dihedral and the corrective terms over-coordination V over , and undercoordination V under energies. The non-bonded energy, which is analogous to classical molecular dynamics, is given by the sum of the van der Waals potential V vdW and Coulomb potential V q . The last terms, on the other hand, are a set of corrective energies that accounts for specific phenomena in certain specific systems. For additional details about the mathematical and physical implications of each term, the reader can refer to the works by Van Duin et al. [37] and Chenoweth et al. [84]. From Eq. (3) and the previous studies, we can attribute the unrealistic behavior observed with the new force field to the increased energy contribution from bond energy, which improves the description of system connectivity. However, this results in an increase in the effect of correction energy terms, specifically over-coordination energy V over , and under-coordination energy, V under , in Eq. (3), which are functions of the difference between the total number of bonds of an atom and its valence number [37]. Indeed, in the initial and final conditions, the lithium atom has the same number of neighboring atoms, which is higher than in the transition state, where space is created by moving the lattice atoms, and the increase of the lithium distance respects the atoms at the extremes of the unit cell. This unexpected artifact highlights the critical importance of carefully considering energy contributions in force fields to ensure accurate predictions, as observed in the case study using ReaxFF. Consequently, it is likely that additional configurations are needed to be included in the training set. However, it is important to note that this alone does not guarantee the resolution of all the aforementioned issues. In fact, due to the reliance on a fixed functional shape in the force field, a more deep and sophisticated alteration in the mathematical expression of the energy terms might be required. Implementing such changes would involve modifying the code, which exceeds the scope of this work. This also highlights that a critical challenge here is the limited transferability of the parameterization beyond the specific database, emphasizing the necessity of a comprehensive database of ab initio simulations that is specifically tailored and adequately representative for studying the targeted system at an atomistic scale.

Conclusion
In conclusion, the proposed partial reparameterization methodology for the ReaxFF has significantly improved the description of lithium fluoride in the solid state, leading to better predictions of its solid phase properties and lithium mobility in LiF crystal using reactive MD simulations. Indeed, implementing the new parameters of the ReaxFF has demonstrated its ability to predict the stable unit cell under mechanical deformation accurately, exhibit typical solid-state RDF, and notably reduce lithium diffusivity in LiF by at least two orders of magnitude. The automation and interactivity of the protocol, achieved by leveraging Python libraries for atomistic simulations, made it possible to construct and simulate various configurations of the LiF crystal needed to build a database for correcting bond and van der Waals interactions of the ReaxFF. The new force field obtained from the reoptimization not only improved the behavior of the crystal in the solid phase but also partially corrected the description of lithium transport phenomena in LiF.
However, the in-depth investigation carried out revealed that the diffusion activation energies predicted by the new force field are still underestimated. This limitation may be due to the method used to construct the database, which did not directly sample lithium transport, but focused more on local or global deformation of the crystal lattice. This study highlights the strong dependence of the ReaxFF on the configurations included in the database and the challenges in adequately interpolating the energy surface in unexplored regions. The need to correct the ReaxFF for each specific case makes it challenging to adopt it for the up-scaling electronic simulations (here DFT) up to the molecular level. Based on our experience, the need for ever-larger databases poses serious questions on the cost-benefit ratio of this method. Although the ReaxFF MD simulations are much faster than ab initio simulations, given all the functionals to be computed and the need to update the charges at each step reduces by at least a factor of 100, the speed of MD simulations compared to simulations with Lennard-Jones fluid [85].
In addition to the sensitivity of the ReaxFF to the training set used, the difficulty in accurately describing the system studied in this work may also be attributed to an intrinsic bias of the force field. Indeed, it is worth noting that the original formulation of the ReaxFF potentials proposed by Van Duin in 2001 was validated primarily on organic systems [37], and various new functionals have been added to improve the accuracy and applicability of ReaxFF in other systems over the past few decades [86][87][88]. These include the "lone-pair" energy term (V lp in Eq. (3)) for hydrocarbon combustion [84,86,89], a three-body functional (V trip in Eq. (3)) term for −NO 2 group chemistry [86,90,91], and an energy term (V H−bond in Eq. (3)) to account for hydrogen bonds in aqueous systems [84,92], among others. Therefore, it may be necessary to introduce new corrective energy terms to improve the accuracy of ReaxFF in strongly inorganic systems such as LiF.
On the other hand, the rapid development of neural network-based force fields [93] (NNFF) may provide alternative and accurate approaches to the ReaxFF. The database constructed using the proposed protocol could be used for training these new force fields, which can provide superior performance at the cost of compromising on the physical insight into the parameters obtained from the training. In summary, the proposed methodology can be extended to the parameterization of other potentials, and by increasing the number of initial configurations, it may also be possible to proceed with the parameterization of neural network potentials. Moreover, the guided reparametrization method could incorporate other frequently encountered inorganic compounds in the SEI within future ReaxFF or NNFF. This expansion offers an exciting opportunity to explore increasingly intricate and realistic systems resembling mosaic structures [94] that emulate the Peled model [7].

Interactive reparameterization protocol
The protocol for calculating the new ReaxFF parameters is presented in a flowchart shown in Figure 8. The procedure is carried out using four Jupyter Notebooks (JNBs) [54,55] that facilitate the automatic construction, visualization, and simulation of the necessary configurations for database construction and optimization of the reactive potential. Python libraries ASE [48,49], pymatgen [50,51], PLAMS [69,70] and ParAMS [52,53], designed for atomistic systems manipulation and simulation are utilized throughout the entire process, allowing for streamlined and efficient handling of the various steps.
The first JNB initiates the protocol by defining the chemical formula of the component to study. Then, using the API from the Materials Project database, the available crystalline units are downloaded. In the case of LiF, there are three configurations available, corresponding to different lattices (Figure 1.a-c), one stable (Fm3m) and two metastable (P6 3 mc, Pm3m). These crystals are imported into the interactive work environment as virtual objects from the ASE (Atoms) and pymatgen (Structure) libraries. Within this virtual environment, the crystals were manipulated to generate Did the loss function improve respect the optimization of the previous interaction subset?
Is the new ReaxFF suitable for your application?

Start
Using the crystal's chemical formula, querying the Material Project database via its API for all stable and metastable unit structures.
Generation of initial dataset configurations using ASE and pymatgen libraries.
Obtain energy and force values by averaging DFT simulations of Single Point and Geometry Optimization.
Select a subset of ReaxFF interaction parameters: Bond, van der Waals, Angle, Torsion, etc.
Using a gradient-free optimizer, get the parameters that minimize the loss function.
Build the dataset for the training by extracting the needed quantities from the DFT simulations, and select a loss function. initial configurations for the DFT simulations. Specifically, we build different defected or deformed crystals following the procedure by LaBrosse et al. [73], namely: Supercells: To capture the effect of the atoms' coordination in the training set, several supercells of size 2 × 1 × 1, 2 × 2 × 1, 2 × 2 × 2, 3 × 2 × 2, 3 × 3 × 2, and 3 × 3 × 3 were constructed. Vacancies: We randomly removed a lithium or fluorine atom from the 3 × 2 × 2 supercell, up to a maximum of five vacancies. Strain: We applied different types of strain to each LiF crystal, including a normal strain ε 11 , a shear strain ε 12 = ε 21 , and a homogeneous deformation in all directions ε 11 = ε 22 = ε 33 (needed for computing the equation of state), resulting in 13 configurations with strain values ranging from −12.5% to 23.5%. Substitution: We randomly substituted a lithium or fluorine atom with the opposite species, with a maximum of five possible defects for each crystal. Interstitial: We inserted an interstitial atom at the center of Voronoi volumes obtained from the 3 × 2 × 2 supercell for each LiF crystal. We repeated this procedure five times to create structures with 1, 2, ..., and 5 interstitial atoms. Slab: To include surface energy in the ReaxFF, we generated crystal slabs for the surfaces (100), (110), (111), and (210). We repeated each slab 2, 3, or 4 times to account for different thicknesses and we introduced sufficient empty space along the normal direction to avoid numerical artifacts due to the periodic boundary conditions (PBCs). Bulk at 300 K and 500 K: We included frames of ab initio MD simulations at 300 K and 500 K for the 3 × 2 × 2 supercell to predict possible energy fluctuations during ReaxFF MD. Amorphous LiF: To account for the high-energy state of the crystal, we included possible amorphous LiF with an ab initio MD simulation at T = 2500 K.
In the second JNB, previously constructed configurations are used to perform ab initio simulations to obtain the numerical values needed for the databases. The ASE and pymatgen objects of the various systems were used as input to the PLAMS library of the AMS [71] commercial code, enabling the running, control, and processing of DFT simulations. For LiF, over 300 DFT simulations were carried out using the BAND plane-wave DFT code available in AMS, as listed in the supplementary Table S1. Single-point (SP) simulations were used to calculate the energy and force values of each atom in each configuration, by solving the Kohn-Sham equations while considering the atom cores as fixed. For systems with defects, geometry optimization (GO) simulations were initially performed to determine the equilibrium configuration. Ab initio MD simulations were performed using Grimme's extended density functional based on tight-binding (DFTB) [95] due to their low numerical cost. From the resulting trajectories, 10 frames were selected, and their force and energy values were refined using SP-DFT simulations.
The third and fourth JNBs heavily rely on the AMS ParAMS [52,53] library, which is specifically designed for potential optimization. In the third JNB, all quantities, such as energy, forces, and charges needed for the database, are extracted, resulting in more than 3000 entries. Regarding the potential energy, because it is a state function, the database does not include the absolute values obtained from the DFT simulations but the relative values obtained with respect to the defect-free configuration of the LiF crystal. This database is then used to optimize the ReaxFF in the fourth and final JNB. The optimization process begins by selecting the group of parameters that will form the search space, starting with those that significantly influence the behavior of the ReaxFF, such as the bond energy, followed by the van der Waals energy and angular energy terms. Various gradient-free optimization algorithms are available in the ParAMS library, and for the ReaxFF optimization for LiF, we chose the genetic algorithm Covariance Matrix Adaptation Evolution Strategy algorithm (CMA-ES) [76] to minimize the objective function represented by the sum of squared errors (SSE) between the DFT values and the ReaxFF values.
Each group or subset of parameters is optimized sequentially, and the ReaxFF optimization process is considered complete when further improvements in the loss function become negligible.

Density functional theory calculations
The dataset values (energies, forces, charges, etc.) for the ReaxFF optimization were obtained from DFT simulations performed with the commercial BAND [96,97] code available in the AMS suite. We numerically solved the Kohn-Sham equations using the Perdew-Burke-Ernzerhof (PBE) [98] functional and the polarized double zeta (DZ) numerical atomic orbitals (NAOs) basis set for the calculation of the s, p, and d orbitals. The software automatically chose the values of k-point and frozen electrons depending on the desired accuracy, and we selected a high accuracy value that guaranteed an error of less than 0.01 eV per atom, and by comparing the formation energy values obtained for each crystalline unit of LiF studied with the values reported on the Material Project online database (see Figure 1a-c). For very inhomogeneous systems, such as those with a large number of interstitial atoms or surfaces, we calculated forces without frozen atoms and using a single zeta (SZ) type basis set to speed up the calculation of the equilibrium configuration. We then refined the resulting configuration using the settings described above.
For detailed instructions on installing and utilizing the protocol and database repository, please refer to the supplementary material provided.

Molecular Dynamics calculations
All the reactive molecular dynamics simulations were performed using the openaccess code LAMMPS [67] with the ReaxFF package [99]. The initial configuration for the diffusion of Li in bulk LiF was obtained by starting from the primitive unit cell of the stable crystal obtained from the Material Project database. The unit cell was then converted into the conventional unit cell using the pymatgen routine ConventionalCellTransformation [50] and replicated six times along all directions to obtain a 6 × 6 × 6 supercell with final dimensions of 24.5 Å × 24.5 Å × 24.5 Å. To create the system with vacancies (i.e. Li 0.9 F), 86 randomly selected lithium atoms were removed. While to create the system with interstitial atoms (i.e. Li 1.1 F), 86 lithium atoms were placed inside the LiF supercell using the PACKMOL code [100]. After the initial energy minimization, the system was simulated for 0.5 ns using an NVT ensemble at three different temperatures (300 K, 400 K, and 500 K) with a Nose-Hoover thermostat [101] and a relaxation time of τ T = 25 fs. The integration time step was set to δt = 0.25 fs. Thermodynamic properties, including the instantaneous mean square displacement for all lithium atoms, were sampled every 50 simulation steps to compute the diffusivity as discussed later.

Diffusion energy barrier calculations (DFT)
To determine the energy barrier of Li ion diffusion, we used the climbing image-nudged elastic band (CI-NEB) method [83]. To employ this method, we built a 2 × 2 × 2 supercell of the primitive cell of the stable LiF crystal and placed a Li atom in the interstitial site found with the Voronoi analysis [102] in two adjacent cells to study the diffusion by interstitials. While, for the vacancy case, we removed a Li atom from two adjacent cells of the supercell. After geometry optimization of the initial and final states, we performed the CI-NEB calculation using 23 images for the direct-hopping diffusion case and 17 images for the vacancy case. We set the maximum perpendicular force component for the transitional state to be 2.5 eV Å −1 as the climbing convergence criterion. Finally, we obtained the activation energy by averaging the energy differences between the initial and transitional state and the final and transitional state.

Diffusion energy barrier calculations (MD)
To compute the diffusion coefficient D for each combination of temperature, system, and potential, we employed the Einstein-Smoluchowski diffusion equation [79][80][81], Eq. (1), that requires the time derivative of the MSD obtained from ReaxFF MD simulations. We obtained the numerical MSD from the MD trajectories and fitted it to a linear model of the form ⟨r 2 ⟩ = α 0 + α 1 · t + ε, where ⟨r 2 ⟩ represents the mean-square displacement, ε is the statistical error, and t denotes the elapsed simulation time ( Figure S19-S21). To determine the coefficients α 0 and α 1 , we used the least squares method (LSM) [103]. Consequently, the diffusivity can be evaluated as D = α 1 /6. An analogous procedure was followed to compute the activation energy from the Arrhenius law, Eq. (2). To linearize the equation, we took the logarithm of both sides and applied the variable substitution x = T −1 . This manipulation resulted in the equation taking the form ln(D) = ln(D 0 ) − E a /k B · x ( Figure 5). Using the diffusion coefficients obtained from the ReaxFF MD simulations, we fitted a linear model of the form ln(D) = β 0 + β 1 · x + ε to obtain the activation energy as function slope of the line, i.e., E a = −β 1 · k B .
To obtain the reported 95% confidence intervals for the diffusivity and activation energy in Table 3, we assumed that the statistical error ε of the linear model follows a Student's t-distribution T ν , where ν represents the degrees of freedom of the distribution [103]. In our case, ν equals the number of sampled data n minus the constraints of our model, which are the intercept and slope of the model (ν = n − 2). Under these reasonable hypotheses, the uncertainty for the diffusivity, δD, and activation energy, δE a , is estimated as follows [103]: (4) t n−2, 0.025 is the cuts probability 0.025 from the upper tail of Student's t distribution T n−2 , with n − 2 degrees of freedom. α 0 , α 1 , β 0 , and β 1 are the coefficients of the two linear models determined from the linear regression. ⟨r 2 ⟩ i represents the MSD at the i-th time t i of the simulation, k B is the Boltzmann constant, and D i and T i are the diffusivity and corresponding simulation temperature values, respectively. We use the notation ⟨ · ⟩ to indicate the mean value of the independent sampled variable of each model, i.e., the time, t, for the first model, and the reciprocal of the temperature, T −1 , for the second. The confidence interval for the entire linear model depicted in Figure 6 was instead obtained with the Eq. (5), which provides the uncertainty for the logarithm of the diffusivity, ln(D), as a function of the reciprocal of the temperature, T −1 , which is the independent variable of the linear model [103]:

Acknowledgements
The project leading to this application has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 957189 (BIG-MAP project). The authors also acknowledge that the results of this research have been achieved using the DECI resource ARCHER2 based in UK at EPCC with support from the PRACE aisbl.

Author contributions statement
The research project was conceptualized and designed collaboratively by P.D.A., M.F., P.A., and E.C. In particular, they collectively discussed the feasibility and suitability of using ReaxFF for investigating the SEI formation. E.C. and P.D.A. subsequently decided to focus on the LiF compound. P.D.A. developed, tested, and refined the protocol for reoptimizing the ReaxFF, and conducted the simulations and postprocessing of the data with the assistance of R.C. P.D.A., R.C. and M.F. worked on analyzing and interpreting the results, and P.D.A. wrote the manuscript with input from R.C., M.F., E.C. and P.A. The project was supervised by E.C. and P.A. All the authors have read and approved the final manuscript.

Additional information
Together with the material presented in this article, there is a supporting information document that contains further plots and data about the training data for the reparameterization and the mew ReaxFF file, as well as an additional plot of Li diffusion analysis. This supporting information document can be found at https://TBD.

Accession codes
In this study, we employed several computational tools to carry out our simulations.
We used the open-source LAMMPS code for the ReaxFF MD simulations, while the commercial codes BAND and DFTB were utilized for the DFT and ab initio MD simulations, respectively. These last two codes are all available in the Amsterdam Modeling Suite by SCM. In addition, we employed several Python libraries, including ASE, Pymatgen, and Packmol, to arrange the initial configurations, as well as the PLAMS library and an in-house Python code for automatically managing the simulation campaign. Finally, we utilized the ParAMS library to perform the ReaxFF parametrization.