INDEEDopt: a deep learning-based ReaxFF parameterization framework

Empirical interatomic potentials require optimization of force field parameters to tune interatomic interactions to mimic ones obtained by quantum chemistry-based methods. The optimization of the parameters is complex and requires the development of new techniques. Here, we propose an INitial-DEsign Enhanced Deep learning-based OPTimization (INDEEDopt) framework to accelerate and improve the quality of the ReaxFF parameterization. The procedure starts with a Latin Hypercube Design (LHD) algorithm that is used to explore the parameter landscape extensively. The LHD passes the information about explored regions to a deep learning model, which finds the minimum discrepancy regions and eliminates unfeasible regions, and constructs a more comprehensive understanding of physically meaningful parameter space. We demonstrate the procedure here for the parameterization of a nickel–chromium binary force field and a tungsten–sulfide–carbon–oxygen–hydrogen quinary force field. We show that INDEEDopt produces improved accuracies in shorter development time compared to the conventional optimization method.


INTRODUCTION
Atomistic-scale insights have been critical to understanding the dynamical evolution of chemically reactive systems and therefore have created a demand for the development of computational chemistry techniques. Quantum mechanics (QM)-based atomistic simulation methods have been commonly used to study the molecular dynamics (MD) because these methods provide accurate energies, charges, and reaction pathways. However, simulation times and sizes are highly constrained by computational cost, and these limitations are some of the motivations behind the development of empirical potentials. The empirical potentials provide fast access to forces and, in turn, the dynamical evolution of much larger chemical systems for longer simulation times in various conditions (e.g., temperature, pressure). However, due to fixed connectivity between atoms, traditional empirical potentials have limited capability of modeling the evolution of systems during reactive events. In order to bridge the gap between QM and empirical potential-based methods, several reactive force field-based atomistic simulation techniques have been developed [1][2][3][4] . ReaxFF is one of the widely used reactive interatomic potential in this category due to its reliability and transferability between chemical systems. Initially, ReaxFF interatomic potential has been formulized to model hydrocarbons, and then extended through silica, nitramine-based materials to several aqueous and combustion systems 5 . Currently, the ReaxFF method covers over 50 elements, and is applicable to a wide range of chemical systems that are of interest to materials science community (i.e., 2D materials [6][7][8][9][10][11][12][13] , electrochemistry 14,15 , thin films 16,17 , nanotubes 18,19 , catalysts 20,21 ).
ReaxFF uses bond-order and charge-dependent functionals to define intra-and interatomic interactions. The simplest form of the total energy equation used in ReaxFF can be given as below, and a detailed and most up-to-date form can be found in ref. 22 : (1) where E bond , E angle , and E tors are energy terms related to bond formation, three-body valence angle strain, and four-body torsional angle strain, respectively. The energy term, E over, is used to prevent the over coordination of atoms. The energy terms, E Coulomb and E vdW , are associated with electrostatic and dispersive contributions, respectively. Each separate energy term in Eq. (1) contributes to the total energy of the system, which in turn predicts the dynamical evolution. Each term is calculated by an equation involving several parameters to be optimized to adapt the calculation to a form specific to a chemical system of interest. A typical ReaxFF force field consists of approximately 100 of these parameters per element type depending on the element. Because the intra-and interatomic interactions are tuned by these parameters, they are optimized to reproduce reference values with reasonable accuracy before moving to production simulations. These reference values form a force field training (FFtraining) set, which is composed of, but not limited to, molecular properties (e.g. bond lengths, bond angles, charges, and energies), and/or chemical reactions from as simple as bond breaking/ formation to more complex such as vacancy dynamics, ion diffusion of reference systems. The reference values, which hereafter will be referred to as "reference properties", are obtained using QM-based methods if experimental values are not present. The quality of the production simulations is strictly affected by the quality of the developed force field; therefore, the force field parametrization is one of the most critical parts of the MD studies. ReaxFF uses fixed functions for all chemical systems; hence, application to different chemical systems and/or different applications of similar chemical systems require adaptation of ReaxFF through optimization of the force field parameters.
The optimization process requires a comprehensive exploration of the force field parameter landscape. However, optimization of such a large number of parameters, especially for multi-component chemical systems, is challenging to achieve. A typical force field for a multi-component system contains several hundreds of parameters, a significant part of which can be transferred from previously optimized force fields; however, depending on the complexity of the molecular system, a few tens of parameters should be optimized to reproduce reference properties in a given FF-training set. A parameter space that is composed of tens of parameters can be considered as high-dimensional and can have a complex geometry with many local minima.
Each local minimum in ReaxFF parameter space yields acceptable molecular property values and therefore can be reliably used by researchers for the application of the developed force field in the production stage. However, being stuck in a local minimum may be troublesome from the parametrization perspective. For example, it is a very common practice to update FFtraining sets to extend the applications, and this update requires reparameterization of the force field because the previously found local minimum corresponding to an ideal parameter set may no longer be appropriate for newly added reference properties. Another issue that may arise during force field development is due to diversity in the FF-training set. A typical ReaxFF training set is composed of molecular and condensed phase parts, which are also separated into groups such as the equation of states, the heat of formations, etc. It is very likely for a force field parameter set to get stuck in a local minimum that precisely reproduce reference properties corresponding to a specific part of the force field while poorly fitting to the remaining.
The ability to have the information of several local minima in parameter space and so making transitions between these minima is significantly beneficial to detect low discrepancy regions that are capable of precisely fitting to the whole FF-training set, including all sections. The conventional optimization method that is widely used by the force field developers is a sequential one parameter parabolic extrapolation method 23,24 , which is not capable of switching between local minima to detect lowest error regions in parameter space. In contrast, this method is susceptible to being stuck in a local minimum due to its sequentiality, which also prevents parallelization of the optimization algorithm. Due to these limitations in the conventional method, considerable effort has been directed toward finding a solution to this optimization problem [25][26][27][28][29][30] . Iype et al. 25 have used a Monte Carlo algorithm combined with simulated annealing (SA) procedure to find global optimum values for ReaxFF parameters of magnesium-salt hydrates. The application of SA method varies depending on the system; therefore, it cannot offer a common solution to the force field optimization problem. Guo et al. 31 developed a method to improve ReaxFF parameterization using machine learning and fit separate ReaxFF energy terms (Eq. 1) to reproduce density functional theory (DFT) MD energy terms with high accuracy; however, the applications of this method were suitable for very small three-component chemical systems trained over a few single phases. Several studies have used single and multiobjective global optimization methods such as genetic algorithm (GA) [26][27][28][29][30] or enhanced particle swarm optimization method 32 to optimize ReaxFF force field parameters. All these studies have implemented existing global optimization methods to parameterize ReaxFF force fields and despite how useful these methods have become; the optimization problem has not yet been completely solved. However, we believe that the ReaxFF optimization system should not be viewed as "just another" global optimization problem, because there may be several parameter combinations that may be "almost equally good" or" equally desirable".
Here we present an INitial-DEsign Enhanced Deep learningbased OPTimization (INDEEDopt) framework that will not only accelerate the force field development for simple chemical systems (e.g., ternary, quaternary, or quinary component), but also enable the development for multi-component ones (e.g., senary or larger), which have become crucial with the advances in materials discovery (e.g., high entropy alloys, inorganic-organic interfaces). The framework uses machine learning (ML) due to its capability of modeling highly correlated high-dimensional data and adapts an initial design algorithm to explore the high-dimensional parameter space comprehensively. Thus, INDEEDopt is capable of finding several local minima in parameter space and produces optimized force fields to be used in the production stage of simulations. INDEEDopt differs from other interesting ReaxFF parameterization methodologies in several aspects. An important feature of INDEEDopt is that it uses a Latin Hyper Design (LHD) algorithm to efficiently generate a data set and explore whole parameter space instead of being stuck around initially assigned regions. Most of the optimization methods developed to date require initial parameter value guesses; however, there is no certain rule to assign these values. INDEEDopt does not require the initial values; in fact, it can be used as a method to generate these values that can be further optimized by combining with other optimization methods. Another critical distinction is that INDEEDopt is a datadriven method and generates a significant amount of data, which can be used for further investigations. The procedure that INDEEDopt framework uses is fully parallelizable and can be used with any ReaxFF-MD software without modifications to the code. INDEEDopt as a concept can be adapted to the parametrization of classical and other reactive force fields in the future if desired.

INDEEDopt force field optimization framework
INDEEDopt framework involves three stages, and a schematic representation of the framework can be seen in Fig. 1. In the first stage, the high-dimensional parameter space is sampled by means of an initial design algorithm called Orthogonal-maximin Latin Hypercube Design (OMLHD). The OMLHD algorithm can generate parameter combinations within ranges specific to each parameter that are multidimensionally uniformly distributed by reducing the pairwise correlation and maximizing the distance between parameters 33 . The parameter combinations are given to the ReaxFF-MD code for energy minimization calculations, which returns target property values correspond to each reference property in FF-training set. The parameter combinations and corresponding target property values produce a training set for the second stage of the framework, which is the ML model training stage, and hereafter as this training set will be referred to as ML-training set.
The strong correlations of ReaxFF force field parameters create very complex parameter-property relationships because these correlations do not only come from analytical expressions but are also correlated due to the many-body character of atomic interactions specific to the system. In order to handle these complex correlative relationships, we adapted deep learning (DL) as the ML method in the framework, because DL is a promising approach in extracting high-level features from less featured input data 34 . The DL involves a type of artificial neural network with multiple hierarchical hidden layers. Using these multi-layered network structures, DL models are capable of nonlinearly mapping the relationships between a given set of inputs and corresponding outputs. This capability makes DL an appropriate method to obtain features from a large data set. In addition, DL models can be easily parallelized on CPU or GPU architectures, which is one of the goals of this study since this provides significant acceleration in the FF development. In the DL model, the ReaxFF parameters (N) are given as an input to a deep neural network (DNN), which is trained to return target molecular property values (P) (Fig. 1b). The DNN used in this study is a fully connected feed-forward type, which transfers data from the input to the output layer through hidden layers. Each layer is composed of several nodes that are connected to each of the other nodes in the previous and next layers. Nonlinear mapping is implemented into neural networks using activation functions. For all layers except the last layer, rectified linear unit (ReLu), which has been proven to accelerate the training process 35 , is employed as an activation function, and we use a linear activation function for the last layer. The shape of a neural network used in DL model affects prediction accuracy 36 ; therefore, to understand the complexity of the neural network used in INDEEDopt, we analyzed the effect of the network shape to the final prediction. The accuracy of prediction was measured according to the mean absolute error (MAE) between the model prediction and test data. According to our analysis, MAE decreases until the number of layers is increased over ten. After ten, the model tends to overfit to the data, thus results in a reduction in accuracy (Fig. 2). We used two hidden layers in our model because the accuracy difference between ten and two layers is not significant enough to negatively affect the optimization procedure. The effect of the number of nodes per hidden layer was analyzed by changing the node amount in each layer by multiples of the number of reference property values in the FF-training set. As can be seen from Fig. 2, the increase in the number of nodes results in a decrease in MAE, and the network with 900 nodes per layer fits best to the test data. However, the improvement in accuracy is not significant, while the increase in node number is inefficient in terms of computational time. In the light of these analyses, two hidden layers containing a constant number of nodes, which was selected as the size of the FF-training set, is considered as the most appropriate neural network configuration to be used in INDEEDopt framework. For a more complex FFtraining set that is composed of thousands of reference property values, there may be a need to assign more hidden layers and more nodes per layer; however, our configuration produced sufficient accuracy for both of our parametrization test cases and should be sufficient for most of the parametrizations conducted in Fig. 1 Schematic representation of the optimization framework. Schematic representation of the DL-based optimization procedure. a Representation of the sampling using the initial design algorithm. b Deep neural network structure, red circles are input nodes, yellow circles are output nodes, and grey circles are hidden layer nodes. c Representation of the parameter-discrepancy correlation in parameter space. current ReaxFF literature. The INDEEDopt code is capable of detecting GPU architecture in the computer system and automatically run on it, so it is advisable to use a larger number of nodes and layers if GPU is accessible to force field developer.
The DL training loss function is optimized using Adaptive Moment Estimation (Adam) method, which is computationally efficient and suitable for high dimensionality due to its adaptive learning rate 37 . Once the DL model is trained, the accuracy of the model is evaluated using a test set, which is obtained by randomly selecting a part of the ML-training set as a test set with a split ratio of 0.2. The DL model has an early stopping algorithm, which ensures that the overfitting is avoided by stopping the training at the point where the prediction accuracy is maximum. The maximum fail parameter is set to 100 epochs, while the total number of epochs is selected as 10,000. The quality of the DL model predictions is calculated using MAE. The first two stages of INDEEDopt are iterative and repeat until the DL model converges to an acceptable accuracy. The accuracy is calculated by the MAE between DL prediction and ReaxFF calculation. The third stage of the framework, which is the optimization stage, starts once the accuracy is converged. The convergence criterion is satisfied when the accuracy change between two consecutive iterations is less than 2%. The accuracy convergence plots for two different applications of INDEEDopt can be seen in Supplementary Fig. 1. The computational flow diagram of INDEEDopt framework is depicted in Fig. 3. We should note that the convergence of DL model is critical, and the third stage does not start until the convergence in second stage is achieved. A well converged DL model is capable of reproducing ReaxFF calculations with high accuracy as can be seen in Fig. 4, which shows how close are the predictions and the true values. The percent difference between the INDEEDopt predictions and ReaxFF calculations can also be seen in Supplementary Table 1. There may be several reasons for a failure/delay in convergence. One of them is the limited sampling of the parameter space, which is very likely a result of a too wide parameter range selection. Another reason may be high uncertainty in parameter-property correlation, which is likely due to the limited representation of the effects of these parameters in FF-training set. The increase in uncertainty is caused by inconsistent response in target property values to alteration of parameters in the first stage. This issue can be addressed by expanding FF-training set by adding reference property values that are correlated with the parameters of interest and less sensitive to change in parameter values. This expansion stabilizes the response to parameter alterations and reduces the uncertainty in target properties in ML-training set. Such high uncertainty is not only a problem for INDEEDopt, it may also affect the force field quality during production, because when the force field is trained specifically for very unstable reference properties, transferability of the force field to new environments that can be faced during production simulations is limited.
The third stage of INDEEDopt is when the parameter optimization is performed by minimizing the Eq. (2).
where x i,ML and x i,REF are the target property obtained by trained DL model, and reference property values in FF-training set, respectively, and w i is the weight factor, which can be used to adjust the importance of property values during parametrization. Production of target property values using ReaxFF-MD code takes a significant amount of time during optimization, especially when the reference data set is extensive. This time issue limits the optimization quality as most of the developed algorithms tend to limit their parameter space sampling. To address this issue,   Table 1.
INDEEDopt framework calculates the target properties using the trained DL model predictions instead of calling ReaxFF-MD code in the optimization stage. A well-trained DL model is capable of predicting target properties with reasonable accuracy in 3 ms, which is at least three orders of magnitude faster than using ReaxFF-MD code. Therefore, INDEEDopt is capable of exploring a wider parameter space more thoroughly. In addition, the MLtraining set, which is generated to train DL model, has information about whole parameter space (in other words, it has the information for several local minima), and therefore can be used to train several parameters sets in parallel by targeting the best local minima in a smaller parameter range similar to ref. 28 . We performed optimization using two different approaches, namely brute-force method and Minimum Energy Design (MED) method, and compared the performances in the "Results" section. Additionally, because the optimization stage of INDEEDopt is separated from other stages, any other optimization algorithm can be integrated into the workflow. The brute-force optimization assigns randomly generated values to all parameters at once and calculates the discrepancy (y) between the target and reference property values. This process repeats until stopped by the force field developer when the discrepancy value is satisfactory. The criterion used to stop the optimization in this work was to produce a lower discrepancy than the one obtained by the conventional optimization method. However, if not assigned by the user, INDEEDopt uses the lowest discrepancy producing parameter set from the ML-training set. Even though the brute-force optimization is advantageous in terms of transitioning between local minima, it is not computationally efficient to be used in high-dimensional problems; therefore, we implemented the MED method, which can detect local minima faster as also demonstrated in ref. 38 .
The MED is a smart active learning strategy that can be used to efficiently and adaptively explore the parameter space and identify combinations with low discrepancy between the target and reference property values. Proposed and developed by Joseph et al. 39 , it is typically used to explore large, complex spaces, quickly carving out "bad" (with large total discrepancies in our case), and producing more and more "good" combinations. A completely automated version of the algorithm based on ref. 40 is implemented in the R package mined. It requires the user to select a set of initial parameter combinations and a computer code (in our case the DL model) that calculates the total discrepancy between the target and reference property values for any given parameter combination. A more detailed description of the implementation of MED algorithm can be found in ref. 38 .

Applications of INDEEDopt
The INDEEDopt framework was tested for two different force field development cases, namely, a Ni-Cr binary system for alloy applications and a W-S-C-O-H quinary system for 2D materials applications. The Ni-Cr system is relatively simple, due to the absence of angular and dihedral terms, which in turn requires optimization of a small number of parameters. The W-S-C-O-H system is more complex due to several reasons: (1) it has angular and dihedral interactions involved in the training set; (2) the number of parameters to be optimized and the size of the FFtraining set is larger than the Ni-Cr one; (3) the force field training set includes both condensed and molecular phases together; therefore, W-S-C-O-H training was selected as a secondary test case to evaluate the performance of INDEEDopt framework. The same force fields were also parametrized using the conventional method, which is the most commonly used method in the ReaxFF literature 24 . All the settings related to energy minimization, molecular geometries in the FF-training set, and weights factors for discrepancy calculations were kept the same for the optimization using INDEEDopt and the conventional force field development method.
Ni-Cr force field optimization The ReaxFF training set of the Ni-Cr alloy system is composed of 90 different reference property values, including equations of state and heat of formation energies for different Ni-Cr compositions in fcc and bcc phases. A total of 16 parameters were optimized, which involve bond-order terms, bond energy terms, over coordination terms, charge polarization terms, and Coulombic interaction terms. A detailed full list of parameters is given in Supplementary Table 2.
Using OMLHD algorithm, a total of 79,635 samples were created to be used as the ML-training set. Each parameter has a corresponding anticipated value range, mostly coming from the chemical nature of the element; however, some parameter ranges are determined by experience. The ranges used for each parameter (in other words, upper and lower limits of parameter search) are given in Supplementary Table 2. The determination of the ranges is important because wider ranges require more sampling, which is computationally inefficient, may also delay the optimization depending on the complexity of the problem. While exploring the parameter space, some parameter combinations produce meaningless inter-or intra-atomic interactions, and if the ranges are large, a significant amount of the sampling is conducted unnecessarily. For example, the Ni-Cr training set uses a very wide range for parameter-1, which is the bond energy term between Ni and Cr. Our analysis of the correlation between parameter-1 and remaining parameters ( Supplementary Fig. 2) shows that a significant part (~80%) of the parameter range produces a large discrepancy meaning that the interatomic interactions are not usable for production simulations. INDEEDopt labels these large discrepancy regions as "infeasible" and remaining is referred to as "feasible". After a first complete scan of the parameter space, INDEEDopt focuses on feasible regions and avoid sampling the infeasible parts of the parameter space, which avoids spending time to learn unnecessary parts of the parameter space.
A comparison of total discrepancy obtained by INDEEDopt and conventional method for different sections of the training set is given in Fig. 5. A detailed comparison of each FF-training set element can be seen in Supplementary Fig. 3. Most of the target properties in the training set are improved using INDEEDopt except some equation of states values Ni 3 Cr and Ni 2 Cr in FCC and BCC phases, respectively. A significant improvement is observed in the heat of formation energetics, and the lowest total error sum obtained using the conventional method was calculated as 5614.3, which was reduced to 2804.0 using INDEEDopt with brute force and 2316.2 using INDEEDopt with MED algorithm. During the optimization using the conventional method, the heat of formation and equations of state parts of the force field were inversely correlated; therefore, each part was optimized by compromising the accuracy of the other, and both parts were not able to be optimized at the same time. The reason why INDEEDopt succeeded is due to finding another local minimum in parameter space that can satisfy both parts of the force field. The significant differences between some of the final parameter values obtained by INDEEDopt and conventional method also show that the optimum parameter sets are detected in different regions of the parameter space (Fig. 6). The final parameter values obtained by both methods are found in Supplementary Table 2. Some of the parameters were optimized to similar values by both optimization methods, which may be due to a narrow parameter search range for these specific parameters assigned by the force field developer. INDEEDopt outperformed the conventional method in terms of timings by producing an optimized parameter set in around 2 days compared to 2 weeks by an experienced force field developer. However, we should note that the conventional method was run on a single core while INDEEDopt was run on a hundred cores.
The overall improvement (i.e., reduction in total error) of the Ni-Cr force field translates into physical properties. The lattice expansion for Ni and Cr phases with the increase in temperature is calculated by INDEEDopt optimized force field, conventionally optimized force field, and compared with experimental values (Fig. 7). According to our analysis, the INDEEDopt optimized force field produces a superior result for Cr (BCC) phase compared to the conventionally optimized one, and it behaves similarly for Ni (FCC) phase. Additionally, energy-volume curves for each Ni n Cr m phase can be seen in Supplementary Fig. 4.

W-S-C-O-H force field optimization
The parametrization of the W-S-C-O-H quinary system is more complex than the binary alloy case because the FF-training set involves both molecular and condensed phase parts. The presence of a condensed phase brings extra challenges to the optimization problem. One of the challenges is that the molecular and condensed phase may be best optimized separately in different local minima in the parameter space, meaning that a well-optimized molecular part does not necessarily produce low discrepancy for condensed phase part as well. In addition, the condensed phase part requires the usage of large molecular configurations to represent a crystalline phase. For example, the W-S-C-O-H training set has several geometries that are composed of hundreds of atoms, which makes these properties more sensitive to alteration in parameter values, and therefore brings considerable uncertainty to the parameter-property correlations. An additional complexity to the optimization problem comes from the size of the FF-training set. There are 289 reference properties in FF-training set, which is significantly higher than most data sets used in literature to test newly developed optimization procedures. The FF-training set includes heat of formation values for both molecular and condensed phase, defect and ad-atom energetics, some application-specific reaction energetics, strain energetics, and bond and angle energies. The total number of parameters to be optimized is 68, which include bond, angle, and off-diagonal terms. A detailed list of optimized parameters is given in Supplementary Table 3.
INDEEDopt produced a total of 44,700 samples until the DL model is trained to mimic ReaxFF-MD with reasonable accuracy. The total discrepancy sum for different parts in the FF-training set is given in Fig. 8 and compared with the ones calculated using the conventional method. A more detailed comparison is also given in Supplementary Fig. 5. A detailed explanation for each FF-training set part is given in Supplementary Note 2. The difference between the final optimized parameter values obtained by both methods can be seen in Fig. 9. As can be seen from the bar graphs, INDEEDopt produced a lower total error than the conventional method for angle scan, bond scan, and defect/ad-atom parts of the force field. The lowest total error sum obtained using the conventional method was calculated as 5314.4, which was reduced to as low as 5250.3 using INDEEDopt with brute force and 5705.5 using INDEEDopt with MED algorithm. The amount of time spent on optimization was approximately 2 months with the conventional method and was 5 days with INDEEDopt with bruteforce algorithm. In addition, the usage of MED instead of the brute-force algorithm reduced this time to 4 days. Even though Fig. 6 The comparison of Ni-Cr force field parameters optimized by different methods. The stacked bar graph representation of the difference between the parameter values in the nickel-chromium training set optimized using INDEEDopt with brute force and conventional method. The red represents the conventional, and yellow represents INDEEDopt version of the same parameter. The difference between the percentage values corresponds to the difference between predicted values by each method. The value 50% corresponds to a situation when both methods predict the same values. If the predicted parameter values are negative, then the percentage value is given as negative. Fig. 7 Influence of temepature to lattice parameters for Cr (BCC) and Ni (FCC) phases. The lattice expansion with respect to temperature change for Cr (BCC) phase (a) and Ni (FCC) phase (b). The experimentally obtained results are compared with the ones obtained using conventionally optimized force field (blue) and INDEEDopt optimized force field (red). the MED algorithm was not able to produce the lowest error, the produced value is acceptable to be used in production simulations. We should note that the brute-force optimization criterion in this study was to do better than the conventional method; therefore, the optimization was stopped after the discrepancy value was below the one produced by the conventional method. However, the parameter set was very well developed by experienced FF developers using the conventional method; therefore, it was a challenging test. INDEEDopt spent approximately one day, around 30 million calculations, at the optimization stage to predict a lower discrepancy point than the conventionally obtained one. Therefore, it may not be guaranteed to predict a better point by running the optimization longer because: (1) INDEEDopt predicted a local minimum that is very different from the conventionally obtained one, which can be seen in Fig. 8, and there may not be a better point accessible by our optimization method in this specific low discrepancy region; (2) the DL model may not be able to make predictions as sensitive as ReaxFF-MD, which prevents revealing a better point. It may be possible to further improve the parametrization by assigning INDEEDoptpredicted parameters as a starting point to the conventional method by assigning a narrow parameter search range.

DISCUSSION
An initial design enhanced DL-based optimization framework has been developed to parametrize ReaxFF force fields. The INDEEDopt framework was tested on two different training sets, and the performance was examined comparing with the conventional method. The tests were conducted on Ni-Cr and W-S-C-O-H training sets, of which lateral is the most challenging due to a large number of properties and diversity in the training set. The framework could produce a lower total discrepancy between ReaxFF values and reference property values compared to the conventional method. The time required for optimization calculations is also lower using INDEEDopt due to its ability to run on multiple processors. Another advantage of INDEEDopt framework is its ability to be used as an initial parameter generator that can be combined with other optimization methods, which can be valuable in avoiding local minima stucking problem in some other optimization methods and also to fine-tune force fields. Additionally, INDEEDopt can be used by any software that can run ReaxFF-MD energy minimization simulations, which makes it open to a wider user population.
This version of INDEEDopt is fully capable of providing a timeefficient tool for optimizing ReaxFF force fields with sufficient quality. However, there are some challenges that can be addressed to improve the framework further in future versions. The sampling scheme (first stage of the framework) used in INDEEDopt is computationally the most demanding part due to the high dimensionality of the problem, which can be addressed by developing problem-specific initial design methods. With the implementation of design of experiments-based optimization procedure for neural network hyperparameter tuning, desired accuracies from the DL model can be achieved in shorter periods of time. The future versions of INDEEDopt will also include subroutines that conduct statistical analysis to give feedback about the FF-training set, which will be very valuable to further improve FF quality. We believe that our framework provides major progress towards the process of ReaxFF parameterization. A Python 3 implementation of INDEEDopt framework is available at https://github.com/mertyigit/INDEEDopt for public use.

Computational details
INDEEDopt requires calculation of each reaction in ReaxFF training set to produce a data set, which is used to train a DL model. These calculations were performed as energy minimization and conducted at 0 K. The number of iterations were decided depending on the ReaxFF training set. The W-S-C-O-H training set involves condensed phase calculations, Fig. 9 The comparison of W-S-C-O-H force field parameters optimized by different methods. The stacked bar graph representation of the difference between the parameter values in the tungsten-sulfide-carbon-hydrogen training set optimized using INDEEDopt with brute force and conventional method. The red represents the conventional and yellow represents INDEEDopt version of the same parameter. The difference between the percentage values corresponds to the difference between predicted values by each method. The value 50% corresponds to a situation when both methods predict the same values. If the predicted parameter values are negative, then the percentage value is given as negative. Fig. 8 The comparison of error values calculated by INDEEDopt and conventional method. The total error values corresponding to different parts of the tungsten-sulfide-carbon-hydrogen FFtraining set calculated by INDEEDopt with brute force and conventional methods. An explanation of the naming for different parts of FF-training set can be seen in Supplementary Note 2, and also same structures are used in ref. 6 .