Bond order redefinition needed to reduce inherent noise in molecular dynamics simulations

In this work, we present the bond order redefinition needed to reduce the inherent noise in order to enhance the accuracy of molecular dynamics simulations. We propose defining the bond order as a fraction of energy distribution. It happens due to the character of the material in nature, which tries to maintain its environment. To show the necessity, we developed a factory empirical interatomic potential (FEIP) for carbon that implements the redefinition with a short-range interaction approach. FEIP has been shown to enhance the accuracy of the calculation of lattice constants, cohesive energy, elastic properties, and phonons compared to experimental data, and can even be compared to other potentials with the long-range interaction approach. The enhancements due to FEIP can reduce the inherent noise, then provide a better prediction of the energy based on the behaviour of the atomic environment. FEIP can also transform simple two-body interactions into many-body interactions, which is useful for enhancing accuracy. Due to implementing the bond order redefinition, FEIP offers faster calculations than other complex interatomic potentials.


Scientific Reports
| (2021) 11:3674 | https://doi.org/10.1038/s41598-020-80217-0 www.nature.com/scientificreports/ selection. In addition, the atomic environment structure database references are accumulated via separate abinitio computations and nudged elastic bands (NEB) [22][23][24] . A database training process is then adopted to establish the mapping between fingerprints and atomic energies, and the output is the interatomic potential. Therefore, the ML method still has a computationally high cost to simulate large numbers of atoms. Furthermore, Pauling showed experimentally that the bond order has a close relation to the bond length 25 . The latest technique is density derived electrostatic and chemical (DDEC6), which has been reported as the best method for calculating the bond order 26 . This method applies to non-magnetic, collinear magnetic and noncollinear magnetic materials with localised or delocalised bonding electrons. Even so, DDEC6 is not designed for electrodes, highly time-dependent states, some extremely high-energy excited states, and nuclear reactions. However, no absolute definition for the bond order, but we can indicate its quantity via bond length, coordination number, bond angle, dihedral, and charge populations. Regarding MD simulation, though the ML method offers QM accuracy, in general, the simulation calculation depends on the underlying potential, especially the definition of bond order. In particular, there is inherent noise in the MD simulation, which affects the results.
In this study, we show the need for bond order redefinition to reduce inherent noise in MD simulations. Through this redefinition, the beginning of a factory-based empirical interatomic potential (FEIP) is developed in this study. Thus, for a simple potential such as Lennard-Jones, the accuracy can increase when it is transformed into a many-body potential. We use the carbon-carbon interaction for graphene phonons dispersion, mechanical properties and the cohesive energy of graphite and diamond, to show how this redefinition affects the accuracy and the speed of the simulation, which is useful for future studies of atomic mechanisms.

Results
Bond order redefinition. We begin with the principles of the redefinition: if there only one atom existed, there would be no bonds. However, when a second atom comes close to the first atom, the repulsive-attractive interactions happen to maintain its environmental state and the bond order at the highest value. Next, the energy of this system is distributed when the third atom comes to join. Now, the bond order decreases, causing the bonds between the first and second atoms to become weak. The same condition applies to the next new atoms. The distributed energy eventually forms a new atomic environment. From this principle, the bond order in this study is considered as a fraction of the energy distribution. It is a result of the character of the material maintaining its environment. This distribution depends on the amount of attractive energy that eventually becomes the bond.
FEIP development. FEIP potential implements the protocol of bond order redefinition. With that protocol, this potential will translate the bond-order appropriate based on the definition of repulsive and attractive energy. Thus, there is no strict formula for FEIP as the meaning of "factory" suggests. In this study, we restricted our works to short-range interactions. Thus, we started with the short-range binding energy, the regular structure, and the strongly localised approximation of Abell's work 27 , where Z is the coordination number, q is the net distribution of the electron, h is the bond order, G is the topology that represents the state of the atomic environment, r is the interatomic separation, and V A and V R are the attractive and repulsive energy, respectively. As noted by Abell 27 , q is nearly exactly equal to one in short-range approximations and bond order can thus be defined via the variational principle with respect to r in the following form where V A' and V R' denote the first derivative with respect to r of the attractive and repulsive energy, respectively. Meanwhile, r e is the interatomic separation at equilibrium. Previously, the bond order was a direct function of topology G, as formulated by Tersoff or Brenner, where k is a constant. In our redefinition, however, bond order should be a function of the fraction of the energy. Meanwhile, Eqs. (2) and (3) should be equal numerically. To solve this problem, we use the Taylor expansion of the natural logarithm, so that the relation between G in the right-hand side (RHS) of Eq. (3) and the energy in RHS of Eq. (2) is as follows: By substituting Eq. (4) into Eq. (3), the bond order for short-range interactions based on the redefinition can be written as: In this equation it is important is that each binding energy parameter, e.g. cohesive energy and interatomic separation equilibrium, should be a function of G. In practice, FEIP will use Eq. (5) to determine the bond order

FEIP for carbon.
In this study, we use the Morse potential to describe the universal repulsive and attractive interactions for carbon-carbon. The FEIP for carbon has the following form: with where A, B, r e , α and λ are a function of the topology G. In this work, we chose the topology as follows: This formula is used in the Tersoff potential. There is no standard formula for G. Equation (10) was chosen as the topology in order to reduce the dependency of the database, though it can reduce the accuracy of the bond order calculation due to ignoring the π-bonding contribution 15 . However, the most important reason to use this expression is to make comparisons with the Tersoff potential and the LCBOP as representative of short-range and long-range interactions, respectively. Thus, we can provide a clear explanation of the importance of redefining the bond order in atomic simulations.
The number of constant parameters required depends on the type of trendline functions of A(G), B(G), r e (G), α(G), and λ(G). Meanwhile, the number of databases used greatly influences these types of functions. It is not difficult to assemble the desired database from experimental or theoretical data for A(G), B(G), and r e (G). However, data for α(G) and λ(G) dominantly comes from theory. These functions represent the width of the well potential. We can determine this information with DFT or another interatomic potential which has a similar physical meaning. Several studies reveal the relationship of the exponential trend of potential energy and bond order to the coordination number 27,28 . Therefore, this study uses an exponential form for the trendline functions: Uncertainty and the lack of data is another limitation in this concept. Thus, the choice of the trendline function requires physical intuition. In completing our work, this study focuses on graphite, graphene and diamond materials as the carbon allotropes for which there is abundant data. The parameters in G topology of Eq. (10) have a close relationship to the mechanical properties, while the others are related to the cohesive energy and lattice constant of each allotropic material type. Except for R 1 and R 2 , we determine these parameters based on the farthest and shortest distance of the first and second nearest neighbours of the interest allotropic material, www.nature.com/scientificreports/ respectively. Here, we chose R 1 equal to 1.8 Å (slightly over 1.55 Å, the farthest distance of the diamond first nearest neighbour), and R 2 we set at 2.1 Å (below 2.46 Å, the shortest distance of the graphite second nearest neighbour), based on the experiment 29,30 . Then the standard fitting process is repeatedly carried out until the calculation results are closer to the experiment. After passing the standard fitting procedure following Tersoff 's work 11 , Table 1 shows the constant parameters suggestion for Eqs. (9)-(11).

Test of FEIP for carbon.
In this section, we present the test results of FEIP for carbon, including the lattice constant, cohesive energy, and elastic constants. Our results are compared with experimental data and other simulations for diamond and graphene, as shown in Table 2.
When compared with other simulations, FEIP gives close results to the experimental data for the lattice constant and the cohesive energy. For the elastic constants, however, FEIP gives a prediction of C 12 , which is far from the experimental data. Nevertheless, the results of FEIP for diamond and graphite case might be still better than other methods, even compared with the complex calculations of the best reactive force field (ReaxFF). The prediction errors resulting from FEIP for diamond and graphite are 159% and 92%, respectively. Meanwhile, ReaxFF has errors of 356% and 263%, respectively, almost twice the error of FEIP. For the elastic constants, ReaxFF has 56% error, but FEIP is 66%, except for C 44 , shear displacement over the basal plane of graphite.
The next test is the phonon dispersion of graphene. This test is needed to verify the representation of some physical properties such as thermal and electrical conductivity [39][40][41][42] . Phonon vibration modes control the distance www.nature.com/scientificreports/ between atoms that affects the bond order. Therefore, the accuracy of the phonon calculation is a test of the redefinition of the bond order implemented by FEIP. For that reason, this study compares phonon calculations with LCBOP as representative of long-range interactions and the Tersoff optimisation for short-range interactions. Figure 1 shows the result. At the Γ-point, the experimental data shows two optic mode branches, in-plane transverse (TO) and longitudinal (LO) modes at the frequency 298 THz. Meanwhile, the out-of-plane optical (ZO) mode is at 163 THz. The simulation result using the LCBOP predicts the ZO at 155 THz, while the simulation using the Tersoff potential gives a result 70 THz higher than the experimental value. Meanwhile, for the TO mode prediction, the LCBOP and Tersoff potential give results of 265 THz and 289 THz, respectively. The FEIP predicts 168 THz and 299 THz for ZO and TO mode, respectively (see Fig. 1), which is the best compared with the other potentials. Phonon path from Γ to K-point, the results using the FEIP are closer to the experimental values than the others. One notes here, at the Γ-point as the starting path, the Tersoff potential gives a non-degenerate result for the TO and LO branches, which is a contradiction with the experimental data. Next for out-of-plane accoustic (ZA) mode prediction, the results of the calculation using the FEIP for the crosses frequency of ZA/ZO due to the consequences of the point-group symmetry of graphene are 90 THz at the K-point, 12 THz lower than the experimental value. Meanwhile, the prediction based on the LCBOP is 16 THz lower. However, the Tersoff potential-based prediction is about 20 THz lower than the others.
In general, the transverse acoustic (TA) mode of the FEIP calculation is similar to that of the Tersoff potential calculations, but the LCBOP gives the closet result to the experimental data. Nevertheless, the predictions for ZO and ZA mode of the FEIP are the best. FEIP also makes a better prediction for the LO and TO modes. Especially from Γ-to K-point, the FEIP calculation results are closer to the experimental data than the others. However, the over bending mode character is not visible for all potentials.
The next test is the stability and accuracy of the defects system with FEIP potential. Here, we show the relaxation defect structure spot of Stone-Wales (SW), bare single vacancy (V 1 ), pentagon dislocation reconstruction due to dangling bonds (5-db) V1, bare divacancy (V 2 ), a divacancy fully sp 2 reconstruction made of two pentagons and a central octagon (5-8-5) V 2 and a divacancy with two V 1 of graphene in Fig. 2. Meanwhile, the formation energy (E f ) of vacancies and dislocation is calculated with the following formula where E d , μ and E 0 are defect energy, the chemical potential of carbon and initial energy before defect, respectively, and n gives the number of carbon atoms that were added (n positive) or removed (n negative). We then compare their formation energy with the results of other studies presented in Table 3. As noted in the table, based on the tight-binding and DFT calculation, FEIP predicts the lowest formation energy compared to Tersoff and LCBOP for the bare vacancy. In general, Tersoff optimized and FEIP must underestimate the result for the V 1 and SW based on Airebo and DFT calculation. Meanwhile, LCBOP, which represents long-range interaction, shows the lowest prediction for  www.nature.com/scientificreports/ SW, but FEIP has a better result. However, LCBOP is closer to the DFT calculation for bare V 1 than FEIP. For the number types of divacancy and pure dislocation SW, FEIP shows better results compared to the Tersoff optimized and LCBOP, respectively. This means that FEIP of this version gives a better prediction for the case in which the number of vacancies increases and dislocation is lower. The inaccuracy result of FEIP for V1 and dislocation of graphene is due to our approaches that ignore π-bonding and short-range approximation. The last test for FEIP in this study is the stability and accuracy of the interstitial case in graphite. Thus, we insert one atom between the top two layers of graphite for the bridge and spiro interstitial. In that same position of the two layers, two atoms are also inserted to produce the di-interstitial of two spiro isolated. Figure 3 shows the meaning of these interstitials. Table 4 shows the interstitial types of formation energy. From the simulation, it is known that Tersoff optimized and LCBOP failed to produce all stable interstitial structures (not shown on the figure), except for the two spiro isolated of LCBOP. Meanwhile, calculations using FEIP show the opposite result.
FEIP results in a close prediction to DFT calculation for spiro. Meanwhile, for bridge and two spiro isolated interstitials show a difference of almost 3.0 eV. These happen because FEIP in this version ignores the π-bonding; as a result, the carbon interstitial gets even radial distribution energy from the top and bottom layers. This www.nature.com/scientificreports/ explains why the carbon for the bridge interstitial in Fig. 3 binds to the two closest atoms of the two layers that flank it, which contradicts the DFT result 50 .

Discussions
Extended Lennard-Jones potential. The Lennard-Jones potential is generally used for rare gas interactions and simple fluids [52][53][54][55] In general, particles interacting following this potential are stable in the hexagonal close-packed (hcp) structure at zero temperature and pressure [55][56][57] , but some rare gases such as Ne and Ar form face-centred cubic (fcc) structures 58,59 . To anticipate this rare gas solid (RGS) problem, many researchers modify this potential to predict the onset of crystallisation in supercooling, surface tension, critical nucleus size and nucleation ratse 60,61 . Schwerdtfeger et al. 62 extended the Lennard-Jones potential by using two-body potentials to improve structure calculations of rare gas clusters and the solid state. They argue that the use of the manybody expansion does not change the preference for hcp over fcc due to zero-point vibrational effects. However, that statement disagrees with the Lennard-Jones Embedded-Atom (LJEA) potential from the work of Baskes 63 . This potential embeds the energy of each atom into the background electron density to allow investigation of many-body effects. The LJEA potential is used to calculate properties of an fcc material such as elastic constants, Bain transformation and defect properties as a function of many-body parameters. The result of the calculations sthat hows the ground state structure includes all phases; meanwhile, the melting point of fcc structures decreases while the many-body interactions increase. We argue that the hcp preference of the Lennard-Jones-based calculations is due to the inability of this potential to adjust the energy requirements to the state of the atomic environment. In the LJEA potential, each atom is embedded in the background electron density provided by neighbouring atoms. Thus, the LJEA potential takes the energy as a summation of the pair interactions and the embedding energy as a function of the local background electron density. However, the LJEA potential assumes that the electron cloud around each atom is spherical, which makes this potential good for the fcc structure. The FEIP differs in that there is no pair interaction summation since every attractive and repulsive energy is a function of topology. FEIP will predict the Lennard-Jones energy requirement associated with the current environment topology. In this way, the FEIP offers a calculation which adapts to every possible crystal structure. However, here we focus on presenting the Lennard-Jones extension as a FEIP feature. To show the bond order requirements via FEIP, we chose the carbon-carbon interaction to test the accuracy because it is more complex than the rare gas case.

FEIP for carbon: bond order redefinition reason.
Although FEIP is more accurate in predicting lattice constants and cohesive energy, the need for bond order redefinition begins with the accuracy of the elastic constants. Changing the position of each atom due to strain and tension will cause changes to the bond order. Thus, the accuracy of these properties is evidence in favour of redefining the bond order. The previous study corroborates this statement that the elastic properties come from many-body interactions that are counted by the bond order 43 .
ReaxFF uses its own definition for a complex bond order calculation summed from σ, π, and π-π bonding, however, the results in Table 2 show that calculations of C 12 and C 44 fall too far from the experiment. Although this current FEIP uses short-range and σ-bonding, with redefinition, this work shows better results. Especially for C 66 , ReaxFF has underestimated the value compared with FEIP. In addition to our ignoring the π-bonding approach in the derivative FEIP formula, another cause of overprediction for these elastic constants is the lack of carbon allotroph data so that the absolute values of the A 2 and B 2 parameters are relatively large; this means a small change (0.33%) in G topology will result in A(G) and B(G) shifting up to 7 percent. Thus, the FEIP result confirms the accuracy of the elastic constant due to the concept of bond order. Thus, comparing the elastic constants is our chance to prove the needs redefinition.
We analysed the calculation of phonon dispersion to better understand the need for a redefinition of the bond order. The accuracy of phonon modes presented by FEIP, especially at low frequency, is due to the different character of this potential compared with LCBOP and the Tersoff potential. For the longitudinal accoustic (LA) mode, however, FEIP gives a result similar to a calculation using a force constant and valence force field (VFF) methods in which the fourth nearest-neighbour interactions are considered 64,65 , except there is no tangent frequency with LO mode at the K-point. Thus, the weakness of FEIP in predicting phonon modes at high frequencies compared to experimental data is due to the short-range interaction approximation, not the bond order redefinition. This means that the LO and TO modes are the causes of the long-range interactions.
To see how the bond order redefinition affects the accuracy of the simulation, we studied the carbon atoms during the phonon calculations. For that reason, a plot of root mean square deviation (RMSD) with respect to the total energy is shown in Fig. 4a. This study takes one particular atom of graphene juxtaposed with its nine nearest neighbour atoms to calculate the RMSD. From Fig. 4a, it is clear that the Tersoff potential gives a lower energy prediction than the others because the carbon atoms cannot move freely and are localised. However, LCBOP and FEIP show different conditions. Both potentials give an energy higher than the Tersoff potential because the carbon atoms move freely and provide more available phonon modes. FEIP provides energy predictions which almost coincide with the LCBOP calculations, where the LCBOP prediction is slightly higher. This highest-energy state of the LCBOP causes the atoms to move too freely, making the predictions for high frequency phonon modes inaccurate. The FEIP applies a short-range interaction approach while the LCBOP uses a long-range interaction approach. However, with the concept of bond order redefinition, the FEIP can adjust the energy requirements to the state of its atomic environment. Figure 4b shows the power spectral density (PSD) of energy, which reveals the cause of the inability of the Tersoff potential to adjust its energy requirements to the environment. We can see that the magnitude and amplitude of the white noise from the Tersoff are higher than in the FEIP and the LCBOP. Outside of the range Scientific Reports | (2021) 11:3674 | https://doi.org/10.1038/s41598-020-80217-0 www.nature.com/scientificreports/ of frequency from − 20 to 20 GHz, the level of noise fluctuation becomes more significant in the Tersoff potential, meaning that up to 500 ps (1/20 GHz) is not sufficient to reduce the noise. This situation ultimately results in a bond that is too strong compared to other potentials. The FEIP shows a different result during the phonon calculation process, in that the magnitude and amplitude of white noise are similar to the results from the LCBOP. Thus, the bond order redefinition described in this study can reduce simulation noise so that convergence is easy to achieve even though it does not apply the long-range interaction approach. Another advance of implementing this redefinition is the improvement in calculation speed. Although the Tersoff potential is faster than FEIP, Table 5 reveals that to give accuracy similar to the long-range interactions of LCBOP, FEIP needs 63.28 min, which is nearly three times faster than LCBOP.

Methods
In this study, all simulations were run using the open-source code Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) 66 . Meanwhile, the open-source code Octave (GNU) was used to analyse the spectrum 67 . To redefine the bond order, we focused on carbon-carbon interactions in which graphene and diamond were selected as materials. The standard fitting procedure was used to determine the parameters of the FEIP. Meanwhile, for testing the mechanical properties, this study applies the solid-continuum methodology 68 . This study also used a graphene single-layer and graphite four-layer system with 2508 and 4032 carbon atoms, respectively, for defects testing. We then deleted one or more carbons to represent the vacancy and dislocation. Meanwhile, one and two carbons were added between layers of the graphite system to test the carbon interstitial. After that, we compared the FEIP result testing to the Tersoff and LCBOP potential for accuracy and stability of the defects system with relaxation method.
For phonon testing, we use Kong's methodology, which is included in LAMMPS 69,70 . A hexagonal graphene system with 200 atoms was selected for this study. The system was allowed to remain in equilibrium at 300 K for six million iterations with an NVT ensemble. During the equilibrium process, the simulation ran with a two-femtosecond timestep. The phonon distribution was calculated directly after one million iterations. The FEIP was compared to the Tersoff potential and the LCBOP as a representation of short-range and long-range interactions, respectively.

Conclusions
We have constructed the bond order redefinition, treating it as a fraction of the energy distribution. This distribution depends on the amount of attractive energy present, which eventually becomes the bond. We developed the factory empirical interatomic potential (FEIP) that implements the bond order redefinition and uses a shortrange interaction approach. From the calculation for the elastic constants, the lattice constant and the cohesive energy of graphite and diamond, the FEIP gives a close result to the experimental data, better than the complex reactive ReaxFF potential for these carbon allotropes. This study also conducts the stability and accuracy tests on defects for graphene and graphite. As a result, FEIP can provide good predictions for large vacancies and dislocations. In the case of carbon interstitial in graphite, FEIP can produce stable defects but not with Tersoff and LCBOP. Except for the two spiro isolated, LCBOP and FEIP provide the same energy predictions and are close to the DFT calculations. However, because the FEIP for this version ignores π-bonding, the defect structure for the bridge is different from the DFT view, where the FEIP results show that the interstitial carbon binds to the two nearest neighbour atoms from the graphite layer flanking it. We compared the calculations for graphene phonon dispersion using FEIP, the Tersoff potential to represent short-range interactions, and the LCBOP to represent long-range interactions. The results show that even the FEIP uses short-range interactions for carbon, but the accuracy is similar to LCBOP due to the inherent ability of FEIP to reduce the noise. Moreover, FEIP performs the calculation almost twice as fast as the LCBOP. However, the need for abundant data for different types of allotropes will influence the accuracy of the calculation, especially for elastic constants. Another advantage of implementing the redefinition is that FEIP can transform the simple two-body potential into a many-body interaction that is useful to enhance the accuracy of potentials such as the Lennard-Jones. Thus, we need the bond order redefinition for better accuracy and faster simulations, which is useful in the future study of atomic mechanisms.

Data availability
No datasets were generated or analysed during the current study.