Physically informed artificial neural networks for atomistic modeling of materials

Large-scale atomistic computer simulations of materials heavily rely on interatomic potentials predicting the energy and Newtonian forces on atoms. Traditional interatomic potentials are based on physical intuition but contain few adjustable parameters and are usually not accurate. The emerging machine-learning (ML) potentials achieve highly accurate interpolation within a large DFT database but, being purely mathematical constructions, suffer from poor transferability to unknown structures. We propose a new approach that can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding. This is achieved by combining a rather general physics-based model (analytical bond-order potential) with a neural-network regression. This approach, called the physically informed neural network (PINN) potential, is demonstrated by developing a general-purpose PINN potential for Al. We suggest that the development of physics-based ML potentials is the most effective way forward in the field of atomistic simulations.


I. INTRODUCTION
Large-scale molecular dynamics (MD) and Monte Carlo (MC) simulations of materials are traditionally implemented using classical interatomic potentials predicting the potential energy and Newtonian forces acting on atoms.Computations with such potentials are very fast and afford access to systems with millions of atoms and MD simulation times up to hundreds of nanoseconds.Such simulations span a wide range of time and length scales and constitute a critical component of the multiscale approach in materials modeling and arXiv:1808.01696v4[cond-mat.mtrl-sci]22 Jan 2019 computational design.
Several functional forms of interatomic potentials have been developed over the years, including the embedded-atom method (EAM) [1][2][3], the modified EAM (MEAM) [4], the angular-dependent potentials [5], the charge-optimized many-body potentials [6], reactive bond-order potentials [7][8][9], and reactive force fields [10] to name a few.These potentials address particular classes of materials or particular types of applications.Their functional forms depend on the physical and chemical models chosen to describe interatomic bonding in the respective class of materials.
A common feature of all traditional potentials is that they express the potential energy surface (PES) of the system, E = E(r 1 , ..., r N , p), as a relatively simple function of atomic coordinates (r 1 , ..., r N ), N being the number of atoms (Fig. 1a).Knowing the PES, the forces acting on the atoms can be computed by differentiation and used in MD simulations.The potential functions depend on a relatively small number of fitting parameters p = (p 1 , ..., p m ) (typically, m = 10 − 20) and are optimized (trained) on a relatively small database of experimental data and first-principles density functional theory (DFT) calculations.The traditional potentials are, of course, much less accurate than DFT calculations.Nevertheless, many of them demonstrate a reasonably good transferability to atomic configurations lying well outside the training dataset.This important feature owes its origin to the incorporation of at least some basic physics in the potential form.As long as the nature of chemical bonding remains the same as assumed during the potential development, the potential can predict the system energy adequately even for new configurations not seen during the training process.Unfortunately, the construction of good quality potentials is a long and painful process requiring personal experience and intuition and is more art than science [8,11].In addition, the traditional potentials are specific to a particular class of materials and cannot be easily extended to other materials or improved in a systematic manner.
The training process can be implemented on-the-fly by running ab initio MD simulations [26].
A major weakness of ML potentials is their poor transferability.Being purely mathematical constructions devoid of any physical meaning, they can accurately interpolate the energy between the training configurations but are generally incapable of properly extrapolating the energy to unknown atomic environments.As a result, the performance of ML potentials outside the training domain can be very poor.There is no reason why a purely mathematical extrapolation scheme would deliver physically meaningful results outside the training database.This explains why the existing ML potentials are usually (with rare exceptions [40]) narrowly focused on, and only tested for, a particular type of physical properties.This distinguishes them from the traditional potentials which, although less accurate, are designed for a much wider range of applications and diverse properties.
In this work we propose a new approach that can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding.We focus on NN potentials as an example, but the approach is general and can be readily extended to other methods of nonlinear regression.Like all ML potentials, the proposed physically-informed NN (PINN) potentials are trained using a large DFT dataset.However, by contrast to the existing, mathematical NN potentials, the PINN potentials incorporate the basic physics and chemistry of atomic interactions leveraged by the extraordinary adaptivity and trainability of NNs.The PINN potentials thus strike a golden compromise between the two "extremes" represented by the traditional, physics-guided interatomic potentials, and the mathematical NN potentials.
The general idea of combining traditional interatomic potentials with NNs was previously discussed by Malshe et al. [41], who constructed an adjustable Tersoff potential [42][43][44] for a Si 5 cluster.Other authors have also applied machine-learning methods to parameterize physics-based models of molecular interactions, primarily in the context of broad exploration of the compositional space of molecular (mostly organic) matter [45][46][47].Glielmo et al. [48] recently proposed to construct n-body Gaussian process kernels to capture the n-body nature of atomic interactions in physical systems.The PINN potentials proposed in this paper are inspired by such approaches but extend them to (1) more advanced physical models with a broad applicability, and (2) large-scale systems by introducing local energies E i linked to local structural parameters G l i .The focus is placed on the exploration of the configurational space of defected solids and liquids in singlecomponent and, in the future, binary or multicomponent systems.The main goal is to improve the transferability of interatomic potentials to unknown atomic environments while keeping the same high accuracy of training as normally achieved with mathematical machine-learning potentials.

II. PHYSICALLY-INFORMED NEURAL NETWORK POTENTIALS
The currently existing, mathematical NN potentials [14][15][16][17][18][32][33][34][35][36][37] partition the total energy E into a sum of atomic energies, E = i E i .A single NN is constructed to express each atomic energy E i as a function of a set of local fingerprint parameters (also called symmetry parameters [32]) (G 1 i , G 2 i , ..., G k i ).These parameters encode the local environments of the atoms.The network is trained by minimizing the error between the energies predicted by the NN and the respective DFT total energies for a large set of atomic configurations.The flowchart of the method is depicted in Fig. 1b.
The proposed PINN model is based on the following considerations.A traditional, physics-based potential can always be trained to reproduce the energy of any given atomic configuration with any desired accuracy.Of course, this potential will not work well for other configurations.Imagine, however, that the potential parameters have been trained for a large set of reference structures, one structure at a time, each time producing a different parameter set p. Suppose then that, during the subsequent simulations, we have a way of identifying, on the fly, a reference structure closest to any current atomic configuration.Then the accuracy of the simulation can be drastically improved by dynamically choosing the best set of potential parameters for every atomic configuration accoutered during the simulation.Now, since the atomic energy E i only depends on the local environment of atom i, the best parameter set for computing E i can be chosen by only examining the local environment of this atom.The energies of different atoms are then computed by using different, environment-dependent, parameter sets while keeping the same, physics-motivated functional form of the potential.
Instead of generating and storing a large set of discrete reference structures, we can construct a continuous NN-based function mapping the local environment of every atom on a parameter set of the interatomic potential optimized for that particular environment.Specifically, the local structural parameters (fingerprints) G l i (l = 1, ..., k) of every atom i are fed into the network, which then maps them on the optimized parameter set p i appropriate for atom i. Mathematically, the local energy takes the functional form where (r i1 , ..., r in ) are atomic positions in the vicinity of atom i.
In comparison with the direct mapping G l i → E i implemented by the mathematical NN potentials, we have added an intermediate step: The first step is executed by the NN and the second by a physics-based interatomic potential.A flowchart of the two-step mapping is shown in Fig. 1c.It is important to emphasize that this intermediate step does not degrade the accuracy relative to the direct mapping, because a feedforward NN can always be trained to execute any real-valued function [49,50].Thus, for any functional form of the potential, the NN can always adjust its architecture, weights and biases to achieve the same mapping as in the direct method.However, since the chosen potential form captures the essential physics of atomic interactions, the proposed PINN potential will display a better transferability to new atomic environments.Even if the potential parameters predicted by the NN for an unknown environment are not very accurate, the physics-motivated functional form will ensure that the results remain at least physically meaningful.This physics-guided extrapolation is likely to be more reliable than the purely mathematical extrapolation inherent in the existing NN potentials.Obviously, the same reasoning applies to the interpolation process as well, which can also be more accurate.
The functional form of the PINN potential must be general enough to be applicable across different classes of materials.In this paper we chose a simple analytical bondorder potential (BOP) [51][52][53] that must work equally well for both covalent and metallic materials.For a single-component system, the BOP functions are specified in the Methods section.They capture the physical and chemical effects such as the pairwise repulsion between atoms, the angular dependence of the chemical bond strength, the bond-order effect (the more neighbors, the weaker the bond), and the screening of chemical bonds by surrounding atoms.In addition to being appropriate for covalent bonding, the proposed BOP form reduces to the EAM formalism in the limit of metallic bonding.

III. EXAMPLE: PINN POTENTIAL FOR AL
To demonstrate the PINN method, we have constructed a general-purpose potential for aluminum.The training and validation datasets were randomly selected from a preexisting DFT database [20,21].Some additional DFT calculations have also been performed using the same methodology as in [20,21].The selected DFT supercells represent 7 crystal structures for a large set of atomic volumes under isotropic tension and compression, several slabs with different surface orientations, including surfaces with adatoms, a supercell with a single vacancy, five different symmetrical tilt grain boundaries, and an unrelaxed intrinsic stacking fault on the (111) plane with different translational states along the [211] direction.The database also includes several isolated clusters with the number of atoms ranging from 2 (dimer) to 79.The ground-state face centered cubic (FCC) structure was additionally subject to uniaxial tension and compression in the [100] and [111] directions at 0 K temperature.Most of the atomic configurations were snapshots of DFT MD simulations in the microcanonical (NVE) or canonical (NVT or NPT) ensembles for several atomic volumes at several temperatures.Some of the high-temperature configurations were part-liquid, part crystalline.In total, the database contains 3649 (127592 atoms).More detailed information about the database can be found in the Supplementary Tables S1 and S2.To avoid overfitting or selection bias, the 10-fold cross-validation method was during the training.The database was randomly partitioned in 10 subsets.One of them was set aside for validation and the remaining data was used for training.The process repeated 10 times form different choices of the validation subset.
The local structural parameters G l i chosen for Al are specified in the Methods section.The NN contained two hidden layers with the same number of nodes in each.This number was increased until the training process produced a PINN potential with the root-meansquare error (RMSE) of training and validation close to 3 to 4 meV/atom, which was set as our goal.This is the level of accuracy of the DFT energies included in the database.For comparison, a mathematical NN potential was constructed using the same methodology.The number of hidden nodes of the NN was adjusted to give about the same number of fitted parameters and to achieve approximately the same RMSE of training and validation as for the PINN potential.Table I summarizes the training and validation errors averaged over the 10 cross-validation runs.One PINN and one NN potential were selected for a more detailed examination reported below.
Figures 2 and S1 demonstrate excellent correlation between the predicted and DFT energies over a 7 eV/atom wide energy range for both potentials.The error distribution has a near-Gaussian shape centered at zero.Examination of errors in individual groups of structures (Fig. S2) shows that the largest errors originate from the crystal structures (especially FCC, HCP and simple hexagonal) subjected to large expansion.
Table II summarizes some of the physical properties of Al predicted by the potentials in comparison with DFT data from the literature.There was no direct fit to any of these properties, although atomic configurations most relevant to some of the properties were represented in the training dataset.While both potentials agree with the DFT data well, the PINN potential tends to be more accurate for most properties.For the [110] self-interstitial dumbbell, the NN potential predicts an unstable configuration that spontaneously rotate to the [100] orientation, whereas the PINN correctly predicts such configurations to be metastable.Figure 5 shows the linear thermal expansion factor as a function of temperature predicted by the potentials in comparison with experimental data.The PINN potential displays good agreement with experiment without direct fit, whereas the NN potentials overestimates the thermal expansion at high temperatures.
(The discrepancies at low temperatures are due to the quantum effects that are not captured by classical simulations.)As another test, the radial distribution function and the bond angle distribution in liquid Al were computed at several temperatures for which experimental and/or DFT data is available (Figs. S4 and S5).In this case, both potentials were found to perform equally well.Any small deviations from the published DFT calculations are within the uncertainty of the different DFT flavors (exchange-correlation functionals).
For testing purposes, we computed the energies of the remaining groups of structures that were part of the original DFT database [20,21] but were not used here for training or validation.The full information about the testing dataset (26425 supercells containing a total of 2376388 atoms) can be found in the Supplementary Table S3.For example, Fig. 7 compares the energies predicted by the potentials with DFT energies from hightemperature MD simulations for a supercell containing an edge dislocation or HCP Al.In both cases, the PINN potential is obviously more accurate.The remaining testing cases are presented in the Supplementary Information file (Supplementary Figures S6-S10).Although there are cases where both potentials perform equally well, in most cases the PINN potential predicts the energies of unknown atomic configurations more accurately than the NN potential.
For further testing, the energies of the crystal structures of Al were computed for atomic volumes both within and beyond the training interval.Both potentials accurately reproduce the DFT energy-volume relations for all volumes spanned by the DFT database (Figs. 3 and S3).However, extrapolation to larger or smaller volumes reveals significant differences.For example, the PINN potential correctly predicts that the crystal energy continues to rapidly increase under strong compression (repulsive interaction mode).In fact, the extrapolated PINN energy goes exactly through the new DFT points that were not included in the training or validation datasets, see examples in Fig. 4. By contrast, the energy predicted by the NN model immediately develops wiggles and strongly deviates from the physically meaningful repulsive behavior.Such artifacts were found for other structures as well.
Furthermore, while the atomic forces were not used for either training or validation, they were compared with the DFT forces once the training was complete.For the validation dataset, this comparison probes the accuracy of interpolation, whereas for the testing dataset the accuracy of extrapolation.As expected, for the validation dataset the PINN forces are in better agreement with DFT calculations than the NN forces (RMSE ≈ 0.1 eV/ Å versus ≈ 0.2 eV/ Å) as illustrated in Fig. 8a,b.For the testing dataset, the advantage of the PINN model in force predictions is even more significant.For example, for the dislocation and HCP cases discussed above, the PINN potential provides more accurate predictions (RMSE ≈ 0.1 eV/ Å) than the NN potential (RMSE ≈ 0.4 eV/ Å for the dislocation and 0.6 eV/ Å for the HCP case) (Fig. 8c-f).This advantage persists for all other groups of the structures from the testing database.
It was also interesting to compare PINN potential with traditional, parameter-based potentials for Al.One of them was the widely accepted EAM Al potential [54] that had been fitted to a mix of experimental and DFT data.The other was a BOP potential of the same functional form as in the PINN model.Its parameters were fitted in this work using the DFT database as for the PINN/NN potentials and then fixed once and for all.Fig. 6 compares the DFT energies with the energies predicted by the EAM and BOP models across the entire set of reference configurations.The PINN predictions are shown for comparison.The plots demonstrate that the traditional, fixed-parameter models generally follow the correct trend but become increasingly less accurate as the structures deviate from the equilibrium, low-energy atomic configurations.The adaptivity to the local atomic environments built into the PINN potential greatly improves the accuracy.

IV. DISCUSSION AND CONCLUSIONS
The proposed PINN potential model is capable of achieving the same high accuracy in interpolating between DFT energies on the PES as the currently existing mathematical NN potentials.The construction of PINN potentials requires the same type of DFT database, is equally straightforward, and does not heavily rely on human intuition.However, extrapolation outside the domain of atomic configurations represented in the training database is now based on a physical model of interatomic bonding.As a result, the extrapolation becomes more reliable, or at least more failure-proof, than the purely mathematical extrapolation.The accuracy of interpolation can also be improved for the same reason.As an example, the PINN Al potential constructed in this paper demonstrates better accuracy of interpolation and significantly improved transferability than a regular NN potential with about the same number of parameters.The advantage of the PINN potential is especially strong for atomic forces, which are important for molecular dynamics.The potential could be used for accurate simulations of mechanical behavior and other processes in Al.Construction of general-purpose PINN potentials for Si and Ge is currently in progress.
We believe that the development of physics-based ML potentials is the best way forward in this field.Such potentials need not be limited to NNs or the particular BOP model adopted in this paper.Other regression methods can be employed and the interatomic bonding model can be made more sophisticated, or the other way round, simpler in the interest of speed.
Other modifications are envisioned in the future.For example, not all potential parameters are equally sensitive to local environments.To improve the computational efficiency, the parameters can be divided in two subsets [41]: local parameters a i = (a i1 , ..., a iλ ) adjustable according to the local environments as discussed above, and global parameters b = (b 1 , ..., b µ ) that are fixed after the optimization and used for all environments (as in the traditional potentials).The potential format now becomes During the training process, the global parameters b and the network weights and biases are optimized simultaneously, as shown in Fig. 1d.Extension of PINN potentials to binary and multicomponent systems is another major task for the future.All ML potentials are orders of magnitude faster than straight DFT calculations but inevitably much slower than the traditional potentials.Preliminary tests indicate that PINN potentials are about a factor of two slower than regular NN potentials for the same number of parameters, the extra overhead being due to the BOP calculation.All computations reported in this paper utilized in-house software parallelized with MPI for training and with OpenMP for MD and MC simulations (see example in Fig. S14).Collaborative work is underway to develop highly scalable HPC software packages for physically-informed ML potential training and MD/MC simulations using multiple CPUs or GPUs, or both.

METHODS
The main ingredients of the proposed PINN method are the local structural parameters G l i , the BOP potential, and the NN.There are many possible ways of choosing local structural parameters [14-18, 32, 35, 37].After trying several options, the following set of G l i 's was selected.For an atom i, we define g where r ij and r ik are distances to atoms j and k, respectively, and θ ijk is the angle between the bonds ij and ik.In Eq.( 3), P m (x) is the Legendre polynomial of order m and is a truncated Gaussian of width σ centered at point r 0 .The truncation function f c (r) is defined by This function and its derivatives up to the third go to zero at a cutoff distance r c .The parameter d controls the truncation range.For example, P 0 (x) = 1 and g (0) i characterizes the local atomic density near atom i.Likewise, P 1 (x) = x and g (1) i can be interpreted as the dipole moment of a set of unit charges placed at the atomic positions j and k.As such, this parameter measures the degree of local deviation from spherical symmetry in the environment (g (1) i = 0 for spherical symmetry).For m = 2, we have P 2 (x) = (3x 2 − 1)/2 and g (2) i is related to the quadrupole moment of a set of unit charges placed at the atomic positions around atom i.We found that polynomials up to degree m = 6 should be included to accurately represent the diverse atomic environment.Each g (l) i is computed for several values of σ and r 0 spanning a range of interatomic distances.For each atom, the set of n g In this work we chose σ = 1.0 and used polynomials with m = 0, 1, 2, 4, 6 for 12 r 0 values, giving a total of k = 60 G l i 's.In the BOP model adopted in this work, the energy of an atom i is postulated in the form where r ij is the distance between atoms i and j and the summation is over all atom j other than i within the cutoff radius r c .The bond-order parameter b ij is taken in the form where represents the number of chemical bonds (other than ij) formed by atom i. Larger z ij values (more bonds) lead to a smaller b ij and thus weaker ij bond.
The screening factor S ij reduces the strength of bonds by surrounding atoms.For example, when counting the bonds in ( 8), we screen them by S ik , so that strongly screened bonds contribute less to z ij .The screening factor S ij is given by where the partial screening factor S ijk represents the contribution of a neighboring atom k (different from i and j) to the screening of the bond ij.S ijk is given by It has the same value for all atoms k located on the surface of an imaginary spheroid whose poles coincide with the atoms i and j.For all atoms k outside this "cutoff spheroid", on which r ik +r jk −r ij = r c , we have S ijk = 1 -such atoms are too far away to screen the bond.
If an atom k is placed on the line between the atoms i and j, we have r ik + r jk − r ij = 0 and S ijk is small -the bond ij is strongly screened (almost broken) by the atom k.This behavior reasonably reflects the nature of chemical bonding.
Finally, the promotion energy is taken in the form For a covalent material, E (p) i accounts for the energy cost of changing the electronic structure of a free atoms before it forms chemical bonds.For example, for group IV elements, this is the cost of the s 2 p 2 → sp 3 hybridization.On the other hand, E (p) i can be interpreted as the embedding energy appearing in the EAM formalism [1,2].Here, the host electron density on atom i is given by ρi = j =i S ij b ij f c (r ij ).Due to this feature, this BOP model can be applied to both covalent and metallic systems.
The BOP functions depend on 8 parameters A i , B i , α i , β i , a i , h i , σ i and λ i , which constitute the parameter set (p 1 , ..., p m ) with m = 8.The cutoff parameters were fixed at r c = 6 Å and d = 1.5 Å.
The feedforward NN contained two hidden layers and had the 60 × 15 × 15 × 8 architecture for the PINN potential and 60 × 16 × 16 × 1 for the NN potential.The number of nodes in the hidden layers was chosen to reach the target accuracy of about 4 meV/atom without overfitting.
The training/validation database consisted of DFT total energies for a set of supercells.The DFT calculations were performed using projector-augmented wave (PAW) pseudopotentials as implemented in the electronic structure Vienna Ab initio Simulation Package (VASP) [55,56].The generalized gradient approximation (GGA) was used in conjunction with the Perdew, Burke, and Ernzerhof (PBE) density functional [57,58].The plane-wave basis functions up to a kinetic energy cutoff of 520 eV were used, with the k-point density chosen to achieve convergence to a few meV/atom level.Further details of the DFT calculations can be found in [20,21].The energy of a given supercell s, E s = i E s i , predicted by the potential was compared with the DFT energy E s DFT .Note that the original E s DFT values were not corrected to remove the energy of a free atom.To facilitate comparison with literature data, prior to the training all DFT energies were uniformly shifted by 0.38446 eV/atom to match the experimental cohesive energy of Al, 3.36 eV/atom [59].
The NN was trained by adjusting its weights w κ and biases b κ to minimize the objective function The second term was added to avoid overfitting by controlling the magnitudes of the weights and biases.The parameter τ controls the degree of regularization.The third term ensures the variations of the PINN parameters relative to their database-averaged values p η remain small.The minimization of E was implemented by the Davidson-Fletcher-Powell algorithm of unconstrained optimization.The optimization was repeated several times starting from different random states and the solution with the smallest E was selected as final.The PINN and NN forces were computed by analytical calculations using chain-rule differentiation.S1 and S2 for the group numbers).The red line marks the RMSE (Table I).

FIG. 1 :
FIG. 1: Flowcharts of the development of atomistic potentials: (a) Traditional interatomic potential, (b) Mathematical NN potential, (c) Physically-informed NN (PINN) potential with all-local parameters, (d) PINN potential with parameters divided into local and global.The dashed rectangle outlines the objects requiring parameter optimization.PES is the potential energy surface of the material.
FIG. S1: (a) Energies of atomic configurations in the training dataset computed with the mathematical NN potentials versus DFT energies.The straight line represents the perfect fit.(b) Error distribution in the training dataset.
FIG.S5: Bond angle distribution, g(R min , θ), in liquid aluminum at 1000 K in comparison with DFT calculations[73].The calculation included the neighbors within the first minimum R min of the radial distribution function (cf.Fig.S4).
FIG. S13: Atomic forces in HCP Al during NVT MD simulations at 300 K, 600 K, 1000 K, 1500 K, 2000 K and 4000 K predicted by the PINN and NN potentials in comparison with DFT calculations.The straight lines represent the perfect fit.
FIG. S14: Demonstration of MD simulations for liquid Al with the PINN potential.The simulation was conducted in the zero-pressure NPT ensemble at the temperature of 1250 K using a beta-version of the ParaGrandMC code (https://software.nasa.gov/software/LAR-18773-1).The system contains 10,976 atoms.(a) Typical snapshot of the system.(b) Energy and pressure as a function of time during initial stages of the simulation.

TABLE I :
Fitting and validation errors and related neural network information for the straight NN and PINN models.

TABLE S3 :
[20,21]database used for testing.The data was extracted from the database generated by Botu et al.[20,21].The structures are divided into datasets and further into groups according to the structure type and physical conditions (temperature, deformation).For NVE simulations, the table indicates the temperature of initial thermalization with ideal atomic positions.