Abstract
Largescale atomistic computer simulations of materials heavily rely on interatomic potentials predicting the energy and Newtonian forces on atoms. Traditional interatomic potentials are based on physical intuition but contain few adjustable parameters and are usually not accurate. The emerging machinelearning (ML) potentials achieve highly accurate interpolation within a large DFT database but, being purely mathematical constructions, suffer from poor transferability to unknown structures. We propose a new approach that can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding. This is achieved by combining a rather general physicsbased model (analytical bondorder potential) with a neuralnetwork regression. This approach, called the physically informed neural network (PINN) potential, is demonstrated by developing a generalpurpose PINN potential for Al. We suggest that the development of physicsbased ML potentials is the most effective way forward in the field of atomistic simulations.
Introduction
Largescale molecular dynamics (MD) and Monte Carlo (MC) simulations of materials are traditionally implemented using classical interatomic potentials predicting the potential energy and Newtonian forces acting on atoms. Computations with such potentials are very fast and afford access to systems with millions of atoms and MD simulation times up to hundreds of nanoseconds. Such simulations span a wide range of time and length scales and constitute a critical component of the multiscale approach in materials modeling and computational design.
Several functional forms of interatomic potentials have been developed over the years, including the embeddedatom method (EAM)^{1,2,3}, the modified EAM (MEAM)^{4}, the angulardependent potentials^{5}, the chargeoptimized manybody potentials^{6}, reactive bondorder potentials^{7,8,9}, and reactive force fields^{10} to name a few. These potentials address particular classes of materials or particular types of applications. Their functional forms depend on the physical and chemical models chosen to describe interatomic bonding in the respective class of materials.
A common feature of all traditional potentials is that they express the potential energy surface (PES) of the system, E = E(r_{1}, ..., r_{N}, p), as a relatively simple function of atomic coordinates (r_{1}, ..., r_{N}), N being the number of atoms (Fig. 1a). Knowing the PES, the forces acting on the atoms can be computed by differentiation and used in MD simulations. The potential functions depend on a relatively small number of fitting parameters p = (p_{1}, ..., p_{m}) (typically, m = 10–20) and are optimized (trained) on a relatively small database of experimental data and firstprinciples density functional theory (DFT) calculations. The traditional potentials are, of course, much less accurate than DFT calculations. Nevertheless, many of them demonstrate a reasonably good transferability to atomic configurations lying well outside the training dataset. This important feature owes its origin to the incorporation of at least some basic physics in the potential form. As long as the nature of chemical bonding remains the same as assumed during the potential development, the potential can predict the system energy adequately even for new configurations not seen during the training process. Unfortunately, the construction of good quality potentials is a long and painful process requiring personal experience and intuition and is more art than science^{8,11}. In addition, the traditional potentials are specific to a particular class of materials and cannot be easily extended to other materials or improved in a systematic manner.
During the past decade, a new direction has emerged wherein interatomic potentials are developed by employing machinelearning (ML) methods^{12,13,14,15,16,17,18,19,20,21,22}. The idea was originally conceived in the chemistry community in the 1990s in the effort to improve the accuracy of intermolecular force fields^{23,24}, an approach that was later adopted by the physics and materials science communities. The general idea is to forego the physical insights and reproduce the PES by interpolating between DFT data points using highdimensional nonlinear regression methods such as the Gaussian process regression^{19,25,26,27}, interpolating moving least squares^{28}, kernel ridge regression^{12,20,21}, compressed sensing^{29,30}, gradientdomain machinelearning model^{31}, or the artificial neural network (NN) approach^{13,14,15,16,17,18,32,33,34,35,36,37,38}. If properly trained, a ML potential can predict the system energy with a nearly DFT accuracy (a few meV/atom). ML potentials are not specific to a particular class of materials or type of chemical bonding. They can be improved systematically if weaknesses are discovered or new DFT data become available. The training process can be implemented onthefly by running ab initio MD simulations^{26}.
A major weakness of ML potentials is their poor transferability. Being purely mathematical constructions devoid of any physical meaning, they can accurately interpolate the energy between the training configurations but are generally incapable of properly extrapolating the energy to unknown atomic environments. As a result, the performance of ML potentials outside the training domain can be very poor. There is no reason why a purely mathematical extrapolation scheme would deliver physically meaningful results outside the training database. This explains why the existing ML potentials are usually (with rare exceptions^{39}) narrowly focused on, and only tested for, a particular type of physical properties. This distinguishes them from the traditional potentials which, although less accurate, are designed for a much wider range of applications and diverse properties.
In this work we propose a new approach that can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding. We focus on NN potentials as an example, but the approach is general and can be readily extended to other methods of nonlinear regression. Like all ML potentials, the proposed physically informed NN (PINN) potentials are trained using a large DFT dataset. However, by contrast to the existing, mathematical NN potentials, the PINN potentials incorporate the basic physics and chemistry of atomic interactions leveraged by the extraordinary adaptivity and trainability of NNs. The PINN potentials thus strike a golden compromise between the two extremes represented by the traditional, physicsguided interatomic potentials, and the mathematical NN potentials.
The general idea of combining traditional interatomic potentials with NNs was previously discussed by Malshe et al.^{40}, who constructed an adjustable Tersoff potential^{41,42,43} for a Si_{5} cluster. Other authors have also applied machinelearning methods to parameterize physicsbased models of molecular interactions, primarily in the context of broad exploration of the compositional space of molecular (mostly organic) matter^{44,45,46}. Glielmo et al.^{47} recently proposed to construct nbody Gaussian process kernels to capture the nbody nature of atomic interactions in physical systems. The PINN potentials proposed in this paper are inspired by such approaches but extend them to (1) more advanced physical models with a broad applicability, and (2) largescale systems by introducing local energies E_{i} linked to local structural parameters \(G_i^l\). The focus is placed on the exploration of the configurational space of defected solids and liquids in singlecomponent and, in the future, binary or multicomponent systems. The main goal is to improve the transferability of interatomic potentials to unknown atomic environments while keeping the same level of accuracy of training as normally achieved with mathematical machinelearning potentials.
Results
Physically informed neural network potentials
The currently existing, mathematical NN potentials^{13,14,15,16,17,18,32,33,34,35,36} partition the total energy E into a sum of atomic energies, \(E = \mathop {\sum}\nolimits_i {E_i}\). A single NN is constructed to express each atomic energy E_{i} as a function of a set of local fingerprint parameters (also called symmetry parameters^{13}) \((G_i^1,G_i^2,...,G_i^k)\). These parameters encode the local environments of the atoms. The network is trained by minimizing the error between the energies predicted by the NN and the respective DFT total energies for a large set of atomic configurations. The flowchart of the method is depicted in Fig. 1b.
The proposed PINN model is based on the following considerations. A traditional, physicsbased potential can always be trained to reproduce the energy of any given atomic configuration with any desired accuracy. Of course, this potential will not work well for other configurations. Imagine, however, that the potential parameters have been trained for a large set of reference structures, one structure at a time, each time producing a different parameter set p. Suppose that, during the subsequent simulations, we have a way of identifying, on the fly, a reference structure closest to any current atomic configuration. Then the accuracy of the simulation can be drastically improved by dynamically choosing the best set of potential parameters for every atomic configuration accoutered during the simulation. Now, since the atomic energy E_{i} only depends on the local environment of atom i, the best parameter set for computing E_{i} can be chosen by only examining the local environment of this atom. The energies of different atoms are then computed by using different, environmentdependent, parameter sets while keeping the same, physicsmotivated functional form of the potential.
Instead of generating and storing a large set of discrete reference structures, we can construct a continuous NNbased function mapping the local environment of every atom on a parameter set of the interatomic potential optimized for that particular environment. Specifically, the local structural parameters (fingerprints) \(G_i^l\) (l = 1, ..., k) of every atom i are fed into the network, which then maps them to the optimized parameter set p_{i} appropriate for atom i. Mathematically, the local energy takes the functional form
where (r_{i1}, ..., r_{in}) are atomic positions in the vicinity of atom i.
In comparison with the direct mapping \(G_i^l \mapsto E_i\) implemented by the mathematical NN potentials, we have added an intermediate step: \(G_i^l \mapsto {\mathbf{p}}_i \mapsto E_i\). The first step is executed by the NN and the second by a physicsbased interatomic potential. A flowchart of the twostep mapping is shown in Fig. 1c. It is important to emphasize that this intermediate step does not degrade the accuracy relative to the direct mapping, because a feedforward NN can always be trained to execute any realvalued function^{48,49}. Thus, for any functional form of the potential, the NN can always adjust its architecture, weights and biases to achieve the same mapping as in the direct method. However, since the chosen potential form captures the essential physics of atomic interactions, the proposed PINN potential will display a better transferability to new atomic environments. Even if the potential parameters predicted by the NN for an unknown environment are not very accurate, the physicsmotivated functional form will ensure that the results remain at least physically meaningful. This physicsguided extrapolation is likely to be more reliable than the purely mathematical extrapolation inherent in the existing NN potentials. Obviously, the same reasoning applies to the interpolation process as well, which can also be more accurate.
The functional form of the PINN potential must be general enough to be applicable across different classes of materials. In this paper we chose a simple analytical bondorder potential (BOP)^{50,51,52} that must work equally well for both covalent and metallic materials. For a singlecomponent system, the BOP functions are specified in the Methods section. They capture the physical and chemical effects such as the pairwise repulsion between atoms, the angular dependence of the chemical bond strength, the bondorder effect (the more neighbors, the weaker the bond), and the screening of chemical bonds by surrounding atoms. In addition to being appropriate for covalent bonding, the proposed BOP form reduces to the EAM formalism in the limit of metallic bonding.
Example: PINN potential for Al
To demonstrate the PINN method, we have constructed a generalpurpose potential for aluminum. The training and validation datasets were randomly selected from a preexisting DFT database^{20,21}. Some additional DFT calculations have also been performed using the same methodology as in refs. ^{20,21}. The selected DFT supercells represent seven crystal structures for a large set of atomic volumes under isotropic tension and compression, several slabs with different surface orientations, including surfaces with adatoms, a supercell with a single vacancy, five different symmetrical tilt grain boundaries, and an unrelaxed intrinsic stacking fault on the (111) plane with different translational states along the [211] direction. The database also includes several isolated clusters with the number of atoms ranging from 2 (dimer) to 79. The groundstate face centered cubic (FCC) structure was additionally subject to uniaxial tension and compression in the [100] and [111] directions at 0 K temperature. Most of the atomic configurations were snapshots of DFT MD simulations in the microcanonical (NVE) or canonical (NVT or NPT) ensembles for several atomic volumes at several temperatures. Some of the hightemperature configurations were partliquid, part crystalline. In total, the database contains 3649 supercells (127592 atoms). More detailed information about the database can be found in the Supplementary Tables 1 and 2. To avoid overfitting or selection bias, the 10fold crossvalidation method was used during the training. The database was randomly partitioned in 10 subsets. One of them was set aside for validation and the remaining data was used for training. The process repeated 10 times for different choices of the validation subset.
The local structural parameters \(G_i^l\) chosen for Al are specified in the Methods section. The NN contained two hidden layers with the same number of nodes in each. This number was increased until the training process produced a PINN potential with the rootmeansquare error (RMSE) of training and validation close to 3–4 meV per atom, which was set as our goal. This is the level of accuracy of the DFT energies included in the database. For comparison, a mathematical NN potential was constructed using the same methodology. The number of hidden nodes of the NN was adjusted to give about the same number of fitted parameters and to achieve approximately the same RMSE of training and validation as for the PINN potential. Table 1 summarizes the training and validation errors averaged over the 10 crossvalidation runs. One PINN and one NN potential were selected for a more detailed examination reported below.
Figure 2 and Supplementary Fig. 1 demonstrate excellent correlation between the predicted and DFT energies over a 7 eV per atom wide energy range for both potentials. The error distribution has a nearGaussian shape centered at zero. Examination of errors in individual groups of structures (Supplementary Fig. 2) shows that the largest errors originate from the crystal structures (especially FCC, HCP, and simple hexagonal) subjected to large expansion.
Table 2 summarizes some of the physical properties of Al predicted by the potentials in comparison with DFT data from the literature. There was no direct fit to any of these properties, although atomic configurations most relevant to some of the properties were represented in the training dataset. While both potentials agree with the DFT data well, the PINN potential tends to be more accurate for most properties. For the [110] selfinterstitial dumbbell, the NN potential predicts an unstable configuration that spontaneously rotates to the [100] orientation, whereas the PINN potential correctly predicts such configurations to be metastable. Figure 3 shows the linear thermal expansion factor as a function of temperature predicted by the potentials in comparison with experimental data. The PINN potential displays good agreement with experiment without direct fit, whereas the NN potential overestimates the thermal expansion at high temperatures. (The discrepancies at low temperatures are due to the quantum effects that are not captured by classical simulations.) As another test, the radial distribution function and the bond angle distribution in liquid Al were computed at several temperatures for which experimental and/or DFT data are available (Supplementary Figs 4 and 5). In this case, both potentials were found to perform equally well. Any small deviations from the published DFT calculations are within the uncertainty of the different DFT flavors (exchangecorrelation functionals).
For testing purposes, we computed the energies of the remaining groups of structures that were part of the original DFT database^{20,21} but were not used here for training or validation. The full information about the testing dataset (26,425 supercells containing a total of 2,376,388 atoms) can be found in the Supplementary Table 3. For example, Fig. 4 compares the energies predicted by the potentials with DFT energies from hightemperature MD simulations for a supercell containing an edge dislocation or HCP Al. In both cases, the PINN potential is obviously more accurate. The remaining testing cases are presented in the Supplementary Figs. 6–10. Although there are cases where both potentials perform equally well, in most cases the PINN potential predicts the energies of unknown atomic configurations more accurately than the NN potential.
For further testing, the energies of the crystal structures of Al were computed for atomic volumes both within and beyond the training interval. Both potentials accurately reproduce the DFT energy–volume relations for all volumes spanned by the DFT database (Fig. 5 and Supplementary Fig. 3). However, extrapolation to larger or smaller volumes reveals significant differences. For example, the PINN potential correctly predicts that the crystal energy continues to rapidly increase under strong compression (repulsive interaction mode). In fact, the extrapolated PINN energy goes exactly through the new DFT points that were not included in the training or validation datasets, see examples in Fig. 6. By contrast, the energy predicted by the NN model immediately develops wiggles and strongly deviates from the physically meaningful repulsive behavior. Such artifacts were found for other structures as well.
To demonstrate that the unphysical behavior exhibited by the NN potential is not a specific feature of our structural parameters \(G_i^l\) or the training method, we constructed another NN potential using a thirdparty NNtraining package PROPhet^{53}. This potential, which we refer to as NN′, uses the BehlerParrinello symmetry functions^{13}, which are different from our structural descriptor \(G_i^l\). The NNtraining algorithm is also different. A 47 × 18 × 18 × 1 network containing 1225 fitting parameters was trained on exactly the same DFT database to about the same accuracy as the NN and PINN potentials (Table 1). Figure 6 shows that the NN′ potential behaves in a similar manner as our NN potential, closely following the DFT energies within the training/validation domain and becoming unphysical as soon as we step outside this domain.
While the atomic forces were not used for either training or validation, they were compared with the DFT forces once the training was complete. For the validation dataset, this comparison probes the accuracy of interpolation, whereas for the testing dataset the accuracy of extrapolation. As expected, for the validation dataset the PINN forces are in better agreement with DFT calculations than the NN forces (RMSE ≈ 0.1 eV Å^{−1} versus ≈0.2 eV Å^{−1}) as illustrated in Fig. 7a, b. For the testing dataset, the advantage of the PINN model in force predictions is even more significant. For example, for the dislocation and HCP cases discussed above, the PINN potential provides more accurate predictions (RMSE ≈ 0.1 eV Å^{−1}) than the NN potential (RMSE ≈ 0.4 eV Å^{−1} for the dislocation and 0.6 eV Å^{−1} for the HCP case) (Fig. 7c, f). This advantage persists for all other groups of structures from the testing database.
It was also interesting to compare the PINN potential with traditional, parameterbased potentials for Al. One of them was the widely accepted EAM Al potential^{54} that had been fitted to a mix of experimental and DFT data. The other was a BOP potential of the same functional form as in the PINN model. Its parameters were fitted in this work using the same DFT database as for the PINN/NN potentials and then fixed once and for all. Figure 8 compares the DFT energies with the energies predicted by the EAM and BOP models across the entire set of reference configurations. The PINN predictions are shown for comparison. The plots demonstrate that the traditional, fixedparameter models generally follow the correct trend but become increasingly less accurate as the structures deviate from the equilibrium, lowenergy atomic configurations. The adaptivity to the local atomic environments built into the PINN potential greatly improves the accuracy.
Discussion
The proposed PINN potential model is capable of achieving the same high accuracy in interpolating between DFT energies on the PES as the currently existing mathematical NN potentials. The construction of PINN potentials requires the same type of DFT database, is equally straightforward, and does not heavily rely on human intuition. However, extrapolation outside the domain of atomic configurations represented in the training database is now based on a physical model of interatomic bonding. As a result, the extrapolation becomes more reliable, or at least more failureproof, than the purely mathematical extrapolation. The accuracy of interpolation can also be improved for the same reason. As an example, the PINN Al potential constructed in this paper demonstrates better accuracy of interpolation and significantly improved transferability than a regular NN potential with about the same number of parameters. The advantage of the PINN potential is especially strong for atomic forces, which are important for molecular dynamics. The potential could be used for accurate simulations of mechanical behavior and other processes in Al. Construction of generalpurpose PINN potentials for Si and Ge is currently in progress.
We believe that the development of physicsbased ML potentials is the best way forward in this field. Such potentials need not be limited to NNs or the particular BOP model adopted in this paper. Other regression methods can be employed and the interatomic bonding model can be made more sophisticated, or the other way round, simpler in the interest of speed.
Other modifications are envisioned in the future. For example, not all potential parameters are equally sensitive to local environments. To improve the computational efficiency, the parameters can be divided into two subsets^{40}: local parameters a_{i} = (a_{i1}, ..., a_{iλ}) adjustable according to the local environments as discussed above, and global parameters b = (b_{1}, ..., b_{μ}) that are fixed after the optimization and used for all environments (as in the traditional potentials). The potential format now becomes
During the training process, the global parameters b and the network weights and biases are optimized simultaneously, as shown in Fig. 1d. Extension of PINN potentials to binary and multicomponent systems is another major task for the future.
All ML potentials are orders of magnitude faster than straight DFT calculations but inevitably much slower than the traditional potentials. Preliminary tests indicate that PINN potentials are about 25% slower than the regular NN potentials for the same number of parameters, the extra overhead being due to the BOP calculation. However, the computational efficiency depends on the parallelization method and computer architecture. All computations reported in this paper utilized inhouse software parallelized with MPI for training and with OpenMP for MD and MC simulations (see example in Supplementary Fig. 14). Collaborative work is underway to develop highly scalable HPC software packages for physically informed ML potential training and MD/MC simulations using multiple CPUs or GPUs, or both. The results will be reported in a forthcoming paper.
Methods
Local structural parameters
There are many possible ways of choosing local structural parameters^{13,14,15,16,17,18,34,36}. After trying several options, the following set of \(G_i^l\)’s was selected. For an atom i, we define
where r_{ij} and r_{ik} are distances to atoms j and k, respectively, and θ_{ijk} is the angle between the bonds ij and ik. In Eq. (3), P_{m}(x) is the Legendre polynomial of order m and
is a truncated Gaussian of width σ centered at point r_{0}. The truncation function f_{c}(r) is defined by
This function and its derivatives up to the third go to zero at a cutoff distance r_{c}. The parameter d controls the truncation range.
For example, P_{0}(x) = 1 and \(g_i^{(0)}\) characterizes the local atomic density near atom i. Likewise, P_{1}(x) = x and \(g_i^{(1)}\) can be interpreted as the dipole moment of a set of unit charges placed at the atomic positions j and k. As such, this parameter measures the degree of local deviation from spherical symmetry in the environment (\(g_i^{(1)} = 0\) for spherical symmetry). For m = 2, we have P_{2}(x) = (3x^{2} − 1)/2 and \(g_i^{(2)}\) is related to the quadrupole moment of a set of unit charges placed at the atomic positions around atom i. We found that polynomials up to degree m = 6 should be included to accurately represent the diverse atomic environment. Each \(g_i^{(l)}\) is computed for several values of σ and r_{0} spanning a range of interatomic distances. For each atom, the set of k \(g_i^{(m)}\)’s obtained is arranged in a onedimensional array \((G_i^1,G_i^2,...,G_i^k)\). In this work we chose σ = 1.0 and used polynomials with m = 0, 1, 2, 4, 6 for 12 r_{0} values, giving a total of k = 60 \(G_i^l\)’s.
The BOP potential
In the BOP model adopted in this work, the energy of an atom i is postulated in the form
where r_{ij} is the distance between atoms i and j and the summation is over all atom j other than i within the cutoff radius r_{c}. The bondorder parameter b_{ij} is taken in the form
where
represents the number of chemical bonds (other than ij) formed by atom i. Larger z_{ij} values (more bonds) lead to a smaller b_{ij} and thus weaker ij bond.
The screening factor S_{ij} reduces the strength of bonds by surrounding atoms. For example, when counting the bonds in Eq. (8), we screen them by S_{ik}, so that strongly screened bonds contribute less to z_{ij}. The screening factor S_{ij} is given by
where the partial screening factor S_{ijk} represents the contribution of a neighboring atom k (different from i and j) to the screening of the bond ij. S_{ijk} is given by
It has the same value for all atoms k located on the surface of an imaginary spheroid whose poles coincide with the atoms i and j. For all atoms k outside this cutoff spheroid, on which r_{ik} + r_{jk} − r_{ij} = r_{c}, we have S_{ijk} = 1 — such atoms are too far away to screen the bond. If an atom k is placed on the line between the atoms i and j, we have r_{ik} + r_{jk} − r_{ij} = 0 and S_{ijk} is small — the bond ij is strongly screened (almost broken) by the atom k. This behavior reasonably reflects the nature of chemical bonding.
Finally, the promotion energy \(E_i^{(p)}\) is taken in the form
For a covalent material, \(E_i^{(p)}\) accounts for the energy cost of changing the electronic structure of a free atoms before it forms chemical bonds. For example, for group IV elements, this is the cost of the s^{2}p^{2} → sp^{3} hybridization. On the other hand, \(E_i^{(p)}\) can be interpreted as the embedding energy
appearing in the EAM formalism^{1,2}. Here, the host electron density on atom i is given by \(\bar \rho _i = \mathop {\sum}\nolimits_{j \ne i} {S_{ij}} b_{ij}\, f_c(r_{ij})\). Due to this feature, this BOP model can be applied to both covalent and metallic systems.
The BOP functions depend on eight parameters A_{i}, B_{i}, α_{i}, β_{i}, a_{i}, h_{i}, σ_{i}, and λ_{i}, which constitute the parameter set (p_{1}, ..., p_{m}) with m = 8. The cutoff parameters were fixed at r_{c} = 6 Å and d = 1.5 Å.
The neural network and training procedures
The feedforward NN contained two hidden layers and had the 60 × 15 × 15 × 8 architecture for the PINN potential and 60 × 16 × 16 × 1 for the NN potential. The number of nodes in the hidden layers was chosen to reach the target accuracy of about 34 meV/atom without overfitting.
The training/validation database consisted of DFT total energies for a set of supercells. The DFT calculations were performed using projectoraugmented wave (PAW) pseudopotentials as implemented in the electronic structure Vienna Ab initio Simulation Package (VASP)^{55,56}. The generalized gradient approximation (GGA) was used in conjunction with the Perdew, Burke, and Ernzerhof (PBE) density functional^{57,58}. The planewave basis functions up to a kinetic energy cutoff of 520 eV were used, with the kpoint density chosen to achieve convergence to a few meV per atom level. Further details of the DFT calculations can be found in refs. ^{20,21}. The energy of a given supercell s, \(E^s = \mathop {\sum}\nolimits_i {E_i^s}\), predicted by the potential was compared with the DFT energy \(E_{{\mathrm{DFT}}}^s\). Note that the original \(E_{{\mathrm{DFT}}}^s\) values were not corrected to remove the energy of a free atom. To facilitate comparison with literature data, prior to the training all DFT energies were uniformly shifted by 0.38446 eV per atom to match the experimental cohesive energy of Al, 3.36 eV per atom^{59}. The NN was trained by adjusting its weights w_{εκ} and biases b_{κ} to minimize the objective function
The second term was added to avoid overfitting by controlling the magnitudes of the weights and biases. The parameter τ controls the degree of regularization. The third term ensures that the variations of the PINN parameters relative to their databaseaveraged values \(\bar p_\eta\) remain small. The minimization of \({\cal{E}}\) was implemented by the Davidson–Fletcher–Powell algorithm of unconstrained optimization. The optimization was repeated several times starting from different random states and the solution with the smallest \({\cal{E}}\) was selected as final. The PINN and NN forces were computed by the finitedifference method.
Data availability
All data that support the findings of this study are available in the Supplementary Information file or from the corresponding author upon reasonable request.
References
 1.
Daw, M. S. & Baskes, M. I. Embeddedatom method: derivation and application to impurities, surfaces, and other defects in metals. Phys. Rev. B 29, 6443–6453 (1984).
 2.
Daw, M. S. & Baskes, M. I. Semiempirical, quantum mechanical calculation of hydrogen embrittlement in metals. Phys. Rev. Lett. 50, 1285–1288 (1983).
 3.
Mishin, Y. in Handbook of Materials Modeling (ed. Yip, S.), Ch. 2.2, 459–478 (Springer, Dordrecht, 2005).
 4.
Baskes, M. I. Application of the embeddedatom method to covalent materials: a semiempirical potential for silicon. Phys. Rev. Lett. 59, 2666–2669 (1987).
 5.
Mishin, Y., Mehl, M. J. & Papaconstantopoulos, D. A. Phase stability in the FeNi system: investigation by firstprinciples calculations and atomistic simulations. Acta Mater. 53, 4029–4041 (2005).
 6.
Liang, T., Devine, B., Phillpot, S. R. & Sinnott, S. B. Variable charge reactive potential for hydrocarbons to simulate organiccopper interactions. J. Phys. Chem. A 116, 7976–7991 (2012).
 7.
Brenner, D. W. Empirical potential for hyrdocarbons for use in simulating the chemical vapor deposition of diamond films. Phys. Rev. B 42, 9458–9471 (1990).
 8.
Brenner, D. W. The art and science of an analytical potential. Phys. Stat. Solidi (b) 217, 23–40 (2000).
 9.
Stuart, S. J., Tutein, A. B. & Harrison, J. A. A reactive potential for hydrocarbons with intermolecular interactions. J. Chem. Phys. 112, 6472–6486 (2000).
 10.
van Duin, A. C. T., Dasgupta, S., Lorant, F. & Goddard, W. A. Reaxff: a reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409 (2001).
 11.
Mishin, Y., Asta, M. & Li, J. Atomistic modeling of interfaces and their impact on microstructure and properties. Acta Mater. 58, 1117–1151 (2010).
 12.
Mueller, T., Kusne, A. G. & Ramprasad, R. in Reviews in Computational Chemistry (eds Parrill, A. L. & Lipkowitz, K. B.), Vol. 29, Ch. 4, 186–273 (Wiley, 2016).
 13.
Behler, J. & Parrinello, M. Generalized neuralnetwork representation of highdimensional potentialenergy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
 14.
Behler, J., Martonak, R., Donadio, D. & Parrinello, M. Metadynamics simulations of the highpressure phases of silicon employing a highdimensional neural network potential. Phys. Rev. Lett. 100, 185501 (2008).
 15.
Behler, J. Neural network potentialenergy surfaces in chemistry: a tool for largescale simulations. Phys. Chem. Chem. Phys. 13, 17930–17955 (2011).
 16.
Behler, J. Atomcentered symmetry functions for constructing highdimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).
 17.
Behler, J. Constructing highdimensional neural network potentials: a tutorial review. Int. J. Quant. Chem. 115, 1032–1050 (2015).
 18.
Behler, J. Perspective: machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).
 19.
Bartok, A., Payne, M. C., Kondor, R. & Csanyi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
 20.
Botu, V. & Ramprasad, R. Adaptive machine learning framework to accelerate ab initio molecular dynamics. Int. J. Quant. Chem. 115, 1074–1083 (2015).
 21.
Botu, V. & Ramprasad, R. Learning scheme to predict atomic forces and accelerate materials simulations. Phys. Rev. B 92, 094306 (2015).
 22.
Wood, M. A. & Thompson, A. P. Extending the accuracy of the SNAP interatomic potential form. J. Chem. Phys. 148, 241721 (2018).
 23.
Raff, L. M., Komanduri, R., Hagan, M. & Bukkapatnam, S. T. S. Neural Networks in Chemical Reaction Dynamics. (Oxford University Press, New York, 2012).
 24.
Blank, T. B., Brown, S. D., Calhoun, A. W. & Doren, D. J. Neural network models of potential energy surfaces. J. Chem. Phys. 103, 4129–4137 (1995).
 25.
Payne, M., Csanyi, G. & de Vita, A. in Handbook of Materials Modeling (ed. Yip, S.), 2763–2770 (Springer, Dordrecht, 2005).
 26.
Li, Z., Kermode, J. R. & De Vita, A. Molecular dynamics with onthefly machine learning of quantummechanical forces. Phys. Rev. Lett. 114, 096405 (2015).
 27.
Glielmo, A., Sollich, P. & de Vita, A. Accurate interatomic force fields via machine learning with covariant kernels. Phys. Rev. B 95, 214302 (2017).
 28.
Dawes, R., Thompson, D. L., Wagner, A. F. & Minkoff, M. Interpolating moving leastsquares methods for fitting potential energy surfaces: a strategy for efficient automatic data point placement in high dimensions. J. Chem. Phys. 128, 084107 (2008).
 29.
Seko, A., Takahashi, A. & Tanaka, I. Firstprinciples interatomic potentials for ten elemental metals via compressed sensing. Phys. Rev. B 92, 054113 (2015).
 30.
Mizukami, W., Hebershon, S. & Tew, D. P. A compact and accurate semiglobal potential energy surface for malonaldehyde from constrained least squares regression. J. Chem. Phys. 141, 144310 (2015).
 31.
Chmiela, S., Sauceda, H. E., Muller, K. R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machinelearned force fields. Nat. Commun. 9, 3887 (2018).
 32.
Bholoa, A., Kenny, S. D. & Smith, R. A new approach to potential fitting using neural networks. Nucl. Instrum. Methods Phys. Res. 255, 1–7 (2007).
 33.
Sanville, E., Bholoa, A., Smith, R. & Kenny, S. D. Silicon potentials investigated using density functional theory fitted neural networks. J. Phys. Condens. Matter 20, 285219 (2008).
 34.
Eshet, H., Khaliullin, R. Z., Kuhle, T. D., Behler, J. & Parrinello, M. Ab initio quality neuralnetwork potential for sodium. Phys. Rev. B 81, 184107 (2010).
 35.
Handley, C. M. & Popelier, P. L. A. Potential energy surfaces fitted by artificial neural networks. J. Phys. Chem. A 114, 3371–3383 (2010).
 36.
Sosso, G. C., Miceli, G., Caravati, S., Behler, J. & Bernasconi, M. Neural network interatomic potential for the phase change material GeTe. Phys. Rev. B 85, 174103 (2012).
 37.
Schutt, K. T., Sauceda, H. E., Kindermans, P. J., Tkatchenko, A. & Muller, K. R. Schnet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
 38.
Imbalzano, G. et al. Automatic selection of atomic fingerprints and reference configurations for machinelearning potentials. J. Chem. Phys. 148, 241730 (2018).
 39.
Bartok, A. P., Kermore, J., Bernstein, N. & Csanyi, G. Machine learning a general purpose interatomic potential for silicon. Phys. Rev. X 8, 041048 (2018).
 40.
Malshe, M. et al. Parametrization of analytic interatomic potential functions using neural networks. J. Chem. Phys. 129, 044111 (2008).
 41.
Tersoff, J. New empirical approach for the structure and energy of covalent systems. Phys. Rev. B 37, 6991–7000 (1988).
 42.
Tersoff, J. Empirical interatomic potential for silicon with improved elastic properties. Phys. Rev. B 38, 9902–9905 (1988).
 43.
Tersoff, J. Modeling solidstate chemistry: interatomic potentials for multicomponent systems. Phys. Rev. B 39, 5566–5568 (1989).
 44.
Bereau, T., Andrienko, D. & von Lilienfeld, O. A. Transferable atomic multipole machine learning models for small organic molecules. J. Chem. Theor. Comput. 11, 3225–3233 (2015).
 45.
Bereau, T., DiStasio, R. A., Tkatchenko, A. & von Lilienfeld, O. A. Noncovalent interactions across organic and biological subsets of chemical space: physicsbased potentials parametrized from machine learning. J. Chem. Phys. 148, 241706 (2018).
 46.
Kranz, J. J., Kubillus, M., Ramakrishnan, R. & von Lilienfeld, O. A. Generalized densityfunctional tightbinding repulsive potentials from unsupervised machine learning. J. Chem. Theor. Comput. 14, 2341–2352 (2018).
 47.
Glielmo, A., Zeni, C. & de Vita, A. Efficient nonparametric nbody force fields from machine learning. Phys. Rev. B 97, 184307 (2018).
 48.
Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximation. Neural Netw. 2, 359–366 (1989).
 49.
Pinkus, A. Approximation theory of the MLP model in neural networks. Acta Numer. 8, 143–195 (1999).
 50.
Oloriegbe, S. Y. Hybrid BondOrder Potential for Silicon. Ph.D. thesis (Clemson University, Clemson, 2008).
 51.
Gillespie, B. A. et al. Bondorder potential for silicon. Phys. Rev. B 75, 155207 (2007).
 52.
Drautz, R. et al. Analytic bondorder potentials for modelling the growth of semiconductor thin films. Prog. Mater. Sci. 52, 196–229 (2007).
 53.
Kolb, B., Lentz, L. C. & Kolpak, A. M. Discovering charge density functionals and structureproperty relationships with PROPhet: a general framework for coupling machine learning and firstprinciples methods. Sci. Rep. 7, 1192 (2017).
 54.
Mishin, Y., Farkas, D., Mehl, M. J. & Papaconstantopoulos, D. A. Interatomic potentials for monoatomic metals from experimental data and ab initio calculations. Phys. Rev. B 59, 3393–3407 (1999).
 55.
Kresse, G. & Furthmüller, J. Efficiency of abinitio total energy calculations for metals and semiconductors using a planewave basis set. Comput. Mat. Sci. 6, 15 (1996).
 56.
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmentedwave method. Phys. Rev. B 59, 1758 (1999).
 57.
Perdew, J. P. et al. Atoms, molecules, solids, and surfaces: applications of the generalized gradient approximation for exchange and correlation. Phys. Rev. B 46, 6671–6687 (1992).
 58.
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
 59.
Kittel, C. Introduction to Sold State Physics. (WileyInterscience, New York, 1986).
 60.
Touloukian, Y. S., Kirby, R. K., Taylor, R. E. & Desai, P. D. (eds.) Thermal Expansion: Metallic Elements and Alloys, Vol. 12 (Plenum, New York, 1975).
 61.
de Jong, M. et al. Charting the complete elastic properties of inorganic crystalline compounds. Sci. Data 2, 150009 (2015).
 62.
Tran, R. et al. Surface energies of elemental crystals. Sci. Data 3, 160080 (2016).
 63.
Qiu, R. et al. Energetics of intrinsic point defects in aluminium via orbitalfree density functional theory. Philos. Mag. 97, 2164–2181 (2017).
 64.
Zhuang, H., Chen, M. & Carter, E. A. Elastic and thermodynamic properties of complex MgAl intermetallic compounds via orbitalfree density functional theory. Phys. Rev. Appl. 5, 064021 (2016).
 65.
Iyer, M., Gavini, V. & Pollock, T. M. Energetics and nucleation of point defects in aluminum under extreme tensile hydrostatic stresses. Phys. Rev. B 89, 014108 (2014).
 66.
Sjostrom, T., Crockett, S. & Rudin, S. Multiphase aluminum equations of state via density functional theory. Phys. Rev. B 94, 144101 (2016).
 67.
Devlin, J. F. Stacking fault energies of Be, Mg, Al, Cu, Ag, and Au. J. Phys. F: Met. Phys. 4, 1865 (1974).
 68.
Ogata, S., Li, J. & Yip, S. Ideal pure shear strength of aluminum and copper. Science 298, 807–811 (2002).
 69.
Jahnatek, M., Hafner, J. & Krajci, M. Shear deformation, ideal strength, and stacking fault formation of fcc metals: a densityfunctional study of Al and Cu. Phys. Rev. B 79, 224103 (2009).
 70.
Kibey, S., Liu, J. B., Johnson, D. D. & Sehitoglu, H. Predicting twinning stress in fcc metals: linking twinenergy pathways to twin nucleation. Acta Mater. 55, 6843–6851 (2007).
Acknowledgements
We are grateful to Dr. James Hickman for performing some of the additional Al DFT calculations used in this work. We are also grateful to Dr. Vesselin Yamakov for numerous helpful discussions, the development of a software package for PINNbased simulations, and for benchmarking the computational speed of the method. The authors acknowledge support of the Office of Naval Research under Awards No. N000141812612 (G.P.P.P. and Y.M.) and N000141712148 (R.B. and R.R.). This work was also supported in part by a grant of computer time from the DoD High Performance Computing Modernization Program at ARL DSRC, ERDC DSRC and Navy DSRC.
Author information
Affiliations
Contributions
Y.M. developed the PINN theory and initiated this research project. G.P.P.P. wrote the computer software for the NN and PINN potential training, validation and testing under Y.M.’s direction and supervision. He also created the Al NN and PINN potentials reported in this paper and tested their properties. R.B. generated much of the DFT data for Al used in this work under R.R.’s advise and supervision. Y.M. wrote the initial draft of the manuscript. All coauthors were engaged in discussions, contributed ideas at all stages of the work, participated in the manuscript editing, and approved its final version.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Journal peer review information: Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pun, G.P.P., Batra, R., Ramprasad, R. et al. Physically informed artificial neural networks for atomistic modeling of materials. Nat Commun 10, 2339 (2019). https://doi.org/10.1038/s41467019103435
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467019103435
Further reading

Stark spectral line broadening modeling by machine learning algorithms
Neural Computing and Applications (2022)

Automated discovery of a robust interatomic potential for aluminum
Nature Communications (2021)

Physics informed neural network for parameter identification and boundary force estimation of compliant and biomechanical systems
International Journal of Intelligent Robotics and Applications (2021)

Wire EDM process optimization for machining AISI 1045 steel by use of Taguchi method, artificial neural network and analysis of variances
International Journal of System Assurance Engineering and Management (2020)

Local electronic descriptors for solutedefect interactions in bcc refractory metals
Nature Communications (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.