Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Physically informed artificial neural networks for atomistic modeling of materials

Abstract

Large-scale atomistic computer simulations of materials heavily rely on interatomic potentials predicting the energy and Newtonian forces on atoms. Traditional interatomic potentials are based on physical intuition but contain few adjustable parameters and are usually not accurate. The emerging machine-learning (ML) potentials achieve highly accurate interpolation within a large DFT database but, being purely mathematical constructions, suffer from poor transferability to unknown structures. We propose a new approach that can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding. This is achieved by combining a rather general physics-based model (analytical bond-order potential) with a neural-network regression. This approach, called the physically informed neural network (PINN) potential, is demonstrated by developing a general-purpose PINN potential for Al. We suggest that the development of physics-based ML potentials is the most effective way forward in the field of atomistic simulations.

Introduction

Large-scale molecular dynamics (MD) and Monte Carlo (MC) simulations of materials are traditionally implemented using classical interatomic potentials predicting the potential energy and Newtonian forces acting on atoms. Computations with such potentials are very fast and afford access to systems with millions of atoms and MD simulation times up to hundreds of nanoseconds. Such simulations span a wide range of time and length scales and constitute a critical component of the multiscale approach in materials modeling and computational design.

Several functional forms of interatomic potentials have been developed over the years, including the embedded-atom method (EAM)1,2,3, the modified EAM (MEAM)4, the angular-dependent potentials5, the charge-optimized many-body potentials6, reactive bond-order potentials7,8,9, and reactive force fields10 to name a few. These potentials address particular classes of materials or particular types of applications. Their functional forms depend on the physical and chemical models chosen to describe interatomic bonding in the respective class of materials.

A common feature of all traditional potentials is that they express the potential energy surface (PES) of the system, E = E(r1, ..., rN, p), as a relatively simple function of atomic coordinates (r1, ..., rN), N being the number of atoms (Fig. 1a). Knowing the PES, the forces acting on the atoms can be computed by differentiation and used in MD simulations. The potential functions depend on a relatively small number of fitting parameters p = (p1, ..., pm) (typically, m = 10–20) and are optimized (trained) on a relatively small database of experimental data and first-principles density functional theory (DFT) calculations. The traditional potentials are, of course, much less accurate than DFT calculations. Nevertheless, many of them demonstrate a reasonably good transferability to atomic configurations lying well outside the training dataset. This important feature owes its origin to the incorporation of at least some basic physics in the potential form. As long as the nature of chemical bonding remains the same as assumed during the potential development, the potential can predict the system energy adequately even for new configurations not seen during the training process. Unfortunately, the construction of good quality potentials is a long and painful process requiring personal experience and intuition and is more art than science8,11. In addition, the traditional potentials are specific to a particular class of materials and cannot be easily extended to other materials or improved in a systematic manner.

During the past decade, a new direction has emerged wherein interatomic potentials are developed by employing machine-learning (ML) methods12,13,14,15,16,17,18,19,20,21,22. The idea was originally conceived in the chemistry community in the 1990s in the effort to improve the accuracy of inter-molecular force fields23,24, an approach that was later adopted by the physics and materials science communities. The general idea is to forego the physical insights and reproduce the PES by interpolating between DFT data points using high-dimensional nonlinear regression methods such as the Gaussian process regression19,25,26,27, interpolating moving least squares28, kernel ridge regression12,20,21, compressed sensing29,30, gradient-domain machine-learning model31, or the artificial neural network (NN) approach13,14,15,16,17,18,32,33,34,35,36,37,38. If properly trained, a ML potential can predict the system energy with a nearly DFT accuracy (a few meV/atom). ML potentials are not specific to a particular class of materials or type of chemical bonding. They can be improved systematically if weaknesses are discovered or new DFT data become available. The training process can be implemented on-the-fly by running ab initio MD simulations26.

A major weakness of ML potentials is their poor transferability. Being purely mathematical constructions devoid of any physical meaning, they can accurately interpolate the energy between the training configurations but are generally incapable of properly extrapolating the energy to unknown atomic environments. As a result, the performance of ML potentials outside the training domain can be very poor. There is no reason why a purely mathematical extrapolation scheme would deliver physically meaningful results outside the training database. This explains why the existing ML potentials are usually (with rare exceptions39) narrowly focused on, and only tested for, a particular type of physical properties. This distinguishes them from the traditional potentials which, although less accurate, are designed for a much wider range of applications and diverse properties.

In this work we propose a new approach that can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding. We focus on NN potentials as an example, but the approach is general and can be readily extended to other methods of nonlinear regression. Like all ML potentials, the proposed physically informed NN (PINN) potentials are trained using a large DFT dataset. However, by contrast to the existing, mathematical NN potentials, the PINN potentials incorporate the basic physics and chemistry of atomic interactions leveraged by the extraordinary adaptivity and trainability of NNs. The PINN potentials thus strike a golden compromise between the two extremes represented by the traditional, physics-guided interatomic potentials, and the mathematical NN potentials.

The general idea of combining traditional interatomic potentials with NNs was previously discussed by Malshe et al.40, who constructed an adjustable Tersoff potential41,42,43 for a Si5 cluster. Other authors have also applied machine-learning methods to parameterize physics-based models of molecular interactions, primarily in the context of broad exploration of the compositional space of molecular (mostly organic) matter44,45,46. Glielmo et al.47 recently proposed to construct n-body Gaussian process kernels to capture the n-body nature of atomic interactions in physical systems. The PINN potentials proposed in this paper are inspired by such approaches but extend them to (1) more advanced physical models with a broad applicability, and (2) large-scale systems by introducing local energies Ei linked to local structural parameters $$G_i^l$$. The focus is placed on the exploration of the configurational space of defected solids and liquids in single-component and, in the future, binary or multicomponent systems. The main goal is to improve the transferability of interatomic potentials to unknown atomic environments while keeping the same level of accuracy of training as normally achieved with mathematical machine-learning potentials.

Results

Physically informed neural network potentials

The currently existing, mathematical NN potentials13,14,15,16,17,18,32,33,34,35,36 partition the total energy E into a sum of atomic energies, $$E = \mathop {\sum}\nolimits_i {E_i}$$. A single NN is constructed to express each atomic energy Ei as a function of a set of local fingerprint parameters (also called symmetry parameters13) $$(G_i^1,G_i^2,...,G_i^k)$$. These parameters encode the local environments of the atoms. The network is trained by minimizing the error between the energies predicted by the NN and the respective DFT total energies for a large set of atomic configurations. The flowchart of the method is depicted in Fig. 1b.

The proposed PINN model is based on the following considerations. A traditional, physics-based potential can always be trained to reproduce the energy of any given atomic configuration with any desired accuracy. Of course, this potential will not work well for other configurations. Imagine, however, that the potential parameters have been trained for a large set of reference structures, one structure at a time, each time producing a different parameter set p. Suppose that, during the subsequent simulations, we have a way of identifying, on the fly, a reference structure closest to any current atomic configuration. Then the accuracy of the simulation can be drastically improved by dynamically choosing the best set of potential parameters for every atomic configuration accoutered during the simulation. Now, since the atomic energy Ei only depends on the local environment of atom i, the best parameter set for computing Ei can be chosen by only examining the local environment of this atom. The energies of different atoms are then computed by using different, environment-dependent, parameter sets while keeping the same, physics-motivated functional form of the potential.

Instead of generating and storing a large set of discrete reference structures, we can construct a continuous NN-based function mapping the local environment of every atom on a parameter set of the interatomic potential optimized for that particular environment. Specifically, the local structural parameters (fingerprints) $$G_i^l$$ (l = 1, ..., k) of every atom i are fed into the network, which then maps them to the optimized parameter set pi appropriate for atom i. Mathematically, the local energy takes the functional form

$$E_i = E_i\left( {{\mathbf{r}}_{i1},...,{\mathbf{r}}_{in},{\mathbf{p}}_i\left( {G_i^l({\mathbf{r}}_{i1},...,{\mathbf{r}}_{in})} \right)} \right),$$
(1)

where (ri1, ..., rin) are atomic positions in the vicinity of atom i.

In comparison with the direct mapping $$G_i^l \mapsto E_i$$ implemented by the mathematical NN potentials, we have added an intermediate step: $$G_i^l \mapsto {\mathbf{p}}_i \mapsto E_i$$. The first step is executed by the NN and the second by a physics-based interatomic potential. A flowchart of the two-step mapping is shown in Fig. 1c. It is important to emphasize that this intermediate step does not degrade the accuracy relative to the direct mapping, because a feedforward NN can always be trained to execute any real-valued function48,49. Thus, for any functional form of the potential, the NN can always adjust its architecture, weights and biases to achieve the same mapping as in the direct method. However, since the chosen potential form captures the essential physics of atomic interactions, the proposed PINN potential will display a better transferability to new atomic environments. Even if the potential parameters predicted by the NN for an unknown environment are not very accurate, the physics-motivated functional form will ensure that the results remain at least physically meaningful. This physics-guided extrapolation is likely to be more reliable than the purely mathematical extrapolation inherent in the existing NN potentials. Obviously, the same reasoning applies to the interpolation process as well, which can also be more accurate.

The functional form of the PINN potential must be general enough to be applicable across different classes of materials. In this paper we chose a simple analytical bond-order potential (BOP)50,51,52 that must work equally well for both covalent and metallic materials. For a single-component system, the BOP functions are specified in the Methods section. They capture the physical and chemical effects such as the pairwise repulsion between atoms, the angular dependence of the chemical bond strength, the bond-order effect (the more neighbors, the weaker the bond), and the screening of chemical bonds by surrounding atoms. In addition to being appropriate for covalent bonding, the proposed BOP form reduces to the EAM formalism in the limit of metallic bonding.

Example: PINN potential for Al

To demonstrate the PINN method, we have constructed a general-purpose potential for aluminum. The training and validation datasets were randomly selected from a pre-existing DFT database20,21. Some additional DFT calculations have also been performed using the same methodology as in refs. 20,21. The selected DFT supercells represent seven crystal structures for a large set of atomic volumes under isotropic tension and compression, several slabs with different surface orientations, including surfaces with adatoms, a supercell with a single vacancy, five different symmetrical tilt grain boundaries, and an unrelaxed intrinsic stacking fault on the (111) plane with different translational states along the [211] direction. The database also includes several isolated clusters with the number of atoms ranging from 2 (dimer) to 79. The ground-state face centered cubic (FCC) structure was additionally subject to uniaxial tension and compression in the [100] and [111] directions at 0 K temperature. Most of the atomic configurations were snapshots of DFT MD simulations in the microcanonical (NVE) or canonical (NVT or NPT) ensembles for several atomic volumes at several temperatures. Some of the high-temperature configurations were part-liquid, part crystalline. In total, the database contains 3649 supercells (127592 atoms). More detailed information about the database can be found in the Supplementary Tables 1 and 2. To avoid overfitting or selection bias, the 10-fold cross-validation method was used during the training. The database was randomly partitioned in 10 subsets. One of them was set aside for validation and the remaining data was used for training. The process repeated 10 times for different choices of the validation subset.

The local structural parameters $$G_i^l$$ chosen for Al are specified in the Methods section. The NN contained two hidden layers with the same number of nodes in each. This number was increased until the training process produced a PINN potential with the root-mean-square error (RMSE) of training and validation close to 3–4 meV per atom, which was set as our goal. This is the level of accuracy of the DFT energies included in the database. For comparison, a mathematical NN potential was constructed using the same methodology. The number of hidden nodes of the NN was adjusted to give about the same number of fitted parameters and to achieve approximately the same RMSE of training and validation as for the PINN potential. Table 1 summarizes the training and validation errors averaged over the 10 cross-validation runs. One PINN and one NN potential were selected for a more detailed examination reported below.

Figure 2 and Supplementary Fig. 1 demonstrate excellent correlation between the predicted and DFT energies over a 7 eV per atom wide energy range for both potentials. The error distribution has a near-Gaussian shape centered at zero. Examination of errors in individual groups of structures (Supplementary Fig. 2) shows that the largest errors originate from the crystal structures (especially FCC, HCP, and simple hexagonal) subjected to large expansion.

Table 2 summarizes some of the physical properties of Al predicted by the potentials in comparison with DFT data from the literature. There was no direct fit to any of these properties, although atomic configurations most relevant to some of the properties were represented in the training dataset. While both potentials agree with the DFT data well, the PINN potential tends to be more accurate for most properties. For the [110] self-interstitial dumbbell, the NN potential predicts an unstable configuration that spontaneously rotates to the [100] orientation, whereas the PINN potential correctly predicts such configurations to be metastable. Figure 3 shows the linear thermal expansion factor as a function of temperature predicted by the potentials in comparison with experimental data. The PINN potential displays good agreement with experiment without direct fit, whereas the NN potential overestimates the thermal expansion at high temperatures. (The discrepancies at low temperatures are due to the quantum effects that are not captured by classical simulations.) As another test, the radial distribution function and the bond angle distribution in liquid Al were computed at several temperatures for which experimental and/or DFT data are available (Supplementary Figs 4 and 5). In this case, both potentials were found to perform equally well. Any small deviations from the published DFT calculations are within the uncertainty of the different DFT flavors (exchange-correlation functionals).

For testing purposes, we computed the energies of the remaining groups of structures that were part of the original DFT database20,21 but were not used here for training or validation. The full information about the testing dataset (26,425 supercells containing a total of 2,376,388 atoms) can be found in the Supplementary Table 3. For example, Fig. 4 compares the energies predicted by the potentials with DFT energies from high-temperature MD simulations for a supercell containing an edge dislocation or HCP Al. In both cases, the PINN potential is obviously more accurate. The remaining testing cases are presented in the Supplementary Figs. 610. Although there are cases where both potentials perform equally well, in most cases the PINN potential predicts the energies of unknown atomic configurations more accurately than the NN potential.

For further testing, the energies of the crystal structures of Al were computed for atomic volumes both within and beyond the training interval. Both potentials accurately reproduce the DFT energy–volume relations for all volumes spanned by the DFT database (Fig. 5 and Supplementary Fig. 3). However, extrapolation to larger or smaller volumes reveals significant differences. For example, the PINN potential correctly predicts that the crystal energy continues to rapidly increase under strong compression (repulsive interaction mode). In fact, the extrapolated PINN energy goes exactly through the new DFT points that were not included in the training or validation datasets, see examples in Fig. 6. By contrast, the energy predicted by the NN model immediately develops wiggles and strongly deviates from the physically meaningful repulsive behavior. Such artifacts were found for other structures as well.

To demonstrate that the unphysical behavior exhibited by the NN potential is not a specific feature of our structural parameters $$G_i^l$$ or the training method, we constructed another NN potential using a third-party NN-training package PROPhet53. This potential, which we refer to as NN′, uses the Behler-Parrinello symmetry functions13, which are different from our structural descriptor $$G_i^l$$. The NN-training algorithm is also different. A 47 × 18 × 18 × 1 network containing 1225 fitting parameters was trained on exactly the same DFT database to about the same accuracy as the NN and PINN potentials (Table 1). Figure 6 shows that the NN′ potential behaves in a similar manner as our NN potential, closely following the DFT energies within the training/validation domain and becoming unphysical as soon as we step outside this domain.

While the atomic forces were not used for either training or validation, they were compared with the DFT forces once the training was complete. For the validation dataset, this comparison probes the accuracy of interpolation, whereas for the testing dataset the accuracy of extrapolation. As expected, for the validation dataset the PINN forces are in better agreement with DFT calculations than the NN forces (RMSE ≈ 0.1 eV Å−1 versus ≈0.2 eV Å−1) as illustrated in Fig. 7a, b. For the testing dataset, the advantage of the PINN model in force predictions is even more significant. For example, for the dislocation and HCP cases discussed above, the PINN potential provides more accurate predictions (RMSE ≈ 0.1 eV Å−1) than the NN potential (RMSE ≈ 0.4 eV Å−1 for the dislocation and 0.6 eV Å−1 for the HCP case) (Fig. 7c, f). This advantage persists for all other groups of structures from the testing database.

It was also interesting to compare the PINN potential with traditional, parameter-based potentials for Al. One of them was the widely accepted EAM Al potential54 that had been fitted to a mix of experimental and DFT data. The other was a BOP potential of the same functional form as in the PINN model. Its parameters were fitted in this work using the same DFT database as for the PINN/NN potentials and then fixed once and for all. Figure 8 compares the DFT energies with the energies predicted by the EAM and BOP models across the entire set of reference configurations. The PINN predictions are shown for comparison. The plots demonstrate that the traditional, fixed-parameter models generally follow the correct trend but become increasingly less accurate as the structures deviate from the equilibrium, low-energy atomic configurations. The adaptivity to the local atomic environments built into the PINN potential greatly improves the accuracy.

Discussion

The proposed PINN potential model is capable of achieving the same high accuracy in interpolating between DFT energies on the PES as the currently existing mathematical NN potentials. The construction of PINN potentials requires the same type of DFT database, is equally straightforward, and does not heavily rely on human intuition. However, extrapolation outside the domain of atomic configurations represented in the training database is now based on a physical model of interatomic bonding. As a result, the extrapolation becomes more reliable, or at least more failure-proof, than the purely mathematical extrapolation. The accuracy of interpolation can also be improved for the same reason. As an example, the PINN Al potential constructed in this paper demonstrates better accuracy of interpolation and significantly improved transferability than a regular NN potential with about the same number of parameters. The advantage of the PINN potential is especially strong for atomic forces, which are important for molecular dynamics. The potential could be used for accurate simulations of mechanical behavior and other processes in Al. Construction of general-purpose PINN potentials for Si and Ge is currently in progress.

We believe that the development of physics-based ML potentials is the best way forward in this field. Such potentials need not be limited to NNs or the particular BOP model adopted in this paper. Other regression methods can be employed and the interatomic bonding model can be made more sophisticated, or the other way round, simpler in the interest of speed.

Other modifications are envisioned in the future. For example, not all potential parameters are equally sensitive to local environments. To improve the computational efficiency, the parameters can be divided into two subsets40: local parameters ai = (ai1, ..., a) adjustable according to the local environments as discussed above, and global parameters b = (b1, ..., bμ) that are fixed after the optimization and used for all environments (as in the traditional potentials). The potential format now becomes

$$E_i = E_i\left( {{\mathbf{r}}_{i1},...,{\mathbf{r}}_{in},{\mathbf{a}}_i\left( {G_i^l({\mathbf{r}}_{i1},...,{\mathbf{r}}_{in})} \right),{\mathbf{b}}} \right).$$
(2)

During the training process, the global parameters b and the network weights and biases are optimized simultaneously, as shown in Fig. 1d. Extension of PINN potentials to binary and multicomponent systems is another major task for the future.

All ML potentials are orders of magnitude faster than straight DFT calculations but inevitably much slower than the traditional potentials. Preliminary tests indicate that PINN potentials are about 25% slower than the regular NN potentials for the same number of parameters, the extra overhead being due to the BOP calculation. However, the computational efficiency depends on the parallelization method and computer architecture. All computations reported in this paper utilized in-house software parallelized with MPI for training and with OpenMP for MD and MC simulations (see example in Supplementary Fig. 14). Collaborative work is underway to develop highly scalable HPC software packages for physically informed ML potential training and MD/MC simulations using multiple CPUs or GPUs, or both. The results will be reported in a forthcoming paper.

Methods

Local structural parameters

There are many possible ways of choosing local structural parameters13,14,15,16,17,18,34,36. After trying several options, the following set of $$G_i^l$$’s was selected. For an atom i, we define

$$g_i^{(m)} = \mathop {\sum}\limits_{j,k} {P_m} \left( {{\mathrm{cos}}\,\theta _{ijk}} \right)f(r_{ij})f(r_{ik}),m = 0,1,2,...,$$
(3)

where rij and rik are distances to atoms j and k, respectively, and θijk is the angle between the bonds ij and ik. In Eq. (3), Pm(x) is the Legendre polynomial of order m and

$$f(r) = \frac{1}{{\sigma ^3}}e^{ - (r - r_0)^2/\sigma ^2}f_c(r)$$
(4)

is a truncated Gaussian of width σ centered at point r0. The truncation function fc(r) is defined by

$$f_c(r) = \left\{ {\begin{array}{*{20}{l}} {\frac{{(r - r_c)^4}}{{d^4 + (r - r_c)^4}}} \hfill & {r \le r_c} \hfill \\ {0,} \hfill & {r \ge r_c.} \hfill \end{array}} \right.$$
(5)

This function and its derivatives up to the third go to zero at a cutoff distance rc. The parameter d controls the truncation range.

For example, P0(x) = 1 and $$g_i^{(0)}$$ characterizes the local atomic density near atom i. Likewise, P1(x) = x and $$g_i^{(1)}$$ can be interpreted as the dipole moment of a set of unit charges placed at the atomic positions j and k. As such, this parameter measures the degree of local deviation from spherical symmetry in the environment ($$g_i^{(1)} = 0$$ for spherical symmetry). For m = 2, we have P2(x) = (3x2 − 1)/2 and $$g_i^{(2)}$$ is related to the quadrupole moment of a set of unit charges placed at the atomic positions around atom i. We found that polynomials up to degree m = 6 should be included to accurately represent the diverse atomic environment. Each $$g_i^{(l)}$$ is computed for several values of σ and r0 spanning a range of interatomic distances. For each atom, the set of k $$g_i^{(m)}$$’s obtained is arranged in a one-dimensional array $$(G_i^1,G_i^2,...,G_i^k)$$. In this work we chose σ = 1.0 and used polynomials with m = 0, 1, 2, 4, 6 for 12 r0 values, giving a total of k = 60 $$G_i^l$$’s.

The BOP potential

In the BOP model adopted in this work, the energy of an atom i is postulated in the form

$$E_i = \frac{1}{2}\mathop {\sum}\limits_{j \ne i} {\left[ {e^{A_i - \alpha _ir_{ij}} - S_{ij}b_{ij}e^{B_i - \beta _ir_{ij}}} \right]} f_c(r_{ij}) + E_i^{(p)},$$
(6)

where rij is the distance between atoms i and j and the summation is over all atom j other than i within the cutoff radius rc. The bond-order parameter bij is taken in the form

$$b_{ij} = (1 + z_{ij})^{ - 1/2},$$
(7)

where

$$z_{ij} = a_i^2\mathop {\sum}\limits_{k \ne i,j} {S_{ik}} ({\mathrm{cos}}\theta _{ijk} + h_i)^2f_c(r_{ik})$$
(8)

represents the number of chemical bonds (other than ij) formed by atom i. Larger zij values (more bonds) lead to a smaller bij and thus weaker ij bond.

The screening factor Sij reduces the strength of bonds by surrounding atoms. For example, when counting the bonds in Eq. (8), we screen them by Sik, so that strongly screened bonds contribute less to zij. The screening factor Sij is given by

$$S_{ij} = \mathop {\prod}\limits_{k \ne i,j} S_{ijk},$$
(9)

where the partial screening factor Sijk represents the contribution of a neighboring atom k (different from i and j) to the screening of the bond ij. Sijk is given by

$$S_{ijk} = 1 - f_c(r_{ik} + r_{jk} - r_{ij})e^{ - \lambda _i^2(r_{ik} + r_{jk} - r_{ij})}.$$
(10)

It has the same value for all atoms k located on the surface of an imaginary spheroid whose poles coincide with the atoms i and j. For all atoms k outside this cutoff spheroid, on which rik + rjk − rij = rc, we have Sijk = 1 — such atoms are too far away to screen the bond. If an atom k is placed on the line between the atoms i and j, we have rik + rjk − rij = 0 and Sijk is small — the bond ij is strongly screened (almost broken) by the atom k. This behavior reasonably reflects the nature of chemical bonding.

Finally, the promotion energy $$E_i^{(p)}$$ is taken in the form

$$E_i^{(p)} = - \sigma _i\left( {\mathop {\sum}\limits_{j \ne i} {S_{ij}} b_{ij}f_c(r_{ij})} \right)^{1/2}.$$
(11)

For a covalent material, $$E_i^{(p)}$$ accounts for the energy cost of changing the electronic structure of a free atoms before it forms chemical bonds. For example, for group IV elements, this is the cost of the s2p2 → sp3 hybridization. On the other hand, $$E_i^{(p)}$$ can be interpreted as the embedding energy

$$F(\bar \rho _i) = - \sigma _i\left( {\bar \rho _i} \right)^{1/2}$$
(12)

appearing in the EAM formalism1,2. Here, the host electron density on atom i is given by $$\bar \rho _i = \mathop {\sum}\nolimits_{j \ne i} {S_{ij}} b_{ij}\, f_c(r_{ij})$$. Due to this feature, this BOP model can be applied to both covalent and metallic systems.

The BOP functions depend on eight parameters Ai, Bi, αi, βi, ai, hi, σi, and λi, which constitute the parameter set (p1, ..., pm) with m = 8. The cutoff parameters were fixed at rc = 6 Å and d = 1.5 Å.

The neural network and training procedures

The feedforward NN contained two hidden layers and had the 60 × 15 × 15 × 8 architecture for the PINN potential and 60 × 16 × 16 × 1 for the NN potential. The number of nodes in the hidden layers was chosen to reach the target accuracy of about 3-4 meV/atom without overfitting.

The training/validation database consisted of DFT total energies for a set of supercells. The DFT calculations were performed using projector-augmented wave (PAW) pseudopotentials as implemented in the electronic structure Vienna Ab initio Simulation Package (VASP)55,56. The generalized gradient approximation (GGA) was used in conjunction with the Perdew, Burke, and Ernzerhof (PBE) density functional57,58. The plane-wave basis functions up to a kinetic energy cutoff of 520 eV were used, with the k-point density chosen to achieve convergence to a few meV per atom level. Further details of the DFT calculations can be found in refs. 20,21. The energy of a given supercell s, $$E^s = \mathop {\sum}\nolimits_i {E_i^s}$$, predicted by the potential was compared with the DFT energy $$E_{{\mathrm{DFT}}}^s$$. Note that the original $$E_{{\mathrm{DFT}}}^s$$ values were not corrected to remove the energy of a free atom. To facilitate comparison with literature data, prior to the training all DFT energies were uniformly shifted by 0.38446 eV per atom to match the experimental cohesive energy of Al, 3.36 eV per atom59. The NN was trained by adjusting its weights wεκ and biases bκ to minimize the objective function

$${\cal{E}} = \mathop {\sum}\limits_s {\left( {E^s - E_{{\mathrm{DFT}}}^s} \right)^2} + \tau \left( {\mathop {\sum}\limits_{\epsilon \kappa } {\left| {w_{\epsilon \kappa }} \right|^2} + \mathop {\sum}\limits_\kappa {\left| {b_\kappa } \right|^2} } \right) + \gamma \left( {\mathop {\sum}\limits_\eta {\left| {p_\eta - \bar p_\eta } \right|^2} } \right).$$
(13)

The second term was added to avoid overfitting by controlling the magnitudes of the weights and biases. The parameter τ controls the degree of regularization. The third term ensures that the variations of the PINN parameters relative to their database-averaged values $$\bar p_\eta$$ remain small. The minimization of $${\cal{E}}$$ was implemented by the Davidson–Fletcher–Powell algorithm of unconstrained optimization. The optimization was repeated several times starting from different random states and the solution with the smallest $${\cal{E}}$$ was selected as final. The PINN and NN forces were computed by the finite-difference method.

Data availability

All data that support the findings of this study are available in the Supplementary Information file or from the corresponding author upon reasonable request.

References

1. 1.

Daw, M. S. & Baskes, M. I. Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals. Phys. Rev. B 29, 6443–6453 (1984).

2. 2.

Daw, M. S. & Baskes, M. I. Semiempirical, quantum mechanical calculation of hydrogen embrittlement in metals. Phys. Rev. Lett. 50, 1285–1288 (1983).

3. 3.

Mishin, Y. in Handbook of Materials Modeling (ed. Yip, S.), Ch. 2.2, 459–478 (Springer, Dordrecht, 2005).

4. 4.

Baskes, M. I. Application of the embedded-atom method to covalent materials: a semi-empirical potential for silicon. Phys. Rev. Lett. 59, 2666–2669 (1987).

5. 5.

Mishin, Y., Mehl, M. J. & Papaconstantopoulos, D. A. Phase stability in the Fe-Ni system: investigation by first-principles calculations and atomistic simulations. Acta Mater. 53, 4029–4041 (2005).

6. 6.

Liang, T., Devine, B., Phillpot, S. R. & Sinnott, S. B. Variable charge reactive potential for hydrocarbons to simulate organic-copper interactions. J. Phys. Chem. A 116, 7976–7991 (2012).

7. 7.

Brenner, D. W. Empirical potential for hyrdocarbons for use in simulating the chemical vapor deposition of diamond films. Phys. Rev. B 42, 9458–9471 (1990).

8. 8.

Brenner, D. W. The art and science of an analytical potential. Phys. Stat. Solidi (b) 217, 23–40 (2000).

9. 9.

Stuart, S. J., Tutein, A. B. & Harrison, J. A. A reactive potential for hydrocarbons with intermolecular interactions. J. Chem. Phys. 112, 6472–6486 (2000).

10. 10.

van Duin, A. C. T., Dasgupta, S., Lorant, F. & Goddard, W. A. Reaxff: a reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409 (2001).

11. 11.

Mishin, Y., Asta, M. & Li, J. Atomistic modeling of interfaces and their impact on microstructure and properties. Acta Mater. 58, 1117–1151 (2010).

12. 12.

Mueller, T., Kusne, A. G. & Ramprasad, R. in Reviews in Computational Chemistry (eds Parrill, A. L. & Lipkowitz, K. B.), Vol. 29, Ch. 4, 186–273 (Wiley, 2016).

13. 13.

Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).

14. 14.

Behler, J., Martonak, R., Donadio, D. & Parrinello, M. Metadynamics simulations of the high-pressure phases of silicon employing a high-dimensional neural network potential. Phys. Rev. Lett. 100, 185501 (2008).

15. 15.

Behler, J. Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. Phys. Chem. Chem. Phys. 13, 17930–17955 (2011).

16. 16.

Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).

17. 17.

Behler, J. Constructing high-dimensional neural network potentials: a tutorial review. Int. J. Quant. Chem. 115, 1032–1050 (2015).

18. 18.

Behler, J. Perspective: machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).

19. 19.

Bartok, A., Payne, M. C., Kondor, R. & Csanyi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).

20. 20.

Botu, V. & Ramprasad, R. Adaptive machine learning framework to accelerate ab initio molecular dynamics. Int. J. Quant. Chem. 115, 1074–1083 (2015).

21. 21.

Botu, V. & Ramprasad, R. Learning scheme to predict atomic forces and accelerate materials simulations. Phys. Rev. B 92, 094306 (2015).

22. 22.

Wood, M. A. & Thompson, A. P. Extending the accuracy of the SNAP interatomic potential form. J. Chem. Phys. 148, 241721 (2018).

23. 23.

Raff, L. M., Komanduri, R., Hagan, M. & Bukkapatnam, S. T. S. Neural Networks in Chemical Reaction Dynamics. (Oxford University Press, New York, 2012).

24. 24.

Blank, T. B., Brown, S. D., Calhoun, A. W. & Doren, D. J. Neural network models of potential energy surfaces. J. Chem. Phys. 103, 4129–4137 (1995).

25. 25.

Payne, M., Csanyi, G. & de Vita, A. in Handbook of Materials Modeling (ed. Yip, S.), 2763–2770 (Springer, Dordrecht, 2005).

26. 26.

Li, Z., Kermode, J. R. & De Vita, A. Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. Phys. Rev. Lett. 114, 096405 (2015).

27. 27.

Glielmo, A., Sollich, P. & de Vita, A. Accurate interatomic force fields via machine learning with covariant kernels. Phys. Rev. B 95, 214302 (2017).

28. 28.

Dawes, R., Thompson, D. L., Wagner, A. F. & Minkoff, M. Interpolating moving least-squares methods for fitting potential energy surfaces: a strategy for efficient automatic data point placement in high dimensions. J. Chem. Phys. 128, 084107 (2008).

29. 29.

Seko, A., Takahashi, A. & Tanaka, I. First-principles interatomic potentials for ten elemental metals via compressed sensing. Phys. Rev. B 92, 054113 (2015).

30. 30.

Mizukami, W., Hebershon, S. & Tew, D. P. A compact and accurate semi-global potential energy surface for malonaldehyde from constrained least squares regression. J. Chem. Phys. 141, 144310 (2015).

31. 31.

Chmiela, S., Sauceda, H. E., Muller, K. R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).

32. 32.

Bholoa, A., Kenny, S. D. & Smith, R. A new approach to potential fitting using neural networks. Nucl. Instrum. Methods Phys. Res. 255, 1–7 (2007).

33. 33.

Sanville, E., Bholoa, A., Smith, R. & Kenny, S. D. Silicon potentials investigated using density functional theory fitted neural networks. J. Phys. Condens. Matter 20, 285219 (2008).

34. 34.

Eshet, H., Khaliullin, R. Z., Kuhle, T. D., Behler, J. & Parrinello, M. Ab initio quality neural-network potential for sodium. Phys. Rev. B 81, 184107 (2010).

35. 35.

Handley, C. M. & Popelier, P. L. A. Potential energy surfaces fitted by artificial neural networks. J. Phys. Chem. A 114, 3371–3383 (2010).

36. 36.

Sosso, G. C., Miceli, G., Caravati, S., Behler, J. & Bernasconi, M. Neural network interatomic potential for the phase change material GeTe. Phys. Rev. B 85, 174103 (2012).

37. 37.

Schutt, K. T., Sauceda, H. E., Kindermans, P. J., Tkatchenko, A. & Muller, K. R. Schnet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).

38. 38.

Imbalzano, G. et al. Automatic selection of atomic fingerprints and reference configurations for machine-learning potentials. J. Chem. Phys. 148, 241730 (2018).

39. 39.

Bartok, A. P., Kermore, J., Bernstein, N. & Csanyi, G. Machine learning a general purpose interatomic potential for silicon. Phys. Rev. X 8, 041048 (2018).

40. 40.

Malshe, M. et al. Parametrization of analytic interatomic potential functions using neural networks. J. Chem. Phys. 129, 044111 (2008).

41. 41.

Tersoff, J. New empirical approach for the structure and energy of covalent systems. Phys. Rev. B 37, 6991–7000 (1988).

42. 42.

Tersoff, J. Empirical interatomic potential for silicon with improved elastic properties. Phys. Rev. B 38, 9902–9905 (1988).

43. 43.

Tersoff, J. Modeling solid-state chemistry: interatomic potentials for multicomponent systems. Phys. Rev. B 39, 5566–5568 (1989).

44. 44.

Bereau, T., Andrienko, D. & von Lilienfeld, O. A. Transferable atomic multipole machine learning models for small organic molecules. J. Chem. Theor. Comput. 11, 3225–3233 (2015).

45. 45.

Bereau, T., DiStasio, R. A., Tkatchenko, A. & von Lilienfeld, O. A. Non-covalent interactions across organic and biological subsets of chemical space: physics-based potentials parametrized from machine learning. J. Chem. Phys. 148, 241706 (2018).

46. 46.

Kranz, J. J., Kubillus, M., Ramakrishnan, R. & von Lilienfeld, O. A. Generalized density-functional tight-binding repulsive potentials from unsupervised machine learning. J. Chem. Theor. Comput. 14, 2341–2352 (2018).

47. 47.

Glielmo, A., Zeni, C. & de Vita, A. Efficient nonparametric n-body force fields from machine learning. Phys. Rev. B 97, 184307 (2018).

48. 48.

Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximation. Neural Netw. 2, 359–366 (1989).

49. 49.

Pinkus, A. Approximation theory of the MLP model in neural networks. Acta Numer. 8, 143–195 (1999).

50. 50.

Oloriegbe, S. Y. Hybrid Bond-Order Potential for Silicon. Ph.D. thesis (Clemson University, Clemson, 2008).

51. 51.

Gillespie, B. A. et al. Bond-order potential for silicon. Phys. Rev. B 75, 155207 (2007).

52. 52.

Drautz, R. et al. Analytic bond-order potentials for modelling the growth of semiconductor thin films. Prog. Mater. Sci. 52, 196–229 (2007).

53. 53.

Kolb, B., Lentz, L. C. & Kolpak, A. M. Discovering charge density functionals and structure-property relationships with PROPhet: a general framework for coupling machine learning and first-principles methods. Sci. Rep. 7, 1192 (2017).

54. 54.

Mishin, Y., Farkas, D., Mehl, M. J. & Papaconstantopoulos, D. A. Interatomic potentials for monoatomic metals from experimental data and ab initio calculations. Phys. Rev. B 59, 3393–3407 (1999).

55. 55.

Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mat. Sci. 6, 15 (1996).

56. 56.

Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758 (1999).

57. 57.

Perdew, J. P. et al. Atoms, molecules, solids, and surfaces: applications of the generalized gradient approximation for exchange and correlation. Phys. Rev. B 46, 6671–6687 (1992).

58. 58.

Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).

59. 59.

Kittel, C. Introduction to Sold State Physics. (Wiley-Interscience, New York, 1986).

60. 60.

Touloukian, Y. S., Kirby, R. K., Taylor, R. E. & Desai, P. D. (eds.) Thermal Expansion: Metallic Elements and Alloys, Vol. 12 (Plenum, New York, 1975).

61. 61.

de Jong, M. et al. Charting the complete elastic properties of inorganic crystalline compounds. Sci. Data 2, 150009 (2015).

62. 62.

Tran, R. et al. Surface energies of elemental crystals. Sci. Data 3, 160080 (2016).

63. 63.

Qiu, R. et al. Energetics of intrinsic point defects in aluminium via orbital-free density functional theory. Philos. Mag. 97, 2164–2181 (2017).

64. 64.

Zhuang, H., Chen, M. & Carter, E. A. Elastic and thermodynamic properties of complex Mg-Al intermetallic compounds via orbital-free density functional theory. Phys. Rev. Appl. 5, 064021 (2016).

65. 65.

Iyer, M., Gavini, V. & Pollock, T. M. Energetics and nucleation of point defects in aluminum under extreme tensile hydrostatic stresses. Phys. Rev. B 89, 014108 (2014).

66. 66.

Sjostrom, T., Crockett, S. & Rudin, S. Multiphase aluminum equations of state via density functional theory. Phys. Rev. B 94, 144101 (2016).

67. 67.

Devlin, J. F. Stacking fault energies of Be, Mg, Al, Cu, Ag, and Au. J. Phys. F: Met. Phys. 4, 1865 (1974).

68. 68.

Ogata, S., Li, J. & Yip, S. Ideal pure shear strength of aluminum and copper. Science 298, 807–811 (2002).

69. 69.

Jahnatek, M., Hafner, J. & Krajci, M. Shear deformation, ideal strength, and stacking fault formation of fcc metals: a density-functional study of Al and Cu. Phys. Rev. B 79, 224103 (2009).

70. 70.

Kibey, S., Liu, J. B., Johnson, D. D. & Sehitoglu, H. Predicting twinning stress in fcc metals: linking twin-energy pathways to twin nucleation. Acta Mater. 55, 6843–6851 (2007).

Acknowledgements

We are grateful to Dr. James Hickman for performing some of the additional Al DFT calculations used in this work. We are also grateful to Dr. Vesselin Yamakov for numerous helpful discussions, the development of a software package for PINN-based simulations, and for benchmarking the computational speed of the method. The authors acknowledge support of the Office of Naval Research under Awards No. N00014-18-1-2612 (G.P.P.P. and Y.M.) and N00014-17-1-2148 (R.B. and R.R.). This work was also supported in part by a grant of computer time from the DoD High Performance Computing Modernization Program at ARL DSRC, ERDC DSRC and Navy DSRC.

Author information

Authors

Contributions

Y.M. developed the PINN theory and initiated this research project. G.P.P.P. wrote the computer software for the NN and PINN potential training, validation and testing under Y.M.’s direction and supervision. He also created the Al NN and PINN potentials reported in this paper and tested their properties. R.B. generated much of the DFT data for Al used in this work under R.R.’s advise and supervision. Y.M. wrote the initial draft of the manuscript. All co-authors were engaged in discussions, contributed ideas at all stages of the work, participated in the manuscript editing, and approved its final version.

Corresponding author

Correspondence to Y. Mishin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Journal peer review information: Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

Pun, G.P.P., Batra, R., Ramprasad, R. et al. Physically informed artificial neural networks for atomistic modeling of materials. Nat Commun 10, 2339 (2019). https://doi.org/10.1038/s41467-019-10343-5

• Accepted:

• Published:

• DOI: https://doi.org/10.1038/s41467-019-10343-5

• Stark spectral line broadening modeling by machine learning algorithms

• Irinel Tapalaga
• Ivan Traparić
• Ivan P. Dojčinović

Neural Computing and Applications (2022)

• Automated discovery of a robust interatomic potential for aluminum

• Justin S. Smith
• Benjamin Nebgen
• Kipton Barros

Nature Communications (2021)

• Physics informed neural network for parameter identification and boundary force estimation of compliant and biomechanical systems

• Wenjing Li
• Kok-Meng Lee

International Journal of Intelligent Robotics and Applications (2021)

• Wire EDM process optimization for machining AISI 1045 steel by use of Taguchi method, artificial neural network and analysis of variances

• Ahmed A. A. Alduroobi
• Alaa M. Ubaid
• Rasha R. Elias

International Journal of System Assurance Engineering and Management (2020)

• Local electronic descriptors for solute-defect interactions in bcc refractory metals

• Yong-Jie Hu
• Ge Zhao
• Liang Qi

Nature Communications (2019)