Abstract
Machine learning potentials have become an important tool for atomistic simulations in many fields, from chemistry via molecular biology to materials science. Most of the established methods, however, rely on local properties and are thus unable to take global changes in the electronic structure into account, which result from longrange charge transfer or different charge states. In this work we overcome this limitation by introducing a fourthgeneration highdimensional neural network potential that combines a charge equilibration scheme employing environmentdependent atomic electronegativities with accurate atomic energies. The method, which is able to correctly describe global charge distributions in arbitrary systems, yields much improved energies and substantially extends the applicability of modern machine learning potentials. This is demonstrated for a series of systems representing typical scenarios in chemistry and materials science that are incorrectly described by current methods, while the fourthgeneration neural network potential is in excellent agreement with electronic structure calculations.
Introduction
Computer simulations nowadays have become an important tool in many fields of science like chemistry, molecular biology, physics, and materials science. The quality, and thus the predictive power, of the results obtained in these simulations crucially depends on the accurate description of the atomic interactions. While electronic structure methods like density functional theory (DFT) provide a reliable description of many types of systems, the high computational costs of DFT restrict its application in molecular dynamics (MD)^{1} and Monte Carlo^{2} simulations to a few hundred atoms preventing the investigation of many interesting phenomena. Larger systems can be studied by more efficient atomistic potentials, which avoid solving the electronic structure problem onthefly but instead provide a direct functional relation between the atomic positions and the potential energy. Atomistic potential energy surfaces (PESs) have been developed for many types of systems, and most of these potentials are based on physical approximations, which necessarily limit the accuracy of the obtained results.
With the advent of machine learning (ML) potentials^{3,4,5,6,7} in recent year an alternative approach to the construction of PESs has emerged, which allows to combine the accuracy of quantum mechanical electronic structure calculations with the efficiency of simple empirical potentials. Many types of ML potentials have been proposed to date, like neural network potentials^{8,9,10,11,12}, Gaussian approximation potentials (GAPs)^{13}, moment tensor potentials (MTPs)^{14}, spectral neighbor analysis potentials (SNAPs)^{15}, and many others^{16,17}.
ML potentials can be classified into four different generations. Starting with the work of Doren and coworkers published in 1995^{8}, the first generation (1G) of ML potentials^{18,19} has been applicable to lowdimensional systems depending on the positions of a few atoms only. This restriction has been overcome in highdimensional neural network potentials (HDNNPs) proposed by Behler and Parrinello in 2007^{9}, which represented the first ML potential of the second generation (2G). In this generation, which employs the concept of nearsightedness^{20}, the total energy of the system is constructed as a sum of atomic energies, which depend on the local chemical environment up to a cutoff radius and —in case of HDNNPs—are computed by individual atomic neural networks. Most modern ML potentials making use of different ML algorithms, like HDNNPs, GAPs, MTPs, and SNAPs, belong to this second generation, and as standard methods for atomistic simulations they have been successfully applied to a wide range of systems.
A limitation of 2G ML potentials, which are applicable to tens of thousands of atoms, is the neglect of longrange interactions, i.e., electrostatics beyond the cutoff radius, but also dispersion interactions, which may substantially accumulate for condensed systems, are often truncated. This possible source of error, in particular for ionic systems, has been recognized early, and electrostatic corrections based on fixed charges have been proposed^{13,21}. In more flexible third generation (3G) ML potentials, longrange electrostatic interactions are included by constructing environmentdependent atomic charges, which in case of 3GHDNNPs are expressed by a second set of atomic neural networks^{22,23}. These charges can then be used in standard algorithms like the Ewald sum to compute the full longrange electrostatic energy. Owing to the additional effort in constructing and using 3G ML potentials, most applications have been reported for molecular systems^{12,24,25}, while in simulations of condensed systems they are rarely used, as often longrange electrostatic interactions are efficiently screened.
A remaining limitation of 3G ML potentials is their inability to describe longrange charge transfer and different charge states of a system, since the atomic partial charges are expressed as a function of the local chemical environment only. Neglecting nonlocal charge transfer and changes in the global charge distribution, which can be important in many systems^{26,27}, can result in qualitative failures as illustrated in Fig. 1 for the molecular model system XC_{7}H_{7}O displayed in panel a. Depending on the choice of the functional group X in b, like an amino group NH_{2} or its protonated form NH\({\,}_{3}^{+}\), different partial charges, which we use in this work as a qualitative fingerprint of the electronic structure, are obtained as shown in the plots of the DFT Hirshfeld charges on the right hand side. In particular the charge of the right oxygen atom depends on the choice of X, although X is far outside its local atomic environment displayed as dashed circle. As a consequence, ML potentials relying on a local description, like 2G and 3GHDNNPs, cannot distinguish these systems and the same charge is assigned to the right oxygen in both molecules, which is chemically incorrect. A second case is illustrated in Fig. 1c. In this case the OH group on the left is deprotonated resulting in a negative ion with two oxygen atoms almost equally sharing the negative charge. This charge is very different from the charge in the carbonyl oxygen of the neutral molecule. Still, again, the local environment of the carbonyl oxygen atom is identical, which is why 2G and 3G ML potentials cannot be applied to multiple charge states.
This limitation of local atomistic potentials in the description of longrange charge transfer and of systems in different charge states has been recognized already some time ago, and for simple empirical force fields different solutions have been proposed^{28,29,30,31}. In the context of ML potentials the first method that has been proposed to address this problem is the charge equilibration via neural network technique (CENT)^{32,33,34}. In this method, a charge equilibration^{28} scheme is applied, which allows for a global redistribution of the charge over the full system to minimize a chargedependent total energy expression. The charges are based on atomic electronegativities, which are determined as a function of the local chemical environment and expressed by atomic neural networks similar to the charges in 3GHDNNPs. This method has enabled the inclusion of longrange charge transfer in a ML framework for the first time, but due to the employed energy expression this method is primarily applicable to ionic systems^{35,36,37}, and the overall accuracy is still lower than in case of other stateoftheart ML potentials. Recently, another promising method has been proposed by Xie, Persson and Small^{38} aiming for a correct description of systems with different charge states. In this method, atomic neural networks are used that do not only depend on the local structure but also on atomic populations, which are determined in a selfconsistent process. The training data for different populations has been generated using constrained DFT calculations, and a first application for Li_{n}H_{n} clusters has been reported. Furthermore, an extension of the AIMNet method has been proposed^{39}, which can be used to predict energies and atomic charges for systems with nonzero total charge. Here, the interaction range between atoms is increased through iterative updates during which information is passed between nearby atoms. Although the resulting charges are not used to calculate explicit Coulomb interactions, many related quantities, such as electronegativities, ionization potentials or condensed Fukui functions can be derived.
In the present work, we propose a general solution for the limitations of current ML potentials by introducing a fourthgeneration (4G) HDNNP, which is applicable to longrange charge transfer and multiple charge states. It consists of highly accurate shortrange atomic energies similar to those used in 2GHDNNPs and charges determined from a charge equilibration method relying on electronegativities in the spirit of the CENT approach. Both, the shortrange atomic energies as well as the electronegativities are expressed by atomic neural networks as a function of the chemical environments. The capabilities of the method are illustrated for a series of model systems showcasing typical scenarios in chemistry and materials science that cannot be correctly described by conventional ML potentials. For all these systems we demonstrate that 4GHDNNPs trained to DFT data are able to provide reliable energies, forces and charges in excellent agreement with electronic structure calculations. In the beginning of the following section the methodology of 4GHDNNPs is introduced and the relation to other generations of HDNNPs and the CENT method is discussed. After that the results for a series of periodic and nonperiodic benchmark systems are presented, including a detailed comparison to the performance of 2G and 3GHDNNPs. We show that previous generations of HDNNPs, which are unable to take distant structural changes into account, yield inaccurate energies and forces, and even distinct local minima of the PES can be missed, which are correctly resolved by the 4GHDNNP. These results are general and equally apply to other types of 2G ML potentials.
Results
4GHDNNP
The overall structure of the 4GHDNNP is shown schematically in Fig. 2 for an arbitrary binary system. Like in 3GHDNNPs the total energy consists of a shortrange part, which, as we will see below, requires in addition nonlocal information, and an electrostatic longrange part, which is not truncated,
The electrostatic part E_{elec}(R, Q) depends on a set of atomic charges \({\bf{Q}}=\left\{{Q}_{i}\right\}\), which are trained to reference charges obtained in DFT calculations, and the positions of the atoms \({\bf{R}}=\left\{{{\bf{R}}}_{i}\right\}\). An important difference to 3GHDNNPs is that these charges are not directly expressed by atomic neural networks as a function of the local atomic environments, but they are obtained indirectly from a charge equilibration scheme based on atomic electronegativities {χ_{i}} that are adjusted to yield charges in agreement with the DFT reference charges, which here we choose to be Hirshfeld charges^{40}, but many choices are in principle possible.
Like in the CENT approach the atomic electronegativities are local properties defined as a function of the atomic environments using atomic neural networks. As in 2G and 3GHDNNPs there is one type of atomic neural network with a fixed architecture per element in the system making all atoms of the same type chemically equivalent, while the specific values of the electronegativities depend on the positions of all neighboring atoms inside a cutoff sphere of radius R_{c}. The positions of the neighboring atoms inside this sphere are specified by a vector G_{i} of atomcentered symmetry functions^{41}, which ensures the translational, rotational and permutational invariance of the electronegativities.
To predict the atomic charges, which are represented by Gaussian charge densities of width σ_{i} taken from the covalent radii of the respective elements, a charge equilibration scheme^{42} is used. In this scheme, the charge is distributed among the atoms in an optimal way to minimize the energy expression
with E_{elec} being the electrostatic energy of the Gaussian charges and J_{i} the elementspecific hardness. The J_{i} do not depend on the chemical environment and are constant for each element. While they are manually chosen in the CENT method, we optimize them during training. They are hence treated as free parameters like the weights and biases of the neural networks. For the electrostatic energy we then obtain
with
To solve this minimization problem the derivatives of E_{Qeq} with respect to the charges Q_{i} are calculated and set to zero,
where the elements of the matrix A are given by
Considering the constraint that the sum of all charges must be equal to the total charge Q_{tot} of the system, the following set of linear equations is solved by including this constraint via the Lagrange multipliers.
Highly optimized algorithms are available for systems of linear equations, which can be efficiently solved for small and mediumsized systems containing up to about ten thousand atoms in a few seconds on modern hardware. For larger systems the cubic scaling of the standard algorithms can pose a bottleneck. In that case one could resort to using iterative solvers for which the most expensive part of each iteration is a matrix vector multiplication involving the matrix A. This corresponds to the evaluation of the electrostatic potential at each atoms position for which numerous lowcomplexity algorithms, such as fast multipole methods, are known. In this way it is possible to reduce the effort from cubic to nearly linear scaling providing access to very large systems.
Overall, this process is like in the CENT^{32}, but the main difference is in the training process. While in CENT only the error with respect to the DFT energies is minimized, the atomic charges obtained during the charge equilibration process serve merely as intermediate quantities, which do not have a strict physical meaning. In the 4GHDNNP proposed in this work, the charges are trained directly to reproduce reference charges from DFT, which therefore are qualitatively meaningful although one should be aware that atomic partial charges are not physical observables and different partitioning schemes can yield different numerical values^{43}.
Once the atomic electronegativities have been learned, a functional relation between the atomic structure and the atomic partial charges is available. The intermediate global charge equilibration step ensures that these charges depend on the atomic positions, chemical composition and total charge of the entire system, and thus in contrast to 3GHDNNPs nonlocal charge transfer is naturally included.
In a second step, the local atomic energy contributions yielding the shortrange energy according to
have to be determined. Like in 2GHDNNPs the shortrange atomic energies are provided by individual atomic neural networks based on information about the chemical environments. An important difference to 2GHDNNPs is that the atomic energies in addition depend on nonlocal information that is provided to the shortrange atomic neural networks by using not only the atomcentered symmetry function values describing the positions of the neighboring atoms inside the cutoff spheres, but also the atomic partial charges determined in the first step (s. Fig. 2). This information is required to take into account changes in the local electronic structure resulting from possible longrange charge transfer, which has an immediate effect on the local manybody interactions.
The shortrange atomic neural networks are then trained to express the remaining part of the total energy E_{ref} according to
where the electrostatic energy is determined based on the partial charges resulting from the fitted atomic electronegativities. Thus, by construction the goal of the shortrange part is to represent all energy contributions that are not covered by the electrostatic energy such that double counting is avoided. In addition to the energies, also the forces are used for determining the parameters of the shortrange atomic neural networks. We note that since the shortrange energy depends on the atomic charges, which in turn are functions of all atomic coordinates, the derivatives ∂E_{short}/∂Q_{i} as well as ∂Q_{i}/∂R have to be considered in the computation of the forces. Details on how these contributions can be efficiently computed, as well as many other details of the 4GHDNNP method, can be found in the supplementary methods.
In summary, in contrast to the CENT method, the shortrange interactions are not described through the charges resulting from the charge equilibration process but are described by separate shortrange neural networks, which enables a more accurate description of the total energy.
Overview of test systems
In the following subsections we demonstrate the limitations of ML potentials based on local properties only and show how they can be overcome by the 4GHDNNP. For this purpose we use a set of nonperiodic and periodic systems, which cover a wide range of typical situations in chemistry and materials science. The nonperiodic systems consist of a covalent organic molecule, a small metal cluster and a cluster of an ionic material covering very different types of atomic interactions. These examples demonstrate the simultaneous applicability of a single 4GHDNNP to systems of different total charges and the correct description of longrange charge transfer and the associated electrostatic energy. As a periodic system we have chosen a small gold cluster adsorbed on a MgO(001) slab, which is a prototypical example for heterogeneous catalysis. We show that in contrast to established ML potentials, the 4GHDNNP is able to reproduce the change in adsorption geometry of the cluster if dopant atoms are introduced in the slab far away from the cluster. In all cases, the 4GHDNNP PES is very close to the results obtained from DFT.
While in theses examples we do not explicitly investigate the transferability of the potentials to different systems, we expect that the 4GHDNNP in general provides an improved transferability compared to 2G and 3G ML potentials due to the underlying physical description of the global charge distribution and the resulting electrostatic energy. This expectation is supported by the fact that even traditional charge equilibration schemes with constant electronegativities are known to work well across different systems^{44}. Furthermore, for the related CENT approach a broad transferability has already been demonstrated for different atomic environments^{33}.
A benchmark for organic molecules
The first model system we study is a linear organic molecule consisting of a chain of ten sphybridized carbon atoms terminated by two hydrogen atoms as shown in Fig. 3a. Molecules of this type have been studied before in electronic structure calculations^{45,46,47}. For this molecule we will now demonstrate the applicability of 4GHDNNPs to systems with longrange charge transfer induced by protonation, which changes the total charge and the local structure in a part of the system. Since the majority of existing machine learning potentials rely on local structural information only without explicit information about the global charge distribution and total charge, they are not simultaneously applicable to both neutral and charged systems.
This is different for 4GHDNNPs, which naturally include the correct longrange electrostatic energy for any global charge present in the training set. Because of the protonation of the terminal carbon atom, its hybridization state changes to sp^{2} and the electronic structure of the resulting C_{10}H\({\,}_{3}^{+}\) cation is modified even at very large distances along the whole molecule, which is reflected in the differences of the DFT charges of the molecules in Fig. 3b, which have been structurally optimized by DFT. The geometries of both molecules are given in the supplementary tables.
Using a data set containing both molecules, we have constructed 2G, 3G, and 4GHDNNPs using a cutoff radius R_{c} = 4.23 Å as illustrated by the circle in Fig. 3a for the example of the left carbon atom. In Fig. 3c we show the atomic partial charges obtained with the 3GHDNNP in two forms: first as unscaled charges directly obtained from the atomic neural network fits without any constraint for the correct total charge of the system, and second rescaled to ensure total charges of zero or one, respectively. It can be seen that the scaling process does not significantly improve the 3GHDNNP charges.
The atoms in the left half of the molecule are far from the added proton such that their atomic environments differ only slightly due to the DFT geometry optimization. In addition, in the training set a lot of basically identical environments but different atomic charges are present for these atoms, which results in high fitting errors due to the contradictory information. As a consequence the neural networks assign averaged charges to these atoms, which differ qualitatively from the DFT reference charges of both systems. For instance, the 3GHDNNP partial charges on atom 2, i.e., the left carbon atom, are almost identical in both molecules although they are very different in DFT. Note that the predicted charges of atoms 16 in C_{10}H_{2} and C_{10}H\({\,}_{3}^{+}\) would be even exactly identical if the latter molecule would not have been relaxed after protonation. The charges obtained with the 4GHDNNP shown in Fig. 3d, on the other hand, match the DFT charges very accurately for both molecules, as they can be distinguished in this method.
The inaccurate charges obtained with the 3GHDNNP lead to a poor quality of the potential energy surface, and the same is observed for the shortrange only 2GHDNNP. In Table 1 we compare the errors of the total energies as well as the mean errors of the atomic charges and forces of all HDNNP generations for the DFToptimized structures. It can be seen that the errors of all quantities obtained for the 4GHDNNP are much lower than for the 2G and 3GHDNNPs. Further, we note that in several cases the energies obtained by the 3GHDNNP are even worse than for the 2GHDNNP, as the unphysical charge distribution to some extent prevents the accurate representation of the energy.
To investigate the forces in more detail, in Fig. 4 we plot the individual atomic forces in both molecules using the 2GHDNNP and the 4GHDNNP for the DFToptimized structures. For all atoms in both molecules the 4GHDNNP yields very lowforce errors, with an average error of only 0.037 eV/Å underlining the quality of this PES. However, for the 2GHDNNP the forces acting on the left half of C_{10}H\({\,}_{3}^{+}\) and on all atoms in C_{10}H_{2} the force errors are significantly larger. The reason is again the 2GHDNNP cannot distinguish both molecules for these atoms, and the force errors are only low close to the extra proton in C_{10}H\({\,}_{3}^{+}\), which can be recognized as a distinct local structural feature in the atomic environments of the right half of this molecule.
Interestingly, the relatively high errors of the 2GHDNNP forces are not matched by high energy errors, which instead are surprisingly low and smaller than 1 meV/atom for both molecules. This suggests that the total energy predicted by 2GHDNNPs may benefit from error compensation in the atomic energies in that the atomic energies in the right half of C_{10}H\({\,}_{3}^{+}\) are adjusted to compensate the deficiencies of the atomic energies in the left half of the molecule.
Metal clusters: Ag_{3}
In this example, we investigate a small metal cluster, Ag_{3}, in two different charge states. The potential energy surface of small clusters is strongly influenced by the ionization state of the cluster and the ground state can differ as a function of the total charge of the cluster^{48,49,50,51}. Owing to the small system size there are no longrange effects, and the full system is included in each atomic environment. Therefore, in principle 2GHDNNPs should be perfectly suited to describe the PES of Ag_{3}, but this is only true as long as the total charge of the system does not change, since for a combination of data with different total charges, like Ag\({\,}_{3}^{+}\) and Ag\({\,}_{3}^{}\), in the training set the unique relation between atomic positions and the energy is lost. The minimumenergy structures of both cluster ions obtained from DFT are shown in Fig. 5a along with the atomic partial charges. After training a 2GHDNNP and a 4GHDNNP to data containing both types of clusters, we have reoptimized the geometries by the respective HDNNP generation. As expected, the minima obtained with the 2GHDNNP (Fig. 5b) are identical for both charge states, but do not agree with any of the DFT structures. The 4GHDNNP on the other hand, which in addition to the structural information also takes the total charge and the resulting partial charges into account, is able to predict the minima and also the atomic partial charges of both systems with very high accuracy (Fig. 5c). In this case, the inability of the 2GHDNNP to distinguish between clusters is also apparent from the energy errors with respect to DFT. While the energy errors for Ag\({\,}_{3}^{}\) and Ag\({\,}_{3}^{+}\) obtained from the 4GHDNNP are only about 1.166 meV/atom and 0.320 meV/atom, respectively, the errors of the 2GHDNNP are 0.605 and 2.017 eV/atom and thus several orders of magnitude larger. The 3GHDNNP using scaled charges performs even worse and errors of 0.713 and 5.721 eV/atom are obtained. This is due to the nonphysical electrostatic contribution calculated from the incorrectly predicted charges.
NaCl cluster ions
As the last nonperiodic example we select a system with mainly ionic bonding, which is a positively charged Na_{9}Cl\({\,}_{8}^{+}\) cluster, and we analyze the changes of the PES, if a neutral sodium atom is removed. The initial structure of the cluster ion has been obtained from a DFT geometry optimization and is shown in Fig. 6. The sodium atoms are shown in purple, blue, and brown, while the chlorine atoms are displayed in gray. We then construct a second system by removing the brown sodium atom from the cluster while keeping the positions of the remaining atoms fixed. Since the overall positive charge of the cluster is maintained, the charge is redistributed throughout the new Na_{8}Cl\({\,}_{8}^{+}\) cluster ion.
To investigate the consequences of this change in the electronic structure on the PES, we compute and compare the energies and forces when moving the blue sodium atom along a onedimensional path indicated by the arrow in Fig. 6 for both cluster ions. The distance to the closest neighboring sodium atom highlighted as dashed line is used to define the structure.
Figure 7 shows the energies for both systems obtained with DFT, as well as the 2G, 3G and 4GHDNNPs. All energies are given as relative energies to the minimum DFT energy of the respective cluster ion and refer to the full systems. First, we note that the positions of the DFT minima differ by more than 0.1 Å, i.e., depending on the presence of the very distant brown atom the blue atom adopts different equilibrium positions. The 2GHDNNP, however, is unable to distinguish these minima and instead the same local minimum Na–Na distance is found for both systems, which is approximately the average value of the two DFT minima. We note that the 2GHDNNP energy curves of the two systems are not identical but there is an energy offset, as some of the atomic environments in the right part of the systems differ yielding different atomic energies. Since these environments do not change when moving the blue atom this offset is constant. For the 3GHDNNP the same qualitative behavior is observed, and two very similar but not identical minima are found for both systems. Still, in case of the 3GHDNNP the energy offset between both systems is not merely a constant anymore, as the longrange electrostatic interactions between the blue and the brown atom in Na_{9}Cl\({\,}_{8}^{+}\) are positiondependent. We note that in spite of these qualitative differences with respect to DFT, the 2G and 3GHDNNP curves show only a deviation of about 1 meV per atom from the DFT curves. This is very small and in the typical order of magnitude of stateoftheart ML potentials, and in the present case this apparently high accuracy hides the qualitatively wrong minima. Finally, the 4GHDNNP energies for both systems are very accurate and the energy curves match the corresponding DFT curves very closely. Both distinct local minima are correctly identified and at the right positions.
Next, we turn to the forces shown in Fig. 7b. The results are fully consistent with our discussion of the energy curves. The DFT forces acting on the displaced atom are different for both cluster ions and well reproduced by the 4GHDNNP. The 2GHDNNP forces of both systems are exactly identical due to the constant offset between both energy curves (Fig. 7a), while the 3GHDNNP forces of both systems are slightly different due to the additionally included longrange electrostatics.
Au_{2} cluster on MgO(001)
As example for a periodic system we choose a diatomic gold cluster supported on the MgO(001) surface. Similar systems have attracted attention because of their catalytic properties for reactions like carbon monoxide oxidation, epoxidation of propylene, watergasshift reactions, and the hydrogenation of unsaturated hydrocarbons^{52}. Theoretical^{53,54} as well as experimental studies^{55} have shown that the geometry of these clusters can be modified by the introduction of dopant atoms into the oxide substrate. This ability to control the cluster morphology is of great interest, as it can enhance the catalytic activity of the system^{54}. 2GHDNNPs have been used before to study the properties of supported metal clusters^{56,57,58}, but systems as complex as doped substrates to date have remained inaccessible, since longrange charge transfer between the dopant and the gold atoms is crucial to achieve a physically correct description of these systems.
For Au_{2} at MgO(001) there are two main adsorption geometries, an upright “nonwetting” orientation of the dimer attached to a surface oxygen and parallel to the surface in a “wetting” configuration, in which the two Au atoms reside on two Mg atoms. DFT optimizations of the positions of the gold atoms with fixed substrate for the doped and undoped surfaces reveal that the presence of the dopant atoms changes the relative stability of both structures. On the pure MgO support (Fig. 8a) the minimumenergy structure is “nonwetting”, while a flat “wetting” geometry is more stable if the MgO is doped by three aluminum atoms (Fig. 8b) corresponding to 2.86% of the slab. The Al dopant atoms were introduced into the 5th layer, resulting in a distance of >10 Å from the gold atoms. Despite this large separation, we found that by doping the charge on the Au_{2} cluster is reduced (becomes more negative) by about 0.2 e compared to the same geometry for the undoped surface. This change in the electronic structure does not only lead to a switching in the energetic order of the geometries but also to a change of the bondlength between the gold atoms and the substrate.
The energy difference (E_{wetting} − E_{nonwetting}) between the wetting and nonwetting configurations calculated with different methods on a doped substrate are −2.7 meV for DFT, 375 meV for the 2GHDNNP and −41 meV for the 4GHDNNP. On an undoped substrate we obtained 929 meV for DFT, 375 meV for the 2GHDNNP and 975 meV for the 4GHDNNP. These numbers were obtained after the positions of the gold atoms were optimized. In case of the 2GHDNNP, both optimizations yield the same structure. For the 2GHDNNP the energy differences for the doped and undoped systems are exactly the same as the dopant atoms are outside the local chemical environments of the gold atoms. Thus, the 2GHDNNP cannot take the change of the PES by doping into account. The DFT and 4GHDNNP results agree in that there is a slight preference for the wetting configuration for the doped surface, while in the undoped case the nonwetting configuration is clearly more stable.
An analysis of the PES for the case of the nonwetting geometry for the doped and undoped slabs is given in Fig. 9, which shows the energies relative to the minimum DFT energies of the respective systems as a function of the distance between the bottom Au atom and its neighboring oxygen atom for DFT, the 2GHDNNP and the 4GHDNNP. The energy curves of the 4GHDNNP and DFT are very similar and can resolve the different equilibrium bond lengths for the doped (4GHDNNP: 2.342 Å; DFT: 2.332 Å) and undoped (4GHDNNP: 2.177 Å; DFT: 2.190 Å) substrates. The 2GHDNNP yields the same adsorption geometry with a bondlength of 2.256 Å in both cases, while the energies substantially differ from the DFT values with the main effect of the dopant being a constant energy shift between both substrates, similar to what we have observed in the presence or absence of the additional sodium atom in the NaCl cluster.
Discussion
In this work, we developed a fourthgeneration highdimensional neural network potential with accurate longrange electrostatic interactions, which is able to take longrange charge transfer as well as multiple charge states of a system into account. The new method is thus applicable to chemical problems, which are incorrectly described by current machine learning potentials relying on a local description of the atomic environments only.
The 4GHDNNP combines the advantages of the CENT approach and conventional highdimensional neural network potentials of second and third generation by being generally applicable to all types of systems and providing a very high accuracy. Employing environmentdependent atomic electronegativities, which are expressed by atomic neural networks, a charge equilibration method is used to determine the global charge distribution in the system. The resulting charges are then used to compute the longrange electrostatic energy, as well as to include information about the global electronic structure into the shortrange atomic energy contributions represented by a second set of atomic neural networks.
The superiority of the 4GHDNNP potential energy surface with respect to established 2G and 3GHDNNPs has been demonstrated for a series of systems, where conventional methods give qualitatively wrong results. In addition to the qualitatively correct description, we also obtained a clearly improved quantitative agreement of energies, forces and atomic charges with the underlying DFT data, and we could demonstrate that local minimum structures that are missed by the previous generations of HDNNPs are correctly identified by the new method.
The results obtained in this work are general and equally valid for other types of machine learning potentials relying on environmentdependent atomic energies only. Thus, the 4GHDNNP is a vital step for the further development of nextgeneration ML potentials providing a correct description of the PES based a global charge distribution.
Methods
Neural network potentials
The HDNNPs reported in this work have been constructed using the program RuNNer^{59,60,61}. Atomcentered symmetry functions^{41} have been used for the description of the atomic environments within a spatial cutoff radius set to 8–10 Bohr depending on the system. For a given system, the same parameters of the symmetry functions and the same atomic neural network architectures have been used for the different generations of HDNNPs being compared, and the parameters and cutoff radii for all systems can be found in supplementary tables. The functional forms of the symmetry functions are given in ref. ^{41}. In all examples, the atomic neural networks consist of an input layer with the number of symmetry functions ranging from 12 to 54 depending on the specific element and system, two hidden layers with 15 neurons each, and an output layer with one neuron providing either the atomic shortrange energy or electronegativity. Forces have been obtained as analytic energy derivative. The activation functions in the hidden layers and the output layer were the hyperbolic tangent and the linear function, respectively.
In all cases 90% of the available reference data was used for training the HDNNPs while the remaining 10% of the data points were used as an independent test set to confirm the reliability of PESs and detect possible overfitting. Energies and forces were used for training the shortrange atomic neural networks.
Moreover, a screening of the shortrange Coulomb electrostatic interaction was applied in order to facilitate the fitting of the shortrange energies and forces obtained from Eq. (9)^{23}. The inner and outer cutoff radius for screening of the electrostatic interaction have been set to 1.69–2.54 Å and the cutoff of the symmetry functions, respectively. The widths of the Gaussian charge densities in Eq. (4) have been set to the covalent radii of the elements. All the details of the training process and the validation strategies for HDNNPs in general can be found in recent reviews^{60,61}.
The HDNNPbased geometry optimizations were performed using simple gradient descent algorithms and the numerical threshold of the forces was set to 10^{−4} Ha/Bohr ≈ 0.005 eV/Å, which is the same convergence used in the DFT calculations used for validating the HDNNP results.
DFT calculations
The DFT reference data has been generated using the allelectron code FHIaims^{62} employing the Perdew–Burke–Ernzerhof^{63} (PBE) exchangecorrelation functional with light setting. The total energy, sum of eigenvalues, and charge density for all systems except Au_{2}MgO were converged to 10^{−5} eV, 10^{−2} eV, and 10^{−4} e, respectively. For the Au_{2}MgO systems stricter settings have been applied by multiplying each criterion by a factor 0.1 in combination with a 3 × 3 × 1 kpoint grid. Spin polarized calculations have been carried out for the Au_{2}MgO, NaCl and Ag_{3} systems. Reference atomic charges were calculated using Hirshfeld population analysis^{40}. In principle any other charge partitioning scheme could be used in the same way.
The data set of the C_{10}H_{2}/C_{10}H\({\,}_{3}^{+}\) molecules and the Ag_{3} clusters have been constructed by performing BornOppenheimer molecular dynamics^{64} simulations for each system at 300 K with 5000 steps at a time step of 0.5 fs. A NoséHoover thermostat^{65} was applied to run simulations in the canonical (NVT) ensemble, and the effective mass was set to 1700 cm^{−1}. In addition, the trajectory path during the geometry relaxations up to a numerical convergence of 0.001 eV/Å of the forces was also added to the data set to have sufficient sampling close to equilibrium structures. The geometry optimization of the Ag\({\,}_{3}^{}\) system has been terminated when reaching forces below 0.0015 eV/Å.
In case of the NaCl cluster and the Au_{2} cluster at the MgO surface the reference data set consists of two structurally different types of systems, and half of the data set was dedicated to each of the two cases. We performed a random sampling along the trajectories depicted in Figs. 7 and 9 and added further Gaussian distributed displacements to ensure sufficient sampling of the PES in the vicinity of the structures of interest. For the NaCl cluster we used Gaussian displacements with a standard deviation of 0.05 Å. As in the Au_{2}MgO system we only investigated the change in geometry of the Au_{2} cluster, while the MgO substrate remained fixed during all geometry relaxations, we used a smaller magnitude of the Gaussian displacements for the substrate than for the cluster. A standard deviation of 0.02 Å was used for the substrate and 0.1 Å was used for the gold cluster. Half of the data set consists of structures with an undoped substrate, while the other half includes a doped substrate. Half of the samples of each substrate configuration were generated with the Au_{2} cluster in its wetting configuration, and the other half with the cluster in its nonwetting configuration. The total number of reference data points for the NaCl cluster and Au_{2}MgO slab is 5000, while the the Ag_{3} clusters and the organic molecule it is 10,019 and 11,013, respectively.
Data availability
The datasets used to train the NNPs presented in this paper have been published online^{68}. All data that support the findings of this study are available in the Supplementary information file or from the corresponding author upon reasonable request.
Code availability
All DFT calculations were performed using FHIaims (version 171221_1). The HDNNPs have constructed using the program RuNNer, which is freely available under the GPL3 license at https://www.unigoettingen.de/de/software/616512.html.
References
 1.
McCammon, J. A., Gelin, B. R. & Karplus, M. Dynamics of folded proteins. Nature 267, 585–590 (1977).
 2.
Jorgensen, W. L. & Ravimohan, C. Monte Carlo simulation of differences in free energies of hydration. J. Chem. Phys. 83, 3050–3054 (1985).
 3.
Behler, J. Perspective: machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).
 4.
Botu, V., Batra, R., Chapman, J. & Ramprasad, R. Machine learning force fields: construction, validation, and outlook. J. Phys. Chem. C 121, 511–522 (2017).
 5.
Deringer, V. L., Caro, M. A. & Csányi, G. Machine learning interatomic potentials as emerging tools for materials science. Adv. Mater. 31, 1902765 (2019).
 6.
Brockherde, F. et al. Bypassing the KohnSham equations with machine learning. Nat. Commun. 8, 872 (2017).
 7.
Noé, F., Tkatchenko, A., Müller, K.R. & Clementi, C. Machine learning for molecular simulation. Annu. Rev. Phys. Chem. 71, 361–390 (2020).
 8.
Blank, T. B., Brown, S. D., Calhoun, A. W. & Doren, D. J. Neural network models of potential energy surfaces. J. Chem. Phys. 103, 4129–4137 (1995).
 9.
Behler, J. & Parrinello, M. Generalized neuralnetwork representation of highdimensional potentialenergy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
 10.
Schütt, K. T., Sauceda, H. E., Kindermans, P.J., Tkatchenko, A. & Müller, K.R. SchNetA deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
 11.
Unke, O. T. & Meuwly, M. PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
 12.
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
 13.
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
 14.
Shapeev, A. V. Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14, 1153–1173 (2016).
 15.
Thompson, A. P., Swiler, L. P., Trott, C. R., Foiles, S. M. & Tucker, G. J. Spectral neighbor analysis method for automated generation of quantumaccurate interatomic potentials. J. Chem. Phys. 285, 316–330 (2015).
 16.
Drautz, R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 99, 014104 (2019).
 17.
Balabin, R. M. & Lomakina, E. I. Support vector machine regression (LSSVM)an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data? Phys. Chem. Chem. Phys. 13, 11710 (2011).
 18.
Behler, J. Neural network potentialenergy surfaces in chemistry: a tool for largescale simulations. Phys. Chem. Chem. Phys. 13, 17930–17955 (2011).
 19.
Handley, C. M. & Popelier, P. L. Potential energy surfaces fitted by artificial neural networks. J. Phys. Chem. A 114, 3371–3383 (2010).
 20.
Prodan, E. & Kohn, W. Nearsightedness of electronic matter. Proc. Natl. Acad. Sci. 102, 11635–11638 (2005).
 21.
Deng, Z., Chen, C., Li, X.G. & Ong, S. P. An electrostatic spectral neighbor analysis potential for lithium nitride. NPJ Comput. Mater 5, 75 (2019).
 22.
Artrith, N., Morawietz, T. & Behler, J. Highdimensional neuralnetwork potentials for multicomponent systems: applications to zinc oxide. Phys. Rev. B 83, 153101 (2011).
 23.
Morawietz, T., Sharma, V. & Behler, J. A neural network potentialenergy surface for the water dimer based on environmentdependent atomic energies and charges. J. Chem. Phys. 136, 064103 (2012).
 24.
Yao, K., Herr, J. E., Toth, D. W., Mckintyre, R. & Parkhill, J. The TensorMol0.1 model chemistry: a neural network augmented with longrange physics. Chem. Sci. 9, 2261–2269 (2018).
 25.
Bereau, T., Andrienko, D. & Von Lilienfeld, O. A. Transferable atomic multipole machine learning models for small organic molecules. J. Chem. Theory Comput. 11, 3225–3233 (2015).
 26.
Hoshino, T. et al. Firstprinciples calculations for vacancy formation energies in Cu and Al; nonlocal effect beyond the LSDA and lattice distortion. Comp. Mat. Sci. 14, 56 (1999).
 27.
Parsaeifard, B., Finkler, J. A. & Goedecker, S. Detecting nonlocal effects in the electronic structure of a simple covalent system with machine learning methods, arXiv:2008.11277 (2020).
 28.
Rappe, A. K. & Goddard, W. A. Charge equilibration for molecular dynamics simulations. J. Phys. Chem. 95, 3358 (1991).
 29.
van Duin, A. C. T., Dasgupta, S., Lorant, F. & Goddard, W. A. ReaxFF: a reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409 (2001).
 30.
Zhou, X. W. & Wadley, H. N. G. A charge transfer ionic–embedded atom method potential for the O–Al–Ni–Co–Fe system. J. Phys.: Condens. Matter 17, 3619 (2005).
 31.
Gasteiger, J. & Marsili, M. Iterative partial equalization of orbital electronegativity–a rapid access to atomic charges. Tetrahedron 36, 3219–3228 (1980).
 32.
Ghasemi, S. A., Hofstetter, A., Saha, S. & Goedecker, S. Interatomic potentials for ionic systems with density functional accuracy based on charge densities obtained by a neural network. Phys. Rev. B 92, 045131 (2015).
 33.
Faraji, S. et al. High accuracy and transferability of a neural network potential through charge equilibration for calcium fluoride. Phys. Rev. B 95, 104105 (2017).
 34.
Amsler, M. et al. FLAME: a library of atomistic modeling environments. Comput. Phys. Commun. 256, 107415 (2020)
 35.
Hafizi, R., Ghasemi, S. A., Hashemifar, S. J. & Akbarzadeh, H. A neuralnetwork potential through charge equilibration for WS_{2}: From clusters to sheets. J. Chem. Phys. 147, 234306 (2017).
 36.
Faraji, S., Ghasemi, S. A., Parsaeifard, B. & Goedecker, S. Surface reconstructions and premelting of the (100) CaF_{2} surface. Phys. Chem. Chem. Phys. 21, 16270–16281 (2019).
 37.
Rasoulkhani, R. et al. Energy landscape of ZnO clusters and lowdensity polymorphs. Phys. Rev. B 96, 064108 (2017).
 38.
Xie, X., Persson, K. A. & Small, D. W. Incorporating electronic information into machine learning potential energy surfaces via approaching the groundstate electronic energy as a function of atombased electronic populations. J. Chem. Theory Comput. 16, 4256–4270 (2020).
 39.
Zubatyuk, R., Smith, J., Nebgen, B.T., Tretiak, S. & Isayev, O. Teaching a neural network to attach and detach electrons from molecules, ChemRxiv 12725276.v1 (2020).
 40.
Hirshfeld, F. L. Bondedatom fragments for describing molecular charge densities. Theor. Chim. Acta 44, 129–138 (1977).
 41.
Behler, J. Atomcentered symmetry functions for constructing highdimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).
 42.
Rappe, A. K. & Goddard III, W. A. Charge equilibration for molecular dynamics simulations. J. Phys. Chem. 95, 3358–3363 (1991).
 43.
Sifain, A. E. et al. Discovering a transferable charge assignment model using machine learning. J. Phys. Chem. Lett. 9, 4495–4501 (2018).
 44.
Ma, Y., Lockwood, G. K. & Garofalini, S. H. Development of a transferable variable charge potential for the study of energy conversion materials FeF_{2} and FeF_{3}. J. Phys. Chem. C 115, 24198–24205 (2011).
 45.
Fan, Q. & Pfeiffer, G. V. Theoretical study of linear C_{n} (n = 6–10) and HC_{n}H (n = 2–10) molecules. Chem. Phys. Lett. 162, 472–478 (1989).
 46.
Horny`, L., Petraco, N. D. K. & Schaefer, H. F. Odd carbon long linear chains HC_{2n+1}H (n = 4–11): properties of the neutrals and radical anions. J. Am. Chem. Soc. 124, 14716–14720 (2002).
 47.
Pan, L., Rao, B. K., Gupta, A. K., Das, G. P. & Ayyub, P. Hsubstituted anionic carbon clusters C_{n}H^{−}(n ≤ 10): density functional studies and experimental observations. J. Chem. Phys. 119, 7705–7713 (2003).
 48.
Duanmu, K. et al. Geometries, binding energies, ionization potentials, and electron affinities of metal clusters: Mg\({\,}_{n}^{0,\pm 1}\), n= 1–7. J. Phys. Chem. C 120, 13275–13286 (2016).
 49.
Goel, N., Gautam, S. & Dharamvir, K. Density functional studies of Li_{N} and Li\({\,}_{N}^{+}\)(N= 2–30) clusters: Structure, binding and charge distribution. Int. J. Quant. Chem. 112, 575–586 (2012).
 50.
Fournier, R. Trends in energies and geometric structures of neutral and charged aluminum clusters. J. Chem. Theory Comput. 3, 921–929 (2007).
 51.
De, S. et al. The effect of ionization on the global minima of small and medium sized silicon and magnesium clusters. J. Chem. Phys. 134, 124302 (2011).
 52.
Haruta, M. & Daté, M. Advances in the catalysis of Au nanoparticles. Appl. Catal. A 222, 427–437 (2001).
 53.
Mammen, N., Narasimhan, S. & de Gironcoli, S. Tuning the morphology of gold clusters by substrate doping. J. Am. Chem. Soc. 133, 2801–2803 (2011).
 54.
Mammen, N. & Narasimhan, S. Inducing wetting morphologies and increased reactivities of small Au clusters on doped oxide supports. J. Chem. Phys. 149, 174701 (2018).
 55.
Shao, X. et al. Tailoring the shape of metal Adparticles by doping the oxide support. Angew. Chem. Int. Ed. 50, 11525–11527 (2011).
 56.
Artrith, N., Hiller, B. & Behler, J. Neural network potentials for metals and oxidesFirst applications to copper clusters at zinc oxide. Phys. Status Solidi B 250, 1191–1203 (2013).
 57.
Elias, J. S. et al. Elucidating the nature of the active phase in copper/ceria catalysts for CO oxidation. ACS Catal. 6, 1675–1679 (2016).
 58.
Paleico, M. L. & Behler, J. Global optimization of copper clusters at the ZnO(\(10\bar{1}0\)) surface using a DFTbased neural network potential and genetic algorithms. J. Chem. Phys. 153, 054704 (2020).
 59.
Behler, J. RuNNer–A Program for Constructing Highdimensional Neural Network Potentials, Universität Göttingen 2020. (Universität Göttingen, 2020)
 60.
Behler, J. Constructing highdimensional neural network potentials: A tutorial review. Int. J. Quant. Chem. 115, 1032–1050 (2015).
 61.
Behler, J. First principles neural network potentials for reactive simulations of large molecular and condensed systems. Angew. Chem. Int. Ed. 56, 12828–12840 (2017).
 62.
Blum, V. et al. Ab initio molecular simulations with numeric atomcentered orbitals. Comput. Phys. Commun. 180, 2175–2196 (2009).
 63.
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
 64.
Barnett, R. N. & Landman, U. BornOppenheimer moleculardynamics simulations of finite systems: Structure and dynamics of (H_{2}O)_{2}. Phys. Rev. B 48, 2081 (1993).
 65.
Nosé, S. A unified formulation of the constant temperature molecular dynamics methods. J. Chem. Phys. 81, 511–519 (1984).
 66.
Stukowski, A. Visualization and analysis of atomistic simulation data with OVITO  the Open Visualization Tool. Modell. Simul. Mater. Sci. Eng. 18, 015012 (2010).
 67.
Momma, K. & Izumi, F. VESTA 3 for threedimensional visualization of crystal, volumetric and morphology data. J. Appl. Crystallogr. 44, 1272–1276 (2011).
 68.
Ko, T.W., Finkler, J. A., Goedecker, S. & Behler, J. A fourthgeneration highdimensional neural network potential with accurate electrostatics including nonlocal charge transfer. Materials Cloud Archive 2020.X, https://doi.org/10.24435/materialscloud:f3yh (2020).
Acknowledgements
We are grateful for the financial support from the Deutsche Forschungsgemeinschaft (DFG) (BE3264/131, project number 411538199) and the Swiss National Science Foundation (SNF) (project number 182877 and NCCR MARVEL). Calculations were performed in Göttingen (DFG INST186/12941 FUGG, project number 405832858), at sciCORE (http://scicore.unibas.ch/) scientific computing center at University of Basel and the Swiss National Supercomputer (CSCS) under project s963D/C03N05.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Affiliations
Contributions
Both research groups contributed equally to this project. J.B. and S.G. conceived the 4GHDNNP approach and initiated the research project. T.W.K. and J.A.F. worked out the practical algorithms for the approach and implemented it in the RuNNer software written by J.B. All calculations were performed by T.W.K. and J.A.F. All authors contributed ideas to the project and jointly analyzed the results. T.W.K. and J.A.F. wrote the initial version of the manuscript and prepared the figures, all authors jointly edited the manuscript. T.W.K. and J.A.F. contributed equally to this paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ko, T.W., Finkler, J.A., Goedecker, S. et al. A fourthgeneration highdimensional neural network potential with accurate electrostatics including nonlocal charge transfer. Nat Commun 12, 398 (2021). https://doi.org/10.1038/s41467020204272
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467020204272
Further reading

Highdimensional neural network potentials for magnetic systems using spindependent atomcentered symmetry functions
npj Computational Materials (2021)

Teaching a neural network to attach and detach electrons from molecules
Nature Communications (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.