Abstract
With many frameworks based on message passing neural networks proposed to predict molecular and bulk properties, machine learning methods have tremendously shifted the paradigms of computational sciences underpinning physics, material science, chemistry, and biology. While existing machine learning models have yielded superior performances in many occasions, most of them model and process molecular systems in terms of homogeneous graph, which severely limits the expressive power for representing diverse interactions. In practice, graph data with multiple node and edge types is ubiquitous and more appropriate for molecular systems. Thus, we propose the heterogeneous relational message passing network (HermNet), an end-to-end heterogeneous graph neural networks, to efficiently express multiple interactions in a single model with ab initio accuracy. HermNet performs impressively against many top-performing models on both molecular and extended systems. Specifically, HermNet outperforms other tested models in nearly 75%, 83% and 69% of tasks on revised Molecular Dynamics 17 (rMD17), Quantum Machines 9 (QM9) and extended systems datasets, respectively. In addition, molecular dynamics simulations and material property calculations are performed with HermNet to demonstrate its performance. Finally, we elucidate how the design of HermNet is compatible with quantum mechanics from the perspective of the density functional theory. Besides, HermNet is a universal framework, whose sub-networks could be replaced by other advanced models.
Introduction
In the realm of physics, chemistry, material science, and biology, multi-scale modeling1,2 helps us understand the properties of materials in multiple scales of time and space. Molecular dynamics (MD) simulation is an essential tool for modeling dynamical evolution of a many-body system. The trajectories of interacting particles are determined by solving Newton’s equations of motion involving complex interatomic potentials. There are two mainstream approaches for performing MD simulations, i.e., classical MD3 and ab initio molecular dynamics (AIMD)4. The potential energy surface in classical MD is given by parameterized force fields of a presumed functional form, which facilitates large-scale calculations but possesses poor transferability across tasks. On the other hand, AIMD computes the total energy of a system using quantum mechanics methods, such as the density functional theory (DFT)5, that guarantees the applicability and the accuracy under a wide variety of conditions. However, due to the cost of rigorously treating the electronic degrees of freedom, AIMD modeling is currently limited to physical and chemical systems of modest scales. With the rapid development of technology for chemical and material synthesis, the need to construct force fields for large-scale calculations with accuracy comparable to that of the first-principles methods has become ever more urgent.
One recent development to address the above issue is to use machine learning methods6,7 to facilitate MD simulations. The most important tool in machine learning is neural networks. The first framework of neural networks for MD simulations is proposed by Behler and Parrinello8, which is based on fully connected neural networks. Considerable success has been achieved along this route. Especially, Deep potential (DeePMD)9,10 has been developed as a comprehensive software suite and has been used in simulations of crystal nucleation11,12 and construction of phase diagram13. Traditional neural networks, for example, fully connected neural networks and convolutional neural networks, are most useful when the input data are Euclidean. However, atoms are intrinsically indistinguishable and cannot be ordered. As a result, heavy data preprocessing have to be performed in the above-mentioned frameworks. To alleviate such data preprocessing burden, graph neural networks (GNNs)14,15 are introduced. The power of graph formalism lies in its focus on relationships among entities (or nodes) rather than the properties of individual nodes. In particular, message passing neural networks (MPNNs)16 summarized the recapitulative formula for GNN in the spatial domain. With atoms represented as nodes and interactions or bonds between them represented as edges in a graph, molecules or crystals can be transformed to molecular graphs or crystal graphs naturally. GNN-based frameworks for MD simulations, including DTNN17, SchNet18,19, DimeNet20,21, PAINN22, and MDGNN23, have accurately predicted the potential surface of small molecules and crystals. Current GNN-based MD simulations mostly use homograph, where the message passing network is the same regardless of the types of the atoms. On the other hand, it is now a common practice to use the hybrid pair style in MD simulations, which utilizes different force fields for atom pairs of different types. The hybrid pair style is very useful for complex material systems, such as polymers on metal surface, polymers with nano-particles and solid-solid interface between two different materials. This motivates us to explore the possibility of improved performance by using heterogeneous graph in GNN-based MD simulations.
In this work, we propose a framework to model diverse interactions in a single MD simulations, termed heterogeneous relational message passing networks (HermNet). The model shares a similar idea of hybrid pair style in Large-scale Atomic/Molecular Massively Parallel Simulator software24. HermNet splits the molecular or crystal graph into several subgraphs and use different message passing networks for different subgraphs. Within each subgraph, we choose a modified version of polarizable atom interaction neural network (PAINN)22 as the sub-network. Experiments on molecular and extended systems were performed and the results were satisfactory. HermNet provides a general method to design heterogeneous GNN for MD simulations.
Results
Preliminary
In the graph theory25, a graph is a data structure composed of sets of vertices and edges. Graphs could be classified either as undirected graphs or digraphs by whether there is an explicit designation of edges’ orientations. From the standpoint that an undirected edge graph can be interpreted as a bidirectional link between the pair of nodes, undirected graphs are made up of digraphs.
Graphs can be further classified either as homogeneous or heterogeneous, according to the types of nodes and edges. A homogeneous graph is a special case of heterogeneous graph. MPNN16, which is a universal spatial-domain-based GNN framework, was proposed for homogeneous graphs. With hv and evw denoting, respectively, node features and edge features in a graph, MPNN is summarized as
where the forward propagation is decomposed into two phases, a message passing phase and a readout phase. Mt and Ut are a message function and a update function, respectively. The hidden states hw of all the neighbors \({{{\mathcal{N}}}}(v)\) of vertex v will be aggregated and then be used to update hidden states of vertex v in the next step. A heterogeneous graph supports sophisticated multi-type relations and inherently enables richer semantic relations. Relational graph convolutional network (R-GCN)26 is an extension of MPNN. \({{{\mathcal{G}}}}=({{{\mathcal{V}}}},{{{\mathcal{E}}}},{{{\mathcal{R}}}})\) denotes a heterogeneous graph with nodes (entities) \({v}_{i}\in {{{\mathcal{V}}}}\) and labeled edges (relations) \(({v}_{i},r,{v}_{j})\in {{{\mathcal{E}}}}\), where \(r\in {{{\mathcal{R}}}}\) is a relation type, that covers both canonical directional and inverse directional relations. A generalized forward process of an entity vi in a relational graph takes the form
where \({{{{\mathcal{N}}}}}_{i}^{r}\) denotes the set of neighbor indices of vertex i of relation r. Eq. (3) implies that a heterogeneous graph can be decomposed into several homogeneous graphs of distinct relations \({{{\mathcal{R}}}}\). Typically, each homogeneous graph is a directed graph. In other words, an R-GCN layer is made up of multiple MPNN layers, each of which is associated with a homogeneous graph of relation r.
Architecture
Diverse forms of force fields are manifestly responsible for the intricate interactions, especially in systems with multiple elements. GNNs for homogeneous graphs model interactions of different atomic pairs with shared parameters, which limits the expressive power for neural-network-based force fields. For example, as shown in Fig. 1(a), there are three kinds of particles, i.e. A-, B- and C-type atoms. The graph is constructed via linking central nodes with their adjacent nodes within a cutoff radius. In a classical MD simulation for this system, six different force fields can be allocated for A–A pairs, A–B pairs, A–C pairs, etc., provided only two-body interactions are considered. If a homogeneous GNN is employed to model different interactions by fitting a single function, it is expected to generate a mean force field. On the other hand, equipped with multiple types of nodes and edges, a heterogeneous GNN is a natural choice to model these interactions with a more detailed resolution.
a The original graph constructed via a certain method with multiple node types, specifically, three types A, B, and C here. b Subgraphs extracted from the original graph according to triadic relations. The number of subgraphs is of the order of \({N}_{e}^{3}\), where Ne is the number of node types. c Subgraphs extracted from the original graph according to type of central nodes. In this case, the number of subgraphs linearly increases with respect to Ne.
As shown in the following, we develop a universal framework, HermNet, to model diverse many-body interactions simultaneously via extracting appropriate subgraphs, which are subsequently processed by heterogeneous GNNs. The overview of the entire architecture diagram of HermNet is displayed in Fig. 2(a), which takes atomic numbers Z (and a vector of zeros) as the node’s scalar features (and node’s vectorial features). HermNet is composed of several message passing layers, termed HermConv layers, which model interactions hierarchically. The RMConv modules (Fig. 2b) of different relations constitute the HermConv module (Fig. 2c, d). We introduce three variants of HermNets: heterogeneous pair networks (HPNet), heterogeneous triadic networks (HTNet), and heterogeneous vertex networks (HVNet). A HPNet layer for central nodes of A-type is displayed in Fig. 2c, where all the sub-networks with A-type destination contribute to the local environment of A-type node. A HermNet layer for HVNet and HTNet is displayed in Fig. 2d. If the parameters of its sub-networks [RMConv, see Fig. 3] are shared for the same kinds of central nodes, this HermNet framework is referred to as HVNet. When the parameters are not shared, this HermNet framework is an HTNet. We only test and report HVNet’s performance in the following sections, as the other two models (HPNet and HTNet) have high complexity and will take more training time and data points for a proper assessment.
a The entire architecture diagram of HermNet where {Z} is the set of atomic numbers, which will be passed through an embedding layer. Initial vectorial node features are all-zero vectors of fixed dimension. This layer is expected to receive a scalar node feature {s}, a vectorial node feature, and a vectorial edge feature, i.e. relative position vector \({\overrightarrow{r}}_{ij}\), and then output an updated scalar and vectorial node feature as the inputs of the next layer. The final scalar node features will be passed to a global pooling layer as feature of the graph. With the graph-level feature passing to a sequence of fully connected layers, the target to predict is achieved. b Sub-network for processing related subgraphs, i.e., homogeneous digraphs. The layer is composed of message passing layers hierarchically, such as radial message passing layer for two-body interactions, angular message passing layer for three-body interactions, and so on. Related message passing layers will be truncated according to the level of interactions to be modeled. The features or/and message passing layers with dotted line should be introduced in accordance with requirements. Several sub-networks which model different relations compose a single heterogeneous relational message passing layer. When the interactions are truncated to two-body interactions, the entire framework is termed HPNet. c and d are the schematic diagrams of HermConv module of HPNet and HTNet (HVNet), respectively. RMConv modules for different relations constitute the HermConv module. c Sub-network in HPNet for A-type when the system contains only two kinds of elements, specifically, A- and B-type. d The hidden states of A-type vertex derive from a sub-network that is truncated to three-body interactions for corresponding relations. The colors of the networks for different three-body interactions represent the parameters. If these colors are the same, which means the parameters are shared in all these three networks, the HermNet is termed HVNet. If not, then the HermNet is termed HTNet.
a is the architecture of the sub-network, termed relational message passing convolutional (RMConv) layer. Such a RMConv layer is a simplified and modified PAINN22 invoked for a specific type of interaction, and is constituted by (b) radial message layer, c radial update layer, d angular message layer, and e or f angular update layer. {s} is the set of scalar node features and initially set as the atomic numbers, which will be passed through an embedding layer. Initial vectorial node feature \({\overrightarrow{v}}^{(0)}\) is an all-zero vector with a fixed dimension. \(\sin \left(\frac{n\pi }{{r}_{{{\rm{cut}}}}}\parallel {\overrightarrow{r}}_{ij}\parallel \right)/\parallel {\overrightarrow{r}}_{ij}\parallel\) with 1 ≤ n ≤ 30 are selected as radial basis functions (RBF)20 and a cosine cutoff fcut59 is also adopted in the filter. The original message layer in PAINN22 (i.e., a MPNN layer) is decomposed into (b) and (c) (i.e., radial message layer and radial update layer). A modified and simplified update layer is decomposed into (d) and (e) for HVNet or (f) for HTNet. These layers model three-body interactions via expressing angular information explicitly.
Most machine learning frameworks for MD simulation only take into account the interatomic distances in feature engineering, ignoring the bond angle information, which is an important characteristic of both molecules and crystals. In principle bond angle can be deduced from interatomic distances. However, it is advantageous to explicitly include bond angle information in feature engineering to achieve better performance. Directional message passing networks (DimeNet)20,21 innovatively introduced three-body interactions explicitly by combining radial and angular information from the edges of the original graph and the corresponding line graph, respectively. PAINN22 is a rotationally equivariant MPNN framework and the complexity of calculating angular information was reduced. In this work, we incorporate angular information by choosing PAINN as the sub-network in HermNet. This specific message passing setup can be directly implemented in HVNet, while slight modifications are required in HTNet to distinguish the type of source nodes. We note that HPNet cannot incorporate all angular information explicitly. For example, the bond angle A → B ← B is lost in HPNet because A → B and B ← B are processed by different sub-networks.
As discussed above, a heterogeneous graph could be decomposed into several homogeneous subgraphs. To describe the method of extracting these subgraphs, we use \({{{\mathcal{G}}}}\), \({\hat{{{{\mathcal{Q}}}}}}_{s}\), and \({\hat{{{{\mathcal{Q}}}}}}_{d}\) to denote the input heterogeneous graph, the operator that returns the subgraphs with specific source nodes, and the operator that returns the subgraphs with specific destination nodes, respectively. As indicated in Fig. 1b, c, the directed subgraphs for HVNet could be extracted via selecting inbound edges of a given A-type destination node, i.e. \({\hat{{{{\mathcal{Q}}}}}}_{d}^{A}{{{\mathcal{G}}}}\), while those for HTNet are extracted via selecting inbound edges of a given B-type destination node firstly and then choosing out-bound edges of its A-type and C-type source nodes simultaneously, i.e. \({\hat{{{{\mathcal{Q}}}}}}_{s}^{A\cup C}{\hat{{{{\mathcal{Q}}}}}}_{d}^{B}{{{\mathcal{G}}}}\) for triadic relation A → B ← C. We note that if the two destination nodes are extracted sequentially for HTNet, the result is generally an empty graph.
In the following, we report the testing of HVNet against other prior frameworks on three well-established benchmark datasets. As detailed below, HVNet convincingly outperforms most of the prior methods.
Benchmarks on MD17 dataset
The MD17 dataset17,27,28 provides non-equilibrium structures sampled (at a time resolution of 0.5 fs) from AIMD trajectories for eight small molecules with a background temperature of 500 K. The potential energy and force labels are computed with PBE + vdW − TS method. Christensen and von Lilienfeld29 found that the energies in original MD17 dataset are contaminated with substantial numerical noises and published a revised version of the MD17 dataset. Distinct HVNet models were trained on this revised dataset, and an a 1000-frame training set and a 1000-frame validation set are randomly selected. The learning rate was initially set at 3 × 10−4 and adaptively reduced when the loss on the validation set reached a plateaus. The truncated radius was set at 5 Å for the construction of molecular graphs. Additional details can be found in the Supplementary Methods. Table 1 presents the comparisons of mean absolute errors (MAEs) of three benchmarked models and HVNet. It should be noted that the results of PAINN and HVNet were trained on revised MD17 dataset, while SchNet and DimeNet were trained on the original MD17 dataset. HVNet outperforms other models with a comfortable margin on three-quarters of the predictive tasks, and its results of the remaining tasks are comparable to the best results among all four frameworks. We also attempted to train an HTNet on the MD17 dataset; however, the parameter space of the HTNet is simply too large, and obvious overfitting was immediately observed after just several training epochs. Then we trained the HTNet model on the HfO2 dataset, which was proposed to fit Gaussian approximation potential models30,31,32,33,34,35, and found that when more than 1500 data points were used for training, no obvious overfitting was observed (detailed discussion with respect to training HTNet model is provided in the Supplementary Notes 2). This indicates that HTNet might be expressive once more data points are provided.
Benchmarks on QM9 dataset
The QM9 dataset36,37 consists of computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of carbon, hydrogen, oxygen, nitrogen, and fluorine. All properties were calculated at the B3LYP/6-31G (2df, p) level of quantum chemistry. This dataset provides quantum chemical insights for the relevant chemical space of small organic molecules, and has been widely adopted as the benchmark to calibrate, analyze and evaluate new methods in this field. HVNet was trained on 110k molecules and validated on another 10k molecules. The properties of the 134k molecules include dipole moment (μ), isotropic polarizability (α), energy of the highest occupied molecular orbital (εHOMO), energy of the lowest unoccupied molecular orbital (εLUMO), band gap (Δε), electronic spatial extent (R2), zero point vibrational energy (ZPVE), internal energy at 0 K (U0), internal energy at 298.15 K (U), enthalpy at 298.15 K (H), free energy at 298.15 K (G), and heat capacity at 298.15 K (cv). It must be emphasized that HVNets were trained with atomization energies rather than the original internal energies, enthalpy energy, and free energy, i.e., the original energies subtracting the atomic reference energies, which is the protocol advocated in the DimeNet work of Klicpera et al. 20. These adjusted values are more reasonable because absolute energies are generally meaningless and relative energies essentially convey all physical implications. Table 2 reports the MAEs of HVNet for 12 tasks in the QM9 dataset with comparison to other eight models. HVNet outperforms all baselines on 10 out of 12 tasks. For the other 2 tasks, R2 and ZPVE, the MAEs of HVNet are on par with some of the baselines. Details of additional settings and the definition of the physical quantities with respect to the models and datasets are provided in the Supplementary Methods and Supplementary Discussion 1.
Benchmark on extended systems
Predicting properties of extended systems is a more ambitious task because of their intricate chemical environments. Since HermNet is capable to handle extended systems, we conduct this more challenging benchmark on the extended system datasets provided in ref. 10. The datasets contain properties of 8 different systems, among which bulk C5H5N, bulk TiO2, the system which consists of MoS2 and Pt, and high entropy alloy (HEA) are four most difficult tasks: the bulk C5H5N and bulk TiO2 dataset include multiple phases; the system of MoS2 and Pt includes five different datasets. Unfortunately, training on the MoS2+Pt dataset required too much computational time, so we chose not to further pursue this benchmark after some preliminary tuning (and no corresponding results are shown). The HEA dataset is explicitly divided into two datasets, such that the model should be trained on the first dataset which includes 40 kinds of 5 equimolar-element CoCrFeMnNi HEA with random occupations and then tested on the test set in the first dataset and the entire second dataset that includes another 16 kinds of HEA with random occupations. Table 3 shows the comparisons of root mean square errors (RMSEs) between DeepPot-SE/DeePMD10 and HVNet. Since the potential energy is an extended quantity, the RMSEs of energies were normalized with the system size in consistency with how the DeepPot-SE and DeePMD10 presented the results. As shown in Table 3, HVNet achieved lower RMSEs than DeepPot-SE on all tasks except the dataset of MoS2 and Pt, which we chose not to do due to the excessive amount of required training time. Detail of additional settings and specific discussions are provided in Supplementary Methods and Supplementary Discussion 2. Besides, we calculated the vacancy formation energy of the bulk Cu with the trained model. An arbitrary Cu atom was removed and the configuration was relaxed with DFT and HVNet, respectively. The chemical potential of Cu was calculated from DFT and the vacancy formation energies from DFT and HVNet are 1.03 eV and 1.07 eV, respectively, which are also consistent with previous computational and experimental results (1.14 eV and 1.17–1.28 eV, respectively)38.
Molecular dynamics simulation and phonon dispersion
To demonstrate the performance of HermNet, MD simulation of a MoSe2 monolayer was performed. The dataset was generated with Vienna ab initio Simulation Package (VASP)39 using the projector-augmented wave40,41 pseudopotentials. The Perdew-Berke-Ernzerhof exchange-correlation potential42 was used. The cutoff of plane waves was 260 eV and a 2 × 2 × 1 gamma-centered k-point mesh was adopted to sample the Brillouin zone of the 6 × 6 × 1 supercell. The simulation was carried out under the canonical ensemble with the temperature increasing from 100 K to 1500 K, and 5000 frames were obtained With the time step of 1 fs. The dataset was randomly shuffled and split into training set, validation set, and test set in the ratio of 8:1:1. The MAEs of energy and forces on test set were 0.09 meV per atom and 2.93 meV Å−1, respectively. The comparison of radial distribution functions at 300 K from AIMD and i-PI43, a classical MD simulation software, with HVNet as the force fields, is shown in Fig. 3a. Furthermore, the phonon dispersion was calculated via interfacing HermNet and phonopy44. Acoustic sum rule was enforced to ensure that the three acoustic modes at Γ point must be zero. As shown in Fig. 4b, the performance of HermNet on phonon dispersion demonstrates that even the second order derivative of potential energy reaches high precision.
Discussion
The complexity of a sub-network is generally scaled as \({{{\mathcal{O}}}}(| {{{\mathcal{N}}}}| )\), where \(| {{{\mathcal{N}}}}|\) is typically the number of the neighbors captured within a cutoff radius. The numbers of sub-networks for HVNet, HPNet, and HTNet are \({{{\mathcal{O}}}}({N}_{e})\), \({{{\mathcal{O}}}}({N}_{e}^{2})\) and \({{{\mathcal{O}}}}({N}_{e}^{3})\), respectively. Here, Ne is the number of element types present in the system. Therefore, HVNet is most useful when the number of distinct elements is large. Further discussions on the complexity analysis are deferred to the Supplementary Notes 2.
To construct accurate force field for classical MD simulations, potential energy surface needs to be reproduced up to first-principles precision. Actually, potential energy has hierarchical structure and can be decomposed into several terms as follows,
where the first term represents the energy of a single atom and the second term is the summation of all the pairwise interactions, such as the energy contributed from bonds. The third term denotes the three-body interactions, which typically entails angular specifications. Higher-order many-body interactions can be further included in order to build a more accurate potential energy surface. The layers shown in Fig. 3b and (c), which are equivalent to the message layer in the original PAINN proposal22, could be viewed as a single MPNN layer which models two-body interactions since they merely process radial information. The inner products of the positional vectors presented in the modules in Fig. 3d, e or f are responsible for modeling three-body interactions. Thus the sub-network, i.e. concatenation of these layers, as shown in Figs. 2b and 3a, exactly conforms to this hierarchical rule in Eq. (4).
On the other hand, graphs are constructed with a specific cutoff radius and only information of 1-hop neighbors is aggregated in a single MPNN layer. The final energy prediction is obtained with a global pool operation on all local environments. This suggests that locality is an essential property that facilitates the learning of potential energies. The DFT total energy could be expressed as a summation of eigenvalues of electronic Hamiltonian and the interaction of the nuclei with a correction to avoid double counting45. To take advantages of a localized basis as in a graph, we will discuss the total energy within the tight-binding framework, which could provide more physical insights. When the density is expressed as the superposition of spherical atomic densities46, the total energy in the tight-binding representation is written as
where \({\rho }_{m,{m}^{\prime}}\) is the density matrix. \({H}_{m,{m}^{\prime}}\) is the matrix element of the Hamiltonian between states m and \({m}^{\prime}\), where m = 1, ⋯ , Nbasis denote the states in the basis. RI is the position of atom I, and J is a neighboring site of I. The formula demonstrates that total energy could be decomposed into pairwise contributions, which is consistent with the layer made up of radial message passing layer in Fig. 2(a). Generally, the terms in Eq. (5) are both short-range interactions47,48,49 and could be extended to higher-order interaction. Then the total energy could be expressed as \({E}_{{{\rm{total}}}}=\mathop{\sum }\nolimits_{i = I}^{N}{\varepsilon }_{I}^{\prime}\), which is a summation of local contributions from central particles. This indicates the locality of a system’s overall energy, consistent with the idea underlying the seminal work of Ref. 8, which is widely adopted in the many follow-up works in this field.
In principle, the parameters of sub-networks in DeePMD10 are not shared for different element types, which is similar to heterogeneous GNNs. Thus the outperformance on extended systems results from the ability the sub-networks we used in this work. There are also other existing heterogeneous GNN framework designed for MD simulations, but the design principle is very different. MXMNet50 utilized multiplex graphs, which could be viewed as heterogeneous graphs with individual node and two edge types, to capture global and local geometric information from multiplex graphs allocated with different cutoff radii. Heterogeneous molecular GNNs51 introduced heterogeneous graphs for molecules via grouping the original graph and a line graph into a single heterogeneous graph with two kinds of nodes. It processes information of nodes in original graph and line graph with two different GNNs respectively. The heterogeneity in these two works is equivalent to distinguishing original graphs and line graphs, which still treats the original graphs as a homogeneous graph.
In conclusion, we develop HermNet, a framework based on heterogeneous GNN, to learn multiple kinds of force fields in a single MD simulation via extracting required subgraphs. Different from previous works, HermNet introduce heterogeneous graphs to describe different interactions of element types rather than to distinguish the hierarchy of the interactions. Among three variants of HermNet, we tested HVNet on a variety of systems, covering both molecular and extended systems, and obtained satisfactory results. Some discussions based on quantum mechanics and DFT have been provided to justify our model designs. Although we primarily focus on experiments with HVNet, in principle, HTNet is capable of modeling sophisticated interactions once enough data is provided. HVNet outperforms the state-of-the-art benchmark models on most of the tasks for small molecules. For the experiments on extended systems, HVNet also outperforms DeePMD10. These results demonstrate the powerful representation and promising application potential of HVNet for diverse and intricate systems such as HEA. Finally, we emphasize that HermNet is a universal framework, whose sub-networks could be replaced by other advanced or specialized models. For example, unitary N-body tensor equivariant neural network (UNiTE)52, another remarkable framework based on the elegant group theory, was proposed recently, which performed impressively on molecular datasets. We believe that HermNet can deliver improved results by replacing the current sub-networks with UNiTE52. Besides, many-body interactions could also be truncated to higher order in sub-networks of HermNet, such as dihedral angular information53. HermNet can also be extended to model interactions from higher-order contributions via extracting higher-order subgraphs and invoking frameworks that model higher-order contributions properly. More information could be found in Supplementary Discussion 3.
Methods
Architecture implementation
HermNet is implemented with PyTorch54 and Deep Graph Library55 python library. Neighbors of the central particle are found by Scikit-Learn56 library and the node features are extracted by Atomic Simulation Environment57 and Pymatgen58 library. In our work, a simplified PAINN22 is implemented as sub-network in both HVNet and HTNet. The angular formula in HVHet is the same as that in PAINN22, while that in HTNet is a little different. The proof that angular information could be introduced in HVNet and HTNet with PAINN naturally is provided in Supplementary Notes 1.
Data availability
The raw data of revised MD17, QM9, and bulk systems are available at https://figshare.com/articles/dataset/Revised_MD17_dataset_rMD17_/12672038, https://deepchemdata.s3-us-west-1.amazonaws.com/datasets/molnet_publish/qm9.zip, and http://www.deepmd.org/database/deeppot-se-data/, respectively.
Code availability
The implementations of HermNet described in the paper are available at https://github.com/sakuraiiiii/HermNet.
References
Weinan, E. & Engquist, B. Multiscale modeling and computation. Not. Am. Math. Soc. 50, 1062–1070 (2003).
Horstemeyer, M. F. Multiscale modeling: a review, in Practical Aspects of Computational Chemistry: Methods, Concepts and Applications 87–135 (Springer, 2010).
Alder, B. J. & Wainwright, T. E. Studies in molecular dynamics. i. general method. J. Chem. Phys. 31, 459–466 (1959).
Car, R. & Parrinello, M. Unified approach for molecular dynamics and density-functional theory. Phys. Rev. Lett. 55, 2471 (1985).
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Zhang, L., Han, J., Wang, H., Car, R. & Weinan, E. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Zhang, L. et al. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems, in Advances in Neural Information Processing Systems (vol. 31, Curran Associates, Inc., 2018).
Bonati, L. & Parrinello, M. Silicon liquid structure and crystal nucleation from ab initio deep metadynamics. Phys. Rev. Lett. 121, 265701 (2018).
Niu, H., Bonati, L., Piaggi, P. M. & Parrinello, M. Ab initio phase diagram and nucleation of gallium. Nat. Commun. 11, 1–9 (2020).
Zhang, L., Wang, H., Car, R. & E, W. Phase diagram of a deep potential water model. Phys. Rev. Lett. 126, 236001 (2021).
Zhou, J. et al. Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020).
Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2020).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning, 1263–1272 (PMLR, 2017).
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8, 1–8 (2017).
Schütt, K. et al. SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. In Proc 31st International Conference on Neural Information Processing Systems, 992–1002 (Curran Associates Inc., 2017).
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet–a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Klicpera, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs, In International Conference on Learning Representations (ICLR, 2019).
Klicpera, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules, Preprint at http://arxiv.org/abs/2011.14115 (2020).
Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra, In Proc 38th International Conference on Machine Learning, 9377–9388 (vol. 64, PMLR, 2021).
Wang, Z. et al. Symmetry-adapted graph neural networks for constructing molecular dynamics force fields. Sci. China.: Phys., Mech. Astron. 64, 1–9 (2021).
Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
Bondy, J. A. et al. Graph Theory with Applications (vol. 290, Macmillan London, 1976).
Schlichtkrull, M. et al. Modeling relational data with graph convolutional networks, in European semantic web conference, 593–607 (Springer, 2018).
Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, e1603015 (2017).
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 1–10 (2018).
Christensen, A. S. & von Lilienfeld, O. A. On the role of gradients for machine learning of molecular energies and forces. Mach. Learn. Sci. Technol. 1, 045018 (2020).
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Fujikake, S. et al. Gaussian approximation potential modeling of lithium intercalation in carbon nanostructures. J. Chem. Phys. 148, 241714 (2018).
Deringer, V. L. & Csányi, G. Machine learning based interatomic potential for amorphous carbon. Phys. Rev. B 95, 094203 (2017).
Bartók, A. P., Kermode, J., Bernstein, N. & Csányi, G. Machine learning a general-purpose interatomic potential for silicon. Phys. Rev. X 8, 041048 (2018).
Mocanu, F. C. et al. Modeling the phase-change memory material, Ge2Sb2Te5, with a machine-learned interatomic potential. J. Phys. Chem. B 122, 8998–9006 (2018).
Sivaraman, G. et al. Experimentally driven automated machine-learned interatomic potential for a refractory oxide. Phys. Rev. Lett. 126, 156002 (2021).
Ruddigkeit, L., Van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).
Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 1–7 (2014).
Hoshino, T. et al. First-principles calculations for vacancy formation energies in Cu and Al; non-local effect beyond the lsda and lattice distortion. Comp. Mater. Sci. 14, 56–61 (1999).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169 (1996).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758 (1999).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Kapil, V. et al. i-PI 2.0: A universal force engine for advanced molecular simulations. Comput. Phys. Commun. 236, 214–223 (2019).
Togo, A. & Tanaka, I. First principles phonon calculations in materials science. Scr. Mater. 108, 1–5 (2015).
Martin, R. M. Electronic Structure: Basic Theory and Practical Methods (Cambridge university press, 2020).
Foulkes, W. M. C. & Haydock, R. Tight-binding models and density-functional theory. Phys. Rev. B 39, 12520 (1989).
Kohn, W. Density functional and density matrix method scaling linearly with the number of atoms. Phys. Rev. Lett. 76, 3168 (1996).
Prodan, E. & Kohn, W. Nearsightedness of electronic matter. Proc. Natl Acad. Sci. 102, 11635–11638 (2005).
Li, H. et al. Deep neural network representation of density functional theory hamiltonian. Preprint at http://arxiv.org/abs/2104.03786 (2021).
Zhang, S., Liu, Y. & Xie, L. Molecular mechanics-driven graph neural network with multiplex graph for molecular structures, Preprint at http://arxiv.org/abs/2011.07457 (2020).
Shui, Z. & Karypis, G. Heterogeneous molecular graph neural networks for predicting molecule properties. In 2020 IEEE International Conference on Data Mining (ICDM), 492-500 (IEEE, 2020).
Qiao, Z. et al. Unite: Unitary N-body tensor equivariant network with applications to quantum chemistry. Preprint at http://arxiv.org/abs/2105.14655 (2021).
Klicpera, J., Becker, F. & Günnemann, S. Gemnet: universal directional graph neural networks for molecules, in Advances in Neural Information Processing Systems (vol. 34, Curran Associates, Inc., 2021).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library, in Advances in Neural Information Processing Systems, 8024-8035 (vol. 32, Curran Associates, Inc., 2019).
Wang, M. et al. Deep graph library: a graph-centric, highly-performant package for graph neural networks. Preprint at http://arxiv.org/abs/1909.01315 (2019).
Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Larsen, A. H. et al. The atomic simulation environment—a python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Ong, S. P. et al. Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).
Anderson, B., Hy, T. S. & Kondor, R. Cormorant: covariant molecular neural networks, in Advances in Neural Information Processing Systems14537–14546 (vol. 32, Curran Associates, Inc., 2019).
Liu, Z. et al. Transferable multilevel attention neural network for accurate prediction of quantum chemistry properties via multitask learning. J. Chem. Inf. Model. 61, 1066–1082 (2021).
Acknowledgements
This work was supported by the Basic Science Center Project of NSFC (Grant No. 51788104), the Ministry of Science and Technology of China (Grants Nos. 2018YFA0307100, and 2018YFA0305603), the National Science Fund for Distinguished Young Scholars (Grant No. 12025405), the National Natural Science Foundation of China (Grant No. 11874035), Tsinghua University Initiative Scientific Research Program, and the Beijing Advanced Innovation Center for Future Chip (ICFC). The authors thank Tencent Quantum Lab for providing computational resources via Tencent Elastic First-principle Simulations (TEFS).
Author information
Authors and Affiliations
Contributions
Z.W., C.W. and W.D. conceived the idea and designed the research. Z.W. implemented the codes and S.Z. further optimized these codes. W.D., C.H., Y.X., S.H., and B.G. supervised the work. All authors discussed the results and were involved in the writing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing Interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, Z., Wang, C., Zhao, S. et al. Heterogeneous relational message passing networks for molecular dynamics simulations. npj Comput Mater 8, 53 (2022). https://doi.org/10.1038/s41524-022-00739-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41524-022-00739-1