Heterogeneous relational message passing networks for molecular dynamics simulations

Wang, Zun; Wang, Chong; Zhao, Sibo; Xu, Yong; Hao, Shaogang; Hsieh, Chang Yu; Gu, Bing-Lin; Duan, Wenhui

doi:10.1038/s41524-022-00739-1

Download PDF

Article
Open access
Published: 31 March 2022

Heterogeneous relational message passing networks for molecular dynamics simulations

npj Computational Materials volume 8, Article number: 53 (2022) Cite this article

5657 Accesses
15 Citations
1 Altmetric
Metrics details

Subjects

Abstract

With many frameworks based on message passing neural networks proposed to predict molecular and bulk properties, machine learning methods have tremendously shifted the paradigms of computational sciences underpinning physics, material science, chemistry, and biology. While existing machine learning models have yielded superior performances in many occasions, most of them model and process molecular systems in terms of homogeneous graph, which severely limits the expressive power for representing diverse interactions. In practice, graph data with multiple node and edge types is ubiquitous and more appropriate for molecular systems. Thus, we propose the heterogeneous relational message passing network (HermNet), an end-to-end heterogeneous graph neural networks, to efficiently express multiple interactions in a single model with ab initio accuracy. HermNet performs impressively against many top-performing models on both molecular and extended systems. Specifically, HermNet outperforms other tested models in nearly 75%, 83% and 69% of tasks on revised Molecular Dynamics 17 (rMD17), Quantum Machines 9 (QM9) and extended systems datasets, respectively. In addition, molecular dynamics simulations and material property calculations are performed with HermNet to demonstrate its performance. Finally, we elucidate how the design of HermNet is compatible with quantum mechanics from the perspective of the density functional theory. Besides, HermNet is a universal framework, whose sub-networks could be replaced by other advanced models.

Learning local equivariant representations for large-scale atomistic dynamics

Article Open access 03 February 2023

E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials

Article Open access 04 May 2022

Accurate and scalable graph neural network force field and molecular dynamics with direct force architecture

Article Open access 21 May 2021

Introduction

In the realm of physics, chemistry, material science, and biology, multi-scale modeling^1,2 helps us understand the properties of materials in multiple scales of time and space. Molecular dynamics (MD) simulation is an essential tool for modeling dynamical evolution of a many-body system. The trajectories of interacting particles are determined by solving Newton’s equations of motion involving complex interatomic potentials. There are two mainstream approaches for performing MD simulations, i.e., classical MD³ and ab initio molecular dynamics (AIMD)⁴. The potential energy surface in classical MD is given by parameterized force fields of a presumed functional form, which facilitates large-scale calculations but possesses poor transferability across tasks. On the other hand, AIMD computes the total energy of a system using quantum mechanics methods, such as the density functional theory (DFT)⁵, that guarantees the applicability and the accuracy under a wide variety of conditions. However, due to the cost of rigorously treating the electronic degrees of freedom, AIMD modeling is currently limited to physical and chemical systems of modest scales. With the rapid development of technology for chemical and material synthesis, the need to construct force fields for large-scale calculations with accuracy comparable to that of the first-principles methods has become ever more urgent.

One recent development to address the above issue is to use machine learning methods^6,7 to facilitate MD simulations. The most important tool in machine learning is neural networks. The first framework of neural networks for MD simulations is proposed by Behler and Parrinello⁸, which is based on fully connected neural networks. Considerable success has been achieved along this route. Especially, Deep potential (DeePMD)^9,10 has been developed as a comprehensive software suite and has been used in simulations of crystal nucleation^11,12 and construction of phase diagram¹³. Traditional neural networks, for example, fully connected neural networks and convolutional neural networks, are most useful when the input data are Euclidean. However, atoms are intrinsically indistinguishable and cannot be ordered. As a result, heavy data preprocessing have to be performed in the above-mentioned frameworks. To alleviate such data preprocessing burden, graph neural networks (GNNs)^14,15 are introduced. The power of graph formalism lies in its focus on relationships among entities (or nodes) rather than the properties of individual nodes. In particular, message passing neural networks (MPNNs)¹⁶ summarized the recapitulative formula for GNN in the spatial domain. With atoms represented as nodes and interactions or bonds between them represented as edges in a graph, molecules or crystals can be transformed to molecular graphs or crystal graphs naturally. GNN-based frameworks for MD simulations, including DTNN¹⁷, SchNet^18,19, DimeNet^20,21, PAINN²², and MDGNN²³, have accurately predicted the potential surface of small molecules and crystals. Current GNN-based MD simulations mostly use homograph, where the message passing network is the same regardless of the types of the atoms. On the other hand, it is now a common practice to use the hybrid pair style in MD simulations, which utilizes different force fields for atom pairs of different types. The hybrid pair style is very useful for complex material systems, such as polymers on metal surface, polymers with nano-particles and solid-solid interface between two different materials. This motivates us to explore the possibility of improved performance by using heterogeneous graph in GNN-based MD simulations.

In this work, we propose a framework to model diverse interactions in a single MD simulations, termed heterogeneous relational message passing networks (HermNet). The model shares a similar idea of hybrid pair style in Large-scale Atomic/Molecular Massively Parallel Simulator software²⁴. HermNet splits the molecular or crystal graph into several subgraphs and use different message passing networks for different subgraphs. Within each subgraph, we choose a modified version of polarizable atom interaction neural network (PAINN)²² as the sub-network. Experiments on molecular and extended systems were performed and the results were satisfactory. HermNet provides a general method to design heterogeneous GNN for MD simulations.

Results

Preliminary

In the graph theory²⁵, a graph is a data structure composed of sets of vertices and edges. Graphs could be classified either as undirected graphs or digraphs by whether there is an explicit designation of edges’ orientations. From the standpoint that an undirected edge graph can be interpreted as a bidirectional link between the pair of nodes, undirected graphs are made up of digraphs.

Graphs can be further classified either as homogeneous or heterogeneous, according to the types of nodes and edges. A homogeneous graph is a special case of heterogeneous graph. MPNN¹⁶, which is a universal spatial-domain-based GNN framework, was proposed for homogeneous graphs. With h_v and e_vw denoting, respectively, node features and edge features in a graph, MPNN is summarized as

$${m}_{v}^{t+1}=\mathop{\sum}\limits_{w\in {{{\mathcal{N}}}}(v)}{M}_{t}({h}_{v}^{t},{h}_{w}^{t},{e}_{vw}^{t}),$$

(1)

$${h}_{v}^{t+1}={U}_{t}({h}_{v}^{t},{m}_{v}^{t+1}),$$

(2)

where the forward propagation is decomposed into two phases, a message passing phase and a readout phase. M_t and U_t are a message function and a update function, respectively. The hidden states h_w of all the neighbors ${{{\mathcal{N}}}}(v)$ of vertex v will be aggregated and then be used to update hidden states of vertex v in the next step. A heterogeneous graph supports sophisticated multi-type relations and inherently enables richer semantic relations. Relational graph convolutional network (R-GCN)²⁶ is an extension of MPNN. ${{{\mathcal{G}}}}=({{{\mathcal{V}}}},{{{\mathcal{E}}}},{{{\mathcal{R}}}})$ denotes a heterogeneous graph with nodes (entities) ${v}_{i}\in {{{\mathcal{V}}}}$ and labeled edges (relations) $({v}_{i},r,{v}_{j})\in {{{\mathcal{E}}}}$, where $r\in {{{\mathcal{R}}}}$ is a relation type, that covers both canonical directional and inverse directional relations. A generalized forward process of an entity v_i in a relational graph takes the form

$${h}_{i}^{(l+1)}=\mathop{\sum}\limits_{r\in {{{\mathcal{R}}}}}{U}_{r}^{(l)}\left({h}_{i}^{(l)},\mathop{\sum}\limits_{j\in {{{{\mathcal{N}}}}}_{i}^{r}}{M}_{r}^{(l)}({h}_{i}^{(l)},{h}_{j}^{(l)},{e}_{ij}^{(l)})\right),$$

(3)

where ${{{{\mathcal{N}}}}}_{i}^{r}$ denotes the set of neighbor indices of vertex i of relation r. Eq. (3) implies that a heterogeneous graph can be decomposed into several homogeneous graphs of distinct relations ${{{\mathcal{R}}}}$. Typically, each homogeneous graph is a directed graph. In other words, an R-GCN layer is made up of multiple MPNN layers, each of which is associated with a homogeneous graph of relation r.

Architecture

Diverse forms of force fields are manifestly responsible for the intricate interactions, especially in systems with multiple elements. GNNs for homogeneous graphs model interactions of different atomic pairs with shared parameters, which limits the expressive power for neural-network-based force fields. For example, as shown in Fig. 1(a), there are three kinds of particles, i.e. A-, B- and C-type atoms. The graph is constructed via linking central nodes with their adjacent nodes within a cutoff radius. In a classical MD simulation for this system, six different force fields can be allocated for A–A pairs, A–B pairs, A–C pairs, etc., provided only two-body interactions are considered. If a homogeneous GNN is employed to model different interactions by fitting a single function, it is expected to generate a mean force field. On the other hand, equipped with multiple types of nodes and edges, a heterogeneous GNN is a natural choice to model these interactions with a more detailed resolution.

**Fig. 1: A schematic diagram that demonstrates how to extract subgraphs from a original heterogeneous graph.**

As shown in the following, we develop a universal framework, HermNet, to model diverse many-body interactions simultaneously via extracting appropriate subgraphs, which are subsequently processed by heterogeneous GNNs. The overview of the entire architecture diagram of HermNet is displayed in Fig. 2(a), which takes atomic numbers Z (and a vector of zeros) as the node’s scalar features (and node’s vectorial features). HermNet is composed of several message passing layers, termed HermConv layers, which model interactions hierarchically. The RMConv modules (Fig. 2b) of different relations constitute the HermConv module (Fig. 2c, d). We introduce three variants of HermNets: heterogeneous pair networks (HPNet), heterogeneous triadic networks (HTNet), and heterogeneous vertex networks (HVNet). A HPNet layer for central nodes of A-type is displayed in Fig. 2c, where all the sub-networks with A-type destination contribute to the local environment of A-type node. A HermNet layer for HVNet and HTNet is displayed in Fig. 2d. If the parameters of its sub-networks [RMConv, see Fig. 3] are shared for the same kinds of central nodes, this HermNet framework is referred to as HVNet. When the parameters are not shared, this HermNet framework is an HTNet. We only test and report HVNet’s performance in the following sections, as the other two models (HPNet and HTNet) have high complexity and will take more training time and data points for a proper assessment.

**Fig. 2: The schematic of the working principle of the entire architecture.**

**Fig. 3: The overview of the sub-networks in HVNet and HTNet.**

Most machine learning frameworks for MD simulation only take into account the interatomic distances in feature engineering, ignoring the bond angle information, which is an important characteristic of both molecules and crystals. In principle bond angle can be deduced from interatomic distances. However, it is advantageous to explicitly include bond angle information in feature engineering to achieve better performance. Directional message passing networks (DimeNet)^20,21 innovatively introduced three-body interactions explicitly by combining radial and angular information from the edges of the original graph and the corresponding line graph, respectively. PAINN²² is a rotationally equivariant MPNN framework and the complexity of calculating angular information was reduced. In this work, we incorporate angular information by choosing PAINN as the sub-network in HermNet. This specific message passing setup can be directly implemented in HVNet, while slight modifications are required in HTNet to distinguish the type of source nodes. We note that HPNet cannot incorporate all angular information explicitly. For example, the bond angle A → B ← B is lost in HPNet because A → B and B ← B are processed by different sub-networks.

As discussed above, a heterogeneous graph could be decomposed into several homogeneous subgraphs. To describe the method of extracting these subgraphs, we use ${{{\mathcal{G}}}}$, ${\hat{{{{\mathcal{Q}}}}}}_{s}$, and ${\hat{{{{\mathcal{Q}}}}}}_{d}$ to denote the input heterogeneous graph, the operator that returns the subgraphs with specific source nodes, and the operator that returns the subgraphs with specific destination nodes, respectively. As indicated in Fig. 1b, c, the directed subgraphs for HVNet could be extracted via selecting inbound edges of a given A-type destination node, i.e. ${\hat{{{{\mathcal{Q}}}}}}_{d}^{A}{{{\mathcal{G}}}}$, while those for HTNet are extracted via selecting inbound edges of a given B-type destination node firstly and then choosing out-bound edges of its A-type and C-type source nodes simultaneously, i.e. ${\hat{{{{\mathcal{Q}}}}}}_{s}^{A\cup C}{\hat{{{{\mathcal{Q}}}}}}_{d}^{B}{{{\mathcal{G}}}}$ for triadic relation A → B ← C. We note that if the two destination nodes are extracted sequentially for HTNet, the result is generally an empty graph.

In the following, we report the testing of HVNet against other prior frameworks on three well-established benchmark datasets. As detailed below, HVNet convincingly outperforms most of the prior methods.

Benchmarks on MD17 dataset

The MD17 dataset^17,27,28 provides non-equilibrium structures sampled (at a time resolution of 0.5 fs) from AIMD trajectories for eight small molecules with a background temperature of 500 K. The potential energy and force labels are computed with PBE + vdW − TS method. Christensen and von Lilienfeld²⁹ found that the energies in original MD17 dataset are contaminated with substantial numerical noises and published a revised version of the MD17 dataset. Distinct HVNet models were trained on this revised dataset, and an a 1000-frame training set and a 1000-frame validation set are randomly selected. The learning rate was initially set at 3 × 10⁻⁴ and adaptively reduced when the loss on the validation set reached a plateaus. The truncated radius was set at 5 Å for the construction of molecular graphs. Additional details can be found in the Supplementary Methods. Table 1 presents the comparisons of mean absolute errors (MAEs) of three benchmarked models and HVNet. It should be noted that the results of PAINN and HVNet were trained on revised MD17 dataset, while SchNet and DimeNet were trained on the original MD17 dataset. HVNet outperforms other models with a comfortable margin on three-quarters of the predictive tasks, and its results of the remaining tasks are comparable to the best results among all four frameworks. We also attempted to train an HTNet on the MD17 dataset; however, the parameter space of the HTNet is simply too large, and obvious overfitting was immediately observed after just several training epochs. Then we trained the HTNet model on the HfO₂ dataset, which was proposed to fit Gaussian approximation potential models^{30,31,32,33,34,35}, and found that when more than 1500 data points were used for training, no obvious overfitting was observed (detailed discussion with respect to training HTNet model is provided in the Supplementary Notes 2). This indicates that HTNet might be expressive once more data points are provided.

Table 1 Comparison of the MAEs between several benchmarked models and HVNet trained on MD17 dataset using 1000 training samples (energies in meV and forces in meV Å⁻¹).

Full size table

Benchmarks on QM9 dataset

The QM9 dataset^36,37 consists of computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of carbon, hydrogen, oxygen, nitrogen, and fluorine. All properties were calculated at the B3LYP/6-31G (2df, p) level of quantum chemistry. This dataset provides quantum chemical insights for the relevant chemical space of small organic molecules, and has been widely adopted as the benchmark to calibrate, analyze and evaluate new methods in this field. HVNet was trained on 110k molecules and validated on another 10k molecules. The properties of the 134k molecules include dipole moment (μ), isotropic polarizability (α), energy of the highest occupied molecular orbital (ε_HOMO), energy of the lowest unoccupied molecular orbital (ε_LUMO), band gap (Δε), electronic spatial extent (R²), zero point vibrational energy (ZPVE), internal energy at 0 K (U₀), internal energy at 298.15 K (U), enthalpy at 298.15 K (H), free energy at 298.15 K (G), and heat capacity at 298.15 K (c_v). It must be emphasized that HVNets were trained with atomization energies rather than the original internal energies, enthalpy energy, and free energy, i.e., the original energies subtracting the atomic reference energies, which is the protocol advocated in the DimeNet work of Klicpera et al. ²⁰. These adjusted values are more reasonable because absolute energies are generally meaningless and relative energies essentially convey all physical implications. Table 2 reports the MAEs of HVNet for 12 tasks in the QM9 dataset with comparison to other eight models. HVNet outperforms all baselines on 10 out of 12 tasks. For the other 2 tasks, R² and ZPVE, the MAEs of HVNet are on par with some of the baselines. Details of additional settings and the definition of the physical quantities with respect to the models and datasets are provided in the Supplementary Methods and Supplementary Discussion 1.

Table 2 Comparison of the MAEs between several benchmark models and HVNet trained on QM9 dataset.

Full size table

Benchmark on extended systems

Predicting properties of extended systems is a more ambitious task because of their intricate chemical environments. Since HermNet is capable to handle extended systems, we conduct this more challenging benchmark on the extended system datasets provided in ref. ¹⁰. The datasets contain properties of 8 different systems, among which bulk C₅H₅N, bulk TiO₂, the system which consists of MoS₂ and Pt, and high entropy alloy (HEA) are four most difficult tasks: the bulk C₅H₅N and bulk TiO₂ dataset include multiple phases; the system of MoS₂ and Pt includes five different datasets. Unfortunately, training on the MoS₂+Pt dataset required too much computational time, so we chose not to further pursue this benchmark after some preliminary tuning (and no corresponding results are shown). The HEA dataset is explicitly divided into two datasets, such that the model should be trained on the first dataset which includes 40 kinds of 5 equimolar-element CoCrFeMnNi HEA with random occupations and then tested on the test set in the first dataset and the entire second dataset that includes another 16 kinds of HEA with random occupations. Table 3 shows the comparisons of root mean square errors (RMSEs) between DeepPot-SE/DeePMD¹⁰ and HVNet. Since the potential energy is an extended quantity, the RMSEs of energies were normalized with the system size in consistency with how the DeepPot-SE and DeePMD¹⁰ presented the results. As shown in Table 3, HVNet achieved lower RMSEs than DeepPot-SE on all tasks except the dataset of MoS₂ and Pt, which we chose not to do due to the excessive amount of required training time. Detail of additional settings and specific discussions are provided in Supplementary Methods and Supplementary Discussion 2. Besides, we calculated the vacancy formation energy of the bulk Cu with the trained model. An arbitrary Cu atom was removed and the configuration was relaxed with DFT and HVNet, respectively. The chemical potential of Cu was calculated from DFT and the vacancy formation energies from DFT and HVNet are 1.03 eV and 1.07 eV, respectively, which are also consistent with previous computational and experimental results (1.14 eV and 1.17–1.28 eV, respectively)³⁸.

Table 3 Comparison of the root mean square errors between DeepPot-SE (DeePMD) and HVNet trained on extended systems dataset, where the root mean square errors of the energies are normalized by the number of atoms in the system (energies in meV and forces in meV Å⁻¹).

Full size table

Molecular dynamics simulation and phonon dispersion

To demonstrate the performance of HermNet, MD simulation of a MoSe₂ monolayer was performed. The dataset was generated with Vienna ab initio Simulation Package (VASP)³⁹ using the projector-augmented wave^40,41 pseudopotentials. The Perdew-Berke-Ernzerhof exchange-correlation potential⁴² was used. The cutoff of plane waves was 260 eV and a 2 × 2 × 1 gamma-centered k-point mesh was adopted to sample the Brillouin zone of the 6 × 6 × 1 supercell. The simulation was carried out under the canonical ensemble with the temperature increasing from 100 K to 1500 K, and 5000 frames were obtained With the time step of 1 fs. The dataset was randomly shuffled and split into training set, validation set, and test set in the ratio of 8:1:1. The MAEs of energy and forces on test set were 0.09 meV per atom and 2.93 meV Å⁻¹, respectively. The comparison of radial distribution functions at 300 K from AIMD and i-PI⁴³, a classical MD simulation software, with HVNet as the force fields, is shown in Fig. 3a. Furthermore, the phonon dispersion was calculated via interfacing HermNet and phonopy⁴⁴. Acoustic sum rule was enforced to ensure that the three acoustic modes at Γ point must be zero. As shown in Fig. 4b, the performance of HermNet on phonon dispersion demonstrates that even the second order derivative of potential energy reaches high precision.

**Fig. 4: Comparison of material properties.**

Discussion

The complexity of a sub-network is generally scaled as ${{{\mathcal{O}}}}(| {{{\mathcal{N}}}}| )$, where $| {{{\mathcal{N}}}}|$ is typically the number of the neighbors captured within a cutoff radius. The numbers of sub-networks for HVNet, HPNet, and HTNet are ${{{\mathcal{O}}}}({N}_{e})$, ${{{\mathcal{O}}}}({N}_{e}^{2})$ and ${{{\mathcal{O}}}}({N}_{e}^{3})$, respectively. Here, N_e is the number of element types present in the system. Therefore, HVNet is most useful when the number of distinct elements is large. Further discussions on the complexity analysis are deferred to the Supplementary Notes 2.

To construct accurate force field for classical MD simulations, potential energy surface needs to be reproduced up to first-principles precision. Actually, potential energy has hierarchical structure and can be decomposed into several terms as follows,

$$U=\mathop{\sum}\limits_{i}{E}_{i}+\mathop{\sum}\limits_{i < j}{E}_{ij}+\mathop{\sum}\limits_{i < j < k}{E}_{ijk}+\cdots \,,$$

(4)

where the first term represents the energy of a single atom and the second term is the summation of all the pairwise interactions, such as the energy contributed from bonds. The third term denotes the three-body interactions, which typically entails angular specifications. Higher-order many-body interactions can be further included in order to build a more accurate potential energy surface. The layers shown in Fig. 3b and (c), which are equivalent to the message layer in the original PAINN proposal²², could be viewed as a single MPNN layer which models two-body interactions since they merely process radial information. The inner products of the positional vectors presented in the modules in Fig. 3d, e or f are responsible for modeling three-body interactions. Thus the sub-network, i.e. concatenation of these layers, as shown in Figs. 2b and 3a, exactly conforms to this hierarchical rule in Eq. (4).

On the other hand, graphs are constructed with a specific cutoff radius and only information of 1-hop neighbors is aggregated in a single MPNN layer. The final energy prediction is obtained with a global pool operation on all local environments. This suggests that locality is an essential property that facilitates the learning of potential energies. The DFT total energy could be expressed as a summation of eigenvalues of electronic Hamiltonian and the interaction of the nuclei with a correction to avoid double counting⁴⁵. To take advantages of a localized basis as in a graph, we will discuss the total energy within the tight-binding framework, which could provide more physical insights. When the density is expressed as the superposition of spherical atomic densities⁴⁶, the total energy in the tight-binding representation is written as

$${E}_{{{\rm{total}}}}=\mathop{\sum}\limits_{m,{m}^{\prime}}{\rho }_{m,{m}^{\prime}}{H}_{m,{m}^{\prime}}+\mathop{\sum}\limits_{I < J}f(| {{{{\rm{R}}}}}_{I}-{{{{\rm{R}}}}}_{J}| ),$$

(5)

where ${\rho }_{m,{m}^{\prime}}$ is the density matrix. ${H}_{m,{m}^{\prime}}$ is the matrix element of the Hamiltonian between states m and ${m}^{\prime}$, where m = 1, ⋯ , N_basis denote the states in the basis. R_I is the position of atom I, and J is a neighboring site of I. The formula demonstrates that total energy could be decomposed into pairwise contributions, which is consistent with the layer made up of radial message passing layer in Fig. 2(a). Generally, the terms in Eq. (5) are both short-range interactions^47,48,49 and could be extended to higher-order interaction. Then the total energy could be expressed as ${E}_{{{\rm{total}}}}=\mathop{\sum }\nolimits_{i = I}^{N}{\varepsilon }_{I}^{\prime}$, which is a summation of local contributions from central particles. This indicates the locality of a system’s overall energy, consistent with the idea underlying the seminal work of Ref. ⁸, which is widely adopted in the many follow-up works in this field.

In principle, the parameters of sub-networks in DeePMD¹⁰ are not shared for different element types, which is similar to heterogeneous GNNs. Thus the outperformance on extended systems results from the ability the sub-networks we used in this work. There are also other existing heterogeneous GNN framework designed for MD simulations, but the design principle is very different. MXMNet⁵⁰ utilized multiplex graphs, which could be viewed as heterogeneous graphs with individual node and two edge types, to capture global and local geometric information from multiplex graphs allocated with different cutoff radii. Heterogeneous molecular GNNs⁵¹ introduced heterogeneous graphs for molecules via grouping the original graph and a line graph into a single heterogeneous graph with two kinds of nodes. It processes information of nodes in original graph and line graph with two different GNNs respectively. The heterogeneity in these two works is equivalent to distinguishing original graphs and line graphs, which still treats the original graphs as a homogeneous graph.

In conclusion, we develop HermNet, a framework based on heterogeneous GNN, to learn multiple kinds of force fields in a single MD simulation via extracting required subgraphs. Different from previous works, HermNet introduce heterogeneous graphs to describe different interactions of element types rather than to distinguish the hierarchy of the interactions. Among three variants of HermNet, we tested HVNet on a variety of systems, covering both molecular and extended systems, and obtained satisfactory results. Some discussions based on quantum mechanics and DFT have been provided to justify our model designs. Although we primarily focus on experiments with HVNet, in principle, HTNet is capable of modeling sophisticated interactions once enough data is provided. HVNet outperforms the state-of-the-art benchmark models on most of the tasks for small molecules. For the experiments on extended systems, HVNet also outperforms DeePMD¹⁰. These results demonstrate the powerful representation and promising application potential of HVNet for diverse and intricate systems such as HEA. Finally, we emphasize that HermNet is a universal framework, whose sub-networks could be replaced by other advanced or specialized models. For example, unitary N-body tensor equivariant neural network (UNiTE)⁵², another remarkable framework based on the elegant group theory, was proposed recently, which performed impressively on molecular datasets. We believe that HermNet can deliver improved results by replacing the current sub-networks with UNiTE⁵². Besides, many-body interactions could also be truncated to higher order in sub-networks of HermNet, such as dihedral angular information⁵³. HermNet can also be extended to model interactions from higher-order contributions via extracting higher-order subgraphs and invoking frameworks that model higher-order contributions properly. More information could be found in Supplementary Discussion 3.

Methods

Architecture implementation

HermNet is implemented with PyTorch⁵⁴ and Deep Graph Library⁵⁵ python library. Neighbors of the central particle are found by Scikit-Learn⁵⁶ library and the node features are extracted by Atomic Simulation Environment⁵⁷ and Pymatgen⁵⁸ library. In our work, a simplified PAINN²² is implemented as sub-network in both HVNet and HTNet. The angular formula in HVHet is the same as that in PAINN²², while that in HTNet is a little different. The proof that angular information could be introduced in HVNet and HTNet with PAINN naturally is provided in Supplementary Notes 1.

Data availability

The raw data of revised MD17, QM9, and bulk systems are available at https://figshare.com/articles/dataset/Revised_MD17_dataset_rMD17_/12672038, https://deepchemdata.s3-us-west-1.amazonaws.com/datasets/molnet_publish/qm9.zip, and http://www.deepmd.org/database/deeppot-se-data/, respectively.

Code availability

The implementations of HermNet described in the paper are available at https://github.com/sakuraiiiii/HermNet.

References

Weinan, E. & Engquist, B. Multiscale modeling and computation. Not. Am. Math. Soc. 50, 1062–1070 (2003).
Google Scholar
Horstemeyer, M. F. Multiscale modeling: a review, in Practical Aspects of Computational Chemistry: Methods, Concepts and Applications 87–135 (Springer, 2010).
Alder, B. J. & Wainwright, T. E. Studies in molecular dynamics. i. general method. J. Chem. Phys. 31, 459–466 (1959).
Article CAS Google Scholar
Car, R. & Parrinello, M. Unified approach for molecular dynamics and density-functional theory. Phys. Rev. Lett. 55, 2471 (1985).
Article CAS Google Scholar
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Article Google Scholar
Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
Article CAS Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS Google Scholar
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Article Google Scholar
Zhang, L., Han, J., Wang, H., Car, R. & Weinan, E. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Article CAS Google Scholar
Zhang, L. et al. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems, in Advances in Neural Information Processing Systems (vol. 31, Curran Associates, Inc., 2018).
Bonati, L. & Parrinello, M. Silicon liquid structure and crystal nucleation from ab initio deep metadynamics. Phys. Rev. Lett. 121, 265701 (2018).
Article CAS Google Scholar
Niu, H., Bonati, L., Piaggi, P. M. & Parrinello, M. Ab initio phase diagram and nucleation of gallium. Nat. Commun. 11, 1–9 (2020).
Article CAS Google Scholar
Zhang, L., Wang, H., Car, R. & E, W. Phase diagram of a deep potential water model. Phys. Rev. Lett. 126, 236001 (2021).
Article CAS Google Scholar
Zhou, J. et al. Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020).
Article Google Scholar
Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2020).
Article Google Scholar
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning, 1263–1272 (PMLR, 2017).
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8, 1–8 (2017).
Article Google Scholar
Schütt, K. et al. SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. In Proc 31st International Conference on Neural Information Processing Systems, 992–1002 (Curran Associates Inc., 2017).
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet–a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Article Google Scholar
Klicpera, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs, In International Conference on Learning Representations (ICLR, 2019).
Klicpera, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules, Preprint at http://arxiv.org/abs/2011.14115 (2020).
Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra, In Proc 38th International Conference on Machine Learning, 9377–9388 (vol. 64, PMLR, 2021).
Wang, Z. et al. Symmetry-adapted graph neural networks for constructing molecular dynamics force fields. Sci. China.: Phys., Mech. Astron. 64, 1–9 (2021).
CAS Google Scholar
Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
Article CAS Google Scholar
Bondy, J. A. et al. Graph Theory with Applications (vol. 290, Macmillan London, 1976).
Schlichtkrull, M. et al. Modeling relational data with graph convolutional networks, in European semantic web conference, 593–607 (Springer, 2018).
Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, e1603015 (2017).
Article Google Scholar
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 1–10 (2018).
Article CAS Google Scholar
Christensen, A. S. & von Lilienfeld, O. A. On the role of gradients for machine learning of molecular energies and forces. Mach. Learn. Sci. Technol. 1, 045018 (2020).
Article Google Scholar
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Article Google Scholar
Fujikake, S. et al. Gaussian approximation potential modeling of lithium intercalation in carbon nanostructures. J. Chem. Phys. 148, 241714 (2018).
Article Google Scholar
Deringer, V. L. & Csányi, G. Machine learning based interatomic potential for amorphous carbon. Phys. Rev. B 95, 094203 (2017).
Article Google Scholar
Bartók, A. P., Kermode, J., Bernstein, N. & Csányi, G. Machine learning a general-purpose interatomic potential for silicon. Phys. Rev. X 8, 041048 (2018).
Google Scholar
Mocanu, F. C. et al. Modeling the phase-change memory material, Ge₂Sb₂Te₅, with a machine-learned interatomic potential. J. Phys. Chem. B 122, 8998–9006 (2018).
Article CAS Google Scholar
Sivaraman, G. et al. Experimentally driven automated machine-learned interatomic potential for a refractory oxide. Phys. Rev. Lett. 126, 156002 (2021).
Article CAS Google Scholar
Ruddigkeit, L., Van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).
Article CAS Google Scholar
Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 1–7 (2014).
Article Google Scholar
Hoshino, T. et al. First-principles calculations for vacancy formation energies in Cu and Al; non-local effect beyond the lsda and lattice distortion. Comp. Mater. Sci. 14, 56–61 (1999).
Article CAS Google Scholar
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169 (1996).
Article CAS Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Article Google Scholar
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758 (1999).
Article CAS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Article CAS Google Scholar
Kapil, V. et al. i-PI 2.0: A universal force engine for advanced molecular simulations. Comput. Phys. Commun. 236, 214–223 (2019).
Article CAS Google Scholar
Togo, A. & Tanaka, I. First principles phonon calculations in materials science. Scr. Mater. 108, 1–5 (2015).
Article CAS Google Scholar
Martin, R. M. Electronic Structure: Basic Theory and Practical Methods (Cambridge university press, 2020).
Foulkes, W. M. C. & Haydock, R. Tight-binding models and density-functional theory. Phys. Rev. B 39, 12520 (1989).
Article CAS Google Scholar
Kohn, W. Density functional and density matrix method scaling linearly with the number of atoms. Phys. Rev. Lett. 76, 3168 (1996).
Article CAS Google Scholar
Prodan, E. & Kohn, W. Nearsightedness of electronic matter. Proc. Natl Acad. Sci. 102, 11635–11638 (2005).
Article CAS Google Scholar
Li, H. et al. Deep neural network representation of density functional theory hamiltonian. Preprint at http://arxiv.org/abs/2104.03786 (2021).
Zhang, S., Liu, Y. & Xie, L. Molecular mechanics-driven graph neural network with multiplex graph for molecular structures, Preprint at http://arxiv.org/abs/2011.07457 (2020).
Shui, Z. & Karypis, G. Heterogeneous molecular graph neural networks for predicting molecule properties. In 2020 IEEE International Conference on Data Mining (ICDM), 492-500 (IEEE, 2020).
Qiao, Z. et al. Unite: Unitary N-body tensor equivariant network with applications to quantum chemistry. Preprint at http://arxiv.org/abs/2105.14655 (2021).
Klicpera, J., Becker, F. & Günnemann, S. Gemnet: universal directional graph neural networks for molecules, in Advances in Neural Information Processing Systems (vol. 34, Curran Associates, Inc., 2021).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library, in Advances in Neural Information Processing Systems, 8024-8035 (vol. 32, Curran Associates, Inc., 2019).
Wang, M. et al. Deep graph library: a graph-centric, highly-performant package for graph neural networks. Preprint at http://arxiv.org/abs/1909.01315 (2019).
Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Larsen, A. H. et al. The atomic simulation environment—a python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Article Google Scholar
Ong, S. P. et al. Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
Article CAS Google Scholar
Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).
Article Google Scholar
Anderson, B., Hy, T. S. & Kondor, R. Cormorant: covariant molecular neural networks, in Advances in Neural Information Processing Systems14537–14546 (vol. 32, Curran Associates, Inc., 2019).
Liu, Z. et al. Transferable multilevel attention neural network for accurate prediction of quantum chemistry properties via multitask learning. J. Chem. Inf. Model. 61, 1066–1082 (2021).
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by the Basic Science Center Project of NSFC (Grant No. 51788104), the Ministry of Science and Technology of China (Grants Nos. 2018YFA0307100, and 2018YFA0305603), the National Science Fund for Distinguished Young Scholars (Grant No. 12025405), the National Natural Science Foundation of China (Grant No. 11874035), Tsinghua University Initiative Scientific Research Program, and the Beijing Advanced Innovation Center for Future Chip (ICFC). The authors thank Tencent Quantum Lab for providing computational resources via Tencent Elastic First-principle Simulations (TEFS).

Author information

Authors and Affiliations

State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing, China
Zun Wang, Sibo Zhao, Yong Xu, Bing-Lin Gu & Wenhui Duan
Tencent Quantum Lab, Tencent, Shenzhen, Guangdong, 518057, China
Zun Wang, Yong Xu, Shaogang Hao, Chang Yu Hsieh & Wenhui Duan
Department of Physics, Carnegie Mellon University, Pittsburgh, PA, USA
Chong Wang
Frontier Science Center for Quantum Information, Beijing, 100084, China
Yong Xu & Wenhui Duan
RIKEN Center for Emergent Matter Science (CEMS), Wako, Saitama, 351-0198, Japan
Yong Xu
Institute for Advanced Study, Tsinghua University, Beijing, 100084, China
Bing-Lin Gu & Wenhui Duan

Authors

Zun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sibo Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Shaogang Hao
View author publications
You can also search for this author in PubMed Google Scholar
Chang Yu Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Bing-Lin Gu
View author publications
You can also search for this author in PubMed Google Scholar
Wenhui Duan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.W., C.W. and W.D. conceived the idea and designed the research. Z.W. implemented the codes and S.Z. further optimized these codes. W.D., C.H., Y.X., S.H., and B.G. supervised the work. All authors discussed the results and were involved in the writing of the manuscript.

Corresponding authors

Correspondence to Chong Wang, Shaogang Hao or Wenhui Duan.

Ethics declarations

Competing interests

The authors declare no competing Interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

SUPPLEMENTAL MATERIAL

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, Z., Wang, C., Zhao, S. et al. Heterogeneous relational message passing networks for molecular dynamics simulations. npj Comput Mater 8, 53 (2022). https://doi.org/10.1038/s41524-022-00739-1

Download citation

Received: 10 September 2021
Accepted: 24 February 2022
Published: 31 March 2022
DOI: https://doi.org/10.1038/s41524-022-00739-1

This article is cited by

Enhancing ReaxFF for molecular dynamics simulations of lithium-ion batteries: an interactive reparameterization protocol
- Paolo De Angelis
- Roberta Cappabianca
- Eliodoro Chiavazzo
Scientific Reports (2024)
Equivariant neural network force fields for magnetic materials
- Zilong Yuan
- Zhiming Xu
- Yong Xu
Quantum Frontiers (2024)
Transferable equivariant graph neural networks for the Hamiltonians of molecules and solids
- Yang Zhong
- Hongyu Yu
- Hongjun Xiang
npj Computational Materials (2023)