Abstract
Diffusion Monte Carlo (DMC) based on fixednode approximation has enjoyed significant developments in the past decades and become one of the goto methods when accurate ground state energy of molecules and materials is needed. However, the inaccurate nodal structure hinders the application of DMC for more challenging electronic correlation problems. In this work, we apply the neuralnetwork based trial wavefunction in fixednode DMC, which allows accurate calculations of a broad range of atomic and molecular systems of different electronic characteristics. Our method is superior in both accuracy and efficiency compared to stateoftheart neural network methods using variational Monte Carlo (VMC). We also introduce an extrapolation scheme based on the empirical linearity between VMC and DMC energies, and significantly improve our binding energy calculation. Overall, this computational framework provides a benchmark for accurate solutions of correlated electronic wavefunction and also sheds light on the chemical understanding of molecules.
Introduction
Since the establishment of quantum wavefunction theory by Erwin Schrödinger, ab initio electronic structure calculation has become one of the holy grails in chemistry^{1,2}. Molecules generally consist of a set of nuclei bonded together via electrons through electrostatic interactions. Therefore, the ground state electronic structure, i.e., the manybody electronic wavefunction, is very much the most fundamental property, based on which we form the basic understanding of molecules. On top of the ground state wavefunction solution, one may further study electronic excitation, calculate nuclear forces and vibrations, optimize molecular structures, model dynamics and reactions, etc.^{3}. Approximated methods, such as density functional theory and post HartreeFock methods have been widely employed for these purposes, but challenges still exist when high accuracy is needed^{4,5}. For instance, the subchemicalaccuracy is often desired to predict adsorption of molecules on surfaces, the packing order of organic chemicals, and the hydrogen bonding of water and biological molecules^{6,7}. Therefore, pushing the limit towards the exact ground state wavefunction of molecules is of both fundamental importance and practical relevance.
Stochastic approaches, i.e., quantum Monte Carlo (QMC) methods, have been a competitive rival of the deterministic methods in chasing the ground truth of manybody electronic wavefunction of molecules^{8,9,10,11}. In particular, diffusion Monte Carlo (DMC), an approach based on ground state projection, is capable of treating dynamic correlations and reaching subchemicalaccuracy for molecules^{12,13}. However, effective DMC algorithms usually work together with the socalled fixednode approximation^{14,15}, and the accuracy is only assured when a good trial wavefunction containing the correct nodal structure is provided in advance^{16}. Despite many progresses have been made to improve the trial wavefunction, e.g., using physically more meaningful ansatz or combined with multideterminant post HartreeFock wavefunctions^{13,17,18}, the fixednode approximation remains as the Achilles’ heel of DMC.
Recently, it has been shown that machine learning techniques such as neural networks can lend strong support to describe the electronic structure of molecular systems and provide a powerful way to reconstruct the manybody wavefunction^{19,20,21,22,23,24,25,26}. FermiNet is one of the notable examples, which has already shown promising results for small molecules consisting of typically less than 30 electrons^{20,21,27}. In these neural network wavefunction methods, variational Monte Carlo (VMC) is often employed to train the network on the fly. Despite its effectiveness on small molecules, it remains to be challenging to apply neural networkbased VMC on larger systems, due to required large computation resources and long converging time.
In this work, we integrate the FermiNet neural network wavefunction into DMC. This approach takes advantage of the accurate trial wavefunction of FermiNet and the efficient ground state projection of DMC, which allows calculations of a range of systems to unprecedented accuracy. We refer to the vanilla FermiNet approach as FermiNetVMC, and refer to our FermiNetbased DMC approach as FermiNetDMC. Compared to FermiNetVMC, FermiNetDMC is able to achieve lower variational ground state energy at reduced computational cost. We carry out tests on atoms as well as molecules including N_{2}, cyclobutadiene, water dimer, benzene and benzene dimer. We also present the empirical linear relation between VMC and DMC energies in our calculations and introduce an extrapolation scheme accordingly. Insights to the electronic structure of these systems obtained from our calculations are also discussed.
Results
Computational framework
As illustrated in Fig. 1a, in the traditional electronic structure approach, diffusion Monte Carlo is often used after optimization of trial wavefunction using VMC, which approaches the limit of a given wavefunction ansatz. DMC further purifies the true ground state out of other contaminating eigenstates, and it often allows the breaking through of the ansatz limit. However, to overcome the notorious sign problem, nodal points where the wavefunction is zero have to be fixed in DMC, and walkers are only allowed to evolve in each fixed nodal pocket. Here, the idea is to implement the recently developed neural network as an accurate wavefunction ansatz (Fig. 1b). On one hand, the wavefunction learned by the neural network automatically reproduces an accurate representation of the mysterious nodal structure of many electrons of molecules. The accurate nodal structure ensures that the subsequent DMC simulation with fixed nodes does not yield bias to the ground state. On the other hand, compared with neural networksbased VMC, our scheme only requires the information of the nodal structure instead of the full wavefunction. It is reasonable to expect the nodal structure to be simpler characterized than the full wavefunction.
Our multiwalker DMC algorithm is implemented in a fully parallel manner, in which each walker independently simulates the stochastic dynamics of electrons (Fig. 1c). The three key steps in our DMC algorithm are diffusion, branching, and merging (Fig. 1d), and they ensure that the equilibrium is reached for each walker after simulation in terms of the probability distribution of different electronic configurations. The diffusion step changes the configuration of electrons from one to another, while the crossnode movement is forbidden. Branching and merging control the total population of walkers during the simulation. In this work, we have implemented a GPU and neural network friendly DMC algorithm, which can be easily scaled out to multiple computing nodes. The runtime for one step in FermiNetDMC is almost identical to that in FermiNetVMC. Therefore, to compare the efficiency or total runtime between FermiNetDMC and FermiNetVMC, we only need to compare the number of steps in those processes. More methodological and technical details are provided in the “Methods” section and the Supplementary Notes 1–5.
Single atoms
Neural network models are faced with the tradeoff between model expressiveness and computational intensiveness. For powerful models like FermiNet, it may take hundreds of thousands iterations to converge in the training process even for small benchmark systems with just a few electrons. Figure 2 shows calculations for single atoms with a shallow and narrow FermiNet ansatz with only 2 layers of rather small number of neurons (see Supplementary Table S3 for details). The network is designed to be restricted so that we can study FermiNet’s performance when it is not expressive enough for the considered systems. This situation is of practical importance especially when we are interested in applying neural networkbased QMC methods to large systems of one hundred electrons or more. As shown in Fig. 2a, a common pattern of FermiNet’s training progress is that the energy curve drops to a fairly low level in a short amount of time and then slowly converges to its limit. Figure 2a is a calculation on the Be atom with the mentioned small network, and after 5 × 10^{5} steps of training, which ensures complete convergence, the systematic error still can not be reduced to within the chemical accuracy. In addition, the computational cost could scale up quickly for larger systems even on the most advanced modern computation platforms such as NVIDIA’s Tesla A100 GPU. This issue prevents accurate calculations for more than 30 correlated electrons^{20,21}.
The combination of the FermiNet neural network wavefunction ansatz and DMC achieves a substantial improvement in both accuracy and efficiency. For Be atom and the same simple neural network, FermiNetDMC energy drops to within 1 mHa with respect to the reference value of the total energy. The DMC data is obtained with 10^{5} steps of simulation, and the variance of DMC is also significantly reduced. It is also encouraging to see that even when we start from the trial wavefunction after 10^{4} steps of training, the DMC energy obtained subsequently is also converged within 1 mHa to the exact value. At the 10^{5} step when the training has not yet completely converged, the DMC energy is already consistent with the result obtained at the 5 × 10^{5} step. The good performance of DMC based on undertrained trial wavefunctions suggests the nodal structure is well characterized before the wavefunction is fully trained in the neural network ansatz. In Fig. 2b–d, we present the threedimensional cuts of the full 11dimensional (11D) nodal structure of the FermiNet wavefunction at the initial, 10^{4}, and 10^{5} step. The 10^{5} step nodal structure is very well converged to the correct one obtained from CI calculations^{28}, and the nodal structure at 10^{4} step is also qualitatively same, explaining the high accuracy obtained subsequently using DMC. For comparison, the nodal structure of the initial wavefunction is also shown. Because of the fact that only the nodal structure determines the accuracy of DMC, the training process of neural network functions can be significantly shortened. Overall, to reach chemical accuracy for Be atom, the cost of FermiNetDMC is only a fraction of the cost of FermiNetVMC.
Figure 2e further shows the energy of FermiNetbased VMC and DMC for different atoms in order of the number of electrons under the same 2layer network. We try different learning rates and train enough iterations (10^{6} for S, Cl, Ar and 5 × 10^{5} for the other atoms) to ensure that we make full use of the expressive power of the network. As expected the error of VMC increases when the number of electrons in the system increases and the complexity of the system gradually exceeds the expressive limits of the neural network. With DMC the errors are reduced by more than half. The dashed lines are linear fittings of the VMC and DMC energy. And the deviation of the data points from the fitting curves indicates that there is a correlation between the DMC and the VMC energy: when the VMC energy is comparably better, the DMC error is also smaller. The linear rising of DMC error shows that the training of nodal structure also becomes increasingly difficult when system size increases, and the correlation between the VMC and the DMC error indicates the information of the nodal structure is closely entangled with the full wavefunction. Note that we use a 2layer network here in order to examine the behavior of FermiNet VMC and DMC in the regime where the network ansatz is relatively restricted for the considered systems, while FermiNetVMC can be more expressive to achieve high accuracy for those atoms with more layers and neurons, as shown in Pfau et al.^{20}.
Moreover, the improvement of DMC suggests that it may take a smaller and hence more efficient network to represent the nodal surface, without affecting the DMC accuracy. In Fig. 2f, we present a set of such tests on Ne atom, where the complexity of the neural network is labeled as (L,D,W) to indicate the number of layers, the number of determinants, and the width of each layer in the network, respectively. Overall, when the expressiveness of the network is reduced both VMC and DMC are affected in terms of their accuracy. Therefore, all the calculations suggest that the VMC energy is a good indicator of not only how well the wavefunction is optimized but also the quality of its nodal structure. The behavior is also expected for other neural network wavefunction ansatz. Combined with the typical firststeepthenflat optimization curve in neural networks, we can automate the switchingon of DMC and minimize the total cost of calculations at targeted accuracy.
Building upon the successful treatments of FermiNetDMC on atoms, we now extend the approach to larger molecules.
Nitrogen molecule
The first example is the dissociation curve of N_{2} molecule. At equilibrium N_{2} forms a strong triple covalent bond at 2.1 a.u., and the dissociation is accompanied by a severe bond breaking process, which is strongly correlated in nature. Therefore, the dissociation curve of N_{2} is often used to benchmark electronic structure methods’ description of strong correlation. In DMC, this is also highly relevant because the nodal structure is directly affected by electron correlation. Figure 3a plots the relative energy of N_{2} with respect to the experimental reference ^{29} as a function of the bond length. The results from FermiNetVMC and r12MRACPF, a stateoftheart traditional multireference approach^{30}, are also shown. We can see that our DMC calculations are consistently better than those references, with an error of less than 1 mHa in a wide range of bond length. The largest error comes, not surprisingly, around the dissociation point near 4 a.u., and yet the error is only 3 mHa. In fact, our results can be considered as the most accurate ab initio one of N_{2} dissociation curve reported so far. It is worth noting that the FermiNetVMC results here have been remarkably accurate, whose deviation from experiment curve is within 2 mHa near equilibrium and 4 mHa in dissociation region. Yet our FermiNetDMC results can improve averagely about 1 mHa. For comparison, CCSD(T) calculation (not plotted), which is known as the “golden standard” in quantum chemistry, have an error of 25 mHa around 4 a.u.^{20}. In terms of relative energy, the nonparallelity error (NPE) of FermiNetDMC (3.28 mHa) is only slightly better than that of FermiNetVMC (3.53 mHa), consistent with mild improvement on small systems reported in Wilson et al.^{31}, and both are comparable to the stateoftheart r12MRACPF result (2.14 mHa).
The remaining error source of DMC is the nodal structure error produced in the training of neural network using VMC, which is fully reflected on the shape of the FermiNet VMC and DMC curves. The results of FermiNetDMC are close to the experimental fitting curve within 1 mHa outside the dissociation region and cannot go any lower due to the variational property. So, when combined with a more expressive or better trained neural network that can handle the dissociation region, it is very likely that the full dissociation curve of N_{2} can be reproduced by DMC within an error of 1 mHa, meaning that DMC can also solve strongly correlated systems within chemical accuracy.
Cyclobutadiene
A similar example is the structural transition of cyclobutadiene, which is also wellknown for its multireferential nature. The neural networkbased VMC models^{21,22} have already shown promising results on cyclobutadiene. FermiNetDMC can handle this system with higher accuracy and reduced computational cost.
In our experiments, VMC process takes around 3 × 10^{5} steps to converge, though the converged result is still around 7 mHa higher than the reported value in Spencer et al.^{21}, which converges in 2 × 10^{5} steps. This is probably because we use different training hyperparamters, or simply because our optimization process gets trapped in a bad local minimum. However, our final DMC result is around 4 mHa lower than the reference data^{21}. This demonstrates the effectiveness of our DMC implementation as a seamless extension to VMC. Namely even if the optimization in VMC does not work well, the following DMC process can still bring the energy calculation to a highly accurate level. This is especially important for neural networkbased VMC, because its optimization is significantly trickier to tune and requires a longer time to completely converge, compared to conventional VMC. Here, the DMC finite timestep error is negligible as illustrated in Supplementary Fig. 1, which guarantees the variational property of our FermiNetDMC results,
With 10^{5} VMC and 10^{5} DMC steps, FermiNetDMC’s energy result is 2 mHa lower than the reference data in Spencer et al.^{21} produced from a training phase with 2 × 10^{5} VMC steps. Note that in this case our number of total QMC steps is still slightly less than the ref. ^{21} due to the required inference phase in FermiNetVMC. Therefore, FermiNetDMC should be preferred for its lower variational energy at the same or less computational cost.
The automerization energy difference of cyclobutadiene is shown in the inset panel of Fig. 3c. Neural networkbased VMC gives an accurate automerization energy difference of cyclobutadiene^{21,22}. It is consistent with the highend of the experimental data. The results of FermiNetDMC are also in the same region. See Supplementary Note 9 for more details, including the training curve for transition configuration and the DMC energy data for both equilibrium and transition configurations.
Water dimer
In addition to the strong covalent bonding, where static correlation is more essential, molecular systems with weaker hydrogen bonding and noncovalent interactions can also be challenging because of dynamic correlations. To this end, we have carried out FermiNetDMC calculations on the 10 Smith stationary point of water dimer^{32}. The 10 structures, as illustrated in Fig. 3d and Supplementary Fig. 4, have different hydrogen bonding configurations and their relative energies are used to benchmark the performance of electronic structure methods and force field models on hydrogen bonding systems^{33}. With 10 total energy results (plotted in Supplementary Fig. 5) and 9 relative energy results (plotted in Supplementary Fig. 6), we can thus have a rather credible investigation on the error cancellation performance of FermiNetVMC and FermiNetDMC. We compare the energy results of FermiNetDMC with an undertrained network and a welltrained network as trial wavefunctions respectively. The undertrained network is trained by VMC in 10^{5} steps, while the welltrained network is trained by VMC in 3 × 10^{5} steps. CCSD(T) results^{34} are displayed as benchmarks for their high accuracy for such type of systems.
As shown in Fig. 3d, the undertrained FermiNetVMC performs badly on SP3 and SP5, and so does the welltrained FermiNetVMC on SP4, though some of the FermiNetVMC results are quite close to the benchmark results (e.g., SP7 and SP8 in Supplementary Fig. 6). On the other hand, FermiNetDMC performs consistently well no matter which network is used as trial wavefunction, undertrained or welltrained. Overall, the mean absolute deviations from the benchmark CCSD(T) results are also given in Fig. 3d, from which we can clearly tell the improvement of FermiNetDMC on relative energy calculations. For comparison purpose, we also show DMC results with traditional SlaterJastrow wavefunction ansatz^{35}, whose accuracy is at the same level with FermiNetDMC as the difference is negligible compared to the statistical error. The inferior performance of FermiNetVMC may be due to the different degree of convergence in different systems, while FermiNetDMC provides a more efficient and practical solution than fully converged FermiNetVMC.
Benzene
To further illustrate the power of our approach, we have examined the benzene molecule and a benzene dimer. Benzene is one of the most fundamental organic molecules with a hexagonal ring of C–H (Fig. 4a). There have been challenges in understanding its electronic configuration, bonding order and obtaining the ground state energy. To understand the electronic structure of benzene molecule, we performed FermiNetbased VMC and DMC simulations with 3layer and 4layer networks separately. Our best FermiNetDMC result calculated with the 4layer network coincides with the CCSD(T) result extrapolated to completebasisset (CBS) limit. The comparison is shown in Fig. 4d. The CCSD(T) result is carried out with Psi4^{36} and the CBS result is extrapolated using ccpCVXZ (X=3,4,5) basis, which is much larger than the ones reported in Johnson III.^{37} and used by others as the stateoftheart electronic structure methods in Eriksen et al.^{38}. The energy from our CCSD(T)/CBS calculation is also much lower than those references. See Supplementary Note 14 for more details on the CCSD(T) calculation and CBS extrapolation.
The 3layer FermiNet here is much smaller than the 4layer one. Besides being one layer shallower, the number of neurons on each layer is also significantly less. See Supplementary Tables 8–10 for the related hyperparameters. Figure 4d shows that the 3layer FermiNetDMC’s energy is lower than the 4layer VMC result by around 10 mHa, which demonstrates one of the main benefits of FermiNetDMC that it can achieve better accuracy with smaller network. This is especially important when we are dealing with large systems.
In our calculations, FermiNetDMC is able to achieve lower variational energy results with an order of magnitude better efficiency. With a total of 4 × 10^{5} QMC steps (2 × 10^{5} VMC training steps and 2 × 10^{5} DMC steps), the 3layer FermiNetDMC’s energy result (–232.225 Ha) is slightly better than the 4layer FermiNetVMC’s energy result (–232.223 Ha) at 10^{6} VMC training step. Moreover, the runtime of a single VMC step for the 4layer network is approximately 4 times that of a single VMC or DMC step for the 3layer network under the same computation resources. Therefore, in this case, the 3layer FermiNetDMC can achieve a better energy result at only a tenth of the total computation cost compared to the 4layer FermiNetVMC. Similarly, compared to the 3layer FermiNetVMC at 2 × 10^{6} VMC training step, the 3layer FermiNetDMC with 4 × 10^{5} QMC steps can achieve more than 10 mHa better energy result at only a fifth of the total computation cost.
Furthermore, the energy difference between the FermiNetDMC results in Fig. 4d is only around 3 to 4 mHa, suggesting the closeness between the node structure of the two trial wavefunctions. To confirm this statement, we visualized 2dimensional slices of those trial wavefunctions in Fig. 4b, c. The slices are generated by moving a single spinup electron inside a two dimensional box while fixing all other electrons at representative positions suggested by Liu et al.^{39} and illustrated in Fig. 4a. See section “Nodal structure and wavefunction visualization” and Supplementary Note 11 for more visualization details. Comparing Fig. 4b (4layer FermiNet VMC) and Fig. 4c (3layer FermiNet), we find that the nodes, represented by the dark pixels, do share the same pattern. Moreover, the parts of nodal surface in lighter areas, namely with larger wavefunction value, are very close to each other in Fig. 4b, c, and they are the most important parts of nodal surface in the DMC process since walkers are more likely to visit its neighborhood. The closeness of those parts is consistent with the fact that the FermiNetDMC energies are close.
To track how nodal surface evolves along the training process, we propose a divergence D(S, T) measuring the difference between two nodal surfaces S and T. The definition and algorithmic details are described in section “Divergence measuring nodal surface difference” and Supplementary Note 15, and the definition is also related to the intuition mentioned above that nodes in the neighborhood with larger wavefunction value are more important in the QMC calculation. For the 3layer FermiNet, we calculated
where S_{final} and S_{k} are the nodal surface corresponding to the final VMC training step and the intermediate training steps k, respectively. The result is shown in Fig. 4e together with the VMC and DMC energy, where the trend of the divergence correlates well with energies. As a matter of fact, there is a linear relation between the divergence and DMC energy, as shown in Fig. 4f, indicating that the proposed divergence successfully captures the essential information of the difference between nodal surfaces. Here the divergence converges to around 0.005 instead of 0 because of the large learning rate used when training the 3layer FermiNet for benzene.
We have also trained a neural network for a benzene dimer, which is a prototypical system to further test noncovalent interactions. Benzene dimer, which has 84 electrons in total, is a much larger system than the ones considered in previous neural networkbased VMC works^{19,20,21,22,23,24,25,26}. We elaborate the challenges and tricks dealing with large systems using FermiNetbased QMC methods in Supplementary Note 5. We consider a Tshaped structure with an edgetoface arrangement, as illustrated in Fig. 5a, specifically the equilibrium configuration with a centertocenter distance of 4.95 Å^{40}. Figure 5a also shows the VMC and DMC energies as functions of the VMC training step, which are both over 200 mHa lower than the CCSD(T) result with ccpCVTZ basis. The converged FermiNetDMC energy is over 50 mHa lower than both FermiNetVMC result and the CCSD(T) result with ccpCVQZ basis. It echos statements made in above sections that FermiNetDMC can achieve significantly higher accuracy for larger systems or cases where the neural network ansatz is not powerful enough to characterize the ground state wavefunction well. For comparison, the FermiNetVMC energy has not fully converged even after four million training steps. Schätzle et al.^{41} shows that neural networkbased VMC, in particular, PauliNet, can achieve variational energies at the fixednode limit in certain circumstances, while in our calculations, one can clearly see that it is not the case for FermiNet especially when its expressive power is limited compared to the size of the system. On the other hand, our DMC result is 15 mHa higher than the CCSD(T)/CBS result. Note that CCSD(T) is not a variational method, hence the relatively lower CCSD(T)/CBS result may indicate similar accuracy compared to our DMC result. To achieve more accurate FermiNetDMC result, we can use a better neural network trial wavefunction with a larger network or a better network architecture.
In addition to the total energy at the equilibrium configuration, binding energy is also of great interest when studying a benzene dimer^{40,42,43,44,45}, and classical methods, such as CCSD(T) and MP2, can produce results agreeing with experimental data well. However, for neural networkbased QMC, the binding energy calculation is more subtle and challenging due to the lack of systematic error cancellation. Using the same network structure handling both monomer and dimer would introduce additional sizeinconsistencylike bias because of the more severe expressiveness limitation on benzene dimer than monomer. For the benzene dimer, we find such an estimate would predict a severe underbinding with both VMC and DMC. Another way to estimate the binding energy is to take the difference between a separated dimer configuration (10 Å)^{43} and the equilibrium configuration, shown in Fig. 5b, which turns out to be systematically overbinding. With an empirical linear relation between VMC and DMC energy in the training process, we developed a simple VMCDMC hybrid extrapolation scheme, which leads to an accurate estimate of the binding energy, wellagreed with the experimental measurements^{46}, also displayed in Fig. 5b. We will elaborate more on this extrapolation scheme in section “Linear relation between VMCDMC energy”. In order to systematically improve the binding energy calculation, the most straightforward way is to adopt better neural network ansatz as the trial wavefunction for better accuracy. Adding regularization mechanism in the optimization processes is another possible option so that the model variance can be reduced for better error cancellation. Note that in the case of DMC with pseudopotential, binding energy calculation can be also improved with certain deterministic approximation^{47}. We will leave it as a future study apply those ideas to improve the binding energy calculation.
Linear relation between VMCDMC energy
Quite consistently, we find linear relation between VMC and DMC energies in our calculation. We have encountered two types of linear relation. One is about intermediate energies calculated along the training process for a given system, while another one is about the converged energies from different systems. We take advantage of the first type of linearity and develop a simple but effective extrapolation scheme accordingly.
We find that, for molecular systems, such as cyclobutadiene, benzene monomer and dimers, there’s a linear trend between the VMC and DMC energies calculated at different steps along the VMC training process. Equivalently, there’s a linear relation between quantities
where \({E}_{{{{{{{{\rm{VMC}}}}}}}}}^{(k)}\) and \({E}_{{{{{{{{\rm{DMC}}}}}}}}}^{(k)}\) represent the VMC and DMC energy calculated at VMC training step k, and E_{final} is the DMC energy at the final VMC training step, namely a constant for one training process. Such relation is shown in Fig. 5c. Based on this empirical linear relation, we propose an extrapolation scheme
where E_{ex} is the extrapolated energy, and w and b are two parameters to be determined. Here slope w can be fitted using \({E}_{{{{{{{{\rm{VMC}}}}}}}}}^{(k)}\) and \({E}_{{{{{{{{\rm{DMC}}}}}}}}}^{(k)}\) along the training process, but the intercept b cannot be inferred from those data. Therefore, it is difficult to use this scheme to extrapolate absolute energy unless we have extra information on intercept b. On the other hand, when calculating relative energy, we may simply assume the intercept b between different configurations are the same so that it can be canceled out in the calculation. Namely for relative energy, we have
Note that the calculation of relative energy is especially troublesome for neural networkbased QMC methods, due to the strong dependence on the number of training steps and the long converging period. See Supplementary Fig. 8b for how binding energies calculated with FermiNet VMC and DMC change along the optimization process. With our scheme, the binding energy results calculated from different VMC training steps would be the same, modulo the fitting error of the linear relation, which means we can circumvent the dependence of the binding energy result on the number of training steps. In practice, the extrapolated binding energies form a well concentrated distribution, and doing an extra average using different VMC training steps can eliminate the linear fitting error and provide an accurate estimate. Moreover, it also suggests that we can calculate the extrapolated binding energy with data collected in the early phase of the training process, avoiding the long converging period of VMC optimization.
Applying this scheme to binding energy calculation of a benzene dimer, the result is significantly improved and the distribution fitted from energy difference of different VMC training steps is concentrated around the experimental range, as shown in Fig. 5b. The estimate of extrapolated binding energy by averaging the energy difference is 3.60 mHa, within the experimental range. See Supplementary Note 13 for more extrapolationrelated details for benzene dimer.
We have discussed the relation of VMC and DMC energy for elements on the second and third rows in section “Single atoms”. For each atom, we have a reference energy data E_{exact} to be compared with converged VMC energy (E_{VMC}) and DMC energy (E_{DMC}). As shown in Fig. 2e, both E_{DMC} − E_{exact} and E_{VMC} − E_{exact} grow linearly as the atomic number increases, though the slope changes when switching from the second row elements to the third row. However, if we instead compare
then we have a single linear relation across all elements on both second and third rows, as shown in Fig. 5d.
Interestingly, the slope of fitted lines in both Fig. 5c and d are all quite close. We will leave further study on those two types of linearity as future work.
Discussion
FermiNetDMC is able to achieve accurate ab initio calculations for various systems, obtaining ground state of 16 atoms, N_{2} along the bonding curve, 2 cyclobutadiene configurations, 10 hydrogen bonded water dimers, benzene monomer and dimer. These systems include bond breaking structures where strong static correlation exists and weakly bonded dimers where dynamic correlation dominates, and FermiNetDMC performs consistently well. FermiNetDMC leverages the expressive power of neural network to provide wellbehaved trial wavefunctions. Neural networkbased VMC has claimed success in small systems when the network can be sufficiently trained. However, it is not able to provide satisfactory ground state wavefunction and energy when the expressiveness of the neural network is limited. Compared to VMC, the combination of neural network with DMC provides a powerful solution, in the sense that it can achieve more accurate result with simpler network and better efficiency. The improvement of FermiNetDMC in efficiency can be up to 1 or 2 orders of magnitude in the large systems tested in order to reach the same accuracy level as FermiNetVMC, which can become increasingly more important when dealing with even larger molecules.
There is an interesting linear relation between VMC and DMC energy observed during the training process as well as across different systems. We develop an extrapolation scheme accordingly, which greatly improves the accuracy of relative energy calculation as shown in the benzene dimer case and overcome the issue that the relative energy calculation greatly depends on the different training steps in the QMC process. We also design a divergence measuring the difference between nodal surfaces of two wavefunctions, which correlates well with the corresponding DMC energies in numerical experiments. Namely the proposed divergence successfully captures the essence of nodal surface differences.
It is worth pointing out that a similar idea to this work was proposed in a preprint by Wilson et al., where they have performed preliminary tests on the second row elements^{31}. However, only minor improvements in accuracy were observed accompanied by an increased cost of DMC, since the FermiNet used there was powerful enough to achieve high accuracy for the tested small systems and leave little room for further improvement. By comparison, our approach, being more sophisticated and efficient, achieves significant accuracy boost when dealing with more challenging molecular systems, which FermiNet alone cannot handle well. We have also shown that even for small systems, FermiNetDMC should still be preferred for the fact that it can achieve comparable or even better accuracy with a smaller network and much less computation resources compared with FermiNetVMC. Our work, therefore, eliminates the negative concerns of going from VMC to DMC with neural network wavefunction ansatz. Moreover, the DMC method can be further integrated with other powerful molecular neural networks^{22,25}, periodic neural network for solids^{48}, neural networks with effective core potential^{49}, which has the potential to catalyze a paradigm shift in the application of stochastic electronic structure methods.
Methods
Basic theory
To study a manybody system from first principles, we always consider solving the wellknown Schrödinger equation for electrons and nuclei. When we work in the BornOppenheimer approximation^{50}, and further consider a fixed set of nuclear positions, the problem is simplified to the solution of the ground state manyelectron wavefunction.
where x_{i} = (r_{i}, σ_{i}) denotes the spatial and spin coordinates of electron i, and R_{I}, Z_{I}, respectively, denote the spatial coordinates and the charge of nucleus I. The wavefunction of electrons obeys FermiDirac statistics thus should be antisymmetric with respect to the interchange of both the spatial coordinates and the spins of any two electrons, namely the following equality of wavefunction should hold: ψ( ⋯ , x_{i}, ⋯ , x_{j}, ⋯ ) = − ψ( ⋯ , x_{j}, ⋯ , x_{i}, ⋯ ).
Unlike most methods that use variational principle to approach the ground state wavefunction, DMC is a stochastic projection method. A given antisymmetric wavefunction ψ_{T} can always be represented as a linear combination of a set of eigenfunctions ψ_{k} of the corresponding Hamiltonian operator,
When an imaginarytime evolution operator acts on ψ_{T},
where E_{T} is the trial energy as an offset, there will be a decay coefficient added to each expansion term, and the decay rate is proportional to state energy E_{k}. After a long enough imaginarytime evolution, ψ_{T} can reach the ground state ψ_{0}, whereas contributions from all other eigenfuntions vanish. If we define a timedependent wavefunction and look at the imaginarytime Schrödinger equation:
Without the potential energy terms, it resembles a standard diffusion equation,
The diffusion equation defines the master equation of stochastic processes, hence we can solve the diffusion equation of wavefunction by simulating the stochastic processes^{51}. With potential terms, additional processes are required to bind the diffusion equation in simulation (see, e.g., refs. ^{16,52,53} for more details).
Trial wavefunction
In this work, we use FermiNet neural network ansatz as our trial wavefunction. Due to the huge number of parameters, it is challenging to converge the training process of FermiNet unless the system is small enough. After many tests, we identified a common training pattern of FermiNet, which consists of two stages: a relatively short sharpadjustment stage and a lengthy finetuning one. We propose to use the FermiNet wavefunction right after the sharpadjustment stage as the trial wavefunction in DMC, which maximizes the efficiency of the entire simulation protocol. In this way we can also achieve more accurate results than a better converged FermiNet model after the lengthy finetuning stage. Comparing to the gain, the cost of performing DMC on the longtrained FermiNet is rather minor in most of the systems tested.
DMC implementation
We have developed a GPUfriendly DMC software in JAX^{54}, which can be seamlessly integrated with FermiNet^{27}, developed in the same programming framework. Our DMC software can also be integrated with other trial wavefunctions implemented in JAX and it has been open sourced in order to accelerate further combination of QMC methods with neural networks. See Algorithm 1 for a brief workflow of one DMC iteration, beyond which various of modifications are implemented, including those proposed by Umrigar et al. to reduce timestep error^{52} and by Zen et al. to keep size consistency^{55}.
Random walkers’ branching and merging change the total number of walkers, which cause efficiency issue for JAX program and is also not friendly to distributed computing especially when load balancing is involved. We devised a new branchingmerging strategy to overcome these issues. Whenever we need to branch certain random walker due to its overly large weight, we also merge two walkers on the same computing node with the smallest weight. No merging is executed if no branching happens. In this way, we keep the number of walkers on each computing node unchanged. We did thorough numerical verification of this strategy and found that the introduced bias is negligible.
The most timeconsuming module in our DMC implementation is to calculate the local energy. In our optimized program, the computational cost for each local energy estimation is almost same as a VMC inference step of the original FermiNet. Therefore, the total cost depends solely on the number of iterations performed in DMC and VMC.
Energy calculation
For FermiNetVMC, we always perform a separate inference simulation for energy estimate, where we fix all the parameters of FermiNet after training and do a number of Markov Chain Monte Carlo (MCMC) steps to sample batches of random walkers accordingly. We calculate the average local energy for each batch, and use reblock analysis to determine the mean value of the set of averaged energy as well as the standard deviation. For FermiNetDMC, we use the mixed estimator of energy^{52} and treat the first 10% of MC steps as the equilibrating phase and only use the steps afterwards for energy production. See Supplementary Tables 6–12 for the hyperparameters of all our calculations. We also use reblock analysis to determine the mean of the averaged energy and its standard error. In our plots, error bars represent one standard error for energy estimates, unless otherwise specified.
Algorithm 1
Simplified Diffusion Monte Carlo algorithm pseudocode.
Note that walkers in DMC are more autocorrelated than the ones in VMC inference phase especially when the timestep used in DMC is set to be small to avoid bias. Therefore, more batches of random walkers are needed to reduce the statistical error to a given level in DMC than in VMC. However in practice, we found that the number of the required extra batches of walkers in DMC is usually much fewer than the number of steps in VMC training phase for full convergence.
Nodal structure and wavefunction visualization
The threedimensional cuts of the full 11D nodal structure of Be in Fig. 2b–d is plotted according to the rules of Bressanini et al.^{28}. The four electrons’ spherical coordinates are respectively
fixing all the degrees of freedom except r_{1}, θ_{1} and r_{3}. The green surfaces in the plots show the nodal surfaces, i.e., the places where the value of wavefunction is zero.
To visualize the nodal surface of benzene, we calculated the wavefunction value on 2dimensional slices of the 126dimensional space. We first fixed a 126dimensional electron configuration at the representative position of benzene electronic structure from Liu et al.^{39}, and perturb it slightly for the visualization purpose. To construct one slice of the 126dimensional space, we move a single spinup electron in a 2dimensional square with all other 41 electrons fixed. Then we apply FermiNet to points on each slice and display the logscaled magnitude of the evaluated wavefunction value, where the points with small value stand for the nodes on each slice. Since the FermiNet output is unnormalized, diagrams for different FermiNet may have drastically different range of displayed value.
Divergence measuring nodal surface difference
We define a divergence measuring the difference between two sets S_{1} and S_{2} in any metric space as follows
where P_{1} is a probability measure on S_{1}, and \({\{{Y}_{i}\}}_{i=1,\ldots,K}\) are sampled from P_{1}. The distance d(Y, S) between a single point Y and a set S is defined as the smallest distance between Y and any point in S, namely
For a nodal surface S corresponding to an unnormalized wavefunction Ψ, we would like to define a measure on S such that a small area on S is assigned larger weight if its neighborhood has larger Ψ^{2} value, namely larger probability to be visited by walkers in DMC. Therefore, we consider a neighborhood
around S and a mapping
then “push forward" the probability density \({m}_{{\Psi }^{2}}\) (corresponding to Ψ^{2}) from S_{ϵ} to S via ϕ, namely
Intuitively, for any point y in S_{ϵ} we may simply choose ϕ(y) to be the point on N that is closest to y.
However, it’s quite difficult to determine both S_{ϵ} and ϕ mentioned above algorithmically, and thus, in practice, we use some approximate alternatives that are much easier to compute. See Supplementary Note 15 for the algorithmic detail.
Data availability
All data supporting the findings of this study are provided in Supplementary Information.
Code availability
We have released our DMC software at https://github.com/bytedance/jaqmc.
References
Pople, J. A. Nobel lecture: quantum chemical models. Rev. Mod. Phys. 71, 1267 (1999).
Kohn, W. Nobel lecture: electronic structure of matter—wave functions and density functionals. Rev. Mod. Phys. 71, 1253 (1999).
Helgaker, T. et al. Recent advances in wave functionbased methods of molecularproperty calculations. Chem. Rev. 112, 543 (2012).
Cao, Y. et al. Quantum chemistry in the age of quantum computing. Chem. Rev. 119, 10856 (2019).
Kirkpatrick, J. et al. Pushing the frontiers of density functionals by solving the fractional electron problem. Science 374, 1385 (2021).
Brandenburg, J. G., Zen, A., Alfé, D. & Michaelides, A. Interaction between water and carbon nanostructures: How good are current density functional approximations?. J. Chem. Phys. 151, 164702 (2019).
AlHamdani, Y. S. et al. Interactions between large molecules pose a puzzle for reference quantum mechanical methods. Nat. Commun. 12, 1 (2021).
Eriksen, J. J. et al. The ground state electronic energy of benzene. J. Phys. Chem. Lett. 11, 8922 (2020).
Williams, K. T. et al. Direct comparison of manybody methods for realistic electronic Hamiltonians. Phys. Rev. X 10, 011041 (2020).
Booth, G. H., Thom, A. J. & Alavi, A. Fermion monte carlo without fixed nodes: a game of life, death, and annihilation in slater determinant space. J. Chem. Phys. 131, 054106 (2009).
Umrigar, C. J., Toulouse, J., Filippi, C., Sorella, S. & Hennig, R. G. Alleviation of the fermionsign problem by optimization of manybody wave functions. Phys. Rev. Lett. 98, 110201 (2007).
Kent, P. R. C. et al. QMCPACK: advances in the development, efficiency, and application of auxiliary field and realspace variational and diffusion quantum Monte Carlo. J. Chem. Phys. 152, 174105 (2020).
Needs, R. J., Towler, M. D., Drummond, N. D., López Ríos, P. & Trail, J. R. Variational and diffusion quantum Monte Carlo calculations with the CASINO code. J. Chem. Phys. 152, 154106 (2020).
Anderson, J. B. A randomwalk simulation of the schrödinger equation: H\({}_{3}^{+}\). J. Chem. Phys. 63, 1499 (1975).
Anderson, J. B. Quantum chemistry by random walk. \({{{{{\mathrm{H}}}}}^2P},\,{{{{{\mathrm{H}}}}}^+_3}\, {D_{3h}}\,{}^{1} {A^{'}_{1}},\,{{{{{\mathrm{H}}}}}_2}\,{}^3 {{\Sigma}^{+}_{u}},\,{{{{{\mathrm{H}}}}}_4}\, {}^1 {{\Sigma}^{+}_{g}},\,{{{{{\mathrm{Be}}}}}}\, {}^1 S\),. J. Chem. Phys. 65, 4121 (1976).
Foulkes, W. M. C., Mitas, L., Needs, R. J. & Rajagopal, G. Quantum Monte Carlo simulations of solids. Rev. Modern Phys. 73, 33 (2001).
Bajdich, M., Mitas, L., Wagner, L. K. & Schmidt, K. E. Pfaffian pairing and backflow wavefunctions for electronic structure quantum monte carlo methods. Phys. Rev. B 77, 115112 (2008).
López Ríos, P., Ma, A., Drummond, N. D., Towler, M. D. & Needs, R. J. Inhomogeneous backflow transformations in quantum monte carlo calculations. Phys. Rev. E 74, 066701 (2006).
Han, J., Zhang, L. & E, W. Solving manyelectron schrödinger equation using deep neural networks. J. Comput. Phys. 399, 108929 (2019).
Pfau, D., Spencer, J. S., Matthews, A. G. & Foulkes, W. M. C. Abinitio solution of the manyelectron schrödinger equation with deep neural networks. Phys. Rev. Res. 2, 033429 (2020).
Spencer, J. S., Pfau, D., Botev, A. & Foulkes, W. M. C. Better, faster fermionic neural networks. https://arxiv.org/abs/2011.07125 (2020).
Hermann, J., Schätzle, Z. & Noé, F. Deepneuralnetwork solution of the electronic Schrödinger equation. Nat. Chem. 12, 891 (2020).
Entwistle, M., Schätzle, Z., Erdman, P. A., Hermann, J., & Noé, F. Electronic excited states in deep variational Monte Carlo. Nat. Commun. 14, 274 (2023).
Lin, J., Goldshlager, G., & Lin, L. Explicitly antisymmetrized neural network layers for variational monte carlo simulation. J. Comput. Phys. 474, 111765 (2023).
Scherbela, M., Reisenhofer, R., Gerard, L., Marquetand, P., & Grohs, P. Solving the electronic schrödinger equation for multiple nuclear geometries with weightsharing deep neural networks. Nat. Comput. Sci. https://doi.org/10.1038/s4358802200228x (2022).
Gao, N. & Günnemann, S. Abinitio potential energy surfaces by pairing GNNs with neural wave functions. https://arxiv.org/abs/2110.05064 (2021).
Spencer, J. S., Pfau, D. & FermiNet Contributors. FermiNet, http://github.com/deepmind/ferminet (2020).
Bressanini, D. Implications of the two nodal domains conjecture for ground state fermionic wave functions. Phys. Rev. B 86, 115120 (2012).
Le Roy, R. J., Huang, Y. & Jary, C. An accurate analytic potential function for groundstate n 2 from a directpotentialfit analysis of spectroscopic data. J. Chem. Phys. 125, 164310 (2006).
Gdanitz, R. J. Accurately solving the electronic schrödinger equation of atoms and molecules using explicitly correlated (r12) mrci: the ground state potential energy curve of n2. Chem. Phys. Lett. 283, 253 (1998).
Wilson, M., Gao, N., Wudarski, F., Rieffel, E., & Tubman, N. M. Simulations of stateoftheart fermionic neural network wave functions with diffusion Monte Carlo. https://arxiv.org/abs/2103.12570 (2021).
Smith, B. J., Swanton, D. J., Pople, J. A., Schaefer III, H. F. & Radom, L. Transition structures for the interchange of hydrogen atoms within the water dimer. J. Chem. Phys. 92, 1240 (1990).
Gillan, M. J., Alfé, D. & Michaelides, A. Perspective: How good is DFT for water?. J. Chem. Phys. 144, 130901 (2016).
Tschumper, G. S. et al. Anchoring the water dimer potential energy surface with explicitly correlated computations and focal point analyses. J. Chem. Phys. 116, 690 (2002).
Gillan, M., Manby, F., Towler, M. & Alfè, D. Assessing the accuracy of quantum monte carlo and density functional theory for energetics of small water clusters. J. Chem. Phys. 136, 244105 (2012).
Smith, D. G. A. et al. Psi4 1.4: Opensource software for highthroughput quantum chemistry. J. Chem. Phys. 152, 184108 (2020).
Johnson III, R. D. (ed.) NIST Computational Chemistry Comparison and Benchmark Database. NIST Standard Reference Database Number 101 (2021).
Eriksen, J. J. et al. The ground state electronic energy of benzene. J Phys. Chem. Lett. 11, 8922 (2020).
Liu, Y., Kilby, P., Frankcombe, T. J. & Schmidt, T. W. The electronic structure of benzene from a tiling of the correlated 126dimensional wavefunction. Nat. Commun. 11, 1210 (2020).
Pitonak, M. et al. Benzene dimer: highlevel wave function and density functional theory calculations. J. Chem. Theory Comput. 4, 1829 (2008).
Schätzle, Z., Hermann, J. & Noé, F. Convergence to the fixednode limit in deep variational Monte Carlo. J. Chem. Phys. 154, 124108 (2021).
AlHamdani, Y. S. et al. Interactions between large molecules pose a puzzle for reference quantum mechanical methods. Nat. Commun. 12, 3927 (2021).
Azadi, S. & Cohen, R. Chemical accuracy from quantum monte carlo for the benzene dimer. J. Chem. Phys. 143, 104301 (2015).
Tsuzuki, S., Honda, K., Uchimaru, T., Mikami, M. & Tanabe, K. Origin of attraction and directionality of the π/π interaction: model chemistry calculations of benzene dimer interaction. J. Am. Chem. Soc. 124, 104 (2002).
Sinnokrot, M. O., Valeev, E. F. & Sherrill, C. D. Estimates of the ab initio limit for ππ interactions: the benzene dimer. J. Am. Chem. Soc. 124, 10887 (2002).
Grover, J. R., Walters, E. A. & Hui, E. T. Dissociation energies of the benzene dimer and dimer cation. J. Phys. Chem. 91, 3233 (1987).
Zen, A., Brandenburg, J. G., Michaelides, A. & Alfè, D. A new scheme for fixed node diffusion quantum monte carlo with pseudopotentials: improving reproducibility and reducing the trialwavefunction bias. J. Chem. Phys. 151, 134105 (2019).
Li, X., Li, Z. & Chen, J. Ab initio calculation of real solids via neural network ansatz. Nat. Commun. 13, 7895 (2022).
Li, X., Fan, C., Ren, W. & Chen, J. Fermionic neural network with effective core potential. Phys. Rev. Res. 4, 013021 (2022).
Born, M. & Oppenheimer, R. Zur quantentheorie der molekeln. Annalen der Physik 389, 457 (1927).
Karlin, S. & Taylor, H. E. A Second Course in Stochastic Processes (Elsevier, 1981).
Umrigar, C., Nightingale, M. & Runge, K. A diffusion monte carlo algorithm with very small timestep errors. J. Chem. Phys. 99, 2865 (1993).
Reynolds, P. J., Tobochnik, J. & Gould, H. Diffusion quantum monte carlo. Comput. Phys. 4, 662 (1990).
Bradbury, J. et al. JAX: Composable Transformations of Python+NumPy Programs. https://news.ycombinator.com/from?site=github.com/google (2018).
Zen, A., Sorella, S., Gillan, M. J., Michaelides, A. & Alfe, D. Boosting the accuracy and speed of quantum monte carlo: size consistency and time step. Phys. Rev. B 93, 241118 (2016).
Caffarel, M. Quantum Monte Carlo for Chemistry Toulouse. http://qmcchem.upstlse.fr/index.php/Quantum_Monte_Carlo_for_Chemistry_@_Toulouse/ (2009).
Chakravorty, S. J., Gwaltney, S. R., Davidson, E. R., Parpia, F. A. & p Fischer, C. F. Groundstate correlation energies for atomic ions with 3 to 18 electrons. Phys. Rev.A 47, 3649 (1993).
Lyakh, D. I., Musiał, M., Lotrich, V. F. & Bartlett, R. J. Multireference nature of chemistry: the coupledcluster view. Chem. Rev. 112, 182 (2012).
Acknowledgements
This work is directed and supported by Hang Li and ByteDance Research. We thank Yubing Qian for providing preliminary settings for water dimer simulations. We thank Weinan E, Xiang Li, Kai Zheng, Ke Liao, and Yu Liu for fruitful discussions. We thank Mike Entwistle and James Spencer for sharing data and results. We thank Michel Caffarel for allowing us to use an adapted version of the figure on his website. We thank Shaochen Shi, Xiaoying Jia, Xin Liu, and Chenlin Chai for engineering improvement on our DMC software. We thank the rest of ByteDance Research team for inspiration and encouragement. J.C. is supported by the National Key R&D Program of China under Grant No. 2021YFA1400500, the National Natural Science Foundation of China under Grant No. 92165101 and No. 11974024, and the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No. XDB33000000.
Author information
Authors and Affiliations
Contributions
W.R. and J.C. conceived the study; W.R. implemented the main code with important contributions from W.F.; W.R. and W.F. performed simulations, data analyses, and figure designing. X.W. performed CCSD(T) related calculation. J.C. supervised the project. W.R., W.F., X.W., and J.C. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ren, W., Fu, W., Wu, X. et al. Towards the ground state of molecules via diffusion Monte Carlo on neural networks. Nat Commun 14, 1860 (2023). https://doi.org/10.1038/s41467023376093
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467023376093
This article is cited by

Ab initio quantum chemistry with neuralnetwork wavefunctions
Nature Reviews Chemistry (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.