A fast and accurate computational method for the linear-combination-based isotropic periodic sum

An isotropic periodic sum (IPS) is a powerful technique to reasonably calculate intermolecular interactions for wide range of molecular systems under periodic boundary conditions. A linear-combination-based IPS (LIPS) has been developed to attain computational accuracy close to an exact lattice sum, such as the Ewald sum. The algorithm of the original LIPS method has a high computational cost because it needs long-range interaction calculations in real space. This becomes a performance bottleneck for long-time molecular simulations. In this work, the combination of an LIPS and fast Fourier transform (FFT) was developed, and evaluated on homogeneous and heterogeneous molecular systems. This combinational approach of LIPS/FFT attained computational efficiency close to that of a smooth particle mesh Ewald while maintaining the same high accuracy as the original LIPS. We concluded that LIPS/FFT has great potential to extend the capability of IPS techniques for the fast and accurate computation of many types of molecular systems.

Molecular dynamics (MD) simulations continue to evolve. Recent advances in computer power and algorithmic developments have made it possible to simulate a wide range of molecular systems with reasonable time and length scales for scientific and industrial applications [1][2][3][4][5] .
The computational accuracy of MD has been supported by established molecular models and intermolecular interaction calculation methods [6][7][8][9][10][11][12][13][14] . Importantly, the accuracy of long-range interaction calculations strongly affects to the accuracy of MD. Under the periodic boundary conditions (PBCs), long-range interactions are ordinally calculated using truncation or lattice sum methods. The simplicity and calculation efficiency are merits of truncation methods, but truncation of long-range interactions can be serious defects on simulating Lennard-Jones (LJ) liquids [15][16][17] , aqueous [18][19][20][21][22][23][24][25][26][27][28] , and macromolecular systems [29][30][31][32][33][34][35][36][37][38][39][40][41][42][43] . The reaction field method 44,45 , the Wolf method and its modifications 46,47 , and the smoothing and/or shifting cutoff methods are kinds of truncation methods that have serious defects for various systems (see above citations). The Ewald sum 48 is used as the most accurate lattice sum method. It incorporates PBCs using the discrete Fourier transform (DFT), but the reciprocal term is computationally expensive. The particle mesh Ewald (PME) and its modifications 49,50 are particle-particle/particle-mesh approaches that use fast Fourier transforms (FFT) for the Ewald sum, and the smooth PME (SPME) 50 is now the de facto standard lattice sum method. The SPME perform well for intermediate-sized systems ( ∼ − N 10 10 3 5 , where N is the number of particles) 51 . However, the SPME does not have a high performance for large systems ( ∼ N 10 6 or larger), because it is difficult for FFT to attain strong scaling on massively parallel machines 52,53 . The Barnes-Hut tree-code 54 and the fast multipole method 55 are lattice sum methods that use hierarchical tree structures. These tree-based method can attain stronger scaling than FFT because it does not contain reciprocal space calculations that require all-to-all communications on massively parallel machines 53 . The tree-based method has been applied for long-range interaction calculations of molecular simulations 52,[56][57][58][59][60][61][62][63][64][65][66][67][68][69][70] . The Gaussian split Ewald sum 71 is a combination between the Ewald sum and Poisson equation solver, and can avoid all-to-all communications on massively parallel machines. Importantly, all the lattice sum methods are potentially subject to the symmetry effect that comes from the lattice-like repetition of the unit cell for PBC. This effect is especially troublesome when simulating macromolecular systems, where conformational distributions are sensitive to image interactions 38,40,[72][73][74][75][76][77][78][79][80] . More research into the lattice sum method is required, and the symmetry effect should be carefully considered. To avoid the symmetry effect, the isotropic periodic sum (IPS) technique is a promising approach. It was first developed by Wu and Brooks 81 . The IPS has become a powerful technique to reasonably calculate intermolecular  interactions for a wide range of molecular systems, including net-charge systems under PBCs. It has been applied  to solids 82 , liquids 81,83-86 , solid-liquid 87,88 and liquid-vapor interfaces 89,90 , liquid crystals 91 , proteins 81,92 , lipids 89,93 ,  combined quantum mechanics/molecular mechanics methods 94 , and constant pH MD simulations [95][96][97] . Improved methods have been developed for large-scale systems that exploit the possibility of parallel computing 98,99 .
An extended IPS technique, that is, the linear-combination-based IPS (LIPS) was developed to improve the accuracy of the IPS for both homogeneous and heterogeneous systems 7,100-103 . The LIPS provides periodic reaction fields that can design pseudo pair potentials in the range of extended IPS theory. This pseudo pair potential has the high accuracy that achieves computational results close to an exact lattice sum. For example, in the phase transition of liquid crystal systems, the LIPS and SPME with fine grid spacing are the only techniques that can reasonably estimate the solid-liquid-crystalline phase transition temperature 104 . The LIPS has great potential as one of the best possible approaches to contribute to a further substantial advance in IPS techniques. However, the algorithm of the original LIPS has a high computational cost because it requires long-range interaction calculations in real space. This becomes a performance bottleneck for long-time molecular simulations using the LIPS. Several possible approaches to raise the computational efficiency of the LIPS were suggested in the original paper 100 , but have not been applied.
In the present work, the combination of the LIPS and fast Fourier transform (FFT), that is, LIPS/FFT, is developed as a substantial advance in the computational efficiency of the LIPS for intermediate-sized systems -∼ N ( 10 10 ) 3 5 . The performance of LIPS/FFT is evaluated on homogeneous and heterogeneous polar molecular systems. LIPS/FFT attains computational efficiency close to the SPME while maintaining the high accuracy of the LIPS. We conclude that LIPS/FFT has great potential to extend the capability of IPS techniques for the fast and accurate computation of many types of molecular systems.

Methodology
LIPS/FFT method. In the LIPS method, pseudo pair potentials are designed using extended IPS theory, which provides periodic reaction fields 100,101 . Several types of pseudo pair potentials have been developed 100,101 , and these can be expressed as the following equation generally: LIPS c c where u LIPS is the LIPS pseudo pair potential; r is the interaction distance; R c is the cutoff radius of the LIPS, which is closely related to the production of periodic reaction fields; u is an original pair potential, such as Coulomb interaction; and φ is the effective potential from the periodic reaction field. The LIPS pseudo pair potential has high accuracy. The previous paper demonstrated that one of the pseudo pair potentials, LIPS-SW, had the same accuracy as the SPME with fine grid spacing (less than 0.1 nm), for the estimation of the solid-liquid-crystalline phase transition temperature 104 . To attain high accuracy, the LIPS potentials require a large R c condition R c = L/2, where L is the length of the longest side of the simulation box. This means long-range interaction calculations in real space, and becomes a performance bottleneck for MD simulations using the LIPS. To avoid the aforementioned difficulty and attain advanced computational efficiency, the combination of the LIPS and FFT is developed as follows: (i) The original LIPS pair potential is divided into short-range, long-range and boundary pair potentials u S , u L and u B , respectively. For the exact implementation within CHARMM 105-107 (version c40b2), the following expressions are used: where r c (<R c ) is the short-range cutoff radius introduced for convenience. (ii) LIPS potential energy U LIPS is calculated as the summation of the short-range, long-range, and boundary potential energy, U S , U L and U B , respectively.
(iii) The short-range potential energy U S is simply calculated in real space: where r ij is the interaction distance between particles i and j. (iv) The long-range potential energy is calculated in reciprocal space using FFT. The details are very similar to those of the SPME 50 . For any simulation box, the charge distribution q and potential energy distribution Φ can be exactly defined using matrices that contain positional information for point charges. For the advanced calculation efficiency of FFT, the point charges have to be redistributed to predefined grid points. To do this, the following charge array Q is introduced: i N n n n i n i n  i  n  i  1 2  3  , ,   1  1  1 1  2  2  2 2  3  3  3 3   1 2 3 where k 1 , k 2 , and k 3 are integers that correspond to predefined grid points; n 1 , n 2 , and n 3 are integers that correspond to the real space summation; M n is the cardinal b-spline function of the n-th order; w 1i , w 2i , and w 3i are real numbers; and K 1 , K 2 , and K 3 are positive integers that correspond to the total number of grid points. Thus, the Fourier transform of q is approximated by Q as follows: where F is the operator of the discrete Fourier transform (DFT); m 1 , m 2 , and m 3 are integers that correspond to the reciprocal space summation; and The term Φ for long-range pair potential at the grid points also has to be defined. Thus, the following energy array Φ * is introduced: n n n 1 2 3 , , The convolution between Q and Φ * , * Φ ⁎ Q , is expressed as follows: where j 1 , j 2 , and j 3 are integers. With the fine grid spacing conditions, * Φ ⁎ Q can be used as the approximation of qΦ. For advanced computational efficiency, the convolution calculation using FFT is essential. * Φ ⁎ Q can be simply calculated using DFT and inverse DFT (IDFT): where F −1 is the operator of IDFT. Therefore, long-range potential energy U L can be calculated as follows: where B = |b 1 | 2 ⋅ |b 2 | 2 ⋅ |b 3 | 2 . The reciprocal long-range force is calculated using the differential of Eq. 13 with respect to position vector of charges r i : where r αi is the α (=x, y, z) component of the coordinate of molecule i, r i . The term ∂Q/∂r αi can be calculated based on Eq. 7 from the property of the b-spline functions: where w is a real number. (v) The boundary potential energy U B is simply calculated in real space: The potential energy and force of LIPS/FFT is calculated from the aforementioned five steps. Note that a similar approach was performed by Wu and Brooks for the combination of the IPS and FFT (IPS/DFFT) 92  Simulation conditions. To evaluate the capability of LIPS/FFT on homogeneous and heterogeneous polar molecular systems, MD simulations for bulk water and water-vapor interfacial systems were performed. For a careful evaluation of the effect of IPS techniques on long-range interaction calculations, all the following three simulation conditions were met. (i) LIPS/FFT was applied only for electrostatic interactions. LIPS-SW 101 and LIPS-5th 100 were used for the pseudo pair potential of LIPS/FFT. For comparison, IPS/DFFT and the SPME were also applied only for electrostatic interactions. The IPS method for non-polar systems (IPSn) 81 and polar systems (IPSp) 84 were used for IPS/DFFT. For all the aforementioned methods, the short-range cutoff radius for real space interaction calculations and the order for b-spline interpolations were set to 1.0 nm and 8, respectively. The factor α for the SPME 48-50 was set to 3.71692 nm −1 , which corresponds to the short-range cutoff radius. For the treatment of Lennard-Jones (LJ) interactions, the shifting/switching functions to the LJ forces were used [105][106][107] , with the non-bonded cutoff and inner-cutoff equal to 1.0 nm and 0.99 nm, respectively. (ii) The large cutoff radii R c less than or equal to L/2 were mainly used under cubic simulation boxes as a result of considering previous reports for the symmetry effect 79,80 . Even though IPS techniques are conceptually different from the lattice sum, LIPS/FFT and IPS/DFFT at infinite R c is almost equal to the lattice sum. Moreover, these two methods at R c > L/2 contain at least the nearest copy boxes of PBCs without any processing by IPS techniques. These facts imply the concern that the combination of IPS techniques and FFT cannot avoid the effect from PBCs when requiring large cutoff radii R c > L/2 for accuracy. For a secure application, therefore, LIPS/FFT and IPS/DFFT should be used at R c ≤ L/2. (iii) Fine grid spacing less than or equal to 0.1 nm was used. as a result of considering the previous study for liquid-crystalline phase transition phenomena 104 , which demonstrated that even using the SPME, fine grid spacing conditions (less than 0.1 nm) were required for the accurate estimation of the phase transition temperature of macromolecular systems. LIPS/FFT and IPS/DFFT under fine grid spacing conditions should be evaluated for future applications to complex macromolecular systems.
Bulk water systems were simulated with a constant molecular number, volume, and temperature condition [108][109][110] . The number of water molecules was 6,192, the size of the simulation box was 5.71 nm × 5.71 nm × 5.71 nm, and the temperature was 298.15 K. The extended simple point charge (SPC/E) model 111 was used for water molecules. The atoms in each water molecule were constrained by the SHAKE algorithm 112 . The Verlet leapfrog integrator 113 was used with a time step of 2 fs. All simulation systems were equilibrated prior to data acquisition, and the elapsed time after the equilibration was 1 ns.
The water-vapor interfacial systems were simulated with the constant molecular number, volume, and temperature condition. The number of water molecules was 13,500, the size of simulation box was 10.8 nm × 10.8 nm × 10.8 nm, and the temperature was 298.15 K. The SPC/E model was used for water molecules. The atoms in each water molecule were constrained by the SHAKE algorithm. The Verlet leapfrog integrator was used with a time step of 2 fs. All simulation systems were equilibrated prior to data acquisition, and the elapsed time after the equilibration was 15 ns.
All the aforementioned simulations were performed using CHARMM 105-107 (version c40b2) modified for LIPS/FFT implementation.

Results and Discussion
To directly evaluate the accuracy of the electrostatic forces calculated by LIPS/FFT, the following two steps were performed: (i) The instantaneous value of the electrostatic forces was calculated for each method using exactly the same coordinates of molecular systems that had been equilibrated using the SPME. Note that the coordinates were not equilibrated using any IPS techniques but using the SPME. (ii) The instantaneous value was compared with that of the SPME using the root-mean-squared deviation of the forces, δ f , and the largest error of the forces, e f,max , defined as follows: where f α,i and f α,i,SPME are instantaneous forces calculated by LIPS/FFT and the SPME, respectively. For the calculation of δ f , grid spacing Δ for the SPME was set to 0.05 nm. δ f and e f,max for IPS/DFFT were also calculated for comparison. Figure 1(a) shows the R c dependences of δ f (top) and e f,max (bottom) for bulk water systems for two grid spacing conditions: Δ = 0.1 nm and 0.05 nm. Overall, the R c dependences of δ f and e f,max were similar to each other method, with few exceptions. However, the results show that LIPS/FFT was in better agreement with the SPME than IPS/DFFT. With increasing R c , δ f and e f,max decreased, except for the case using IPSn/DFFT with Δ = 0.1 nm. This indicates that only IPSn/DFFT with Δ = 0.1 nm barely improved its accuracy with increasing R c . This very slow decay of δ f and e f,max was caused by the cutoff boundary condition of the IPSn pseudo pair potential. This boundary condition was too simple to accurately estimate polar molecular systems 100 . Using finer grid spacings can improved the accuracy of IPSn/DFFT to a level close to the other methods. With decreasing Δ, δ f and e f,max decreased, except for the case that used IPSp/DFFT. This indicates that only IPSp/DFFT could not improve its accuracy by choosing fine grid spacing less than 0.1 nm. The IPSp pseudo pair potential included a screening effect from the countercharge modeled on the typical structure of polar molecular systems. This screening effect smeared information for the charge distribution embedded in the fine grids, and then inhibited the accuracy improvement of IPSp/DFFT. Figure 1(b) shows the R c dependences of δ f (top) and e f,max (bottom) for water-vapor interfacial systems for two grid spacing conditions: Δ = 0.1 nm and 0.05 nm. It is obvious that LIPS/ FFT was in much better agreement with the SPME than IPS/DFFT. δ f and e f,max started to drastically decrease when R c exceeded the thickness of the water slab (∼4 nm), with a few exceptions. This fast decay was because the sphere-like cutoff territory with large R c became able to contain the major part of the characteristic structure of molecular systems. For LIPS/FFT, δ f at R c > 4 nm decayed in proportion to − . R c 3 4 . For IPSp/DFFT, δ f decay was clearly slower than it was for LIPS/FFT; δ f at R c > 4 nm decayed in proportion to − . R c 1 9 . For IPSn/DFFT, δ f decay was very slow. Using finer grid spacings could improve the accuracy of IPSn/DFFT; however, the slow decay tendency would still remain. Even using Δ = 0.05 nm, δ f at R c > 4 nm barely decay. With decreasing Δ, δ f and e f,max decreased, except for the case using IPSp/DFFT. Similar to the results of bulk water systems, only IPSp/DFFT could not improve its accuracy by choosing fine grid spacing less than 0.1 nm. The screening effect of IPSp inhibited accurate calculation using IPS/DFFT regardless of the homogeneity/heterogeneity of the molecular structure of systems.
To evaluate the accuracy of LIPS/FFT for estimating homogeneous/heterogeneous polar molecular systems, some physical properties for bulk water and water-vapor systems were calculated. The potential energy U is one of the fundamental thermodynamic properties. Figure 2(a) shows U for bulk water systems at Δ = 0.1 nm. In comparison with the results of the SPME, the accuracy for estimating U is in the following order: LIPS-5th/FFT (=SPME) = LIPS-SW/FFT = IPSp/DFFT > IPSn/DFFT. This order follows the results of δ f and e f,max . The radial distribution function g(r) is a critical property showing local configuration of molecular systems. The conventional expression give, where n i (r) is the number of molecules in the region between r and r + Δr from molecule i. Figure 2(b) shows the oxygen-oxygen g(r) for bulk water systems at R c = 2.8 nm and Δ = 0.1 nm. In comparison with the results of the SPME, the accuracy for estimating g(r) is in the following order: LIPS-5th/FFT (=SPME) = LIPS-SW/ FFT = IPSp/DFFT > IPSn/DFFT. This order follows the results of δ f and e f,max . The velocity auto-correlation function C(t) explains microscopic motion of molecular systems 114 . The conventional expression give, where v i is the velocity of molecule i. Figure 2(c) shows C(t) for bulk water systems at R c = 2.8 nm and Δ = 0.1 nm. All the results were almost same as that of the SPME. The self-diffusion coefficient D is one of the representative dynamic properties. D can be determined either by the Einstein relation or Green-Kubo formula, which are basically equivalent. Here we used the Einstein relation, where t is time. The slope of the mean-squared displacement of a diffusing particle in the long-time limit was calculated for the diffusion coefficient. Figure 2(d) shows D for bulk water systems at Δ = 0.1 nm. All the results were almost same as that of the SPME. The mass density profile ρ(z) is one of the fundamental thermodynamic properties showing the mass distribution of heterogeneous systems. Figure 2(e) shows ρ(z) for water-vapor interfacial systems at R c = L/2 and Δ = 0.1 nm. All the results were almost same as that of the SPME. The electrostatic potential profile ψ(z) is well known as one of the properties sensitive to the cutoff radius 84,90,92,[100][101][102][103] , and thus should be calculated to evaluate the accuracy of the truncation methods. ψ(z) was calculated using the double integration of the Poisson equation: One is the influence of pseudo pair potentials at a relatively short interaction distance. For the pseudo pair potentials, the deviation from the Coulomb potential simply increased with an increase of the interaction distance. In a relatively short interaction distance (r < R c /2), the deviation was in the following order: LIPS-SW < LIPS-5th = IPSn < IPSp. This fact corresponds to the results of ψ(z), but not to that of δ f and e f,max because the deviation at a short interaction distance was concealed in δ f and e f,max , which reflected  the total deviation. The other reason is the influence of the difference between the instantaneous and equilibrated values. δ f and e f,max are the instantaneous values calculated using coordinates equilibrated by the SPME, whereas ψ(z) is the physical property calculated from the molecular structure equilibrated by each method. This difference may affect the evaluation of accuracy. Note that the results of ψ(z) at R c = L/2 and Δ = 0.05 nm are almost the same as those plotted in Fig. 2(f), and consistent with the accuracy of δ f and e f,max at the same R c and Δ conditions (data not shown).
To evaluate the computational efficiency of LIPS/FFT, CPU time t CPU was measured for water-vapor interfacial systems with different N. An Intel(R) Xeon(R) CPU E5-2690 v2 (10 cores/20 threads, 3.00 GHz) was used for the measurement. L when N = 12000, 40500, and 96000 was 7.2 nm, 10.8 nm, and 14.4 nm, respectively. R c and Δ were set to L/2 and 0.1 nm, respectively. Figure 3 shows the N dependence of t CPU . The results clearly demonstrate that LIPS/FFT attained almost the same computational efficiency as the SPME. t CPU for every method satisfied O(N logN) scaling, which means ideal computational efficiency using the FFT algorithm.

Conclusion
We developed a combination of the LIPS and FFT, LIPS/FFT, for a substantial advance in the computational efficiency of the LIPS for intermediate-sized systems ∼ -N ( 10 10 ) 3 5 . The performance of LIPS/FFT was evaluated on homogeneous/heterogeneous polar molecular systems. LIPS/FFT attained computational efficiency almost the same as that of the SPME while maintaining the advanced accuracy of the original LIPS. Furthermore, static and dynamic properties calculated by LIPS-SW/FFT was almost the same as that by the SPME. This indicates that LIPS-SW/FFT accurately estimated not only instantaneous values but also equilibrated values. This high accuracy was observed within R c ≤ L/2. It clearly demonstrated that LIPS/FFT could overcome the concern that the combination of IPS techniques and FFT could not avoid the symmetry effect when requiring large cutoff radii R c > L/2 for accuracy. One of the potential advantages of IPS techniques is the capability to simulate complex ionic systems that are difficult to accurately simulate using conventional lattice sum methods 79,80,115 . LIPS/FFT can enhance this advantage because it has both high accuracy and computational efficiency, while avoiding the symmetry effect. We conclude that the developed LIPS/FFT has great potential to extend the capability of IPS techniques for the fast and accurate computation of many types of molecular systems that involve highly complex ionic structure and dynamics, such as counterion condensation, ion conduction, and electrochemical migration. In further studies, the capability of LIPS/FFT for complex ionic systems will be intensively evaluated.