Preconditioners for the geometry optimisation and saddle point search of molecular systems

Mones, Letif; Ortner, Christoph; Csányi, Gábor

doi:10.1038/s41598-018-32105-x

Download PDF

Article
Open access
Published: 18 September 2018

Preconditioners for the geometry optimisation and saddle point search of molecular systems

Letif Mones^1,2,
Christoph Ortner¹ &
Gábor Csányi²

Scientific Reports volume 8, Article number: 13991 (2018) Cite this article

1877 Accesses
10 Citations
3 Altmetric
Metrics details

Subjects

Abstract

A class of preconditioners is introduced to enhance geometry optimisation and transition state search of molecular systems. We start from the Hessian of molecular mechanical terms, decompose it and retain only its positive definite part to construct a sparse preconditioner matrix. The construction requires only the computation of the gradient of the corresponding molecular mechanical terms that are already available in popular force field software packages. For molecular crystals, the preconditioner can be combined straightforwardly with the exponential preconditioner recently introduced for periodic systems. The efficiency is demonstrated on several systems using empirical, semiempirical and ab initio potential energy surfaces.

Neural operators for accelerating scientific simulations and design

Article 08 April 2024

Kamyar Azizzadenesheli, Nikola Kovachki, … Anima Anandkumar

ColabFold: making protein folding accessible to all

Article Open access 30 May 2022

Milot Mirdita, Konstantin Schütze, … Martin Steinegger

Opportunities and challenges in design and optimization of protein function

Article 02 April 2024

Dina Listov, Casper A. Goverde, … Sarel Jacob Fleishman

Introduction

Geometry optimisation and transition state search are fundamental procedures to identify important stationary points of molecules, molecular crystals and material systems in computational chemistry. Since the evaluation of chemically accurate ab initio potential energies and gradients are computationally demanding, several techniques have been developed over the last three decades to enhance the convergence of optimisation methods.

Among the most widely used are quasi-Newton methods, in particular BFGS or its limited memory version¹, which start with a (scaled) identity as their guess for the Hessian and update it at each iteration based on the gradient information collected from previous steps. Initialising with a Hessian guess that includes more molecule specific geometrical information can improve the convergence. For instance, just introducing some connectivity information about a molecule can lead to surprisingly good results^2,3.

Utilising internal coordinates in the construction of Hessian has also been intensively investigated^4,5,6,7,8,9. These approaches include (1) methods that build the Hessian matrix in the space of internal coordinates in the beginning of the optimisation and then transform it to Cartesian coordinates, (2) methods that recompute and transform the internal Hessian every step and (3) methods that carry out even the optimisation step in the space of internal coordinates. However, this latter scheme requires to carry out both the projection of Cartesian gradients to internal coordinates gradients and the transformation of the internal coordinates to Cartesian coordinates. None of these steps is straightforward and the gradient conversion can be accomplished only by an iterative solution due to the curvilinear nature of the transformation⁸.

More sophisticated methods estimate the initial Hessian guess from a surrogate potential (e.g. force field or semiempirical potential) whose second derivative can be obtained at low computational cost. The update of the approximate Hessian can then be achieved either by using quasi-Newton methods or other techniques such as DIIS^10,11. Such strategies significantly improve the speed of convergence in either Cartesian or internal coordinates⁶.

If the model Hessian is cheap to calculate (e.g., as obtained from a surrogate model) and provides a reasonable approximation of the quantum Hessian, it may be advantageous to recompute it at every optimisation step. Such a scheme was introduced by Lindh et al.¹², where a model potential is constructed consisting of quadratic terms for all distances, angles and dihedrals in the molecule. At each geometry optimisation step the force field is constructed such that the current conformation is its local minimum, and its “Hessian” is computed, neglecting the dependence of the force field parameters on the geometry. With this construction, the model “Hessian” is therefore not the Hessian of any potential. Nevertheless, this approach yields excellent performance, which led to its wide implementation in quantum chemistry program packages (e.g. in MOLPRO¹³, ORCA¹⁴, DALTON¹⁵, CRYSTAL¹⁶).

The model “Hessian” of Lindh¹² can be considered as a preconditioner or metric that effects a transformation to a new coordinate system where the optimisation problem is better conditioned, hence algorithms converge more rapidly and tend to be more robust. Geometrically, the shape of the energy landscape becomes more isotropic. In general it is desirable that both the construction and inversion of the preconditioner matrix are inexpensive (at least compared to the computation of the energy and gradient). This can be achieved by building a sparse preconditioner from simple analytical functions. Since the preconditioner defines a metric in configuration space it needs to be positive definite. This requirement is automatically fulfilled in the Lindh approach¹² by the use of quadratic molecular mechanical terms with equilibrium points corresponding to the actual geometry at each step.

We recently introduced a general and effective preconditioner for geometry optimisation and saddle point search for material systems³. This preconditioner is determined by the local connectivity structure of atoms making both the construction and inversion computationally inexpensive. Especially for larger systems we observe an order of magnitude or larger reduction of the number of optimisation steps for the preconditioned LBFGS method compared to the same without preconditioning. Here we expand our previous work to molecules and molecular crystals by combining it with a force field based preconditioner inspired by the approach of Lindh et al.¹².

Methods

Enhancing geometry optimisation by using preconditioners

We briefly review the methodology for preconditioning geometry optimisation and the dimer saddle point search method for material systems³. For a system with N particles let ${x}_{k}\in {{\mathbb{R}}}^{3N}$ denote the configuration at the k th iterate of an optimisation algorithm. The corresponding energy, gradient and preconditioner are denoted by f_k = f(x_k), g_k = ∇f(x_k) and ${P}_{k}\in {{\mathbb{R}}}^{3N\times 3N}\approx {\nabla }^{2}f({x}_{k})$, respectively. A preconditioned steepest-descent step is then given by

$${x}_{k+1}={x}_{k}-{\alpha }_{k}{P}_{k}^{-1}{g}_{k}$$

(1)

where α_k is the step size obtained from some line search procedure at the k th iteration. If P_k = I, then (1) becomes the standard steepest descent scheme and if P_k = ∇²f(x_k), then it becomes a quadratically convergent Newton scheme. In general, different choices of P_k may “interpolate” between these extremes.

From an alternative point of view preconditioning can be thought of as a coordinate transformation, where a new set of coordinates is defined as ${y}_{k}:={P}^{\mathrm{1/2}}{x}_{k}$. The advantage of this framework is that once an appropriate preconditioner matrix is available then any optimisation algorithm can be modified by applying the original algorithm on the transformed coordinates. To obtain the final form of the modified algorithm we need to transform the variables back to the original coordinate system. As a simple example, applying the coordinate transformation on the gradient descent equation immediately leads to the equation of quasi Newton schemes (eq. 1):

$$\begin{array}{rcl}{y}_{k+1} & = & {y}_{k}-{\alpha }_{k}{\nabla }_{y}F({y}_{k})\\ {P}_{k}^{\mathrm{1/2}}{x}_{k+1} & = & {P}_{k}^{\mathrm{1/2}}{x}_{k}-{\alpha }_{k}{P}_{k}^{-\mathrm{1/2}}{\nabla }_{x}\,f({x}_{k})\\ {x}_{k+1} & = & {x}_{k}-{\alpha }_{k}{P}_{k}^{-1}{\nabla }_{x}\,f({x}_{k})\end{array}$$

(2)

where we used the fact that ${\nabla }_{y}F({y}_{k})={P}_{k}^{-\mathrm{1/2}}{\nabla }_{x}\,f({x}_{k})$. Preconditioning popular optimisation methods like LBFGS, conjugate gradients, FIRE¹⁷ etc. is similarly possible.

A simple preconditioner that is effective for a wide range of materials systems is based on the following N × N matrix³

$${L}_{ij}=\{\begin{array}{cc}-\mu \exp (\,\,-\,A(\frac{{r}_{ij}}{{r}_{{\rm{n}}{\rm{n}}}}-1)), & i\ne j\,\text{and}\,|{r}_{ij}| < {r}_{{\rm{c}}{\rm{u}}{\rm{t}}}\\ 0, & i\ne j\,\text{and}\,|{r}_{ij}|\ge {r}_{{\rm{c}}{\rm{u}}{\rm{t}}}\\ -\sum _{i^{\prime} \ne j}{L}_{i^{\prime} j}, & i=j\end{array}$$

(3)

where i and j denote atomic indices and μ, A, r_cut and r_nn are parameters that can be user-specified or estimated numerically. We note that (3) is a generalisation of the Laplacian matrix used to represent undirected graphs. Given a specific connectivity defined by r_cut and setting A = 0 and μ = 1, L_ij reduces exactly to the Laplacian matrix. The actual 3N × 3N preconditioner is simply obtained from the corresponding L_ij element in an isotropic manner:

$$[{P}_{{\rm{E}}{\rm{x}}{\rm{p}}}{]}_{k+3(i-1),l+3(j-1)}=\{\begin{array}{cc}{L}_{ij}, & k=l\\ 0, & k\ne l\end{array}$$

(4)

where the k and l indices denote Cartesian components. Despite its simplicity in capturing only geometric connectivity but no specific material information, P_Exp provides a good model for the local curvature of the potential energy landscape, which effectively controls ill-conditioning in large systems. The application of P_Exp resulted in a significant reduction of the number of optimisation steps required for several material systems compared to the unpreconditioned LBFGS³.

FF-based preconditioners

Preliminary tests showed that for molecular systems such as molecules in gas phase or molecular crystals using P_Exp still leads to a speed-up, but a much more modest one than for material systems. The explanation for this is that molecular systems contain a wide range of different interactions (e.g., pair, angle, dihedral, electrostatic, dispersive) of vastly varying stiffness which in addition are more loosely coupled, and this creates a second source of ill-conditioning distinct from ill-conditioning due to large system size. Inspired by the use of internal coordinates in molecular optimisation techniques⁴ and the model Hessian of Lindh et al.¹² we therefore propose a generalisation of P_Exp that is effective also for molecular systems.

The construction of our FF-based preconditioner begins with a surrogate potential energy function, given by a sum over internal coordinates each describing a short-range bond in the system (distance, angle, or dihedral),

$${V}_{{\rm{FF}}}=\sum _{\alpha }{V}_{\alpha }({\xi }_{\alpha }(x)).$$

(5)

The individual potential energy terms are in general simple functions of the internal coordinates. Some examples of most typical forms are the quadratic, Morse or torsional potentials, respectively given by

$${V}_{{\rm{Quadratic}}}(q)=\frac{1}{2}k{(q-{q}_{0})}^{2},$$

(6)

$${V}_{{\rm{Morse}}}(d)={D}_{0}{(1-\exp (-\alpha (d-{d}_{0})))}^{2},$$

(7)

and

$${V}_{{\rm{Torsion}}}(\varphi )=\frac{1}{2}{k}_{\varphi }(1+\,\cos (n\varphi -{\varphi }_{0}))$$

(8)

where the corresponding parameters can be taken from standard force field libraries.

Due to its simple functional form ${H}_{{\rm{FF}}}\,:\,=\,{\nabla }^{2}{V}_{{\rm{FF}}}$ is cheap to compute. If only short-range bonds are taken into account then it is also sparse, hence it is cheap to store and invert. Moreover, we expect that V_FF gives a good qualitative approximation to the quantum potential energy landscape, hence H_FF is a good qualitative approximation to the quantum Hessian ∇²f. Therefore, H_FF satisfies all the conditions required for a preconditioner except that it will in general be indefinite. A conceptually straightforward but computationally expensive approach to overcome this limitation is to enforce positivity by replacing all eigenvalues of H_FF with their absolute values. Instead, we propose to analytically modify the local Hessian contributions to ensure overall positivity, resulting in a further reduction in computational cost.

The Hessian contribution from V_α is given by

$${H}_{\alpha }=\frac{{\partial }^{2}{V}_{\alpha }}{\partial {x}^{2}}=\mathop{\underbrace{\frac{\partial {\xi }_{\alpha }}{\partial x}\otimes \frac{\partial {\xi }_{\alpha }}{\partial x}\frac{{\partial }^{2}{V}_{\alpha }}{\partial {\xi }_{\alpha }^{2}}}}\limits_{{H}_{\alpha }^{(1)}}+\mathop{\underbrace{\frac{{\partial }^{2}{\xi }_{\alpha }}{\partial {x}^{2}}\frac{\partial {V}_{\alpha }}{\partial {\xi }_{\alpha }}}}\limits_{{H}_{\alpha }^{(2)}},$$

(9)

where we have decomposed H_α into two terms, ${H}_{\alpha }^{\mathrm{(1)}}$ and ${H}_{\alpha }^{\mathrm{(2)}}$. If V_α is quadratic then $\frac{{\partial }^{2}{V}_{{\rm{Quadratic}}}}{\partial {\xi }^{2}}=k > 0$, hence ${H}_{\alpha }^{\mathrm{(1)}}$ is positive semi-definite, while the sign of ${H}_{\alpha }^{\mathrm{(2)}}$ is ambiguous. Note however, that if the system is at equilibrium of V_α with respect to ξ_α, i.e., if $\frac{\partial {V}_{\alpha }}{\partial {\xi }_{\alpha }}=0$, then ${H}_{\alpha }^{\mathrm{(2)}}=0$. This in fact is the case in the Lindh approach¹². Instead of adjusting V_α at every step such that the geometry corresponds to its equilibrium, here we simply drop ${H}_{\alpha }^{\mathrm{(2)}}$ and only use ${H}_{\alpha }^{\mathrm{(1)}}$ to construct the preconditioner, thus ensuring that it always stays positive definite.

For non-quadratic contributions we expect that $\frac{{\partial }^{2}{V}_{\alpha }}{\partial {\xi }_{\alpha }^{2}} > 0$ for most but not all bonds, hence we enforce positivity by replacing it with its absolute value. This leads to the following general preconditioner for molecular systems:

$${P}_{{\rm{F}}{\rm{F}}}=\sum _{\alpha }{\mathop{H}\limits^{ \sim }}_{\alpha }^{(1)}=\sum _{\alpha }\frac{{\rm{\partial }}{\xi }_{\alpha }}{{\rm{\partial }}x}\otimes \frac{{\rm{\partial }}{\xi }_{\alpha }}{{\rm{\partial }}x}|\frac{{{\rm{\partial }}}^{2}{V}_{\alpha }}{{\rm{\partial }}{\xi }_{\alpha }^{2}}|$$

(10)

It is worth noting that $\frac{\partial {\xi }_{\alpha }}{\partial x}$ is already computed by molecular mechanics force field based MD programs since it is required for the assembly of ∇V_α. Thus, the only new quantity that must be computed is $\frac{{\partial }^{2}{V}_{\alpha }}{\partial {\xi }_{\alpha }^{2}}$, which represents a negligible additional computational cost.

We note that the final functional form (10) of our preconditioner is very similar to that of Lindh et al.¹², however we arrived at it from a fundamentally different perspective, which has several advantages. Lindh et al.‘s method was introduced for quadratic terms only, hence the force constants have to be recomputed after each optimisation step (as the equilibrium bond lengths, angles and dihedrals are set to the actual ones to obtain a positive semidefinite matrix). Our method can be considered as a generalisation of their approach, allowing arbitrary functional forms of internal coordinate dependent terms. In particular this means that the FF parameters need not be adjusted to achieve a positive semidefinite matrix, and incorporating different parameter sets is straightforward. Moreover, as it will be discussed below our perspective makes it easy to extend the preconditioner construction to new situations. Finally we mention that since the FF-based preconditioner is a sparse matrix both its construction and inversion are computationally inexpensive, which gives the possibility to utilise it even for large systems.

Combining FF and Exp preconditioners

For molecular crystals, intermolecular interactions also play an important role. This consideration led us to combine the molecular mechanics based FF preconditioner (describing bonded interactions) with a modified Exp preconditioner (4) (tuned to describe only non-bonded interactions, i.e. interactions already treated by the FF preconditioner are omitted):

$${P}_{\mathrm{Exp}+{\rm{FF}}}={P}_{\mathrm{Exp}}^{{\rm{nb}}}+{P}_{{\rm{FF}}}$$

(11)

P_FF is fully specified from the chosen force field V_FF. To specify ${P}_{\mathrm{Exp}}^{{\rm{nb}}}$ we first manually choose the parameters A and r_cut in (4) to account for the interaction between molecules³. The remaining parameters are computed in a similar automatic manner as described in ref.³, and we keep only those matrix elements of P_Exp for which the corresponding matrix element in P_FF is zero. We note that correct scaling between P_FF and P_Exp is implicitly ensured via the μ parameter in Eq. 3.

Implementation details

We tested the FF and FF + Exp preconditioners on a range of optimisation and saddle point search tasks. For geometry optimisations the form of the preconditioned LBFGS method was identical to the one we describe in ref.³: at each iterate the search direction is given by

$$\begin{array}{c}{\bf{input}}\,q=\nabla f({x}_{k})\\ {s}_{k}={x}_{k}-{x}_{k-1}\\ {y}_{k}=\nabla f({x}_{k})-\nabla f({x}_{k-1})\\ {\rho }_{k}=1/{y}_{k}^{T}{s}_{k}\\ {\rm{for}}\,i=k,\ldots ,k-m\\ \,\,{\alpha }_{i}={\rho }_{i}{s}_{i}^{T}q\\ \,\,q=q-{\alpha }_{i}{y}_{i}\\ \boxed{z={P}_{k}^{-1}q}\\ {\rm{for}}\,i=k-m,\ldots ,k\\ \,\,{\beta }_{i}={\rho }_{i}{y}_{i}^{T}z\\ \,\,z=z+({\alpha }_{i}-{\beta }_{i}){s}_{i}\\ {\bf{output}}\,{p}_{k}=z\end{array}$$

(12)

with initial search direction $z={P}_{0}^{-1}\nabla f({x}_{0})$. The box in Eq. 12 indicates the single modification of the standard algorithm to obtain preconditioning. Variable m is the maximum number of metric corrections. The step length selection is obtained by a backtracking line-search enforcing only the Armijo condition³. We note that although there exist several techniques (discussed in the Introduction) that use a quasi-Newton method in combination with some approximated Hessian information, their applicability is rather limited for large systems due to the excessive computational cost. Therefore our baseline was the unpreconditioned LBFGS method and where it was necessary we also compared our method to other preconditioning based techniques.

For saddle point search tasks we slightly modified the superlinearly converging dimer method¹⁸. Dimer methods use two copies of the system with coordinates x⁽¹⁾ and x⁽²⁾ and a fixed separation length $l=\parallel {x}^{\mathrm{(1)}}-{x}^{\mathrm{(2)}}\parallel $ between them. The algorithm is usually split into two alternating steps¹⁹: (1) in the rotation step we fix the midpoint and rotate the endpoints to approximately align them with the lowest (negative) eigenmode of the Hessian (v_k); (2) in the translation step we shift the dimer to maximise the energy along the dimer direction while minimising energy in all directions perpendicular to it (p_k).

In principle, both the rotation and translation steps can be preconditioned, however, we found that in many systems preconditioning the rotation step results in a smaller spectral gap and hence slower convergence. Therefore, we chose to precondition only the translation step. Our implementation employs the conjugate gradient method using the Polak-Ribière formula:

$$\begin{array}{c}{\bf{i}}{\bf{n}}{\bf{p}}{\bf{u}}{\bf{t}}\,{q}_{k}=-\,(I-2{v}_{k}\otimes {v}_{k})\nabla f({x}_{k})\\ \boxed{\beta =\frac{{q}_{k}^{T}{P}_{k}^{-1}({q}_{k}-{q}_{k-1})}{{q}_{k-1}^{T}{P}_{k}^{-1}{q}_{k-1}}}\\ {s}_{k}={q}_{k}+\beta {s}_{k-1}\\ {\bf{o}}{\bf{u}}{\bf{t}}{\bf{p}}{\bf{u}}{\bf{t}}\,\boxed{{P}_{k}=\frac{{P}^{-1}{s}_{k}}{{s}_{k}^{T}{P}^{-1}{s}_{k}}}\end{array}$$

(13)

and using the initial iterate s₀ = −(I − 2v₀ ⊗ v₀)∇f(x₀). Again, boxed steps are the modifications needed to the original method to achieve preconditioning. For computing the step length we used the trust region radius approach suggested by Kästner and Sherwood¹⁸, with acceptance criterion based on the projection of the gradient of the actual step.

For molecules in gas phase V_FF is invariant under rotations and translations, hence P_FF will be at least six fold degenerate with zero eigenvalues for any configuration of the molecule corresponding to the three translational and three rotational degrees of freedom. While these degrees of freedom could in principle be fixed we found that a straightforward solution is to simply regularise the preconditioner by replacing it with $P\to P+cI$. We found that good generic values for c are 0.1 and 1.0 eV/Å² for geometry optimisations and saddle point search, respectively.

Model systems and potentials

Organic molecules in gas phase

Three potential energy surfaces were investigated for geometry optimisations: semiempirical PM6²⁰, DFT^21,22 and MP2²³. For the DFT potential we used the the BLYP exchange-correlation functional^24,25 with the DZVP-MOLOPT basis set²⁶ and plane wave cutoff of 480 Ry within the Gaussian and plane waves method (GPW) approach²⁷ and Goedecker-Teter-Hutter (GTH) pseudopotentials²⁸. For the MP2 calculations the 6–31G** basis set was applied.

In the case of geometry optimisations on the PM6 surface we compared three different force fields from which we constructed the preconditioners: the force field of Lindh et al. (LFF)¹², the universal force field (UFF)²⁹ and the generalised Amber force field (GAFF)³⁰.

Initial configurations were taken from 0.5 ns long molecular dynamics (MD) simulations at 300 K.

For transition state search we selected 7 examples (with their initial configurations) from the benchmark of Baker and Chan³¹ and three additional systems whose initial configurations were taken from 0.5 ns long MD simulations at 300 K. The computations were performed on the semiempirical PM6 surface²⁰.

Molecular crystals

We compared four different optimisation schemes (unpreconditioned, only FF-based, only Exp-based and Exp + FF-based preconditioners) on five organic molecular crystals (systems XVIII to XXII) whose initial geometries were taken from the Organic Crystal Structure Prediction competition of the Cambridge Crystallographic Data Centre^32,33. We used a DFT potential energy surface with the PBE exchange-correlation functional³⁴ with a plane wave basis set using a cutoff energy of 800 eV and ultrasoft pseudopotentials³⁵.

Material systems

We also tested how the force field based preconditioner works on two material systems compared to the exponential preconditioner. We examined the unpreconditioned and preconditioned geometry optimisation for bulk silicon and vacancy with varying system size. The potential energy surface was the screened Tersoff potential^36,37 and we used the universal force field (UFF)²⁹ for building the FF-based preconditioner matrix. The bulk systems were built by using the bulk function of ASE³⁸ with default lattice constants while for the corresponding vacancy systems a silicon atom was removed. Initial configurations were obtained by applying a random displacement on the atomic positions using normal distribution with standard deviation of 0.05 Å.

Next we considered bulk tungsten and a single interstitial site in bulk tungsten, also using different system sizes. The potential energy surface was a machine learning based Gaussian Approximation Potential (GAP) reproducing the quality of DFT (with PBE functional)³⁹. The preconditioner in this case was based on a simple Embedded Atom Method (EAM) potential^40,41. Initial configurations were obtained in a similar way as the bulk silicon ones but this time a standard deviation of 0.15 Å was applied.

Software

For the semiempirical PM6 method AmberTools16⁴² was used. The MP2 potential surface was generated using MOLPRO^13,43. The screened Tersoff potential was provided by Atomistica⁴⁴. The DFT potentials with the BLYP and PBE functionals were provided by CP2K⁴⁵ and CASTEP⁴⁶, respectively, using the QUIP interface⁴⁷. The GAP model was called via QUIP⁴⁷.

In all cases the geometry optimisation was performed within ASE³⁸. The other software packages were only used to compute the energy and gradient of the configurations. A Python implementation of the FF and Exp + FF preconditioners with several potential forms of nonbonded terms is available within ASE^38,48 (https://gitlab.com/molet/ase).

Initial structures of molecules and molecular crystals as well as the generating Python codes for the starting geometry of material systems are provided as Supplementary Data.

Accession codes

Python implementation of the preconditioners can be found here: [https://gitlab.com/molet/ase].

Results

Organic molecules in gas phase

We investigated several organic molecules’ geometry optimisations with (FF) and without (ID) our new FF-based preconditioner on three potential energy surfaces (PM6, DFT and MP2), using the GAFF force field for building the preconditioner. The results are shown in Table 1 and Fig. 1. The convergence criterion of the geometry optimisations was $\parallel \nabla E{\parallel }_{\infty }={10}^{-3}$ eV Å⁻¹ for DFT and MP2 surfaces while for the relatively inexpensive PM6 potential we applied a slightly tighter threshold of $\parallel \nabla E{\parallel }_{\infty }={10}^{-4}$ eV Å⁻¹. Depending on the system and underlying potential we can observe a 4–10 fold decrease in the required number of optimisation steps using our preconditioner.

Table 1 Total number of function/gradient calls of geometry optimisation for organic molecules in gas phase using conventional (ID) and FF-based preconditioned (FF) LBFGS method on three different quantum chemistry surfaces. Convergence threshold was $\parallel \nabla E{\parallel }_{\infty }={10}^{-4}$ eV Å⁻¹ for PM6 and $\parallel \nabla E{\parallel }_{\infty }={10}^{-3}$ eV Å⁻¹ for DFT and MP2 potentials, respectively.

Full size table

For PM6 only, to highlight the correlation between performance gain and ill-conditioning, we also computed the ratio between the condition numbers for the unpreconditioned and preconditioned Hessians, κ_I/κ_P, at the minima, where

$${\kappa }_{P}=\frac{{\lambda }_{P}^{max}}{{\lambda }_{P}^{min}}=\frac{\mathop{max}\limits_{{u}^{T}Pu=1}{u}^{T}Hu}{\mathop{min}\limits_{{u}^{T}Pu=1}{u}^{T}\mathop{H}\limits^{ \sim }u}$$

(14)

where $\tilde{H}$ is a modified Hessian where the zero eigenvalue due to symmetries are removed. In Fig. 2 we observe that the computational saving is more strongly correlated to the condition number ratio than to the system size.

For the three smallest systems and the PM6 surface only we also compared our FF-based preconditioner against using the exact Hessian of the model potential (Hessian/GAFF) and a finite-difference Hessian of PM6 (Hessian/PM6) as preconditioners for LBFGS method. Zero eigenvalues of the Hessian matrices were shifted to a moderate positive number to avoid numerical instabilities. The results are collected in Table 2 and Fig. 3. Our FF-based preconditioner clearly outperforms both of these variants.

Table 2 Comparison of total number of function/gradient calls of geometry optimisation of different optimisation algorithms for minimisation of small organic molecules on PM6 surface: unpreconditioned LBFGS (ID), FF-based preconditioned LBFGS (FF/GAFF), FF-Hessian based preconditioned LBFGS (Hessian/GAFF), PM6-Hessian based preconditioned LBFGS (PM6/GAFF). Convergence threshold was $\parallel \nabla E{\parallel }_{\infty }={10}^{-4}$ eV Å⁻¹ for all cases.

Full size table

We also examined the effect of using different force fields from which to construct the FF-based preconditioner. Beside the GAFF force field, we investigated two general force fields: the universal force field (UFF)²⁹ and a general force field introduced by Lindh et al. (LFF)¹². The results shown in Table 3 and Fig. 4 indicate that there is no significant difference between the three force fields. We only mention that the LFF force-field includes all possible 2, 3 and 4–body interactions¹², resulting in a dense preconditioner matrix, which for larger systems and an efficient potential energy surface could become a performance bottleneck. By contrast, the preconditioners based on GAFF or UFF are sparse, hence their cost scales linearly with system size.

Table 3 Comparison of the effect of different force field based preconditioners for the geometry optimisation of organic molecules in gas phase (total number function/gradient calls). Convergence threshold was $\parallel \nabla E{\parallel }_{\infty }={10}^{-4}$ eV Å⁻¹ for all cases.

Full size table

Finally, we tested how the different preconditioners perform when applied to transition state search, comparing again against ID (no preconditioning) and against the Exp preconditioner (with default parameter set and μ = 1; unlike LBFGS, CG is invariant under rescaling of μ). The results are collected in Table 4 and Fig. 5. Overall the Exp preconditioner does not improve significantly over ID. We experimented with different parameters, e.g., adding connectivity information up to the 4-body interaction, but observed no improvements. Both FF-based preconditioners are again comparable and yield a much improved convergence even for these relatively small systems. For instance, the gain is already 2–3-fold for dimethyl-phosphate and tyrosine hydrolyses.

Table 4 Number of steps of translations (and total number of function and gradient calls in parentheses) of saddle searches using different preconditioned variants of superlinearly converging dimer method. Convergence threshold was $\parallel \nabla E{\parallel }_{\infty }={10}^{-4}$ eV Å⁻¹ for all cases.

Full size table

Molecular crystals

We compared geometry optimisation with fixed unit cells using LBFGS, preconditioned with ID (unpreconditioned), Exp³ and FF (GAFF force field). For Exp the nearest neighbour distance (r_nn) in Eq. 4 was calculated from the initial structure, we specified r_cut = 2r_nn and A = 3.0. In addition, we also employed the Exp + FF preconditioner as defined in (11). The results for different molecular crystals are shown in Table 5 and Fig. 6. As expected, Exp reduces the number of optimisation steps compared to ID, although the improvement is significantly smaller for molecular crystals that for material systems³. Interestingly, FF alone already leads to a significant speed-up over both ID and Exp even though the inter-molecular interaction is not captured well. This indicates that for molecular crystal optimisations preconditioning based on specific intramolecular information is crucial. The most successful method was the Exp + FF combination, which leads to a 3-7 fold speed up even for these relatively small test systems.

Table 5 Total number of function/gradient calls of geometry optimisation of molecular crystals using different preconditioning strategies. Convergence criterion was $\parallel \nabla E{\parallel }_{\infty }={1.0}^{-3}$ eV Å⁻¹.

Full size table

Material systems

Finally, it is also interesting to investigate how an FF-based preconditioner compares against the Exp preconditioner for material systems, where Exp performs very well³. We tested geometry optimisation of bulk silicon and a vacancy in bulk silicon with perturbed initial conditions, with increasing system size. The screened Tersoff potential was used as the potential energy, while the FF preconditioner was constructed from UFF. The results are shown in Table 6. In both cases, the FF-based preconditioner yields a clear further speed-up over Exp for both systems.

Table 6 Total number of function/gradient calls geometry optimisation steps of bulk silicon and a bulk silicon vacancy using different preconditioning strategies. Convergence criterion was $\parallel \nabla E{\parallel }_{\infty }={1.0}^{-3}$ eV Å⁻¹.

Full size table

Another test system was bulk tungsten and a single interstitial site in bulk tungsten with perturbed initial conditions and different system sizes. The potential energy surface was provided by a GAP model that was trained on DFT data. The preconditioner was based on a simple EAM potential:

$${V}_{{\rm{E}}{\rm{A}}{\rm{M}}}=\sum _{i}[\frac{1}{2}\sum _{j\ne i}{\rm{\Phi }}({r}_{ij})+F(\sum _{j\ne i}\rho ({r}_{ij}))]$$

(15)

where Φ(r_ij) is a pair potential, F is the embedding function and ρ(r_ij) is the electron charge density contribution from atom j to atom i. Based on Eq. 10 our FF-based preconditioner was defined as ${P}_{{\rm{F}}{\rm{F}}}={\sum }_{\alpha }\frac{{\rm{\partial }}{r}_{\alpha }}{{\rm{\partial }}x}\otimes \frac{{\rm{\partial }}{r}_{\alpha }}{{\rm{\partial }}x}|\frac{{{\rm{\partial }}}^{2}{V}_{{\rm{E}}{\rm{A}}{\rm{M}}}}{{\rm{\partial }}{r}_{\alpha }^{2}}|$, where α runs over all ij pairs. In the actual implementation Φ, F and ρ functions are represented by splines so computing the corresponding curvature is fairly straightforward.

The results are presented in Table 7. For both the bulk and interstitial systems the number of function/gradient calls of the unpreconditioned optimisation increases with system size while the preconditioned optimisations require almost the same number of optimisation steps to achieve the same convergence criterion.

Table 7 Total number of function/gradient calls for geometry optimisation of bulk tungsten and interstitial defect in bulk tungsten using different preconditioning strategies. Convergence criterion was $\parallel \nabla E{\parallel }_{\infty }={1.0}^{-3}$ eV Å⁻¹.

Full size table

Conclusion

We introduced a flexible preconditioner for molecular simulation based on empirical potentials that are widely implemented in popular molecular mechanical program packages. Our method, which can be considered a generalisation of Lindh et al.¹², decomposes the analytic Hessian of the empirical potential and modifies individual components to ensure their positivity. An advantage of this procedure is that it avoids the computation of second derivatives of the collective variables (or internal coordinates). The preconditioner yields significant improvements (at least 2 fold, and typically 5 fold decrease in function/gradient calls compared to unpreconditioned techniques), demonstrated thoroughly on a wide range of systems including molecules in gas phase, molecular crystals and materials, using different target potential energy surfaces (empirical, semiempirical and ab initio) as well as different optimisation tasks (geometry optimisations and saddle point searches).

References

Liu, D. C. & Nocedal, J. On the limited memory bfgs method for large scale optimization. Mathematical Programming 45, 503–528 (1989).
Article MathSciNet MATH Google Scholar
Bakken, V. & Helgaker, T. The efficient optimization of molecular geometries using redundant internal coordinates. The Journal of Chemical Physics 117, 9160–9174 (2002).
Article ADS CAS Google Scholar
Packwood, D. et al. A universal preconditioner for simulating condensed phase materials. The Journal of Chemical Physics 144, 164109 (2016).
Article ADS PubMed CAS Google Scholar
Fogarasi, G., Zhou, X., Taylor, P. W. & Pulay, P. The calculation of ab initio molecular geometries: efficient optimization by natural internal coordinates and empirical correction by offset forces. Journal of the American Chemical Society 114, 8191–8201 (1992).
Article CAS Google Scholar
Pulay, P. & Fogarasi, G. Geometry optimization in redundant internal coordinates. The Journal of Chemical Physics 96, 2856–2860 (1992).
Article ADS CAS Google Scholar
Baker, J. Techniques for geometry optimization: A comparison of cartesian and natural internal coordinates. Journal of Computational Chemistry 14, 1085–1100 (1993).
Article CAS Google Scholar
Baker, J., Kessi, A. & Delley, B. The generation and use of delocalized internal coordinates in geometry optimization. The Journal of Chemical Physics 105, 192–212, https://doi.org/10.1063/1.471864 (1996).
Article ADS CAS Google Scholar
Peng, C., Ayala, P. Y., Schlegel, H. B. & Frisch, M. J. Using redundant internal coordinates to optimize equilibrium geometries and transition states. Journal of Computational Chemistry 17, 49–56 (1996).
Article CAS Google Scholar
Eckert, F., Pulay, P. & Werner, H.-J. Ab initio geometry optimization for large molecules. Journal of Computational Chemistry 18, 1473–1483 (1997).
Article CAS Google Scholar
Császár, P. & Pulay, P. Geometry optimization by direct inversion in the iterative subspace. Journal of Molecular Structure 114, 31–34 (1984).
Article ADS Google Scholar
Vogel, S., Fischer, T. H., Hutter, J. & Lüthi, H. P. Third-order methods for molecular geometry optimizations. International Journal of Quantum Chemistry 45, 679–688 (1993).
Article CAS Google Scholar
Lindh, R., Bernhardsson, A., Karlström, G. & Malmqvist, P.-A. On the use of a hessian model function in molecular geometry optimizations. Chemical Physics Letters 241, 423–428 (1995).
Article ADS CAS Google Scholar
Werner, H.-J., Knowles, P. J., Knizia, G., Manby, F. R. & Schütz, M. Molpro: a general purpose quantum chemistry program package. WIREs Comput Mol Sci 2, 242–253 (2012).
Article CAS Google Scholar
Neese, F. The orca program system. Wiley Interdisciplinary Reviews: Computational Molecular Science 2, 73–78 (2012).
CAS Google Scholar
Aidas, K. et al. The dalton quantum chemistry program system. Wiley Interdisciplinary Reviews: Computational Molecular Science 4, 269–284 (2014).
PubMed CAS Google Scholar
Dovesi, R. et al. Quantum-mechanical condensed matter simulations with crystal. Wiley Interdisciplinary Reviews: Computational Molecular Science 0, e1360 (2018).
Google Scholar
Bitzek, E., Koskinen, P., Gähler, F., Moseler, M. & Gumbsch, P. Structural relaxation made simple. Phys. Rev. Lett. 97, 170201 (2006).
Article ADS PubMed CAS Google Scholar
Kästner, J. & Sherwood, P. Superlinearly converging dimer method for transition state search. The Journal of Chemical Physics 128, 014106 (2008).
Article ADS PubMed CAS Google Scholar
Henkelman, G. & Jónsson, H. A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives. The Journal of Chemical Physics 111, 7010–7022 (1999).
Article ADS CAS Google Scholar
Stewart, J. J. P. Optimization of parameters for semiempirical methods v- Modification of nddo approximations and application to 70 elements. Journal of Molecular Modeling 13, 1173–1213 (2007).
Article PubMed PubMed Central CAS Google Scholar
Hohenberg, P. & Kohn, W. Inhomogeneous electron gas. Phys. Rev. 136, B864–B871 (1964).
Article ADS MathSciNet Google Scholar
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138 (1965).
Article ADS MathSciNet Google Scholar
Møller, C. & Plesset, M. S. Note on an approximation treatment for many-electron systems. Phys. Rev. 46, 618–622 (1934).
Article ADS MATH Google Scholar
Becke, A. D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 38, 3098–3100 (1988).
Article ADS CAS Google Scholar
Lee, C., Yang, W. & Parr, R. G. Development of the colle-salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 37, 785–789 (1988).
Article ADS CAS Google Scholar
VandeVondele, J. & Hutter, J. Gaussian basis sets for accurate calculations on molecular systems in gas and condensed phases. The Journal of Chemical Physics 127, 114105 (2007).
Article ADS PubMed CAS Google Scholar
Lippert, G., Hutter, J. & Parrinello, M. A hybrid gaussian and plane wave density functional scheme. Molecular Physics 92, 477–488 (1997).
Article ADS CAS Google Scholar
Goedecker, S., Teter, M. & Hutter, J. Separable dual-space gaussian pseudopotentials. Phys. Rev. B 54, 1703–1710 (1996).
Article ADS CAS Google Scholar
Rappe, A. K., Casewit, C. J., Colwell, K. S., Goddard, W. A. & Skiff, W. M. Uff, a full periodic table force field for molecular mechanics and molecular dynamics simulations. Journal of the American Chemical Society 114, 10024–10035 (1992).
Article CAS Google Scholar
Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A. & Case, D. A. Development and testing of a general amber force field. Journal of Computational Chemistry 25, 1157–1174 (2004).
Article PubMed CAS Google Scholar
Baker, J. & Chan, F. The location of transition states: A comparison of cartesian, z-matrix, and natural internal coordinates. Journal of Computational Chemistry 17, 888–904 (1996).
Article CAS Google Scholar
Bardwell, D. A. et al. Towards crystal structure prediction of complex organic compounds–a report on the fifth blind test. Acta Crystallographica Section B 67, 535–551 (2011).
Article CAS Google Scholar
Reilly, A. M. et al. Report on the sixth blind test of organic crystal structure prediction methods. Acta Crystallographica Section B 72, 439–459 (2016).
Article CAS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Article ADS PubMed CAS Google Scholar
Vanderbilt, D. Soft self-consistent pseudopotentials in a generalized eigenvalue formalism. Phys. Rev. B 41, 7892–7895 (1990).
Article ADS CAS Google Scholar
Tersoff, J. New empirical model for the structural properties of silicon. Phys. Rev. Lett. 56, 632–635 (1986).
Article ADS PubMed CAS Google Scholar
Pastewka, L., Klemenz, A., Gumbsch, P. & Moseler, M. Screened empirical bond-order potentials for si-c. Phys. Rev. B 87, 205410 (2013).
Article ADS CAS Google Scholar
Bahn, S. R. & Jacobsen, K. W. An object-oriented scripting interface to a legacy electronic structure code. Computing in Science Engineering 4, 56–66 (2002).
Article CAS Google Scholar
Szlachta, W. J., Bartók, A. P. & Csányi, G. Accuracy and transferability of gaussian approximation potential models for tungsten. Phys. Rev. B 90, 104108 (2014).
Article ADS CAS Google Scholar
Daw, M. S. & Baskes, M. I. Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. Phys. Rev. B 29, 6443–6453 (1984).
Article ADS CAS Google Scholar
Zhou, X. W., Johnson, R. A. & Wadley, H. N. G. Misfit-energy-increasing dislocations in vapor-deposited cofe/nife multilayers. Phys. Rev. B 69, 144113 (2004).
Article ADS CAS Google Scholar
Case, D. A. et al. Amber 16. Tech. Rep., University of California, San Francisco, http://www.ambermd.org (2016).
Werner, H.-J. et al. Molpro, version 2015.1, a package of ab initio programs, https://www.molpro.net (2015).
Pastewka, L. Atomistica: interatomic potentials library, http://www.atomistica.org.
VandeVondele, J. et al. Quickstep: Fast and accurate density functional calculations using a mixed gaussian and plane waves approach. Computer Physics Communications 167, 103–128, http://www.cp2k.org (2005).
Article ADS CAS Google Scholar
Segall, M. D. et al. First-principles simulation: ideas, illustrations and the castep code. Journal of Physics: Condensed Matter 14, 2717 http://www.castep.org (2002).
ADS CAS Google Scholar
Csányi, G. et al. Expressive programming for computational physics in fortran 95+. IoP Comput. Phys. Newsletter Spring, http://www.libatoms.org (2007).
Larsen, A. H. et al. The atomic simulation environment—a python library for working with atoms. Journal of Physics: Condensed Matter 29, 273002 https://wiki.fysik.dtu.dk/ase (2017).
Google Scholar

Download references

Acknowledgements

The authors thank Prof. Chris J Pickard for stimulating discussions. The work was supported by the EPSRC grant EP/J022012/1. L.M. and C.O. were supported by ERC Starting Grant 335120. Computer time was in part provided by the Centre for Scientific Computing of the University of Warwick and Archer under the UKCP Consortium EPSRC grant EP/P022596/1.

Author information

Authors and Affiliations

Mathematics Institute, University of Warwick, Zeeman Building, Coventry, CV4 7AL, United Kingdom
Letif Mones & Christoph Ortner
Engineering Laboratory, University of Cambridge, Trumpington Street, Cambridge, CB2 1PZ, United Kingdom
Letif Mones & Gábor Csányi

Authors

Letif Mones
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Ortner
View author publications
You can also search for this author in PubMed Google Scholar
Gábor Csányi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.M. wrote software and conducted the simulations. L.M. and C.O. and G.C. designed the research. All authors edited the manuscript.

Corresponding author

Correspondence to Letif Mones.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Initial geometries

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mones, L., Ortner, C. & Csányi, G. Preconditioners for the geometry optimisation and saddle point search of molecular systems. Sci Rep 8, 13991 (2018). https://doi.org/10.1038/s41598-018-32105-x

Download citation

Received: 08 March 2018
Accepted: 14 August 2018
Published: 18 September 2018
DOI: https://doi.org/10.1038/s41598-018-32105-x

Keywords

This article is cited by

Machine learning force fields for molecular liquids: Ethylene Carbonate/Ethyl Methyl Carbonate binary solvent
- Ioan-Bogdan Magdău
- Daniel J. Arismendi-Arrieta
- Gábor Csányi
npj Computational Materials (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Neural operators for accelerating scientific simulations and design

ColabFold: making protein folding accessible to all

Opportunities and challenges in design and optimization of protein function

Introduction

Methods

Enhancing geometry optimisation by using preconditioners

FF-based preconditioners

Combining FF and Exp preconditioners

Implementation details

Model systems and potentials

Organic molecules in gas phase

Molecular crystals

Material systems

Software

Accession codes

Results

Organic molecules in gas phase

Molecular crystals

Material systems

Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Electronic supplementary material

Initial geometries

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Machine learning force fields for molecular liquids: Ethylene Carbonate/Ethyl Methyl Carbonate binary solvent

Comments

Search

Quick links