A fast and adaptable method for high accuracy integration of the time-dependent Schrödinger equation

We present an adaptable, fast, and robust method for integrating the time-dependent Schrödinger equation. We apply the method to calculations of High Harmonic (HHG) and Above Threshold Ionisation (ATI) spectra for a single atomic electron in an intense laser field. Our approach implements the stabilized bi-conjugate gradient method (BiCG-STAB) for solving a sparse linear system to evolve the electronic wavefunction in time. The use of this established method makes the propagation scheme less restrictive compared to other schemes which may have particular requirements for the form of the equation, such as use of a three-point finite-difference approximation for spatial derivatives. Our method produces converged solutions significantly faster than existing methods, particularly if high accuracy is required. We demonstrate that this approach is suitable for a range of different parameters and show that in many circumstances significant gains can be made with the use of a fourth-order time propagator as opposed to the more common second-order Crank-Nicolson (CN) method.


A fast and adaptable method for high accuracy integration of the time-dependent Schrödinger equation Daniel Wells & Harry Quiney
We present an adaptable, fast, and robust method for integrating the time-dependent Schrödinger equation. We apply the method to calculations of High Harmonic (HHG) and Above Threshold Ionisation (ATI) spectra for a single atomic electron in an intense laser field. Our approach implements the stabilized bi-conjugate gradient method (BiCG-STAB) for solving a sparse linear system to evolve the electronic wavefunction in time. The use of this established method makes the propagation scheme less restrictive compared to other schemes which may have particular requirements for the form of the equation, such as use of a three-point finite-difference approximation for spatial derivatives. Our method produces converged solutions significantly faster than existing methods, particularly if high accuracy is required. We demonstrate that this approach is suitable for a range of different parameters and show that in many circumstances significant gains can be made with the use of a fourth-order time propagator as opposed to the more common second-order Crank-Nicolson (CN) method.
There is substantial interest in using the interaction of atoms in intense laser fields as a source of coherent short-wavelength light [1][2][3][4][5] . Such light sources are attractive for their laboratory scale and are the foundation of attosecond science, which exploits the sub-cycle dynamics of electron behaviour to probe matter directly on its natural timescale [6][7][8][9] .
Numerical integration of the Time-Dependent Schrödinger Equation (TDSE) has been an essential tool in understanding intense laser-atom interactions since practical applications of this field of research became apparent with the discoveries of Above Threshold Ionization (ATI) 10 and High Harmonic Generation (HHG) 11 . Direct solution of the TDSE provides an ab initio treatment of the interaction capable of dealing exactly with its non-linear nature, which has historically been exploited to obtain insights that would not be possible with analytic or approximate methods 12 .
Despite the centrality of the TDSE to this field, high accuracy computations are rarely performed. This is due in part to the substantial computational requirements of such a task, but also due to the difficulty in directly comparing the results of numerical simulation with experimental observables. This is a result of both uncertainty in the precise form of the laser field and the necessity of making approximations in treating many electron atoms.
Interest in accurate numerical data, however, has increased recently largely due to developments in laser science that allow for the creation of tailored laser fields 13 as well as experiments using atomic hydrogen as the interaction medium -a system for which the single particle Hamiltonian can be expressed exactly -and the use of such experiments for laser calibration [14][15][16] .
Although established techniques for integrating the TDSE [17][18][19][20][21][22][23] can, in principle, work to arbitrary accuracy, they are not generally amenable to high accuracy simulations since computation time scales poorly with increased demand for accuracy. The commonly used split-step operator method 19,20 separates the Hamiltonian into a field-free (atomic) term and an interaction (laser) term and uses a second-order decomposition of the time propagator, For an initial state at time t, integration of this equation gives a state at time t + Δt, where  is the time-ordering operator. Taking a short time approximation to this equation, in which the Hamiltonian is considered constant over a time step, and applying the Crank-Nicolson formula yields the familiar propagation equation which is unitary, accurate to second-order in the time step and unconditionally stable. Even for modest demands of accuracy, this equation necessitates a very small time step and so we also consider a higher-order expansion of equation 3, obtained from the fourth-order Lobatto IIIA which is an implicit Runge-Kutta method involving three function evaluations 30 . The dependence on the middle function evaluation can be removed by substitution to obtain: As an indication of the potential advantage of the higher-order method, consider the propagation of a plane wave with frequency ω such that i t

The wave function
and , ) 12 6 12 6 ( , ) The leading order error terms are (ωΔt) 3 /12 and (ωΔt) 5 /12000 respectively. Restricting this error to one part in one thousand gives ωΔt < 0.23 for the second-order method and ωΔt < 1.64 for the fourth-order method: the allowed time step is greater than a factor of seven larger. For a desired accuracy ε this factor of advantage of the higher-order method scales as ε −2/15 .
The fourth-order method has an apparent shortcoming in that it sacrifices the strict unitarity of the second-order method. This is evident from the evaluation of the Hamiltonian at different times on each side of equation 5. However, the norm of the wavefunction never deviates significantly from unity so this formal lack of unitarity is not a problem in practice. Despite this, some may prefer to take a unitary version of this propagator by evaluating the Hamiltonian at only one point during each propagation step, for example at the midpoint of the interval.
Although this approach technically reduces the propagation from fourth-order to second-order, it is still significantly more accurate than the standard CN method. This is because the time variation of the Hamiltonian only occurs through the interaction term and is significantly slower than the time variation of the wave function, which is determined by the highest energy eigenstate occupied by the electron.
One can readily compute an upper-bound for the error arising from the unitary version of the propagator by considering plausible upper limits for the electron momentum and energy. For a Hamiltonian with a time-dependent vector potential, A(t), the difference in the wave function computed using the unitary and non-unitary expansions has a leading term in Δt of 3 where A′ and A′′ are the first and second derivatives of the vector potential amplitude and p θ is the angular part of the momentum operator (from the diagonal part of H). For a slowly varying vector potential (an infrared laser field for example), this error can be small enough for this scheme to be advantageous over the CN method. There is little to be gained, however, by enforcing unitarity of the time evolution in this way. It turns out to almost always be a better option to take the unaltered fourth-order propagator, and ensure that the observable of interest is converged with respect to size of the time step. We demonstrate that the use of this fourth order propagation scheme has advantages over the more common CN method, particularly for simulations that require high degrees of accuracy.
The stabilized bi-conjugate gradient method. We represent the wave function Ψ(r) in terms of some set of basis states φ i (r), are the expansion coefficients. Propagation by either equations 4 or 5 amounts to a matrix equation of the form where b can be calculated directly by projection of the RHS of equations 4 or 5 onto the the basis states while the LHS can be similarly cast as a known matrix, A, multiplied by the coefficients of the wavefunction at t + Δt. A will typically be a sparse matrix, coupling each basis element to only a handful of other elements. We solve this equation using the preconditioned BiCG-STAB 29 , a Krylov subspace method that offers smoother and faster convergence than its predecessors, the bi-conjugate gradient (BiCG) and conjugate gradient squared (CG-S) methods.
The routine converges well only if the matrix is close to the identity. To achieve this a preconditioning matrix, ∼ A, is uesd that is some approximation to A. The preconditioned equation is then 1 Each iteration of the algorithm requires two inversions of the preconditioning matrix, ∼ A, two multiplications by A, and four vector inner products. The algorithm exits when the norm of the residual is less than a predefined tolerance.
Application to atoms in intense laser fields. The electron wave function is decomposed as a partial wave expansion such that The radial functions, ψ l , are discretized on a linear grid. A dynamically expanding grid is used for computations of photo-electron spectra to ensure the wavefunction at the grid boundary is zero. For simulations of HHG for which ionised electrons play no role, the maximum grid size is set such that maximum electron excursion distances are entirely within the grid and a masking function is used to prevent reflections from the grid boundary.
The interaction with the laser field is performed in velocity gauge. Velocity and length gauge representations differ by a phase factor in the wavefunction. It has been shown previously that this phase factor results vastly more terms in expansion of equation 14 when length gauge is used, making length gauge computations impractical 21,28 .
In this basis, the non-discretized Hamiltonian becomes tri-diagonal in the angular quantum number, l, such that the laser interaction couples each ψ l only to ψ l±1 : The term c l is the coupling coefficient l and A(t) is the magnitude of the vector potential of the laser field. Discritization is performed using a five-point finite difference approximation for the the spatial derivatives. The approximations at the ith grid element are given by when acting on a function f with a grid spacing h.
For a radial grid there is an inherent complication in these formulas at the first grid point r = h since the above expressions require a value at r = −h (a value at r = 0 is implicitly included as a result of the boundary condition at the origin). Neglecting this complication results in an error in the first grid point independent of step size.
This problem could be remedied by altering a few elements in the top left of the discritized C l and B l matrices, but unfortunately there is no general way of accomplishing this without destroying the hermiticity of the Hamiltonian operator.
A solution is to use an analytic expression for the radial wave functions near the origin to extrapolate to a point at r = −h. This approach is similar to that of Salomonson and Öster's for a logarithmic grid spacing 31 . The analytic form of the radial functions is given by where Z is the atomic number of the atom. Close to the origin the form of the wavefunction becomes independent of energy due to the unbounded magnitude of the nuclear potential and so this approximation will hold regardless of the state of the electron. Equation 21 indicates that radial functions are O(r l+1 ) as → r 0 and the extrapolated point will be well approximated by zero for l ≥ 2. It is therefore sufficient to only consider the l = 0 and l = 1 radial functions. Extrapolated values at r = −h can be computed in terms of the value at r = h: Incorporating the values results in modifications to the discritized first and second derivative operators. For the second derivative, where a is chosen to suit the angular number of the relevant partial wave, with a = −29 + 2Zh/(1 − Zh) for C 0 and a = −31 − 2Zh/(2 − hZ) for C 1 .
The operators B and B † in equation 15 act on radial functions of different angular number so the explicit dependence of equation 21 on l is problematic for the hermiticity of the Hamiltonian. The second term in equation 23 is not required to obtain a sensible value for the first derivative and can be safely modified to restore hermiticity by setting ψ 1 where the plus is used for the unconjugated B l matrices and the minus is used for the hermitian conjugates † B l (this is hermitian due to the i that pre-multiplies the derivative in equation 17). For B 0 , we obtain b = 1 + 2hZ/ (1 − Zh) and for B 1 No such modifications are required at the large r grid boundary since the wavefunction is guaranteed to be zero by the use of either a masking function or a dynamically expanding grid.
Preconditioning. To work effectively, BiCG-STAB requires a pre-conditioning matrix, ∼ A, that can be easily inverted and functions to enhance the diagonal dominance of the matrix fed to the conjugate gradient algorithm. A suitable choice for this purpose is the part of the matrix that is diagonal in l. For the CN propagator, this is just at while for the fourth-order propagator this is where H at. is the atomic part of the Hamiltonian: H as defined in equation 15 without the B l operators. Since the function of this matrix is simply to act as an approximation to the full matrix, we can employ a three-point finite difference representation for the spatial derivative. This ensures that the preconditioning matrix is tridiagonal in the case of the CN propagator and is easily invertible. The fourth-order preconditioner (equation 27) can be factorized into two tridiagonal matrices, which can be inverted sequentially. This particular form of preconditioning mirrors the splitting of the propagation matrix used for the MI method, for which the part diagonal in l is inverted directly and the 'off diagonal' part is handled by the iterative routine.
Adaptive time stepping. The numerical integration becomes much more efficient when an adaptive time step is used. At the beginning or end of the laser pulse for example, relatively large time steps can be taken compared to when the laser vector potential is large or is changing rapidly.
Both the CN and the fourth-order methods allow for embedded error checking by constructing lower-order estimates of the wave function at the next time step and using the difference between this and the higher-order value as an error estimate. This can then be used to track the error and adjust the time step accordingly. For the CN method a first-order approximation to Ψ n+1 is given by The error estimate vector is then The error estimate for the fourth-order method is slightly more complicated to derive but ultimately reduces to n n n n n n (4) 1 1 1 Only the last term in this expression has not already been calculated in performing the propagation so computation of the error is relatively low cost. Using this information to choose an appropriate next step size requires some trial and error. A successful method is to estimate the total error in a time step by where the summation ignores terms for which |Ψ| 2 is less than a cut-off, set to 10 −5 , to ensure computational effort is not being spent where the wave function is negligible. The following time step is set so the error is less than a desired tolerance, η, by The exponent μ is set according to the order of error of the time step -0.2 for the fourth-order method and 0.33 for Crank-Nicolson.
While this adaptive stepping scheme produces converged results for both the CN and fourth-order propagators, we find that it works best for the fourth-order method. This is likely because the estimate of the error is closer to the true error in this case.

Convergence of the CG algorithm.
To test the performance of the BiCG-STAB routine we compare directly with the MI method. As remarked earlier the splitting of the workload of the two schemes is very similar with both involving direct inversion of the part of the matrix that is diagonal in the orbital quantum number lin the BiCG-STAB scheme this is done through the preconditioner -while parts off-diagonal in l are handled by the iterative scheme. We can therefore meaningfully compare the two schemes iteration-by-iteration.
The calculations simulate a hydrogen atom initially in its ground state. The matrix equation we use as a test case corresponds to a single time step of 0.1 atomic units in the presence of a vector potential. The magnitude of the vector potential is set to values of 1.0, 2.0 and 3.0 atomic units, which are typical of intense laser pulses. For the purposes of this comparison, a single iteration of the CG algorithm is counted as two iterations as it involves two updates of the solution vector and two inversions of the preconditioning matrix, meaning a single 'iteration' involves similar computational work for each of the methods. To facilitate a direct comparison the three-point difference formula is used in for the CG method rather than the five-point formula as defined in equations 19 and 20. When compared iteration by iteration in this fashion, the CG method is somewhat slower than MI. This is to be expected due to the bookkeeping overheads and the need to compute a search direction at each step. The convergence as a function of iteration is shown in Fig. 1.
It is noteworthy that the convergence of the two methods is almost identical. This reflects the diagonal dominance of the matrix to be inverted and the similarities in the treatment of the diagonal part of the matrix. Similar calculations were performed with the wave function in an evolved state after partial ionization by a laser pulse, but the similarities in the convergence are maintained.
These results demonstrate no immediate advantage in using the more complex CG scheme over the MI scheme if the CN propagator and three-point methods are used. We find, however, that the convergence of the conjugate gradient routine remains strong when higher-order methods are used for the spatial and temporal derivatives. The cost of these higher-order approximations can be measured by the number of iterations required for the algorithm to converge and comparing this convergence using the lower-order methods. We expect the lower-order methods to converge more quickly due to the simplicity of the equations. Employing a five-point formula results in an insignificant additional convergence cost. The fourth-order time propagator, on the other hand, always requires more iterations to converge than its second-order counterpart, but this cost is counterbalanced by the significant reduction in required number of time steps. An example of the number of iterations required for convergence as a function of the time step size, Δt, is shown in Fig. 2.

Comparison of 3 point and 5 point difference formulas.
A five-point difference method has a clear advantage over the three-point difference method. Figure 3 demonstrates how calculations of the total ionization probability depend on grid spacing for the respective methods. The simulations are of a hydrogen atom in a 5 fs laser pulse with peak intensity 10 15 W/cm 2 . An upper limit is imposed on the maximum grid spacing by the uncertainty principle and it is important to ensure this limit is not exceeded. Specifically, for a maximum electron momentum p max , this requirement is Δr < 2π/p max . p max can be estimated by considering the maximum momentum an electron could attain in the evolving laser field from a stationary start. In the simulations presented in Fig. 3 this limit is approximately 1 atomic unit, safely exceeding the maximum grid spacing used in the simulations.
For the five-point method, even with a grid spacing of 0.2 a.u., the error in the calculated ionization probability is less than 0.1% (percentage relative to the converged probability, 0.958) of the converged value. To achieve the same accuracy with the three-point method requires four times as many grid points. The corresponding benefit to computational time is not quite a factor of four, since slightly more work is done using a five-point routine (multiplication by the 5-point Hamiltonian is roughly a factor of 1.67 more expensive -15 non-zero diagonal bands instead of 9), but the decrease in computational time is significantly better than a factor of two. As shown in Fig. 2, the number of iterations required for a time step to converge is only slightly greater, or the same, for when the five-point difference method is used, making this option clearly superior.  Unitarity of fourth-order propagator. The asymmetry of the fourth-order propagation equation (equation 5) has the potential to result in non-unitary time evolution. This is a direct result of the explicit time variation of the Hamiltonian and so complete unitarity of the propagation will be restored in the limit Δ → t 0 when the laser field becomes zero. To illustrate the deviation of the norm of the wave function, Fig. 4 shows the norm plotted as a function of pulse time for a range of step sizes (which are held constant throughout the calculation). For this simulation the peak pulse intensity is set to 5 × 10 14 W/cm 2 with central wavelength of 800 nm and four cycles duration.
The figure shows the most rapid variation in the norm when the vector potential of the laser field is changing most rapidly, corresponding to the times when the Hamiltonian is also changing most rapidly. For Δt = 0.1 a.u., this variation is negligible; significantly less than one part in ten thousand. For Δt = 0.5 a.u. the maximum variation is around 0.2 percent. The clear dependence of the deviation from unit norm on the rate of change of the vector potential suggests a varying time-step would be beneficial. While it is certainly possible to use the variation of the norm of the wavefunction directly as a control parameter for the step size, the methods of adaptive step sizing described above are preferable as they more directly relate step size to the error in the propagation equations.

Comparison of 2nd order and 4th order propagators.
To compare the performance of the second-order and fourth-order propagators we examined high harmonic spectra computed using fixed time steps. We find there is a significant reduction in the number of time steps required to achieve a converged solution when the higher-order method is used -we obtain a well converged spectrum for the fourth-order method using a time step of 0.2 atomic units. To achieve similar convergence with the Crank-Nicolson method the required time step is 0.02 atomic units.
We define a metric to determine the error in a given spectrum based on a normalized difference of the spectrum to the converged spectrum (obtained using the highest accuracy simulation): is a weighted average of the spectrum at ω determined by is a windowing function with a window size 2δ defined by 3 3 3 We set δ to be 4 times the laser frequency.
In Fig. 5, we show spectra produced using the two methods. The fully converged spectrum in not shown in this plot as it is indistinguishable from the fourth-order result. The second-order method, despite having a time step five times shorter than the fourth-order method, is clearly different from the converged spectrum and is unsatisfactory for quantitative purposes. The right-hand plots of Fig. 5 show the integrated error of the harmonic spectra as defined in equation 33. This shows markedly faster convergence for the higher-order method. Importantly, both the second-and fourth-order methods converge to the same result as the time step is decreased. This indicates that the small deviation from unit norm that arises in the fourth-order method has no appreciable effect on the calculation of observables. The cost of the higher-order method is its increased complexity, which results in more iterations per time step (see Fig. 2) and more computation at each iteration. To determine the overall performance of the higher-order method relative to the second-order method we count the total number of iterations throughout a simulation. Table 1 demonstrates how a specific harmonic intensity converges with respect to the size of time step used in the calculation. Harmonic intensity is calculated by integrating the spectrum, I(ω), over the harmonic frequency.
Once the number of required iterations exceeds 20, there appears to be no computational advantage in increasing the time step regardless of numerical accuracy, and a smaller step size results in a faster computation. To allow for this, for these simulations, whenever the number of iterations required is 20 or greater, the time step is reduced so that this limit is not exceeded. The time step is otherwise kept constant at the specified value. The only simulations for which the number of iterations per step approaches 20 at any point are the fourth-order simulations with step sizes of 0.20, 1.0 and 2.0.
The total number of iterations aggregated across the simulation are shown as an alternative measure of the time taken for each method, that is (relatively) independent of factors such as programming language and style.
Computationally, the difference between the second-and fourth-order methods stems from the fact that the number of Hamiltonian multiplications increases from two to six each iteration, while the number of inversions of the field free Hamiltonian increases from two to four. The number of dot products remains the same, although the number of vector additions increases slightly. Overall, a single iteration of the fourth-order scheme consistently takes about 1.5 times the time taken by the second-order method. Thus, the fourth-order scheme will be more efficient if the simulation can be completed with fewer than two thirds of the number for the second-order scheme. The table shows that, in calculating the 15th harmonic to within 5% accuracy, the second-order method takes 300 × 10 3 iterations compared to approximately 105 × 10 3 iterations for the fourth-order method. The overall time saving is around a factor of two.
In general it will be desirable to use a varying time step rather than a fixed time step, so this factor of two does not necessarily reflect the actual reduction in computation of the higher order method. The fourth order method is most advantageous when the magnitude of the vector potential is large, and high accuracy is required. Note  that when the magnitude is large the variation is slow so this is also the point at which the deviation from unit norm will be smallest. In many cases, the most efficient method may be a hybrid of the second-and fourth-order schemes where the second-order method can skip through the 'easy' periods where the vector potential is small, taking large steps, and the fourth-order method can step in to cover the rockier terrain when the magnitude of the vector potential is large.

Conclusion
We have presented a method of integrating the time-dependent Schrödinger equation for a single particle with a time-dependent Hamiltonian that is based around use of the stabilized bi-conjugate gradient method (BiCG-STAB) to solve a matrix equation for the propagation of a single time step. We have applied our method to the problem of an atom in an intense few-cycle infra-red laser field and demonstrated that it performs better than existing methods. Specifically, we have demonstrated that this method can incorporate higher-order approximations to spatial and temporal derivatives than other more rigid methods, and this allows for the use of significantly fewer spatial and temporal grid points without compromising on accuracy. We have shown that this reduction in grid points means a significant decrease in the amount of time required to perform simulations.