Imaginary time is an unphysical, yet powerful, mathematical concept. It has been utilised in numerous physical domains, including quantum mechanics, statistical mechanics and cosmology. Often referred to as performing a ‘Wick rotation’,1 replacing real time with imaginary time connects Euclidean and Minkowski space,2 quantum and statistical mechanics3 and static problems to problems of dynamics.4 In quantum mechanics, propagating a wavefunction in imaginary time enables: the study of finite temperature properties,5,6,7 finding the ground-state wavefunction and energy (such as in density matrix renormalisation group),8,9,10,11 and simulating real-time dynamics (such as time-dependent Hartree).12,13 For a system with Hamiltonian, H, evolving in real time, t, the propagator is given by eiHt. The corresponding propagator in imaginary time, τ = it, is given by e; a non-unitary operator.

Using a classical computer, we can simulate imaginary time evolution by evaluating the propagator and applying it to the system wavefunction. There also exist various related classical methods, such as quantum Monte Carlo14,15 and density matrix renormalisation group16,17 for solving different problems. However, because the dimension of the wavefunction grows exponentially with the number of particles, classical simulation of many-body quantum systems is generally hard.18 While efficient variational trial states have been developed for a number of applications,19 powerful trial wavefunctions typically require classical computational resources which scale exponentially with the system size.11

Quantum computing can naturally and efficiently store many-body quantum states, and hence is suitable for simulating quantum systems.20 We can map the system Hamiltonian to a qubit Hamiltonian, and simulate real-time evolution (as described by the Schrödinger equation) by realising the corresponding unitary evolution with a quantum circuit.21 Using Trotterization,22 the real-time propagator can be decomposed into a sequence of single- and two-qubit gates.23 The ability to represent the real-time propagator with a sequence of gates stems from its unitarity. In contrast, because the imaginary time operator is non-unitary, it is not straightforward to decompose it into a sequence of unitary gates using Trotterization, and thus directly realise it with a quantum circuit. As a result, alternative methods are required to implement imaginary time evolution using a quantum computer.

Classically, we can simulate real (imaginary) time evolution of parametrised trial states by repeatedly solving the (Wick rotated) Schrödinger equation over a small timestep, and updating the parameters for the next timestep.8,9,11,24,25,26,27 This method has recently been extended to quantum computing, where it was used to simulate real-time dynamics.28 Closely related are the variational quantum eigensolver (VQE)29,30,31,32,33,34,35 and the quantum approximate optimisation algorithm (QAOA),36 which update the parameters using a classical optimisation routine, to find the minimum energy eigenvalue of a given Hamiltonian. As ‘hybrid quantum-classical methods’, these algorithms use a small quantum computer to carry out a classically intractable subroutine, and a classical computer to solve the higher-level problem. The quantum subroutine may only require a small number of qubits and a low-depth circuit, presenting a potential use for noisy intermediate-scale quantum hardware.37

In this paper, we propose a method to simulate imaginary time evolution on a quantum computer, using a hybrid quantum-classical variational algorithm. The proposed method thus combines the power of quantum computers to efficiently represent many-body quantum states, with classical computers’ ability to simulate arbitrary (including unphysical) processes. We discuss using this method to find the ground-state energy of many-body quantum systems, and to solve optimisation problems. We then numerically test the performance of our algorithm at finding the ground-state energy of both the hydrogen molecule (H2) and lithium hydride (LiH). We compare our results for LiH to those obtained using the VQE with gradient descent. As our algorithm only requires a low-depth circuit, it can be realised with current and near-term quantum processors.


Variational imaginary time evolution

We focus on many-body systems that are described by Hamiltonians \(H = \mathop {\sum}\nolimits_i {\lambda _i} h_i\), with real coefficients, λi, and observables, hi, that are tensor products of Pauli matrices. We assume that the number of terms in this Hamiltonian scales polynomially with the system size, which is true for many physical systems, such as molecules or the Fermi-Hubbard model. Given an initial state |ψ〉, the normalised imaginary time evolution is defined by

$$\left| {\psi (\tau )} \right\rangle = A(\tau )e^{ - H\tau }\left| {\psi (0)} \right\rangle ,$$

where \(A(\tau ) = 1/\sqrt {\left\langle {\psi (0)} \right|e^{ - 2H\tau }\left| {\psi (0)} \right\rangle }\) is a normalisation factor. In the instance that the initial state is a maximally mixed state, the state at time τ is a thermal or Gibbs state ρT=1 = e−Hτ/Tr[e−Hτ], with temperature T = 1. When the initial state has a non-zero overlap with the ground state, the state at τ → ∞ is the ground state of H. Equivalently, the Wick rotated Schrödinger equation is,

$$\frac{{\partial \left| {\psi (\tau )} \right\rangle }}{{\partial \tau }} = - (H - E_\tau )\left| {\psi (\tau )} \right\rangle ,$$

where the term Eτ = 〈ψ(τ)|H|ψ(τ)〉 results from enforcing normalisation. Even if |ψ(τ)〉 can be represented by a quantum computer, the non-unitary imaginary time evolution cannot be naively mapped to a quantum circuit.

In our variational method, instead of directly encoding the quantum state |ψ(τ)〉 at time τ, we approximate it using a parametrised trial state \(| {\phi (\vec \theta (\tau ))} \rangle\), with \(\vec \theta (\tau ) = (\theta _1(\tau ),\theta _2(\tau ), \ldots ,\theta _N(\tau ))\). This stems from the intuition that the physically relevant states are contained in a small subspace of the full Hilbert space.38 The trial state is referred to as the ansatz. In condensed matter physics and computational chemistry, a wide variety of ansätze have been proposed for both classical and quantum variational methods.11,20,39,40

Using a quantum circuit, we prepare the trial state, \(| {\phi (\vec \theta )} \rangle\), by applying a sequence of parametrised unitary gates, \(V(\vec \theta ) = U_N(\theta _N) \ldots U_k(\theta _k) \ldots U_1(\theta _1)\) to our initial state, \(\left| {\bar 0} \right\rangle\). We express this as \(| {\phi (\vec \theta )} \rangle = V(\vec \theta )| {\bar 0} \rangle\) and remark that \(V(\vec \theta )\) is also referred to as the ansatz. We refer to all possible states that could be created by the circuit V as the ‘ansatz space’. Here, Uk(θk) is the kth unitary gate, controlled by parameter θk, and the gate can be regarded as a single- or two-qubit gate.

To simulate the imaginary time evolution of the trial state, we use McLachlan’s variational principle,41,42

$$\delta \left\| {(\partial /\partial \tau + H - E_\tau )\left| {\psi (\tau )} \right\rangle } \right\| = 0,$$

where \(\parallel \rho \parallel = {\mathrm{Tr}}[\sqrt {\rho \rho ^\dagger } ]\) denotes the trace norm of a state. By replacing |ψ(τ)〉 with \(| {\phi (\tau )} \rangle = | {\phi (\vec \theta (\tau ))} \rangle\), we effectively project the desired imaginary time evolution onto the manifold of the ansatz space. The evolution of the parameters is obtained from the resulting differential equation

$$\mathop {\sum}\limits_j {A_{ij}} \dot \theta _j = C_i,$$


$$\begin{array}{l}A_{ij} = \Re \left( {\frac{{\partial \left\langle {\phi (\tau )} \right|}}{{\partial \theta _i}}\frac{{\partial \left| {\phi (\tau )} \right\rangle }}{{\partial \theta _j}}} \right)\\ C_i = \Re \left( { - \mathop {\sum}\limits_\alpha {\lambda _\alpha } \frac{{\partial \left\langle {\phi (\tau )} \right|}}{{\partial \theta _i}}h_\alpha \left| {\phi (\tau )} \right\rangle } \right),\end{array}$$

and hα and λα are the Pauli terms and coefficients of the Hamiltonian, as described above. The derivation of Eq. (4) can be found in the Supplementary Materials. As both Aij and Ci are real, the derivative \(\dot \theta _j\) is also real, as required for parametrising a quantum circuit. Interestingly, although the average energy term Eτ appears in Eq. (2), it does not appear in Eq. (4). This is because the ansatz applied maintains normalisation, as it is composed of unitary operators.

Imaginary time evolution with quantum circuits

By following a similar method to that introduced in ref., 28 we can efficiently measure Aij and Ci using a quantum computer. We assume that the derivative of a unitary gate Ui(θi) can be expressed as \(\partial U_i(\theta _i)/\partial \theta _i = \mathop {\sum}{_k {f_{k,i}}} U_i(\theta _i)\sigma _{k,i}\), with unitary operator σk,i. The derivative of the trial state is given by \(\partial \left| {\phi (\tau )} \right\rangle /\partial \theta _i = \mathop {\sum} {_k {f_{k,i}}} \tilde V_{k,i}\left| {\bar 0} \right\rangle\), with \(\tilde V_{k,i} = U_N(\theta _N) \ldots U_{i + 1}(\theta _{i + 1})U_i(\theta _i)\sigma _{k,i} \ldots U_1(\theta _1)\). There are typically only one or two terms resulting from each derivative. As an example, when Ui(θi) is a single-qubit rotation \(R_z(\theta _i) = e^{ - i\theta _i\sigma _z/2}\), the derivative \(\partial U_i(\theta _i)/\partial \theta _i = - i/2 \times \sigma _ze^{ - i\theta _i\sigma _z/2}\). The coefficients Aij and Ci are given by

$$\begin{array}{l}A_{ij} = \Re \left( {\mathop {\sum}\limits_{k,l} {f_{k,i}^ \ast } f_{l,j}\left\langle {\bar 0} \right|\tilde V_{k,i}^\dagger \tilde V_{l,j}\left| {\bar 0} \right\rangle } \right),\\ C_i = \Re \left( {\mathop {\sum}\limits_{k,\alpha } {f_{k,i}^ \ast } \lambda _\alpha \left\langle {\bar 0} \right|\tilde V_{k,i}^\dagger h_\alpha V\left| {\bar 0} \right\rangle } \right).\end{array}$$

All of these terms are of the form \(a\Re \left( {e^{i\theta }\left\langle {\bar 0} \right|U\left| {\bar 0} \right\rangle } \right)\) and can be evaluated using the circuits shown in the Supplementary Materials.

With A(τ) and \(\vec C(\tau )\) at time τ, the imaginary time evolution over a small interval δτ can be simulated by evaluating \(\vec{\dot{\theta}}(\tau ) = A^{ - 1}(\tau )\cdot \vec C(\tau )\), and using a suitable update rule, such as the Euler method,

$$\vec \theta (\tau + \delta \tau ) \simeq \vec \theta (\tau ) + \dot{\vec\theta}(\tau )\delta \tau = \vec \theta (\tau ) + A^{ - 1}(\tau )\cdot \vec C(\tau )\delta \tau .$$

By repeating this process NT = τtotal/δτ times, we can simulate imaginary time evolution over a duration τtotal. Often, the satisfying parameter evolution is not unique and Eq. (4) is underdetermined. In that case, we can employ truncated singular value decomposition to approximately invert A, or Tikhonov regularisation to additionally constrain the parameters to vary smoothly. We elaborate upon these strategies in the Supplementary Materials.

A limitation of our variational method is that the ansatz may not be able to faithfully describe all states on the desired trajectory, much like its real-time counterpart.28 Even though such states lie in a small subspace of the full Hilbert space,38 it is difficult to prove that they can be generated by a given ansatz, despite promising numerical results.28 However, our numerical results are similarly promising for imaginary time, and demonstrate it to be a robust routine for energy minimisation. Moreover, we believe that when tasked with finding the ground state using imaginary time evolution, a small deviation from the true evolution is less problematic than when trying to simulate real-time evolution. This is because imaginary time evolution always drives a state towards the ground state (or one of the lowest eigenstates), whereas the real-time evolution of two closely separated states may be very different. Consequently, as long as errors due to an imperfect ansatz do not cause the simulation to become trapped in local minima, we do not mind if the evolution deviates from the path of true imaginary time evolution, as ultimately, it will still be driven towards the ground state. Nevertheless, designing ansätze that are well suited to imaginary time evolution is an interesting open problem.

Ground-state energy via imaginary time evolution

We apply our method to the problem of finding the ground-state energy of a many-body Hamiltonian, H. As with the VQE, our goal is to find the values of the parameters, \(\vec \theta\), which minimise the expectation value of the Hamiltonian

$$E_{\min } = \min _{\vec \theta }{\mathrm{ }}\langle {\phi (\vec \theta )} |H| {\phi (\vec \theta )} \rangle ,$$

where \(| {\phi (\vec \theta )} \rangle = V(\vec \theta )| {\bar 0} \rangle\) is our variational trial state. The VQE solves this problem by using a quantum computer to construct a good ansatz and measure the expectation value of the Hamiltonian, and a classical optimisation routine to obtain new values of the parameters. In order to preserve the exponential speedup of the VQE over classical methods, the trial state is constructed using a number of parameters that scales polynomially with the system size. However, because we may need to consider many possible values for each parameter, the total size of the parameter space still scales exponentially with the system size. Moreover, many optimisation algorithms, such as gradient descent, are liable to becoming trapped in local minima. This combination can make the classical optimisation step of the VQE very difficult.43

As described above, if the initial state has a non-zero overlap with the ground state, true propagation in imaginary time will evolve the system into the ground state, in the limit that τ → ∞. Classically, this has been leveraged as a powerful tool to find the ground-state energy of quantum systems.8,9,11 Using our method, we can efficiently simulate ansatz-based imaginary time evolution to find the ground state, using a quantum computer. In the numerical simulations described below, we use the Euler method to solve differential equations, which corresponds to the update rule for the parameters shown in Eq. (7). We prove in the Supplementary Materials that when δτ is sufficiently small, the average energy of the trial state, \(E(\tau)=\langle\phi(\tau)\vert {H}\vert {\phi}(\tau)\rangle\), always decreases when following the Euler update rule: E(τ + δτ) ≤ E(τ).

In this work, we consider gradient descent, a canonical classical optimisation method

$$\vec \theta (\tau + \delta \tau ) = \vec \theta (\tau ) + \vec G(\tau )\delta \tau = \vec \theta (\tau ) + \vec C(\tau )\delta \tau ,$$

where \(\vec G(\tau ) = - \nabla E(\tau )\) is the gradient of E(τ) and \(\vec C(\tau ) \equiv - \nabla E(\tau )\) is the same vector in Eq. (4). Classical optimisation methods only consider information about the average energy, and not about the ansatz itself, which is encoded in the matrix A, used only in variational imaginary time evolution.

Toy example

Here, we present two simple toy examples which highlight the difference between variational imaginary time evolution and gradient descent for finding the ground-state energy of Hamiltonians. Consider the following Hamiltonians

$$H_A = \left( {\begin{array}{*{20}{c}} 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 \end{array}} \right),\;\;H_B = \left( {\begin{array}{*{20}{c}} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 \end{array}} \right)$$

with ansätze

$$\left| {\psi _A(\theta _1,\theta _2,\theta _3)} \right\rangle = e^{i\theta _3}CR_Y^{0,1}(\theta _2)R_X^0(\theta _1)\left| {00} \right\rangle ,$$
$$\left| {\psi _B(\theta _1,\theta _2,\theta _3)} \right\rangle = e^{i\theta _3}CR_Y^{0,1}(\theta _2)R_X^0(\theta _1)R_X^1(\theta _1)\left| {01} \right\rangle ,$$

prepared by circuits ((A) and (B), respectively).Here, \(CR_Y^{0,1}\) is a controlled Y rotation with control qubit 0 and target qubit 1, \(R_X^q(\theta _1)\) is a rotation of qubit q around the x-axis, and the rotation about the j-axis is \(R_{\sigma _j}(\theta ) = e^{ - i\theta \sigma _j/2}\) with Pauli matrices σj. Note that θ3 is a fictitious parameter which corresponds to the global phase. This is present only so that the evolution of the other parameters are not constrained to produce an oscillating global phase in time, as recently studied in ref. 44.

We study the ability of variational imaginary time and gradient descent to navigate the energy landscapes of the toy systems A and B, and present the results in Fig. 1. Figure 1a shows imaginary time robustly discovering the global minima of the energy landscape of system A, while gradient descent becomes trapped in local minima. Figure 1b shows imaginary time performing comparably with gradient descent for system B, despite it being only slightly more complicated than system A. This shows that it is still possible for imaginary time evolution to become trapped in local minima, when the ansatz is not sufficiently powerful for certain Hamiltonians.

Fig. 1
figure 1

Comparison of variational imaginary time (top plot in each panel) and gradient descent (bottom plot in each panel) discovering the ground state in toy systems A (top panel) and B (bottom panel). The background colour indicates the energy \(\langle\psi(\theta _1,\;\theta _2,\;\theta _3)\vert {H}\vert\psi(\theta _1,\;\theta _2,\;\theta _3)\rangle\) with red and blue corresponding to the global maximum and ground-state energies, respectively. The arrows indicate the trajectories of the methods, and are coloured green if they converge to the true ground state, and red otherwise

Simulation of H2 and LiH

We use our method to find the ground-state energy of the H2 and LiH molecules in their minimal spin–orbital basis sets. We map the molecular fermionic Hamiltonians to qubit Hamiltonians using the procedure described in the Supplementary Materials. The H2 Hamiltonian acts on two qubits, and considers the space of two electrons in four spin–orbitals. The LiH Hamiltonian acts on eight qubits, and considers an active space of two electrons in eight spin–orbitals. There are numerous possible choices for the ansatz circuit; we use a universal ansatz for H229 and an ansatz inspired by the low-depth circuit ansatz45 for LiH, as shown in the Supplementary Materials. The simulation results for H2 are shown in Fig. 2. We have used a universal ansatz, which is capable of representing all states along the imaginary time trajectory to confirm that our method can recover true imaginary time evolution, when the ansatz is sufficiently powerful. We attribute deviation from the true evolution to the use of an Euler update rule, and finite step size. Our simulations were able to converge to the ground state in all trials.

Fig. 2
figure 2

Simulations of H2 with random initial parameters and timestep δτ = 0.01. The red line is the exact ground-state energy. The dashed black line is the exact imaginary time evolution. The blue line is the variational imaginary time evolution. The inset plot shows the fidelity of variational imaginary time to true imaginary time evolution. Here we consider an internuclear distance of R = 0.75 Å. The inset plot and main plot share the same x-axis label

We compare the LiH results to those obtained using the VQE, with gradient descent as the classical optimisation routine. We use the low-depth circuit ansatz shown in the Supplementary Materials for our simulation, with 137 parameters. This is approximately a quarter of the number needed in a universal ansatz. We consider starting from a good initial state (the Hartree–Fock state for LiH), and also random initial states. We believe that the latter simulations provide a more thorough test of both methods.

We use the maximum stable step size δτ for each method such that energy monotonically decreases in the first 200 iterations. The stable timestep for imaginary time was 0.225, and for gradient descent it was 0.886. Figure 3 shows the imaginary time method outperforming gradient descent. It is able to locate the ground state more quickly, and accurately. This advantage is most noticeable for the case of random start states, where the obtained convergence rate is significantly higher than gradient descent. This may be most relevant for solving optimisation problems using the QAOA algorithm, where it is often harder to motivate a good initial starting state.

Fig. 3
figure 3

Noise-free simulations of LiH at an internuclear distance of R = 1.45 Å. Simulations in the top plot begin in a small random perturbation (of at most, Δθj = π/50) from the Hartree–Fock state. Simulations in the bottom plot begin with uniformly random parameters. The solid lines (against the left axis) indicate the fraction of 1280 simulations which, by the given iteration, have converged to within 1 mHartree of the true ground state. The dashed lines (against the right axis) indicate the average proximity to the true ground state of only the so-far converged simulations. Imaginary time and gradient descent use their maximum stable timesteps

It is natural to question whether the resource requirements of variational imaginary time evolution are comparable with those of gradient descent. We assess this by performing a simple resource estimation, and by examining its sensitivity to shot noise and to gate errors within the quantum computer. At each iteration, populating the gradient vector requires \({\cal{O}}(N_CN_HN_p)\) measurements, where NC is the number of measurements required to ascertain a Hamiltonian term to the required precision, NH is the number of terms in the Hamiltonian, and Np is the number of parameters used in the ansatz. For imaginary time, the total cost is \({\cal{O}}(N_CN_HN_p + N_p^2N_A)\), where NA is the number of measurements required to ascertain an element of the A matrix to the required precision. Sensible ansätze typically have fewer parameters Np than there are Hamiltonian terms NH (in our LiH simulations, Np = 137 and NH = 181). If the number of Hamiltonian terms is considerably larger than the number of parameters used, then the additional cost \({\cal{O}}(N_p^2N_A)\) of imaginary time can be dominated by the cost of calculating the gradient vector. While this is not true for our LiH simulations, we find that it is possible to further reduce the cost of imaginary time by using fewer measurements for each term in the A matrix than for each element of the gradient vector (NANC), while still maintaining imaginary time’s superior performance. We demonstrate this in Fig. 4, where we vary the number of measurements used to populate A, and simulate the methods under the effect of decoherence. The results show that imaginary time can perform significantly better than gradient descent under the presence of noise, even when significantly fewer measurements are made. However, if the gradient is not known to sufficient accuracy (NC < 2 × 104), the reliability of imaginary time evolution cannot be improved by increasing NA, and can even perform less effectively than gradient descent. Combined with imaginary time’s faster convergence and the tendency of gradient-based methods to become trapped in local minima, we expect finding the ground state to require substantially fewer measurements using imaginary time than gradient descent.

Fig. 4
figure 4

Simulations of LiH in the presence of a 10−4 error rate per gate and varying amounts of shot noise. Each point indicates the fraction of 100 trials which, after 2000 iterations from uniformly random initial parameter states, finished within 1 mHartree of the true ground state of LiH. For imaginary time (the blue lines), the horizontal axis indicates the number of shots NA used in sampling each element of the coefficient matrix A, every iteration. The number of shots NC used in measuring each Hamiltonian term in each element of the gradient vector, as employed by imaginary time evolution, is varied between the blue lines. For gradient descent, the horizontal axis is NC


In this work, we have proposed a method to efficiently simulate imaginary time evolution using hybrid quantum-classical computing. We have applied our method to finding the ground-state energy of quantum systems, and have tested its performance on H2 and LiH. As imaginary time evolution outperformed gradient descent at this task, we believe our method provides a competitive alternative to conventional classical optimisation routines. We will examine this further in future work. We expect that our method would also be suitable for solving general optimisation problems, in conjunction with the QAOA, especially given its performance with randomly chosen initial states.

Our method can also be used to prepare a thermal (Gibbs) state, ρT = eH/T/Tr[eH/T] of Hamiltonian H at temperature T. Sampling from a Gibbs distribution is an important aspect of many machine-learning algorithms, and so we believe that our method is applicable to problems in quantum machine learning. Moreover, while previous methods to prepare the Gibbs state46,47 require long gate sequences (and hence, fault tolerance), our method can be implemented using a shallow circuit. Our algorithm can also be combined with recently proposed error mitigation techniques,28,48,49,50 and so is suitable for current quantum hardware.

Although exact imaginary time evolution deterministically propagates a good initial state to the ground state in the limit that τ → ∞, our variational method may still converge to higher-energy states, if the chosen ansatz is not sufficiently powerful. In future work, we will investigate how our method may be optimally applied to a variety of tasks in chemistry, optimisation and machine learning. This will include developing suitable ansätze for a range of problems.