Introduction

Quantum computing is beginning to show promising proof-of-principle calculations, especially in quantum chemistry. Calculations of the binding energies for molecules such as H21 and BeH22 have been done using small, noisy quantum computers. Applications in machine learning have also been shown on quantum hardware.3 Quantum computing is entering the noisy intermediate-scale quantum era.4 Full fault-tolerant error correction is still many years away; near-term quantum computers will have a limited number of qubits, each qubit being noisy. Methods that reduce noise and correct errors without doing full error correction on every qubit will help extend the range of interesting problems that can be solved in the near-term. Each qubit is a very valuable resource for near-term quantum computers; using them efficiently, and reusing them as often as possible, is imperative in enabling useful calculations on near-term devices.

Here we describe and demonstrate a simple scheme for reducing the effects of a variety of noise sources by reducing each source separately and summing the resulting corrections. In practice, the reduction could be accomplished via quantum error correction5 or by active engineering to reduce a noise rate, e.g., autonomous error correction6,7 or dynamical decoupling.8 Simple quantum error correction schemes have already been shown in superconducting circuits9,10 and such systems are approaching more complicated error correction schemes.11 In trapped ion quantum computers, a seven qubit color code has been already been demonstrated.12 Quantum computing architectures are nearing the quality and size where a single qubit could be error corrected, but we are far from every qubit being corrected. The scheme we present could make use of this limited error correction, by sweeping through each qubit, correcting one qubit at a time, but is not dependent on error correction methods per se. We validate it with the variational quantum eigensolver (VQE),13,14 simulating the calculation of the ground state energies of H2 and LiH. We assume that a single qubit has its error rate reduced by some means, while the other qubits retain all of their error. Our results show that the scheme reduces the needed quality of each qubit drastically; for ‘chemical accuracy’, error rates can be up to two orders of magnitude larger. We apply this scheme to multiple noise sources, including amplitude decay, dephasing, thermal noise, and correlated noise. We stress that the scheme can be used to reduce the environmental error from any measured observable, not just those used in VQE. Furthermore, this scheme is not restricted to algorithms on quantum computers; it could potentially be used in quantum sensing. Quantum error correction has been proposed as a method to reduce environmental noise in quantum metrology,7,15 allowing a probe to reach the Heisenberg limit.16 Our scheme could reduce the number of error correcting qubits necessary for such a method.

Results

Noise sources and removal

We simulate four different types of varying noise sources, represented by Lindblad operators: amplitude damping (L1), dephasing (L2), thermal (Lth), and correlated noise (Lc):

$$\begin{array}{l}L^1(\rho ) = \gamma _1{\cal D}[\sigma ](\rho ),\\ L^2(\rho ) = \gamma _2{\cal D}\left[ {\sigma ^\dagger \sigma } \right](\rho ),\\ L^{th}(\rho ) = \gamma _{th}\left( {n_{th} + 1} \right){\cal D}[\sigma ] + \gamma _{th}n_{th}{\cal D}\left[ {\sigma ^\dagger } \right],\\ L^c(\rho ) = \gamma _c{\cal D}\left[ {\sigma _1^\dagger \sigma _2} \right] + \gamma _c{\cal D}\left[ {\sigma _1\sigma _2^\dagger } \right],\end{array}$$
(1)

where σ is an annihilation operator and \({\cal D}[C](\rho ) = C\rho C^\dagger - \frac{1}{2}(C^\dagger C\rho + \rho C^\dagger C)\). These Lindblad terms are applied to each qubit or to various combinations of qubits. The parameters in Eq. (1) are \(\gamma _1 = \frac{1}{{T_1}}\), the amplitude damping rate; \(\gamma _2 = \frac{2}{{T_2^ \ast }}\), twice the dephasing rate; γth, the thermalization rate; nth, the thermal occupation (taken to be 0.5); and γc, the correlated noise rate. We have used this formalism in previous work.17,18,19 For our calculations we assume that reduction of an error source corresponds to reducing the relevant parameter, γ, by some known amount.

It is important to understand the exact process of the reduction of the relevant error parameters and what effect that has on the relevant error properties of the underlying qubits. For instance, if some active engineering process reduces the error only fractionally, the fractional reduction must be known to make full use of our method; if it is only known approximately, there will still be a correction, but it will not be as drastic. Furthermore, if the error reduction strategy decreases the relevant source of error, but increases a different source of error, the method will not directly be able to account for the additional error. For instance, quantum error correction could be used to reduce the error by correcting each qubit, one at a time. Quantum error correction, however, naturally introduces new overhead in number of qubits (due to the encoding) and in number of gates (due to the increased complexity of logical operations, especially two-qubit operations, on encoded qubits), which could introduce new types of errors into the circuit evaluation. This additional complexity would make the original, noisy evaluation differ by more than just a reduction of the error on each qubit by a partial amount. Furthermore, the quantum error correction algorithms likely to be implemented in the near-term will only partially decrease the error, not eliminate it, even without all these complications. In this work, we assume gates are separated by one time unit and the error rates are given in inverse time units.

Variational quantum eigensolver

Here, we briefly overview the variational quantum eigensolver (VQE). VQE solves for an approximate, variational ground state energy of a parameterized wavefunction ansatz, |ψ(θ)〉. The variational principle ensures that E0, the true ground state energy of the Hamiltonian H, is always less than E, the energy of a parameterized wavefunction ansatz. E is evaluated on the quantum computer; the parameters θ are optimized using a classical computer. Classical computing methods such as variational quantum Monte Carlo20 also make use of the variational principle. The hope of a quantum realization is that quantum computers can efficiently prepare non-trivial states which would be more difficult to prepare on a classical computer. While methods like quantum phase estimation21 can give generally more accurate energies, VQE requires shorter circuits and has a natural robustness to noise.2,13,22 VQE is not limited to quantum chemistry; it has also been used to study problems in nuclear physics.23 When using VQE for quantum chemistry, the second quantized quantum chemistry Hamiltonian is transformed into a qubit Hamiltonian using a transformation such as Jordan–Wigner.24

Correction scheme

First consider the Lindbland master equation,

$$\frac{{{\mathrm{d}}\rho }}{{{\mathrm{d}}t}} = L(\rho ) = \mathop {\sum}\limits_{i = 1}^m {\kern 1pt} L_i(\rho ).$$
(2)

A formal solution for a density matrix evolving from time t to t + τ and satisfying this equation is

$$\rho (t + \tau ) = V_\tau (\rho (t)),$$
(3)

where

$$V_\tau () = {\mathrm{exp}}[\tau L()].$$
(4)

We use () above to indicate that L and Vτ are superoperators that take in an operator with the brackets to generate a new one. To first order in τ and taking L() to be the sum over Lindblad operators of Eq. (2),

$$V_\tau () \approx 1 + \tau \mathop {\sum}\limits_{i = 1}^m {\kern 1pt} L_i(),$$
(5)

Applying gate 1 via application of unitary operator U1, evolving under Vτ, applying gate 2 (U2), etc., up to gate G (UG) corresponds exactly to a final density matrix given by

$$\rho (T) = U_GV_\tau \left( {U_{G - 1} \cdots V_\tau \left( {U_2V_\tau \left( {U_1\rho (0)U_1^\dagger } \right)U_2^\dagger } \right) \cdots U_{G - 1}^\dagger } \right)U_G^\dagger ,$$
(6)

where T = (G – 1) t, the time after the last gate has been applied. Notice that Eq. (6) is not symmetric, since Vτ is a superoperator.

Equation (6) is ρ(T) for the case of all m error sources present. The corresponding ρa(T) with no error terms present is simply:

$$\rho _a(T) = U_GU_{G - 1} \cdots U_2U_1\rho (0)U_1^\dagger U_2^\dagger \cdots U_{G - 1}^\dagger U_G^\dagger .$$
(7)

The density matrices resulting from reducing error sources i = 1, 2, …, m separately are

$$\rho _i(T) = U_GV_\tau ^i(U_{G - 1} \cdots V_\tau ^i(U_2V_\tau ^i(U_1\rho (0)U_1^\dagger )U_2^\dagger ) \cdots U_{G - 1}^\dagger )U_G^\dagger .$$
(8)

\(V_\tau ^i\) is the corresponding Lindblad evolution operator that contains a suitably scaled Li but has all other terms the same, e.g. to first order in τ,

$$V_\tau ^i() \approx 1 + \tau \mathop {\sum}\limits_{j \ne i}^m {\kern 1pt} L_j() + \tau (1 - f_i)L_i(),$$
(9)

where fi is the fraction of noise removed.

Now we consider the new density matrix defined as

$$\tilde \rho (T) = \rho (T) - \mathop {\sum}\limits_{i = 1}^m \frac{1}{{f_i}}(\rho (T) - \rho _i(T)).$$
(10)

Insertion of the Eqs. (5) and (9) into Eq. (10) leads, as detailed in the Supplementary Material, to

$$\tilde \rho (T) = \rho _a(T) + {\cal O}(\tau ^2)$$
(11)

being correct to first order in τ, i.e. all first order error terms exactly cancel, with remaining error terms on the order of τ2 and higher. In contrast, the uncorrected density matrix ρ(T) contains first order error terms.

We should note that Eq. (10) could also be obtained via a linear expansion of the density matrix about the noise rates and a finite difference approximation to the first derivatives. However, we prefer our propagator-based derivation since it provides a more physical context as well as a means for rigorous error analysis in terms of the underlying propagation and Lindblad noise terms. The general philosophy of error mitigation via additional measurements and linear response can also be found in somewhat more sophisticated approaches such as quantum subspace expansion.25

Since the density matrix itself is corrected by this procedure, all observables are also accurate to first order in τ. Equation (10) immediately leads to the observable correction formula, traced with the observable, A:

$$\begin{array}{*{20}{l}} {\tilde A} \hfill & = \hfill & {{\mathrm{Tr}}(\tilde \rho (T)A)} \hfill \\ {} \hfill & = \hfill & {{\mathrm{Tr}}(\rho (T)A) - \mathop {\sum}\limits_{i = 1}^m \frac{1}{{f_i}}[{\mathrm{Tr}}(\rho (T)A) - {\mathrm{Tr}}(\rho _i(T)A)]} \hfill \\ {} \hfill & = \hfill & {\left\langle A \right\rangle - \mathop {\sum}\limits_{i = 1}^m \frac{1}{{f_i}}\left( {\left\langle A \right\rangle - \left\langle {A_i} \right\rangle } \right),} \hfill \end{array}$$
(12)

which will also be accurate to order τ whereas the uncorrected observable, 〈A〉, has first order error terms.

Suppose, for example, that m = n, where n is the number of qubits, and each qubit is noisy. The strength of Eq. (12) is that, for near-term quantum computing without the possibility of perfect error correction of all n qubits, only \({\cal O}(n)\) computations involving reducing the error on just one qubit (to yield the 〈Ai〉) are required, along with the original calculation with no error reduction to yield 〈A〉. Intuitively, the difference fi(〈A〉 − 〈Ai〉) represents a fraction of the first order noise contributions for noise source i; these are then subtracted away from the noisy expectation value with a relevant scaling \(\frac{1}{{f_i}}\), leaving a result with less noise. This cancellation relies on the expectation value being built up from many measurements. Each expectation value (〈A〉 and 〈Ai〉) contains contributions from measurements with no error, as well as measurements with errors. Our correction scheme cancels out some of the measurements with error, while leaving the result with no error, leading to a better calculation of the observable. Our simulations of the application of this scheme to VQE indicate that Eq. (12) does indeed yield a substantial improvement for real algorithms. Our method lends itself naturally to use on a quantum computer, where calculations can be repeated and techniques exist for the reduction of error, but it is not restricted to just that. Any quantum system in which an observable can be repeatedly measured and each noise source can be reduced separately can make use of the scheme to obtain a more accurate result; quantum error correction assisted quantum metrology is a prime example.7,15

In VQE, the measured observable in question is the energy, E, of the wavefunction ansatz. We first optimize the parameters of the wavefunction; this can be done either with no error reduction or potentially reducing the error on a single qubit. Once a set of optimal parameters is found, the expectation value of the energy is evaluated on the quantum computer with no error reduction, and then reducing the error on each qubit separately. For an n qubit problem, this involves only an additional \({\cal O}(n)\) evaluations of the energy on the quantum computer with error reduction on one of the qubits each time. Once all of the energies are measured, Eq. (12) is used to obtain \(\tilde E\).

H2

We first consider the hydrogen molecule, H2, at equilibrium bond length 0.74 Å and an STO-3G basis, resulting in a four qubit circuit. We use the unitary coupled cluster singles double (UCCSD) ansatz1 (166 gates) and note that each gate is applied sequentially with one time unit between each gate. We made no effort to apply gates in parallel. The parameters of the wavefunction ansatz were optimized with noise on every qubit. We then sweep through the qubits, reducing the noise from each qubit by 10% (fi = 0.1 for each noise source i). The final energy is then calculated by using our correction scheme, Eq. (12).

Results for typical amplitude damping (γ1) and dephasing (γ2) noises are shown in Fig. 1, representing three different environmental regimes. One regime corresponds to γ1 = γ2 and is similar to a superconducting qubit quantum computer,2 whereas the regime with just γ2 is consistent with spin26 and trapped ion27 quantum computers. Though we do not know of a quantum architecture where γ1 is the dominant noise source, we include it as a third regime for completeness. The x-axis in this figure (and all subsequent figures) shows the error rate multiplied by the number of gates in the circuit multiplied by the number of qubits, and roughly represents the expected number of errors for a given circuit evaluation. For the γ1 = γ2 case, however, the expected number of errors would be doubled, as there are two possible error channels. We see that chemical accuracy (1.6 mHa, the horizontal black line) with respect to the noise-free result can be obtained with error rates more than one order of magnitude higher than without our method; on average, the error rates can be 35× larger. To obtain chemical accuracy, corrections of 50 mHa are applied. On this log–log plot, the slopes of the two lines represent their scaling with error. The correction amount (which is approximately the error of the uncorrected energy) has a slope of 1, whereas the corrected energy has a slope of 2, representing the cancellation of the first order error terms.

Fig. 1
figure 1

Error cancellation for ground state H2 with the UCCSD ansatz under amplitude damping (γ1), dephasing (γ2), and both amplitude damping and dephasing (γ1, γ2) noise sources, removing only 10% of the noise on each qubit separately. The horizontal line represents ‘chemical accuracy’, 1.6 mHa. Dashed lines with triangles are the amount of correction applied by our scheme. Solid lines with circles are the difference between the corrected energy and the energy evaluated with no noise

The Supplementary Material provides results for this same example removing 100% of the noise. This results in only a slight benefit, compared to removing only a small fraction (10%) of the noise, allowing for error rates to be, on average, 45× larger (30% larger than the fractional case). By removing all of the noise, there is no need for the linear extrapolation done by the factor \(\frac{1}{f}\); the value at zero-noise is directly measured for each qubit. Additionally, the determination of the fraction of removed noise, f, can be complicated. This could be done by measuring the relevant error properties before and after the error reduction strategy is applied, but errors in the process could give an unreliable estimation of f. Even a small deviation from the true value of f will eventually result in the slope of the corrected energy becoming 1 (instead of 2) at sufficiently small error rates. This is due to the difference in the corrected energy and the noise-free energy becoming smaller than the precision of the determined f. If an unknown f is assumed to be 1, for instance, the method will still remove a fraction of the error, giving results as if the quantum computer had error rates 1−f; there are still first order errors, but they are smaller due to the correction scheme. Because the difference between the result when a small fraction of the noise is removed and when all of the noise is removed is relatively marginal, we focus on the latter case for the remainder of this paper, where Eq. (12) takes a simpler form with f = 1. The Supplementary Material also provides results for a different wavefunction ansatz, one similar to that of ref. 2. The results are consistent when using this separate ansatz. Note that these figures make no reference to the true, full configuration interaction energy. The difference plotted is between the corrected energy and the energy evaluated with no noise; the quality of the wavefunction ansatz cannot be determined from these figures alone. The ordering of the results provides a limited sensitivity analysis for different quantum computing architectures. Similar to ref. 22, we note that VQE is more sensitive to amplitude damping noise than to dephasing noise.

We should note that the very largest gains in accuracy in Fig. 1 (and subsequent figures) occur at relatively low error rates. The initial, uncorrected algorithm that is run should be reasonably reliable, e.g., the error rate should be 1/(no. gates), in inverse time units with gates separated by one time unit. This error rate (e.g., 1/166 ≈ 0.006 for the case of Fig. 1) may be quite daunting to achieve in practice; partial error correction on all qubits could be used to achieve this threshold.

Even though the wavefunction parameters were optimized in the presence of noise, the final energy evaluated at the different parameter sets for the fully error corrected circuit differ very little. The optimal parameters from the largest error rates only give a difference of 1.7 mHa compared to the optimal parameters from the error-free optimal parameters, when both were evaluated with no noise. We therefore optimize the parameters once with no noise and use those parameters for evaluation at all noise rates in the following examples.

Our correction scheme is not limited to environmental noise sources, such as those modeled by γ1 and γ2. Any noise source describable by a Lindblad superoperator can benefit, as long as each noise source can be isolated and reduced independently of all other noise sources. To demonstrate this, we apply a thermal noise source with rate γth and a correlated noise source with rate γc to the H2 UCCSD example, Fig. 2, where we now assume that the noise has been completely removed (fi = 1 for all i). We see trends similar to those for amplitude damping and dephasing; corrections of ≈70 mHa bring the energy to within chemical accuracy at error rates almost 50 times larger than otherwise needed. While perhaps experimentally difficult, thermal noise could be reduced by selectively cooling each qubit. The correction scheme applied to our correlated noise term reveals some subtleties of the method. Our correlated noise Lindblad, Lc of Eq. (1), naturally has terms from two qubits. When we sweep through the qubits, we now reduce fully all terms which involve a single qubit; this leads to the removal of each Lc term twice, once for each qubit in each Lc. This double counting can be corrected by taking half of the calculated correction from each qubit. Our scheme relies on the fact that each term is reduced only once. As long as the noise sources of interest and their controlled reduction are well understood, the scheme can be applied. In our correlated noise term, Lc, every term is removed exactly twice and the calculated correction can be halved. Though we focus on fractional noise reduction, the results will still hold if the noise is instead increased by a controlled, known amount (say, doubled), for each qubit separately. The ‘correction’ would be the difference between the inflated noise run and the normal noise run, scaled by the appropriate factor. This is similar in spirit to refs. 28,29,30, where the total noise of the system is artificially increased and the results are subsequently extrapolated to the zero noise limit.

Fig. 2
figure 2

Error cancellation for ground state H2 with the UCCSD ansatz under thermal noise (γth) and correlated noise (γc) with total error removal on each qubit separately. See caption of Fig. 1 for explanation of linetypes and symbols

LiH

We also study LiH in the STO-3G basis at bond length 1.74 Å, using 12 qubits, with over 12,000 gates. Results for the UCCSD ansatz are shown in Fig. 3. The correction for LiH is even more dramatic than for H2. Corrections of 100–200 mHa bring the answer to within chemical accuracy, and error rates can be over two orders of magnitude higher, ranging from 68 for only γ1 noise to 128 with only γ2 noise. This example provides confidence that the correction scheme will work for larger circuits. For LiH the procedure works even better than for the smaller circuits of H2. This will likely be true for ever larger circuits: the number of first order errors increases with increasing number of qubits, and these are all approximately corrected. While the number of second order errors also increases, second order errors go as γ2 and τ2, and will be small.

Fig. 3
figure 3

Error cancellation for the ground state of LiH with the UCCSD ansatz under amplitude damping (γ1), dephasing (γ2), and both amplitude damping and dephasing (γ1, γ2) noise sources with total error removal on each qubit separately. See caption of Fig. 1 for explanation of linetypes and symbols

Our method can reduce the error in any measured observable, and so has application to a wide range of quantum algorithms. Further study on other algorithms, e.g., the quantum phase estimation21 and quantum approximate optimization algorithms,31 are needed to understand the broader impact of this scheme. Applications to quantum metrology warrant further study: the noise floor of a quantum sensor could be reduced at substantial reduction in cost in number of quantum systems by reducing the overhead for quantum error correction assisted metrology.7,15 The magnitude of the correction can also be used as a metric for measuring how close to the true answer one is, without knowledge of the true answer. The method relies on the error characteristics with no reduction to be the same as the error characteristics after reduction; in quantum error correction, this may not always be true. Further study into practical methods of reducing the error and ensuring that the error characteristics before and after are similar is ongoing.

Discussion

We presented a simple scheme to reduce error in quantum algorithms, applying it to simulations of the variational quantum eigensolver. Reducing the error on each qubit (e.g., through error correction or some active or engineering process) one at a time and summing the scaled difference from the result with no error reduction can provide a significant correction to the observable (e.g., energy). This scheme reduces the coherence requirements to obtain chemical accuracy; error rates can be up to two orders of magnitude larger. The overhead is relatively low: an additional \({{\cal O}\left( {n} \right)}\) evaluations with single qubit error reduction.

The general philosophy used in this work for accounting for errors, replacing all-qubit error correction with additional measurements and making use of an error model, has also been exploited in interesting recent work28,29,30 involving increasing a single global noise parameter, along with extrapolation to zero error assuming a polynomial expansion for the error or via a quasi-probability formalism. Our approach focuses on changing the error associated with each individual qubit or error source separately. Furthermore, our method is developed with focus on mitigating the environmental noise of each qubit, whereas these other interesting works focus on the noise from gate applications. In the near term, with a very limited number of qubits and gates, it is likely that the approach of refs 28,29,30 is more feasible. However, our method, presented is the paper, could be used for quantum memories and other quantum devices, where the dominant noise source is not the gates, but the environmental interactions. Our approach offers an alternative that is particularly relevant when increasing a global noise parameter (such as the noise on the gates) is infeasible. Other, more recent work, focuses on using a stabilizer formalism, where a specific, relevant quantity (such as a global symmetry of the system) is used to detect errors occurring in a quantum circuit.32,33 If the symmetry is violated, then an error occurred and the run in question should be excluded. Such approaches cannot eliminate all errors and could be used in tandem with the method we propose in this paper. As quantum devices with larger numbers of qubits come online, utilization of many forms of error mitigation will be necessary to make full use of the power of these small, noisy devices.

Methods

Generation of VQE circuits

We use the open source package OpenFermion34 to generate the qubit Hamiltonian, starting from quantum chemistry integrals generated via Psi4.35 We use the unitary coupled cluster singles doubles (UCCSD) ansatz,13 with OpenFermion34 and ProjectQ36 to generate the circuits. We optimize the parameters of the wavefunctions using both Nelder-Mead simplex37 and COBYLA.38

Time evolution

Consider a system of n qubits characterized by a time-dependent density matrix ρ(t) subjected to a sequence of k = 1, 2, …, G gate operations, each being a unitary transformation Uk on ρ: \(\rho \to U_k\rho U_k^\dagger\). We assume time τ lapses between each gate operation. During these times ρ evolves under a Lindblad master equation, Eq. (2). The dynamics is simulated with the high-performance density matrix evolution program QuaC,39 using the different noise sources noted in the main text.