Quantifying the effect of gate errors on variational quantum eigensolvers for quantum chemistry

Variational quantum eigensolvers (VQEs) are leading candidates to demonstrate near-term quantum advantage. Here, we conduct density-matrix simulations of leading gate-based VQEs for a range of molecules. We numerically quantify their level of tolerable depolarizing gate-errors. We find that: (i) The best-performing VQEs require gate-error probabilities between $10^{-6}$ and $10^{-4}$ ( $10^{-4}$ and $10^{-2}$ with error mitigation) to predict, within chemical accuracy, ground-state energies of small molecules with $4-14$ orbitals. (ii) ADAPT-VQEs that construct ansatz circuits iteratively outperform fixed-circuit VQEs. (iii) ADAPT-VQEs perform better with circuits constructed from gate-efficient rather than physically-motivated elements. (iv) The maximally-allowed gate-error probability, $p_c$, for any VQE to achieve chemical accuracy decreases with the number $\ncx$ of noisy two-qubit gates as $p_c\approxprop\ncx^{-1}$. Additionally, $p_c$ decreases with system size, even with error mitigation, implying that larger molecules require even lower gate-errors. Thus, quantum advantage via gate-based VQEs is unlikely unless gate-error probabilities are decreased by orders of magnitude.


I Introduction
Calculating the ground-state energy of a molecular Hamiltonian is an important but hard task in computational chemistry [1].For strongly correlated systems, exact classical approaches quickly become infeasible as system sizes exceed 100 spin-orbitals.Other, approximate, methods often lack accuracy [1][2][3][4][5].This makes quantum computers an attractive alternative.A potential route to quantum-chemistry simulations relies on the quantum-phase-estimation algorithm (QPEA) [1,6].However, the QPEA requires executing millions of gates on error-corrected hardware [7].Realizing such hardware requires significant resource overheads and gate-error probabilities below a minimal threshold [8].For example, the surface code [9,10], requires thousands of physical qubits to implement a single logical qubit at a gate-error probability of 10 −4 [11].In view of these requirements, the QPEA is not yet feasible.
To reduce the qubit number and gate-error requirements, the variational quantum eigensolver (VQE) was proposed [12].The VQE is a hybrid quantum-classical algorithm that uses a classical optimizer and a parameterized quantum circuit, the 'ansatz', to estimate ground-state energies.Combined with significant hardware developments [13], VQEs have facilitated successful demonstrations of quantum computational chemistry for small systems [12,[14][15][16][17][18].These demonstrations have been aided by VQE algorithms' abilities to correct for certain errors [15,19,20].Despite these achievements, there are still significant hurdles to overcome for VQEs to become useful.First, the short, gate-efficient ansätze used in small-scale experimental demonstrations [12,[14][15][16][17][18] face optimization difficulties for larger systems.This is related to the emergence of barren plateaus (vanishing gradients), which are more likely when the ansatz is unrelated to the Hamiltonian [21,22].Current research, for example on growing the ansatz circuit iteratively (the ADAPT-VQE algorithm [23]) is aimed at avoiding or mitigating the issue of barren plateaus [21].Another significant hurdle comes from gate-error rates in hardware.Although current noisy intermediate-scale quantum devices [24][25][26] have sufficiently many qubits to run VQEs for molecules with more than 100 spin-orbitals [13], their gate-error rates are too high.
At present, efforts to ameliorate the gate-error issue aim to either reduce ansatz circuit depths [23,[27][28][29][30] or implement elaborate error-mitigation schemes [31][32][33].However, VQEs are often benchmarked in the absence of gate errors, with circuit depths and CNOT counts used as proxies of their noise resilience [30].It has been argued that the maximum viable circuit depth for a VQE ansatz circuit is given by the reciprocal of gate-error probability 1/p [1].More rigorously, given a gate-error probability p, the maximum VQE circuit depth which cannot be simulated classically scales as O(p −1 ) [34,35].A research question, which remains under-explored, is to quantify the gate-error probabilities that VQEs can tolerate.Specifically, considering the analogy of a surface code, which has a well-defined fault-tolerance threshold [11], we aim to find the maximally allowed gate-error probability, below which a certain VQE estimates a certain molecule's energies within chemical accuracy.Quantifying the maximally allowed gate-error probability allows the noise resilience of leading VQEs to be ranked, and provides useful goals for the hardware community.
In this article, we numerically quantify under how high gate-error probabilities VQEs can operate successfully.More specifically, using density-matrix simulations, we simulate the ground-state search of leading, gate-based VQEs for a range of molecules.In the presence of depolarizing noise, we show that: (i) Even the best performing VQEs require gate error probabilities p c on the order of 10 −6 to 10 −4 (without error mitigation) in order to predict molecular ground-state energies within chemical accuracy of 1.6 × 10 −3 Hartree.This is significantly below the fault-tolerance threshold of the surface code [11].For small systems, error mitigation can be employed such that the required p c values can be improved to 10 −4 to 10 −2 .(ii) ADAPT-VQEs tend to tolerate higher gate-error probabilities than VQEs that use fixed ansätze, such as UCCSD and k-UpCCGSD.(iii) ADAPT-VQEs tolerate higher gate-error probabilities when circuits are synthesized from gate-efficient [27][28][29]36], rather than physicallymotivated [23], elements.We support these claims by estimating, in the presence of depolarizing noise, the scaling relation between the maximally tolerable gate-error probability p c and the number N II of noisy (two-qubit) gates.Our results indicate that p c ∝ ∼ N −1 II for any gate-based VQE.(iv) We find that the maximally allowed gate-error probability, p c , decreases with system size, with and without error mitigation.This shows that larger molecules would likely require even lower gate-errors.We conclude that substantial quantum advantage in VQE-based quantum chemistry is unlikely, unless gate-errors are significantly reduced, or error-corrected hardware is realized, or error-mitigation protocols are improved and made scalable.

II.I ADAPT-VQEs
In this work, we investigate several classes of VQE algorithms.Our study prioritizes VQEs with short ansatz circuits, as these are expected to be more noise resilient [29,30].Specifically, we consider ADAPT-VQEs, which have comparatively short ansatz circuits [23,29] and the ability to mitigate rough parameter landscapes [21].We further consider UCCSD [37] and k-UpCCGSD [38] as prototypes of fixed ansatz VQEs -the latter for its comparatively shallow ansatz circuits [30].Before we outline the results of our noise-resilience investigation, we describe the workings of the (ADAPT-)VQE.
The main idea of VQEs is to use shallow ansatz circuits, defined by a set of parameters θ, to generate entangled trial states ρ(θ).A classical optimizer is then used to vary θ and minimize the energy-expectation value of H. Provided that the ansatz is sufficiently expressive, the Rayleigh-Ritz variational principle, allows min θ (E(θ)) to approach the molecular ground-state energy E 0 [1].ADAPT-VQEs use a classical optimizer in two ways [23]: to conduct the Rayleigh-Ritz minimization with respect to a parameterized quantum state; and to iteratively construct the ansatz that generates the parameterized state itself.A quantum computer is used to calculate the energy-expectation value of the parameterized state.Consider the state generated by the ansatz U n : ρ n (θ 1 , . . ., θ n ) is parameterized by n parameters.In ADAPT-VQE, the classical optimizer and the quantum computer work to find a minimum-energy expectation value: An ADAPT-VQE iteratively adds parameterized elements to its ansatz to construct ρ n (θ 1 , . . ., θ n ) such that E 1 > . . .> E n and E n approaches E 0 .
The iterative ansatz construction proceeds as follows.First, the ADAPT-VQE algorithm initializes a state ρ 0 , usually the Hartree-Fock state [2].Then, the algorithm generates a sequence of trial states by successively adding elements of the form picked from a finite pool P of operators (see below).Here, T α , for α ∈ [1, . . ., |P|], are anti-Hermitian operators.Thus, the unitary ansatz grows as The ansatz element A n (θ n ) ∈ P is typically picked to yield the steepest energy gradient.For each value of α, a quantum computer evaluates the energy expectation value after adding A α (θ n ) in the nth step: The ADAPT-VQE algorithm then picks the element A n ≡ A α=αn with = argmax Alternatively, one may define a sub-pool S ⊂ P of operators with the largest gradients and let the algorithm pick the element with the largest energy difference: After choosing the nth ansatz element A n (θ n ), a classical computer optimizes and updates the parameters θ 1 , . . ., θ n to minimize the energy expectation value E n in Eq. (3).Provided that E n − E n−1 > ϵ, for some energy precision ϵ, the iterative algorithm continues.When E n − E n−1 ≤ ϵ the algorithm halts at some final length n = N , and outputs E N ≡ E n as the estimate of E 0 .
In this work, we focus on the three main types of ADAPT-VQEs: fermionic-ADAPT-VQE, QEB-ADAPT-VQE and qubit-ADAPT-VQE.(Efficient gate-representations for their relevant ansatz elements can be found in [28,29].)These algorithms differ in their ansatz-element pools P.
First, we consider the fermionic-ADAPT-VQE [23].As the name suggests, this algorithm uses a pool of operators that closely simulate the physics of fermionic excitations.The pool is formed from Here, a † i and a i are fermionic creation and annihilation operators acting on the ith orbital.Throughout this work, we represent these operators using the Jordan-Wigner transformation [39], where X i , Y i , Z i are the Pauli operators acting on the ith qubit.The fermionic-ADAPT-VQE leads to shallower and more gate efficient circuits than UCCSD [23].Further, choosing fermionic excitations along the gradient of minimum energy produces Htailored circuits, which can potentially avoid barren plateaus [21].Second, we consider the QEB-ADAPT-VQE [29,36].This algorithm uses a pool of operators that nearly (up to a ± sign) simulate the physics of fermionic excitations.The pool is formed from where Q † i and Q i are known as qubit creation and annihilation operators, respectively.Due to the CNOT efficiency of its pool, the QEB-ADAPT-VQE can find ground-state and excited-state energies with fewer CNOT gates than the fermionic-ADAPT-VQE [28,29,36].
Finally, we consider the qubit-ADAPT-VQE [27].This algorithm uses a pool of gate-efficient elements without physical motivation.The pool is formed from segments of Pauli-operator strings: Ansatz Element Pool Initialise state, ansatz and pool Measure expectation where σ i denotes Pauli operators X i , Y i , Z i acting on the ith qubit.In previous works, this pool has been found to generate the most shallow and CNOT efficient circuits for ADAPT-VQEs [27].In our simulations, we use a pool formed from XY -Pauli strings of length two and four with an odd number of Y 's.It is possible to use reduced pools [27], but at the expense of reduced circuit efficiency of the final ansatz [29].
Typically, the fermionic-ADAPT-VQE and the qubit-ADAPT-VQE use the gradient-based decision rule expressed in Eq. 8. On the other hand, the original QEB-ADAPT-VQE uses the energy-based decision rule, shown in Eq. 9.These algorithms are summarized in a flow-chart summary in Fig. 1, and in the pseudocode of Supplementary Note 1.
To demonstrate the benefits of iteratively-grown ansätze, we compare them to a typical fixed-ansatz-VQE method: the UCCSD-VQE [17,37].In Supplementary Note 4, we extend this comparison to the k-UpCCGSD algorithm [38].Owing to its linear scaling of circuit depth with qubit number, this algorithm was recently put forward as the leading fixed-ansatz VQE [30].We simulate the workings of the fixed-ansatz methods using the aforementioned fermionic and QEB elements.
Given the breadth of work on VQEs [30], it is not possible to perform an exhaustive analysis of all existing algorithms.Nevertheless, the analytical results in Sec.II.V, and the low circuit depths provided by ADAPT-VQE [30], suggest that our results provide a lower bound on the requirements for gate-based VQE algorithms to operate successfully.However, there exist algorithms that differ greatly from typical VQEs, and could deserve future attention.We discuss some of these, and the reasons for our exclusion of them, below.We will not consider iterative qubit coupled cluster (iQCC) [40] and ClusterVQE [41] algorithms.We do not anticipate these algorithms to be feasible options to study strongly correlated systems, whose simulation using quantum algorithms provides the most benefit over classical algorithms.We also omit the DISCO-VQE [42].Due to its large jumps in Hilbert space during the discrete optimizations of the ansatz, we expect DISCO-VQE to lack tolerance to barren plateaus.These problems may be overcome in future improvements of these VQE algorithms.We leave the design of improved algorithms, and the noise-evaluation of them, to future articles.Finally, we omit the ctrl-VQE algorithm [43].Although highly interesting, this Hamiltonian algorithm operates with device-tailored pulses, rather than quantum gates, and thus lies outside the scope of this work.

II.II Density-matrix simulations
To investigate the effect of noise on gate-based VQE, we constructed a VQE-tailored density-matrix simulator, expanding the state-vector circuit simulator of Ref. [29].We represent molecular orbitals in the Slater type orbital-3 Gaussians (STO-3G) spinorbital basis set [44], with the option of frozen orbitals.The openfermion-Psi4 package [45,46] is used to generate the second-quantized Hamiltonian and to perform the Jordan-Wigner transformation [39].Ansatz parameters are optimized using Nelder-Mead [47] or gradient-descent-based (BFGS) [48] methods in SciPy [49].
We note that, due to the wide array of quantum-computing platforms and their contrasting qubit-control implementations, no noise model can be simultaneously realistic and platform-agnostic.In this work, we model noise by applying single-qubit depolarizing noise to the target qubit i after the application of each two-qubit CNOT gate.Our noise channel can be represented by where p ∈ [0, 1] is the gate-error probability.
In real devices, noise from two-qubit gates completely dominates the noise from single-qubit gates [25,[50][51][52].Thus, we ignore the latter.Additionally, we exclude state preparation and measurement errors, which are often lower in magnitude than the accumulated two-qubit gate errors [50], and can be mitigated efficiently in experiments [53][54][55][56][57][58].(We note that ADAPT-VQE algorithms have high measurement requirements, such that measurement errors may prevent the algorithm from reaching the global minimum energy.This topic requires further investigation.)Depolarizing noise is commonly used to represent local and Markovian gate errors when assessing both NISQ [59][60][61] and quantum-error-correction [11,[62][63][64] algorithms.More realistic models can include thermal-relaxation noise (dephasing and amplitude damping) [65] and devicespecific gate errors derived from gate-set-tomography data [66].When T 1 ≈ T 2 thermal relaxation noise can be approximated using our depolarizing noise model [62].This is a reasonable model for superconducting hardware [50].On the other hand, when T 2 ≪ T 1 dephasing noise dominates our depolarizing noise model is less accurate.This is common in trapped-ion devices [51,52] and spin qubits [67] and has recently been investigated in Ref. [68].Moreover, existing VQE algorithms require unrealistically low error rates to give chemically accurate energies.Any attempt to scale down the error rates in realistic noise models to these low levels must be theoretically-justified.This is challenging for a complex, multi-parameter model.Hence, we exclude noise models based on gate-set tomography.Finally, we do not consider coherent errors, since their effect can be suppressed by randomized compiling [69] and dynamical decoupling [70,71].Randomized compiling [72] can also be used to convert coherent errors to stochastic errors.Additionally, VQE algorithms are somewhat resilient to coherent errors [12,73].Thus, in this work we focus on incoherent errors.Note that, if VQE algorithms were studied with a coherent noise model, their perceived performance may be greater.
When simulating the smallest molecules (H 2 and H 4 ) we apply our noise channel [Eq.(19)] after each application of a CNOT gate.This gate-by-gate method is computationally expensive.To facilitate feasible simulations of molecules larger than H 4 , we approximate each noisy ansatz element by a corresponding noiseless ansatz evolution and a noise-inducing evolution.The noise-inducing evolution corresponds to depolarizing noise applied to each qubit in accordance with the number of times that qubit was a CNOT target in the ansatz element.We observe that this lower-bounds the effect of noise.For example, for H 4 , applying noise after each CNOT with gate-error probability p = p ′ , gives approximately the same energy accuracy as applying total noise after each element with p ≈ 1.3p ′ .Consequently, our simulations of larger molecules should not be compared directly with those for H 2 and H 4 .A detailed illustration of our noise approximation is given in Supplementary Note 3.
Energy accuracy is the key metric of VQE performance.It is defined as Here, E n (p) is the VQE-calculated energy with gate-error probability p in the nth iteration and E FCI is the energy given by the full-configuration-interaction [5]   larger-molecule simulations tractable, we estimate ∆E as follows.We first grow the ansatz circuit C n in noiseless, unitary simulations until the nth iteration, for which the energy accuracy ∆E(0, n) first drops below a cut-off energy precision ϵ t : ∆E(0, n) < ϵ t .Then, we approximate ∆E(p, n) by simulating the implementation of C n with noise on our density-matrix simulator.Thus, ∆E(p, n) may depend on the iteration n.As demonstrated in Supplementary Note 2, ansatz growth and optimization in the presence of noise has little effect on the noise probability required for chemical accuracy.

II.III Comparison between ADAPT-VQEs with noise
In this section, we benchmark the noise resilience of ADAPT-VQEs using our densitymatrix simulator.We study H 2 , H 4 , LiH, HF and BeH 2 .Our simulations were conducted using a parameter optimization cut-off of ∇ ⃗ θ E ≤ ϵ O = 10 −6 Hartree and ansatz growth cut-off of E n − E n−1 ≤ ϵ = 10 −12 Hartree.In our simulations of the larger molecules, we used an ansatz-truncation cut-off of ∆E(0, n) < ϵ t = 10 −4 Hartree.Below, we use ∆E(p) = ∆E(p, n final ) to refer to the energy accuracy at the final ansatz length n = n final .Because of the significant skepticism towards errormitigation strategies [35,74,75], we omit such strategies from the analyses presented in this section and investigate error mitigation separably in Sec.II F.
The inset of Fig. 2(a) shows how ∆E(p) varies with p for H 2 .The values of p ∈ [0, 0.02] include the well-known surface-code fault-tolerance threshold [8,11] as well as the gate-error probability of currently available quantum hardware [25,[50][51][52].All tested VQE algorithms require extremely small gate-error probabilities if they are to improve on the Hartree-Fock energy approximation, even for the simple H 2 molecule.The region of chemical accuracy is too small to show in the inset.In real implementations of ADAPT-VQE algorithms, energies exceeding the Hartree-Fock approximation would not be achieved, since, in this case, adding elements to the ansatz does not improve the initial energy accuracy.Here, we add noise to a noiselessly-grown ansatz such that these energies are shown.These observations motivated us to reduce significantly the range of p used in the rest of this study.
The rest of Fig. 2 shows our calculations of ∆E(p) as a function of p for all considered molecules.The region of chemical accuracy is highlighted by yellow shading.We emphasize six general trends supported by our data.First, the maximally-allowed gateerror probabilities p c for computing ground-state energies within chemical accuracy are extremely small.For all molecules investigated in this study, the value of p c is on the order of 10 −6 to 10 −4 (see Table 1 for details).These values are significantly below the fault-tolerance thresholds of leading error-correction protocols.Second, our simulations of H 2 and H 4 (1 Å) suggest that ADAPT-VQEs outperform fixed ansatz methods.For a given pool of ansatz elements, the corresponding ADAPT-VQE algorithm leads to better energy accuracies than the corresponding fixed ansatz VQE algorithm.Third, the efficient representation of fermionic excitations [28] improves the performance of the fermionic-ADAPT-VQE significantly.This representation reduces CNOT depth, but its scaling of CNOT depth with molecule size is still worse than the scaling of QEB and Pauli string elements.The second and third observations support the claim [29] that the CNOT count is a useful estimator of VQE's noise vulnerability.Fourth, the more gate-efficient (Pauli string and QEB) pools outperform the most physicallymotivated (fermionic) pool.The fermionic-ADAPT-VQE is consistently outperformed by either the qubit-ADAPT-VQE or the QEB-ADAPT-VQE.Fifth, sometimes the QEB-ADAPT-VQE outperforms the qubit-ADAPT-VQE and vice versa.For H 2 , H 4 (1 Å) and BeH 2 , the qubit-ADAPT-VQE outperforms the QEB-ADAPT-VQE.On the other hand, for HF, the QEB-ADAPT-VQE (energy-based decision rule) outperforms the qubit-ADAPT-VQE.For LiH, the QEB-ADAPT-VQE and the qubit-ADAPT-VQE perform similarly.Notably, for H 4 (3 Å), the qubit-ADAPT-VQE fails to add more than two elements to the ansatz.Hence, it never surpasses chemical accuracy.This gives some indications of the qubit-ADAPT-VQE's being worse than the QEB-APAPT-VQE at simulating strongly correlated molecules.Sixth, different decision rules for QEB-ADAPT-VQEs yield different performances.For HF, LiH and BeH 2 , the energy-reduction decision rule gives a better energy accuracy than the maximumgradient rule.Conversely, for H 4 (1 Å and 3 Å) the gradient-based decision rule performs Fig. 3 Colour plots representing the energy accuracy (from FCI) at different gate-error probabilities and ansatz lengths, for three different molecules.Three iterative-growth methods are included: the fermionic-ADAPT-VQE with efficient elements, the QEB-ADAPT-VQE with the energy decision rule and the qubit-ADAPT-VQE (the plots for the remaining two methods are given in Supplementary Note 2).The yellow lines on the colour plot highlight the ansatz lengths that minimize the energy accuracies for each gate-error probability.The top-column figures are extracted from the colour plots by plotting the energy accuracy along these curves.
are presented in Table 1 and Fig. 6.After this optimal-truncation analysis, our overall conclusion remains unchanged: Even for the best-performing ADAPT-VQEs, the maximally allowed gate-error probability is on the order 10 −6 to 10 −4 Hartree.

II.V Analytical noise-susceptibility analysis
To analytically support our numerical results, we study the linear response of energy accuracy ∆E(p) to noisy perturbations of the unitary ansatz circuits.Then, we use our results to show that p c is roughly inversely proportional to the number N II of noisy (two-qubit) gates.
Noise susceptibility:-From Fig. 2 we see that ∆E(p) ≈ χ ′ p, for some constant χ ′ .Inspired by this observation, we define the noise-susceptibility parameter : Now, we show that χ ∝ N II (details are given in Supplementary Note 6).If p = 0, an ansatz circuit C can be expressed as a product of R unitary gates: We use R CX to denote the set of indices for which G r is a noisy (CNOT) gate, and we use i r to denote the qubit which noise acts on.Further, we define a perturbed version of the target unitary U as where the Pauli gate σ acts on qubit i r after the rth gate.The corresponding energy expectation values are Usually, E U is close to E 0 .Thus, we interpret E Up (σ, r, i r ) as a noise-induced excitation.
We call E Up (σ, r, i r ) − E U the noise-induced fluctuation.The average noise-induced fluctuation of the ansatz is In Supplementary Note 6, we show that Below, we analyze this expression for the noise-susceptibility parameter.
Simplified computations:-The energy expectation values underlying χ can be simulated with unitary operations on a state vector.Such simulations are significantly simpler to perform than density-matrix simulations.Thus, we can more easily estimate the energy accuracy for small values of p: To test our method we compare Eq. ( 27) (black dotted lines) with some curves in Fig 2 .Equation ( 27) estimates the simulated data remarkably well.Next, we use our method to estimate p c for molecules too large to study with our density-matrix simulator.The estimates of p c for H 2 0 and NH − 2 are listed in the right-most section of Tab. 1.We stress that Eq. ( 27) is an excellent predictor of ∆E(p) for the gate-error probabilities p ∈ [0, p c ] which allow for chemically-accurate simulations.
Scaling:-The energy fluctuations are bounded by the spectral range of H: δE ≤ E max − E 0 .Thus, Eq. ( 26) suggests that noise susceptibility grows linearly with N II , as δE is constant.Fig. 4 supports this claim.The curves indicate that χ ∝ ∼ N II and δE ≈ O(1), for a variety of molecules, ADAPT-VQE algorithms and circuit depths.Combining these observations with Eq. ( 27), we estimate that where ∆E C = 1.6 × 10 −3 Hartree (chemical accuracy).This result is supported by recent results in condensed matter systems [76].The inverse proportionality between p c and N II suggests that gate-error probabilities will have to reach extremely small values for useful chemistry calculations with VQE algorithms to be viable.Alternatively, we require improved VQE algorithms with shallower circuits and fewer noisy (two-qubit) gates.

II.VI Quantum Error Mitigation
In the absence of error-corrected hardware, several strategies to mitigate the effect of noise have been suggested [31][32][33]77].Quantum error mitigation is a family of strategies which generally rely on knowledge of a circuit, noise model, or both to generate a set of modified circuits.Sampling from these circuits can generate a better estimate of the noiseless circuit's output [77].While these strategies have been demonstrated in simple VQE implementations [14,78,79], they suffer, in general, from exponential scaling of sample requirements with qubit number [35,74,75], potentially preventing their viability in useful NISQ VQE implementations.Indeed, leading reviews on quantum computational chemistry [1], state that 'it seems unlikely that error-mitigation methods alone would enable more than a small multiplicative increase in the circuit depth.'This unfavourable scaling has also been observed experimentally, where it has prevented the use of all but the most simple mitigation strategies [26].
The main goal of this work is to assess the required error rates for useful VQE implementations of molecules with more than 100 spin-orbitals.Due to the uncertainty around their scalability, as well as the unclear performance in the presence of timedependent noise (particularly two-level-system defects [80] which drift in frequency), a study of this type should not include quantum error mitigation in its current form.Despite this, we believe it is relevant to extend our study to ascertain the maximally allowed gate-error probability p c to calculate molecular energies within chemical accuracy for an error-mitigation protocol with polynomial sampling overhead.To partially address this question, we repeat our numerical simulations using linear zero-noise extrapolation [31,32] with a noise multiplication factor of 3. Despite being biased and heuristic [81], we choose linear zero-noise extrapolation for its modest sampling overhead and numerical stability, which proved useful in recent large-scale demonstrations of error mitigation [26].
The results for H 2 , H 4 and LiH are depicted in Fig. 5. Compared to their counterparts in Fig. 3(a), we note the following: (i) The maximally allowed error probability increases by one or two orders of magnitude.This demonstrates the utility of error mitigation to make VQE more viable, especially for smaller molecules.(ii) The resulting energy error displays a roughly parabolic behaviour.This is expected from the series expansion of the depolarizing noise and indicates that further improvement (at an increased sampling overhead) may be possible by using higher-order extrapolations.(iii) We note that while all VQE algorithms display an increased noise resilience from error mitigation, their relative p c -ranking does not change.This suggests that a VQE algorithm with higher noise resilience in the absence of error mitigation would remain more noise resilient when error mitigation is applied.Finally, we put the improved gate-error probabilities p c into context by plotting them as crosses in Fig. 6.Given the sharp decrease of p c with the problem size N , in the presence and absence of error mitigation, it is unlikely that error mitigation will improve p c sufficiently for useful system sizes, N > 100.Ultimately, it remains an open question whether the unfavourable scaling of error mitigation prevents its use in realistic quantum-chemistry applications.

III Discussion
Any quantum algorithm aimed at near-term NISQ devices must be designed to tolerate some level of noise.In this work, we numerically quantify the maximally allowed depolarizing gate-error probabilities, p c , required by leading gate-based VQEs to achieve chemically accurate energy estimates.Based on numerical simulations, we reach five conclusions.First, even the best-performing VQE algorithms require gate-error probabilities between 10 −6 and 10 −4 , for the small molecules we assess.Such errors are at least an order of magnitude below state-of-the-art experiments [25,50] and the surfacecode threshold [8,11].If error mitigation is viable, the p c values can be improved to 10 −4 to 10 −2 with linear zero-noise extrapolation.Second, larger molecules tend to require longer ansatz-circuits and thus, lower gate-error probabilities, see Fig. 6.This is the case both with and without error mitigation.Third, in the presence of noise, ADAPT-VQEs can tolerate approximately an order of magnitude greater gate-errors p c than equivalent fixed-ansatz VQEs, including those with the shortest ansatz circuits [30,38].Fourth, the more gate-efficient the ADAPT-VQE ansatz pool, the more noise resilient the algorithm.From a noise-resilience perspective, qubit excitations and Pauli-string excitations outperform fermionic excitations.Fifth, the maximum gateerror probability allowed to reach chemical accuracy is roughly inversely proportional to the number of CNOT gates: p c ∝ ∼ N −1 II .We now conclude this work with a couple of comments.
As opposed to a fault-tolerance threshold in error correction, the maximally allowed gate-error probability p c crucially depends on the size of the input problem, see Fig. 6.More specifically, p c tends to shrink as the number of spin orbitals N increases.A key question for future research is to elucidate how fast p c decreases with N .Our numerical data in Fig. 6 suggests an exponential scaling, both with and without error Fig. 6 Plot representing the noise probability required to reach chemical accuracy, pc, for different ansätze and molecule sizes (number of orbitals).For molecules with the same number of orbitals, the mean probability is taken.The crosses and circles represent the noise probabilities required to reach chemical accuracy with and without error mitigation, respectively.The data without error mitigation is taken from Table 1, and the data with error mitigation is taken from the crosses in Fig. 5. Additionally, a recent state-of-the-art two-qubit gate error rate with superconducting qubits is shown in purple [88].
mitigation.Meanwhile, assuming VQEs achieve molecular ground-state energies with polynomially shallow circuits (N II = poly(N )), Eq. ( 28) suggests a polynomial scaling.Having analytical expressions of the decrease of p c with the number of spin orbitals N , would inform us whether quantum advantage is at all feasible for input problems beyond 100 spin orbitals.
While this study is entirely focused on gate-errors, other sources of noise may also be relevant.These include errors from state preparation and measurement as well as statistical noise due to sampling of expectation values from a limited number of shots.As mentioned when justifying the noise model, errors due to state preparation and measurement tend to be smaller than the accumulated gate errors, and there are widely-implemented methods to compensate for them [53][54][55][56][57][58].However, in principle, measurement errors may lead to sub-optimal parameter values or operator choices during ansatz growth of ADAPT-VQE, which may prevent the algorithm from reaching the global minimum energy.A detailed analysis of such effects is left for future work.
While it is possible to sample any expectation value with ϵ accuracy in polynomially few shots [1,30], the scaling prefactor may lead to prohibitively large run-times [30,[89][90][91].This issue is particularly acute for ADAPT-VQE algorithms, where each growth step requires shots for both parameter optimization and element selection.In this case, the number of necessary gradient measurements for each ansatz growth step is greater than that for VQE parameter optimization, by a factor which scales linearly in the number of qubits [92].Holistic studies of VQE run-times [30,[89][90][91] provide predictions which vary greatly depending on the estimation methodology.The estimated run-times are often intractable without significant parallelization.The number of necessary measurements for parameter optimization can potentially be reduced via alternate groupings of Pauli operators [83][84][85], or tensor contraction of the Hamiltonian (such as by double factorization [86,87,93]).Despite this progress, run-time scaling remains a significant obstacle to overcome before ADAPT-VQEs can perform useful computations on real hardware.A balance must be found between run-time and the acceptable level of statistical noise.This is complicated by the combination of gate errors, measurement errors and statistical errors, which may affect VQEs adversely in a non-trivial way.We leave this as an open problem for the community as, in this work, we focus on the noise resilience of ADAPT-VQEs.
This work numerically investigated the maximally allowed gate-error probability p c required to achieve chemically accurate predictions as a core metric of VQE performance.Similar to a fault-tolerance threshold in error correction, p c should provide a transparent metric to compare the noise resilience of VQEs as well as provide useful guidance for the experimental community.Having demonstrated that p c is between 10 −4 and 10 −6 for very small molecules (and worse for larger molecules), we conclude that quantum advantage in VQE-based quantum chemistry requires: (i) Substantially improved error mitigation, (ii) error correction, and/or (iii) significantly improved hardware in which gate errors are reduced by orders of magnitude.
Data Availability Data generated during the study is available upon request (E-mail: kd437@cantab.ac.uk or ckl45@cam.ac.uk).

Fig. 2
Fig.2Energy accuracy as a function of gate-error probability for H 2 , H 4 (at 1 Å and 3 Å interatomic separation), LiH, HF and BeH 2 .Ansätze using fermionic, qubit, and Pauli string elements are plotted in red, blue and green, respectively.All curves labelled as "fixed" VQE ansätze use the UCCSD ansatz[17,37].Energy accuracies lower than chemical accuracy are highlighted by the yellow region.The purple line in the H 2 inset is the energy calculated using the Hartree-Fock[2] state.Extrapolated noise-susceptibility calculations are shown in black for the fermionic-ADAPT-VQE (d), the QEB-ADAPT-VQE (e) and the qubit-ADAPT-VQE (f ).

Fig. 4
Fig. 4 Noise susceptibility (top) and average energy fluctuation (bottom) as functions of the number N II of CNOT gates for all molecules and algorithms reported in Sec.II.III, at all circuit depths.
calculation of the true ground-state energy E 0 .A key objective of our study is to find the maximally allowed gate error probability p c for which ∆E(p, n) < 1.6 milli-Hartree.Classical optimizers are used to tune θ 1 , . . ., θ n .The parameters are optimized until the gradient norm, ∇ ⃗ θ E ≤ ϵ O , for some precision ϵ O .In our simulations of H 2 we calculate converged values of E n (p) using our density-matrix simulator.To keep