Given the recent progress in quantum computing hardware, it is natural to ask where the first demonstration of a quantum advantage for a practical problem will occur. Since the first experimental demonstration by Peruzzo et al.1, the variational quantum eigensolver (VQE) framework has offered a promising path towards utilizing small and noisy quantum devices for simulating quantum chemistry. The essence of the VQE approach is the use of the quantum device as a coprocessor, which prepares a parameterized quantum wavefunction and measures the expectation value of observables. In conjunction with a classical optimization algorithm, it is possible to then minimize the expectation value of the Hamiltonian as a function of the parameters, arriving at approximations for the wavefunction, energy, and other properties of the ground state1,2,3,4,5,6,7,8. A growing body of work attempting to understand and ameliorate the challenges associated with using VQE to target nontrivial systems has emerged in recent years9,10,11,12,13,14,15,16,17,18,19,20,21,22. In this article, we address the challenge posed by the large number of circuit repetitions needed to perform accurate measurements and propose a scheme that dramatically reduces this cost. In addition, we explain how our approach to measurement has reduced sensitivity to readout errors and also enables a powerful form of error mitigation.

Within VQE, expectation values are typically estimated by Hamiltonian averaging. Under this approach, the Hamiltonian is decomposed into a sum of operators that are tensor products of single-qubit Pauli operators, commonly referred to as Pauli words. The expectation values of the Pauli words are determined independently by repeated measurement. When measurements are distributed optimally between the Pauli words P, the total number of measurements M is upper bounded by

$$M\le {\left(\frac{{\sum }_{\ell }\left|{\omega }_{\ell }\right|}{\epsilon }\right)}^{2},\quad {\rm{where}}\quad H=\mathop{\sum}\limits_{\ell }{\omega }_{\ell }{P}_{\ell }$$

is the Hamiltonian whose expectation value we estimate as ∑ωP〉, the ω are scalars, and ϵ is the target precision3,23. Prior work assessing the viability of VQE has used bounds of this form and concluded that chemistry applications require “a number of measurements which is astronomically large” (quoting from ref. 3).

Several recent proposals attempt to address this obstacle by developing more sophisticated strategies for partitioning the Hamiltonian into sets of simultaneously measurable operators16,17,18,19,20,21. We summarize their key findings in Table 1. This work has a similar aim, but we take an approach rooted in a decomposition of the two-electron integral tensor rather than focusing on properties of Pauli words. We quantify the performance of our proposal by numerically simulating the variances of our term groupings to more accurately determine the number of circuit repetitions required for measurement of the ground state energy. This contrasts with the analysis in other recent papers that have instead focused on using the number of separate terms which must be measured as a proxy for this quantity. By that metric, our approach requires a number of term groupings that is linear in the number of qubits—a quartic improvement over the naive strategy and a cubic improvement relative to these recent papers. However, we argue that the number of distinct term groupings alone is not generally predictive of the total number of circuit repetitions required, because it does not consider how the covariances of the different terms in these groupings can collude to either reduce or increase the overall variance. We will show below that our approach benefits from having these covariances conspire in our favor; for the systems considered here, our approach gives up to three orders of magnitude reduction in the total number of measurements, while also providing an empirically observed asymptotic improvement.

Table 1 A history of ideas reducing the measurements required for estimating the energy of arbitrary basis chemistry Hamiltonians with the variational quantum eigensolver.

Although there are a variety of approaches to simulating indistinguishable fermions with distinguishable qubits24,25,26, the Jordan–Wigner transformation is the most widely used. This is due to its simplicity and to the fact that it allows for the explicit construction of a number of useful circuit primitives not available under more sophisticated encodings. These include the Givens rotation network that exactly implements a change of single-particle basis16,27,28,29. A disadvantage of using the Jordan–Wigner transformation is the fact that it maps operators acting on a constant number of fermionic modes to qubit operators with support on up to all N qubits. In the context of measurement, the impact of this nonlocality can be seen by considering a simple model of readout error such as a symmetric bitflip channel. Under this model, a Pauli word with support on N qubits has N opportunities for an error that reverses the sign of the measured value, leading to estimates of expectation values that are exponentially suppressed in N (see section “Error mitigation”). It has recently been shown that techniques based on fermionic swap networks can avoid the overheads and disadvantages imposed by the nonlocality of the Jordan–Wigner encoding in a variety of contexts, including during measurement16,28,29. Our work will likewise avoid this challenge without leaving the Jordan–Wigner framework, allowing estimation of single- and two-particle fermionic operator expectation values by the measurement of only one- and two-local qubit operators, respectively.

In addition to this reduction in the support of the operators that we measure, our work offers another opportunity for mitigating errors. It has been observed that when one is interested in states with a definite eigenvalue of a symmetry operator, such as the total particle number, η, or the z-component of spin, Sz, it can be useful to have a method that removes the components of some experimentally prepared state with support on the wrong symmetry manifold8,9,10,11. Two basic strategies to accomplish this have been proposed. The first of these strategies is to directly and nondestructively measure the symmetry operator and discard those outcomes where the undesired eigenvalue is observed, projecting into the proper symmetry sector by postselection. In order to construct efficient measurement schemes, prior work in this direction has focused on measuring the parities of η and Sz, rather than the full symmetry operators9,11. These proposals involve nonlocal operations that usually require O(N) depth, which may induce further errors during their implementation. The second class of strategies builds upon the foundation of ref. 14 and uses additional measurements together with classical postprocessing to calculate expectation values of the projected state without requiring additional circuit depth8,9,10, a procedure that can be efficiently applied to the parity of the number operator in each spin sector. In this work, we show how our proposal for measurement naturally leads to the ability to postselect directly on the proper eigenvalues of the operators η and Sz, rather than on their parities.


Using Hamiltonian factorization for measurements

The crux of our strategy for improving the efficiency and error resilience of Hamiltonian averaging is the application of tensor factorization techniques to the measurement problem. Using a representation discussed in the context of quantum computing in refs. 29,30,31, we begin with the factorized form of the electronic structure Hamiltonian in second quantization:

$$H={U}_{0}\left(\mathop{\sum}\limits _{p}{g}_{p}{n}_{p}\right){U}_{0}^{\dagger }+\mathop{\sum }\limits_{\ell =1}^{L}{U}_{\ell }\left(\mathop{\sum}\limits _{pq}{g}_{pq}^{(\ell )}{n}_{p}{n}_{q}\right){U}_{\ell }^{\dagger },$$

where the values gp and \({g}_{pq}^{(\ell )}\) are scalars, \({n}_{p}={a}_{p}^{\dagger }{a}_{p}\), and the U are unitary operators that implement a single-particle change of orbital basis. Specifically,

$$U=\exp \left(\mathop{\sum}\limits _{pq}{\kappa }_{pq}{a}_{p}^{\dagger }{a}_{q}\right),\quad U{a}_{p}^{\dagger }{U}^{\dagger }=\mathop{\sum}\limits _{q}{\left[{e}^{\kappa }\right]}_{pq}{a}_{q}^{\dagger },$$

where \({[{e}^{\kappa }]}_{pq}\) is the p,q entry of the matrix exponential of the anti-Hermitian matrix κ that characterizes U.

Numerous approaches that accomplish this goal exist, including the density fitting approximation32,33, and a double factorization that begins with a Cholesky decomposition or eigendecomposition of the two-electron integral tensor29,33,34,35,36,37,38,39. In this work, we use such an eigendecomposition and refer readers to the Supplementary Note III and to refs. 29,31 for further details. The eigendecomposition step permits discarding small eigenvalues to yield a controllable approximation to the original Hamiltonian. While such low-rank truncations are not central to our approach and would not significantly reduce the number of measurements, doing so would asymptotically reduce L (and thus ultimately, the number of distinct measurement term groupings). Such decompositions have been explored extensively in the context of electronic structure on classical computers on a far wider range of systems than those considered here34,36,39,40,41,42. It has been found that L = O(N) is sufficient for the case of arbitrary basis quantum chemistry, both in the large system and large basis set limits34. Furthermore, specific basis sets exist where L = 1, such as the plane wave basis or dual basis of ref. 27.

Our measurement strategy, which we shall refer to as Basis Rotation Grouping, is to apply the U circuit directly to the quantum state prior to measurement. This allows us to simultaneously sample all of the 〈np〉 and 〈npnq〉 expectation values in the rotated basis. We can then estimate the energy as

$$\langle H\rangle =\mathop{\sum}\limits_{p}{g}_{p}{\langle {n}_{p}\rangle }_{0}+\mathop{\sum }\limits_{\ell =1}^{L}\sum _{pq}{g}_{pq}^{(\ell )}{\langle {n}_{p}{n}_{q}\rangle }_{\ell },$$

where the subscript on the expectation values denotes that they are sampled after applying the basis transformation U. The reason that the \({\langle {n}_{p}\rangle }_{\ell }\) and \({\langle {n}_{p}{n}_{q}\rangle }_{\ell }\) expectation values can be sampled simultaneously is because under the Jordan–Wigner transformation, np = (1 + Zp)/2, which is a diagonal qubit operator. In practice, we assume a standard measurement in the computational basis, giving us access to measurement outcomes for all diagonal qubit operators simultaneously. Thus, our approach is able to sample all terms in the Hamiltonian with only L + 1 = O(N) distinct term groups.

Fortunately, the U are exceptionally efficient to implement, even on hardware with minimal connectivity. Following the strategy described in ref. 28, and assuming that the system is an eigenstate of the total spin operator, any change of single-particle basis can be performed using N2/4 − N/2 two-qubit gates and gate depth of exactly N, even with the connectivity of only a linear array of qubits28. This gate depth can actually be improved to N/2 by further parallelizing the approach of ref. 28, making use of ideas that are explained in the context of multiport interferometry in ref. 43. In fact, a further optimization is possible by performing the second matrix factorization discussed in ref. 29. This would result in only \(O({\mathrm{log}\,}^{2}N)\) distinct values of the \({g}_{pq}^{(\ell )}\) and a gate complexity for implementing the U, which is reduced to \(O(N\mathrm{log}\,N)\); however, we note that this scaling is only realized in fairly large systems when N is growing towards the thermodynamic (large system) rather than continuum (large basis) limit.

The primary objective of our measurement strategy is to reduce the time required to measure the energy to within a fixed accuracy. Because different hardware platforms have different repetition rates, we focus on quantifying the time required in terms of the number of circuit repetitions. We shall present data for electronic ground states that demonstrate the effectiveness of our Basis Rotation Grouping approach in comparison to three other measurement strategies and the upper bound of Eq. (1). All calculations were performed using the open-source software packages OpenFermion and Psi444,45. Specifically, we used exact calculations of the variance of expectation values with respect to the full configuration interaction ground state to determine the number of circuit repetitions required. The calculations presented here are performed for symmetrically stretched hydrogen chains with various bond lengths and numbers of atoms, for a symmetrically stretched water molecule, and for a stretched nitrogen dimer, all in multiple basis sets. We justify our focus on the electronic ground states here by noting that most variational algorithms for chemistry attempt to optimize ansatz that are already initialized near the ground state. For reference, we provide analogous data calculated with respect to the Hartree–Fock state in Supplementary Table II.

In order to calculate the variance of the estimator of the expectation value of the energy, it is necessary to determine the distribution of measurements between the different term groupings. References3,23 provide a prescription for the optimal choice. They demand that (in the notation of Eq. (1)) each term H is measured a fraction of the time f equal to

$${f}_{\ell }=\frac{\left|{\omega }_{\ell }\right|\sqrt{1-{\langle {H}_{\ell }\rangle }^{2}}}{{\sum }_{j}\left|{\omega }_{j}\right|\sqrt{1-{\langle {H}_{j}\rangle }^{2}}}.$$

In practice, the expectation values in the above expression are not known ahead of time and so the optimal measurement fractions f cannot be efficiently and exactly determined a priori. For the purposes of this paper, we approximated the ideal distribution of measurements by first performing a classically tractable configuration interaction singles and doubles (CISD) calculation of the quantities in Eq. (5). We shall show that this approximation introduces a negligible overhead in measurement time for all systems considered in this work. One could also envisage using an adaptive measurement scheme that makes additional measurements based on the observed sample variance, in order to approximate the ideal partitioning of measurement time, such as the one described in ref. 46.

Circuit repetitions required for energy measurement

In Fig. 1 we plot the number of circuit repetitions for our proposed Basis Rotation Grouping measurement approach (black circles), together with three other measurement strategies and the upper bound based on Eq. (1) for the systems listed in Table 2. The first and most basic alternative strategy is simply to apply no term groupings and measure each Pauli word independently, a strategy we refer to as Separate Measurements (lime green circles). A more sophisticated approach, similar to the one described in ref. 17, is to partition the Pauli words into groups of terms that can be measured simultaneously. In the context of a near-term device, we consider two Pauli words Pj and Pk simultaneously measurable if and only if they act with the same Pauli operator on all qubits on which they both act nontrivially. Pauli words that satisfy this condition can be simultaneously measured using only single-qubit rotations and measurement. In order to efficiently partition the Pauli words into groups, we choose to take all of the terms that only contain Z operators as one partition and then account for the remaining Pauli words heuristically by adding them at random to a group until no more valid choices remain before beginning a new group. We refer to this approach as Pauli Word Grouping (teal circles). The final strategy that we compare with preprocesses the Hamiltonian by applying the techniques based on the fermionic marginal (RDM) constraints described in ref. 23, before applying the Jordan–Wigner transformation and using the same heuristic grouping strategy to group simultaneously measurable Pauli words together. We call this latter strategy Pauli Word Grouping, RDM Constraints (dark blue circles).

Fig. 1: The number of circuit repetitions required to estimate the ground state energy of various systems using five different measurement strategies.
figure 1

The number of circuit repetitions required to estimate the ground state energy of various systems. From left to right: A hydrogen chains of varying lengths in varying basis sets, B a water molecules in varying basis sets, C a nitrogen dimer in varying basis sets. The specific systems considered are enumerated in Table 2. A target precision corresponding to a 2σ error bar of 1.0millihartree is assumed. Calculations performed on systems that require the same number of qubits (spin orbitals) are plotted together in columns. The cost of our proposed measurement strategy appears to have a lower asymptotic scaling than any other method we consider and obtains a speedup of more than an order of magnitude compared to the next best approach for a number of systems.

Table 2 List of the molecular systems considered in this work, displayed in order of increasing number of qubits, for each type of system.

We refer to the bound of Eq. (1) as being based on the Hamiltonian coefficients and calculate it from the Jordan–Wigner transformed Hamiltonian, (meaning that the ω in Eq. (1) are the coefficients of Pauli words). This bound is indicated by salmon-colored circles in Fig. 1. We note that attempting to calculate a similar bound directly from the fermionic Hamiltonian (meaning that the ω in Eq. (1) would be the coefficients of the terms \({a}_{p}^{\dagger }{a}_{q}\) or \({a}_{p}^{\dagger }{a}_{q}^{\dagger }{a}_{r}{a}_{s}\)) leads to different bounds. These are derived in Supplementary Note I, where they are shown to be substantially looser for the systems we consider in this work. While one would not measure the fermion operators directly, it is surprising that these bounds would be significantly different. We refer the interested reader to the supplementary information for an analysis and discussion of this phenomenon.

Considering first the hydrogen chain systems in Fig. 1 (left panel, a), we note that our Basis Rotation Grouping approach consistently outperforms the other strategies for simulations with more than four fermionic modes, requiring significantly fewer measurements. Interestingly, while the bounds from the qubit Hamiltonian and other three methods appear to have relative performances that are stable across a variety of system sizes, the Basis Rotation Grouping method appears to have a different asymptotic scaling, at least for hydrogen chains of increasing length and basis set size. This is likely due to large-scale effects that only manifest when approaching a system’s thermodynamic limit (which one approaches particularly quickly for hydrogen chains)47. In Table 3 we quantify this asymptotic scaling by assuming that the dependence of the variance on the number of qubits N in the hydrogen chain’s Hamiltonian can be modeled by the functional form aNb for some constants a and b, which we fit using a Bayesian analysis described in the table footnote48. By contrast, the data from the minimal basis water molecule (panel B in Fig. 1) shows no benefit in measurement time from our method compared to the heuristic grouping strategies. However, the advantage of our approach becomes significant for that system in larger basis sets, a trend that is also apparent to a lesser extent for the nitrogen dimer (panel C in Fig. 1).

Table 3 Bounds and uncertainties resulting from Bayesian inference using a Monte-Carlo approximation with 106 particles for all hydrogen full configuration interaction data48.

We find that applying the RDM Constraints of ref. 23 to our Pauli Word Grouping strategy (the combination is plotted with dark blue circles in Fig. 1) does not significantly reduce the observed variance, despite the fact that the use of the RDM Constraints have been previously shown to dramatically reduce the bounds on the number of circuit repetitions required23. In Supplementary Note II, we explore the possibility that this is due to the fact that these constraints were applied to minimize a bound of the same form as Eq. (1) that is, however, formulated using the fermionic representation of the Hamiltonian. We present evidence in the Supplementary Note I of the Supplementary information that, in the context of such bounds, the use of the Jordan–Wigner transformed operators leads to surprisingly different results. However, as we show there, we find that the actual variance with respect to the ground state is not substantially changed by applying the same constraints and performing the minimization using the qubit representation of the Hamiltonian.

Earlier we explained that the data presented in Fig. 1 were calculated by distributing the measurements between different term groupings according to Eq. (1) using the variance of each term calculated with a classically efficient CISD approximation to the ground state. Any deviation from the ideal allocation of measurement cycles (obtained by evaluating Eq. (1) with respect to the true ground state) must increase the time required for measurement. In Fig. 2 we present the ratio between the time required with the approximate distribution and the time required under the optimal one for each of the systems treated in the work. We find that impact from this approximation is negligible, with the largest observed increase in measurement time being below 3%. For systems where CISD no longer provides a qualitatively good approximation to the ground state, it would also be possible to calculate the required quantities with a more sophisticated method, such as the density matrix renormalization group algorithm49.

Fig. 2: The overhead in measurement time incurred by using a sub-optimal distribution of measurement effort.
figure 2

Specifically, the increase in the time (or the number of circuit repetitions) required to measure the ground state energy to a fixed precision when the measurements are distributed between groups using the variances calculated with the configuration interaction singles and doubles (CISD) approximation rather than the true ground state. For each of the systems and measurement techniques considered in this work, we present the ratio of the time required when using this approximate distribution of measurement repetitions compared with the time required using the optimal distribution, both calculated using Eq. (5) and then applied to the measurement of the actual ground state of the system. We find that using a classically tractable CISD calculation to determine the distribution of measurements between groups results in only a small increase in total measurement time.

Overall, Fig. 1 speaks for itself in showing that in most cases there is a very significant reduction in the number of measurements required when using our strategy—sometimes by up to three orders of magnitude for even modestly sized systems. Furthermore, these improvements become more significant as system size grows.

Error mitigation

Beyond the reduction in measurement time, our approach also provides two distinct forms of error mitigation. First, it reduces the susceptibility to readout errors by replacing the measurement of O(N) qubit operators with one- and two-qubit operators. Second, it allows us to perform postselection based on the eigenvalues of the particle number operators in each spin sector. Both properties stem from measuring the Hamiltonian only in terms of density operators in different basis sets.

The first benefit, the reduction in readout errors, is a consequence of only needing to measure expectation values of operators that have support on one or two qubits. Direct measurement of the Jordan–Wigner transformed Hamiltonian using only single-qubit rotations and measurement involves measuring operators with support on O(N) qubits. To demonstrate how reducing the support of the operators helps to mitigate errors, we consider a simple model of measurement error: the independent, single-qubit symmetric bitflip channel. When estimating the expectation value of a Pauli word P acting on K qubits with a single-qubit bitflip error rate p, a simple Kraus operator analysis shows that P is modified to

$${\langle {P}_{\ell }\rangle }_{{\rm{bitflip}}}={(1-2p)}^{K}{\langle {P}_{\ell }\rangle }_{{\rm{true}}},$$

which means that the noise channel will bias the estimator of the expectation value towards zero by a factor exponential in K. Thus, the determination of expectation values is highly sensitive to the extent of the locality of the P, a behavior that we expect to persist under more realistic models of readout errors.

One could also accomplish the reduction in the support of the operators that our method achieves by other means. For example, one could measure each of the O(N4) terms separately, localizing each one to a single-qubit operator by applying O(N) two-qubit gates. Other schemes have been proposed that simultaneously allow generic two-electron terms to be measured using O(1) qubits each while simultaneously accomplishing the parallel measurement of O(N) terms at a time, at the cost of using O(N2) or O(N2log(N)) two-qubit gates16,20,22. One advantage of our approach is that we achieve this reduction in operator support at the same time as the large reduction in the number of measurement repetitions presented in section “Circuit repetitions required for energy measurement” above.

Our approach also enables a second form of error mitigation. Each measurement we prescribe is also simultaneously a measurement of the total particle number operator, η, and of the z-component of spin, Sz. We can therefore reduce the impact of circuit and measurement errors by performing postselection conditioned on a desired combination of quantum numbers for each of these operators. Let P denote the projector onto the corresponding subspace and let ρ denote the density matrix of our state. We obtain access to the projected expectation value,

$${\langle H\rangle }_{{\rm{proj}}}=\frac{{\rm{Tr}}\left(P\rho H\right)}{{\rm{Tr}}\left(P\rho \right)},$$

directly from the experimental measurement record by discarding those data points that fall outside the desired subspace. The remaining data points are used to evaluate the expectation values of the desired Pauli words.

This postselection is efficient in the sense that it requires no additional machinery beyond what we have already proposed. The only cost is a factor of \(\approx \!1/{\rm{Tr}}(P\rho )\) additional measurements. This factor is approximate because discarding measurements with the wrong particle number is likely to lead to a lower observed variance. Specifically, by removing measurements in the wrong particle number sector, we avoid having to average over large fluctuations caused by the energetic effects of adding or removing particles. This, therefore, presents an additional route by which our Basis Rotation Grouping scheme will reduce the number of measurements in practice.

Several recent works have proposed error mitigation strategies that allow for the targeting of specific symmetry sectors. We make a brief comparative review of these here in order to place our work in context. One class of strategies focuses on nondestructively measuring one or more symmetry operators9,11. After performing the measurements and conditioning on the desired eigenvalues, the postmeasurement state becomes \(P\rho P/{\rm{Tr}}(P\rho )\) and the usual Hamiltonian averaging can be performed. These approaches share some features with our strategy in that they also require an additional number of measurements that scale as \(1/{\rm{Tr}}(P\rho )\) and an increased circuit depth. However, they also have some drawbacks that we avoid. Because they separate the measurement of the symmetry operator from the measurement of the Hamiltonian, they require the implementation of relatively complicated nondestructive measurements. As a consequence, existing proposals focus on measuring only the parity of the η and Sz operators, leading to a strictly less powerful form of error mitigation than the approach we propose. In addition, most errors that occur during or after the symmetry operator measurement are undetectable, including errors incurred during readout.

A different class of approaches avoids the need for additional circuit depth at the expense of requiring more measurements8,9,10. To understand this, let Π denote the fermionic parity operator and P = (1 + Π)/2 the projector onto the +1 parity subspace. Then,

$${\langle H\rangle }_{{\rm{proj}}}=\frac{{\rm{Tr}}\left(P\rho H\right)}{{\rm{Tr}}\left(P\rho \right)}=\frac{{\rm{Tr}}\left(\rho H\right)+{\rm{Tr}}\left(\rho {{\Pi }}H\right)}{1+{\rm{Tr}}\left(\rho {{\Pi }}\right)}.$$

To construct the projected energy it then suffices to measure the expectation values of the Hamiltonian, the parity operator, and the product of the Hamiltonian and parity operators. A stochastic sampling scheme and a careful analysis of the cost of such an approach reveals that it is possible to use postprocessing to estimate the projection onto the subspace with the correct particle number parity in each spin sector at a cost of roughly \(1/{\rm{T}}r{({P}_{\uparrow }{P}_{\downarrow }\rho )}^{2}\) (where P and P are the parity projectors for the two spin sectors)10. Unlike our approach, this class of error mitigation techniques does not easily allow for the projection onto the correct eigenvalues of η and Sz, owing to the large number of terms required to construct these projection operators. Furthermore, the scaling in the number of additional measurements we described above, already more costly than our approach, is also too generous. This is because the product of the parity operators and the Hamiltonian will contain a larger number nonsimultaneously measurable terms than the same Hamiltonian on its own. Maximum efficiency may require grouping schemes that consider this larger number of term groupings.

The most significant drawback of our method in the context of error mitigation is that the additional time and gates required for the basis transformation circuit lead to additional opportunities for errors. We believe that the reduction in circuit repetitions we have shown makes our method the most attractive choice when it is feasible to use an additional O(N2) two-qubit gates during the measurement process. We therefore, focus, on comparing the performance of our strategy with a strategy that requires no additional gates and uses a quantum subspace error mitigation approach that effectively projects onto the correct parity of the number operator on each spin sector9,10. In order to do so, we use the open-source software package Cirq50 to simulate the performance of both strategies for measuring the ground state energy of a chain of six hydrogen atoms symmetrically stretched to 1.3 Å in an STO-3G basis. We take an error model consisting of (i) applying a single-qubit depolarizing channel with some probability to both qubits following each two-qubit gate, and (ii) applying a bitflip channel during the measurement process with some other probability. We report results for a wide range of gate and readout noise levels inspired by the capabilities of state-of-the-art superconducting and ion trap quantum computers51,52,53,54. Specifically, we consider single-qubit depolarizing noise with probabilities ranging from 2.5 × 104 to 8 × 103 and single-qubit bitflip error probabilities between 6.25 × 104 and 1 × 102. Here, we do not consider the effect of a finite number of measurements and instead report the expectation values from the final density matrix.

Figure 3 shows the error in the measurement of the ground state energy for the error-mitigated Basis Rotation Grouping (far right panel) and Pauli Word Grouping (second panel from right) approaches together with the expectation values for both measurement strategies without error mitigation (two left panels). In these calculations, we assumed that the ground state wavefunction under the Jordan–Wigner transformation is prepared without error. Circuit level noise is considered only during the execution of the Givens rotation required for our Basis Rotation Grouping approach. In order to include the impact of our proposed error mitigation strategy on state preparation as well as measurement, we have also carried out calculations including circuit noise during state preparation. The results of these calculations are presented in Fig. 4. Here, we have approximated a realistic state preparation circuit by applying three random basis rotations that compose to the identity to the ground state wavefunction. These state preparation circuits are simulated with the same gate noise as the measurement circuits. This choice is motivated by the assumption that low-depth circuits will be required for the successful application of VQE and the expectation that 90 two-qubit gates represent a reasonable lower bound to the size of the circuit for a strongly correlated problem on 12 qubits.

Fig. 3: The error in the ground state energy of a hydrogen chain using various measurement strategies.
figure 3

We report the error in millihartrees for measurements of the ground state energy of a stretched chain of six hydrogen atoms under an error model composed of single-qubit dephasing noise applied after every two-qubit gate together with a symmetric bitflip channel during readout. We consider single-qubit depolarizing noise with probabilities ranging from 2.5 × 104 to 8 × 103, corresponding to two-qubit gate error rates of ≈5 × 104 to 1.6 × 102. For the measurement noise, we take the single-qubit bitflip error probabilities to be between 6.25 × 104 and 1 × 102. From left to right: A The error incurred by a Pauli Grouping measurement strategy involving simultaneously measuring compatible Pauli words in the usual molecular orbital basis. B The error when using our Basis Rotation Grouping scheme, which performs a change of single-particle basis before measurement. C The errors using the same Pauli Word Grouping strategy together with additional measurements and postprocessing, which effectively project the measured state onto a manifold with the correct parities of the total particle number and Sz operators. D Those found when using our basis rotation strategy and postselecting on outcomes where the correct particle number and Sz were observed. In all panels, we consider the measurement of the exact ground state without any error during state preparation.

Fig. 4: The error in the ground state energy of a hydrogen chain using various measurement strategies and a noisy state preparation step.
figure 4

We report the error in millihartrees for measurements of the ground state energy of a stretched chain of six hydrogen atoms under an error model composed of single-qubit dephasing noise applied after every two-qubit gate together with a symmetric bitflip channel during readout. We consider single-qubit depolarizing noise with probabilities ranging from 2.5 × 104 to 8 × 103, corresponding to two-qubit gate error rates of ≈5 × 104 to 1.6 × 102. For the measurement noise, we take the single-qubit bitflip error probabilities to be between 6.25 × 104 and 1 × 102. From left to right: A The error incurred by a Pauli Grouping measurement strategy involving simultaneously measuring compatible Pauli words in the usual molecular orbital basis. B The error when using our Basis Rotation Grouping scheme, which performs a change of single-particle basis before measurement. C The errors using the same Pauli Word Grouping strategy together with additional measurements and postprocessing, which effectively project the measured state onto a manifold with the correct parities of the total particle number and Sz operators. D Those found when using our basis rotation strategy and postselecting on outcomes where the correct particle number and Sz were observed. In all panels, for the purpose of approximating a realistic ansatz circuit, three random Givens rotation networks that compose to the identity were simulated acting on the ground state prior to measurement.

Figures 3 and 4 show that the Pauli Word Grouping and Basis Rotation Grouping approaches to measurement benefit significantly from their respective error mitigation strategies. Despite the fact that our proposed Basis Rotation Grouping technique requires 30 additional two-qubit gates compared to the Pauli Word Grouping approach, we see that the errors remaining after mitigation are comparable in some regimes and are lower for our strategy when noise during a measurement is the dominant error channel (compare the bottom right corners of the two rightmost panels in both figures). Focusing first on Fig. 3, we can see that this is true even when the errors during state preparation are not taken into account. Examining the left two panels of both figures, we can see that even without applying postselection, the locality of our Jordan–Wigner transformed operators leads to a considerable benefit in suppressing the impact of readout errors. In the low-noise regime, we expect the quantity 1 − tr(Pρ) to scale linearly with the number of errors coupling the different symmetry sectors. For an error model dominated by two-qubit gate errors, this quantity should itself scale linearly with the number of two-qubit gates. For all of the simulations presented in this work, we find that \(1\le \frac{1}{{\mathrm{tr}}(P\rho )}\le 3\). This implies that the postselection (or postprocessing) can be performed at a reasonable cost, as discussed above.

We note that the absolute errors we find when including noise during state preparation (Fig. 4), even at the lowest noise levels considered here, are larger than the usual target of chemical accuracy (~1 mHa). In practice, an experimental implementation of VQE on nontrivial systems will require the combination of multiple forms of error mitigation. Prior work has shown that error mitigation by symmetry projection combines favorably with proposals to extrapolate expectation values to the zero noise limit11. We expect that such an extrapolation procedure could significantly improve the numbers we present here. Other avenues for potential improvements are also available. For example, one could rely on the error mitigation and efficiency provided by our measurement strategy during the outer loop optimization procedure, before utilizing a richer quantum subspace expansion in an attempt to reduce errors in the ground state energy after determining the optimal ansatz parameters.


We have presented an improved strategy for measuring the expectation value of the quantum chemical Hamiltonian on near-term quantum computers. Our approach makes use of well-studied factorizations of the two-electron integral tensor, in order to rewrite the Hamiltonian in a form that is especially convenient for measuring under the Jordan–Wigner transformation. By doing so, we obtain O(N) distinct sets of terms that must be measured separately, instead of the O(N4) required by a naive counting of terms approach. Application to specific molecular systems shows that in practice, we require a much smaller number of repetitions to measure the ground state energy to within a fixed accuracy target. For example, assuming an experimental repetition rate of 10 kHz (consistent with the capabilities of commercial superconducting qubit platforms), a commonly referenced bound based on the Hamiltonian coefficients suggests that approximately 55 days are required to estimate the ground state energy of a symmetrically stretched chain of six hydrogen atoms encoded as a wavefunction on 24 qubits to within chemical accuracy, while our approach requires only 44 min. Our proposed measurement approach also removes the susceptibility to readout error caused by long Jordan–Wigner strings and allows for postselection by simultaneously measuring the total particle number and Sz operators with each measurement shot.

The tensor factorization that we used to realize our measurement strategy is only one of a family of such factorizations. Future work might explore the use of different factorizations, or even tailor the choice of single-particle bases for measurement to a particular system, by choosing them with some knowledge of the variances and covariances between terms in the Hamiltonian. As a more concrete direction for future work, the data we show in Supplementary Note I, regarding the difference between the bounds when calculated directly from the fermionic operators and the same approach applied to the Jordan–Wigner transformed operators, suggests that the cost estimates for error-corrected quantum algorithms should be recalculated using the qubit Hamiltonian.

For the largest systems we consider in this work, the 24-qubit hydrogen chain and water simulations, and the 20-qubit nitrogen calculations, our numerical results indicate that using our approach results in a speedup of more than an order of magnitude when compared to recent state-of-the-art measurement strategies. Furthermore, we observe a speedup of more than three orders of magnitude compared to the bounds commonly used to perform estimates in the literature. We also present strong evidence for an asymptotic improvement in our data on hydrogen chains of various sizes. We performed detailed circuit simulations that show that reduction in readout errors combined with the error mitigation enabled by our work largely balances out the requirement for deeper circuits, even when compared against a moderately expensive error mitigation strategy based on the quantum subspace expansion9. We expect that the balance of reduced measurement time and efficient error mitigation provided by our approach will be useful in the application of variational quantum algorithms to more complex molecular systems.

Finally, we note that these techniques will generally be useful for quantum simulating any fermionic system, even those for which the tensor factorization cannot be truncated, such as the Sachdev-Ye-Kitaev model of many-body chaotic dynamics55,56. In that case, L will attain its maximal value of N2, and our scheme will require N2 + 1 partitions. Likewise, if the goal is to use the basis rotation grouping technique to estimate the fermionic two-particle reduced density matrix rather than just the energy, one would need to measure in all O(N2) bases.

In the process of preparing this manuscript, we have become aware of several recent works that employ more sophisticated strategies for grouping Pauli words together or employing a different family of unitary transformations than those we consider to enhance the measurement process17,18,19,20. It would be an interesting subject of future work to calculate and compare the number of circuit repetitions required by these approaches.