Unbiasing fermionic quantum Monte Carlo with a quantum computer

Interacting many-electron problems pose some of the greatest computational challenges in science, with essential applications across many fields. The solutions to these problems will offer accurate predictions of chemical reactivity and kinetics, and other properties of quantum systems1–4. Fermionic quantum Monte Carlo (QMC) methods5,6, which use a statistical sampling of the ground state, are among the most powerful approaches to these problems. Controlling the fermionic sign problem with constraints ensures the efficiency of QMC at the expense of potentially significant biases owing to the limited flexibility of classical computation. Here we propose an approach that combines constrained QMC with quantum computation to reduce such biases. We implement our scheme experimentally using up to 16 qubits to unbias constrained QMC calculations performed on chemical systems with as many as 120 orbitals. These experiments represent the largest chemistry simulations performed with the help of quantum computers, while achieving accuracy that is competitive with state-of-the-art classical methods without burdensome error mitigation. Compared with the popular variational quantum eigensolver7,8, our hybrid quantum-classical computational model offers an alternative path towards achieving a practical quantum advantage for the electronic structure problem without demanding exceedingly accurate preparation and measurement of the ground-state wavefunction.

Introduction.2][3][4] The complexity of this equation seemingly grows exponentially with the number of electrons in the system.This fact has greatly hindered progress towards an efficient means of accurately calculating ground state quantum mechanical properties of complex systems.Over the last century, a substantial research effort has been devoted to the development of new algorithms for the solution of the many-electron problem.Currently, all available general-purpose methods can be grouped into two categories: (1) methods which scale exponentially with system size while yielding numerically exact answers and (2) methods whose cost scales polynomially with system size but which are approximate by construction.Approaches of the second category are currently the only methods that can feasibly be applied to large systems.The accuracy of solutions obtained by these methods may be unsatisfactory and is nearly always difficult to assess.
Quantum computing has arisen as an alternative paradigm for the calculation of quantum properties that may complement and potentially surpass classical methods in terms of efficiency. 5,6While the ultimate ambition of this field is to construct a universal fault-tolerant quantum computer, 7 the experimental devices of today are limited to Noisy Intermediate-Scale Quantum (NISQ) computers. 8NISQ algorithms for the computation of ground states have largely centered around the variational quantum eigensolver (VQE) framework, 9,10 which necessitates coping with optimization difficulties, measurement overhead, and circuit noise.As an alternative, algorithms based on imaginary time evolution have been put forward that, in principle, avoid the optimization problem. 11,12However, due to the non-unitary nature of imaginary time evolution, one must resort to optimization heuristics in order to achieve reasonable scaling with system size.New computational strategies which avoid these limiting factors may help to enable the first practical quantum advantage in fermionic simulations.In this work, we propose and experimentally demonstrate a new class of quantum-classical hybrid algorithms that offers a different route to addressing these challenges.We do not attempt to represent the ground state wavefunction using our quantum processor, choosing instead to use it to guide a quantum Monte Carlo calculation performed on a classical coprocessor.][15] Theory and algorithms.Quantum Monte Carlo (QMC) approaches 16,17 target the exact ground state |Ψ 0 of a many-body Hamiltonian, Ĥ, via imaginary time evolution of an initial state |Φ 0 with a non-zero overlap with |Ψ 0 : where τ is imaginary time and |Ψ(τ ) denotes the timeevolved wavefunction from |Φ 0 by τ (see Fig. 1(a)).In QMC, the imaginary-time evolution in Eq. ( 1) is implemented stochastically, which can enable a polynomialscaling algorithm to sample an estimate for the exact ground state energy by avoiding the explicit storage of high dimensional objects such as Ĥ and |Ψ 0 .The ground state energy, E ground = E(τ = ∞), is estimated from averaging a time series of { E(τ ) }, given by a weighted average over M statistical samples, where E (i) (τ ) is the i-th statistical sample for the energy and w i (τ ) is the corresponding normalized weight for that sample at imaginary time τ .While formally exact, such a stochastic imaginary time evolution algorithm will generically run into the notorious fermionic sign problem, 18 which manifests due to alternating signs in the weights of each statistical sample used in Eq. ( 2).In the worst case, the fermionic sign problem causes the estimator of the energy in Eq. (2) to have exponentially large variance (see Fig. 1(b) top), necessitating that one averages exponentially many samples to obtain a fixed precision estimate of observables such as the ground state energy.Accordingly, exact, unbiased QMC approaches are only applicable to small systems 19,20 or those lacking a signproblem. 21e sign problem can be controlled to give an estimator of the ground state energy with polynomially bounded variance by imposing constraints on the imaginary time evolution of each statistical sample represented by a wavefunction, |φ i (τ ) .These constraints (which include prominent examples such as the fixed node 17,22 and phaseless approximations 23,24 ) are imposed by the use of trial wavefunctions (|Ψ T ), and the accuracy of constrained QMC is wholly determined by the choice of the trial wavefunction (see Fig. 1(b) bottom).Such constraints necessarily introduce a potentially significant bias in the final ground state energy estimate which can be removed in the limit that the trial wavefunction approaches the exact ground state.
Classically, computationally tractable options for trial wavefunctions are limited to states such as a single mean-field determinant (e.g. a Hartree-Fock state), a linear combination of mean-field states, a simple form of the electron-electron pair (two-body) correlator (usually called a Jastrow factor) applied to mean-field states, or some other physically motivated transformations applied to mean-field states such as backflow approaches. 25n the other hand, any wavefunction preparable with a quantum circuit is a candidate for a trial wavefunction on a quantum computer, including more general two-body correlators.These trial wavefunctions will be referred to as "quantum" trial wavefunctions.
To be more concrete, there is currently no efficient classical algorithm to estimate (to additive error) the overlap between |φ i (τ ) and certain complex quantum trial wavefunctions |Ψ T such as unitary coupled-cluster with singles and doubles 26 or the multiscale entanglement renormalization ansatz, 27 even when |φ i (τ ) is simply a computational basis state or a Slater determinant.
Since quantum computers can efficiently approximate Ψ T |φ i (τ ) , there is a potential quantum advantage in this task as well as its particular use in QMC.This offers a different route towards quantum advantage in groundstate fermion simulations as compared to VQE, which instead seeks an advantage in the variational energy evaluation.We expand on this discussion of quantum advantage in Appendix F. We also note that VQE may be used to generate a sophisticated trial wavefunction which alone would not be sufficient to achieve high accuracy, but might offer quantitative accuracy and even quantum advantage when used as a trial wavefunction in our approach.
Our quantum-classical hybrid QMC algorithm (QC-QMC) utilizes quantum trial wavefunctions while performing the majority of imaginary time evolution on a classical computer, and is summarized in Fig. 1(c).In essence, on a classical computer one performs imaginary time evolution for each wavefunction statistical sample, |φ i (τ ) , and collects observables such as the ground state energy estimate, E (i) (τ ).During this procedure, a constraint associated with the quantum trial wavefunction is imposed to control the sign problem.To perform the constrained time evolution, the only quantity that needs to be calculated on the quantum computer is the overlap between the trial wavefunction, |Ψ T , and the statistical sample of the wavefunction at imaginary time τ , |φ i (τ ) .
In this work, we estimate the overlap between the trial wavefunction and the statistical samples using a technique known as shadow tomography. 29,30Experimentally, this entails performing randomly chosen measurements of a reference state related to |Ψ T prior to beginning the QMC calculation.In this formulation of QC-QMC, we emphasize that there is no need for the QMC calculation to iteratively query the quantum processor, despite the fact that the details of the statistical samples are not determined ahead of time.By disentangling the interaction between the quantum and classical computer we avoid feedback latency, an appealing feature on NISQ platforms that comes at the cost of requiring potentially expensive classical post-processing (see Appendix D 3 for more details).Furthermore, our algorithm naturally achieves some degree of noise robustness explained in Appendix D 6 because the quantity that is directly is the ratio between overlap values, which is inherently resilient to the overlaps being rescaled by certain error channels.
While our approach applies generally to any form of constrained QMC, here we discuss an experimental demonstration of the algorithm that uses an implementation of QMC known as auxiliary-field QMC (AFQMC), which will be referred to as QC-AFQMC.In AFQMC, the phaseless constraint 24 is imposed to control the sign problem, and |φ i (τ ) takes the form of a single Slater determinant in an arbitrary single-particle basis.AFQMC has been shown to be accurate in a number of cases even with classically available trial wavefunctions; 31,32 however, the bias incurred from the phaseless constraint cannot be overlooked, as we discuss in detail below.Since a single determinant mean-field wavefunction is the most widely used classical form of the trial function for AFQMC, here we will use "AFQMC" to denote the use of AFQMC with mean-field trial wavefunction.Results and discussion.The experiments in this work were carried out on Google's 54-qubit quantum processor, known as Sycamore. 28The circuits were compiled using hardware-native CZ gates with typical error rates of ≈ 0.5%. 33As the first example, in Fig. 2, we illustrate the quantum primitive used to perform shadow tomography on the H 4 molecule in an 8-qubit experiment.Our eight spin-orbital quantum trial wavefunction consists of a valence bond wavefunction known as a perfect pairing state 34,35 and a hardware-efficient quantum circuit 13 with an offline single-particle rotation applied to this, which would be classically difficult to use as a trial wavefunction for AFQMC.The state preparation circuit in Fig. 2(a) shows how this trial wavefunction can be efficiently prepared on a quantum computer.Similar state preparation circuits are used for the other chemical examples in this work.
In this 8-qubit experiment, we consider H 4 in a square geometry with side lengths of 1.23 Å and its dissociation into four hydrogen atoms.This system is often used as a testbed for electron correlation methods in quantum chemistry. 37,38We perform our calculations using two Gaussian basis sets: the minimal (STO-3G) basis 39 and the correlation consistent quadruple-zeta (cc-pVQZ) basis. 36The latter basis set is of a size and accuracy required to make a direct comparison with laboratory experiments.When describing the ground state of this system, there are two equally important, degenerate meanfield states.This makes AFQMC with a single meanfield trial wavefunction highly unreliable.In addition, a method often referred to as a "gold standard" classical approach (coupled-cluster with singles, doubles, and perturbative triples, CCSD(T) 40 ) also performs poorly for this system.
In Table I, the difficulties of AFQMC and CCSD(T)  36 ; 60-orbital).For clarity, the relative energies are shifted to zero at 2.25 Å. Inset shows the error in total energy relative to the exact results in kcal/mol.The dash dotted line in the inset provides bounds for chemical accuracy (1 kcal/mol).Note that the variational energy of the quantum trial used here is outside the plotted energy scale.The statistical error bars of AFQMC and QC-AFQMC are not visible on this scale.(b, top) Circuit layout showing spin-up and spin-down qubits for the 16-qubit experiment.(b, bottom) Error in total energy as a function of lattice constant of diamond in a double zeta basis (DZVP-GTH; 26 orbitals).The dash dotted line shows the bounds for chemical accuracy.Our quantum trial results are not visible on this energy scale.For high values of the lattice constant none of these methods achieve chemical accuracy but the use of the quantum trial still improves the AFQMC result.Inset shows a supercell structure of diamond where two highlighted atoms form the minimal unit cell.
are well illustrated by comparing their atomization energies with exact values in two different basis sets.Both approaches show errors that are significantly larger than "chemical accuracy" (1 kcal/mol).The variational energy of the quantum trial reconstructed from experiment has a bias that can be as large as 33 kcal/mol.The noise on our quantum device makes the quality of our quantum trial far from that of the ideal (i.e., noiseless) ansatz as shown in Fig. 2(b) and (c), resulting in an error as large as 10 kcal/mol in the atomization energy.Nonetheless, QC-AFQMC reduces this error significantly, and achieves chemical accuracy in both bases.
As shown in Appendix C 3, for the larger basis set, we obtain a residual "virtual" correlation energy by using the quantum resources on a smaller number of orbitals to unbias an AFQMC calculation on a larger number of orbitals, with no additional overhead to the quantum computer.This capability makes our implementation competitive with state-of-the-art classical approaches.Similar virtual correlation energy strategies have been previously discussed within the framework of VQE, 41 but unlike our approach, those strategies come with a significant measurement overhead.To unravel the QC-AFQMC results on H 4 further, we illustrate in Fig. 2(b) and (c) the evolution of trial and QC-AFQMC energies as a function of the number of measurements made on the device.Despite the presence of significant noise within approximately 10 5 measurements, QC-AFQMC achieves chemical accuracy while coping with a sizeable residual bias in the underlying quantum trial.
Next, we move to a larger example, N 2 , which requires a total of 12 qubits in our quantum experiment.Here, a simpler quantum trial is used for QC-AFQMC by taking just the valence bond part of the wavefunction depicted in Fig. 2(a).We examine the potential energy surface of N 2 from compressed to elongated geometries, which is another common benchmark problem for classical quantum chemistry methods. 38,42In Fig. 3 (a), the QC-AFQMC result is shown for the calculations performed in a triple zeta basis (cc-pVTZ), 36 which corresponds to a 60-orbital or 120-qubit Hilbert space.All examined methods, CCSD(T), AFQMC, and QC-AFQMC perform quite well near the equilibrium geometry, but CCSD(T) and AFQMC deviate from the exact results significantly as one stretches the bond distance.As a result, the error of "gold-standard" CCSD(T) can be as large as 14 kcal/mol and the error of AFQMC with a classical trial wavefunction can be as large as -8 kcal/mol.The error in the QC-AFQMC computation ranges from -2 kcal/mol to 1 kcal/mol depending on the bond distance.Thus, while we do not achieve chemical accuracy with QC-AFQMC, we note that even with a very simple quantum trial wavefunction, we produce energies that are competitive with state-of-the-art classical approaches.
Lastly, we present a 16-qubit experiment result on the ground state simulation of a minimal unit cell (2-atom) model of periodic solid diamond in a double-zeta basis set (DZVP-GTH 43 ; 26 orbitals).While at this level of theory the model exhibits significant finite-size effects and does not predict the correct experimental lattice constant, we aim to illustrate the utility of our algorithm in materials science applications.We emphasize that this is the largest quantum simulation of chemistry on a quantum processor to date.Previously, the largest correlated quantum simulations of chemistry involved half a dozen qubits or less 13 with more than an order of magnitude fewer two-qubit gates than is used here, while the largest mean-field calculation performed on a quantum computer involved a dozen qubits with fewer than half as many two-qubit gates. 15We again use the simple perfect pairing state as our quantum trial wavefunction and demonstrate the improvement over a range of lattice parameters compared with classical AFQMC and CCSD(T) in Fig. 3 (b).There is a substantial improvement in the error going from AFQMC to QC-AFQMC showing the increased accuracy due to better trial wavefunctions.Our accuracy is limited by the simple form of our quantum trial and yet we achieve accuracy nearly on par with the classical gold standard method, CCSD(T).
Conclusion.In summary, we proposed a scalable, noise-resilient quantum-classical hybrid algorithm that seamlessly embeds a special-purpose quantum primitive into an accurate quantum computational manybody method, namely QMC.Our work offers an alternative computational strategy that effectively unbiases fermionic QMC approaches by leveraging state-ofthe-art quantum information tools.We have realized this algorithm for a specific QMC algorithm known as AFQMC, and experimentally demonstrated its performance in experiments as large as 16-qubit on a NISQ processor, producing electronic energies that are competitive with state-of-the-art classical quantum chemistry methods.Our algorithm also allows for incorporating the electron correlation energy outside the space that is handled by the quantum computer without increasing quantum resources or measurement overheads.In Appendix F, we discuss issues related to asymptotic scaling and the potential for quantum advantage in our algorithm, including the challenge of measuring wavefunction overlaps precisely.While we have yet to achieve practical quantum advantage over available classical algorithms, the flexibility and scalability of our proposed approach in the construction of quantum trial functions, and its inherent noise resilience, promise a new path forward for the simulation of chemistry in the NISQ era and beyond.
Acknowledgements.The authors thank members of the Google Quantum AI theory team and Fionn Malone for helpful discussions.BO is supported by a NASA Space Technology Research Fellowship.The quantum hardware used for this experiment was developed by the Google Quantum AI hardware team, under the direction of Anthony Megrant, Julian Kelly and Yu Chen.Theoretical foundations for device calibrations were provided by the physics team lead by Vadim Smelyanskiy.Initial data collection was enabled by cloud access to these devices as part of Google Quantum AI's Quantum Computing Service Early Access Program.Pedram Roushan and Charles Neill from the Google team helped to execute the experiment on hardware and design figures.
Note.The code and data for this study are available from the corresponding authors upon request and will be made available publicly in the future.After this work was nearly complete, a theory paper by Yang et al. appeared on arXiv, 44 describing a quantum algorithm for assisting real time dynamics with unconstrained QMC.
Author contributions.JL conceived of the quantum-classical hybrid QMC algorithm, performed QMC calculations, and with contribution from others drafted the manuscript.WJH proposed the use of shadow tomography and designed the experiment with contributions from others.BO helped with theoretical analysis and the com-pilation of circuits.NCR helped with the presentation of figures.JL and RB managed the scientific collaboration.All authors participated in discussions, the writing of the manuscript, and the analysis of the data.
in the final ground state energy estimate.The accuracy of QMC simulations is, therefore, wholly determined by the quality of the trial wavefunction.In cases where strong electron correlation is not present, using a simple single Slater determinant trial wavefunction obtained from a mean-field (MF) approach leads to accurate approximate ground state energies from QMC.However, for cases where MF wavefunctions are qualitatively wrong, one must resort to other alternatives.The form of wavefunction must be simple enough to evaluate the projection onto a working QMC basis in an efficient manner.The QMC basis takes the form of real-space points in DMC, occupation vectors in GFMC, and non-orthogonal Slater determinants in AFQMC.The projection onto the QMC basis often scales exponentially with system size for coupled-cluster states and tensor-product states such as matrix product states.Trial wavefunctions consisting of a linear combination of determinants have been widely used due to the simple evaluation of the projection in this case.However, obtaining an accurate linear combination of determinants scales poorly because the number of important determinants generically scales exponentially with system size.Given these facts, there is a need for a new paradigm that allows for more flexible choices of trial wavefunctions which can lead to more accurate QMC algorithms without losing their scalability.
In this work, we have proposed harnessing the power of quantum computers in performing a hybrid quantumclassical QMC simulation, which we refer to as the QC-QMC algorithm.The key observation that we exploit is that it is possible to perform the QMC basis projection for a wide range of wavefunctions in a potentially more efficient manner on quantum computers than on classical computers.This suggests that one may isolate the specific task of the projection from the QMC algorithm and use quantum computers to perform this task and separately communicate this information to a classical computer to continue the QMC calculation.In principle the required quantity is straightforward to approximate using the Hadamard test. 45However, because the QMC basis projection needs to be performed thousands of times for a single QMC calculation, for Noisy Intermediate-Scale Quantum (NISQ) devices we propose using shadow tomography to characterize the trial wavefunction and evaluate the projection such that the on-line interaction between the quantum and classical device no longer exists.This enables the exploration of the utility of quantum trial wavefunctions without concern for the challenges of tightly coupling high performance classical computing resources with a NISQ device.We demonstrate the usefulness and noise resilience of this approach by producing accurate experiments through Google's Sycamore processor on prototypical strongly correlated chemical systems such as H 4 in a minimal basis and a quadruple-zeta basis, as well as bond-breaking of N 2 in a triple-zeta basis.We also studied a minimal unit cell model of diamond within a double zeta basis.
Appendix B: Review of Projector Quantum Monte Carlo QMC methods are among the most accurate approximate electronic structure approaches, and they can be systematically improved with the use of increasingly sophisticated trial functions.Here, we summarize the essence of the algorithm and discuss a specific QMC method which works in second-quantized space, namely auxiliary-field quantum Monte Carlo (AFQMC).While we focus on developing a strategy tailored for AFQMC in this work, the general discussion is not limited to AFQMC and should be applicable to QMC in general.

Projector quantum Monte Carlo
The essence of any projector QMC methods is that one computes the ground state energy and properties via an imaginary-time propagation where τ is the imaginary time, |Ψ 0 is the exact ground state and |Φ 0 is an initial starting wavefunction satisfying Φ 0 |Ψ 0 = 0. Without any further modification, this is an exact approach to the computation of the ground state wavefunction.In practice, a deterministic implementation of Eq. (B1) scales exponentially with system size and therefore one resorts to a stochastic realization of Eq. (B1) for scalable simulations.Such a stochastic realization is typically referred to as projector QMC.
Unfortunately, a direct implementation of Eq. (B1) via QMC suffers from the infamous fermionic sign problem. 18n first quantized QMC methods such as DMC, fermionic antisymmetry is not imposed explicitly.Such approaches require the imposition of the fermionic nodal structure using trial wavefunctions to compute the fermionic ground state.The use of an approximate nodal structure introduces a bias.In second quantized QMC methods the sign problem manifests in a different way.The statistical estimates from a second quantizated QMC method exhibit variances that grow exponentially with system size.Therefore for simulations of large systems no meaningful statistical estimates can be obtained.It is then necessary to impose a constraint in the imaginary-time propagation to deal with the the sign problem and to obtain statistical efficiency.An example of such a constraint is the "phaseless" constraint in AFQMC (see below).While such constraints introduce biases in the final estimates, rendering QMC approaches inherently approximate in practice, different constrained approaches will have relative strengths and weaknesses with respect to accuracy and flexibility.

Auxiliary-field quantum Monte Carlo
Auxiliary-field quantum Monte Carlo (AFQMC) is a projector QMC method that works in second-quantized space. 46herefore, the sign problem in AFQMC manifests in growing variance in statistical estimates.To impose a constraint in the imaginary-time propagation, it is natural to introduce a trial wavefunction that can be used in the importance sampling as well as the constraint.This results in a wavefunction at imaginary time τ expressed as where |φ i (τ ) is the wavefunction of the i-th walker, w i (τ ) is the weight of the i-th walker, and |Ψ T is some a priori chosen trial wavefunction.From Eq. (B2), it is evident that the importance sampling is imposed based on the overlap between the walker wavefunction and the trial wavefunction.Walker wavefunctions in Eq. (B2) are almost always chosen to be single Slater determinants and the action of the imaginary propagation, exp −∆τ Ĥ , for a small time step ∆τ in Eq. (B1) transforms the walkers in such a way that they stay within the single Slater determinant manifold via the Hubbard-Stratonovich transformation.This property is essential if the computational cost is to grow only polynomially with system size, and is at the core of the AFQMC algorithm as well as that of another commonly used unconstrained (and therefore unbiased) projector QMC approach called the determinant QMC method. 19hile repeatedly applying the imaginary time propagator to the wavefunction, the AFQMC algorithm prescribes a particular way to update the walker weight w i (τ ) in Eq. (B2).In essence, it is necessary that all weights stay real and positive so that the final energy estimator, has a small variance.Here, E (i) (τ ) is so-called the local energy, which is defined as We note that Eq. (B3) is not a variational energy expression and is commonly referred to as the "mixed" energy estimator in QMC.The essence of the constraint is that one updates the i-th walker weight from τ to τ + ∆τ using where and θ i (τ ) is the argument of S i (τ ).This is in a stark contrast with a typical importance sampling strategy which updates the walker weights using S i (τ ), which does not guarantee the positivity and reality of the walker weights.If |Ψ T is exact, this constraint does not introduce any bias, but simply imposes a specific boundary condition on the imaginary propagation which can be viewed as a "gauge-fixing" of the wavefunction.In practice, one does not have access to the exact |Ψ T and therefore can only compute an approximate energy whose accuracy wholly depends on the choice of |Ψ T .Such a constraint is usually referred to as the "phaseless approximation" in the AFQMC literature.Currently, classically tractable trial wavefunctions that are commonly used are either single determinant trials or take the form of a linear combination of determinants. 47,48The former is very scalable (up to 500 electrons or so) but can be often inaccurate, especially for strongly correlated systems, while the latter is limited to a small number of electrons (16 or so) but can produce results that are very accurate even for strongly correlated systems.The choice of the trial wavefunction renders AFQMC limited by the evaluation of Eq. (B3) and Eq.(B6).If the computation of either one of these quantities scales exponentially with system size, the resulting AFQMC calculation will be exponentially expensive.
In the main text, we presented the general philosophy of the QC-QMC algorithm and here we wish to provide more QC-AFQMC-specific details tailored to the experiments presented in this work.
From the perspective of QMC simulations, the main benefit of using a quantum computer is to expand the range of available trial wavefunctions beyond what is efficient classically.Namely, we seek a class of trial wavefunctions that are inherently more accurate than a single determinant trial while bypassing the difficulty of variational optimization on the quantum computer.Among the set of possible trial functions, we are interested in using wavefunctions for which no known polynomial-scaling classical algorithm exists for the exact evaluation of Eq. (B3) and Eq.(B6).The core idea in the QC-AFQMC algorithm is that one can approximately measure Eq. (B3) and Eq.(B6) on the quantum computer and implement the majority of the imaginary-time evolution classically.Our goal is provide a roadmap for quantum computers to apply polynomial-scaling algorithms for the evaluation of Eq. (B3) and Eq.(B6) up to additive errors and thus ultimately to observe quantum advantage in some systems.This clearly separates subroutines into those that need to be run on quantum computers and those on classical computers.

Quantum trial wavefunctions
The specific trial functions of interest in this work are simple variants of so-called coupled-cluster (CC) wavefunctions.In quantum chemistry, CC wavefunctions are among the most accurate many-body wavefunctions. 49They are defined by an exponential parametrization, where |ψ 0 is a single determinant reference wavefunction and the cluster operator T is defined as We use {i, j, k, • • • } to denote occupied orbitals and {a, b, c, • • • } for unoccupied orbitals.T can be extended to include single excitations (S), double excitations (D), triple excitations (T) and so on.The resulting CC wavefunction is then systematically improvable by including higher-order excitations.The most widely used version involves up to doubles and is referred to as CC with singles and doubles (CCSD).There is no efficient algorithm for variationally determining the CC amplitudes, t; however, there is an efficient projective way to determine these amplitudes and the energy, although the resulting energy determined by this procedure is not variational.Such non-variationality manifests as a breakdown of conventional CC, although it has been suggested that the underlying wavefunction is still qualitatively correct and the projective energy evaluation is partially responsible for this issue. 50mploying CCSD (or higher-order CC wavefunctions) within the AFQMC framework is difficult because the overlap between a CCSD wavefunction and an arbitary Slater determinant cannot be calculated efficiently without approximations.This is true for nearly all non-trivial variants of coupled cluster.Notably, there is currently no known efficient classical algorithm for precisely calculating wavefunction overlaps even for the cases of coupled cluster wevefunctions with a limited set of amplitudes, such as generalized valence bond perfect-pairing (PP). 34,35In QC-AFQMC, we can efficiently approximate the required overlaps of such wavefunctions by using a quantum computer to prepare a unitary version of CC wavefunctions or approximations to them.By using CC wavefunctions that we can obtain circuit parameters classically, we are able to avoid a costly variational optimization procedure on the quantum device.
The simplified CC wavefunction ansatz that we utilize in this work is the generalized valence bond PP ansatz.This ansatz is defined as where the orbital rotation operator is defined as and the PP cluster operator is TPP = In this equation, each i is an occupied orbital and each i * is the corresponding virtual orbital that is paired with the occupied orbital i.We map the spin-orbitals of this wavefunction to qubits using the Jordan-Wigner transformation.We note that the pair basis in t i is defined in the rotated orbital basis defined by the orbital rotation operator.
Due to its natural connection with valence bond theory which often provides a more intuitive chemical picture than does molecular orbital theory, the PP wavefunction has played an important role in understanding chemical processes. 34Despite its exponential scaling when implemented exactly on a classical computer, PP in conjunction with AFQMC has been discussed previously; see Ref. 51.We will explore the scaling of the PP-based approach in classical AFQMC and QC-AFQMC in more in detail below because this wavefunction is used in all of our experimental examples (see Appendix F).
The PP wavefunction is known to provide insufficient accuracy for the ground state energy in many important examples.This is best illustrated in systems where inter-pair correlation becomes important, such as multiple bond breaking processes. 52While there exist ways to incorporate inter-pair correlation classically, [53][54][55] in this work we focus on adding multiple layers of hardware-efficient operators to the PP ansatz.There are two kinds of these additional layers that we have explored: 1.The first class of layers includes only density-density product terms of the form e J ij ni nj . (C6) 2. The second class includes only "nearest-neighbor" hopping terms between same spin (σ) pairs In both cases, the i and j orbitals are physically neighboring in the hardware layout.We alternate multiple layers of each kind and apply these layers to the PP ansatz to improve the overall accuracy.The efficacy of these layers varies with their ordering with the choice of the i,j pairs.Lastly, we also employ a full single particle rotation at the end of the hardware-efficient layers.This last orbital rotation can be applied to 1-body and 2-body Hamiltonian matrix elements classically, so we do not have to implement this part on the quantum computer.We refer this orbital rotation as "offline orbital rotation" as noted in Fig. 2. H 4 was the only example where we went beyond the PP wavefunction.When this type of hardware-efficient layers is used, we no longer have an efficient classical algorithm to optimize the wavefunction parameters.In such cases, one can resort to the variational quantum eigensolver to obtain these parameters.Nevertheless, in the case of H 4 , the Hilbert space is small enough (4-orbital) that we still could optimize everything classically.

Overlap and Local energy evaluation
As mentioned above, the overlap and local energy evaluations are the key subroutines that involve the quantum trial wavefunctions.One approach to the overlap evaluation is to use the Hadamard test. 45Using modern methods, one could do this without requiring the state preparation circuit to be controlled by an ancilla qubit. 56-58However, this approach would require a separate evaluation for each walker at every time step.To avoid a steep prefactor in quantum device run time, we propose the use of the technique known as shadow tomography as discussed in Appendix D. For now, we will assume that one can make a query to the quantum processor to obtain the overlap between a quantum trial state and an arbitrary Slater determinant efficiently up to additive error of the overlap.
With the ability to measure the overlap between |Ψ T and an arbitrary single Slater determinant, |φ i (τ ) we can easily estimate the local energy in Eq. (B4).The evaluation of the denominator is just an overlap quantity and an efficient estimation of the denominator is possible via where |φ r p and |φ rs pq denote single and double excitations from |φ i (τ ) , respectively.We only need up to double excitations because our Hamiltonian has up to two-body terms.It is then evident that the ability to estimate Ψ T |φ r p and Ψ T |φ rs pq efficiently is sufficient to evaluate the entire local energy because the rest of the terms in Eq. (C8) follow from the simple application of the Slater-Condon rule. 59The number of overlap queries made to the quantum processor scales as O(N 4 ) with N being the system size in this algorithm.Other "mixed" local observables can be computed via similar algorithms.

Virtual correlation energy
Obtaining the correlation energy outside the "active" space, where the actual quantum resource is spent, is critical for converging our simulation results to the basis set limit (or the continuum limit).The correlation energy outside the active space will be referred to as "virtual correlation energy".We are limited in terms of the number of qubits on NISQ devices, so a procedure to incorporate correlation energy outside the relatively small active space is essential.To this end, a virtual correlation energy strategy has been proposed within the framework of VQE, 41 but this approach comes with a significant measurement overhead due to the requirement of three-and four-body reduced density matrices within the active space.
In this section, our goal is to show that a similar technique for QC-AFQMC exists where we can obtain the virtual correlation energy without any additional qubits or any measurement overhead.We write our trial wavefunction as where |ψ T is the quantum trial wavefunction within the active space, |ψ c is a Slater determinant composed of occupied orbitals outside the active space (i.e.frozen core orbitals), and |0 v is a vacuum state in the space of unoccupied orbitals outside the active space (i.e., frozen virtual orbitals).We want to compute the overlap between |Ψ T and a single Slater determinant |φ where the constant can be efficiently evaluated classically by contracting matchgate states and the evaluation of φ ψ T can now be performed on the quantum computer with only N a qubits.
For the local energy evaluation in Eq. (B4), we leverage the same technique that we used in Eq. (C8).The numerator of the local energy expression is and we only need to focus on the computing the following term: Then y∈{0,1} N c φ rs pq (x, y, 0 v )ψ c (y) is the tensor corresponding to a matchgate state itself (with N a open indices) and thus can be computed efficiently classically.Since an equation of the form Eq. (C13) also holds for |φ rs pq , the local energy evaluation can be performed on the quantum computer with only N a qubits.

Appendix D: Experimental Implementation via Shadow Tomography
The basic goal of shadow tomography is to estimate properties of a quantum state without resorting to full state tomography.4][65][66][67][68][69][70] In the experiments performed in this work, we make use of these tools to approximate the quantities required to perform AFQMC, Eq. (B3) and Eq.(B6).We focus here on the proposal put forward by Huang et al. in Ref. 30.This version of shadow tomography is experimentally simple to implement and compatible with today's quantum hardware.
As we shall explain, the use of shadow tomography makes our experiment particularly efficient in terms of the number of repetitions required to evaluate the required wavefunction overlaps.This allows us to avoid performing a separate set of experiments (e.g. using the Hadamard test) for each timestep and walker.However, this efficiency comes at a cost; the way in which we extract these overlaps from the experimental measurement record requires an exponentially scaling post-processing step.We note that this difficulty is specific to the particular choice we made to demonstrate QC-QMC using AFQMC rather than some other QMC method.For example, if we were using a quantum computer to provide the constraint for a Green's function Monte Carlo calculation, the walker wavefunctions would be computational basis states and we could make use of shadow tomography without this issue.It is an open question whether a more sophisticated measurement strategy could be equally efficient in terms of the number of measurements required while also avoiding this additional bottleneck for QC-AFQMC.Exploring the use of shadow tomography with random fermionic gaussian circuits, as in Ref. 66, seems like a promising direction to explore for this purpose.
In Appendix D 1, we review the general formalism of shadow tomography as proposed in Ref. 30.We continue in Appendix D 2 by showing how we can use shadow tomography to approximate the wavefunction overlaps required to perform QC-QMC and discussing the scaling in terms of the number of measurement repetitions performed on the quantum device.We explain the challenges associated with the classical post-processing of the experimental record for QC-AFQMC in Appendix D 3. In Appendix D 4 and Appendix D 5, we describe two strategies we adopt for reducing the number of quantum gates required for our experimental implementation.Appendix D 4 deals with compiling the measurements, while Appendix D 5 explains how we make a tradeoff between the number of gates and the number of measurements.Finally, in Appendix D 6, we show that the quantities we ultimately estimate using the quantum device are resilient to noise, particularly noise during the shadow tomography measurement procedure.

Review of Shadow Tomography
Let ρ denote some unknown quantum state.We assume that we have access to N copies of ρ.Let {O i } denote a collection of M observables.Our task is to estimate the quantities tr(ρO i ) up to some additive error for each O i .The key insight of Ref. 30 is that we can accomplish this efficiently in certain circumstances by randomly choosing measurement operators from a tomographically complete set.
To specify a protocol, we choose an ensemble of unitaries U. We then proceed by randomly sampling U k ∈ U and measuring the state U k ρU † k in the computational basis to obtain the basis state We require that M be invertible, which is true if and only if the collection of measurement operators defined by drawing U ∈ U and measuring in the computational basis is tomographically complete.Assuming that this is true, we can apply M −1 to both sides of Eq. (D1), yielding We call the collection Many choices for the ensemble U are possible. 30,66,69,70Formally, the condition that the measurement channel is invertible is sufficient.In practice, it is also desirable to impose the constraint that the classical post-processing involved in making use of the shadow can be done efficiently.In this work, we utilize randomly selected N -qubit Clifford circuits, as well as tensor products of randomly selected Clifford circuits on fewer qubits.

Approximating Wavefunction Overlaps with Shadow Tomography
Let |Ψ T denote our trial wavefunction.We restrict ourselves to considering |Ψ T that represent fermionic wavefunctions with a definite number of particles η > 0. We focus on states encoded with the Jordan-Wigner transformation, so that the qubit wavefunction for |Ψ T is a superposition of computational basis states with Hamming weight η.Let |φ denote our walker wavefunction, which is also a superposition of computational basis states with Hamming weight η.In this section, we explain how to approximate the wavefunction overlap φ|Ψ T using shadow tomography.
Our protocol begins by preparing the state |τ τ | on the quantum computer, where |τ = (|0 + |Ψ T )/ √ 2, with |0 denoting the all-zero (vacuum) state.The wavefunction overlap of interest is therefore equal to where we used the fact that Ψ T | 0 = φ | 0 = 0.If we define the observables Let us assume for now that U is the Clifford group on N qubits.Therefore, we can use the expression for the inverse channel from Ref. 30, where X is a placeholder variable.In particular, we have The full expression for φ|Ψ T then becomes Furthermore, because we are expressing φ|Ψ T in terms of the expectation values of the two operators P ± with Tr P2 ± = O(1), Theorem 1 of Ref. 30 allows us to bound the number of measurement repetitions we require for a target precision.Specifically, when we take the ensemble of random unitaries to be the Clifford group on all N qubits, as we do in this section, this bound scales with the Hilbert-Schmidt norm of the operators of interest.Consider the case where we would like to estimate the overlap of |Ψ T with a collection of M different wavefunctions {φ i }.Let ci denote our estimate of φ i |Ψ T .We specify a desired accuracy in terms of two parameters, and δ, by demanding that with probability at least 1 − δ.Theorem 1 of Ref. 30 implies that shadow tomography using the N -qubit Clifford group allows us to achieve this accuracy using 3. Classical Post-processing for Wavefunction Overlaps In the previous section, we described how we can use shadow tomography to estimate overlaps of the form φ | Ψ T by evaluating the expression in Eq. (D11), 2(2 , where the U k are Clifford circuits and b k are computational basis states.We explained how these estimates can be made using a modest number of experimental repetitions, even for a large collection of different |φ i .However, we have not yet described the classical post-processing required to perform this estimation.This section addresses this aspect of our experiment and explains how the approach we took for our implementation of QC-AFQMC in practice involves an exponentially scaling step.We will utilize the fact that overlap between stabilizer states (including basis states) can be efficiently computed classically using the Gottesman-Knill theorem. 71,72For instance, the terms b k |U k |0 can be efficiently calculated for any Clifford circuit U k .Therefore, we can just focus on computing φ|U † k |b k to evaluate the expression in Eq. (D11).In special cases, this can be computed efficiently.For example, if |φ = α c α |φ α can be written as a linear combination of a polynomial number of stabilizer states {|φ α } α , then we can efficiently compute φ α |U † k |b k for each α and sum them together.QMC methods such as Green's function Monte Carlo where the walker wavefunctions are computational basis states are a special case that trivially satisfies this requirement.Even when |φ is not exactly sparse, it may be approximately sparse in the computational basis (in the sense of being close to an exactly sparse state).In such a case, provided that we can sample from |φ efficiently (which is possible for a Slater determinant), we could construct a sparse approximation to |φ (see, e.g., Ref. 73) and use this state to approximate the overlap.In our QC-AFQMC experiments, we expanded |φ in this way, except that we performed a sum over all of the computational basis states with the correct symmetries, incurring an exponential overhead.We emphasize, however, that the cost of this post-processing has no effect on the number of quantum samples needed to produce the classical shadow.
For a general wavefunction |φ , computing φ U † k b k may be classically intractable.Specifically, when |φ is a Slater determinant, as our walkers are, there is no known way to efficiently compute the desired overlap classically.Existing strategies for approximating the overlap between two states can allow us to bypass this exponential scaling if an additive error is acceptable.In general, it is possible to approximate the overlap between two states up to some additive error provided that one can sample from one of the states in the computational basis and query each of them for the amplitudes of particular bitstrings.Techniques of this sort are used in variational Monte Carlo 25 and have also been studied in the context of dequantizing quantum algorithms.In particular, Ref. 74 showed that for normalized states |ψ , |φ , the random variable φ | x ψ | x with probability | x | ψ | 2 has mean ψ | φ and constant variance: This implies an algorithm to calculate ψ | φ to within additive error with failure probability at most δ using O( 1 2 log 1 δ ) samples from |ψ and queries to the amplitudes of |ψ and |φ .Unfortunately, the prefactor of 2(2 N + 1) in Eq. (D11) seems to preclude benefiting from a strategy that estimates φ|U † k |b k up to an additive error.This is why we chose to compute the overlap using the exponential scaling enumeration of basis states in our QC-AFQMC experiments.

Global Stabilizer Measurements
In this section, we outline a strategy for reducing the size of the circuits required to perform shadow tomography.This strategy leverages the fact that we measure in the computational basis immediately after performing a randomly sampled Clifford.Therefore, any permutation of the computational basis states that occurs immediately prior to measurement is unnecessary.
In general, applying a unitary U and then measuring in the computational basis |x : x ∈ {0, 1} N , as shadow tomography was originally presented, is equivalent to measuring in the rotated basis U † |x : x ∈ {0, 1} N .For a set of unitaries U, choosing a unitary therefrom uniformly at random and then measuring in the computational basis is equivalent to measuring the positive operator-valued measure (POVM) that the |U| 2 N measurement operators need not be distinct (e.g., if the unitaries in U only permute the computational basis states).In particular, when U is the set of N -qubit Clifford unitaries C N , each measurement operator U † |x x| U is a stabilizer state, and the POVM is where stab N is the set of N -qubit stabilizer states.That the weight of the measurement operators is uniform follows from the symmetry of U (appending any Clifford to each U ∈ U leaves the distribution unchanged); that the uniform weight is 2 N / |stab N | will be explained later.There are 75 and only 2 N N i=1 (2 i + 1) 2 N |C N | stabilizer states. 72This suggests that sampling a uniformly random Clifford is unnecessary.We will now construct a smaller set of 2 −n |stab N | unitaries CN such that the corresponding POVM is equivalent to Let F N be the "H-free" (Hadamard-free) group on N qubits, i.e. the group generated by X, CNOT, CZ.The action of any H-free operator can be written as 75 where Γ is symmetric Boolean matrix; γ, δ ∈ {0, 1} N ; and ∆ is an invertible Boolean matrix.(A Boolean matrix is one whose entries are 0 or 1.)The action of an H-free operator thus is to simply permute the basis states and add some phase.If we are measuring in the computational basis, the phase does not affect the outcome probabilities and the affine change x → ∆x + δ is invertible.Therefore measuring a state in the computational basis and applying the transformation y → ∆ −1 (y + δ) to the outcome y is equivalent to applying F † and then measuring in the computational basis (i.e., measuring in the basis F |x : x ∈ {0, 1} N ).As shown by Bravyi and Maslov, 75 any Clifford operator can be written in the form F • H • F , where F, F ∈ F N and H is a layer of single-qubit Hadamards.
In shadow tomography, we apply a Clifford F • H • F and measure in the computational basis.As explained above, however, the second H-free operator F need not actually be applied; its effect can be implemented entirely in classical post-processing.In general, F and F are not unique.However, Bravyi and Maslov give a canonical form for Clifford operators (by constraining the H-free operators F, F ) that allows for uniform sampling.If we start with their canonical form and "push" as much of F through the Hadamard layer into F , yielding a new form F • H • F = F • H • F , and neglect the new final H-free operator F , we are left with an operator of the form where I ⊂ [N ] is a subset of qubit indices, Γ is a Boolean upper-triangular matrix with support only on I, and ∆ is Boolean.Applying a Clifford operator and measuring in the computational basis can thus be replaced by applying an operator of the form in Eq. (D17) and measuring in the computational basis.A priori, we would also need to do post-processing to account for the affine transformation effected by the neglected H-free operator, but in fact this is not needed.

Partitioned Shadow Tomography
As we discussed in Appendix D 2, shadow tomography using the N -qubit Clifford group can be used to simultaneously estimate M wavefunction overlaps using a number of samples that scales logarithmically in M .However, performing these measurements on a NISQ devices can be challenging because of the required circuit depth.Alternative choices of the ensemble of random unitaries, U, can alleviate this difficulty.In Ref. 30, Huang et al. consider a second choice of U where the unitaries U ∈ U are instead chosen to be tensor products of single-qubit Clifford operators.This choice leads to especially simple circuits.In the worst case, however, it requires a number of measurements scaling exponentially with the locality of the operators to be estimated.
In the experiments performed in this work, we found it useful to interpolate between these two extremes.Specifically, we use an ensemble of random circuits U consisting of tensor products of random Clifford circuits on N/2 qubits.In this section, we explain how the the techniques for overlap estimation we presented in Appendix D 2 can be generalized to this case.Ref. 30 explains how each choice of U has an associated norm which can be used to bound the variance of the estimators derived from the classical shadow.We do not work out the norm or the associated bounds on the number of measurements for our partitioned shadow tomography here.Instead, we merely note that it performed well in practice and leave this elaboration for a future work.
Recalling and simplifying the expression in Eq. (D9), we have We can use an expression like the one from Eq. (D8) to apply the inverse channel, but first we need to specify some notation.We take a partitioning of the N qubits into P parts.Let N 1 , N 2 , ...N P be the number of qubits in each part of the partition.We consider a shadow tomography protocol that applies a randomly selected N p -qubit Clifford to each part, p ∈ {1, 2, ...P }.Thus, we have The inverse of the shadow tomography measurement channel is simply where, as in Eq. (D8), where X is a placeholder variable.Now we specialize to the case where |φ is a computational basis state, which we denote by |β .We could instead take |φ to be any state which is separable between the parts of the partition (or a sum of such states), but specializing to computational basis states is sufficient for our purposes.Let β p denote the component of |β associated with the p-th part of the partition.Using this notation, we can evaluate Eq. (D18) to yield In carrying out our experiments, we specifically chose to use a partition with two parts, one for each of the spin sectors.All of our walker wavefunctions |φ were superpositions of basis states with a Hamming weight η overall and a nonzero number of electrons in each spin sector.Therefore, when we used shadow tomography to evaluate the overlap of our walker wavefunctions |φ with |Ψ T as described in Appendix D 2 and Appendix D 3, β p 0 p = 0 for the calculations we performed.Because of this, we were able to evaluate the wavefunction overlaps using the expression where the c i 's are the amplitudes of |φ in the computational basis, {|β i }.

Noise Resilience
We show in this section that, in certain circumstances, noise has a negligible impact on the measurement of overlap ratios such as where |Ψ T is some fixed trial wavefunction and |φ 1 , |φ 2 are two arbitrary determinants.Recall that the overlap As a warm up, consider a simple noise model: a global depolarizing channel 76 ρ → ρ = (1 − p)ρ + pI (D25) applied right before measurement.Then, neglecting the error in estimating the overlaps due to measurement, our estimate of the overlap becomes where we used the fact that φ i | 0 = 0. Thus the depolarizing channel has no effect on our estimate.Now suppose we were to apply the robust shadow tomography procedure of Ref. 63 to determine the overlap ratio in Eq. (D24).We will assume for now that the state ρ is prepared without error and that we have some unknown noise process occurring during the shadow tomography procedure.We focus first on the case where our ensemble of random unitaries (U) is the Clifford group on all N qubits, which we refer to as the global case.First, we would estimate a noise parameter f .Then we would calculate the classical shadow using the inverse channel where X is a placeholder variable.Note that, in the absence of noise, we have f −1 = 2 N + 1 and we recover Eq. (D8).This yields a single-round estimate of the overlap, As above, the factor of f −1 drops out when taking ratios.Therefore, when doing shadow tomography (using global Cliffords) to calculate ratios as above, we get robustness for free.That is, we can use the true value in the noiseless as in vanilla shadow tomography and the estimates for the ratios are exactly the same as if we had done robust shadow tomography, without actually doing robust shadow tomography (i.e., estimating f and using that estimate to obtain the corrected inverse channel).This is true whenever the assumptions of robust shadow tomography hold, i.e., that the noise is gate-independent, time-stationary and Markovian.
For partitioned shadow tomography with two partitions (as described in Appendix D 5), the same conclusion holds.Ref. 63 describes in detail how robust shadow tomography applies to to a random ensemble consisting of a tensor product of single-qubit Clifford operators.We can apply the same logic to the case when we have a tensor product of random N 2 -qubit Cliffords.This yields an inverse channel, where f 0,0 , f 0,1 , f 1,0 , f 1,1 are four four parameters which characterize the impact of the noise.These parameters could be learned from calibration experiments, but, as we will see, this is unnecessary for our purposes.In our particular case, the two partitions correspond to two spin sectors.We will assume that |ψ i has no overlap with any state of the form |0 ⊗ |ψ or |ψ ⊗ |0 ; in other words, that the state always has at least one particle of each spin.Now again consider a single-round estimate of the overlap 2 The corresponding variational energies are shown in Table II and Table III for a minimal basis set (STO-3G) varying the number of Clifford circuits.Using these trial wavefunctions we computed the phaseless AFQMC energies (i.e., QC-AFQMC energies) as shown in Table VI and Table VII.There is significant variation in the variational energy depending on the number of Cliffords and whether one uses partitioned shadow tomography or not.Nonetheless, the subsequent AFQMC energy is nearly converged with respect to the number of Cliffords at 15000 and run-torun variation is negligible.We observe essentially the same qualitative results in the case of cc-pVQZ as shown in  For N 2 , we performed only one set of partitioned shadow tomography experiments with a total of 15000 Cliffords because we observed that our final AFQMC energy varies very slightly run-to-run in the case of H 4 .We used a correlation-consistent triple-zeta basis, cc-pVTZ. 36The classical AFQMC calculations done with UHF trial wavefunctions and the spin-projection technique did not change the results discussed here.Similarly, we used UHF reference states for CCSD(T) calculations.Here, we provide the raw data which was used in Fig. 3 (a).Our exact results are obtained from HCI where the second-order perturbation correction was found to be smaller than 0.002 a.u.We believe that these "exact" results are converged with enough precision that these numbers can be used as a benchmark for this system.it is sufficient to prepare where and N is the number of spin orbitals.We do this by creating a state using a single-qubit Hadamard and a ladder of CNOT and SWAP gates.Then for each set of 4 qubits corresponding to a pair of spatial orbitals we prepare where the CNOTs and iSWAP gates leave the zero part of the state unchanged.See the portion of Figure 2 (a) labelled "perfect pairing" for a circuit diagram illustrating this step.Figure 2 (a) also shows a circuit diagram of the aforementioned additional number preserving gates used in the eight qubit experiment.The perfect pairing states, as well as the number preserving gates, are discussed from a quantum chemical perspective in Appendix C 1. Now we discuss how to implement the measurement operators.As discussed in Sec.D 4, the measurement operators have the form Let Γ = Γ + ∆.We can rewrite G as i.e., a CZ layer sandwiched by two layers of single-qubit gates.Maslov and Roetteler 88 showed that a CZ layer followed by complete reversal of the qubits can be implemented using a circuit of 2n + 2 CNOT layers (plus intervening layers of single qubit powers of P).Because the CZ layer in the circuit for G is followed only by single-qubit gates and measurement in the computational basis, the reversal of qubits can be easily undone in post-processing.Thus the shadow tomography circuits have a 2-qubit gate depth of at most 2n + 2. This is a significant improvement over using the full Clifford group for shadow tomography; the best known circuit for a general Clifford has 2-qubit depth 9n. 75urthermore, the CZ circuits have the additional properties that they contain only four unique CNOT layers and that they act only along a line, which are advantageous for calibration and qubit mapping, respectively.X. Raw data for N 2 potential energy surface for seven bond distances (R).Note that the energy of our quantum trial here is obtained from a single set of experiment which may vary significantly run-to-run.Table XI.Raw data for the diamond cold curve for five lattice constants (R).Note that the energy of our quantum trial here is obtained from a single set of experiment which may vary significantly run-to-run.Note that these energies include the Madelung constant.

Appendix F: Outlook on Potential Quantum Advantage
In the typical electronic structure context, quantum advantage is focused on the approximation of the ground state energy.In this outlook, we consider the potential for quantum advantage in this general sense, as well as for the specific quantum subroutine used in our QC-AFQMC algorithm, namely the overlap evaluation.We explain our understanding here of the current computational scaling and limits of our proposed approach for the overlap evaluation and the path towards the first "practical" quantum advantage.
System size scaling.In general, we expect the overlap between Ψ T |φ to approach zero exponentially quickly as the system size increases.For example, the typical overlap value of the walker wavefunction with a simple trial wavefunction can be as small as 10 −5 for 16 atoms, 10 −16 for 54 atoms, and 10 −38 for 128 atoms under periodic boundary conditions. 89These examples suggest that the system size scaling consideration is not just an asymptotic consideration but is practically relevant for system sizes that one may wish to study in the near future.Performing AFQMC requires evaluating these overlaps to a fixed relative precision.Therefore, as the system size increases towards the thermodynamic limit, we would expect that QC-AFQMC formally requires exponentially more measurements to maintain the relative precision.
In order to address the challenges due to this scaling, QC-AFQMC might need to be developed beyond the formulation used in our experiment.For example, using more sophisticated wavefunction forms for |φ than a single Slater determinant could allows one to maintain good overlap between |Ψ T and |φ .Again, as long as |φ can be prepared efficiently on a quantum computer, one can efficiently estimate the overlap Ψ T |φ to fixed additive error using the Hadamard test.In some cases, these overlaps might still be too small as a consequence of the QMA-Hardness of the electronic structure problem. 90However, the onset of this sort of exponential scaling would also render intractable other quantum computing algorithms such as energy estimation via quantum phase estimation. 91The reason for this is because such approaches have a cost that is inversely proportional to the overlap between the target eigenstate of interest and the initial state.Thus, if one can make quantum phase estimation efficient by preparing a suitable initial state, we are optimistic that one can use parameterized versions of those states as the initial |φ in QC-QMC in order to avoid the problem of vanishing overlaps.We note that VQE is also expected to face similar difficulties in the worst case since no polynomial scaling circuit ansatz is able to prepare ground states of the most challenging instances of Quantum advantage in the ground state energy computation.When the number of electrons that we consider is not too large, it is possible to assume that the measurement overhead due to the vanishing overlap may not be a practical concern.With our virtual correlation technique, we can maintain a good overlap value within the active space while producing accurate energies overall.Given this, we are optimistic about routes to achieve quantum advantage in fermionic ground state simulation through the QC-AFQMC algorithm.The aforementioned complex quantum states such as hardware efficient ansatze, UCC and 2D-MERA can be good candidates for trial wavefunctions although the relevance of 2D-MERA for chemistry simulations is yet to be seen.An important consideration here is how one actually obtains wavefunction parameters of those complex quantum states.One may optimize them using the variational quantum algorithm or one may take states that can be efficiently optimized classically.For the latter case, it seems likely that approximating the overlap between these states and an arbitrary Slater determinant up to additive error is difficult despite the fact that some of them can be optimized efficiently using classical algorithms.
We hope to observe quantum advantage either in the overlap estimation or in the ground state energy computation using QC-AFQMC and believe that continued advancement along this direction will lead us to one of the first realizations of practical quantum advantage in NISQ fermionic simulations.

Figure 1 .
Figure 1.(a) Depiction of the imaginary time evolution which shows an exponential convergence to the ground state as a function of imaginary time τ .(b) Illustration of the fermionic sign problem.(b, top) Exact deterministic imaginary time evolution and an unconstrained QMC calculation which is exact on average but the signal-to-noise ratio in the average energy E(τ ) diverges in τ due to the sign problem.(b, bottom) Constrained QMC calculations with classical and quantum constraints.The use of quantum constraint can help to reduce the bias that is non-negligible when using the classical constraint.(c) Overview of the quantum-classical hybrid QMC (QC-QMC) algorithm.Stochastic wavefunction samples are represented as {|φ i } τ (depicted as a matrix manageable to store classically) which are evolved in time along with associated weights {w i } τ .Throughout the time evolution, queries to the quantum processor about the overlap value between the quantum trial wavefunction |Ψ T and a stochastic wavefunction sample {|φ i } τ are made while updating the gate parameters to describe {|φ i } τ .Our quantum processor uses N qubits to efficiently estimate the overlap which is then used to evolve w i and to discard stochastic wavefunction samples with w i < 0. Finally, observables such as E(τ ) are computed on the classical computer by only making overlap queries to the quantum processor (see Appendix C 2).

Figure 2 .
Figure 2. (a) Experimental circuit used for the 8-qubit H 4 experiment over a 2x4 qubit grid (from Q 1,1 to Q 2,1 ) on the Sycamore quantum processor. 28In the circuit diagram, H denotes the Hadamard gate, G denotes a Givens rotation gate (generated by XX + YY), P denotes a Pauli gate, and |Ψ T denotes the quantum trial wavefunction.Note that the "offline" orbital rotation is not present in the actual quantum circuit because they can be efficiently handled via classical post-processing as discussed in Appendix C 1. (b) and (c): Convergence of the atomization energy of H 4 as a function of the number of measurements.(b) a minimal basis set (STO-3G) with four orbitals total from four independent experiments with different sets of random measurements and (c) a quadruple-zeta basis set (cc-pVQZ) with 120 orbitals total from two independent experiments.The different symbols in (b) and (c) show independent experimental results.Note that the ideal (i.e., noiseless) atomization energy of quantum trial in (b) is exactly on top of the exact one.Further note that the quantum resource used in (c) is 8-qubit, but as shown in Appendix C 3, our algorithm allows for adding "virtual" electron correlation in a much larger Hilbert space.Top panels of (b) and (c) magnifies the energy range near the exact answer.

Figure 3 .
Figure 3. (a, top) Circuit layout showing spin-up and spin-down qubits for the 12qubit experiment.(a, bottom) Potential energy surface of N 2 in a triple zeta basis set (cc-pVTZ36 ; 60-orbital).For clarity, the relative energies are shifted to zero at 2.25 Å. Inset shows the error in total energy relative to the exact results in kcal/mol.The dash dotted line in the inset provides bounds for chemical accuracy (1 kcal/mol).Note that the variational energy of the quantum trial used here is outside the plotted energy scale.The statistical error bars of AFQMC and QC-AFQMC are not visible on this scale.(b, top) Circuit layout showing spin-up and spin-down qubits for the 16-qubit experiment.(b, bottom) Error in total energy as a function of lattice constant of diamond in a double zeta basis (DZVP-GTH; 26 orbitals).The dash dotted line shows the bounds for chemical accuracy.Our quantum trial results are not visible on this energy scale.For high values of the lattice constant none of these methods achieve chemical accuracy but the use of the quantum trial still improves the AFQMC result.Inset shows a supercell structure of diamond where two highlighted atoms form the minimal unit cell.
where φ(x, y, z) = x, y, z | φ , ψ T (x) = x | ψ T , ψ c (y) = y|ψ c N a is the number of active spin orbitals, and N c and N v are the number of occupied and unoccupied spin orbitals outside of the active space, respectively.We are using x, y, z to denote bit strings in the space composed of single particle orbitals used to construct |Ψ T .Because the tensor φ * (x, y, z) represents a Slater determinant, it is a special case of what is known as a matchgate tensor with N a + N c + N v open indices.This is also the case for ψ c (y) and δ z,0 v (with N c and N v open indices respectively).Thus, their contraction y∈{0,1} N c φ * (x, y, 0 v )ψ c (y) is also a matchgate with N a open indices and support on states of a fixed Hamming weight (i.e. an unnormalized Slater determinant), and can be formed efficiently by contracting overN c + N v legs with |ψ c ⊗ |0 v .[60][61][62]Let φ(x) denote the resulting matchgate tensor after normalization and φ the associated state.Then φ is a normalized Slater determinant in the same Hilbert space as |ψ T .Thus, we have D4) then we have Re( φ|Ψ T ) = Tr [|τ τ | P + ] , (D5) Im( φ|Ψ T ) = Tr [|τ τ | P − ] , (D6) where z = Re(z) + i Im(z) for z ∈ C. Note that Tr [P ± ] = 0 and Tr P 2 ± = Tr [|φ φ| + |0 0|] = 2. (D7) assuming |φ is a normalized wavefunction.

Table III .
Same asTable II but for the unpartitioned shadow tomography experiments.

Table VIII and
Table IX.

Table VI .
AFQMC energy using |Ψ T from four independent repeated partitioned shadow tomography experiments with a different set of random Cliffords for H 4 , STO-3G (minimal basis).The exact ground state energy is -1.969512.The numbers in parentheses indicate the statistical error of the AFQMC energy.

Table VII .
Same asTable VI but for the unpartitioned shadow tomography experiments.

Table VIII .
AFQMC energy using |Ψ T from four independent repeated partitioned shadow tomography experiments with a different set of random Cliffords for H 4 , cc-pVQZ (a quadruple-zeta basis).The exact ground state energy is -2.11216599.The numbers in parentheses indicate the statistical error of the AFQMC energy.

Table IX .
Same asTable VIII but for the unpartitioned shadow tomography experiments.

Table XII .
Resource counts for the QC-AFQMC experiments realized in this work.