Overlap-ADAPT-VQE: Practical Quantum Chemistry on Quantum Computers via Overlap-Guided Compact Ansätze

ADAPT-VQE is a robust algorithm for hybrid quantum-classical simulations of quantum chemical systems on near-term quantum computers. While its iterative process systematically reaches the ground state energy, practical implementations of ADAPT-VQE are sensitive to local energy minima, leading to over-parameterized ansätze. We introduce the Overlap-ADAPT-VQE to grow wave-functions by maximizing their overlap with any intermediate target wave-function that already captures some electronic correlation. By avoiding building the ansatz in the energy landscape strewn with local minima, the Overlap-ADAPT-VQE produces ultra-compact ansätze suitable for high-accuracy initialization of a new ADAPT procedure. Spectacular advantages over ADAPT-VQE are observed for strongly correlated systems including massive savings in circuit depth. Since this compression strategy can also be initialized with accurate Selected-Configuration Interaction (SCI) classical target wave-functions, it paves the way for chemically accurate simulations of larger systems, and strengthens the promise of decisively surpassing classical quantum chemistry through the power of quantum computing.


Introduction
The computational cost of approximating the ground state energy of an n-electron molecular system on classical computing architectures typically grows exponentially in n.Quantum computers allow for the encoding of the exponentially scaling underlying Hilbert space using only O(n) qubits, and are therefore likely to outperform classical devices on a range of chemical simulations [1][2][3] .The Variational Quantum Eigensolver (VQE) is a hybrid quantum-classical algorithm that is considered a very promising candidate for chemical calculations on Noisy Intermediate Scale Quantum (NISQ) devices 4,5 .In this approach, a parameterized wave-function is generated and variationally tuned to minimize the expectation value of the molecular electronic Hamiltonian.A variety of different parameterized wave-functions have been proposed, including the Trotterised Unitary Coupled Cluster (tUCC) ansatz 6,7 which consists of a sequence of exponential, unitary operators acting on a judiciously chosen reference state.While the tUCC approach includes electronic correlation and has, in principle, a rather simple quantum circuit structure, the excessive depth of these quantum circuits make them ill-suited for applications in the NISQ regime.This issue has led to the proposal that ansatz wave-functions be constructed through the action of a selective subset of possible unitary operators, i.e., only those operators whose inclusion in the ansatz can potentially lead to the largest decrease in the expectation value of the molecular electronic Hamiltonian.In this context, the Adaptive Derivative-Assembled Pseudo-Trotter VQE (ADAPT-VQE) 8 has emerged as the gold standard for generating highly accurate and compact ansatz wave-functions.In ADAPT-VQE, the ansatz is grown iteratively by appending a sequence of unitary operators to the reference Hartee-Fock state.At each iteration, the unitary operator to be applied is chosen according to a simple criterion based on the gradient of the expectation value of the Hamiltonian (see the section on technical background and methods for details).
Assuming that the number of spin-orbitals N being considered is proportional to the number of electrons n in the system, the pool of potential unitary operators in tUCC-based VQEs scales as O(N ℓ ) for ℓ ≥ 4 1 .Consequently, conventional VQEs based on the tUCC ansatz require the representation of a product of O(N ℓ ) unitary operators on quantum circuitry and the optimization of an O(N ℓ )-dimensional cost-function, both of which are practically impossible using the current generation of NISQ devices.The ADAPT-VQE algorithm attempts to alleviate these problems by avoiding the inclusion of unitary operators in the ansatz wave-function that are not expected to lead to a lowering of the resulting energy.Numerical evidence suggests that ADAPT-VQE is indeed resource-saving and the energy-gradient criterion employed by ADAPT-VQE leads to much more accurate wave-functions than conventional VQE algorithms while preserving moderate circuit depth [8][9][10] .Thus, while the state-of-the-art k-UpCCGSD algorithm 11 , which the review article 12 considers the most promising fixed-ansatz VQE, is shown to obtain an accuracy of about 10 −6 Hartree for the BeH 2 molecule at equilibrium distance at a cost of more than 7000 CNOT gates [13,  Table 1], ADAPT-VQE achieves a higher accuracy of about 2 × 10 −8 Hartree for the same system using only about 2400 CNOT gates 9 .In spite of this comparative advantage, such an energy gradient guided procedure has a tendency to fall into local minima of the energy landscape.Exiting from such minima comes at the expense of adding and optimizing operators through multiple ADAPT iterations 14 and leads to over-parameterized wave-functions.In practice, this is associated with an unnecessary increase of the quantum circuit depth required for the representation of the ansatz wave-function coupled to an increasingly difficult classical optimization.This is dramatically revealed in [9, Supplementary information] wherein the basic QEB variant of ADAPT-VQE is applied to the strongly correlated stretched H 6 linear chain, and it is shown that more than a thousand CNOT gates are required to construct a chemically accurate ansatz.Given that the current state-of-the-art simulations on physical quantum computers typically involve a maximal circuit depth of less than 100 CNOT gates (see, e.g., the very recent study 15 ), it seems unrealistic in the very short term to expect a chemically accurate QPU implementation of ADAPT-VQE for strongly correlated molecules.Let us remark here that while the focus of this article is on hybrid quantum-classical adaptive algorithms in the tradition of ADAPT-VQE, quantum imaginary time evolution approaches have also been recently proposed and shown an improved optimization in the high-dimensional non-convex energy landscape 16 .
Our proposed approach for overcoming the challenges of energy plateaus requires modifying the manner in which the ansatz wave-function is constructed.Indeed, rather than constructing an ansatz wave-function through an energy minimisation procedure and potentially encountering local minima, we grow the ansatz wave-function through a process that maximizes its overlap with a -potentially intermediate -target wave function that already captures some electronic correlation of the system.We thus use such a target wave-function as a guide to help us build our ansatz in the right direction so as to catch the bulk of electronic correlation.The workflow of this routine is depicted in Figure 2. The resulting overlap-guided ansatz is subsequently used as a high accuracy initialization for an ADAPT-VQE procedure, an algorithm that we refer to as Overlap-ADAPT-VQE.We benchmark and compare the ansatz wave-functions obtained with Overlap-ADAPT-VQE method to standard ADAPT-VQE on a range of small chemical systems with varying levels of correlation.

Qubit Representation of the Molecular Hamiltonian
The molecular electronic Hamiltonian with one-body and two-body interactions can be expressed in second-quantization notation as Here, p, q, r, and s are indices that label the spin-orbitals used to discretize the system, a p and a † p are the p th fermionic annihilation and creation operators that satisfy the anti-commutation relations: a p , a † q := a p a † q + a † q a p = δ pq and a p , a q := a p a q + a q a p = 0, with δ pq representing the classical Kronecker symbol in the frame of operator algebra, and h pq and h pqrs are one-electron and two-electron integrals that can be computed on classical hardware through the expressions where Ψ p , Ψ q , Ψ r , Ψ s denote spin-orbitals labeled by the indices p, q, r, and s respectively.In order to represent the second-quantized Hamiltonian H on a quantum computer, we use the Jordan-Wigner transform 17,18 to map the creation and annihilation operators to tensor products involving unitary matrices.To this end, we denote by |0⟩ p and |1⟩ p states corresponding to an empty and occupied spin-orbital p respectively.Using this formalism, the reference Hartree-Fock state for a system having n electrons in N spin-orbitals can be expressed as |Ψ HF ⟩ := |1 0 . . . 1 n 0 n+1 . . .0 N ⟩, and the corresponding fermionic creation and annihilation operators are given by where X p ,Y p , Z p are single qubit Pauli gates applied to qubit p 19 .Note that in Equation ( 4), we have introduced the so-called qubit excitation and de-excitation operators Q p and Q † p respectively that switch the occupancy of the spin-orbital.These operators will be the subject of further discussion in the sequel.Let us also remark here that the Jordan-Wigner-transformed excitation and de-excitation operators (4) respect the anti-commutation relations (1).This is simply a consequence of including the tensor product of Z-Pauli gates in Equation ( 4) 17 .

The Variational Quantum Eigensolver
Equipped with the single-qubit Pauli gate representation of the molecular Hamiltonian H, we are now interested in approximating its ground-state eigenvalue.The Variational-Quantum-Eigensolver (VQE) is a hybrid quantum-classical algorithm that couples a classical optimization loop to a subroutine that computes on a quantum computer, the expectation value of the Hamiltonian with respect to a proposed ansatz wave-function.This quantum subroutine involves two fundamental steps: 1.The preparation of a trial quantum state (the ansatz wave-function) |Ψ( ⃗ θ )⟩.A variety of different functional forms for the ansatz wave-function have been proposed 7,[20][21][22] including the aforementioned tUCC ansatz which consists of a sequence of parameterized, exponential fermionic excitation and de-excitation operators acting on a reference state (see below for explicit expressions of these operators).

The measurement of the expectation value
The output of the quantum subroutine is fed into a classical optimization algorithm which calculates the optimal set of parameters ⃗ θ opt that minimizes the expectation value of the Hamiltonian H.The variational principle ensures that the resulting optimized energy is always an upper bound for the exact ground-state energy E 0 of H, i.e., The fundamental challenge in implementing the VQE methodology on NISQ devices is thus to construct an ansatz wave-function that can capture the most important contributions to the electronic correlation energy and, at the same time, is capable of being represented on rather shallow quantum circuits.A necessary condition to achieve the latter is that the chosen ansatz wave-function be parameterized with a relatively small number of optimization parameters.Thus, the major computational shortcoming of the popular tUCCSD method-which otherwise possesses an attractive functional form 7 -is that its actual implementation on quantum computers requires extremely deep circuits which generate far too much noise on the current generation of NISQ devices 23 .Indeed, implementing the tUCCSD algorithm on quantum architectures through the Jordan-Wigner mapping (4) requires O(N 3 n 2 ) quantum gates 7 (recall that N is the number of spin-orbitals being considered and n is the number of electrons in the system so that if N is proportional to n, then the number of quantum gates required will be of the order of O(N 5 )).This problem is further exacerbated by the ubiquitous usage of CNOT gates in the construction of quantum circuits for fermionic excitation and de-excitation operators.tUCCSD has been recently extended to triple excitations (tUCCSDT) 24 and coupled to both spin and orbital symmetries to reduce the operators count but this latter remains too high for real life QPUs implementation despite a significant increased accuracy over tUCCSD.

The ADAPT-VQE Ansatz
The adaptive derivative-assembled pseudo-Trotter variational quantum eigensolver (ADAPT-VQE) 8 was designed to overcome the computational shortcomings of the traditional tUCCSD method by proposing an ansatz function that is adaptively grown through an iterative process.ADAPT-VQE is based on the fact 25 that the full-CI quantum state can be represented by the action of a potentially infinitely long product of only one-body and two-body operators on the reference Hartree-Fock determinant, i.e., ) is the expansion coefficient of the k th repetition of the operator Âq p (resp.Ârs pq ).The general workflow of the ADAPT-VQE algorithm is as follows: 1. On classical hardware, compute one-electron and two-electron integrals, and map the molecular Hamiltonian into a qubit representation.On quantum hardware, boot the qubits to an initial state |Ψ 0 ⟩ = |Ψ HF ⟩.
2. Define a pool of parameterized unitary operators that will be used to construct the ansatz.
3. On quantum hardware, at the m th iteration, identify the parameterized unitary operator Ûm (θ m ) whose action on the current ansatz |Ψ m−1 ⟩ will produce a new wave-function with the largest drop in energy.This identification is done by computing suitable gradients at θ m = 0, the gradients being expressed in terms of commutators involving the molecular Hamiltonian acting on the current ansatz wave-function: 4. Exit the iterative process if the gradient norm is smaller than some threshold ε.Otherwise, append the selected operator to the left of the current ansatz wave-function Hybrid Quantum-Classical VQE: Optimize all parameters θ m , θ m−1 , . . ., θ 1 in the new ansatz wave-function so as to minimize the expectation value of the molecular Hamiltonian, i.e., solve the optimization problem and define the new ansatz wave-function |Ψ m ⟩ using the newly optimized parameters θ Let us emphasize that although we also denote the newly optimized parameters at the current m th iteration by θ ′ 1 , . . .θ ′ m , these optimized values are not necessarily the same as those used to define |Ψ m−1 ⟩ and referenced in Step 4 above.

Return to Step 3 with the updated ansatz |Ψ m ⟩.
There are essentially three types of operator pools that are used to construct the ADAPT-VQE ansatz.
• Fermionic-ADAPT-VQE 8 uses a pool of spin-complemented pairs of single and double fermionic excitation operators.
The quantum circuits performing these unitary operations are of the staircase shape (see Figure 1, equation ( 11)).
• Qubit-ADAPT-VQE 10 divides the fermionic-ADAPT operators after the Jordan-Wigner mapping and takes the individual Pauli strings as operators of the pool.The quantum circuit for an operator is a single layer of fermionic excitation "CNOT-staircase" circuits, similar to the circuit displayed in Figure 1, equation ( 12).
• Qubit-Excitation-Based-ADAPT-VQE (QEB-ADAPT-VQE) 9 uses a pool of qubit excitation operators.Exponential single-qubit and double-qubit excitation evolutions can be expressed using the qubit creation and annihilation operators Q p and Q † p defined through Equation ( 4) as which, after the Jordan-Wigner encoding yields with p, q, r, and s denoting, as usual, indices for the spin-orbitals, and we have written (sq) and (dq) as abbreviations for single qubit and double qubit excitation evolutions respectively.The quantum circuits corresponding to the single-qubit and double-qubit excitation operators 26 are then given in Figure 1, equation (13).
Figure 1.On top, quantum circuit applying the operator e θ (a † i a k ) as part of a single fermionic excitation.Double fermionic excitations are carried out using an analogous circuit.In the middle, quantum circuit (12) performing a generic single-qubit evolution U (sq) pq (θ ) and a quantum circuit (13) performing a generic double-qubit evolution U (dq) pqrs (θ ) at the bottom.Note that the terms single-qubit and double-qubit excitations refer to the fact that these operator perform rotations on one pair and two pairs of qubits respectively, not one and two individual qubits.
Extensive comparisons between these pools of operators have been carried out by Yordanov et al. 9 and numerical evidence suggests that QEB-ADAPT-VQE generates the most computationally tractable ansatz wave-functions.This is primarily due to the fact that qubit excitation circuits can be constructed using much fewer quantum gates than fermionic excitation circuits 26 in combination with the observation that qubit excitation evolutions approximate molecular electronic wave-functions with almost the same level of accuracy as fermionic excitation evolutions.For the purpose of this article therefore, we will restrict our attention to operator pools involving qubit excitation evolutions and work in the framework of QEB-ADAPT-VQE.

The Overlap-Guided Adaptative Algorithm (Overlap-ADAPT)
The numerical evidence presented in the articles 8,9,14 demonstrates that the ADAPT-VQE algorithm is capable of approximating the ground state Full-CI energy to a very high accuracy.Unfortunately, achieving a suitably accurate approximation to the sought-after energy may require a large number of ADAPT iterations which results both in deep quantum circuits that cannot be implemented on the current generation of NISQ devices as well as an increasingly computationally expensive optimization procedure.This problem is particularly apparent in strongly correlated systems for which the ADAPT algorithm frequently encounters energy plateaus 2 prior to achieving the classical chemical accuracy threshold of 10 −3 Hartree.Since quantum chemists are primarily interested in numerical results in the regime 10 −3 to 10 −4 Hartree, i.e., slightly more accurate than the chemical accuracy threshold, it is natural to ask if the ADAPT-VQE procedure could be modified so as to avoid these initial energy plateau slowdowns and achieve the required accuracy using an ansatz compact-enough to be implementable on current NISQ devices.
To make these ideas more precise, let us first introduce for any natural number p, the set of all wave-functions that can be represented by the product of exactly p exponential, one-body and two-body qubit excitation evolution operators acting on the Hartree-Fock reference state: Given now an arbitrary electronic wave-function |Ψ ref ⟩, we can define the best approximation of |Ψ ref ⟩ in the set W p as where ∥ • ∥ denotes a suitable norm such as the usual L 2 or H 1 norms on the space of all electronic wave-functions.The L 2 -norm and the H 1 -norm can both be computed on either classical computers or on quantum devices, depending on whether the underlying wave-functions are represented classically or on quantum circuitry.The computation of the L 2 -norm, however, is more direct and we will therefore adopt this choice of norm for the subsequent numerical simulations considered in this study.
Returning now to Equation ( 15), we see that |Ψ * p ⟩ is the best approximation of an arbitrary target wave-function |Ψ ref ⟩ using a product of exactly p exponential qubit excitation evolution operators acting on the Hartree-Fock reference state.The question we are now interested in answering is the following: If we take the full-CI wave-function |Ψ FCI ⟩ as the target, does the corresponding best approximation |Ψ FCI p ⟩ defined according to (15) provide a chemically accurate wave-function for small choices of p?More precisely, we wish to explore if for small choices of maximal operator count p it holds that The answer to this question will be a strong indication as to whether there exists an ansatz wave-function that is simultaneously more compact than the ADAPT-VQE ansatz and which can also capture the bulk of the electronic correlation in the system.Let us emphasise that we are specifically interested in understanding whether we can obtain a more compact ansatz wave-function than that produced by ADAPT-VQE at chemical accuracy and not at the level of full-CI accuracy.
Unfortunately, answering this question by solving the optimization problem (15) for an arbitrary target wave-function exactly is not computationally feasible since the size of the set W p grows exponentially in p.Nevertheless, an adaptive, iterative procedure that generates an approximate solution to the optimization problem (15) can be defined as follows (see also Figure 2).Given a target wave-function |Ψ ref ⟩ and a maximal operator count p: 1. Set the initialisation to the Hartree-Fock reference state, i.e., set |Ψ 0 ⟩ = |Ψ HF ⟩.
2. At the m th iteration, m ≤ p, identify the parametrised exponential qubit excitation evolution operator A m (θ m ) whose action on the current ansatz |Ψ m−1 ⟩ will produce a new wave-function with the largest overlap with respect to the target wave-function.This identification is done by computing the following gradient involving the current ansatz wave-function at θ m = 0: A detailed description of how to compute the gradients given in Equation ( 17) can be found in the appendix.
3. Append the selected operator to the left of the current ansatz wave-function 4. Optimize all parameters θ m , θ m−1 , . . ., θ 1 in the new ansatz wave-function | ψ m ⟩ so as to maximize its overlap with the target wave-function i.e., solve the optimization problem and define the new ansatz wave-function |Ψ m ⟩ using the newly optimized parameters θ ′ 1 , . . ., θ ′ m , i.e., define Let us emphasize that although we also denote the newly optimized parameters at the current m th iteration by θ ′ 1 , . . .θ ′ m , these optimized values are not necessarily the same as those used to define |Ψ m−1 ⟩ and referenced in Step 3 above.We refer to this adaptive procedure as the Overlap-ADAPT-VQE.Let us emphasise here that rather than fixing a maximal operator count, we may employ some other convergence criteria such as the magnitude of the overlap or the magnitude of the gradient vectors as in the original ADAPT-VQE.Moreover, depending on whether the target wavefunction is in a quantum or a classical representation, the gradient screening and the overlap measurements can be performed using either a quantum or a classical device.In particular, if the targeted wave-function is classically computed, then no additional quantum resources are required or measurements are required to compute the overlaps.
We are now interested in applying the Overlap-ADAPT procedure to the reference full-CI wave-functions of some simple, yet strongly correlated molecular systems in an effort to understand the compactness of the wave-function generated by QEB-ADAPT-VQE in the chemical accuracy regime.To do so, we will compute the energy of the Overlap-ADAPT approximation of the target full-CI wave-functions of a stretched BeH 2 molecule and a stretched linear H 6 chain in a minimal basis set as a function of the number of optimisation parameters, and plot this energy in comparison to the energy obtained using QEB-ADAPT-VQE.
The resulting energy plots, which are displayed in Figure 3, clearly show that the overlap-guided adaptive procedure is able to avoid the initial energy plateaus afflicting the ADAPT procedure that prevent the attainment of chemical accuracy in a small number of iterative steps.These results strongly suggest the potential for creating a more condensed ansatz wave-function than that generated by ADAPT-VQE which can sidestep the issue of early energy plateaus.Before proceeding, let us point out that a key metric for evaluating the efficiency of the Overlap-ADAPT algorithm is to compute the overlap between the ansatz wave-function and the full-CI wave-function over the course of several algorithm iterations.Consequently, for the stretched BeH 2 and stretched linear H 6 chain considered above, we plot the overlap convergence with respect to the full-CI wave-function in Figure 4.It is readily seen that the Overlap-ADAPT procedure targeted at the full-CI wave-function far outperforms the original ADAPT-VQE, achieving a notably higher overlap with the full-CI wave-function for both a stretched BeH 2 molecule and a stretched linear H 6 chain.In particular, for the H 6 system, while ADAPT-VQE reaches a plateau and stalls its progress, the Overlap-ADAPT procedure smoothly advances without interruption.
Of course the Overlap-ADAPT-VQE targeted at a full-CI wave-function does not define a practical VQE since the full-CI ground state energy is precisely the quantity we wish to approximate.A practical VQE based on orbital overlap optimization can, however, be developed by replacing the targeted full-CI wave-function with a tractable high accuracy approximation thereof and using the resulting overlap-guided ansatz wave-function as a high accuracy initialisation for a new ADAPT-VQE procedure.The targeted "computable" wave-function in this situation can be completely general, i.e., it can be the output of any existing numerical algorithm, whether classical or quantum.
The goal of the subsequent sections is to showcase the efficacy of this Overlap-ADAPT algorithm at obtaining chemically accurate results using a minimal number of optimisation parameters.Such findings are important for practical uses of quantum computing for quantum chemistry since, as we have already stated, real-life chemists are interested in reaching convergence in energies corresponding to the so-called chemical accuracy, i.e. 10 −3 to 10 −4 Hartree.Our results can therefore introduce a practical route for compactyfing the ADAPT-VQE operator counts using the Overlap-ADAPT-VQE within this accuracy regime.

Setting of Numerical Simulations
The classical numerical simulations reported in this section have been carried out with an in-house code, using Openfermion-PySCF module 27 for integral computations and OpenFermion 28 for second quantization and the Jordan-Wigner mapping.All calculations are performed within the minimal STO-3G basis set 29 without considering frozen orbitals unless otherwise specified.Note that the number of qubits that a simulation requires is equal to the number of spin orbitals of a system, which therefore limits the quality of the single-particle basis and the size of the system that can be simulated.All optimization routines use the BFGS algorithm implemented on the SCIPY Python module 30 .We use a pool of non spin-complemented restricted single-and double-qubit excitations evolutions.By 'restricted', we mean that we consider only excitations from occupied orbitals to virtual orbitals with respect to the Hartree-Fock determinant.Using fewer operators in the pool makes the gradient screening process faster and easier to handle from a computational point of view 9 .To ensure a fair comparison, this same operator pool is used for both the overlap-guided Ansatz and ADAPT-VQE.
To anticipate applications on noisy quantum machines of such adaptive algorithms, there are essentially two constraints to respect: • The circuit depth should be kept as shallow as possible so as to reduce the effect of decoherence in NISQ devices.In the current context, the circuit depth corresponds to the number of gates used to construct our wave-function ansatz.
• The number of measurements an NISQ device can undertake is very limited.On the other hand, the ADAPT-VQE algorithm requires a large number of measurements both in the form of gradient evaluations at the beginning of each iteration and during the VQE optimization step of the ansatz wave-function.The optimization step in particular often requires an excessive number of measurements since the cost function is both high dimensional and noisy.Consequently, the optimization of the ansatz wave-function is simply intractable with a limited number of evaluations thus preventing practical application of ADAPT-VQE on current quantum devices.
In order to implement such adaptive algorithms on the current generation of NISQ devices therefore, we must minimise both the circuit depth and the number of evaluations.Indeed, as the depth of a circuit increases, the noise level also increases, which results in a greater number of samples being required for accurate measurement of the Hamiltonian expectation values.In ADAPT-VQE, each operator added to the ansatz corresponds to an additional layer of quantum gates in the circuit and an additional parameter in the ansatz.Consequently, to address both the circuit depth and the number of evaluations constraints, we will evaluate the energy convergence as a function of the number of operators present in the ansatz.

Application of Overlap-ADAPT-VQE for Compactification of ADAPT-VQE Ansatzë
As a first test of its effectiveness, we apply the overlap-guided adaptive algorithm to a target wave-function provided by an existing QEB-ADAPT-VQE procedure and then use the result as a high-accuracy initialisation for a new QEB-ADAPT-VQE 9/19 procedure.Essentially, this first set of numerical experiments is meant to model the situation where we have a strong constraint on the circuit depth (represented by the number of optimisation parameter in the ansatz wave-function), and we wish to see if it is possible to use the Overlap-ADAPT-VQE procedure to compactify the ADAPT-VQE ansatz thereby obtaining a higher accuracy wave-function that respects the constraint on the circuit depth.
We compute the ground state energy of the benchmark Beryllium Hydride (BeH 2 ) molecule considered in the original ADAPT-VQE articles 8 .We consider the BeH 2 molecule both at its equilibrium geometry (bond length of 1.3264 Angstrom) as well as at a stretched geometry (bond length of 3.0 Angstrom), which is meant to model a more strongly correlated system.Our results are depicted in Figure 5.
The numerical results indicate that the Overlap-ADAPT-VQE can indeed compactify the QEB-ADAPT-VQE ansatz wavefunction and using the output as an initialization for a new QEB-ADAPT-VQE yields a much more accurate wave-function.Under the constraint of a maximal operator count of 50, the overlap-guided procedure improves the final accuracy of the computed BeH 2 ground state energy at equilibrium and stretched geometries by a factor of 3 and 10 respectively.Note that the improvement in accuracy is much higher in the case of the stretched BeH 2 molecule which exhibits strong correlation, and this suggests that the comparative advantage of the overlap-guided adaptive algorithm over a pure ADAPT-VQE procedure will be more conspicuous for strongly correlated molecules-systems for which the ADAPT-VQE algorithm struggles to compute the ground state energy.Thus, in the case of the BeH 2 molecule for instance, we are able to achieve chemical accuracy using only a 34 operator-ansatz wave-function whereas the QEB-ADAPT-VQE algorithm requires more than 50.Numerical simulations for stretched BeH 2 using a lower maximal operator count of 40 and 45 are displayed in Figure 6 and show similar improvements in the final accuracy of the ansatz wave-function, although the advantage decreases as the maximal operator count decreases.A further test of the Overlap-ADAPT-VQE applied to a target QEB-ADAPT-VQE wave-function is carried out for the diatomic Nitrogen (N 2 ) molecule at equilibrium and stretched geometries.Although the minimal basis set for N 2 is quite large, a tractable computation can be carried out using an active space approach where the eight core electrons of the N 2 molecule are frozen and the ground state energy of the system is computed using the resulting frozen core effective Hamiltonian an approach commonly referred as CAS (6,6).As shown in Figure 7, we see that the Overlap-ADAPT procedure does not further compactify the QEB-ADAPT-VQE wave-function at equilibrium, the final accuracy of the Overlap-QEB-ADAPT-VQE being only slightly higher than that of the classical QEB-ADAPT-VQE procedure.Nevertheless, by applying the Overlap-ADAPT-VQE procedure twice, i.e., taking a QEB-ADAPT-VQE wave-function as the first target, performing an Overlap-ADAPT-VQE procedure, and then taking the resulting wave-function as the target for an additional Overlap-ADAPT-VQE procedure yields a huge gain in  Let us remark that as a rule of thumb, for all these simulations, the Overlap-ADAPT algorithm is used to construct an approximate wave-function using a number of operators equal to about 40%-50% of the maximal operator count.If the maximal operator count is more flexible, then as a general rule we observe that the ADAPT-VQE ansatz taken immediately after the ADAPT process has exited an energy plateau, serves as an effective choice of target wave-function for an overlap-guided adaptive procedure, i.e., the Overlap-ADAPT-VQE can produce a more compact wave-function with comparable energy to that of the target ADAPT wave-function.On the other hand, taking ADAPT-VQE ansatz wave-function from the middle of an energy plateau as the overlap-guided target seems to be a less effective strategy.

Application of Overlap-ADAPT-VQE to Classically Computed Wave-Functions
The stretched linear H 6 chain is a molecular system that exhibits a high degree of electronic correlation.The complex electronic structure creates a rough energy landscape with many local minima, making the finding of the global energy minimum difficult.This system has already been extensively studied 9 and it was shown that achieving chemical accuracy with ADAPT-VQE method required constructing an ansatz wave-function with more than 150 operators from a pool of either generalized fermionic or generalized qubit-excitations.Clearly, resources of this kind are far from being accessible on current NISQ devices, and it is therefore necessary to develop adaptive methods for simulating systems using a much smaller operator count.Until now, the most extensive VQE experiments have typically encompassed around 10 operators while accumulating an error of at least 0.1 Hartree 23,31 .Unfortunately, the ADAPT-VQE ansatz wave-function, presumably not constructed with a satisfactory choice of qubit excitation evolution operators prior to an unreachable number of iterations, cannot be used as the target of the overlap-guided adaptive algorithm as in the previous subsection.Instead, we propose the use of an intermediate, classically computed, multi-configuration wave-function as the overlap-guided target.This approach has the consequent advantage of not costing additional quantum resources.Particularly well-suited choices which fit in the framework of adaptive methods are provided by the so-called Selected-CI (SCI) methods.

Combining Classical Selected-CI Approaches and Quantum Computing
The key idea of SCI methods is to build a compact representation of the reference wave-function by selecting on-the-fly the most relevant Slater determinants thanks to an importance criterion based on perturbation theory (PT).Thanks to this clever selection of the Slater determinants, the variational energy of the reference wave function converges rapidly towards the full-CI energy.Although the recent revival of SCI approaches [32][33][34][35][36][37][38][39][40] has significantly pushed further the size limit of systems for which near full-CI quality energies can be obtained (typically a few tens of correlated electrons in about two hundreds of orbitals 41,42 ), the scaling of SCI methods is intrinsically exponential in the number of correlated electrons and orbitals.
The reason for this exponential scaling is directly linked to the linear parametrization of the sought-after wave-function in terms of Slater determinants, which implies that the intrinsic exponential structure of the wave function must be built explicitly by adding more and more determinants to the reference wave function.This necessarily leads to size consistency errors which manifest through an underestimation of the coefficients of the reference and perturbative wave functions and therefore of the correlation energy.Because the size consistency errors grow with the total (absolute) value of the correlation energy, SCI methods struggle more and more as the number of correlated electrons increases and/or the strength of correlation increases.Recently, attempts to cure this problem have been proposed with a selection of the individual excitation operators 43,44 in a single-reference CC approach.
To overcome these limitations of SCI approaches, an alternative idea is to combine the robust and linear parametrization of SCI with the intrinsic exponential parametrization of the ansatz used in QC computation to take advantage of both worlds: 1.While reaching chemical accuracy in SCI methods is a struggle in the strong correlation regime, obtaining a compact and robust representation of the bulk of correlation effects is an easy task thanks to the smart selection of Slater determinants and the simplicity of the linear parametrization; For the purpose of this study, we choose to employ the so-called CI pertubatively selected iteratively (CIPSI) algorithm implemented in QP2 39 to generate the required SCI wave-function.Before proceeding to the application of this algorithm to the linear H 6 chain, we provide a brief recap of the CIPSI methodology.

The CIPSI algorithm in a nutshell
The CIPSI algorithm, which was originally introduced in the late seventies 45,46 , is the archetype of SCI approaches: it approximates the FCI wave function through an iterative selected CI procedure, and the FCI energy through a second-order multi-reference perturbation theory (in this case, with an Epstein-Nesbet 47,48

partition).
The CIPSI energy is defined as

12/19
Here, E v is the variational energy given by where the reference wave function |Ψ (0) ⟩ = ∑ I ∈ R c I |I⟩ is expanded in Slater determinants |I⟩ within the CI reference space R, and E (2) is the second-order energy correction defined as κ , where κ denotes a determinant outside the reference space R.
The CIPSI energy is systematically refined by doubling the size of the CI reference space at each iteration, selecting the determinants κ with the largest |e (2)   κ |.The calculations are stopped when a target value of E (2) is reached.

CIPSI-Overlap-ADAPT Numerical results
We performed CIPSI calculations through the open-source quantum chemistry environment Quantum Package 39 for the different molecular systems.As mentioned previously, the CIPSI wavefunction is used as a target for the overlap-guided adaptive algorithm and is therefore not required to be very accurate.In particular, all CIPSI wave-functions employed in this study have error much larger than 10 −3 Hartree, i.e., they are not chemically accurate.In the remainder of this section, we compare the energy convergence of the QEB-ADAPT-VQE algorithm starting from an intermediate wave-function obtained by applying the overlap-guided algorithm to a CIPSI wave-function with the traditional QEB-ADAPT-VQE procedure that initializes from a simple Hartree-Fock ansatz.As a rule of thumb, for all these simulations, the Overlap-ADAPT-VQE is used to construct an approximate wave-function with energy comparable to that of the targeted CIPSI wave-function before initiating the subsequent QEB-ADAPT-VQE procedure.Figure 8 shows the energy convergence plot of the two different ADAPT-VQE protocols on the stretched linear H 6 system.We observe a significant difference in the results, with chemical accuracy being achieved using only 40 parameters when the QEB-ADAPT-VQE procedure is initialized with the overlap-guided-CIPSI intermediate wave-function whereas while the 13/19 classical ADAPT-VQE ansatz is nearly 15 times less accurate despite using 50 parameters.Additional calculations revealed that with the classical QEB-ADAPT-VQE protocol requires more than 150 parameters to achieve chemical accuracy 9 .This massive performance gap demonstrates that the CIPSI wave-function initialization guides the ansatz construction in a manner that avoids an initial massive energy plateau which impedes the progress of classical QEB-ADAPT-VQE.
Let us emphasize here that the initial CIPSI wave-function was composed of only 50 determinants and had an error larger than 10 −2 Hartree, which suggests that even a low accuracy classically computed target wave-function for the overlapguided algorithm is enough to improve the convergence of the subsequent QEB-ADAPT-VQE procedure.This observation is particularly important since it highlights the potential of applying this CIPSI-Overlap-ADAPT procedure to much larger systems with strong correlation where CIPSI approaches are not effective and are simply unable to achieve chemical accuracy.For such systems, we can envision computing a CIPSI wave-function at the limit of classical computational resources, using this non-chemically accurate CIPSI wave-function as a target for the overlap-guided adaptive algorithm, and initialising a subsequent QEB-ADAPT-VQE procedure on a quantum computer in order to obtain a final result with chemical accuracy.
To further test the effectiveness of this CIPSI-Overlap-ADAPT approach, we return to the stretched BeH 2 molecule considered in the previous subsection.We employ two different CIPSI wave-functions as targets for the overlap-guided adaptive algorithm and use the approximate wave-functions obtained as high accuracy initializations for QEB-ADAPT-VQE procedures.Our results are displayed in Figure 9 and demonstrate that the CIPSI-Overlap-ADAPT produces a significantly more compact ansatz than the classical QEB-ADAPT-VQE procedure for both choices of CIPSI wave-functions.In both cases, the final accuracy of the wave-function with a maximal operator count of 50 operators is nearly an order of magnitude more than that of QEB-ADAPT-VQE.Furthermore, as noted in the case of the H 6 molecule, the choice of a low accuracy CIPSI wave-function as the initial target for the Overlap-ADAPT-VQE does not meaningfully degrade the final accuracy.Let us also remark here that the CIPSI-Overlap-ADAPT-VQE wave-function obtained at the end of the iterative process can then further be used a target for an additional Overlap-ADAPT-VQE procedure, thereby further increasing the accuracy of the ansatz wave-function.In the case of the stretched BeH 2 molecule, this results in further minor improvements to the final energy that is achievable using a maximal operator count of 50, as displayed in Figure 9.

Discussion
In this study, we have explored the possibility of creating ansatz wave-functions for the variational quantum eigensolver that are more compact than the popular ADAPT-VQE at the chemical accuracy level for some small molecular systems.Since the overparametrization phenomenon observed in the ADAPT algorithm can be attributed to the algorithm's natural propensity to 14/19 encounter local energy minima, we have proposed a new overlap-guided adaptative algorithm called Overlap-ADAPT-VQE, wherein the ansatz wave-function is grown by maximizing its overlap with an intermediate target wave-function that already captures some electronic correlation.We then use this overlap-guided ansatz as a high accuracy initialization for a classical ADAPT-VQE procedure.
As a first test of our proposed approach, we used an existing ADAPAT-VQE ansatz wave-function as a target for the overlapguided adaptive algorithm.The resulting ansatz wave-function was shown to achieve chemical accuracy using signficantly less operators than the classical ADAPT-VQE ansatz.We have also shown that this compression process can be carried out more than once and leads to an even more compact ansatz.For strongly correlated systems, the overlap-guided ansatz is noticeably steered by the target wave-function away from the majority of local traps that are typically encountered in standard ADAPT-VQE when starting from the Hartree-Fock state.While it appears that the ADAPT ansatz is already quite compact for systems with poor electronic correlation, the Overlap-ADAPT approach remains able to offer slight improvements.
Motivated next by the inability of ADAPT-VQE to process highly correlated systems such as the stretched linear H 6 chain using a reasonably compact ansatz, we combined classical selected-CI approaches and quantum computing by taking a CIPSI wave-function as a target for our overlap-guided adaptive algorithm.The resulting CIPSI-Overlap-ADAPT-VQE procedure produced a massive improvement over standard ADAPT-VQE, allowing us to reach chemical accuracy using an ansatz with only 40 operators compared to more than 150 for the classical ADAPT-VQE method.
Previous studies have already investigated the use of additional classical computation to enhance the UCCSD or ADAPT-VQE methods and have demonstrated promising improvements 7,[49][50][51][52] .Our work builds upon this research and contributes to this line of study.It is worth noting that the overlap-guided ansatz can also be interpreted as a state preparation algorithm for Hamiltonian simulation [53][54][55] , as it generates a state with high overlap on the ground state (see Figure 4).
However, within our new framework, the hybrid selected-CI-Overlap algorithm has the potential to bring a quantum advantage over classical quantum chemistry methods by following this procedure: pushing the classical computation of a complex molecular system to its limits, then generating the corresponding ansatz in a quantum computer using the Overlap adaptative algorithm, and further improving this ansatz through ADAPT-VQE and potentially additional overlap-guided compression steps.We are also testing the possibility of a final perturbative state (PT2) calculation following the spirit of the modern classical selected-CI approaches.
Finally, let us emphasise that Overlap-ADAPT-VQE is, by design, able to integrate seamlessly with the recent improvements made to ADAPT-VQE 56,57 , sharing the same structure and adaptive property while still leveraging its own unique approach to operator selection, and many combinations with ADAPT variants can now be proposed and studied.Conversely, convergence in overlaps can be achieved more quickly by incorporating a wider range of operators, such as generalized excitations or symmetry breaking operators, into the pool of operators used.This would lead to immediate improvements in the performance of the Overlap-ADAPT-VQE algorithm.To explore further the capabilities of the various Overlap-ADAPT approaches and their potential practical advantage over classical methods, we are currently working towards larger scale simulations on extended implementations encompassing larger qubit counts on present NISQ machines and new generation advanced simulators.

Figure 3 .
Figure 3.Comparison of the Full-CI Overlap Guided ADAPT-VQE and ADAPT-VQE for the ground state energy of a stretched BeH 2 molecule and a stretched linear H 6 chain with an interatomic distance of 3 Angstrom for both.The plots represent the energy convergence as a function of the number of parameters in the ansatz.The pink area indicates chemical accuracy at 10 −3 Hartree.

Figure 4 .
Figure 4. Comparison of the Full-CI Overlap Guided ADAPT-VQE and ADAPT-VQE for maximising the overlap with the full-CI wave-function of a stretched BeH 2 molecule and a stretched linear H 6 chain with an interatomic distance of 3 Angstrom for both.The plots represent the infidelity between the ansatz and the full-CI wave-function, calculated as one minus the overlap, as a function of the number of parameters in the ansatz.

Figure 5 .
Figure 5.Comparison of the Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a BeH 2 molecule at equilibrium and stretched geometries.The plot represents the energy convergence as a function of the number of parameters in the ansatz.The left-pointing triangles denote the target wave-functions used for subsequent Overlap-ADAPT procedures.For simplicity, we do not plot the entire Overlap-ADAPT curve, rather only the portion corresponding to the energy minimisation using a classical ADAPT-VQE procedure.Thus, in the case of the left figure, the overlap maximisation portion of Overlap Guided QEB-ADAPT-VQE lasts until parameter 40 at which point the energy minimisation portion is initiated.The green dotted line corresponds to an FCI-Overlap-ADAPT-VQE procedure which is plotted as a reference.Note that at equilibrium distance (the left figure), the QEB-ADAPT-VQE curve and the FCI-Overlap-ADAPT-VQE nearly coincide whereas for the stretched molecule (the right figure) the FCI-Overlap-ADAPT-VQE curve is noticeably lower.The pink area indicates chemical accuracy at 10 −3 Hartree.

Figure 6 .
Figure 6.Comparison of the Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a stretched BeH 2 molecule with a maximal operator count of 40 and 45.The plot represents the energy convergence as a function of the number of parameters in the ansatz.The left-pointing triangles denote the target wave-functions used for a subsequent Overlap-ADAPT procedure.For simplicity, we do not plot the entire Overlap-ADAPT curve, rather only the portion corresponding to the energy minimisation using a classical ADAPT-VQE procedure.The green dotted line corresponds to an FCI-Overlap-ADAPT-VQE procedure which is plotted as a reference.The pink area indicates chemical accuracy at 10 −3 Hartree.

Figure 7 .
Figure 7.Comparison of the Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of an N 2 molecule at equilibrium and stretched geometries.The plots represent the energy convergence as a function of the number of parameters in the ansatz.The right-pointing triangles denotes the start of an ADAPT-VQE procedure.The left-pointing triangles denote the target wave-functions used for a subsequent Overlap-ADAPT procedure.For simplicity, we do not plot the entire Overlap-ADAPT curve, rather only the portion corresponding to the energy minimisation using a classical ADAPT-VQE procedure.The green dotted line corresponds to an FCI-Overlap-ADAPT-VQE procedure which is plotted simply as a reference.The pink area indicates chemical accuracy at 10 −3 Hartree.

2 . 3 .
Use this compact SCI wave-function as the target of the overlap-guided adaptive algorithm so as to obtain an intermediate wave-function represented in terms of qubit excitation evolution operators acting on the Hartree-Fock reference state; Use the intermediate wave-function as a high accuracy initialization of a new QEB-ADAPT-VQE procedure.

Figure 8 .
Figure 8.Comparison of the CIPSI-Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a linear H 6 chain with an interatomic distance of 3 Angstrom.The plot represents the energy convergence as a function of the number of parameters in the ansatz.The CIPSI-Overlap ansatz is grown up to 20 parameters and then used as the initial state for an ADAPT-VQE process.This transition from Overlap-ADAPT-VQE to classical ADAPT-VQE is denoted by the top-pointing triangle.The horizontal black dotted line corresponds to the energy error of the initial CIPSI target wave-function.The light blue dotted line corresponds to the energy of the tUCCSD method 9 , which consists of an ansatz wave-function composed of 118 generalised excitation evolutions acting on a reference Hartree-Fock state.The green dotted line corresponds to an FCI-Overlap-ADAPT-VQE procedure which is plotted simply as a reference.The pink area indicates chemical accuracy at 10 −3 Hartree.

Figure 9 .
Figure 9.Comparison of the CIPSI-Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a BeH 2 molecule with an interatomic distance of 3 Angstrom.The plots represent the energy convergence as a function of the number of parameters in the ansatz for two CIPSI initial wavefunction.The CIPSI-Overlap ansatz is grown up to 12 parameters (resp.25 parameters) for the less accurate (resp.more accurate) initial CIPSI wave-function and then used as the initial state for an ADAPT-VQE process.This transition from Overlap-ADAPT-VQE to classical ADAPT-VQE is denoted by the top-pointing triangle.The horizontal dotted lines correspond to the energy error of the initial CIPSI target wave-function.The green dotted line corresponds to an FCI-Overlap-ADAPT-VQE procedure which is plotted simply as a reference.The pink area indicates chemical accuracy at 10 −3 Hartree.