Gate-free state preparation for fast variational quantum eigensolver simulations

The variational quantum eigensolver is currently the flagship algorithm for solving electronic structure problems on near-term quantum computers. The algorithm involves implementing a sequence of parameterized gates on quantum hardware to generate a target quantum state, and then measuring the molecular energy. Due to finite coherence times and gate errors, the number of gates that can be implemented remains limited. In this work, we propose an alternative algorithm where device-level pulse shapes are variationally optimized for the state preparation rather than using an abstract-level quantum circuit. In doing so, the coherence time required for the state preparation is drastically reduced. We numerically demonstrate this by directly optimizing pulse shapes which accurately model the dissociation of H2 and HeH+, and we compute the ground state energy for LiH with four transmons where we see reductions in state preparation times of roughly three orders of magnitude compared to gate-based strategies.


INTRODUCTION
Molecular modeling stands in the juncture of key advances in many important fields including and not limited to energy storage, material designs, and drug discovery.For more than half a century, many seminal works have been reported on the development of theories and methods to enable molecular modeling with high accuracies.Approximate numerical methods which are built on a single Slater determinant reference state, such as density functional theory (DFT), perturbation theory, or coupled cluster, perform well when the amount of electron correlation, ranges from minimal to moderate.However, for systems which are qualitatively governed by electron correlation effects (strongly correlated systems), such approximate methods fail to be sufficiently accurate.While alternative strategies exist, such as density matrix renormalization group (DMRG) [1][2][3] or selected configurational interaction (SCI) methods [4][5][6][7][8][9][10][11][12][13] , which can handle strong correlation, these approaches assume that the correlations are either low-dimensional, or limited in scale.Currently, no polynomially scaling classical algorithm exists which can solve for arbitrary molecular ground states.
High dimensionality of the wavefunction is ultimately the core reason for the exponential cost of modeling electronic structure.Even before optimization, simply storing the wavefunction on classical hardware quickly becomes a bottleneck.This is because one typically represents the wavefunction in a basis of "classical" states, or basis states which have a direct product structure (Slater determinants, occupation number vectors, or even tensor product states 12 ).In this classical basis, the exact (and generally entangled) state of the system is represented as an exponentially (or factorially, if a basis of Slater determinants is used to span a target projected spin subspace) large vector of coefficients weighting the corresponding classical basis states.
Quantum computing offers a radical departure from this computational strategy.As quantum systems themselves, the state of a quantum processing unit (QPU) is also a vector in a Hilbert space of identical dimension to the molecular problem.This ability to perform a one-to-one mapping between vectors in the Hilbert space containing the molecule's electronic wavefunction and those in the state space accessible to a QPU means that with enough control over the QPU, it should be possible to take the vector corresponding to the molecular wavefunction and realize it on the QPU, avoiding altogether the requirement to work with an exponentially large vector of coefficients.Once the QPU is prepared into the state corresponding to the target molecule, any molecular observable (energy, dipole moment, etc) can be obtained by measuring the corresponding operator on the QPU.
In order to turn this strategy into an algorithm, one needs a procedure for realizing the target molecular wavefunction on the QPU.As the leading quantum algorithm for molecular simulation on near-term devices, the variational quantum eigensolver (VQE) 14 provides an efficient procedure for this purpose.In VQE, one defines a parameterized quantum circuit comprised of tunable gates, and then optimizes these gates using the variational principle, minimizing the energy of the molecular Hamiltonian.This parameterized quantum circuit (referred to as an ansatz) defines the variational flexibility (and thus the subspace reachable on the QPU) of the algorithm.
State-preparation circuits with more parameters generally have more variational flexibility, but come with the cost of having deeper quantum circuits and more difficult optimization.This cost can be significant.Current and near-term quantum computers are classified as noisy intermediate scale quantum (NISQ) devices due to the presence of short coherence times, system noise, and frequent gate errors 15 .Because each gate has limited fidelity, the success probability for a sequence of gates decreases exponentially with circuit depth.Even if gates could reach unit fidelity, the finite coherence times of a NISQ device still limits the number of gates that one can apply in a circuit which, in turn, limits the accuracy of the molecular VQE simulation.Although VQE is relatively robust in the presence of noise and errors in certain cases 16 , the critical limitation preventing larger scale experiments is the accurate implementation of deep circuits.The goal of finding parameterized circuits which minimize the circuit depth and maximize the accuracy has led to a number of approaches such as hardware efficient ansätze 17 , physically motivated fixed ansätze [18][19][20][21][22] , and adaptive ansätze [23][24][25] .
Each of these mentioned approaches essentially aims to find the shortest quantum circuit for preparing a state to within an acceptable error.However, even if one succeeds at finding the "best" circuit, it is likely that its execution time will still exceed the coherence time of a NISQ device.In order to make meaningful progress toward larger problems, one can imagine three strategies: (i) device improved error mitigation techniques, (ii) significantly increase the ratio of coherence to gate time, or (iii) simply abandon gates and directly optimize the electronic controls.Of these, the latter strategy appears to be the most feasible on near-term NISQ devices.
In this paper, we explore the possibility of performing gate-free VQE simulations by replacing the parameterized quantum circuit with a direct optimization of the laboratory-frame analog control settings.In the following sections, we argue that quantum control techniques are likely to be better suited for fast VQE state preparation than the more conventional circuit-based approaches on NISQ devices.We first provide a detailed overview of circuitbased VQE, then introduce our proposed strategy, ctrl-VQE, then discuss initial results along with a strategy to avoid overparameterization, and finally compare to gate-based ansätze.Several technical aspects, numerical results, and additional discussion are provided in the Supplementary Methods and Supplementary Discussion.

RESULTS AND DISCUSSION Dissociation of diatomic molecules
In this section, we explore the ability of ctrl-VQE to reproduce full configuration interaction (FCI) (exact diagonalization) bond dissociation energy curves for two small example diatomic molecules.Here, we demonstrate the performance of our approach by computing the ground state molecular electronic energy along the bond dissociation of the H 2 molecule and the HeH + molecular ion.Although small, these molecules have been used as benchmarks for quantum algorithms in recent years 17,20,21,[26][27][28][29][30][31][32][33][34][35] .These two molecular examples are chosen because they exhibit different behaviors during bond dissocation in terms of their electronic structure.As two-orbital, two-electron problems, these would naturally be modeled on four qubits following the commonly used Jordan-Wigner transformation 36 .However, to make the pulse simulations more computationally efficient, we have used the parity mapping instead in which two qubits are diagonal and can be removed from consideration.Details are given in Bravyi et al. 37 , and we use the implementation for mapping and pruning in Qiskit software 38 .
As a prototypical example of a homolytic dissociation, as H 2 dissociates, the ground state moves from being dominated by a single Slater determinant to increasingly multiconfigurational, due to the shrinking HOMO-LUMO gap.As a result, the accuracy of a mean-field treatment (e.g., HF) diminishes as the bond length is stretched.
In stark contrast, HeH + becomes easier to model as the bond distance increases.The reason is that, being the strongest acid and, interestingly, the first molecule formed in the universe 39 , dissociation is a heterolytic deprotonation, such that both the reactants and the products are both closed shell and well represented by a single Slater determinant.If fact, in a minimal basis set (i.e., the STO-3G basis set used in this paper), the products have exactly zero electron correlation energy because the H + ion has no electrons, while the He atom has no empty virtual orbitals into which electrons can be promoted.As a result, HF becomes exact at dissociation.Of course, in larger basis sets, the He atom would have some dynamic correlation, hence the * in Eq. (2).The dissociation curves of the H 2 and HeH + molecular systems produced with ctrl-VQE using the simple square pulses are presented in Figs. 1 and 2 respectively.Consistent with the physical descriptions above, the HF state moves quickly away from the FCI ground state with bond dissociation for H 2 , while the opposite is true for HeH + , where the HF state gradually converges to the exact FCI ground state with increasing bond distance.ctrl-VQE reproduces the FCI bond dissociation curve of H 2 and HeH + with high accuracies.The maximum difference from the FCI energy along the dissociation curve is 0.03 mHa for both H 2 and HeH + , and the average error is 0.002 mH in both the cases.More detailed information including the pulse parameters, characteristics, and molecular energies along the dissociation curves are provided in the SI.The resulting states have a good overlap with the exact FCI ground states (99% in all cases).For H 2 , it is possible for the overlap to deviate at long bond distances due to degenerate singlet and triplet states.As the bond distance increases, the singlet ground state of H 2 becomes degenerate with the lowest triplet state, making it possible to generate a superposition of a singlet and triplet.It is possible to converge to the exact ground singlet state by supplementing the objective function with a penalty term proportional to the total spin operator (see SI for details).

Effect of electron correlation on pulse duration
In any VQE, one is typically interested in finding a useful balance between accuracy (needing deep circuits) and noise (needing shallow circuits).As such, molecular simulations of strongly correlated molecules are intrinsically more difficult as deeper circuits are required, which increases problems from noise and gate errors.An analogous balance is targeted in ctrl-VQE, where one would hope to obtain sufficiently accurate results with as short of a pulse time as possible.Because entanglement cannot be created instantly 40 , we expect molecules with strong correlation to require longer pulses than simpler molecules.
In order to examine this relationship, in Fig. 3 we plot the duration of the shortest pulses we were able to obtain at each molecular geometry, with H 2 (HeH + ) shown in red (purple).Referring back to Eqs. ( 1) and ( 2), one would expect that as the bond distance is increased, H 2 , needing more entanglement, would in turn require increasing pulse durations, whereas pulses for HeH + would get increasingly fast as the bond is stretched.This trend is indeed observed.
The total pulse duration significantly decreases for the dissociation of the HeH + molecular ion, whereas it significantly increases for the H 2 molecule.Note that the initial state and the final target state for HeH + become degenerate with increasing bond distance (above 2.0 Å).Thus, the Hartree Fock state ( 01 j i) is a good approximation to the exact FCI ground state, and only a slight modification to the initial state is required to well approximate the ground state.With ctrl-VQE, the total pulse duration at the far bond distances of HeH + are only 1.0 ns.This directly reflects the efficiency of the method presented in this work.In a gate-based compilation method, generally one would still construct an ansatz comprised of costly two-qubit gates, even though the target state is very nearly approximated by the initial state.
The total pulse duration at the dissociation limit for H 2 is significantly longer than near the equilibrium bond distance.Although the HF molecular energy increases monotonically beyond the equilibrium point, the same is not observed for the pulse duration.This is suggestive of the pulse durations reflecting the different dynamics along the bond dissociation of the two molecular systems (see SI for a more detailed analysis).

Leakage to higher energy states
In Figs. 4 and 5, we show the energy and leakage as a function of evolution time for various choices of the hyper-parameter, T, (the total evolution time).In each of the two figures, the top panel displays the behavior of the molecular energy as the state evolves from the reference Hartree-Fock state to the converged trial state, and the bottom panel displays the total population outside of the 0 j i and 1 j i computational states for each qubit (leakage).We see that for each value of T, the molecular energy of both H 2 and HeH + never decreases toward the exact energy monotonically, but rather increases initially, and then either decreases or oscillates before rapidly converging to the exact energy.While each of the optimized pulses do indeed generate suitable transmon dynamics which accurately produce the molecular ground state at the specified time T, the different pulses are not each equally favorable.From the bottom panels of Figs. 4 and 5, we see that some pulses create much more leakage than other pulses.High levels of leakage will likely require a larger number of shots (pulse executions) to get precise determinations of the expectation values, since a higher portion of the shots end up in excited states which are either discarded, or collected for normalizing the results.However, with leakage of only 10%, we anticipate only needing a commensurate increase in the shot count to compensate for post-selection, and so we do not expect this to be a fundamental limitation of the approach.Alternatively, one can use an unnormalized energy in the objective function.This approach naturally constrains the solutions to simultaneously minimize leakage, as any leakage necessarily penalizes the associated pulse.Results using an unnormalized energy in the objective function are comparable to those shown here, and are given in the SI.

Noise analysis: imprecise control
All the analysis thus far has assumed perfect control over the driving parameters used to generate approximate ground states of chemical Hamiltonians.However, there are always uncertainties Fig. 3 Total pulse duration.Shown as a function of bond distance of H 2 and HeH + molecular systems.Square pulses were used to generate the ctrl-VQE molecular energies in Figs. 1 and 2. The shortest pulse duration at each geometry is plotted here.when attempting to implement the optimal driving parameters.In Fig. 6 we simulate the effect of imprecisely implementing these parameters.For each set of optimal drive parameters found by the ctrl-VQE protocol, we produce 100 samples of noisy parameters taken from Gaussian distributions that are centered about the optimal parameter values: where θ n is a noisy version of the optimal parameter θ o .These in turn are used to produce 100 noisy energy samples.We then average these 100 samples and find the resulting energy difference relative to the target ground state energy.With this applied noise model, we can see that even imprecise settings can still be used to achieve accuracies below a 10 −4 energy error.As shown in the red and green curves in Fig. 6, only when the noise becomes higher than σ = 0.01 (where errors reach a magnitude of 10 MHz or 1 ns) do the energy differences become significant enough to affect the performance of ctrl-VQE.Since these errors are of the same order as the parameters themselves, they do not hinder the realistic implementation of ctrl-VQE.We do not include any explicit noise due to finite decoherence (T 1 ) or dephasing (T Ã 2 ) since the pulses we find are many orders of magnitude shorter than the typical time scales for these effects 41 .To confirm this, we ran simulations using a Lindblad master equation that includes these effects.Our objective function did not change significantly due to our short time scales.

Comparison with circuit compilation techniques
Although several ansätze have been proposed to achieve shorter circuits, even the most compact approaches involve too many gates to implement on current hardware for all but the smallest molecules.In order to reduce the time spent during the statepreparation stage, and thus the coherence time demands on the hardware, circuit compilation techniques have been designed to take a given quantum circuit and execute it either with fewer gates or by reoptimizing groups of gates for faster execution.
To execute the gate-based VQE (described in Section III A) experimentally, the gates in a circuit are compiled to sequences of analog control pulses using a look-up table that maps each elementary gate to the associated predefined analog pulse.The sequence of control pulses corresponding to the elementary gates in the quantum circuit are then simply collected and appropriately scheduled to be executed on the hardware.As such the compilation is essentially instantaneous, making the gate-based compilation technique well suited for VQE algorithms where numerous iterations are performed.
From a compactness perspective however, this gate-based compilation is far from ideal, resulting in total pulse durations which are much longer than what might be obtained with optimized compilation techniques.As is obvious from the one-toone translation of gates to pulses, the overall circuit structure is typically not considered in gate-based compilations.Thus one may naturally be inclined to seek an optimal pulse sequence for the entire circuit.This has motivated compilation algorithms where control pulses are optimized for the target circuit, using numerical optimal control techniques such as gradient ascent pulse engineering (GRAPE) 42 for partially optimal compilation 43 .However, because the GRAPE algorithm itself is highly non-trivial, the compilation latency for each iteration is of critical concern.Two of the GRAPE-based techniques are briefly described below; see Gokhale et al. 43 for a detailed discussion.
The GRAPE compilation technique employs an optimal control routine which compiles to machine-level sequences of analog pulses for a target quantum circuit.This is achieved by manipulating the sets of time-discrete control fields that are executed on the quantum system.The control fields in the optimal routine are updated using gradients (see Boutin et al. 44 for use of analytical gradients) of the cost function with respect to the control fields.GRAPE compilation achieves up to 5-fold speedups compared to standard gate-based compilation.However, such Fig. 6 Effect of imprecise control parameters.Energy error for the dissociation of H 2 is shown.Instead of using the exact, optimized parameters, we use parameters which have added Gaussian noise uncertainty.For each optimal parameter, we randomly sample 100 points each from a Gaussian distribution centered on its optimal setting and with a standard deviation of σ = {10 −4 , 10 −3 , 10 −2 , 10 −1 } which correspond to the blue, orange, red and green curves respectively.The resulting energies for these 100 trials are then averaged.Error bars are given for each point but are smaller than some markers.Fig. 5 Error in energy and leakage for HeH + .Energy difference shown between ctrl-VQE and FCI (top) and leakage (bottom) along the time evolution steps with optimal pulses of different durations for HeH + with a bond distance of 0.90 Å. Optimized pulse parameters are provided in the SI.
speedup comes with a substantial computational cost.This amounts to long compilation latency, making this approach impractical for VQE algorithms in which multiple iterations of circuit-parameter optimizations are performed.The GRAPE-based compilation also suffers from limitations in the circuit sizes it can handle [45][46][47] .
On the other hand, partial compilation techniques achieve significant pulse speedups by leveraging the fast compilation of standard gate-based techniques together with the pulse speedups of GRAPE-based compilations.Two flavors of such an approach were reported in Gokhale et al. 43 .Both divide the whole circuit into blocks of subcircuits.In the so-called strict partial compilation, the structure of quantum circuits used in quantum variational algorithms are exploited to only perform GRAPE-based compilation on fixed subcircuits that are independent of the circuit parametrization.The optimal pulses using the GRAPE compilation techniques for the fixed blocks are pre-computed and simply concatenated with the control pulses from the gate-based compilations for the remaining blocks of the circuit.Thus, the compilation speed is comparable to the gate-based compilations in each iteration, but this method also takes advantage of pulse speedups from GRAPE-based compilations.As one may expect, the pulse speedups heavily depend on the circuit depth of the fixed blocks.
In the other flavor, flexible partial compilation, circuits are blocked into subcircuits, each with a single variational parameter, ensuring the subcircuit blocks have low depth.Hyperparameter optimizations are performed for each subcircuit, and they are utilized to find optimal pulses for the circuit blocks.It is noteworthy to mention that in the flexible partial compilation, compilation latency is reduced significantly (~80x, see Gokhale et al. 43 ) by tuning the hyperparameters of the circuit blocks to speed up the convergence of optimal control in GRAPE.Flexible partial compilations achieve significant pulse speedups as compared to strict partial compilations.However, in spite of the high pulse speedups, the flexible partial compilation technique still diverges from the extremely fast compilation time of the standard gate-based methods.Thus, the algorithm still suffers from compilation latency, albeit to a lesser degree compared to GRAPE-based compilation.
Although both of these GRAPE-based compilation techniques and ctrl-VQE share a direct pulse-shaping step, they fundamentally differ because ctrl-VQE is not a compiler.In contrast to full or partial compiling, we make no reference to a circuit or to implementing any specific unitary.Of course, the control pulses we find do implement some equivalent circuit or unitary (and we analyze these in Sec.II F), but we have no need to ever characterize the unitary.In fact, because ctrl-VQE only targets a single state, a large family of unitaries (defined by the behavior in the spaces orthogonal to the initial and final states) exist which minimize our objective function.As such, many possible solutions exist, with no immediate preference given to any one.

Decompiled control pulses
In the previous sections, the efficiency of ctrl-VQE using optimized control pulses at the device level was demonstrated.The short pulse durations imply that applications to larger molecules might have even more significant speedup since the number of CNOTs in most VQE ansätze increases quickly.
Although ctrl-VQE is performed using no state preparation circuit, the unitary created by the time dynamics of the applied pulse can be decomposed into gates, allowing one to analyze the circuit.This is essentially running a compiler in reverse, or "decompiling" the pulse to yield a state preparation circuit.With this decompiled circuit at hand, we can evaluate the time it would take to execute the optimized pulse as a traditional circuit.By comparing this time to that of the pulse duration, one has a clean benchmark for quantifying the overhead associated with gatebased state preparation.This decompilation can be done in two ways.In the first approach, we simply compute the matrix representation of the evolution operator generated by the optimized ctrl-VQE pulse.For the two-qubit case in focus here, a quantum circuit is then constructed using the KAK decomposition technique.For a detailed description of the technique, we refer the reader to Williams 48 .
In the second approach, an arbitrary circuit corresponding to the state vector from ctrl-VQE is constructed and then transpiled to obtain a shorter circuit depth.In this way we obtained gate durations on IBMQ hardware (mock Johannesburg device available in Qiskit software).The KAK decomposition, arbitrary circuit construction, transpilation, and the mock circuit compilations were performed using Qiskit software 38 .The quantum circuits are illustrated in Fig. 7.The corresponding pulse durations are 1202 ns for the circuit obtained using the KAK decomposition and 825 ns for the circuit obtained from transpilation.The state vector used in this study was for the H 2 molecule at 0.75 Å, and the corresponding pulse duration was 9 ns with ctrl-VQE.This clearly demonstrates that state preparation circuits can be much deeper than is necessary.

Adaptive update of pulse parameterization
The manner in which a pulse is parameterized (in particular, the number of variational parameters) will significantly impact experimental performance.An over-parameterization of the pulse leads to difficulties in experimental optimization.In response, we seek a means to limit the number of parameters.In this section, we describe a scheme to avoid overparameterizations and arrive at the minimal number of pulse parameters to achieve a target accuracy.
Unlike in the previous sections where we chose a fixed number of time segments, in this section we propose an adaptive algorithm which slowly grows the number of segments (and thus parameters).We begin with a single time-segment square pulse (constant amplitude throughout the time evolution), the pulse is then iteratively sliced at random intervals such that the number of time segments systematically increases.This adaptive scheme is outlined below: 1. Initialize a parameterized square pulse with n = 1 time segment and perform the variational pulse optimization.2. Divide the pulse obtained from the previous step at a randomly choosen time.The square pulse now has n = 2 time segments.3. Perform pulse optimization using the pulse shape from the previous step as an initial guess.4. Divide the largest time segment in the square pulse obtained from the previous step into two at a randomly chosen time.This increases the number of segments from n to n + 1. 5. Perform pulse optimization using the pulse shape from the previous step as an initial guess.6. Repeat Steps 4 and 5 until the energy obtained is sufficiently converged.
Note that drive frequencies for the pulses are also optimized in the above procedure.Irrespective of the pulse shape used, only a single drive frequency is used for each qubit.The switching times for the square pulses with more than one time segment are not optimized and remain fixed in Steps 3 and 5.For N transmons with n pulse time segments each, the total number of parameters to optimize is N(n + 1).In the following simulations, we constrain the amplitudes and drive frequencies within the ranges ±40 MHz and ω k ± 3π GHz, respectively.
Following the above strategy of adaptively increasing the number of time segments in the square pulses, we obtained optimal pulses for H 2 , HeH + and LiH with total pulse durations of 9, 9, and 40 ns respectively.Figure 8 plots the energy difference relative to FCI with increasing number of time segments in the optimal pulse.For H 2 and HeH + , chemical accuracy is obtained with a single-time-segment square pulse on each qubit, while for LiH, three time segments are required to achieve chemical accuracy.The molecular energy converges for square pulses with two time segments for H 2 and HeH + and with five time segments for LiH molecules.The error in ground state energy of LiH molecule (Li-H R = 1.5 Å) computed with ctrl-VQE along with pulse duration, leakage, and overlap with the exact FCI state is tabulated in Table 1.To illustrate the sequential pulse refinement from iteration to iteration, the square pulse shapes at each adaptive step for LiH are shown in Fig. 9.
In the case of LiH, we see that the accuracy is not of the same order as in the other two diatomic molecules.To further examine this, we perform over 100 pulse optimizations on the square pulse with 10 time segments obtained from the adaptive strategy by retaining only the switching times and starting with random amplitudes and drive frequencies.We use a large number of initial guesses to reduce the likelihood of getting trapped in a local minimum.We find that the highest accuracy is achieved when we use the pulse obtained from the adaptive strategy above.This suggests that we are nearing the accuracy limit of LiH using a 40 ns pulse.
While the above adaptive strategy aims at practical experimental implementation, the cost of classical simulation remains the same irrespective of the number of time-segments when analytical gradients for the pulse amplitudes are used.The cost of evaluating an analytical gradient is only about 2.5 times the cost of an energy evaluation.Here, the analytical gradient is obtained at each trotter step of time evolution and adapted for the pulse shaped considered.Details along with the derivation are provided in the SI.

Comparison to gate-based ansätze
Now we directly compare the results of our pulse-based technique with gate-based variational ansätze.We use calibration data from an IBMQ device (mock Johannesburg device available in Qiskit Terra version 0.14.2) to compute the duration of the circuits used to prepare trial wavefunctions.We consider the RY and UCCSD ansätze, which are capable of producing the exact ground state.
For both H 2 and HeH + , the RY ansatz requires 1 CNOT, and the UCCSD requires 2 CNOTs.The RY ansatz has four parameters, and the UCCSD ansatz has three parameters.The total pulse time of the RY ansatz is 519 ns, and the UCCSD ansatz is 825 ns.For the LiH molecule, the RY and UCCSD ansätz require 9 and 245 CNOTs respectively.The RY ansatz has 16 parameters whereas the UCCSD ansatz has 8 parameters.The total pulse duration of the RY ansatz is 3485 ns, and that of the UCCSD ansatz is 82,169 ns for LiH.In each case, the time to execute the circuit is significantly longer than the time required to apply the pulses using ctrl-VQE, which is 9 ns for H 2 and HeH + , and 40 ns for LiH.The T 1 and T 2 times of the IBMQ device used range from 45,150 to 115,217 ns (average 71,262 ns) and from 48,173 to 107,025 ns (average 74,490 ns) respectively.Fig. 8 Convergence of pulse parameterization.Energy difference of ctrl-VQE relative to FCI with increasing number of time-segments in the square pulse used.
Table 1.Performance of ctrl-VQE in computing the ground state energy of the LiH molecule (Li-H R = 1.5 Å).The energy error relative to FCI, the leakage to higher energy states, the total pulse duration, and the overlap with the exact FCI state are shown.The data correspond to the pulse with five time segments from Fig. 8.This means that the pulse duration for LiH using the UCCSD ansatz is already reaching the limit set by decoherence times of the device, while the pulse duration obtained with ctrl-VQE is well within the decoherence time.

Method
We note here that this comparison needs to be approached with a bit of caution, as our simulated device has slightly different parameters than the IBM-Q devices used for the circuit timings.However, we do not expect that our results would change significantly if we had access to the full set of exact parameters of the device (including anharmonicities, bus frequencies, and couplings), because previous works using the present device parameter regime 49,50 have demonstrated universal gate sets with single and two-qubit operations commensurate with those of the IBM-Q devices.As such, we expect the ctrl-VQE pulse times reported in this study to be close to what one would get by directly running the calculations on IBM-Q devices.

Summary and future outlook
In summary, we have presented an alternative new variational quantum algorithm which is fundamentally different from the existing quantum algorithms for molecular simulation.The quantum circuit used for state preparation in standard variational algorithms is entirely replaced by a hardware-level control routine with optimized pulse shapes to directly drive the qubit system.As such, we demonstrate that VQE applications do not need to be confined to two level systems (qubits), which are required for general quantum computing.This opens a possibility of faster ansatz preparation within a time window defined by coherence times of NISQ devices.
The efficacy of the presented method is numerically demonstrated by modeling the bond dissociation of two diatomic molecules, H 2 and HeH + .The maximum error from the exact FCI energy is well within chemical accuracy (0.02 kcal/mol) for both molecular systems throughout the bond dissociation.ctrl-VQE captures the important electron correlation effects involved in the bond dissociation of the two molecular systems, as reflected in the pulse durations along the bond dissociation.
To demonstrate the application of ctrl-VQE to a larger molecular system, we have also computed the energy of the LiH molecule at a single bond distance.An adaptive scheme which slowly increases the variational flexibility of the pulse parameterization was used to determine the minimal number of pulse parameters required to acheive a target accuracy, ultimately simplifying experimental implementation and avoiding over-parameterization.The results demonstrate that pulse shapes with relatively modest numbers of parameters achieve convergence in the energy even as the problem size increased.The approach yields significant state-preparation speedups for VQE as compared to standard gate-based ansätze.The short pulse durations minimize losses due to decoherence and dephasing, which is a step toward enabling more accurate VQE simulations on larger, strongly correlated systems.ctrl-VQE can be viewed as a lowest-level quantum variational algorithm.
Because ctrl-VQE operates directly on the hardware, classical simulations of the algorithm are even more computationally expensive than typical VQE emulations, as one needs to not only solve the time-dependent Schrödinger equation, but also optimize the driving Hamiltonian.As such, the systems we have been able to study so far are very small.In future work, we will develop an improved implementation to study the behavior of larger systems and with more sophisticated constraints on the pulse shape (Ω n (t)).Improved software will also make it possible to numerically study the impact that device controllability limitations 51 have on the amount or type of electron correlation able to be described.

METHOD Variational quantum eigensolver
The VQE algorithm aims to leverage classical resources to reduce the circuit depth required for molecular simulation.The algorithm finds optimal rotation angles for a parameterized quantum circuit of fixed depth by variationally minimizing the energy of a target molecule, which is obtained by repeated state preparation and measurement cycles.In order to account for the distinguishability of qubits, we start by transforming the second quantized electronic Hamiltonian into an equivalent form involving non-local strings of Pauli spin operators, ôi : where h i is a sum of molecular one or two-electron integrals, and p operators are are fermionic annihilation operators.Several such transformations exist, such as the Jordan-Wigner 36 , Bravyi-Kitaev 52 , or parity transformation 53 .The main steps in VQE are defined as follows: 1. Precompute all h i values, and transform terms in the Hamiltonian operator into the desired qubit representation.2. Choose an ansatz for the problem which defines the variables ( θ ! ) to optimize.Assuming one starts from the Hartree-Fock (HF) reference state, this involves predefining a parameterized circuit which provides enough variational flexibility to describe the molecule's electron correlation effects.Many ansätze have been proposed, several of which are variants of the original proposal using the unitary coupled-cluster (UCC) ansatz 54,55 .3. Choose an initial set of parameter values, θ !¼ θ !0 .These can be initialized to zero, chosen randomly, or if appropriate, chosen based on some classical precomputation such as using MP2 parameters to start a UCCSD optimization.4. Using current parameters, θ !, repeatedly execute the circuit, each time performing an individual measurement of one of the operators, ôi , in Ĥ or a set of mutually commuting operators.After a sufficient number of circuit executions (shots), the averages of the resulting data converge to the expectation values of the operators such that the average molecular energy can be obtained by multiplication with the one and two-electron integrals, 5. Check convergence.If the average energy has decreased by a small enough value determined to be converged, exit.If the energy is not yet converged, update θ !, and return to step 4.

Control variational quantum eigensolver
In this section, we present an alternative to the gate-based VQE algorithm, replacing the parameterized state-preparation circuit with a parameterized laboratory-frame pulse representation, which is optimized in an analogous manner, but with the benefit of a much faster state preparation, opening up the possibility of more accurate simulations on NISQ devices.All other aspects of VQE (i.e., measurement protocols) are essentially the same.Using the molecular energy as the objective function to minimize, the pulse parameters are optimized using the variational principle.This general strategy, which we refer to as control variational quantum eigensolver (ctrl-VQE), is outlined as follows: 1.As done in any regular VQE, compute the one-and two-electron integrals and transform the molecular Hamiltonian into a qubit representation, for example using Jordan-Wigner, Parity or Bravyi-Kitaev mappings.This defines the objective function to minimize, h Ĥmolecule i. 2. Define a fixed pulse representation (e.g.square pulses, sum of Gaussian pulses, etc.).Parametrize the chosen pulse representation, and initialize parameters.3. Choose an initial state for the qubit(s) system.HF is a good choice for the molecular problems studied here.Controls are assumed to be in the form of direct drives on each qubit.4. Measure the expectation value of the objective function h Ĥmolecule i on the quantum device.5. Using a classical optimization routine, determine the pulse parameters for the next measurement on the quantum device.6. Repeat until a target convergence threshold is met.If the chosen parameterized pulse can fully specify the target Hilbert-space, then the optimal pulse finds the minimum energy state.
We note here that the optimization used in this work excludes the total pulse duration.This is fixed throughout the optimization routine, and only the pulse parameters such as the amplitudes or the frequencies are directly optimized.As such, the total pulse time enters the algorithm as a "hyperparameter", which can be optimized in an outerloop if desired.In the square pulses considered in this work, the time segments are also optimized.Unless stated otherwise, optimal pulses correspond to pulse shapes that are optimal with a given fixed total pulse duration.
Unlike universal quantum computing algorithms, ctrl-VQE occurs at the hardware-level, and any simulation must refer to a specific physical platform.For this work, we choose a well-established transmon platform with the following 1D device Hamiltonian 41 : where 〈kl〉 indicates that the summation is over pairs of coupled qubits.
Here, we assume that the qubits are arranged in a closed linear chain and have always-on direct couplings, where âk is the bosonic annihilation operator for the kth transmon, and ω k , δ k , and g are the resonant frequency, anharmonicity, and constant coupling rate, respectively.Furthermore, each transmon in Eq. ( 6) formally supports an infinite number of states.However, in our simulations we necessarily approximate this system by a finite number of levels (three unless otherwise stated) per transmon.We tested the accuracy of this by adding more levels and found that the results did not change significantly.The parameters used in this work are chosen to be typical of values found in current superconducting transmons, and are provided in Table 2.We find that our results below do not qualitatively depend on the frequency difference between the qubits, and in the SI we provide a comparison of this current device to one with a larger detuning between the transmons.In order to drive the device, we apply a time-dependent field, with separate controls on each qubit such that within the rotating wave approximation, the control Hamiltonian is expressed as: where Ω k (t) is the real-valued, time-dependent amplitude of the drive, and ν k is the frequency of the field.The system therefore evolves under the total Hamiltonian: By moving to the interacting frame, the final ansatz has the following form: dt ĤC ðt;ΩnðtÞ;νnÞ ψ 0 j i: where ψ 0 j i is the VQE reference state (e.g., the HF state), {Ω n (t), ν n } are the variational parameters for the ansatz, T is the total pulse time, and T is the time-ordering operator.Note that although the control Hamiltonian above only has single-qubit terms, the device itself has inter-qubit couplings (Eq.( 6) with strength g), which create an entangling control Hamiltonian in the interacting frame: As such, the coupling strength g is responsible for describing electron correlation in the target molecule.Using this ansatz in Eq. ( 9), the energy to be minimized in the ctrl-VQE objective function is, where ψ trial 0 E is just the ψ trial E state above, projected onto the computational basis and normalized.Note that an unnormalized state can also be used; this yields similar results, as shown in the SI.
The ansatz above is completely (and solely) determined by the device and controls, granting enormous flexibility to the ansatz.In fact, any digital quantum circuit ansatz can be compiled into the form above.As such, the ansatz in Eq. ( 9) does not intrinsically possess any limitation on its potential accuracy beyond the fundamental limitations imposed by quantum speed limits 40 .However, this additional flexibility can make optimization more difficult.
In this work, we have chosen to impose simple constraints on the form of Ω n (t) to minimize the number of optimization parameters and simplify experimental implementation.We have considered two examples: (i) piecewise-constant pulses, and (ii) sum of Gaussian pulses.Because these two examples yield similar results, we present only the square pulse data in the main text, and provide the Gaussian pulse data in the SI.An example of a pair of two-time-segment square pulses for a two transmon system is shown in Fig. 10.With a single time segment square pulse, the amplitude is taken as constant throughout the pulse duration.For n time segments, the parameterized square pulse is given by, Ω k ðtÞ ¼ where c i are amplitudes of the pulse, t i are the switching times, and T is the total pulse duration.The amplitudes are constrained to ±20 MHz unless otherwise mentioned; this is a typical amplitude for RF control signals in superconducting qubit systems 64 .Each transmon drive term also has a frequency modulation of the form exp i ν k t ð Þ; with a driving frequency ν k constrained to (ω k − 2π GHz) ≤ ν k ≤ (ω k + 2π GHz), where ω k , ν k are in units of 2π GHz.Therefore, the pulse parameters that will be optimized include c j , t j , and ν k .With N transmons and n square pulses on each transmon, we then have 2Nn parameters to optimize.The pulse parameter optimizations are performed using l-BFGS-b.
We note that a very recent preprint uses similar quantum control considerations to improve VQE 65 .However, in that work, these considerations are used only to define an improved gate-based ansatz, unlike the direct variational pulse shaping described in this work.Our work is also somewhat related to the paper from Chamon and coworkers 66 in which they find that the optimal pulse for preparing the ground state of a Hamiltonian is of a bang-bang form.Although both ctrl-VQE and the work of Yang et al. 66 both combine variational quantum algorithms with quantum control, our effort aims to find ground states of non-diagonal Hamiltonians (such as the molecular Hamiltonian in Eq. ( 4)), preventing the efficient implementation of the time-evolution operator needed for realizing the bang-bang protocol.Because our control Hamiltonian is not determined by our problem Hamiltonian (Eq.( 4)), we should not expect such bang-bang protocols to be optimal.We also note that a preprint was posted soon after this paper which suggests that optimal control would be helpful in VQE applications 67 .While pulse level control presents a challenge for front-end development which has focused on circuits, IBM has already begun making direct pulse access available 68 , and we hope this work will help encourage other platforms to do the same.

Computational details
The numerical results presented in this work were generated with a locally developed program: CtrlQ.The source code for the program is available on Github, codebase: https://github.com/oimeitei/ctrlqunder a Apache License 2.0.Functionalities from Qiskit 38 and Qutip 69 were used to check and compare the results presented in this work.Molecular integrals were generated using PySCF 70 , and an STO-3G basis set was used throughout.Below, we consider simulations that involve either two-transmon or fourtransmon devices depending on the molecule.The parameters ω, δ and g used in the simulations are explicitly given in Table 2. To demonstrate qualitative insensitivity to the device parameters, we provide some results using a device with a larger detuning in the SI.
Although the simulations presented in the following sections use highly restricted forms of control pulses, our in-house code (CtrlQ) supports various forms of pulses.In fact, we have implemented efficient analytic gradients of the molecular energy with respect to pulse amplitudes (see SI), making it possible to optimize pulses with arbitrary shapes.As mentioned in the previous section, here we focus primarily on simple pulse waveforms with few parameters to simplify experimental implementations.

Fig. 1
Fig. 1 Bond dissociation curve of H 2 molecule.ctrl-VQE energies are computed using square pulses with two time segments.

Fig. 2
Fig. 2 Bond dissociation curve of HeH + molecule.ctrl-VQE energies are computed using square pulses with two time segments.

Fig. 4
Fig.4Error in energy and leakage for H 2 .Energy difference shown between ctrl-VQE and FCI (top) and leakage (bottom) along the time evolution steps with optimal pulses of different durations for H 2 with a bond distance of 0.75 Å.The red and purple lines with T = 29 ns are distinct solutions to the same optimization.Optimized pulse parameters are provided in the SI.

Fig. 7
Fig. 7 from decompiled control pulses.Illustration of circuits constructed using the unitary and state vectors from ctrl-VQE.a Circuit corresponding to unitary obtained from a KAK decomposition.b An arbitary circuit corresponding to the state vector.c Transpiled version of circuit (b).

9
Adaptive sequences for LiH molecule.Square pulse shapes obtained at each adaptive step, starting from one time window and adaptively dividing the time window up to five-segments.The accuracy increases with each division of the pulse.Pulse shapes are shown for the pulses on a qubit 1, b qubit 2, c qubit 3, d qubit 4.

Fig. 10
Fig.10Illustration of square pulses for a two transmon system.Simple square pulse with two time segments.

Table 2 .
Device parameters appearing in Eq. (6) for a four-transmon + , which only involve two transmons, device parameters corresponding to transmons 1 and 2 were used.All units are in 2π GHz.