Quantum Computation of Frequency-Domain Molecular Response Properties Using a Three-Qubit iToffoli Gate

The quantum computation of molecular response properties on near-term quantum hardware is a topic of significant interest. While computing time-domain response properties is in principle straightforward due to the natural ability of quantum computers to simulate unitary time evolution, circuit depth limitations restrict the maximum time that can be simulated and hence the extraction of frequency-domain properties. Computing properties directly in the frequency domain is therefore desirable, but the circuits require large depth when the typical hardware gate set consisting of single- and two-qubit gates is used. Here, we report the experimental quantum computation of the response properties of diatomic molecules directly in the frequency domain using a three-qubit iToffoli gate, enabling a reduction in circuit depth by a factor of two. We show that the molecular properties obtained with the iToffoli gate exhibit comparable or better agreement with theory than those obtained with the native CZ gates. Our work is among the first demonstrations of the practical usage of a native multi-qubit gate in quantum simulation, with diverse potential applications to the simulation of quantum many-body systems on near-term digital quantum computers.


INTRODUCTION
A primary goal of emerging quantum computing technologies is to enable the simulation of quantum manybody systems that are challenging for classical computers [1][2][3].Early experimental demonstrations of quantum simulation algorithms have focused on computing ground-and excited-state energies of small molecules [4][5][6][7] or few-site spin [6] and fermionic models [8].More recently, the scale of quantum simulation experiments has increased in terms of numbers of qubits, diversity of gate sets, and complexity of algorithms, as manifested in simulation of models based on real molecules and materials [9][10][11], various phases of matter such as thermal [12,13], topological [14,15] and many-body localized states [16,17], as well as holographic quantum simulation using quantum tensor networks [18][19][20].As quantum advantages in random sampling have been established on quantum hardware [21,22], focus has turned to the experimental demonstration of quantum advantages in problems of physical significance [23].
For applications in chemistry and physics, the calculation of the response properties of molecules and materials is of substantial interest [24][25][26].Investigating response properties in the electronic structure theory framework involves calculating quantities such as the one-particle Green's function [27] and density-density response functions [28], which provide insight into interpreting experimental spectroscopic measurements [29].Response properties of molecules and materials can be determined either in time domain or in frequency domain.Due to the natural ability of quantum computers to simulate time evolution [1,2], near-term algorithms to compute time-domain response properties have been carried out on quantum hardware [30][31][32].However, computing the frequency-domain response from the time-domain response using the typical gate set available on hardware requires a time duration that exceeds the circuit depth limitations of near-term quantum computers.
An alternative approach to determine these response properties is by computing them directly in the frequency domain.Frequency-domain algorithms generally involve obtaining the ground-and excited-state energies as well as the transition amplitudes between the ground state and the excited states.Although there are established methods to obtain ground-and excited-state energies on quantum computers [33][34][35][36][37], calculating transition amplitudes is less straightforward.Various schemes including variational quantum simulation [38,39], quantum subspace expansion [40] and quantum linear algebra [41] to determine frequency-domain response properties have been proposed.However, the accuracy of variational methods depends on the quality of the ansatz, quantum subspace expansion is susceptible to numerical instabilities from basis linear dependence, and quantum linear algebra is out of reach for near-term quantum hard-FIG.1: Schematic of the diatomic molecules and LCU circuits for computing transition amplitudes.(A) Schematic of the diatomic molecules NaH and KH.The active space consists of only the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO).(B) The circuits to calculate diagonal transition amplitudes, where a0 is the ancilla qubit and s0 and s1 are the system qubits.For the spectral functions the target unitaries are Xpσ and i Ỹpσ, while for the response function the target unitaries are I and Zpσ.(C) The circuit to calculate off-diagonal transition amplitudes in the response functions, where a0 and a1 are the ancilla qubits, and s0 and s1 are the system qubits.The double-controlled-Z gates are decomposed with either iToffoli gates or CZ gates.The double-controlled identity gates that would complete the standard two-ancilla LCU circuit is not shown.
ware.Recently, a non-variational scheme amenable to near-term hardware implementation has been proposed [42,43].This scheme constructs the electron-added and electron-removed states simultaneously by exploiting the probabilistic nature of the linear combination of unitaries (LCU) algorithm [44].Nevertheless, this LCU-based algorithm has not yet been demonstrated on quantum hardware due to the lack of efficient implementations of the multi-qubit gates.
In this work, we experimentally demonstrate the calculation of frequency-domain response properties of diatomic molecules using a recently reported high-fidelity three-qubit iToffoli gate [45] to implement the LCU circuits on a superconducting quantum processor.The multi-controlled gates present in the original LCU algorithm are decomposed either with iToffoli gates or with CZ gates.We calculate transition amplitudes between the ground state and the N -electron or (N ± 1)electron states of NaH and KH molecules restricted to the highest occupied and lowest unoccupied molecular orbitals (HOMO and LUMO).The transition amplitudes are then used to construct spectral functions and densitydensity response functions of the diatomic molecules.We apply error mitigation techniques including randomized compiling (RC) [46,47] during circuit construction, and McWeeny purification [48] during postprocessing, both of which result in marked improvement of the experimental observables.The observables obtained from the reduced-depth circuits with iToffoli decomposition show comparable or better agreement with theory compared to the observables from circuits with CZ decomposition, despite incomplete Pauli twirling in the randomized compiling procedure applied to the iToffoli gate.Our results pave the way for the application of multi-qubit gates for quantum chemistry and related quantum simulation ap-plications on near-term quantum hardware.

Quantum Algorithm for Transition Amplitudes of Diatomic Molecules
We consider the HOMO-LUMO models of the diatomic molecules NaH and KH as shown in Fig. 1a (see Methods).Such molecular models with reduced active space have been used in benchmarking quantum chemistry methods on quantum computers [49].The HOMO-LUMO model generates two spatial orbitals or equivalently four spin orbitals, which correspond to four qubits after Jordan-Wigner transformation [50].To reduce quantum resources, we exploit the number symmetry in each spin sector to reduce the number of qubits from four to two using a qubit-tapering technique [51] (details given in Supplementary Sec.I).
The observables we aim to determine are the spectral function and density-density response function.Suppose that the molecular Hamiltonian with reduced active space has ground state |Ψ 0 with energy E 0 , and (N ±1)electron eigenstates Ψ N ±1 λ with energies E N ±1 λ .Let â † pσ and âpσ be the creation and annihilation operators on orbital p with spin σ, respectively.The one-particle Green's function has the expression [27]: where ω is the frequency and η is a small broadening factor.The spectral function is related to the Green's function by A(ω) = −π −1 Im Tr G(ω).
For the density-density response function, we consider the charge-neutral N -electron excited states Ψ N λ with energies E N λ and the number operator npσ on the orbital p with spin σ.The density-density response function has the expression [28]: The operators âpσ , â † pσ and npσ are not unitary, but they can be written as linear combination of unitary operators as where I is the identity operator, Z pσ is the Pauli Z operator on orbital p with spin σ, and Xpσ and Ȳpσ are the Jordan-Wigner transformed Pauli X and Y operators on orbital p with spin σ with a string of Z operators included to account for the anticommutation relation [50].The Pauli strings Xpσ , Ȳpσ and Z pσ undergo the same transformation and qubit tapering process as the Hamiltonian (details given in Supplementary Sec.I).Except for the identity operator which does not change under the transformation, we label the transformed Xpσ , Ȳpσ , Z pσ as Xpσ , Ỹpσ and Zpσ .
The LCU circuits to calculate diagonal and offdiagonal transition amplitudes are given in Figs.1b and  1c.Each circuit has two system qubits s 0 and s 1 , and one ancilla qubit a 0 or two ancilla qubits a 0 and a 1 .The unitary U 0 prepares the ground state |Ψ 0 on the system qubits from the all-zero initial state.The circuit diagrams reflect that the spectral function involving operators Xpσ and Ỹpσ only requires diagonal transition amplitudes, while the density-density response function involving operators I and Zpσ requires both the diagonal and off-diagonal transition amplitudes.Although the original algorithm [42] proposed performing quantum phase estimation on the system qubits, due to quantum resource constraints we instead apply quantum state tomography [52] to the system qubits while measuring the ancilla qubits in the Z basis.
In the diagonal circuits, we obtain the (unnormalized) system-qubit states 1  2 ( Xpσ ± i Ỹpσ ) |Ψ 0 or 1 2 (I ± Zpσ ) |Ψ 0 with probabilities p ± , where the probabilities are specified by the ancilla measurement outcome as p + = p a0=0 and p − = p a0=1 ; in the off-diagonal circuits, we obtain the (unnormalized) system-qubit states .We take the overlap of the tomographed system-qubit states with the exact eigenstates, which are then postprocessed according to Eq. 18 in Ref. [42] or Eq. 25 in Ref. [43] to yield the transition amplitudes (see Supplementary Sec.II).The transition amplitudes are then used to construct the spectral function and density-density response function according to Eqs. 1 and 2.
In the following sections, for simplicity, we will denote the diagonal circuit that applies the operator â( †) pσ or npσ as the pσ-circuit, and the off-diagonal circuit that applies the operators npσ and nqσ as the (pσ, qσ )-circuit.

iToffoli versus CZ Decomposition in LCU Circuits
The transformed and tapered operators are two-qubit Pauli strings with multiplicative factors of ±1 or ±i.To apply the single-or double-controlled gates, we follow the standard multi-qubit Pauli gate decomposition [53] with the base gate as CZ or CCZ and use CNOT gates consisting of native CZ gates dressed by Hadamard gates to extend the weights of the Pauli strings.The multiplica- tive factor −1 or ±i can be applied as a single-qubit phase gate on the ancilla in the diagonal circuits, or as the native CS, CS † or CZ on the two ancillae in the off-diagonal circuits.Additionally, X gates are wrapped around the ancilla qubits controlled on the 0 state.Figure 2a shows how a double-controlled gate with ancilla a 0 controlled on 1, ancilla a 1 controlled on 0, and a target operator −ZZ is applied on the device.
We decompose the CCZ gate either with the threequbit iToffoli gate as shown in Fig. 2b or with the native CZ gates.The iToffoli decomposition starts with a double-controlled iZ component, followed by a long-range CS † gate to cancel the phase factor i. The SWAP gates in the long-range CS † part of the circuit are further simpli- fied in the transpilation stage or decomposed into three CZ gates and additional single-qubit gates according to a recent work on the same quantum device [54].For the CZ decomposition of CCZ, we use a topology-aware quantum circuit synthesis package [55] to obtain the optimal decomposition as eight CZs under linear qubit connectivity, as opposed to the six-CZ decomposition that requires allto-all qubit connectivity [56].

Spectral Function and Response Function on
Quantum Hardware The spectral function of NaH and KH are shown in Fig. 3.The density matrices are obtained from quantum state tomography and postprocessed with McWeeny purification.Randomized compiling is not employed for these results.The experimental spectral functions show very good agreement with the exact ones, with maximum peak height deviation of 10.6%.
We next turn to the density-density response functions, which are more challenging to determine than the spectral functions because these observables require the deeper off-diagonal circuits containing three-qubit iToffoli gates.We begin by considering a specific off-diagonal circuit needed for the density-density response function, the (0 ↑, 0 ↓)-circuit.To understand the influence of the iToffoli gate on the accuracy of the executed circuit, we compute the fidelity of the whole qubit register obtained by quantum state tomography versus circuit depth.The same quantity was computed for a circuit using only CZ gates to decompose the double-controlled gates.The results are shown in Fig. 4.Although the iToffoli decomposition shows a steeper decrease in fidelity compared to the CZ decomposition, the fidelity at the end of the circuit is higher due to lower circuit depth.The noisy simulation in the inset of Fig. 4 shows a similar trend.The iToffoli gate reported in Ref. [45] does not consider spec-FIG.5: System-qubit state fidelities in the response function calculation of NaH.(A to B) Fidelities between the raw experimental and exact system-qubit density matrices without (A) and with RC (B).The diagonal elements correspond to system-qubit density matrices in the diagonal circuits after taking the ancilla state a0 = 1, and the off-diagonal elements correspond to the system-qubit density matrices in the off-diagonal circuits after taking the ancilla states either as (a0, a1) = (1, 0) (upper diagonal) or as (a0, a1) = (1, 1) (lower diagonal).(C to D) Fidelities between the purified experimental and exact system-qubit density matrices without (C) and with RC (D).Layout of the tiles are the same as in panels (A) and (B).Without RC, purification raises the average off-diagonal fidelity from 45.2% to 67.4%, but with both RC and purification the average off-diagonal fidelity increases to 96.0%.
tator errors on neighboring qubits, which are cancelled out in the gate calibration in this work (details given in Supplementary Sec.III).The cycle benchmarking fidelity of the iToffoli gate accounting for the spectator qubit is 96.1%, lower than the single-qubit gate fidelities which are above 99.5% and the two-qubit gate fidelities which are between 97.6% and 98.8%, which may explain the steeper decay in fidelity with circuit depth in the iToffoli circuit compared to the CZ circuit.
Next, we examine the fidelity of the final state in each iToffoli-decomposed circuit used in the calculation of response functions.Figure 5 shows the system-qubit state fidelities on each response function circuit for NaH, where McWeeny purification is applied to the system-qubit density matrix after restricting the full density matrix to each ancilla bitstring sector.Comparing the values in Fig. 5a with those in Fig. 5b, we can see that RC itself only results in a moderate improvement in the fidelities, with the average diagonal fidelities changing from 84.6% to 85.5% and average off-diagonal fidelities changing from 45.2% to 54.8%.However, the results between Fig. 5b and Fig. 5d show that RC combined with purification yield an average diagonal fidelity of 99.9% and an average off-diagonal fidelity of 96.0%, even though purification without RC only leads to a limited improvement in the average diagonal fidelity from 85.6% to 95.7%, and in the average off-diagonal fidelity from 45.2% to 67.4% in Figs.5a and 5c.
Overall, the iToffoli decomposition yields comparable or better results compared to the CZ decomposition.Without RC, the iToffoli decomposition results in visibly better agreement with the exact response functions in Figs.6a and 6c, where the root-mean-squared (RMS) errors between the experimental and exact response functions are 0.0086 eV −1 and 0.0372 eV −1 for the iToffoli decomposition and 0.0387 eV −1 and 0.0278 eV −1 for the CZ decomposition for χ 00 and χ 01 , respectively.With RC, the two decompositions exhibit comparable RMS errors, which are 0.0081 eV −1 and 0.0143 eV −1 for the iToffoli decomposition, and are 0.0073 eV −1 and 0.0061 eV −1 for the CZ decomposition for χ 00 and χ 01 , respectively, as shown in Figs.6b and 6d.
In both χ 00 and χ 01 , RC can lead to substantial improvements in the experimental data for both decompositions, which is reflected in the heights of the symmetric peaks at ±1.4 eV and ±24.0 eV.For χ 00 without RC and using the CZ decomposition, the peak at ±1.4 eV is twice the height of the exact peak and the peak at ±24 eV is not present in the experimental observable in Fig. 6a.With RC, both features are present, with the deviations of the peak heights being 34.8% and 4.7%, respectively, as in Fig. 6b.For χ 00 with the iToffoli decomposition, the peak height deviations change from 6.1% and 26.6% without RC in Fig. 6a to 11.8% and 24.0% with RC in Fig. 6b.In this case, the results before and after RC are comparable since the experimental observable without RC is already in good agreement with the exact observable.For χ 01 , we see a more marked improvement due to RC compared to the corresponding results for χ 00 .Without RC, both iToffoli and CZ decompositions produce the wrong sign on the peak at ±24 eV in Fig. 6c, but with RC the signs are correctly predicted.The peak height deviations can then be computed as 39.2% and 32.2% for the CZ decompositions and 5.7% and 28.2% for the iToffoli decomposition, as indicated in Fig. 6d.The corresponding results of system-qubit state fidelities and density-density response functions for KH are given in Supplementary Sec.V, which follow a similar trend as NaH.
Since the iToffoli gate is non-Clifford, our implementation of RC results in incomplete Pauli twirling compared to applying RC to the CZ-decomposed circuits (see Supplementary Sec.IV).The incompleteness of RC on the iToffoli-decomposed circuits may explain why the two decompositions have similar RMS errors with RC despite the initial advantage for the iToffoli decomposition without RC due to its lower circuit depth.

DISCUSSION
We have carried out an LCU-based algorithm to compute the spectral functions and density-density response functions of diatomic molecules from the transition amplitudes determined on a superconducting quantum processor.Using a native high-fidelity iToffoli gate [45] has enabled the required circuit depth to be reduced by around a factor of two.These resulting circuits produced better agreement with the exact results compared to the circuits constructed only from CZ, CS, and CS † gates when RC is not employed.We also developed an RC protocol for the non-Clifford iToffoli gate, and have shown that without complete Pauli twirling on the iToffoli gate, the circuits constructed from iToffoli gates gave comparable results as the circuits constructed only from CZ, CS, and CS † gates with RC.
The quality of the computed quantities was greatly improved by the use of several error mitigation techniques.Specifically, our results highlight the significance of RC [46,47] combined with McWeeny purification [48] for quantum simulation.McWeeny purification has been widely used in quantum chemistry [57] and started to be exploited in quantum computing for constraining the purity of the output state [9,58].Our results have showed that RC or McWeeny purification individually only improves the experimental results to a limited extent, as observed in the change of the average off-diagonal fidelities from 45.8% to 54.2% with only RC, and to 67.4% with only purification in Fig. 5.However, the combination of RC and purification results in a significant improvement in the quality of the results with the systemqubit state fidelities being 96.0% on average.Moreover, previous works applied purification to the whole qubit register, but we have showed here that the purification scheme can be applied when there is purity constraint on a subset of qubits.Additionally, our work is the first to apply RC to the non-Clifford iToffoli gate.As more native non-Clifford two-qubit and multi-qubit gates become available, our findings may guide future application of RC to non-Clifford gates.
Our work is also among the first to demonstrate the practical use of a native multi-qubit gate in quantum simulation via an LCU-based algorithm.LCU as a general algorithmic framework is not limited to determining transition amplitudes in frequency-domain response properties, but has broader applications in areas such as solving linear systems [59], simulating non-Hermitian dynamics [60], and preparing quantum Gibbs states [61].
Besides the LCU algorithm, quantum algorithms such as Shor's algorithm [62] and Grover's search algorithm [63] can benefit considerably from native three-qubit gates with reduction in circuit depths and gate counts.Quantum algorithm design and implementation thus far have been mostly restricted to single-and two-qubit gates due to their ease of implementation and demonstrated high fidelity.Meanwhile, early implementations of three-qubit gates [64][65][66] were generally slower and more prone to leakage and decoherence compared to the iToffoli gate employed here due to populating higher levels outside the qubit computational space.However, more recent implementations of three-qubit gates [45,67,68] have begun to address these challenges yielding fidelities approaching those achieved with two-qubit gates.Further, they have been carried out on quantum devices with tens of qubits, suggesting their utility for larger-scale quantum devices.As such native multi-qubit gates become more prevalent, our work paves the way for using them as native gate components in future quantum algorithm design and implementation.

Quantum Circuit Construction
The molecular orbitals used in this work are in STO-3G basis with molecular integrals determined from PySCF [69].We use OpenFermion [70] to map the secondquantized Hamiltonian to qubit operators.The groundstate preparation gate on the system qubits are determined classically by constructing a unitary that maps the all-zero initial states to the ground state and then decomposed into three CZ gates and single-qubit gates using the KAK decomposition [71].The LCU circuits are then constructed by applying the gates shown in Figs.1b and 1c, where the SWAP gates are decomposed according to the scheme in Ref. [54] and the circuits are transpiled by the functions MergeInteractions, MergeSingleQubitGates and DropEmptyMoments in Cirq [72].The transition amplitudes are then combined with the classically determined ground-and excited-state energies to calculate the spectral functions and response functions (See Supplementary Sec.II).

Quantum Device
The quantum device used in this work is a superconducting quantum processor with eight transmon qubits.The algorithm is performed on a four-qubit subset of the device with linear connectivity.Single-qubit gates are performed with resonant microwave pulses.Multiplexed dispersive readout allows for simultaneous state discrimination on all four qubits.CZ gates between all nearest neighbors are performed according to the method in Ref. [73].The same method allows for a native CS gate on one pair and a native CS † gate on a different pair, according to the requirements of the algorithm.While single-qubit gates are applied simultaneously, microwave crosstalk requires that all two-and three-qubit gates are applied in separate cycles from each other as well as from any single-qubit gates.TrueQ [74] is used for circuit manipulations in the implementation of RC as well as gate benchmarking.Internal software is used to map the circuits to hardware pulses for implementing the native gate set.This work employs the recently developed C-iX-C iToffoli gate [45].In this section we outline the procedure for eliminating spectator errors during the gate application.Since the gate acts on a three qubit subset of the full four qubit subsystem we need to understand and correct the spectator error on the fourth qubit.For concreteness we label qubits Q i for i = 0, . . ., 3 with the iToffoli on Q 0 , Q 1 , and Q 2 with Q 3 as spectator and denote states in the form |Q 0 Q 1 Q 2 Q 3 .We run a simple circuit where we prepare the system in either |100+ or |101+ (where

III. CALIBRATION OF THE ITOFFOLI GATE
2) and apply the iToffoli gate in a Ramsey-like sequence to determine the Z rotation on Q 3 conditional on the state of its nearest neighbor, Q 2 (see Fig. S1a for circuit diagram).We observe an unwanted conditional phase interaction between Q 2 and Q 3 with conditional phase, ϕ = 0.844 (see Fig. S1b).This interaction results from the conditional Stark shift between Q 2 and Q 3 when a strong off resonant drive is applied to Q 2 at the frequency of Q 1 to implement the iToffoli gate [73].We can use the same effect to undo the conditional phase by applying simultaneous off-resonant drives to Q 2 and Q 3 for a 120 ns period following the iToffoli drive sequence.The full pulse sequence and characterization of the residual conditional rotation with the cancellation applied are shown in Fig. S1c-S1d.We benchmark the resulting gate implementation using Cycle Benchmarking [75].With this correction, we measure a gate fidelity of 97.8% when isolated to the cycle involving qubits that participate in the iToffoli gate, and a small reduction to 96.6% when including the idling spectator qubit.

IV. RANDOMIZED COMPILING FOR NON-CLIFFORD GATES
In this section we outline a modified version of randomized compiling (RC) [46,47] that is applied to the circuits used to compute the observables in the main text.RC is expected to mitigate errors and improve algorithm performance.A broad native entangling gate set is used, consisting of both Clifford gates (CZ) and non-Clifford gates (CS † , CS, iToffoli).RC is typically used with hard cycles of n-qubit Cliffords in which case the twirling group T is chosen to be the group of tensor products of n single qubit Paulis.By the definition of Clifford gates, for any Clifford C and twirling gate T ∈ T there is some T c ∈ T such that C = T CT c .RC applied to hard cycles of Clifford gates, C k then proceeds by choosing some T k for each hard cycle k, letting k and compiling the single qubit Paulis T k and T c k into the easy cycles of single qubit gates before and after, respectively, the Clifford cycle C k .In order to generalize the method to the non-Clifford gates employed in this work we first find the subsets T X ⊂ T for X = CS, CS † , iToffoli where for all T ∈ T X there is some T c ∈ T X such that X = T XT c .Then RC proceeds in the same way as above except the twirling gates for hard cycles consisting of gate X are simply chosen from the subset T X of Pauli strings that stabilize gate X.Both the CS and CS † are stabilized by 4 of the possible 16 two-qubit Pauli strings and the iToffoli is stabilized by 8 of the possible 64 three-qubit Pauli strings.For all these non-Clifford gates the twirling and inversion gate are the same, T = T c .Results "with RC" in the main text involve averaging the experimental bitstring output distributions over 100 equivalent circuit randomizations generated according to the process outlined here, with each circuit measured for 500 shots.These are compared to results "without RC" in which the bare circuit is measured for 50000 shots (such that the total number of shots is maintained between the two implementations).
All error processes can be described by a superoperator E acting on the system density matrix ρ.Written in the n-qubit Pauli basis this error process matrix is referred to as a Pauli Transfer Matrix (PTM) with diagonal elements giving Pauli fidelities and off-diagonal elements characterizing the unitary (coherent) and non-unitary (incoherent) errors.As discussed in [46,47], applying RC tailors coherent errors into stochastic Pauli noise, which suppresses the off-diagonal elements of the PTM.This holds for the PTM describing errors during the CZ cycles, since these undergo perfect Pauli twirling (in the limit of infinite randomizations).However, in the case of the non-Clifford gate cycles, the twirling is imperfect (since we only twirl over a subset of the n-qubit Pauli strings).As a result, some, but not all, of the off-diagonal elements in the corresponding PTMs are suppressed.In other words, not all coherent errors are tailored to stochastic Pauli noise.This imperfect noise tailoring is the main limitation of our approach to generalizing RC to non-Clifford gates.
We observe a small improvement in the raw state fidelities when using RC but a much larger improvement in the purified state fidelities with respect to the results without RC.The small improvement in the raw fidelities can be explained by the suppression of off-diagonal components of the error process matrix when using RC which lowers the overall error rate slightly.The larger improvement with purification is explained by the fact that RC tailors coherent errors into Stochastic Pauli errors.If the rates of various stochastic Pauli errors are similar, the errors are largely depolarizing and can be corrected by the purification procedure, yielding the high fidelities in Fig. 5d.The deviation of the noise from purely depolarizing is responsible (along with the finite number of randomizations) for the remaining infidelity after RC and purification are applied.Conversely, without RC a larger fraction of the error is a coherent over/under rotation of the two-qubit Bloch vector which cannot be corrected by purification.This section presents the system-qubit state fidelities and density-density response functions of the KH molecule.Figure S2 shows the system-qubit state fidelities in the response function calculations of KH.In the case of RC without purification, the average off-diagonal fidelity improved from 49.2% in Fig. S2a to 61.3% in Fig. S2b, whereas the average diagonal fidelity slightly decreased from 91.5% in Fig. S2a to 90.5% in Fig. S2b.Purification without RC shows a unified improvement across both diagonal and off-diagonal fidelities, which change to 97.1% and 66.9% from 49.2% and 91.5% respectively from Fig. S2a to Fig. S2c.The most significant improvement again comes from applying both purification and RC, with the average diagonal fidelity 99.8% and average off-diagonal fidelity 95.9% in Fig. S2d.
Figure S3 shows the density-density response functions of KH with all data postprocessed with McWeeny purification.As compared to the case for NaH, the results without RC only fail to reproduce the peaks or produce peaks with the wrong signs in Fig. S3c, but reproduce all the peaks in Figs.S3b and S3d.As for comparison between the iToffoli decomposition and CZ decomposition, the root-mean-squared deviations without RC are 0.0108 eV −1 for iToffoli and 0.0316 eV −1 for CZ in Fig. S3a, while the deviations are 0.1007 eV −1 for iToffoli and 0.0824 eV −1 for CZ in Fig. S3c.With RC, the root-mean-squared deviations are 0.0148 eV −1 for iToffoli and 0.0225 eV −1 for CZ in Fig. S3b, and 0.0386 eV −1 for iToffoli and 0.0064 eV −1 for CZ in Fig. S3d.In all cases with or without RC, the two decompositions show comparable results in the response functions.

FIG. 2 :
FIG. 2: Decomposition of the double-controlled composite gates in the LCU circuits.(A) Example of the decomposition of a double-controlled −ZZ gate into CCZ (blue) along with other single-and two-qubit gates.The X gates (green) are used to adjust the control states; the CZ gate on a0 and a1 (purple) is used to adjust the overall multiplicative factor of the Pauli string, which is −1 in this case; the equivalent CNOT gates (orange) are used to extend the Pauli string as in Ref. [53].(B) Decomposition of the CCZ gates with the iToffoli gate, which includes a CC-iZ part (light blue) and an equivalent long-range CS dagger part (yellow).The SWAP gates are simplified in the transpilation stage or further decomposed with CZ gates according to Ref. [54].

FIG. 3 :
FIG. 3: Spectral function of diatomic molecules.Spectral function of (A) NaH, (B) KH.The circuits to obtain the spectral function are shallow three-qubit circuits that do not require the iToffoli gates.The experimental spectral functions are in quantitative agreement with the exact ones, with maximum peak height deviation of 10.6%.

FIG. 4 :
FIG. 4: Fidelity versus circuit depth of the (0 ↑, 0 ↓)-circuit for NaH.Fidelity for the iToffoli decomposition (blue) and the CZ decomposition (yellow).The locations of the iToffoli gates are marked by red crosses.The CZ decomposition results in lower overall fidelity compared to iToffoli decomposition due to higher circuit depth.The inset is the corresponding data from noisy simulation and shows a similar trend.All results in this figure are raw experimental or simulated data without any error mitigation.

FIG. 6 :
FIG. 6: Density-density response function of NaH.(A) Im χ00 without RC.(B) Im χ00 with RC. (C) Im χ01 without RC.(D) Im χ01 with RC.All experimental results are postprocessed with McWeeny purification on the system-qubit states after restricting to the ancilla bitstring subspace.Without RC, the iToffoli decomposition yields qualitatively improved results compared to the CZ decomposition, as quantified by the root-mean-squared (RMS) error of 0.0086 eV −1 (iToffoli) compared to 0.0372 eV −1 (CZ) in Im χ00; with RC, the two decompositions exhibit comparable RMS error values of 0.0081 eV −1 (iToffoli) vs 0.0073 eV −1 (CZ) in Im χ00.

FIG. S1 :
FIG.S1: Cancellation of spectator error during iToffoli gate.(A) Ramsey protocol for detecting spurious ZZ error between Q2 and spectator Q3 during the application of the iToffoli gate.(B) |0 population for Q3 after application of the Ramsey sequence in (A) conditional on the state of Q2.The relative phase shift between the sinusoidal curves gives the unwanted conditional phase, φ, which must be corrected.(C) A pure iToffoli gate on Q0-Q2 is achieved by applying the iToffoli drive from Ref.[45] following by a CZ φ gate.(D) Same as (B) except now the CZ φ correction gate is applied immediately after the iToffoli drive, correcting the unwanted ZZ error.
FIG.S2: System-qubit state fidelities for the response function calculation of KH. (A to B) Fidelities between the raw experimental and exact system-qubit density matrices without (A) and with RC (B).(C to D) Fidelities between the purified experimental and exact system-qubit density matrices without (C) and with RC (D).Layout of the tiles in each panel is the same as in Fig.5in the main text.Similar to NaH, RC combined with McWeeny purification yields the highest system-qubit state fidelities.

FIG. S3 :
FIG. S3: Density-density response function of KH. (A) Im χ00 without RC.(B) Im χ00 with RC. (C) Im χ01 without RC.(D) Im χ01 with RC.All experimental results are processed with McWeeny purification.Similar to NaH, with RC and McWeeny purification, the results from iToffoli decompositions and CZ decompositions are comparable in reproducing the exact response function values.

TABLE I :
Pauli strings and qubit state bitstrings under Z2 transformations and truncations.Each Pauli string is characterized as up-spin or down-spin depending on whether it originates from a creation or annihilation operator on an up-spin or a down-spin orbital, and is characterized as spin-balanced if it originates from a number operator.The qubit state bitstrings are characterized as up-spin, down-spin or spin-balanced by the type of operator that yields the state after applying on the mean-field ground state |1100 , which are consistent with the classification based on the expectation values of the Z2 symmetry operators ZIZI and IZIZ given in the text.