Realization of quantum signal processing on a noisy quantum computer

Quantum signal processing (QSP) is a powerful toolbox for the design of quantum algorithms and can lead to asymptotically optimal computational costs. Its realization on noisy quantum computers without fault tolerance, however, is challenging because it requires a deep quantum circuit in general. We propose a strategy to run an entire QSP protocol on noisy quantum hardware by carefully reducing overhead costs at each step. To illustrate the approach, we consider the application of Hamiltonian simulation for which QSP implements a polynomial approximation of the time evolution operator. We test the protocol by running the algorithm on the Quantinuum H1-1 trapped-ion quantum computer powered by Honeywell. In particular, we compute the time dependence of bipartite entanglement entropies for Ising spin chains and find good agreements with exact numerical simulations. To make the best use of the device, we determine optimal experimental parameters by using a simplified error model for the hardware and numerically studying the trade-off between Hamiltonian simulation time, polynomial degree, and total accuracy. Our results are the first step in the experimental realization of QSP-based quantum algorithms.


INTRODUCTION
Several quantum algorithms are known to outperform their classical counterparts by computational costs that asymptotically scale better, e.g., Shor's prime factoring algorithm [1], Hamiltonian simulation [2,3] and Grover search [4,5].Their realization on actual quantum computers, however, requires additional qubits and gates to correct errors that naturally occur in real physical devices.Currently available noisy quantum computers are not capable yet of running such quantum algorithms for large problem sizes.
In the context of noisy quantum circuits, there are two regimes in which the classical computational requirements for simulating a quantum computer remain tractable.First, shallow circuits typically generate small amounts of entanglement making them amenable to classical simulation.Second, deep circuits quickly accumulate errors causing decoherence towards a regime which can also be treated efficiently on classical computers [6,7].Between these two extremes, there is an optimal working point at which maximum non-trivial quantum correlation is attained and where accurate simulation may become challenging for a classical computer [8].In light of this, a promising route towards achieving a genuine quantum advantage without fault tolerance is to realize the aforementioned algorithms while operating the computer at its optimal working point.In order to design such an algorithm, it is therefore essential to account for the influence of noise on the circuits which implement it.
In this work, we propose to heuristically optimize the depth of quantum circuits and operate where we can make the most out of our noisy quantum computer.With this heuristic approach, we provide the first realization of quantum signal processing (QSP) on a trapped-ion quantum computer.QSP was proposed in [9] and is now recognized as one of the most powerful frameworks for developing quantum algorithms.It gives a unifying perspective on seemingly distinct algorithms such as amplitude amplification and the quantum linear systems algorithm and improves on their computational resources [10,11].Such flexibility stems from the fact that QSP allows one to apply almost any polynomial transformation to an input scalar or matrix.In the literature, QSP often refers to a polynomial transformation applied to an input scalar, and its generalizations apply a polynomial transformation to eigenvalues (QET) or singular values (QSVT) of an input matrix.Throughout this article, we do not make such a distinction and refer to all these protocols as QSP.Hamiltonian simulation is an example where QSP provides an improved asymptotic scaling over other algorithms.Since Feynman's seminal proposal [12], Hamiltonian simulation has been a fundamental problem of quantum computing.An efficient Hamiltonian simulation algorithm allows us to simulate the real-time dynamics of a quantum system described by a Hamiltonian H with computational resources scaling at most polynomially in evolution time t, system size n, and inverse of required accuracy 1/ϵ.Extensive studies have been devoted to exploring efficient algorithms for Hamiltonian simulation, which include product formulas [2,[13][14][15], quantum walks [16], the truncated Taylor-series expansion [17], randomized protocols [18][19][20][21][22], and making use of classical optimization techniques [23][24][25].Nowadays, the QSP-based algorithm is known to exhibit nearly optimal asymptotic scaling [10,26,27] (see also [28] for a comparative survey).
In [29], the authors demonstrate the QSP protocol us-arXiv:2303.05533v3[quant-ph] 27 Sep 2023 1. observables 2. entanglement 3. etc. heuristic: find optimal degree for error rate find QSP angles rescale parameters FIG. 1.The proposed protocol for the realization of QSP on a noisy quantum computer.We choose Hamiltonian simulation as the application.We start with a necessary preprocessing step (A) that maps the input parameters to an effective Hamiltonian H and an effective simulation time t.In step (B), H is embedded in a unitary operator.By classically optimizing/compiling a circuit W this step produces a compressed version of a block-encoding circuit.Next, in the operator-function design (C), we approximate the real-time evolution function, e −ixt , by a polynomial f (x) of degree d.While increasing the degree leads to a more accurate polynomial approximation, the computation suffers from larger noise effects.This is due to the growing depth of the QSP circuit, consisting of O(d) primitive gates.By accounting for the error rate pTQ of two-qubit gates, we heuristically estimate the optimal degree yielding the smallest combined error.The processing step (D) finally realizes QSP using the compressed block-encoding circuit W and the designed polynomial f (x).Upon postselection on the ancilla's measurement outcomes, we obtain an approximation to the desired real-time evolution e −iHt .An error mitigation scheme based on the error rate pTQ further reduces the effect of noise on the output.
ing random Hamiltonians on a superconducting device for the purpose of benchmarking.The present work takes a step forward by realizing QSP on the Quantinuum H1-1 trapped-ion quantum computer and performing the Hamiltonian simulation of physically relevant quantum systems.After the release of the present manuscript, another group demonstrated QSP for the task of quantum channel discrimination [30].

Review of Hamiltonian simulation by quantum signal processing
The Hamiltonian simulation algorithm solves the realtime dynamics of a quantum system by applying a realtime evolution operator e −iHt to some initial state |ψ 0 ⟩, where the Hamiltonian H is given by a Hermitian operator in this work.We employ QSP in order to find an approximate real-time evolution operator that can be efficiently implemented on a quantum computer.QSP outputs a degree-d polynomial f ∈ C[x] using a sequence of unitary operators [9,10,26,27], where * stands for an unspecified entry.Here, we follow the convention of Corollary 8 in [10] (preprint version), where W (x) takes the form of a reflection operator.For a polynomial f (x) that satisfies certain conditions [10,11] there always exists a set of QSP angles {ϕ k }.The conditions are: (i) One way to find f is to consider the polynomial approximation to the exponential function given by the Jacobi-Anger expansion [26], where J i (t) is a Bessel function of order i, and T i (x) is a Chebyshev polynomial of order i.Tolerating an error ϵ poly , the polynomial can be truncated at degree which is almost linear in t and logarithmic in 1/ϵ poly .
Here, we use the big-Θ notation, i.e., for functions f and g we write f (x) = Θ(g(x)) if there exist constants c 1 , c 2 , and The goal is to apply this polynomial transformation to the eigenvalues of the Hamiltonian H.This is achieved by block encoding H, i.e., embedding H in a unitary operator W(H) acting on a larger Hilbert space.A number of block-encoding methods have been proposed in the literature [10,27,[31][32][33] and their applicability depends on the form of the Hamiltonian.For instance, one can employ the linear-combination-of-unitary (LCU) method when H is given as a weighted sum of unitary operators [34].Then, by identifying a subspace analogous to a one-qubit space, the block-encoding unitary W(H) and a generalized rotation operator S(ϕ) behave like the single-qubit operations W (x) and S(ϕ) in Eq. (1).
Our aim is to run a small-scale QSP-based Hamiltonian simulation on a quantum computer with no faulttolerance mechanism.This is challenging because noise limits the maximum depth of our circuits.We present a practical protocol to run the Hamiltonian simulation by QSP, while taking hardware noise into account.

Preprocessing
Recall that QSP applies a polynomial transformation to the eigenvalues of the Hamiltonian.The eigenvalues need to be rescaled in a suitable interval so that the Hamiltonian can be encoded as a sub-block of a unitary operator.By unitarity, the largest possible interval in Eq. ( 4) is [−1, 1].However, the protocol is made more efficient if we further narrow the interval down to [0,1] and approximate e −ixt by an even function of x [35].A general preprocessing method to rescale the spectrum of where λ + and λ − are upper and lower bounds on the eigenvalues, respectively (see Fig. 1A).To recover the desired time evolution, we counterbalance with a time rescaling This yields the desired real-time evolution operator up to an irrelevant global phase: e −i t H = e −iϕ e −itH , where The exact minimum λ min and maximum λ max eigenvalues are unknown and finding them is computationally intractable in general [36][37][38].
That is why we resort to bounds.Equation (8) shows that the effective evolution time t increases as the QSP interval [a, b] gets smaller, and as the eigenvalue bounds get looser.For example, suppose λ ± are taken such that (λ , the bounds λ +/− are 100r% off from λ max/min .From Eq. ( 8) we obtain The first term is the smallest effective time achievable, while the second term is extra overhead.Note that t determines the polynomial degree d (e.g., Eq. ( 6) for the truncated Jacobi-Anger expansion), and thus the circuit depth.
When the Hamiltonian is provided as a weighted sum where ∥•∥ is the spectral norm.Tighter bounds can be obtained by relaxing the ground-state constraints [39,40] and/or exploiting some structure in the Hamiltonian.For translationinvariant systems, the Anderson bound [41], and a particular semi-definite program relaxation, can provide a lower bound with an error that is independent of system size [42].Furthermore, for a large class of local Hamiltonians, one can formulate a hierarchy of semi-definite programming constraints with increasing complexity that can be solved numerically with tensor network and renormalization group techniques [43].

Compressed block-encoding
The second key step of the protocol (Fig. 1B) is to input the Hamiltonian to the quantum computer so that it can be processed.For ϵ BE ≥ 0, a block-encoding W of H is defined by where ∥•∥ F is the Frobenius norm and the integer a is the number of ancillary qubits.Note that (⟨0 a |⊗I)•(|0 a ⟩⊗I) projects onto the subspace where the ancillary qubits are in the all-zero state.The accuracy of the block encoding is specified by the parameter ϵ BE .Depending on the form of H, there exist different block-encoding methods [10,27,[31][32][33][34].While such generic methods are scalable in principle, the required number of ancillary qubits and the circuit depth may preclude an implementation on current noisy quantum devices.Here, we propose two ways to overcome this by compressing the block-encoding circuit.
First, we use a parameterized quantum circuit W = W(θ) as ansatz and minimize Eq. ( 10) with respect to the parameters θ.The possible presence of barren plateaus in the optimization landscape could prohibit quantumclassical hybrid methods from being efficient at larger system sizes [44][45][46].In this case, a fully classical approach is preferable [47].We thus suggest to use tensor network ansätze that can be efficiently optimized on a classical computer.
Second, we make use of multiplexor circuit compilation to compress the LCU block-encoding circuit [48,49].The multiplexor compilation reduces the number of elementary gates required to implement sequential multicontrolled unitary operations which are heavily used in the LCU circuit.Since the compilation adopted here does not introduce approximation error, it provides an exact block-encoding, i.e., ϵ BE = 0.
In the Methods section we discuss both approaches in more detail.

Operator-function design
The depth of a QSP circuit is proportional to the degree d of the polynomial.When using noisy devices, we must fix d so that the final circuit has a reasonable fidelity.Later on, we provide a heuristic to choose d as a function of t and hardware noise.For now, let us assume that d is fixed and proceed to the function design (Fig. 1C).Instead of using the Jacobi-Anger expansion, we numerically optimize the QSP angles {ϕ k }.The preprocessing step has rescaled the eigenvalues of H in [a, b] ⊆ [0, 1], so we restrict the optimization to that interval.Furthermore, we can utilize polynomials of even parity, i.e., QSP polynomials of even degree d.The resulting accuracy is Figure 2(a) shows the accuracy for different values of degree and evolution time.For each value of d, we find the QSP angle sequence using a dedicated python package called pyqsp [50].As expected, the error decreases as the degree gets larger for a given evolution time.It is also observed that the error increases as the evolution time gets longer for a fixed degree.
The error stemming from both block-encoding (10) and operator-function design (11) propagates to the accuracy of the whole algorithm.This is found by expanding the error as [10,31], where we have defined In the third line, we use inequality ∥e Lemma 50 in [31], preprint version) and the fact that the spectral norm is upper bounded by the Frobenius norm.
Let us now incorporate the effect of hardware noise via a simple noise model.This allows us to develop a heuristic for estimating the optimal polynomial degree, given the evolution time and the noise rate of our quantum device.Letting |ψ 0 ⟩ be a n-qubit initial state and |0 a ⟩ be the a-qubit ancillary state, the quantum computation is described by where U QSP represents the unitary implementing the QSP protocol, which will be defined later in Eq. ( 17).We model the noise effect of the hardware with the depolarizing channel D p acting on the entire system.It alters the state to where we set p = 1 − (1 − p TQ ) NTQ with the two-qubit gate infidelity p TQ and the number of two-qubit gates N TQ in the U QSP circuit.The fidelity between this state and the ideal target state |ψ t⟩ := e −i Ht |ψ 0 ⟩ quantifies the error,  11), using pyqsp [50].(b) Upper bound to the infidelity, Eq. ( 16), as a function of degree d and evolution time Jt.(c) For each evolution time, the optimal degree dopt is the degree that minimizes the total error ϵ total in (b).
Thus, the corresponding infidelity is bounded as Figure 2(b) shows the upper bound in Eq. ( 16) as a function of degree and evolution time, where the algorithmic error ϵ QSP [Eq.(12)] is obtained for the Hamiltonian given in Eq. ( 23).The two-qubit gate error rate is set to p TQ = 2.577 × 10 −3 (see Methods for details) and the circuits of degree d ∈ {2, 4, 6, 8, 10, 12, 14} contain N TQ ∈ {52, 98, 144, 190, 236, 282, 328} two-qubit gates, respectively.In contrast to the operator-function design error in Fig. 2(a), the total error in Fig. 2(b) has a sweet spot for each value of Jt.Intuitively, the increase of the degree reduces the algorithmic error ϵ QSP while making the noise effect more prominent due to the larger circuit depth.This motivates the following heuristic: for a given evolution time, pick the degree that minimizes the upper bound on the total error (16) (see [51,52], where a similar approach has been applied to Grover's algorithm).Importantly, this step of the protocol does not require the use of a quantum computer.The optimal degree for Eq. ( 16) is found numerically using classical computation.Additionally, the sweet spot may coincide with the hardware's optimal working point where we expect a classical simulation of the corresponding noisy quantum circuit to be most challenging [6,8], further justifying our heuristic choice.
Figure 2(c) shows that the optimal degree d opt is approximately linear in the evolution time t.The estimated degrees are corroborated by the complementary numerical study that we carried out and presented in the Methods section.It is important to emphasize that our ap-proximately linear scaling in time is different from the one expected by noiseless QSP.Our heuristic is designed to run the noisy quantum computer to its full potential, but may still produce large errors.This happens when the simulation parameters {H, t, p TQ } are not compatible in the first place.For instance, at a fixed error rate p TQ and large simulation time t, it is reasonable to expect a large infidelity.In contrast, Hamiltonian simulation by noiseless QSP achieves linear scaling in time while providing full control over the total error.For example, one can use a perfect block-encoding, ϵ BE = 0, along with the desired approximation error ϵ poly in Eq. ( 6).

Processing
In this last step of the protocol, we apply the polynomial f found in Eq. ( 11) to the block-encoded Hamiltonian W (Fig. 1D).For an even integer d, the QSP unitary takes the form [10,11], where the direct sum is taken over the eigenstates {|λ W ⟩} of W and the upper-left block of the matrices represents the |0 a ⟩ ⟨0 a | component of the corresponding operators.Thus, starting from the initial ancillary state |0 a ⟩, and post-selecting on the ancillary state |0 a ⟩ at the end, we obtain which approximates the desired real-time evolution operator e −iHt .Let us now discuss how to post-process the measurement results and mitigate the noise effects on observables.We let the noisy quantum state simulated on the hardware before any measurement be η, which is generally different from the state affected only by the depolarizing channel given by Eq. ( 14).For simplicity, we consider the expectation value, Tr[ P η], of P := |0 a ⟩ ⟨0 a | ⊗ P , where P is a Pauli operator acting on the system register.The variance is Var η, P = Tr[ Īη] − Tr[ P η] 2 .We mitigate the noise effects by modelling it with the depolarizing channel [53][54][55].In particular, we use the same noise model that we previously employed when estimating the optimal polynomial degree.The expectation value of P with respect to the state in Eq. ( 14) is where p = 1 − (1 − p TQ ) NTQ .We infer the noiseless expectation value from the noisy expectation value as This is understood as mitigating the depolarizing noise, at the cost of a larger variance, Var mitig η, This implies that the number of samples needed to achieve a fixed sampling error increases exponentially in N TQ .Therefore, reducing the depth of the circuit is extremely important even though the noise effect on the expectation value ⟨ P ⟩ mitig η is mitigated.

Hardware experiment
In order to demonstrate the protocol, we perform the QSP-based Hamiltonian simulation experiments on the Quantinuum H1-1 trapped-ion quantum computer.We simulate the real-time dynamics of the quantum system described by the one-dimensional Ising spin Hamiltonian We quantify entanglement growth by bi-partitioning the system into subsystems A and Ā and then computing the time dependence of the von Neumann entropy FIG. 3. Sketch of the setup for the five-qubit experiment.(a) The system consists of the two-qubit ancillary register (orange ions) and the three-qubit system register.The latter is further partitioned into the one-qubit subsystem A (a red ion) and its complement Ā (blue ions).(b) The H1-1 quantum computer operates by manipulating the ions representing the qubits.Each quantum operation (initialization, gate application, measurement) is performed using lasers after the target ions are transported to one of the isolated interaction zones.
In the experiments we use five out of the 20 qubits available, and apply up to 328 two-qubit gates.
and the degree-2 Rényi entropy on the n A -qubit subsystem A, where We perform state tomography by measuring the Pauli expectation values via for an operator P ∈ Pauli A := {I, X, Y, Z} ⊗n A \{I ⊗n A } on A (see Methods), which leads to an estimator of the density matrix, Since the denominator of Eq. ( 26) would be one in the absence of algorithmic error and noise effects, the quantity in Eq. ( 26) approximates the expectation value of the Pauli operator P as is further discussed in the Methods section.We note that the computation of von Neumann entropy is not scalable in general.However, the current procedure can be straightforwardly applied to the computation of degree-2 Rényi entropy using the swap trick [56][57][58][59][60][61] or randomized measurement protocols [62][63][64][65][66][67].The H1-1 system operates by controlling the S 1/2 hyperfine clock states of trapped 171 Yb + ions, which play the role of qubits [68,69]; there are a total of 20 qubits in the system at the time the experiments are conducted (see [70] for details on the H1-1 system).In addition to single-qubit rotations, a two-qubit native gate exp(−iθZ ⊗ Z/2) with θ ∈ R can be applied to an arbitrary pair of qubits giving the system all-to-all connectivity.This is enabled by the ability of the H1-1 system to move any pair of ions to one of five isolated interaction zones where quantum operations (initialization, gate application, measurement) are executed in a manner that suppresses the rate of crosstalk and allows for high-fidelity two-qubit gates.
In the first experiment, we consider the n = 3 Ising spin chain with h i /J = −1.05for all i and m/J = 0.5 in Eq. (23).The system is known to display rapid growth of entanglement [71,72].We preprocess the Hamiltonian H given in Eq. ( 23) to find H via Eq.( 7) with a = 0, b = 1, and λ ± = ±(2J + 3h + 3m).We obtain a compressed block-encoding circuit by variational optimization using two ancillary qubits and L = 3 layers obtaining an error ϵ BE = 1.8 × 10 −2 (see Methods for details).The subsystem A is taken to be the zeroth site of the system register (see Fig. 3 for a schematic of this five-qubit experiment).
Figures 4(a) and (b) show the growth of entanglement entropies with time for our system.The exact time evolution data (dashed line) is obtained from the exact application of the operator e −iHt to the initial state |ψ 0 ⟩.The experimental data obtained from H1-1 is reported with error mitigation (orange circles) as well as without error mitigation (green squares).The noiseless QSP simulation data (blue diamonds) is obtained by classically simulating the algorithm without the noise effects.Error bars represent one standard deviation due to sampling error.
The error-mitigated experimental data agree well with Step Description of the error Possible improvements Preprocessing The spectrum of the Hamiltonian is rescaled using crude upper and lower bounds.This leads to a longer effective evolution time t in Eq. ( 9).

Compressed block-encoding
Imperfect block-encoding if variational optimization of circuit parameters is used.This leads to an error ϵBE in Eq. (12).
A more expressive circuit ansatz and a higher performance classical optimizer/compiler, e.g., using the methods in [24].

Operatorfunction design
The real-time evolution operator is approximated by a polynomial of fixed degree.This leads to an error ϵ poly in Eq. (12).

Processing
Each two-qubit gate fails with some probability pTQ.This leads to a reduced fidelity in Eq. ( 15).
TABLE I. Summary of the main sources of error in our QSP protocol and how they can be improved upon.Note, improving upon some errors affects the other errors in non-trivial ways.
the exact values and with the noiseless data up to Jt = 0.6, while there is a discrepancy between the unmitigated data and the rest from as early as Jt = 0.1.We also observe that the error-mitigated data show larger sampling errors (error bars) than the unmitigated data as expected from Eq. ( 22).The experimentally obtained entanglement entropies generally yield larger values than the exact ones due to algorithmic error and noise effects, which induce the interaction among the system register, ancillary register, and environment surrounding the device.Thus, the von Neumann and Rényi entropies computed on the subsystem A measure the entanglement not only with the system Ā but also with the ancillary register and environment.Nevertheless, our protocol mitigates these erroneous impacts well.In particular, the agreement between the mitigated experimental data and exact values indicates that our protocol brings both QSP algorithmic error and noise effects under good control for the range of parameters that we assessed.
In the second experiment, we simulate the real-time evolution of the n = 4 Ising spin chain with h 1 /J = 1 and h i /J = m/J = 0 for i ̸ = 1 in Eq. (23).We begin by constructing the exact LCU block-encoding circuit (ϵ BE = 0) which uses a = 3 ancillary qubits and 125 two-qubit gates.We compress this circuit using multiplexor compilation and obtain an equivalent circuit with only 44 two-qubit gates.This is a reduction of 64.8% of the original LCU circuit size (see Methods for details).We evolve the initial state |ψ 0 ⟩ = |+⟩ ⊗4 on the system register and make 1000 measurements to compute each Pauli expectation value [Eq.( 26)] at each time Jt ∈ {0.1, 0.4, 0.7}.We again follow the heuristic in Fig. 1C to find d opt ∈ {2, 4, 8} for each evolution time Jt.However, we use a different two-qubit gate infidelity, p TQ = 2.185 × 10 −3 , following an update to the H1-1 device after our first experiment.The resulting number of two-qubit gates in each circuit is N TQ ∈ {102, 204, 408}.
We choose the zeroth and first sites of the system register to represent subsystem A. The calculated entanglement entropies are shown in Figs.4(c) and (d).The discrepancy between the noiseless data (blue diamonds) and exact data (dashed line) is due to the degrees d opt being smaller than those found in the first experiment.Indeed, the heuristic has taken into account the increased number of qubits and two-qubit gates for this second experiment.The degrees found by our heuristic lead to a good agreement between the noiseless data and errormitigated experimental data (orange circles), except for Jt = 0.7.Note that this parameter setting (Jt = 0.7) yields our largest quantum circuit with as many as 408 two-qubit gates.This experiment exemplifies the importance of finding the optimal working point to balance the algorithmic error, hardware noise, and parameter setting.

DISCUSSION
We propose a detailed protocol to perform QSP-based Hamiltonian simulation tailored to noisy quantum hardware.Each process is carefully studied to clarify the sources of error in the estimate of target observables, as summarized in Tab.I.In particular, the polynomial approximation is designed such that the combined error caused by the QSP protocol and noise effect is minimized.The block-encoding circuit is compressed to further reduce the circuit depth for experimental purposes.An error mitigation scheme is used to increase accuracy in the estimate of target expectation values.
We execute the protocol on the Quantinuum H1-1 quantum computer.As an illustration, the time evolution of von Neumann and degree-2 Rényi entanglement entropies are computed.The results from the hardware experiments agree not only with those from noiseless simulations but with exactly obtained values, which implies the algorithmic error and noise effects are well controlled in the range of parameters that we chose.
An important question is whether the approach can scale to larger demonstrations.Both our heuristic and error mitigation schemes are derived under a simple noise model for the hardware at hand.A sophisticated error model may be required to obtain more accurate outputs for larger instances.Beyond that, one can use quantum error detection codes (see, e.g., [76] for the code tailored for the Quantinuum H1 system) to generate more reliable results at the cost of discarding a portion of the circuit runs, or apply algorithm-level error correction [77] for noisy QSP.Finally, it is noted that there exist block-encoding schemes with asymptotically efficient scaling [10,27,[31][32][33].Their required quantum resources are, however, still beyond the capability of currently available quantum devices.The techniques employed in this article to compress block-encoding circuits are potentially useful to perform larger-scale QSP realizations.
While further theoretical improvements are still required to scale up the protocol, the present study has taken the first step in the experimental realization of QSP-based algorithms and applications.

Compressed block-encoding by variational optimization
Here we elaborate on the block-encoding techniques used in this work.The goal is to optimize a parameterized quantum circuit, W(θ), to minimize the blockencoding error, with θ referring to the collection of all the parameters in the circuit.This is equivalent to minimizing the cost function, where we used that H is a Hermitian operator.Provided that the Hamiltonian is expanded as H = ℓ c ℓ P ℓ with n-qubit Pauli operators {P ℓ }, the error ϵ BE is obtained from F (θ) by We consider a particular structure for the parameterized quantum circuit which satisfies the reflection condition W(θ) 2 = I ⊗n .This condition is not crucial to the construction of QSP.However, we empirically found that the constraint makes optimization of block encoding easier.One ansatz satisfying the reflection condition is shown in Fig. 5 and given by where V (θ) is a unitary operator specified by the right circuit of Fig. 5, and CZ stands for the sequential application of controlled-Z gates that is shown in the middle of the upper circuit.
The parameterized quantum circuit W(θ) shown in Fig. 5 is composed of the following gates: where each gate has an independent variational parameter θ.Importantly, these gates are part of the native gate set of the Quantinuum H1-1 quantum computer.
In the present work, the optimization of the blockencoding circuit is performed by minimizing the cost function given in Eq. ( 30) using a classical state-vector simulation and the quasi-Newton BFGS method [78].The optimization is stopped when the gradient norm of the cost function falls below the threshold value 1×10 −5 .The accuracies of the optimized block encoding circuits for the 3-site and 4-site Ising spin Hamiltonian are shown in Fig. 6.In the experiment of the 3-site Ising spin chain, we use the circuit with a = 2 and L = 3, which requires (a+n−1)(2L+1) = 28 R ZZ gates.The optimized circuit has block-encoding error ϵ BE = 1.8 × 10 −2 .
We briefly discuss a classical method based on tensor network techniques.By expressing the cost function [Eq.( 30)] as a tensor network contraction and using a classical optimizer to find the parameters θ, a blockencoding circuit W(θ) which minimizes ϵ BE can be found.The terms in the cost function Eq. (30), Tr( W † W) and Tr( H W), can be evaluated using tensor network contractions as illustrated in Fig. 7.
The cost function in Eq. ( 30) can be variationally optimized using a classical optimizer, for instance, we can employ a gradient-based method as follows.At each iteration i, we require the gradient vector G (i) of the objective function F (θ) at θ = θ (i) : The partial derivatives in each gradient are straightforward to compute via the first of the variational gates given in Eq. (33).We then iterate with some learning parameter γ > 0 to update the parameters.The iteration is repeated until the norm of the vector of gradients falls below a predefined convergence threshold.
One could improve the convergence rate by additionally computing the Hessian matrix H (i) at the cost of The circuit inside the dashed box is repeated L times with new variational parameters added for each layer.The single-and two-qubit gates used in the circuit are R(θ) = exp(−iθ (3) X/2) exp(−iθ (2) Z/2) exp(−iθ (1) X/2) and RZZ (θ) = exp(−iθZ ⊗ Z/2).
In our five-qubit experiment, we use the bottom n(= 3) qubits as the system register and the top a(= 2) qubits as the ancillary register.more evaluations of operator expectation values: Then, the parameter update in Eq. ( 37) is replaced with, For the computation of the inverse of the Hessian matrix, we use the fact that this matrix is Hermitian and since our goal is to minimize the objective function in Eq. ( 30), we are only interested in its positive eigenvalues.Therefore we compute the pseudo-inverse via the eigendecomposition of the Hessian matrix and set all eigenvalues µ k smaller than some small cutoff ϵ to zero, e.g., ϵ = 1 × 10 −5 .More specifically, the pseudo-inverse is computed by replacing µ k by 1/µ k in the diagonal matrix of the eigendecomposition using only the positive eigenvalues µ k ≥ ϵ (all other eigenvalues are set to zero).

Compressed block-encoding by multiplexor compilation
As an alternative approach to compressing a blockencoding circuit, we employ the linear-combination-ofunitaries (LCU) method [34] with the help of an efficient compilation of multi-controlled unitary gates (multiplexors).LCU provides a way to block encode H when it is expressed as a weighted sum of unitary operators, {P ℓ } K ℓ=1 , H =

With these,
gives an exact block encoding of H, i.e., ϵ BE = 0.The bottleneck of this construction is the implementation of B, which contains a sequential application of multi-controlled-P ℓ gates.We make use of the compilation technique of multiplexor, which is developed in [49] based on [79,80], to reduce the gate complexity without introducing extra ancillary qubits.In the block-encoding of H, we use A = Had ⊗3 with the Hadamard gate, Had, and apply the multiplexor compilation to B shown in the right panel of Fig. 8.This results in 44 R ZZ gates for the block-encoding circuit W. Indeed, the number of R ZZ gates is significantly reduced relative to the circuit obtained without the compilation, which uses 125 R ZZ gates.

Heuristic estimation of the optimal degree
One key aspect of this work is the estimation of the optimal degree for the QSP polynomial given a certain noise rate.Our heuristic uses the upper bound ϵ total on the infidelity between the noisy and target states under a simplified noise model.Here we discuss the noise model and provide further numerical results.
For our numerical study, we replace all the two-qubit gates, R ZZ (θ) = exp(−iθZ⊗Z/2) for θ ∈ R, by two-qubit depolarizing channels: where σ is some quantum state and we use the error parameter p 2 = 2.416 × 10 −3 .This value is the two-qubit fault probability reported in the System Model H1 Emulator Product Data Sheet [70].In particular, in the System Model H1-1 Emulator, the probability p 2 is chosen such that the faulty R ZZ (π/2) modeled by the following two-qubit depolarizing channel D (2) combined with the other noise channels emulates the noise of Quantinuum H1-1 quantum computer: where Tr (2) indicates the trace over the two-dimensional subspace which the channel D (2) acts on.We remark that, in the H1-1 Emulator, the faulty R ZZ (θ) is modeled by the channel D (2) with θ-dependent fault probability p 2 (θ) (see [70] for more details).In the present work, we simplify the noise model by using p 2 = 2.416 × 10 −3 for all the two-qubit gates, R ZZ (θ), independent of the angle θ as given by Eq. (42).To clarify the relation between this parameter and the error parameter p TQ used throughout our protocol (see Figure 1), we note that the same channel D (2) is expressed as Therefore, the new error parameter is identified with p TQ = (16/15)p 2 = (16/15)2.416× 10 −3 = 2.577 × 10 −3 .This is the error parameter used in our infidelity bound.
To strengthen our argument, we verify the infidelity bound using exact density matrix emulations of noisy quantum circuits.We let the density matrix numerically obtained by the QSP protocol with the noise channel (42) be η sim .Figure 9(a) shows the infidelity bound, while (b) shows the exact infidelity.It is seen that the locations of minima in Figs.9(a) and (b) are close to each other for each evolution time Jt.This observation supports that the degree d minimizing ϵ total is likely to lead to the smallest possible error on noisy hardware.We emphasize that our heuristic does not require the use of a quantum computer beforehand.The optimal degree is found numerically using classical computation.

Processing with depolarizing error mitigation
In our hardware experiment we employed state tomography to compute the entanglement entropies.To this end, we estimated the expectation value of a Pauli operator P on the system register by This is understood as taking the expectation of P with the normalized post-selected state.Given an initial quantum state |ψ 0 ⟩ on the system register, we wish to approximate the time-evolved state e −iHt |ψ 0 ⟩ ⟨ψ 0 | e iHt by applying the QSP unitary followed by the post-selection.We simulate the protocol on the quantum hardware.Let η be the experimentally obtained state on the system and ancillary registers before any measurements, and let η be the state that is Then, the expectation value of a Pauli operator P with respect to η is This can be estimated with n shots circuit executions with the variance Var η,P = ⟨P ⟩ 2 η Var η, where the variances inside the parenthesis are given by Var η, P = (⟨ Ī⟩ η − ⟨ P ⟩ 2 η )/(n shots − 1) and Var η, Ī = (⟨ Ī⟩ η − ⟨ Ī⟩ 2 η )/(n shots − 1).
To mitigate noise effects, we model them by a depolarizing channel D p [55] applied to the entire system.Upon application of D p , the state σ becomes Assuming that the dominant source of error in the experimentally obtained state η is depolarizing noise, we infer the noiseless expectation value as, This is Eq. ( 46) and is understood as mitigating the depolarizing noise, at the cost of a larger variance, where the approximate equality is due to the QSP algorithmic error and other types of noise effects.This implies that the variance, and hence the required number of samples, increases exponentially in N TQ to achieve some fixed sampling error.

DATA AVAILABILITY
The data that support the findings of this study are available at Zenodo [81].

CODE AVAILABILITY
The code used to create the figures in this paper is available from the authors upon reasonable request.
is implemented by computing such angles {ϕ k }, and is encoded in the expectation ⟨0| U QSP |0⟩.It is evident from Eq. (1) that the circuit depth is proportional to the degree d.Finding an efficient Hamiltonian simulation algorithm with QSP starts by approximating the function e −ixt with a fixed-degree polynomial on an interval I ⊆ [−1, 1].Given time t > 0 and accuracy ϵ poly , we find a polyno-mial f such that max x∈I |f (x) − e −ixt | ≤ ϵ poly .

FIG. 2 .
FIG.2.Heuristic search of optimal parameters for the five-qubit hardware experiment.(a) Accuracy of QSP angle optimization, Eq. (11), using pyqsp[50].(b) Upper bound to the infidelity, Eq. (16), as a function of degree d and evolution time Jt.(c) For each evolution time, the optimal degree dopt is the degree that minimizes the total error ϵ total in (b).

FIG. 4 .
FIG. 4. Experimental results.(a) The von Neumann entanglement entropy and (b) the degree-2 Rényi entanglement entropy of the five-qubit experiment on the H1-1 quantum computer.(c) The von Neumann entanglement entropy and (d) the degree-2 Rényi entanglement entropy of the seven-qubit experiment.Error bars represent one standard deviation due to sampling error.

FIG. 6 .
FIG.6.Error ϵBE of the block encoding circuit as a function of the number of layers L and for each number of ancillary qubits a.We use the Ising spin Hamiltonian with hi/J = −1.05for all i and m/J = 0.5.The system size n is three in (a) and four in (b).

K
FIG. 7. Tensor network contractions for the evaluation of the cost function.(a) Contraction of Tr( W † W) for W of Fig. 5(a).(b) Contraction of Tr( W † H) for W of Fig.5(a) and H represented by a matrix product operator.Note that the terms in the gradient(37) and Hessian (38) can be evaluated using similar tensor network contractions.

FIG. 8 .
FIG.8.Quantum circuit diagrams for compressed block-encoding by multiplexor compilation.(Left) Structure of the LCUbased block encoding W given by Eq. (41).The top three and bottom four qubits represent the ancillary and system registers, respectively.(Right) The sub-circuit B used for block-encoding the n = 4 Ising spin Hamiltonian with h1/J = 1 and hi/J = m/J = 0 for i ̸ = 1, before the multiplexor compilation is applied.

FIG. 9 .
FIG. 9. Numerical verification of the infidelity bound used in this work.(a) The upper bound of the infidelity between the target and simulated states.(b) The infidelity between the target state and simulated state with the noise model in Eq. (42).The locations of minima in (a) and (b) are close to each other for each time Jt.