Fusing the single-excitation subspace with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbb C}^{2^n}$$\end{document}C2n

There is a tremendous interest in developing practical applications for noisy intermediate-scale quantum processors without the overhead required by full error correction. Near-term quantum information processing is especially challenging within the standard gate model, as algorithms quickly lose fidelity as the problem size and circuit depth grow. This has lead to a number of non-gate-model approaches such as analog quantum simulation and quantum annealing. These come with specific hardware requirements that are different than that of a universal gate-based quantum computer. We have previously proposed an approach called the single-excitation subspace (SES) method, which uses a complete graph of superconducting qubits with tunable coupling. Without error correction the SES method is not scalable, but it offers several algorithmic components with constant depth, which is highly desirable for near-term use. The challenge of the SES method is that it requires a physical qubit for every basis state in the computer’s Hilbert space. This imposes exponentially large resource costs for algorithms using registers of ancillary qubits, as each ancilla would double the required graph size. Here we show how to circumvent this doubling by leaving the SES and fusing it with a multi-ancilla Hilbert space. Specifically, we implement the tensor product of an SES register holding “data” with one or more ancilla qubits, which are able to independently control arbitrary \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n\!\times \!n$$\end{document}n×n unitary operations on the data in a constant number of steps. This enables a hybrid form of quantum computation where fast SES operations are performed on the data, traditional logic gates and measurements are performed on the ancillas, and controlled-unitaries act between. As example applications, we give ancilla-assisted SES implementations of quantum phase estimation and the quantum linear system solver of Harrow, Hassidim, and Lloyd.

www.nature.com/scientificreports/ to the data. Crucially, the number of steps required to perform a set of n ′ controlled-unitaries is independent of n and only linear in n ′ (as the controlled-unitaries are performed serially).
To better understand the tensor product structure consider adding a single superconducting qubit to an existing SES array, resulting in a complete graph of n + 1 qubits; see Fig. 1a. There are two distinct ways of doing this, which we call direct sum and tensor product. The direct sum means that number of excitations remains unity and the dimension of the computational subspace is increased by one, resulting in an n + 1-qubit SES register. Adding n ′ qubits in this way increases the size of the register to n + n ′ qubits. Or we can say we have added an n ′ -qubit SES register to the original n-qubit register. If we denote the computational subspace of an n-qubit SES register by C n the direct sum implements, The tensor product option comes from the standard gate model of quantum computation, where each physical qubit contributes a two-dimensional complex Hilbert space C 2 , and the computational Hilbert space of n qubits is the tensor product C 2 ⊗ C 2 ⊗ · · · ⊗ C 2 = C 2 n . In this paper we implement a tensor product of the form which adds an excitation and is equivalent to C 2n . The added qubit can be used as an ancilla to control the application of arbitrary unitaries to the data |ψ� stored in C n , enabling transformations from product states to arbitrary states of the form here |0� n+1 and |1� n+1 are states of qubit n + 1 , the ancilla (not a register of n + 1 qubits). Adding n ′ ancilla in this manner implements (1) C n ⊕ C n ′ = C n+n ′ .
(2) C n ⊗ C 2 ,  www.nature.com/scientificreports/ An example is given in Fig. 1b. Because the computational subspace (5) is exponential in n ′ , larger problem sizes become possible. We propose this extension of the SES method as a possible route to practical quantum computation with NISQ technology.

Controlled-unitary protocol
SES computer chip. The hardware required for ancilla-assisted SES computation is identical to that described in Geller et al. 6 i.e., a complete graph of superconducting transmon 8 or Xmon 9 qubits with tunable frequencies and tunable σ x ⊗ σ x couplings 10 . The device Hamiltonian is with ǫ i and g ii ′ tunable. g ii ′ is a real, symmetric matrix with vanishing diagonal elements. A possible chip layout is shown in Fig. 2.

SES method basics.
In the SES method without ancillas 6,7 , computations are performed in the n-dimensional subspace spanned by the basis states A pure state in the SES has the form The advantage of working in the SES is that the matrix elements of (6) can be directly controlled. We can therefore directly program the Hamiltonian of the computer chip.
The protocol for implementing a specific operation depends on the functionality (available ranges of the ǫ i and g ii ′ ) of the SES chip. In this paper we assume that the experimentally controlled SES Hamiltonian can be written, apart from an additive constant, in the standard form Here g max is the maximum interaction strength provided by the coupler circuits. A reasonable value for g max /h is 10-50 MHz.
|i) := |0 · · · 1 i · · · 0�, i ∈ {1, 2, . . . , n}. Possible layout for an SES chip. On the left, the circles represent qubits, the horizontal lines are wires, and the colored vertical lines are couplers. This is shown in more detail on the right for n = 5 . Each horizontal circuit is a superconducting qubit with capacitance C, tunable junction inductance L j , and n − 1 additional coils (each with self-inductance L 0 and mutual inductance m) for coupling to other qubits. Brown dotted lines indicate dc and microwave control lines for each qubit, as well as readout circuits. Each colored coupler wire contains a Josephson junction with inductance L c tuned by a magnetic flux . Control lines for some offdiagonal SES Hamiltonian matrix elements are also indicated in brown. www.nature.com/scientificreports/ The basic single-step operation in SES quantum computing is the application of a symmetric unitary of the form e −iA to the data, where A is a given real symmetric matrix. If only the unitary e −iA is known, the classical overhead for obtaining A from e −iA is to be included in the quantum runtime. (Note that the generator A is not unique because the matrix logarithm is not unique.) Define, where c = (min i A ii + max i A ii )/2 . The optimal SES Hamiltonian H to implement e −iA up to a phase is given by the standard form (10), with Here I is the n × n identity, and the matrix elements of (12) satisfy |K ii ′ | ≤ 1 . The associated evolution time is Additional discussion of these results is provided in Geller et al. 6 and Katabarwa and Geller 7 .
In an experimental implementation, then, the operation e −iA results from evolution under the Hamiltonian H qc with ǫ i = ǫ 0 + g max K ii , where ǫ 0 is a fixed qubit idling frequency, and g ii ′ = g max K ii ′ (i � = i ′ ), for a time duration t A . It is not even necessary for H to be abruptly switched on and off: Any SES Hamiltonian of the form H = g(t)K such that (g/ ) dt = θ A may be used.
Single-hole states. The idea underlying the controlled-unitary protocol is to use the non-SES states which have n − 1 excitations and which are particle-hole dual to the SES basis states. The dual state |i) has a single hole (absence of excitation) in qubit i. In a graph with g ii ′ = 0 , the basis state |i) is an eigenstate with energy ǫ i , whereas |i) has energy E n − ǫ i , where is the energy of the filled "band" |11 · · · 1� of n excitations. Therefore, apart from a shift E n , the dual states have negative energies; the resulting minus sign is the key to the protocol.
Description of the protocol. First we discuss the use of a single ancilla. The objective is to implement the controlled-unitary where U is an arbitrary n×n unitary matrix acting on the SES register, and I is the n×n identity. (This definition differs from the usual one by NOT gates on the ancilla; we assume that the additional NOT gates are included in the complete protocol.) Partition an n + 1-qubit complete graph into an n-qubit SES register and one ancilla. The initial state is of the form or Next write the unitary in (16) in spectral form as Ve −iD V † , or equivalently where V is unitary and D is a real diagonal matrix. We will make U conditional by implementing instead of (19), where the plus sign comes from the negative energy of the single-hole states and results in an application of the identity.
The first three steps of the protocol are to implement the V † operation in (20) on the data stored in the SES register. The KAK decomposition 7 is used to write V and V † as where A and B are real symmetric matrices. The procedure for computing A and B is given in Katabarwa and Geller 7 . Each operator produced by the KAK decomposition is a symmetric unitary and can be implemented in (11)  www.nature.com/scientificreports/ a single step (see "SES method basics"). The first operation in the protocol, e iA , results from evolution under H qc with ǫ i = ǫ 0 + g max K ii ( ǫ 0 is a fixed qubit idling frequency) and The indices i and i ′ in these expressions include the SES partition {1, . . . , n} only, and during this operation all couplings to the ancilla are turned off ( g i,n+1 = 0 for all i ∈ {1, . . . , n}). These settings program the SES Hamiltonian H = g max K into the chip. The ancilla qubit frequency ǫ n+1 is set to ǫ 0 . The protocol implements the symmetric unitary e iA up to a phase, with that phase chosen to minimize the operation After these steps (18) becomes where, for any unitary W acting on the SES, we write i ′ (i|W|i ′ ) a i ′ as (Wa) i . Note that the ancilla acquired a relative phase after these operations; we assume that such phases are removed by applying z rotations to the ancilla or by working in a rotating frame. The next steps apply e ±iD/2 conditioned on the ancilla, the sign change resulting from the negative energies of the dual states. After CNOT gates between the ancilla (control) and each of the n SES qubits (targets), we have In "Multi-target CNOT" we show how to implement these n CNOT gates simultaneously. Then follow the protocol as if to implement the diagonal operator e −iD/2 in the SES: apply H qc with g ii ′ = 0 and ǫ i = ǫ 0 + g max K ii for a time t D = θ D /2g max . Here K := (D − cI)/θ D , θ D := max i |D ii − c| , and c = (min i D ii + max i D ii )/2. Set the ancilla frequency to ǫ 0 . Following this operation, and another ancilla z rotation (23) becomes After a second set of CNOT gates and subsequent application of e −iD/2 and V to the SES, and a final ancilla z rotation, we obtain or as required. We represent this operation-including implicit NOT gates on the ancilla before and after transformation (16)-by the circuit diagram of Fig. 3.
We have described the use of a single ancilla qubit. The total number of steps required to implement the controlled unitary (about 10) is independent of n. Additional ancilla can be included by increasing the graph size by one for each new ancilla. Each ancilla independently controls unitaries acting on the shared data register. These unitaries, however, cannot be performed simultaneously, so in most applications the runtime will scale linearly with the number of ancilla n ′ .
As an example, in Fig. 4 we consider a quantum circuit that implements m controlled unitaries U 1 , U 2 , . . . , U m in succession, each controlled by a single ancilla. This circuit applies the operator to the computational Hilbert space, where x j is the jth bit in the binary representation for x, Multi-target CNOT. The protocol of the last section requires two rounds of CNOT gates applied between the ancilla (control) and the n qubits in the SES partition (targets). For small n these can be done serially using the high-fidelity entangling gates developed for standard gate-based superconducting quantum computation. The CNOT gates commute, however, and in principal can be performed simultaneously. Multi-target CNOT gate protocols [11][12][13][14][15][16][17][18] have been developed for ion trap, cavity QED, and circuit QED architectures, where many qubits can be coupled to a common cavity or other bosonic mode. These protocols can be applied to the complete graph architecture as well, but at the expense of supplementing each ancilla with an additional resonator or qubit, which must also be fully connected. The desired tensor product would then require n + 2n ′ qubits. We avoid this overhead by designing a fast multi-target CNOT gate specifically for the complete graph. It is well known that the entangling gate is locally equivalent to a two-qubit CNOT gate, meaning that it is a CNOT apart from single-qubit rotations. To see this, let the second qubit be the control, and apply Hadamards to obtain e −i π 4 σ x ⊗σ z . This operator acts with e −i π 4 σ x on the target when the control is |0� , and with e i π 4 σ x when the control is |1� , from which it is straightforward to construct a CNOT.
It is not surprising that a simultaneous multi-target CNOT gate is possible in the complete graph architecture, and the Hamiltonian (6) already contains an interaction underlying such an operation: set the couplings between the ancilla and the n qubits in the SES partition to a positive constant g and all others to zero, as illustrated in Fig. 5. The interaction couples the ancilla qubit to a collective variable of the SES partition. Such an interaction, on its own, can be used to generate the desired multi-qubit entangling operation that generalizes (29). The multi-target CNOT gate   (6), however, contains single-qubit terms that do not commute with (30). Therefore it will be necessary follow a modified protocol to obtain the entangler (32): add a σ x microwave drive to the ancilla and transform to the usual rotating frame, where the σ x ⊗ σ x interaction becomes 1 2 (σ x ⊗ σ x + σ y ⊗ σ y ) , and then transform to a second rotating frame where the interaction is 1 2 σ x ⊗ σ x . This is discussed further in the "Supplementary Information".
Finally, we discuss the expected performance of this design when implemented in a transmon-based chip with inductive couplers. The main source of error is leakage into higher lying |2� states neglected in (6) but present in a real device. Although the current design has not been optimized to minimize this leakage, the estimated performance is already satisfactory for initial demonstrations, as indicated in Table 1.

Applications
Phase estimation. Energy estimation via quantum phase estimation is a natural application of the SES method and has been discussed in detail in Geller et al. 6 where it was explained that the Hamiltonian simulation component can be done exactly without Trotter errors. This is because one can directly program the individual matrix elements of the Hamiltonian. But that implementation requires 2N qubits to simulate a N × N Hamiltonian. Using the controlled-unitary protocol introduced here, however, allows the same algorithm to be implemented with a complete graph of N + 1 qubits. The main ingredients of the implementation, including adiabatic state preparation, have already been discussed 6 and will not be repeated here.

Matrix inversion.
As a second application of SES fusion we give an ancilla-assisted implementation of the quantum linear system solver of Harrow, Hassidim, and Lloyd 19 . We do not expect an SES chip running this implementation to outperform a classical supercomputer. We choose this algorithm because it requires a large register of ancilla qubits, which is challenging, and because it has interesting generalizations and applications to machine learning.
The matrix inversion algorithm 19,20 solves the linear system Ax = b for x , accepting b in the form of a normalized pure state |b� , and returning the solution in the form of a pure state |x� . In the SES implementation these states are stored in a data register of Hilbert space dimension n. (Note that in our notation A is n×n, not 2 n ×2 n .) Figure 5. Graph to implement the n-target CNOT gate. Table 1. Performance of simultaneous n-target CNOT gate in complete graph of n+1 transmon or Xmon qubits, using realistic models for the qubits and couplers. Here η is the qubit anharmonicity, t gate is the gate time excluding the single-qubit rotations in (34), is the microwave Rabi frequency, and g is the coupler strength. The reported gate error is E gate = 1 − |� |U † ideal U| �| 2 , where U ideal is the ideal entangler (32), and U is the realized evolution operator computed in the absence of decoherence. The error is averaged over initial states | � . The qubit frequencies are ǫ 0 /h = 5.5 GHz . www.nature.com/scientificreports/ A second register of m qubits is used for the phase estimation subroutine, and one more is used for postselection. The value of m determines the accuracy of the solution. The SES implementation (for symmetric A) requires a complete graph of n + m + 1 qubits. The circuit for m = 2 is given in Fig. 6. The central (blue) subcircuit implements the controlled-rotation operation here |k� is a computational basis state of the m-qubit ancilla register. The rotation angles θ 0 , . . . , θ 3 in Fig. 6 are determined by finding the net y rotation applied to the last qubit in each of the cases |k� ∈ {|00�, |01�, |10�, |11�} , making use of the identity σ x R y (θ) σ x = R y (−θ), and comparing the result with (35), rewritten as This leads to with the γ k given in (35). The matrix in (37), after multiplication by 2 −m/2 , is orthogonal and hence immediately inverted, yielding the θ k .
(35) Here H is the Hadamard gate, U = e iAt 0 /2 m , the vertical line connecting crosses is a SWAP gate, r = |0��0| + i|1��1| is a z rotation, and R y = e −i(θ/2)σ y is a y rotation. The small (yellow) subcircuits are Fourier transforms and the central (blue) subcircuit implements the controlled ancilla rotation (35). www.nature.com/scientificreports/ The m = 2 phase estimation is not sufficiently accurate for matrix inversion, typically leading to 5-15% algorithm errors for matrix sizes 2 ≤ n ≤ 4. By algorithm error we mean where ρ data is the final state of the data register, traced over the m + 1 ancilla, and |x ideal � is the pure state corresponding to the exact solution of the given linear system. The circuit of Fig. 6 can be easily extended to larger m, however, and the performance for m = 3 is already sufficient for an initial demonstration. The controlledrotation subcircuit for m > 2 can be obtained from the "uniformly controlled" rotation operator construction of Möttönen et al. 21 which requires 2 m CNOT gates (and is therefore not useful for large m). Simulating the m = 3 circuit we find that real symmetric matrices up to dimension 10 can be inverted with algorithm errors less than 5%, as shown in Fig. 7, a considerable increase in problem size over the existing gate-based realizations [22][23][24][25][26] .

Conclusions
In this paper we have introduced a non-gate-model form of NISQ computation that extends the reach of the standard SES method 6,7 by allowing the use of ancilla as control qubits, without doubling the graph size for each ancilla. This should make it possible to apply the SES method to larger problem sizes. We propose this method as a route to practical quantum computation with NISQ technology.
While it is interesting to discuss the SES method and other NISQ approaches in terms of their runtime complexity, this can be misleading because they are not scalable. For example the SES method allows one to implement arbitrary n × n unitaries in constant time 7 , which even a fault-tolerant quantum computer cannot do efficiently. However this ignores the fact that decoherence and other errors will limit the largest problems sizes that can be reliably implemented. But the constant runtime does suggest that it could be a good starting point for developing near-term applications with a quantum advantage. www.nature.com/scientificreports/