Synthesizing multi-phonon quantum superposition states using flux-mediated three-body interactions with superconducting qubits

Massive mechanical resonators operating at the quantum scale can enable a large variety of applications in quantum technologies, as well as fundamental tests of quantum theory. Of crucial importance in that direction, is both their integrability into state-of-the-art quantum platforms as well as the ability to prepare them in generic quantum states using well-controlled high-fidelity operations. Here, we propose a scheme for controlling a radio-frequency mechanical resonator at the quantum scale using two superconducting transmon qubits that can be integrated on the same chip. Specifically, we consider two qubits coupled via a capacitor in parallel to a superconducting quantum interference device (SQUID), which has a suspended mechanical beam embedded in one of its arms. Following a theoretical analysis of the quantum system, we find that this configuration, in combination with an in-plane magnetic field, can give rise to a tuneable three-body interaction in the single-photon strong-coupling regime, while enabling suppression of the stray qubit-qubit coupling. Using state-of-the-art parameters and qubit operations at single-excitation levels, we numerically demonstrate the possibility of ground-state cooling as well as high-fidelity preparation of mechanical quantum states and qubit-phonon entanglement, i.e. states having negative Wigner functions and obeying non-classical correlations. Our work significantly extends the quantum control toolbox of radio-frequency mechanical resonators and may serve as a promising architecture for integrating such mechanical elements with transmon-based quantum processors.


INTRODUCTION
The ability to control massive mechanical objects at the quantum level constitutes a very interesting task for many technological applications, ranging from microwave-to-optical conversion to quantum memories, as well as fundamental studies regarding the quantumclassical divide [1][2][3][4][5][6][7][8][9][10]. The rapid development of cavity optomechanics over the last decade has enabled the exploration of mechanical resonators in regimes where quantum effects become prominent. One approach relies on resonant coupling of acoustic phonons to microwave excitations via piezoelectric materials [11,12], however, the amplitude of the lattice vibrations in these systems is very small, limiting their applications. On the other hand, in typical opto-and electromechanical setups, lowfrequency mechanical resonators are controlled via parametric coupling to an optical or microwave cavity [13][14][15][16][17]. The mechanical elements in these systems are usually realised with metal drumheads or beams, which are characterised by large quality factors and large mass, making them also prime candidates for experimental tests of gravity-induced wavefunction collapse theories [1][2][3][4].
Following the pioneering work on acoustic resonators [11], recently, quantum superpositions of the ground and first excited state were for the first time generated in a parametrically coupled mechanical resonator [18]. In this approach, a superconducting qubit is used to create the excitation, which decays into a microwave resonator on a different chip and subsequently transferred to the mechanical element via an effective linearised interaction. It is recognised, however, that this method has severe limitations as it relies on strong driving, which is challenging with qubits, and suffers from unavoidable losses during the state transfer between different chips, limiting the fidelity of the prepared state [19]. A different scheme, implemented in the optical domain, uses entanglement and post-selective measurements to generate single-photon states [20], although the non-deterministic nature of the protocol in combination with low count rates limits the types of states that can be prepared.
A promising route towards high-fidelity mechanical quantum control is the ability to operate in the singlephoton strong-coupling regime, where interaction times are faster than dissipation processes, which however still remains an experimental challenge for far-detuned parametrically coupled mechanical resonators [19,21]. Operating in this regime is predicted to give rise to non-classical photon correlations [22] and non-Gaussian states [23,24], as well as macroscopic mechanical superpositions [25]. Moreover, using the radiation-pressure coupling to qubits can lead to the creation of mechanical Schrödinger cat states [26,27]. Generating Fock states in this regime could also be enabled using an additional microwave resonator to create an effective tripartite coupling, as predicted in Ref. [27], although this approach is limited to low state preparation fidelities mainly due to the limitations of capacitive coupling and stray qubitresonator interaction. In spite of the experimental and theoretical advances in the field, high-fidelity quantum state preparation of mechanical systems appears to be limited to a small class of engineerable states which are, to a large degree, architecture-dependent.
Here, we analyse a new scheme for synthesizing generic mechanical states by employing tuneable three-body interactions between two superconducting qubits and a mechanical resonator in the single-photon strong-coupling regime. The coupling relies on embedding a suspended micrometer-long beam in one of the arms of a superconducting quantum interference device (SQUID) in combination with an externally applied magnetic field [28][29][30][31]. We find that, by connecting two superconducting transmon qubits [32] directly via this mechanical SQUID, a tuneable three-body interaction arises as the qubit-qubit flux-mediated coupling is modulated by the mechanical displacement. Importantly, the detrimental exchangetype interaction between the two qubits can be suppressed by adding a capacitor in parallel to the SQUID, as realised in Ref. [33]. Using state-of-the-art parameters, reported in recent experiments [31], we numerically demonstrate the possibility of high-fidelity coherent quantum control of the beam, from ground-state cooling to fast and high-fidelity preparation of mechanical Fock states, as well as maximally entangled states of qubits and phonons. Finally, we devise a protocol consisting of qubit flux-pulsing and post-selective measurements for synthesizing multi-phonon superposition states, extending the quantum control toolbox and the plurality of engineerable quantum states in radio-frequency mechanical resonators.

Motion-dependent qubit-qubit interaction
The proposed circuit, shown in Fig. 1a, comprises two transmon qubits coupled directly via a superconducting quantum interference device (SQUID) shunted by a capacitor, C c . This tuneable coupling scheme has recently been realised in circuit QED setups using transmons [33] and LC resonators [34]. The coupling is controlled by tuning the Josephson energy of the SQUID, is the magnetic flux quantum and E c J,Σ is the sum of the two junction Josephson energies in the SQUID. The mechanical part of the circuit consists of a beam of length l that is embedded in one of the arms of the SQUID loop and can oscillate out of plane. By applying an in-plane external magnetic field B, the loop can pick up an additional flux β 0 BlX due to the beam displacement X, resulting in a flux-tuneable and motion-dependent Josephson energy E c J = E c J,Σ | cos(πΦ b /Φ 0 + αX)|, where α = πβ 0 Bl/Φ 0 and β 0 is a geometric factor depending on the mode shape (for the first mechanical mode, considered here, β 0 1) [30,31,35].
Tripartite coupling architecture. a. Circuit diagram of the electromechanical system comprising two transmon qubits directly coupled via a SQUID coupler with an embedded beam that can oscillate out of plane. Tuning the coupler to its filter frequency, where linear currents through the capacitor and the SQUID cancel out, and applying an in-plane magnetic field B, results in a dominant tripartite coupling as the beam oscillations modulate the qubit-qubit interaction. b. Flux-mediated couplings as a function of flux bias Φ b for in-plane magnetic field B = 10 mT and circuit parameters used in this work. The red curve represents the tripartite coupling strength, while the solid/dashed blue curves correspond to the radiation-pressure coupling of the beam with each qubit.
The Hamiltonian describing the circuit in Fig. 1a is where {X, P } and {φ i , Q i } are conjugate variable pairs describing the mechanical and the electrical degrees of freedom at circuit node i, and φ 0 = Φ 0 /(2π) is the reduced flux quantum. The first four terms describe the uncoupled system of the mechanical resonator and two transmon qubits, where m, ω m denote the mass and frequency of the beam and E Ji , C i represent the Josephson energy and loaded capacitance of each transmon, respectively (see Supplementary Sec. S1). The last two terms describe the qubit-qubit interaction via their charge and flux degrees of freedom, Q i and φ i . The core of this proposal relies on the fact that the dynamical displacement of the beam results in a modulation of the superconducting current through the coupling SQUID and its Josephson energy, which mediates the qubit-qubit interaction. Taking into account the finite asymmetry of the SQUID loop, which is present in any realistic scenario, this results in a motion-dependent Josephson energy, πΦ b /Φ 0 αX and relies on the assumption that αX 1 (for a full derivation see Supplementary Sec. S1). For the parameters considered here, which are compatible with values reported in recent experiments using micrometer-long Al beams and sub-Tesla magnetic fields [31], we have αX ∼ 10 −5 − 10 −6 , therefore this is a valid assumption.

Electromechanical system dynamics
The Hamiltonian of the system can be expressed in second quantisation form, aŝ are bosonic operators describing the annihilation (creation) of phonons and qubit excitations, respectively (see Supplementary Sec. S2). The effective electromechanical frequencies are ω m and where E Ji is the modified transmon Josephson energy due to the coupler. The full quantum mechanical treatment of the circuit, including higher-order nonlinear interaction terms, is presented in the Supplementary Material. The first term in Eq. (5) describes a three-body interaction involving hopping of qubit excitations together with mechanical displacements of the beam. The coupling strength is given by where Z i = e 2 E Ci /2 E Ji denote the transmon impedances, and X ZPF = /(2mω m ) is the zero-point motion of the mechanical resonator. The next interaction term describes the radiation-pressure coupling of each qubit with the beam at a rate g 1(2) = g Z 1(2) /Z 2(1) . The electromechanical coupling strengths are plotted in Fig. 1b as a function of the out-of-plane flux bias Φ b . Close to a half-integer flux quantum, E c J (X) is maximally susceptible to mechanical motion which maximises the electromechanical coupling. Note that all couplings become zero exactly at a half-integer flux quantum due to the finite SQUID asymmetry a J . In our calculations an asymmetry of 0.01 is included, reflecting a 2% spread in junction fabrication targeting.
The last term in Eq. (5) describes the qubit-qubit interaction where J C , J L are exchange-type coupling strengths arising from the coupling capacitor and SQUID, respectively, and V is the cross-Kerr coupling strength, which is minimised at Φ b Φ 0 /2 (see Supplementary Eq. S(33)). A significant advantage of this architecture for realising tripartite interactions, compared to relying exclusively on capacitive coupling [27], is that any stray linear coupling between the qubits can be suppressed with the right choice of coupling capacitance C c [33,34]. This makes the three-body interaction dominant and ensures the ability to manipulate the state of each qubit individually by local driving, which is crucial for the state engineering protocols discussed below.

Ground-state cooling
Mechanical resonators realised using vibrating beams and drumheads lie in the radio-frequency regime (∼ 10 MHz), where thermal fluctuations are dominant even at millikelvin temperatures, achieved with conventional cryogenic techniques. An essential element of control is, therefore, the ability to cool these systems to their quantum ground state before manipulating them further. Typically, electromechanical experiments employ a "cold" microwave cavity (∼ 10 GHz), which is coupled to the mechanical element via an effective linearised interaction in the many-photon regime, and cooling is enabled by red-sideband driving [14]. However, in a system comprised of qubits, cooling the mechanical resonator via sideband driving can be a challenging task, requiring multiple tones and eventually limited by the critical number of photons in the Josephson junction [36,37].
Here, we show that it is possible to overcome the challenges of sideband cooling with qubits by employing a time-domain protocol to cool the mechanical resonator to its quantum ground state, using the three-body interaction. The scheme, depicted in Fig. 2a, consists of a sequence of single-qubit operations which, combined with the tripartite interaction, enable the transfer of the thermal phonons to the environment in a stroboscopic fashion, as described below. At first, we bring one qubit (q 1 ) to its excited state and then tune its frequency such that ω 1 = ω 2 −ω m . Since g ω m the interaction (ĉ 1bĉ † 2 +H.c.) is resonant at this condition, such that a phonon combined with the excitation in q 1 can be transferred to the other qubit (q 2 ) after variable time ∆t cool . The cycle is then completed by reinitialising both qubits using active reset protocols [38][39][40]. A similar scheme has also been realised recently for detecting microwave photons in a   ...
Ground-state cooling. a. Schematic of the time-domain protocol to cool the mechanical resonator to its ground state using the three-body interaction. In each cycle, qubit 1 is excited with a microwave pulse, then its frequency is tuned at ω1 = ω2 − ωm for a variable time ∆t cool followed by a reset on both qubits. b. Numerical results after ∼ 100 cycles demonstrating cooling to a 0.05 phonon occupancy for a 10 MHz resonator at T = 10 mK (n th 20) using the system parameters presented in Table I. superconducting resonator, using tripartite interactions with a transmon qubit and a dissipative mode [41].
In Fig. 2b we plot the average number of phonons and qubit excitations as a function of time after ∼ 100 cooling cycles, for a mechanical beam oscillating at ω m /(2π) = 10 MHz and a tripartite coupling strength g/(2π) 0.3 MHz. At the end of the protocol the mechanical resonator is cooled down to the ground state with a phonon occupancy of 0.05, assuming an environment temperature of T = 10 mK (n th 20). The qubit (ω 1 /(2π) = 7 GHz) is excited with a 200 ns Gaussian pulse, while the reset and cooling times are set to ∆t reset = ∆t cool = 200 ns. The cooling time would be best optimised by choosing a short time for large thermal occupation and then increase the time as the resonator cools, since phonons swap faster for higher occupations. For simplicity, here, we considered a fixed time and found that for these parameters, a 200 ns time is sufficient. Furthermore, we have assumed an in-plane magnetic field B = 10 mT which is well-below the critical field for thin Al beams [42] and does not compromise the qubit coherence [43]. All system parameters used in the simulations are listed in Table I.

Mechanical Fock states and qubit-phonon entanglement
Following ground-state preparation, we present a protocol which employs the tripartite coupling to deterministically generate mechanical Fock states and maximally entangled states. As schematically depicted in the inset of Fig. 3a, it consists of preparing q 2 in the excited state and tuning it to ω 2 = ω 1 + ω m , such that the interaction (ĉ † 1b †ĉ 2 + H.c.) is resonant. Ideally, assuming 30 µs Qm 10 6 unitary evolution U (t) under the tripartite interaction, the state would evolve as The evolution of the average number of phonons and qubit excitations, starting from an attainable mechanical state of 0.05 thermal phonons and including system dissipation (see Methods) is plotted in Fig. 3a. We consider the full interaction Hamiltonian presented in Eq. (5), including next-to-leading order non-linear correction (see Supplementary Sec. S2), for the simulation parameters shown in Table I. In Fig. 3b we plot the Wigner function of the final state in the mechanical resonator, after time t = π/(2g), revealing a single-phonon Fock state with 97% (99%) fidelity, starting from an attainable (ideal) ground state. Higher phonon states could also be prepared by resetting and repeating the protocol with modified transfer times t n = n −1/2 m π/(2g). The quantum state preparation scheme described above, can also be used to generate bipartite and tripartite maximally entangled states between the mechanical resonator and the qubits. In particular, in the middle of the above protocol, at t = π/(4g), the system is in a Greenberger-Horne-Zeilinger (GHZ) state with 98 % fidelity, with the corresponding density matrix shown in Fig. 3c. Such states are particularly interesting for applications in quantum information [44,45] and fundamental tests of quantum theory [46]. Using the same protocol for q 2 in a superposition state 1 √ 2 (|0 2 + |1 2 ), the Bell state is generated after time π/(2g) with 96% fidelity (98% for ideal ground state), as depicted in Fig. 3d. The prepared Protocol for Fock-state preparation and maximally entangled states of phonons and qubit excitations. a. Time evolution after exciting qubit 2 and tuning into the operating point ω2 = ω1 + ωm (schematically presented in the inset) such that the interaction (ĉ1bĉ † 2 +H.c) is resonant. Starting from a phonon occupancy of 0.05, a mechanical Fock state is generated with 97% fidelity after time t = π/(2g). b. Wigner function of the resulting mechanical state. c. Density matrix of a Greenberger-Horne-Zeilinger state that is generated in the middle of the protocol, at t = π/(4g). The notation (010m12) is used in labelling and only matrix elements with a magnitude larger than 0.005 are shown. d. Resulting density matrix after repeating the same protocol with qubit 2 initialised in the superposition state 1 √ 2 (|02 + |12 ), leading to a Bell state with 96% fidelity.
state is a maximally entangled pair of a phonon and a qubit excitation, which could be utilised as a testbed for checking the validity of quantum mechanics at macroscopic scales without requiring tomography of the mechanical state [47,48]. Such states might also be suitable for integrating transmon qubits into other platforms such as spins, cold atoms, or even optical photons, via the mechanical resonator [49][50][51][52]. They could also provide possibilities for entangling the mechanical resonator with other physical systems via the transmon.

Multi-phonon quantum superpositions
We now extend the protocol described above to create multi-phonon quantum superposition states in the mechanical resonator, simply by flux-pulsing the qubits. In the protocols discussed previously, the qubit frequencies are tuned at ω 2 = ω 1 + ω m such that the states |0 1 n m 1 2 and |1 1 (n + 1) m 0 2 are coupled. However, when tuned at ω 2 = ω 1 −ω m the interaction term (ĉ 1b †ĉ † 2 +H.c.) becomes resonant, which couples |0 1 (n + 1) m 1 2 and |1 1 n m 0 2 . Therefore, by interchanging the qubit frequencies with flux-tuning pulses during each cycle it could be possible to create higher phonon Fock states and multi-phonon quantum superposition states, as depicted schematically in Fig. 4a.
As a proof-of-concept, using the same simulation parameters as above (Table I), we demonstrate the creation of superposition states |ψ m = 1 2 (|0 + √ 2|2 + |4 ) and |ψ m = 1 √ 2 (|0 + |4 ), after exciting qubit 2 and applying three flux pulses that interchange the qubit frequencies at variable times t 1 , t 2 and t 3 . Figs. 4b, d show the evolution of the qubit and resonator occupancy, starting from an attainable mechanical state of 0.05 phonons. The dashed lines indicate the times that a flux-tuning pulse is applied. The corresponding Wigner functions at the end of each protocol, following post-selection on |0 1 1 2 , are shown in Figs. 4c, e, with preparation fidelities 98 % and 97 %, respectively. After preparation, the states evolve naturally as U (t)|ψ m = (α|0 + βe −inωmt |n + γe −imωmt |m ) including dissipation, which is however not a limiting factor because of the long lifetimes of these mechanical resonators [31]. Readout of the prepared states including Wigner tomography could be performed using similar techniques to the ones developed in Ref. [53].
The scheme described above enables the generation of interesting classes of multi-phonon superposition states, such as the ones shown in Fig. 4, requiring only fluxtuning pulses and a projective measurement at the end. As we show in Supplementary Sec. S3, by including a projective measurement after each step of the protocol, it is possible to generate states with arbitrary phonon number probability distributions, although constrained in the relative phases of the superpositions. Furthermore, we find that quantum superpositions with arbitrary complex coefficients can also be generated with this platform by additionally employing the qubit-qubit interaction in a controllable fashion to perform exchange-type and C-Phase gates between the two qubits (see Supplementary Sec. S3). This would enable the creation of truly arbitrary states, similar to those produced in resonantly coupled qubit-resonator systems [53, 54], with the trade-off of increased complexity in the protocol. Alternatively, using the radiation-pressure coupling with one qubit in combination with a sequence of driving pulses, could enable the creation of mechanical Schrödinger cat states as discussed in Ref. [27].

DISCUSSION
A reconciliation of quantum mechanics and general relativity remains elusive at a theoretical level, however, there exist several proposals for testing the quantumclassical boundary with mechanical resonators offering an ideal testbed. More specifically, it has been theorised . . that a massive object in quantum superposition results in two coexisting space-time geometries, leading to issues with the unitary evolution, which eventually causes it to collapse [1, 3]. Importantly, this relies on the zeropoint motion X ZPF being much larger than the approximate size of the nucleus (∼ 1 fm), which is the case in our system (X ZPF = 33 fm). The collapse timescale t G is inversely proportional to the mass of the object, resulting in t G ∼ 1 − 10 s for the parameters considered here (m ∼ 1 pg), therefore the resonator coherence time should be larger than that. Recent advances in strain engineering techniques can enable the enhancement of beam quality factors up to Q m ∼ 10 9 [55], leading to relaxation times of hundreds of seconds, which would be sufficient for observing gravitational effects. Moreover, the ability to prepare a large variety of superposition states could offer an additional tool in testing such theories. The proposed architecture provides a very versatile platform in this regard, enabling not only generic quantum state preparation, but also with high fidelity, which has so far been a very challenging task.
Our approach combines the best of both worlds of two very versatile systems, namely the exquisite level of quantum control of qubits in circuit quantum electrodynamics [56], with the long lifetimes and flexibility of mechanical elements in coupling to electromagnetic radiation. The high-fidelity generation of hybrid entangled states of phonons and qubit excitations, which have no classical analogue, may provide alternative routes for testing the limits of quantum theory at macroscopic scales [46][47][48]. Additionally, such states are of particular importance in enabling quantum technologies with hybrid quantum systems, from quantum simulation to quantum computing and communication [57], and could also be used for coupling qubit excitations with other systems such as optical photons, cold atoms, or spin systems [49][50][51][52].
Furthermore, we have tested the robustness of our proposal against several imperfections that may occur in a realistic experimental scenario (see Supplementary Sec. S4). The most important limitation would be the presence of a considerable amount of flux noise, resulting in stray qubit-qubit coupling; for example, adding a fluctuation of δΦ b = 1 − 10 µΦ 0 results in 0.1 − 1 MHz added qubit-qubit coupling J, respectively. We find that for J < 1 MHz the fidelity of the cooling and quantum state preparation protocols is not compromised (Supplementary Fig. S4), therefore δΦ b < 10 µΦ 0 is required, which is compatible with observations in similar devices [58, 59]. We note that despite the steep slope of the tripartite coupling g versus flux bias for Φ b /Φ 0 > 0.495 (Fig. 1b), the tripartite coupling never changes by more than 1% for δΦ b < 10 µΦ 0 . Another possible experimental limitation is the deviation from the target qubit frequencies due to imperfect flux tuning pulses. We have studied the effect of this imperfection and find that targeting the qubit frequencies within 100 kHz is sufficient for highfidelity quantum state preparation (see Supplementary  Fig. S5). Additionally, we have studied the robustness of the protocol against qubit coherence and we find that high-fidelity quantum state preparation can be obtained for relaxation and dephasing times T 1, 2 10 µs, which are typical in the superconducting qubit community and compatible with 10 mT magnetic fields [43].
In conclusion, we have analysed a hybrid circuit architecture featuring strong and tuneable flux-mediated electromechanical interactions between a mechanical resonator and two superconducting transmon qubits. Using state-of-the-art parameters, we find that the coupled system can operate in the single-photon strong-coupling regime, which has been a long-standing goal in the field of optomechanics. We have proposed and numerically demonstrated several protocols for achieving groundstate cooling and preparing multi-phonon quantum superposition states as well as hybrid entanglement with high fidelities, which has been a tremendous challenge so far. Moreover, the proposed schemes for quantum manipulation are applicable to a wider range of tripartite quantum systems where a lower frequency mode, that is not directly accessible, is within the tuning range of the two other controllable modes. Our work significantly extends the quantum control toolbox of parametrically coupled radio-frequency mechanical resonators and provides a versatile on-chip interface with transmon-based processors, offering rich opportunities for technological applications as well as fundamental tests of quantum mechanics.

Numerical modelling
We model the dynamical evolution of the system, including environmental dissipation, with the Lindblad master equatioṅ which is numerically solved using QuTiP [60]. Here, L[ô]ρ . = (2ôρô † −ô †ô ρ − ρô †ô )/2 are superoperators describing each dissipation process, and n th = 1/[exp( ω m /(k B T )) − 1] is the thermal phonon number at temperature T . More specifically, we consider qubit decay and dephasing times T 1 = T 2 = 30 µs, which are consistent with measured values in a similar tuneable coupling transmon architecture [33]. The coupling of the mechanical mode to the environment is determined by γ m = ω m /Q, where the quality factor Q = 10 6 is chosen in agreement with experimental observations in recently fabricated SQUID-embedded beams [31]. For completeness, we additionally include O(φ 4 X) terms in the interaction Hamiltonian (see Supplementary Sec S2), which nevertheless cause insubstantial corrections to the system dynamics. We model the mechanical resonator using forty levels and each transmon as a three-level system (including an anharmonicity of E Ci /h 320 MHz). The same parameters, shown in Table I,

LAGRANGIAN-HAMILTONIAN DESCRIPTION OF THE CIRCUIT
The Lagrangian describing the electromechanical system in Fig.1(a) is where X, φ i are variables representing the beam displacement and the flux on circuit node i, respectively, and φ 0 = /2e is the reduced flux quantum. C i , C c and E Ji , E c J denote the capacitances and Josephson energies of each transmon and the coupler, respectively, and m, ω 0 are the mass and frequency of the beam. Following a Legendre transformation H = i φ i Q i − L we obtain the system Hamiltonian where P = ∂L ∂Ẋ is the mechanical conjugate momentum and Q i = ∂L ∂φi are the electrical conjugate momenta representing charges on each circuit node. C 2 = C 2 + C1Cc C1+Cc and C 1 = C 1 + C2Cc C2+Cc denote the modified transmon capacitances due to the coupling capacitance C c .

Motion-dependent flux-tuneable Josephson energy
If the two junctions of the SQUID are identical, the total Josephson energy is E c J,Σ is the sum of the Josephson energies of each junction. In the presence of an in-plane magnetic field, B, the SQUID loop can pick up an additional flux, β 0 BlX, due the beam displacement, where l is the length of the suspended arm and β 0 is a geometric constant associated with the mode shape of the beam (for the first mechanical mode β 0 ∼ 1) [31,35]. Therefore, the flux-and motion-dependent Josephson energy reads where α = πβ 0 Bl/Φ 0 and Φ 0 = h/2e is the magnetic flux quantum. Using basic trigonometry and expanding to lowest order in X, assuming πΦ b /Φ 0 αX, we find The above approximation holds for αX ZPF 1, which is valid for the beams considered in similar experiments and small magnetic fields (B < 1 T) [31].
In the above, the case of a symmetric SQUID E c J,1 = E c J,2 is considered, however in realistic devices a finite asymmetry needs to be taken into account. The Josephson energy is therefore more accurately described by where a J = |(E c J,1 − E c J,2 )/(E c J,1 + E c J,2 )| is the SQUID asymmetry [32]. For πΦ b /Φ 0 αX, we have Substituting into Eq. (S16), yields which can further be simplified, assuming αX In addition, the last term in Eq. (S13) leads to corrections in the bare subsystem Hamiltonians. More specifically, the inductive energy term results in an effective qubit Josephson energy while the potential energy term leads to a displaced rest position The latter does not affect the dynamics and can be absorbed in a redefinition of X → (X − X 0 ).

CIRCUIT QUANTISATION
We now switch to a quantum mechanical description of the circuit, promoting all canonical variables to quantum operatorsX Z i = e 2 E Ci /2 E Ji denotes the impedance of each transmon, where E Ci = e 2 2 Ci is its charging energy. Since the qubits are in the transmon regime [32], E Ji E Ci , the uncoupled electromechanical system is well-described bŷ with qubit frequencies

Tripartite coupling
Following Eq. (S22) we can now write the Hamiltonian operator describing the three-body interaction where, in the second step, we have made a rotating-wave approximation (RWA) to neglect fast rotating termsĉ 1ĉ2b ( †) , which do not contribute to the dynamics since ω 1 , ω 2 ω m , g. The tripartite coupling strength is given by

Radiation-pressure couplings
The radiation-pressure interaction between each qubit and the beam, in Eq. (S23), can be expressed in second quantisation form asĤ following a RWA. The radiation-pressure coupling strengths are given by

Qubit-qubit couplings
Following circuit quantisation and a RWA, the linear qubit-qubit interaction Hamiltonian in Eq. (S24) becomes The qubits also couple via their charge degrees of freedom (Q 1 Q 2 term in Eq. (S13)), which results in the same type of linear interaction with coupling strength Combining these together leads to an overall exchange-type interaction which can be suppressed at the desired operating point (Φ b ) with the right choice of coupling capacitor C c . The nonlinear interaction between the two transmons in Eq. (S25) is given bŷ The first term, following a RWA, results in a cross-Kerr interaction, Vĉ † 1ĉ 1ĉ † 2ĉ 2 , with coupling strength Additionally the same term yields a pair-hopping interaction V 4 (ĉ † 1ĉ † 1ĉ 2ĉ2 + H.c.), which however is not contributing to the dynamics for single-excitation levels. Finally, the other two terms in Eq. (S43) result in correlated hopping which contribute to the linear coupling in Eq. (S42) as J → J + J n1 + J n2 .

Higher-order tripartite interactions
The next-to-leading order electromechanical interactions are given bŷ with nonlinear coupling strengths and g φ1φ 3 The last two terms also lead to a small correction of the tripartite coupling strength g → g + 3g φ 3 1 φ2x + 3g φ1φ 3 2 x . These terms, although weaker, are not negligible (since they are of the same order or larger than dissipation rates) and they are include in simulating the system dynamics.
The O[φ 2 X 2 ] terms in Eq. (S27) can be written as, For the parameters used in this work (αX ZPF ∼ 10 −6 ) the coupling is negligible and does not affect the system dynamics (the same holds for the even weaker O[φ 4 X 2 ] terms). Interestingly, however, for i = j we find a dispersive (cross-Kerr) interactionĉ † iĉ ib †b of each qubit with the resonator, which is active even when the qubits are far detuned and could potentially be employed for phonon-sensitive measurements of the mechanical state [19].

States with arbitrary complex coefficients
In the main text we demonstrated a protocol for creating multi-phonon quantum superposition states by alternating the qubit frequencies such that different parts of the tripartite interaction become resonant. By controlling the interaction times and post-selecting on the qubit state at the end of the protocol, we showed the possibility of creating interesting classes of superposition states with high fidelity. However, this protocol alone is not sufficient for generating mechanical superposition states |ψ = N n=0 c n |n m with arbitrary complex coefficients c n , since there is no phase degree of freedom to control in the protocol. This is due to the fact that the coupling constant g is not an adjustable complex number and in practical implementations the interaction time is the only parameter that can be varied.
However, there is another tuning knob that can be employed by taking advantage of the qubit-qubit exchange-type interaction J (ĉ † 1ĉ 2 +ĉ 1ĉ † 2 ). Detuning the qubits by ∆ = |ω 1 − ω 2 | ω m , while tuning the coupling SQUID such that there is a finite exchange-type coupling strength J ≥ ∆, results in a resonant qubit-qubit interaction that couples |0 1 1 2 and |1 1 0 2 . For example, starting from |0 1 1 2 one would end up with state [cos(Jt)|0 1 1 2 − i sin(Jt)|1 1 0 2 ] after time t (at t = π/(2J) this realises a SWAP gate). Furthermore, by detuning the qubit frequencies such that ∆ > J it   is also possible to introduce a relative phase between the qubit states |0 1 1 2 and |1 1 0 2 (C-Phase gate). Combining this gate with the resonant qubit-qubit interaction, acting on |0 1 1 2 would result in We note that our system, is analogous to the one studied theoretically by Law and Eberly [54] and experimentally by Hofheinz et al. [53]. The system studied in these references considers a resonator that is controllably coupled to a qubit (with local qubit driving) via a resonant exchange-type interaction (Jaynes-Cummings). It is shown that an arbitrary resonator state can be generated by interleaving the Jaynes-Cummings evolution with qubit driving. The two-qubit states |0 1 1 2 and |1 1 0 2 in our system can be mapped to the single qubit basis |0 and |1 considered in Law and Eberly. Furthermore, the tripartite interaction can be mapped to the qubit-resonator Jaynes-Cummings interaction, where in our system we additionally have the possibility of realising the equivalent of a counter-rotating Jaynes-Cummings interaction by exchanging the qubit frequencies. The equivalent of qubit driving can then be performed by controlling the evolution under the exchange qubit-qubit interaction U J described in Eq. (S52).
We will consider the case where an arbitrary state |ψ m = N n=0 a n |n m is generated following post-selection from the arbitrary entangled state |ψ = N −1 n=0 (a n |0 1 n m 1 2 + b n |1 1 n m 0 2 ) + a N |0 1 N m 1 2 . That is, if we can create the above state |ψ with arbitrary complex coefficients, it is then straightforward to collapse it to |ψ m following postselection on the |0 1 1 2 qubit state. Inspired by Refs. [53,54], let us now consider the problem of generating the arbitrary state |ψ from the inverse point of view, i.e. by proving that it is always possible to empty it. Suppose we start from |ψ = N −1 n=0 (a n |0 1 n m 1 2 + b n |1 1 n m 0 2 ) + a N |0 1 N m 1 2 (Fig. S5a). Then by applying the (inverse of the) C-Phase gate and the tripartite interactionÛ † (t) = e i(ĉ † 1bĉ 2+H.c)t for phase θ A and time t A , the probability amplitude of state |0 1 N m 1 2 becomes By appropriately choosing θ A and t A , it is possible to make the above probability amplitude zero, such that all phonons occupying the state |0 1 N m 1 2 are transferred to |1 1 (N − 1) m 0 2 (see Fig. S5b). The step above also results in incomplete transfer of phonons between states |0 1 n m 1 2 and |1 1 (n − 1) m 0 2 (for n < N ), after which we are left with the state |ψ = N −1 n=0 (a n |0 1 n m 1 2 + b n |1 1 n m 0 2 ). Now by combining the (inverse of the) C-Phase gate and resonant qubit-qubit interaction U † J (t J , θ J )|ψ the probability amplitude of state which can be made zero by appropriately choosing θ J and t J such that all phonons occupying |1 1 (N − 1) m 0 2 are transferred to |0 1 (N − 1) m 1 2 (see Fig. S5b). By applying the above two steps N times it is possible to completely empty the initial state, leading the system to |0 1 0 m 1 2 (Fig. S5c). Therefore, reversing the problem, it is possible to generate any arbitrary mechanical state |ψ m = N n=0 a n |n m by creating the arbitrary entangled state (a n |0 1 n m 1 2 + b n |1 1 n m 0 2 ) + a N |0 1 N m 1 2 , and post-selecting on |0 1 1 2 .
States with arbitrary phonon number probability distributions The protocol presented in the previous section can enable the creation of any mechanical quantum state with arbitrary coefficients. Although this protocol is experimentally feasible, it can become complex as it requires a lot of additional tuning to realise the C-Phase and exchange-type gates between the qubits. Here we describe an alternative protocol that only employs the tripartite interactions, therefore requiring only alternating between the qubit frequencies, and post-selective measurements at each step. Reducing the complexity of tuning pulses comes at the cost of not being able to create states with arbitrary complex coefficients, although it is possible to generate states with arbitrary phonon number probability distributions as we see below.
The protocol relies on initially preparing the two qubits in an arbitrary entangled state (Fig.S6a). Assuming the resonator is in the ground state, the tripartite system is initially described by the following wavefunction where α, β are complex numbers that can be chosen arbitrarily and are related by |α| 2 + |β| 2 = 1. As described in the previous section, one can prepare this state by activating the exchange-type qubit-qubit interaction U J (t, θ). This can be done by detuning the qubits sufficiently such that their frequency difference is much greater than the mechanical frequency, and at the same time smaller than the direct qubit-qubit coupling, i.e. J ≥ |ω 1 − ω 2 | ω m , which can be adjusted by changing the flux Φ b on the coupling SQUID.
The addition of two more variables t 3 , t 4 enables the creation of states with arbitrary phonon number probability distribution up to |4 .
Suppose we start from an arbitrary mechanical state |ψ = N −2 n=0 c n |n m |0 1 1 2 , which can be created by applying the above protocol (N/2 − 1) times. Then, following another step we have 1 j=N/2PÛ where the new coefficients are determined by the previous ones according to the following relations: Therefore, applying this protocol N/2 times can lead to the creation of states with arbitrary phonon number probability distribution up to |N .  S7. Effect of qubit coherence on quantum state preparation. Evolution of the fidelity of the prepared state to the ideal state, cos(gt)|010m12 − i sin(gt)|111m02 , using the protocol presented in Fig. 3, for different values of relaxation and pure dephasing times (assuming T1 = T2). The dashed curves correspond to e −t/T 1 for each case.
Although in the simulations we have considered realistic parameters taken from recent experiments, we realise that in an experimental scenario system parameters such as the qubit coherence may significantly deviate from the ones considered in Table 1. Although bad qubit coherence may not affect the success of the cooling protocol, it can pose limits on the quantum state preparation protocols. We have therefore examined the dependence of the fidelity of the prepared quantum state in Fig. 3 to the ideal evolution |ψ ideal = U (t)|0 1 0 m 1 2 = cos(gt)|0 1 0 m 1 2 − i sin(gt)|1 1 1 m 0 2 on the qubit coherence. In Fig. S7 we plot the evolution of the fidelity of the prepared density matrix ρ to the ideal one ρ ideal = |ψ ideal ψ ideal |, for different values of T 1 (assuming T 2 = T 1 ). The dashed curves correspond to e −t/T1 . We find that high-fidelity state preparation (> 90 %) can be achieved with qubit coherence times of T 1 , T 2 ∼ 10 µs which are standard in the superconducting qubit community and achievable in the presence of 10 mT magnetic fields [43].
One effect that is not considered in the main text is that of fluctuations on the flux bias channel δΦ b . These can occur as a result of environmental magnetic field noise or noise of the current source used for biasing the flux line. Given the   stability of our current sources and assuming flux noise levels reported in similar devices [58, 59], we estimate such fluctuations to be around δΦ b /Φ 0 ∼ 10 −6 during the course of the preparation protocols. The most important effect of flux fluctuations would be to introduce a stray qubit-qubit exchange-type coupling as J L ∝ E c J,Σ cos(π(Φ b + δΦ b )/Φ 0 ). More specifically, adding a fluctuation of δΦ b /Φ 0 ∼ 10 −6 would result in J L → J L + 100 kHz, while δΦ b /Φ 0 ∼ 10 −5 and δΦ b /Φ 0 ∼ 10 −4 translate to additional coupling of 1 and 10 MHz, respectively. The latter would be detrimental for the state preparation protocols as it is of the order of the mechanical frequency. Note that the effect of flux noise on the qubit-qubit coupling is also amplified by our choice of a large coupling Josephson energy amplitude E c J,Σ /h = 200 GHz. In Fig. S8 we plot the effect of a finite qubit-qubit coupling on the cooling and quantum state preparation protocols. We find that for J < 1 MHz (therefore δΦ b < 10 µΦ 0 ) it is possible to maintain high-fidelity quantum state preparation. Note that despite the sharp dependence of the tripartite coupling on the flux bias (Fig. 1), this change is less than 0.1 % in the case of δΦ b < 10 µΦ 0 . Fidelity ± 100kHz ± 300kHz ± 500kHz ± 1MHz FIG. S9. Effect of imperfect flux pulsing on quantum state preparation. Evolution of the fidelity of the prepared state to the ideal state using the protocol presented in Fig.3, for different variations of the qubit-qubit detuning ∆. The black dashed curve corresponds to no variation, i.e. ∆ = ωm.
Additionally, in Fig. S9 we examine the effect of imperfect qubit flux pulsing on the fidelity of the prepared quantum state. We find that the fidelity of the protocol is sensitive to this parameter and that targeting the qubit frequencies within ∼ 100 kHz is required for high-fidelity quantum state preparation.