Introduction

Phonons, or mechanical vibrations, offer unique advantages for the transfer of quantum information between solid-state quantum systems. They couple to a wide variety of individual qubits1,2,3,4, and are envisioned as universal quantum transducers between different types of qubits5,6,7. Among these, several defect spins in diamond and silicon carbide are not only optically accessible, but also possess long coherence times allowing them to be used as long-lived quantum memories8,9,10. Although control of some of these defect spins with mechanical vibrations has been demonstrated3,11,12,13,14,15,16, obtaining strong spin-phonon coupling remains a challenge due to their low strain susceptibility. Recent experiments have shown the potential of the negatively charged silicon vacancy (SiV) center in diamond to act as a spin qubit that is highly sensitive to strain due to the orbital degeneracy in its ground state17,18, making it particularly suitable for coupling to phonons. Here, we utilize this high strain susceptibility to demonstrate coherent, low-power control and Ramsey interferometry of a single SiV spin using surface acoustic waves (Fig. 1a). Our results establish the electron-phonon coupling of the SiV spin qubit as a promising approach for transduction between single spins and phonons. This would enable the generation of quantum nonlinearities for single phonons, and realization of phonon-mediated hybrid quantum systems with spins.

Fig. 1: Layout of surface acoustic wave (SAW) devices, and structure of the silicon vacancy (SiV) center.
figure 1

a Schematic of our diamond SAW device. A microwave signal applied to one of the interdigital transducers (IDTs) generates acoustic waves due to the piezoelectric response of aluminum nitride (AlN). SiVs in the diamond are probed using a focused laser beam. (Inset) A scanning electron microscope (SEM) image of a pair of transducers. The scale bar corresponds to \(20\ \mu{\text{m}}\). b Molecular structure of the SiV. c Electronic structure of the SiV under non-zero external magnetic field. The solid red arrow indicates the optical transition used for spin initialization and readout, and the dashed red arrows indicate other optical transitions. The blue arrow indicates the acoustic transition between the two levels of the spin qubit. d Optical fluorescence spectrum of the C-transition of the SiV under resonant optical excitation, showing fine structure. The four peaks correspond to the transitions C1–C4.

Results

Strain response of the SiV

The SiV is a point defect in diamond that has attracted attention due to its excellent optical properties19, long-lived electronic spin qubit10, and ability to maintain these properties inside nanoscale devices20,21. It is formed by a silicon atom located centrally between two adjacent vacant sites in the diamond lattice (Fig. 1b), and hence possesses \({D}_{3d}\) symmetry22. Its electronic structure is composed of a ground state (GS) and an excited state (ES), each split into two, principally by spin-orbit coupling, resulting in four optical transitions. The brightest of these is the transition between the lower branches of the ES and GS, labelled C, which we utilize in this work. In the presence of a magnetic field, the degeneracy of the \({\rm{S}}=1/2\) electronic spin of the SiV is lifted and each energy level further splits into two (Fig. 1c). In particular, the lower ground state splits into the levels \(|{e}_{g+}\downarrow \rangle\) and \(|{e}_{g-}\uparrow \rangle\), which are used as a long-lived spin qubit10. The lifting of the spin degeneracy causes each optical transition to split into four, as shown in Fig. 1d. In the case of transition C, this gives rise to two bright spin-conserving transitions, that we label C2 and C3, and two dim spin-flipping transitions, labelled C1 and C4.

The strain response of the SiV electronic states depends on the orientation of the strain with respect to the symmetry axis of the SiV18. Strain tensor components belonging to the \({A}_{1g}\) representation of the \({D}_{3d}\) symmetry group induce a global shift of the energy levels, while the \({E}_{g}\) representation components mix the orbitals. In the presence of an off-axis magnetic field, the mixing of the spin degree of freedom in the spin-orbit eigenstates allows \({E}_{g}\) strain to introduce a non-zero overlap term for the \(|{e}_{g+}\downarrow \rangle \leftrightarrow |{e}_{g-}\uparrow \rangle\) qubit transition. As a result, the qubit levels inherit the large strain susceptibility resulting from perturbations to the spatial charge distribution of orbitals (\(\sim\! 1\ {\rm{PHz}}/{\rm{strain}}\)). The resulting strain susceptibility for the spin qubit levels can be \(\sim\! 100\ {\rm{THz}}/{\rm{strain}}\) for qubit transition frequencies in the few \({\rm{GHz}}\) range. This is four orders of magnitude larger compared to other defect spins whose acoustic driving has been studied recently3,11,15,16. In these latter cases, the strain susceptibility arises from a perturbation to the much weaker spin-spin interaction within the same orbital or with a far-detuned excited state orbital14,23. The fundamentally different origin of spin-phonon coupling in the SiV enables efficient control of the SiV spin qubit with an oscillating strain field.

Surface acoustic devices on diamond

Surface acoustic waves (SAWs) are travelling mechanical vibrations confined to solid surfaces, and can be thought of as bound states whose bandstructures lie below the continuum of bulk acoustic waves. They are particularly suitable for interfacing with near-surface atomic-scale defects in solids12,13,15, due to their inherent confinement of acoustic energy to a depth of about one acoustic wavelength (\(3\ \mu {\rm{m}}\) in our case). Our SAW devices (Fig. 1a) are fabricated on single-crystal diamond with the top surface normal to the \([100]\) crystal direction. We generate SiVs at a depth of \(100\pm 18\ {\rm{nm}}\) by implantation of silicon ions. The area density of SiVs is sparse enough that SiVs can be individually addressed optically. Due to the lack of piezoelectricity in diamond, which is required for electrical transduction of SAWs, we deposit a thin layer of piezoelectric aluminum nitride (AlN) on top of the diamond. We fabricate aluminum interdigital transducers (IDTs) on top of the AlN. These are patterns of interlocking metal fingers, with each set of fingers connected to a common terminal. When an alternating voltage is applied to the terminals of the IDT, the piezoelectric response of the AlN generates spatially and temporally periodic deformations on the surface, which propagate as SAWs. We design the shape of our transducers to generate Gaussian SAW beams in order to concentrate acoustic energy by focusing. Fabricating transducers in pairs allows us to electrically measure their total scattering parameters (S-parameters).

We evaluate the acoustic mode profile of the SAWs with finite element simulations (Fig. 2a), confirming that the SAW is localized to a depth of one acoustic wavelength. Electrical S-parameter measurements indicate a center frequency of \(3.37\ {\rm{GHz}}\), with a full width half maximum (FWHM) amplitude bandwidth of \(126\ {\rm{MHz}}\) that is limited by the number of finger pairs in the transducers (Fig. 2b). The shortest acoustic pulse generated by these devices is \(13\ {\rm{ns}}\) in duration, as measured by time-domain characterization (Supplementary Note 2). We measure the surface amplitude profile of the SAWs using transmission-mode microwave impedance microscopy24. This technique uses a scanning probe to measure the surface electric potential on a piezoelectric material, which is proportional to the SAW amplitude at each point on the surface. The measured electric potential profile indicates that the SAW is focused to about one acoustic wavelength laterally (Fig. 2c), in addition to the localization in depth (Fig. 2a). In combination with the high strain susceptibility of the SiV, this acoustic confinement further reduces the acoustic power required to drive an individual SiV spin.

Fig. 2: Characterization of SAW transducers.
figure 2

a Total displacement profile of the acoustic mode in a cross-section of the device, obtained from finite element simulation. The white horizontal line indicates the interface between AlN and diamond. SiVs are located \(100\ {\text{nm}}\) below this line. b Room temperature measurement of electrical S-parameters between the transmitter and receiver IDTs. The \({S}_{11}\) plot is magnified \(100\times\) along the vertical axis. c Microwave impedance microscopic image showing the surface electric potential at the focus of the transducers under continuous-wave excitation. The potential is proportional to the SAW amplitude and demonstrates focusing of the SAW on the surface.

Acoustic driving of the SiV spin

After characterizing our devices at room temperature, we proceed with low-temperature acoustic driving of SiV spins. These experiments are performed at a temperature of \(5.8\ {\rm{K}}\), as measured on the cold stage inside a closed-cycle liquid helium cryostat. A confocal microscope focuses light from a tunable \(737\ {\rm{nm}}\) laser on the sample, which is used to resonantly excite optical transitions of individual SiVs, with pulses generated using an electro-optic intensity modulator. We detect photons emitted in the phonon sideband (PSB) of the SiV emission spectrum. Our SiV spin qubit is defined by the levels \(|{e}_{g+}\downarrow \rangle\) and \(|{e}_{g-}\uparrow \rangle\) (Fig. 1c) (denoted \(\left|\downarrow \right\rangle\) and \(\left|\uparrow \right\rangle\) respectively for simplicity), and its transition frequency is adjusted to lie within the bandwidth of the acoustic tranducers (Fig. 2b) by tuning an external magnetic field.

To demonstrate acoustic driving of the SiV spin, we use optically detected acoustic resonance (ODAR). For our experiments, we select an SiV located near the focus of the transducers. An arbitrary waveform generator (AWG) drives one of the transducers to generate acoustic pulses. The acoustic pulses affect the SiV levels via both energy shift (dispersive) and mixing interactions with different components of the applied strain18. We utilize the dispersive interaction to calibrate the timing of the optical and acoustic pulses, taking into account the acoustic velocity (Supplementary Note 3). On the other hand, the mixing strain response resonantly drives population between the qubit states.

The optical and acoustic pulse sequences for ODAR are shown in Fig. 3a. We use optical pulses that are resonant with the spin-flipping C1 optical transition (Fig. 1c) for both spin initialization and readout. A \(150\ {\rm{ns}}\) duration optical pulse initializes the SiV into \(\left|\uparrow \right\rangle\), from the inital thermal mixture of \(\left|\downarrow \right\rangle\) and \(\left|\uparrow \right\rangle\). The corresponding time-resolved histogram of PSB photon detection events shows an initial peak proportional to the thermal population in the \(\left|\downarrow \right\rangle\) state, followed by an exponential decay as population is transferred to \(\left|\uparrow \right\rangle\) via the spin-conserving C3 transition (Fig. 3b). After initialization, a \(20\ {\rm{ns}}\) duration acoustic pulse drives population back into \(\left|\downarrow \right\rangle\). Finally, a \(100\ {\rm{ns}}\) duration optical pulse measures the final population in \(\left|\downarrow \right\rangle\). We use the ratio of the readout signal to the initialization signal as a measure of the final population in the \(\left|\downarrow \right\rangle\) state, normalized to the thermal equilibrium population. By varying the frequency of the acoustic pulse and measuring the corresponding normalized population, we obtain the ODAR spectrum for the SiV spin qubit (Fig. 3c), demonstrating acoustic driving of the SiV spin. The maximum of this spectrum occurs at the transition frequency of the qubit, which is \(3.43\ {\rm{GHz}}\) for our experiment. The ODAR spectrum has a FWHM linewidth of \(48\ {\rm{MHz}}\), which is broadened in part due to the frequency spread of the short \(20\ {\rm{ns}}\) duration acoustic pulse.

Fig. 3: Optically detected acoustic resonance (ODAR) measurements of the SiV spin transition.
figure 3

a Optical and acoustic pulse sequences used for ODAR measurement. The laser is resonant with the C1 transition and initializes the SiV into \(\left|\uparrow \right\rangle\). A \(20\ {\text{ns}}\) duration SAW pulse drives population between \(\left|\uparrow \right\rangle\) and \(\left|\downarrow \right\rangle\). b Time-resolved histogram of photon detection events corresponding to the pulse sequence in a. The height of the peaks at the beginning of the initialization and readout pulses is proportional to the population in \(\left|\downarrow \right\rangle\). Photon detections are integrated over \(10\ {\text{ns}}\) (indicated by the shaded region) to determine initialization and readout signals. c Normalized population in the \(\left|\downarrow \right\rangle\) state as the center frequency of the acoustic pulse is varied, calculated as the ratio between the readout and initialization signals. A maximum is obtained at \(3.43\ {\text{GHz}}\), indicating the resonance frequency of the SiV spin transition. The error bars represent the standard deviation of the normalized population.

Coherent acoustic control of the SiV spin

After matching the acoustic pulse frequency to the qubit transition frequency measured by ODAR, we vary the duration of the acoustic pulse from \(0\ {\rm{ns}}\) to \(100\ {\rm{ns}}\) (Fig. 4a). As the duration of the acoustic pulse increases, the normalized population in \(\left|\downarrow \right\rangle\) displays Rabi oscillations (Fig. 4b), indicating coherent transfer of population between the \(\left|\downarrow \right\rangle\) and \(\left|\uparrow \right\rangle\) states. We fit an exponentially damped sinusoid (Supplementary Note 4) to the data to estimate the Rabi frequency. Upon varying the peak input microwave power from \(-3\ {\rm{dBm}}\) (\(0.5\ {\rm{mW}}\)) to \(6\ {\rm{dBm}}\) (\(4\ {\rm{mW}}\)), we observe the expected linear increase of Rabi frequency against the square root of the power (Fig. 4b), and infer a Rabi frequency of \(48\ {\rm{MHz}}\) with \(4\ {\rm{mW}}\) of input microwave power. We estimate the on-chip acoustic power from the measurement of the microwave reflection S-parameter \({S}_{11}\) and transmission parameter \({S}_{21}\) (Fig. 2b). At the acoustic frequency of \(3.43\ {\rm{GHz}}\), we measure \(| {S}_{11}{| }^{2}=-0.4\; {\rm{dB}}\) and \(| {S}_{21}{| }^{2}=-31\; {\rm{dB}}\), indicating that the peak on-chip acoustic power is between \(350\;\mu {\rm{W}}\) and \(3\;\mu {\rm{W}}\), which is orders of magnitude lower than previous demonstrations of acoustic control of defect spins15,16. This power is within the typical thermal load limits of dilution refrigerators, and is hence compatible with operation at millikelvin temperatures that enable longer (\(\sim\! 10\ {\rm{ms}}\)) SiV spin coherence times10.

Fig. 4: Coherent acoustic control of the SiV spin qubit.
figure 4

a Pulse sequence used for Rabi oscillation measurements. An acoustic pulse of frequency \(3.43\ {\text{GHz}}\), which is resonant with the SiV spin transition and generated with \(2\ {\text{mW}}\) peak microwave input power, is used to coherently drive the qubit. b Normalized population in the \(\left|\downarrow \right\rangle\) state as the acoustic pulse duration is varied. A fit to a theoretical model is shown in red. (Inset) The dependence of Rabi frequency on the square root of peak microwave input power indicates the expected linear behavior. The errors in the Rabi frequencies are of the order of \(0.1\ {\text{MHz}}\). c Pulse sequence used for Ramsey interferometry measurements. Two \(\pi /2\) acoustic pulses, each detuned by \(50\ {\text{MHz}}\) and generated with \(4\ {\text{mW}}\) peak input power, are separated by a varying free precession time. d Normalized population in the \(\left|\downarrow \right\rangle\) state as the free precession time is varied. A fit to a theoretical model is shown in red. The time constant of the exponential decay of the oscillations gives a spin coherence time \({T}_{2}^{* }=33\pm 5\ {\text{ns}}\). The error bars in b and d represent the standard deviation of the normalized population.

Finally, we use this coherent acoustic control to perform Ramsey interferometry and measure the coherence time of the SiV spin qubit. We set the acoustic pulse detuning to \(50\ {\rm{MHz}}\) from the qubit transition frequency. After estimating the duration of a \(\pi /2\) acoustic pulse, we perform a Ramsey pulse sequence with two \(\pi /2\) acoustic pulses separated by a varying time delay (Fig. 4c). The time constant of the exponentially decaying Ramsey fringes gives a direct measurement of the coherence time of the spin qubit, which is \({T}_{2}^{* }=33\pm 5\ {\rm{ns}}\). At our operating temperature of \(5.8\ {\rm{K}}\), the spin coherence time is limited by decoherence due to \(50\ {\rm{GHz}}\) thermal phonons resonant with the orbital transition in the ground state, which follows an inverse dependence with temperature25,26. The measured spin coherence time agrees with previously reported results, when the temperature dependence is accounted for17,26.

Discussion

In conclusion, we demonstrate coherent acoustic control of the SiV spin qubit in diamond using surface acoustic waves, and perform acoustically driven Ramsey interferometry on a single spin. The high strain susceptibility of the SiV spin qubit, arising from the ground-state orbital degeneracy18, allows this to be performed with low acoustic power, making it compatible with operation at millikelvin temperatures which enable long (\(\sim\! 10\ {\rm{ms}}\)) spin coherence times10. It also opens up the possibility of further improving the spin coherence time by using fast acoustic pulses for dynamical decoupling. Finally, this efficient acoustic interface could be utilized to achieve phonon-mediated coupling of the SiV spin with a wide variety of quantum systems. In particular, placing the SiV center within a high quality factor confined mechanical mode would enable strong coupling between the spin qubit and single phonons27. For a diamond mechanical resonator with frequency of a few GHz and mode volume on the order of one cubed acoustic wavelength, strong coupling would be achieved at millikelvin temperatures for a quality factor of \(1{0}^{3}\), or at \(4\ {\rm{K}}\) for a quality factor of \(1{0}^{5}\)18. By making use of cavity optomechanical schemes28, such a hybrid SiV-mechanical resonator system could be used to realize a high cooperativity interface between the SiV spin qubit and photons at telecommunications frequencies. Alternatively, piezoelectric schemes29,30 could be employed to establish an interface with microwave quantum circuits such as superconducting qubits31. Thus our work demonstrates a promising path towards hybrid quantum systems and networks.

Methods

Device fabrication

We use \([100]\)-cut, electronic grade single-crystal diamond samples synthesized by chemical vapor deposition (CVD) from Element Six Corporation. Silicon ions (\(^{28}{\rm{Si}}^{+}\)) are implanted on the top surface of the diamond at an energy of \(150\; {\rm{keV}}\) and a density of \(1{0}^{10}\; {{\rm{cm}}}^{-2}\), introducing Si atoms over the entire surface at a depth of \(100\pm 18\ {\rm{nm}}\) as determined by a SRIM simulation32. SiV centers are generated by a high-temperature (\(110{0}^{\circ }\ {\rm{C}}\)), high-vacuum annealing procedure followed by a tri-acid clean (1:1:1 sulfuric, perchloric, and nitric acids). A \(1.4\ \mu {\rm{m}}\) aluminum nitride (AlN) layer is deposited on top of the diamond by RF sputtering. SAW devices are fabricated using electron beam lithography followed by evaporation of \(100\ {\rm{nm}}\) of gold (Au) for bonding pads and \(75\ {\rm{nm}}\) of aluminum (Al) for IDTs.

Electrical characterization of SAW devices

These measurements are performed at room temperature. The IDTs are contacted with RF probes, which are connected to a vector network analyzer (Agilent E8364B) for S-parameter measurements.

Microwave impedance microscopy measurements

The IDT is driven by a continuous-wave microwave input signal to generate surface acoustic waves. An atomic force microscopy probe scans over the surface of the device and senses the electric potential at each point24. The measured signal is mixed with the drive signal to coherently detect the relative amplitude and phase of the electric potential of the SAWs.

Low temperature experiments

Low temperature experiments are performed in a closed-cycle liquid helium cryostat (Montana Instruments Cryostation). The sample is mounted on an XYZ piezoelectric nanopositioner stack (Attocube ANPx101 and ANPz101) with a custom-made holder. The holder contains another nanopositioner with a neodymium permanent magnet (K&J Magnetics) positioned behind the sample, that can be moved relative to the sample to adjust the magnetic field. The sample is clamped on top of the holder, using Indium foil for good thermal contact. A temperature sensor in the holder estimates the sample temperature to be about \(5.8\ {\rm{K}}\). A pair of IDTs are wire bonded to a PCB on the holder for electrical excitation and readout of acoustic pulses. SiVs in the sample are addressed with a home-built confocal microscope that focuses light from a \(520\ {\rm{nm}}\) laser and a tunable \(737\ {\rm{nm}}\) laser (M-Squared SolSTiS) on the sample. The \(520\ {\rm{nm}}\) laser is periodically switched on to reset the charge state of the SiV, while the \(737\ {\rm{nm}}\) laser is used to resonantly excite optical transitions. The wavelength of the tunable laser is stabilized by using feedback from a wavemeter (HighFinesse WS7). Photons emitted by the SiV in the phonon side band (PSB) \(> \ 740\ {\rm{nm}}\) are selected by an optical long-wavelength pass filter and sent to an avalanche photo diode (APD, Excelitas) for counting.

Pulsed measurements of SiVs

Laser pulses at \(737\ {\rm{nm}}\) are generated with an electro-optic intensity modulator (EOM, iXblue NIR-MX800). The EOM is biased at its half-wave voltage (\({V}_{\pi }\)) and stabilized against temporal drift by using feedback with a lock-in amplifier (SRS SR830). A delay generator (SRS DG645) is used to generate the voltage pulses sent to the EOM. Acoustic pulses are produced by exciting the SAW transducers with short microwave pulses, which are generated with an arbitrary waveform generator (AWG, Tektronix AWG70001A) and amplified with an RF amplifier (Pasternack PE15A3008). The AWG clock is synchronized to the delay generator to prevent clock skew. Optical and acoustic pulses are temporally aligned by adjusting the delay of the waveform on the AWG.