Selective addressing of solid-state spins at the nanoscale via magnetic resonance frequency encoding

The nitrogen vacancy centre in diamond is a leading platform for nanoscale sensing and imaging, as well as quantum information processing in the solid state. To date, individual control of two nitrogen vacancy electronic spins at the nanoscale has been demonstrated. However, a key challenge is to scale up such control to arrays of nitrogen vacancy spins. Here, we apply nanoscale magnetic resonance frequency encoding to realize site-selective addressing and coherent control of a four-site array of nitrogen vacancy spins. Sites in the array are separated by 100 nm, with each site containing multiple nitrogen vacancies separated by ~15 nm. Microcoils fabricated on the diamond chip provide electrically tuneable magnetic field gradients ~0.1 G/nm. Tailored application of gradient fields and resonant microwaves allow site-selective nitrogen vacancy spin manipulation and sensing applications, including Rabi oscillations, imaging, and nuclear magnetic resonance spectroscopy with nanoscale resolution. Microcoil-based magnetic resonance of solid-state spins provides a practical platform for quantum-assisted sensing, quantum information processing, and the study of nanoscale spin networks. Arrays of spins in solids are a promising modality for a wide range of quantum science applications—from sensing to information processing. A team led by Ronald Walsworth at Harvard University adapted methods from magnetic resonance imaging (MRI) to realize site-selective addressing and coherent control of small arrays of optically active electronic spins in diamond known as nitrogen vacancy (NV) colour centres. Microcoils fabricated on the diamond chip provide electrically tunable magnetic field gradients that allow selective NV spin addressing with 30 nm resolution. The team experimentally demonstrated site-selective NV electron spin resonance spectroscopy, Rabi oscillations, Fourier magnetic imaging, and nuclear magnetic resonance (NMR) spectroscopy. The approach should be scalable to selective coherent control of large-scale arrays of strongly interacting NVs, with a broad spectrum of high-impact quantum science applications.


INTRODUCTION
In recent years, nitrogen vacancy (NV) colour centres in diamond have been successfully applied to a wide range of problems in quantum information, sensing, and metrology in both the physical and life sciences. 1 For example, single NV centres have been used for a loophole-free Bell test of quantum realism, 2 probing nanoscale phenomena in condensed matter systems, [3][4][5][6][7] and nuclear magnetic resonance (NMR) spectroscopy and imaging of nanoscale ensembles of nuclear spins [8][9][10] including single proteins 11 and individual proton spins. 12 Large ensembles of NV centres have provided magnetic imaging with combined micronscale resolution and millimetre field-of-view, e.g., for mapping paleomagnetism in primitive meteorites 13 and ancient Earth rocks, 14 genetic studies of magnetotactic bacteria, 15,16 and identifying biomarkers in tumour cells. 17 However, it remains a challenge to realize the intermediate regime of mesoscopic arrays of NV spins with selective nanoscale addressing and coherent control of NVs at each site in the array. Such a capability could be a platform technology for applications such as high-spatialdynamic-range magnetic imaging [18][19][20] and quantum-assisted sensing, [21][22][23] as well as scalable quantum information processing 24,25 and simulation. 26,27 In the present work, we experimentally demonstrate selective coherent manipulation of an array of four NV spin sites, equally spaced by~100 nm, using a frequency encoding technique inspired by magnetic resonance imaging (MRI). In frequency encoding, the positions of spins at different locations in a sample are mapped onto their resonance frequency using tuneable magnetic field gradients, and then frequency-tailored pulse sequences address and control spins at specific target positions. This technique is widely employed in conventional biomedical MRI for image slice-selection with millimetre-scale resolution, 28 and was recently used in trapped-ion experiments 29 to control coherently ions separated by a few microns. Frequency encoding has also been demonstrated for individual NV spins using magnetic tips. 18,30 Creating strong and spatially homogeneous magnetic field gradients that can be switched rapidly compared to the spin coherence lifetime is the primary technical challenge for multi-site spin control. For our NV-diamond experiment, we achieve a tuneable gradient strength~0.1 G/nm over a 1.2 × 8 μm 2 area by use of a micrometre-scale electromagnetic coil (microcoil) fabricated onto the diamond by e-beam lithography (Fig. 1). This gradient strength is several orders of magnitude larger than in conventional MRI; see our prior work 20 for comparison details. The microcoil field gradient is spatially uniform to 5%, can be modulated at~1 MHz, and its application requires no active cooling of the sample. Note that gradients of comparable magnitudes over a large area and with comparable switching rates are difficult to achieve using ferromagnets. 31 In Supplementary Figs. 1 and 2, we provide a detailed comparison between these two techniques with respect to figures of merit such as spatial dynamic range and gradient switching bandwidth. We apply these electrically tuneable magnetic field gradients to a series of demonstrations on the array of NV spins, including site-selective electron spin resonance (ESR) spectroscopy, Rabi oscillations, Fourier imaging, 20 and NMR spectroscopy, all with spatial resolution ≈30 nm.

Frequency encoding of NV spin sites in diamond
The frequency-encoding system for site-selective addressing of NV spins in diamond consists of fabricated arrays of NV centres and micron-scale coils, integrated into a home-built scanning confocal microscope as illustrated in Fig. 1a. NV centres are located at a depth of ≈20 nm from the surface of a [100]-cut diamond chip (4 × 4 × 0.5 mm 3 ) and are arranged in a two-dimensional (2D) array ( Supplementary Fig. 3). A uniform, static magnetic field of B 0 = 128 G, created by an external Helmholtz coil pair, is applied along the diamond [111] direction, which corresponds to one of the four NV crystallographic orientations (Fig. 1a, inset). A microwave antenna for coherent spin-state control and a magnetic field gradient microcoil are fabricated directly on the diamond surface. A strong field gradient, aligned nominally with B 0 , is created by sending electric currents through the microcoil in an anti-Helmholtz configuration. In each optically addressable confocal volume, four NV sites spaced by 100 nm are created by mask implantation (Fig. 1b). 32 Figure 1c shows the energy-level diagram of the negatively charged NV centre. The ground-state spin triplet is optically initialized into spin state |0〉, coherently addressed with microwaves, and read out by spin-state-dependent fluorescence. On applying a magnetic field gradient, the NV spins in different sites acquire a position-dependent Zeeman splitting between the |±1〉 states with well-resolved resonance frequencies. NV spins in each site can thus be selectively manipulated by microwaves with distinct, site-specific frequencies.
As shown by scanning electron microscope (SEM) image in Fig. 1d, the spacing between the gradient microcoil wires is 2.5 μm. In each NV site, there are approximately three NV centres with the same crystallographic orientation, as determined by the rate of fluorescence counts in a typical confocal volume (Fig. 1e), which is consistent with the estimated NV concentration of 5 × 10 11 cm −2 . See Methods and Supplementary Figs. 4 and 5 for more details of the mask implantation and microcoil fabrication. Super-resolution optical imaging with a stimulated emission depletion (STED) microscope confirms the formation of four NV sites with a spacing of about 100 nm within a confocal volume (Fig. 1e, inset). More details of the STED imaging apparatus is found in the "Methods" section. The diamond sample hosts negatively charged NV colour centres implanted in a matrix of defined regions (pink circles) and integrated into a scanning confocal microscope. NV spin states in a given region are initialized and read out with a green (532 nm) laser, and manipulated with microwave fields produced by an antenna (orange bar). A uniform magnetic bias field of B 0 = 128 G, applied along one class of NV orientations, induces a Zeeman splitting between the |±1〉 NV spin states, which are separated at zero magnetic field by 2.87 GHz from the |0〉 state. An additional gradient magnetic field (rainbow-coloured arrows) of about 0.1 G nm −1 , created by electric current through a pair of gold wires (gradient microcoil), introduces a position-dependent Zeeman shift. b Each region (pink circle in a) contains a 1 × 4 array of NV sites with 60 nm diameter and 100 nm centre-to-centre spacing, which can be exposed to the strong magnetic field gradient. Each site typically contains multiple (≈3 ± 1) NVs of the selected orientation. c NV energy-level diagram. Between the ground ( 3 A 2 ) and excited ( 3 E) electronic states there is a nonradiative intersystem crossing channel. Due to the magnetic field gradient, the NV spin states |±1〉 in each site acquire a positiondependent Zeeman splitting. By tuning the microwave frequency to a site-specific resonance, NV centres in any one site can be selectively addressed. d SEM image of gradient microcoil fabricated on a diamond substrate. The microcoil, represented by yellow pseudo-colour, is 1 µm thick and 2 µm wide. (Inset) SEM image of PMMA e-beam resist apertures used for ion implantation mask to create a 1 × 4 array of NV sites. e Scanning confocal microscope image of microcoil and matrix of regions hosting NV centres. (Inset) STED image of 1 × 4 array of NV sites with 50 nm resolution. Colour table represents photon count rate in the unit of kilo-counts-per-second Demonstration of site-selective NV spin addressing and control As a benchmark demonstration of site-selective NV addressing via frequency encoding, we performed optically detected ESR measurements with a DC electric current of 250 mA sent through the microcoil (Fig. 2a). Four ESR peaks, corresponding to the four NV sites in the array, are clearly observed (Fig. 2b). We fit the data to a sum of four Lorentzian curves, and determined the splitting between adjacent resonances to be Δf = 29(3) MHz. From the SEM and STED images we find the mean separation between NV sites to be Δx = 96(7) nm, and thus the field gradient to be dB/dx = 2π × Δf/γΔx = 0.11(1) G nm −1 . Note that the observed~30% variation in ESR peak linewidth is consistent with a simple model of inhomogeneous line broadening in the presence of the applied magnetic field gradient due to multiple (typically three) NV centres being randomly distributed in position within each site. See also Supplementary Fig. 6 for detailed analysis. Next, we demonstrated site-selective coherent Rabi driving of NV spins via the pulse sequence illustrated in Fig. 2c. After initializing all NV spins into the |0〉 state with a 5 µs long green laser pulse, frequency encoding is instantiated with a DC field gradient, and a microwave pulse tuned to the ESR frequency of a target NV site is applied for a duration τ MW . Finally, another 5 µs laser pulse is applied to read out the NV spin states via a fluorescence measurement. NVs in the target site exhibit Rabi oscillations as the duration of the microwave pulse is varied. As the microwave frequency is adjusted to match the ESR frequency at each of the four sites, we obtain the data shown in Fig. 2d. The fidelity of such site-selective control of the NV spins via frequency encoding is estimated to be >97.4% (see "Methods"). By fitting each Rabi oscillation data set with a sinusoid, we determine the Rabi frequencies at all four NV sites, with results that are consistent with a numerical simulation of the microwave field produced by the antenna (Fig. 2e, see also Supplementary Discussion 1).
Demonstration of site-selective NV imaging and NMR spectroscopy To illustrate the utility of the frequency encoding technique, we performed site-selective one-dimensional (1D) imaging of the array of NV centres. The experimental protocol (Fig. 3a) augments our previously demonstrated NV Fourier imaging technique 20 with frequency encoding to resolve the array's sub-diffraction-limit spatial structure with site-selection capability. In analogy with conventional MRI, an alternating (AC) magnetic field gradient, synchronized with a Hahn echo NV pulse sequence, phaseencodes spatial information about the NV sites in wavenumber or "k-space" onto the NV spins' phase, while also isolating the NV spins from local magnetic field variations that induce dephasing. In particular, the array of NV spins with real-space positions x i (i = A,…, D) is exposed to an AC gradient of magnitude dB=dx ð Þ j and thus acquires a position-dependent phase φ ¼ 2πk j x i , where k j ¼ 2π ð Þ À1 γτ dB=dx ð Þ j defines the jth point in Fourier or k-space. Here, γ/2π = 2.8 MHz/G is the NV gyromagnetic ratio and τ is the total NV spin precession time in the Hahn echo sequence. The optically detected NV signal for a point in k-space is proportional Measured NV ESR spectrum with four resonance peaks, corresponding to each NV site in a 1 × 4 array. Black solid line is a fit of the data (black dots) to a sum of four Lorentzian functions. The full-width-half-maxima of the peaks ≈8-15 MHz, and contrast ≈1.0-1.5%. c Site-selective NV Rabi driving sequence. Green laser pulses (5 μs duration) initialize and read out the NV spin states. A fixed magnetic field gradient (0.1 G nm −1 ) differentiates the ESR frequencies at the four NV sites via Zeeman shifts. A microwave pulse, with frequency tuned to any one of the four ESR resonances, drives Rabi oscillations at the corresponding NV site. Error bars indicate 68% confidence intervals in b and c. d Measured siteselective Rabi oscillation contrast as a function of microwave pulse duration τ MW . Black solid lines are fits of the data (coloured dots) to a sinusoid. From top to bottom, NV sites are driven at microwave frequencies of 3.452, 3.480, 3.511, and 3.540 GHz, with Rabi frequencies determined from fits to be 4.4(2), 4.2(2), 3.3(2), and 2.7(1) MHz, respectively. The Rabi frequency variation between sites is attributed to microwave field inhomogeneity caused by the boundary conditions of the gradient microcoil. e Simulation with COMSOL of NV Rabi oscillation parameters provided by the gradient microcoil. The calculated Rabi frequency variation is consistent with the measurements in d to the sum across all NV sites of the cosine of the acquired NV spin phase at each site: s k j À Á $ P i cosð2πk j x i Þ. By incrementally stepping through a range of field gradient amplitudes with τ fixed, one measures the NV signal as a function of k to yield a kspace image. In the following discussion we drop subscript j for simplicity. The real-space image is then reconstructed by a Fourier transformation of the k-space image: S x ð Þ ¼ F½s k ð Þ, where abs½S x ð Þ gives the relative positions of the NV sites in the array.
Note that the resolution of the real-space image is k max ð Þ À1 , where k max ¼ 2π ð Þ À1 γτ dB=dx j j max is the maximum k value used in the measurement. In the presence of an additional DC frequencyencoding field gradient and frequency-tuned microwave pulses, only NV centres at a specific target site in the array (e.g., x A ) are subject to the phase-encoding protocol and hence contribute to both the Fourier image (e.g., s k j ; x A À Á $ cosð2πk j x A Þ) and the realspace image (e.g., S x A ð Þ). We demonstrated this nanoscale NV imaging protocol using a phase-encoding magnetic field gradient of sinusoidal form dB=dx ð Þ j ¼ G j sinð2πt=τÞ, where G j is the gradient magnitude for the jth point in k-space. Example 1D k-space and real-space NV images are shown in Fig. 3b, c, respectively, for 512 equally spaced k-space points and the NV spin-free precession time fixed at τ = 0.9 µs. The maximum k-space value is k max = 0.021(1) nm −1 , which is induced by an AC gradient magnitude of G max = 0.0068(3) G nm −1 corresponding to a microcoil current of I = 25 mA. k max implies a 1D real-space resolution of δx = (2k max ) −1 = 24(2) nm, which is much less than the 100 nm separation between sites in the array of NV spins, but is insufficient to resolve individual NVs within one site. Note that site-selective Hahn echo measurements using the DC field gradient, but without phase encoding, confirm that the coherence times (T 2 ) for NV centres in all array sites are ≥2 μs and thus decoherence is insignificant during the imaging protocol (see Fig. 4a, b and Supplementary Fig. 7).
The top images in both Fig. 3b, c (labelled ABCD) have no DC frequency-encoding gradient applied, and hence contain contributions from all four NV sites, whereas the bottom four images in these figures (labelled A, B, C, or D) are acquired with siteselective frequency encoding using a DC magnetic field gradient of 0.1 G nm −1 . The four resolved signal peaks in image ABCD in Fig. 3c indicate the real-space locations of the NV sites in the array, with mean site separation = 93(1) nm and mean site diameter = 32 (2) nm. These results are consistent with the NV array geometry determined using site-selective frequency encoding (images A, B, C, and D in Fig. 3c), which collectively yield mean site separation = 92(1) nm and mean site diameter = 33(2) nm. Consistent results are also found for the k-space images with and without siteselective frequency encoding. Specifically, site-selected k-space images A, B, C, and D in Fig. 3b exhibit coherent single-spatialfrequency oscillations, with the frequency at each site (determined from fits to the image data) varying linearly with location in the array, as expected for a uniform DC gradient field ( Supplementary  Fig. 8). k-space image ABCD, acquired without site-selection, displays spatial frequency beats in accord with a coherent sum of the k-space images A, B, C, D from the four NV sites.
As a final demonstration, we combined the frequency encoding technique with an AC magnetometry experimental protocol (Fig. 4a) to perform site-selective NMR spectroscopy of the 15 N nuclear spins that are constituents of the NV centres at each site. With no applied DC field gradient, all four NV sites contribute to the measured Hahn echo signal, allowing determination of the NV ensemble T 2 ≥ 2 μs (Fig. 4b, upper panel and Supplementary  Fig. 7). With the frequency encoding DC gradient applied and the microwave frequency tuned to address only one NV site at a time, the Hahn echo signal is modulated by the 15 N NMR signal at a Larmor precession frequency given by the transverse component of the gradient field B ? at the selected site (Fig. 4b, lower panel and Supplementary Discussion 2). The measured 15 N Larmor frequency as a function of the varying transverse field B ? across multiple sites yields a 15 N nuclear gyromagnetic ratio of |γ n /2π| = 0.46(4) kHz G −1 , which is consistent with the accepted value. 19

DISCUSSION
Microcoil-based magnetic resonance frequency encoding provides a practical method for site-selective addressing and coherent control of a mesoscale array of solid-state spins with nanoscale resolution. In particular, the microcoil technique has advantages relative to the use of a scanned magnetic tip in terms Fig. 4 Site-selective NV NMR spectroscopy. a Site-selective AC magnetometry sequence, consisting of laser initialization and readout pulses, a microwave Hahn echo (π/2-π-π/2) pulse sequence with free precession time τ, and a DC field gradient (0.1 G nm −1 ). The microwave frequency is tuned to one of the site-specific NV ESR resonances to be sensitive only to AC magnetic fields at the target spin site and with frequency ≈1/τ. b Measured Hahn echo signal as a function of free precession time τ for no applied magnetic field gradient, which probes all four NV sites (upper panel), and an applied DC field gradient and the microwave frequency tuned to address only NV site C (lower panel). With no applied field gradient, NV spins in all sites experience only the uniform longitudinal bias field B 0, hence the NV Hahn echo signal provides a measure of the NV ensemble T 2 > 2.5 μs. With an applied gradient, the NV Hahn echo signal is modulated by the Larmor precession of NVconstituent 15 N nuclear spins in the transverse component of the gradient field B ? at the selected NV site. In both cases, the data is well fit (black curves) by a model that includes NV ESR and decoherence, 15 N NMR, and the measurement protocol (see Supplementary Discussion 2). Error bars indicate 68% confidence intervals. c Measured 15 N NMR frequency vs. transverse field B ? at the four NV sites. B ? values are determined from COMSOL simulations of the magnetic field gradient, constrained by measured NV ESR frequencies at each NV site. A linear fit to the data yields a 15 N nuclear gyromagnetic ratio of |γ n /2π| = 0.46(4) kHz G −1 , which agrees with the accepted value. 19 Error bars indicate the fitting uncertainty of spatial dynamic range and gradient switching bandwidth. When combined with Fourier magnetic imaging, 20 frequency encoding enables site-selective addressability in both the space and time domains, with advantages over other high-spatialresolution NV magnetic imaging/sensing methods; see our prior work 20 for a comparative discussion. Microcoil-based frequency encoding should be extendable to multiple spatial dimensions, larger and denser NV arrays, and smaller length scales (Supplementary Discussion 3), enabling applications such as targeted decoupling of proximal, dipolar-coupled NV spins. Key technical challenges include further increases in gradient strength (without significant additional heating), as well as improved SNR in the presence of applied gradients (e.g., by optimization of NV density and coherence time, enhanced NV fluorescence collection efficiency, and use of alternative NV spin-state readout schemes 20 ). For example, an increase in gradient strength to >1 G nm −1 should be practical by increasing the electric current through the microcoil, optimally matching the microcoil's impedance to the current supply, and introducing an active temperature control system to better remove microcoil-induced heat from the diamond (Supplementary Discussion 4). This enhanced gradient field could allow selective addressing with >95% fidelity of a micron-scale array of dipolar-coupled NV centres, each spaced by ≈10 nm (Supplementary Discussions 4 and 5). Such a network of strongly interacting spins with highspatial-dynamic range has many potential applications, including in quantum sensing and imaging, 33 quantum information processing, 34,35 studies of quantum spin transport, 36 and as quantum simulators for exotic quantum and topological phases (e.g., spin liquids and supersolids, 37 quantum spin Hall effect, 38 and topological insulators 39 ). We also emphasize the simplicity and flexibility of the gradient microcoil design, which facilitates integration with other systems such as microfluidics and microelectro-mechanical systems. Furthermore, the frequency-encoding technique should be integratable with a wide range of NV sensing protocols, including for DC and AC magnetic fields, electric fields, and temperature. These advantages open new directions for applications, including wide-field NMR imaging of nanoscale nuclear spin diffusion and effusion in cellular or microfluidic environments via q-space detection, 40 and spatially selective nanoscale imaging of strongly correlated spins in condensed matter systems. 37

Creation of 2D NV centre arrays with mask implantation
The diamond sample used in this experiment is an electronic-grade, singlecrystal, and 99.999% 12 C high-purity chemical vapour deposition [100]-cut chip (4 × 4 × 0.5 mm 3 ) obtained from Element 6 Corporation. Registration markers are fabricated on the diamond substrate by e-beam lithography and reactive ion etching. All subsequent fabrication steps use the same spatial coordinates defined by these markers. A polymethyl methacrylate (PMMA) ion implantation mask is used to spatially control NV centre formation in a three-level hierarchical structure of 2D NV arrays ( Supplementary Fig. 3). 15N + ions are implanted with a dose of 1 × 10 13 cm −2 at an implantation energy of 14 keV. The conversion efficiency from nitrogen ions to NV centres after high temperature vacuum annealing (1200°C, 4 h) is approximately 6%, which is determined by comparing the measured NV fluorescence signal from a confocal spot with that from a single NV centre. From simulations using the stopping and range of ions in matter programme, 41 the NV centres are estimated to be 21(7) nm below the diamond surface. Typical NV spin coherence times are T 2 * ≈580 ns and T 2 ≈4.5 µs, which are expected for the sample's high implantation dosage of 15 N + ions (1 × 10 13 cm −2 ) and the resulting high density of paramagnetic P1 centres.

Fabrication of gradient microcoil and microwave antenna
The magnetic field gradient microcoil and microwave antenna are fabricated on the diamond chip near the NV arrays ( Supplementary Fig. 4).
A double-layer PMMA process is employed to form an undercut, which improves the quality of copper lift-off. Here, the bottom layer is PMMA 495k C9 and the top layer is 950k C4. A thin layer of Chromium (~10 nm) is deposited onto the PMMA stack to improve the surface conductance and reduce charging during e-beam lithography. An Elionix F-125 e-beam writing system is used for patterning with exposure dosage and beam energy set at 2600 µC cm −2 and 125 kV, respectively. A 30 nm Ti layer and a 970 nm Au layer are then deposited in an e-beam evaporator, followed by lift-off in MicroChem Remover PG solution. Electrical resistance measurements, performed using a four-probe station, give a resistance of~2 Ω for each microcoil. A heat sink is then attached to the back surface of the diamond coverslip to enhance heat dissipation (Supplementary Fig. 9).

Confocal scanning laser microscopy
Site-selective NV spin addressing and sensing experiments are performed using a custom-built confocal scanning laser microscope. A 400 mW diodepumped solid-state laser (Changchun New Industries) operating at 532 nm is used for NV optical excitation. An acousto-optic modulator (Isomet Corporation) modulated at 80 MHz is used for pulsing the laser beam. Laser pulses are sent to a 100×, 1.3 NA oil-immersion objective lens (Nikon CFI Plan Fluor). A three-axis motorized stage (Micos GmbH) allows scanning of the NV-containing diamond sample in the focal plane of the objective. Red fluorescence from NV centres is separated from the 532 nm excitation light with a dichroic beam splitter (Semrock LM01-552-25), focused onto a single-mode fibre with a mode-field-diameter~5 μm, and then collected by a silicon avalanche photodiode detector (Perkin Elmer SPCM-AQRH-12). Pulse signals from the detector is transmitted to a computer through a DAQ card (National Instrument PCI 6221).

STED microscopy
Super-resolution NV optical images (e.g., Fig. 1e) are recorded on a homebuilt CW-STED microscope. 42 STED microscopy is based on applying a strong, spatially structured optical depletion field to switch off peripheral fluorescent emitters through stimulated emission. In our system, the depletion field is a doughnut-shaped optical beam at 750 nm, which is applied to NV centres in the field of view at the same time as a 532 nm Gaussian-shaped excitation laser beam. The doughnut-shaped depletion beam rapidly drives to the ground electronic state all NV centres that are off the dark spot in the centre of the beam axis, while limiting unwanted NV re-excitation. Thus, only NVs on the beam axis produce NV fluorescence, which is imaged as the microscope is scanned. The doughnut beam is created by a vortex phase plate (RPC Photonics), which imprints a staircase 180°phase shift on the input 750 nm laser beam such that diametrically opposite rays are out of phase. This results in creating a dark spot in the centre of the depletion beam by destructive interference at the focal point of the objective (100× TIRF, Nikon). Aberrations in the interference pattern are minimized via an optimization procedure in which the doughnut beam is imaged by reflection from 80 nm diameter gold nanoparticles, and the position of the vortex phase plate is determined, with respect to the 750 nm laser beam, that minimizes the intensity at the doughnut centre. 43 The STED point spread function is then imaged with~1 µm-deep isolated single NV centres in the bulk diamond sample. An improvement in spatial resolution from about 250 nm (confocal) to 50 nm (STED) is routinely achieved by applying a 300 mW depletion doughnut beam while the excitation beam power is kept as low as 100 µW. With the same experimental parameters, a four site array of NV centres can be STED imaged across a 600 × 250 nm field-of-view with 100 × 100 pixels, at a speed of 10 ms/pixel, realizing a per pixel NV contrast-to-noise ratio of~5. With such a STED imaging procedure, four individual NV sites, each separated by 96(7) nm are distinguished. 2D Gaussian filtering (4 × 1.5 pixels) and thresholding are applied to improve the image rendering, as in Fig. 1e.

Characterization of the magnetic field gradient
To characterize the performance of the microcoil, the magnetic field gradient is measured as a function of electric current through the microcoil by optically detected NV ESR. An external Helmhotz coil pair is used to apply a uniform magnetic field of B 0 = 128 along the [111] diamond crystallographic orientation (on-axis), which leaves the NV spins along the other three orientations (off-axis) degenerate. A gradient magnetic field B(x) is then introduced by sending an electric current I through the microcoil. This gradient field is aligned nominally with B 0 and the [111] diamond crystallographic orientation, although there is also a modest where h is the reduced Planck constant, γ = 2π × 2.8 MHz G −1 is the NV gyromagnetic ratio, and S is the spin-1 operator. The change of NV fluorescence is recorded under continuous optical and microwave excitations. Supplementary Fig. 10a displays the results of the measurement, showing the change of NV fluorescence as a function of microwave carrier frequency f and current I. Of the four observed resonance bands, the outer two correspond to the ESR transitions of on-axis NV centres, while the inner two are the ESR transitions of three degenerate off-axis NV centres. The measurements are consistent with a simulation based on the above NV spin Hamiltonian (Supplementary Fig. 10b). As the current is increased, the resonance bands become broader because the four NV sites split. A higher resolution ESR scan of the highlighted region in Supplementary Fig. 10a quantifies the splitting of the resonance band for on-axis NV centres ( Supplementary  Fig. 10c). At around I = 200 mA, the band begins to clearly split into four peaks, corresponding to four proximal NV sites, and the ESR contrast decreases to~2%. Consistency with simulation is again found (Supplementary Fig. 10d). By fitting the ESR spectrum with a curve comprised of four Lorentzian lineshapes for a given current value, the resonance frequency of each NV site is extracted f i (i = 1,…,4). Since the separation between NV sites is known to be Δx =− 96(7) nm from the SEM and STED images, the field gradient at each current value can be obtained using dB/dx = 2π × Δf/γΔx, where Δf is the measured frequency splitting between adjacent resonance peaks. Repeating this analysis for all current values, the field gradient per unit current is found to be dB/dx/I = 0.45(2) G nm −1 A −1 ( Supplementary Fig. 11). In particular, at I = 250 mA, the measured field gradient is dB/dx = 0.11(1) G nm −1 . A numerical simulation of the field gradient is performed in three steps. First, the magnetic field spatial distribution produced by the microcoil at a fixed current is simulated using COMSOL. Next, the magnetic field spatial distribution for the entire current range is determined under the assumption that the field is linearly proportional to the current. Finally, the ESR resonance peaks for all NV centres are calculated by diagonalizing the ground-state Hamiltonian with the obtained field distribution as an input. The resulting analysis yields a field gradient per unit current of dB/dx/I = 0.48 G nm −1 A −1 , which is in reasonable agreement with the measured value ( Supplementary Fig. 11).

Estimation of Rabi driving fidelity
From the site-selective Rabi driving measurements presented in Fig. 2, the fidelity of coherent NV spin driving in the presence of a DC gradient and with a Rabi frequency of Ω can be estimated by evaluating the off-resonant excitation (crosstalk) error: p err % p dip þ p T1 þ p T2 þ p off . 44 The first term, p dip $ κ=Ω ð Þ 2 $ 10 À9 , is the error induced by dipolar coupling (κ = 0.3 kHz) between nearest neighbour NV centres; the second term, p T1 $ ΩT 1 ð Þ À1 $ 10 À4 , is the depolarization error due to the finite NV spin relaxation time (T 1~1 ms); and the third term, p T2 $ ΩT 2 ð Þ 3 $ 10 À6 , is the decoherence error due to the finite decoherence time (T 2 = 100 µs). The dominant contribution comes from the last term, p off ¼ Ω i =Ω R ð Þ 2 sin 2 ðΩ R τ=2Þ, which is the crosstalk error induced by driving adjacent NV sites given finite NV ESR frequency detuning. Here, is the generalized Rabi frequency and i = A, …, D is the NV site label, with Ω i being the on-resonant Rabi frequency at site i and Δ i,j being the ESR detuning between site i and j. The maximum crosstalk error (p err ) is observed to be 0.026(3) between NV site A and B with the following parameters: Ω A = 4.4(2) MHz, Ω B = 4.2(2) MHz, and Δ AB = 27(1) MHz. In this way, we infer that fidelity of site-selective control is about 97%, which is essentially 1−p err . See also Supplementary Fig. 12.

Data availability
The authors declare that the main data supporting the finding of this study are available within the article and its Supplementary Information files.