Nanoscale Electric Field Imaging with an Ambient Scanning Quantum Sensor Microscope

Nitrogen-vacancy (NV) center in diamond is a promising quantum sensor with remarkably versatile sensing capabilities. While scanning NV magnetometry is well-established, NV electrometry has been so far limited to bulk diamonds. Here we demonstrate imaging external alternating (AC) and direct (DC) electric fields with a single NV at the apex of a diamond scanning tip under ambient conditions. A strong electric field screening effect is observed at low frequencies due to charge noise on the surface. We quantitatively measure its frequency dependence, and overcome this screening by mechanically oscillating the tip for imaging DC fields. Our scanning NV electrometry achieved an AC E-field sensitivity of 26 mV um^(-1) Hz^(-1/2), a DC E-field gradient sensitivity of 2 V um^(-2) Hz^(-1/2), and sub-100 nm resolution limited by the NV-sample distance. Our work represents an important step toward building a scanning-probe-based multimodal quantum sensing platform.


INTRODUCTION
The recent decade has witnessed exciting developments in quantum science and technology, where controlled quantum systems are employed to carry out tasks that are challenging for existing classical techniques. Quantum sensing is a rapidly-growing branch [1], which exploits the fragility of quantum states to detect small external signals with high sensitivity. It has many potential applications in the real world, such as geophysical navigation [2,3], disease diagnostics [4,5] and discovery of new materials [6][7][8]. The nitrogen-vacancy (NV) center in diamond is one of the most promising quantum sensors [9]. Its electron spin has a long coherence time even under ambient conditions. It is sensitive to a variety of signals, such as magnetic fields [10][11][12][13], electric fields [14][15][16][17][18][19][20], temperature [21][22][23][24] and pressure [25][26][27]. Integrating this atomic-sized versatile quantum sensor into a scanning probe microscope further allows mapping external signals with nanoscale resolution [28][29][30]. This holds a great potential for probing condensed matter physics. In particular, imaging both magnetic and electric fields can provide a unique insight to strongly correlated matter [31] and multiferroic materials [32]. It also has interdisciplinary applications, such as imaging charge and spin phenomena in chemistry and biology [9,[33][34][35][36].
Scanning NV magnetometry, as a well-established technique [28], has been utilized to image magnetic materials [6,[37][38][39], hydrodynamic current flows [7,8], skyrmion structures [40] and vortices in superconductors [41]. Recently, scanning NV electrometry has also been achieved, where fixed NVs in bulk diamonds are used to image electric fields from a conductive scanning tip [19,20]. However, NV electrometry has not yet been demonstrated in a diamond scanning tip, which is desired for imaging an arbitrary sample. Major challenges arise from the qubit's weak coupling strength to electric fields, a relatively short coherence time in nanostructures compared to bulk diamonds, and electric field screening by the diamond surface [42][43][44]. In this work, we demonstrate utilizing a shallow NV at the apex of a diamond nanopillar to image external AC and DC electric fields under ambient conditions. Dynamical decoupling sequences are used to extend the NV coherence times [45]. We achieve an AC electric field sensitivity of 26 mV/µm/ √ Hz and sub-100 nm spatial resolution limited by the NV-sample distance. A strong electric field screening effect is observed at low frequencies, likely caused by mobile charges on the diamond surface [42]. We quantitatively characterize its frequency dependence, that reveals a resistive-capacitive (RC) time constant of our diamond tip surface ∼ 30 µs. To overcome this screening effect, in order to image DC electric fields, we oscillate the diamond probe at a frequency of ∼ 190 kHz, and synchronize the quantum sensing pulse sequences with the mechanical motion. Hence, a local spatial gradient is upconverted to an AC signal, allowing for T 2 -limited DC sensing. This motion-enabled imaging technique has been explored in various scanning measurements [46][47][48][49][50][51], and here we apply it to scanning NV electrometry. We achieve a DC electric field gradient sensitivity of 2 V/µm 2 / √ Hz. Our results pave the way for building a scanning-probe-based multi-modal quantum sensing platform.

Experimental Apparatus and NV Electrometry
Our home-built diamond-NV scanning setup combines a confocal microscope and an atomic force microscope (AFM) operating in ambient conditions, as sketched in Fig. 1a. The diamond probe and the sample can be independently scanned by piezoelectric nanopositioners. The probe is attached to a quartz crystal tuning fork, allowing for frequency modulation AFM (FM-AFM). The probe oscillates in a motion parallel to the sample surface, and the AFM operates in contact mode. To improve the scanning robustness, we make multiple nanopillars on a single diamond probe, and use different pillars for AFM contact and quantum sensing separately. The NV, at the apex of the sensing pillar, is kept at a fixed distance from the sample. The distance is controlled by the tilting angle of the probe. More details on AFM scanning control can be found in Supplementary Note 2E. To test the electrometry capabilities of our system, we fabricate a U-shaped gold structure on a quartz substrate and use the NV to map out its electric field field distribution. An on-chip microwave (MW) stripline delivers MW fields to manipulate the NV spin states. A permanent magnet from beneath exerts a bias magnetic field at the NV.
Our electrometry measurements were performed under a magnetic field oriented perpendicular to the NV axis, denoted by B ⊥ . The coordinate system is depicted in Fig. 1b, whereẑ is along the NV axis and projection of one carbon atom in the perpendicular plane defines thex axis. Under B ⊥ , the NV electron spin eigenstates are [14,52], where φ B is the angle between B ⊥ and thex axis (see Fig. 1c). Electric fields induce Stark shifts of the |± energy levels. The splitting between the |± states is approximately ∆ ≈ where φ B and φ E are the angles of magnetic and electric fields in the transverse plane as shown in Fig. 1c, D gs ≈ 2.87 GHz is the zero-field splitting (ZFS), γ B = 2.8 MHz/G is the electron spin gyromagnetic ratio and d ⊥ = (0.17 ± 0.03) MHz/V/µm is the transverse electric field coupling strength [53]. We ignore the coupling to E z since the longitudinal coupling strength d = (3.5 ± 0.2) ×10 −3 MHz/V/µm [53] FIG. 1. Experimental setup and NV center. (a) Schematic of the experimental apparatus showing the confocal objective lens, a diamond probe, a U-shaped gold structure and microwave stripline fabricated on a quartz substrate, and an external magnet. The NV center is located at the apex of the diamond tip at a depth of ∼ 40 nm. The tip has a diameter of 300 nm and hovers above the sample. The NV-sample distance is typically < ∼ 100 nm. The gap between the two electrodes is 500 nm, and Au is (150 ± 5) nm thick. (b) NV center in the presence of a perpendicular magnetic field, denoted by B ⊥ in the XY plane in yellow. Thex direction is defined by the projection of one of the carbon atoms. E ⊥ is the electric field component in the XY plane, which causes Stark shifts of the NV spin states. φB and φE denote the azimuth angle of B ⊥ and E ⊥ relative to thex-axis. (c) Zoom-in of the XY plane. Grey lines represent the projections of carbon atoms. (d) NV electron spin energy level under B ⊥ . The |± states are superpositions of |mS = ±1 (see main text) and sensitive to E ⊥ . d ⊥ = (0.17 ± 0.03) MHz/V/µm is the transverse electric field coupling constant and γ=2.8 MHz/G is the electron spin gyromagnetic ratio. (e) Top panel: optical and microwave sequence for pulsed optically detected magnetic resonance (ODMR) measurements. Bottom panel: Pulsed ODMR spectrum showing transitions between |0 and |± under B ⊥ ≈ 73 G. The driving efficiencies depend on the MW field polarization. The |0 -to-|+ transition is used throughout this work for electric field sensing. is much smaller. Strain is measured to be negligible in our diamond probe for the scope of this work.
We choose to work with 15 NV under B ⊥ > 70 G [52]. The splitting between the |± states is therefore much larger than the hyperfine coupling ≈ 3 MHz [54], and both nuclear spin sublevels (m I = ±1/2) are sensitive to electric fields. Consequently, the full NV optical contrast contributes to the signal without the need of resolving hyperfine states. We use NVs with moderate T * 2 dephasing times (T * 2 ∼1.5 µs, see Supplementary Note 2A) and apply relatively strong MW power to drive transitions between spin states (π pulse duration ∼ 100 ns). These are in contrast to NV electrometry performed under a weak magnetic field where typically B ⊥ < 20 G [14,15]. A detailed comparison between 15 NV and 14 NV electrometry under different magnetic field regimes can be found in Supplementary Note 1B. Fig. 1d shows a pulsed optically detected magnetic resonance (ODMR) spectrum under B ⊥ ≈ 73 G. The transition efficiencies between |0 and |± depend on the linear polarization of the MW fields (see Supplementary Note 1B). We use the |0 -to-|+ transition throughout this work for electric field sensing.

Frequency Dependence of the Electric Field Screening
A strong electric field screening effect is observed at DC and low frequencies (see Supplementary Figure 10), likely caused by mobile charges on the diamond surface [42]. Possible sources include adsorbed water layer under ambient conditions, electronic trapping states due to near-surface band-bending [43,56], and internal charge defects generated during diamond growth or NV creation [57]. We characterize this screening effect by measuring its frequency dependence. NV is positioned within a metal gap. An AC reference voltage is applied across the gap, generating an electric field at the NV. "Lock-in" detection is employed to extract both the amplitude and phase of the NV signal (c) Dynamical-decoupling-based lock-in detection at high frequencies. XY-4 sequence is shown as an example. The NV phase signal φNV is maximized when the detected field is in-phase with XY-4 (lower left plot), and minimized when out-of-phase by π/2 (lower right plot) (d) NV phase signals φNV at 200 kHz, 800 kHz and 1200 kHz. No discernible frequency dependence is observed. Pulse sequences see Supplementary Figure 6. (e) Frequency dependence of signal attenuation (top panel) and phase delay (bottom panel), extracted from the lock-in measurements in (b) and (d). Signal is normalized with respect to the expected phase induced by the applied electric field in the absence of screening. The red solid traces represent an ideal high-pass filter with a cut-off frequency of 35.4 kHz. PDD stands for periodic dynamical decoupling [55], which refers to XY pulse sequences used at high frequencies. (f) A high-pass filter circuit model. NV is positioned within a metal gap. (g) A circuit model showing capacitive coupling between the diamond surface and a sample. relative to the reference signal.
At low frequencies, we use a Ramsey-based lock-in sequence, where a train of equally spaced Ramsey measurements is synchronized with the reference signal, as shown in Fig. 2a. In each Ramsey measurement, the NV spin is prepared in a superposition of |0 and |+ , and then accumulates phase induced by the DC electric field within the free evolution time τ = 800 ns. The accumulated phase φ N V can be extracted from the NV fluorescence signal measured after the final π/2 pulse (see Supplementary Note 2C). This phase is proportional to the local electric field, φ N V = d ⊥ E ⊥ cos (2φ B + φ E )τ , hence each Ramey measurement samples the electric field strength at the corresponding time step. The first Ramsey starts at 4 µs after the reference trigger, and the spacing between neighboring Ramsey measurements is also 4 µs. To have a sufficient number of sampling points, we performed measurements at reference signal frequencies below 50 kHz. As shown in Fig. 2b, the NV Ramsey phase φ N V oscillates in time, with the amplitude increasing with the reference frequency. Sinusoidal curve fitting extracts the amplitude and phase relative to the reference, which are represented by blue dots in Fig. 2e.
At high frequencies, the lock-in detection is based on a dynamical-decoupling pulse sequence [55], such as XY-4 shown in Fig. 2c. The spacing between neighboring π pulses is set to be half of the period of the reference signal. More π pulses are inserted to match high frequencies. Due to the finite coherence time T 2 , these measurements were performed above 200 kHz (see Supplementary Note 2A). The NV spin, prepared in a superposition of |0 and |+ , accumulates a phase induced by the AC electric field within the evolution time τ . The coherent phase signal φ N V is maximized when the detected local field is 'in-phase' (ϕ = 0) with the XY-4 sequence, and minimized when '90 • outof-phase' (ϕ = π/2). To measure the amplitude and phase of the detected signal relative to the reference, we vary the initial phase offset by sweeping the delay between the reference trigger and the first π/2 pulse. As shown in Fig. 2d, φ N V is maximized at zero initial phase offset, and minimized near π/2, which indicates very little relative phase between the detected and reference signal. In contrast to the low-frequency regime, here the oscillation amplitude shows no obvious dependence on the frequency. The extracted amplitude and phase by sinusoidal curve fitting are represented by orange dots in Fig. 2e. Fig. 2e summarizes the results in both low and high frequency regimes. The trend resembles the frequency response of a high-pass filter. The red solid curves represent an ideal high-pass filter with a cut-off frequency f c at 35.4 kHz, corresponding to an RC time constant ∼ 30 µs. A simplified circuit model in Fig. 2f illustrates a possible high-pass filter consisting of the resistive diamond surface and capacitance between the tip and electrodes. For a general sample, Fig. 2g shows the capacitive coupling between the diamond surface and sample. Screening is significantly reduced at high frequencies due to the finite mobility of charge carriers. The specific frequency cut-off can vary between different probes, since the NV location, geometry of the tip surface and diamond purity all affect the screening effect.

Spatial Mapping of AC and DC Electric Field Distribution
We now demonstrate imaging of the AC electric field distribution. A single NV at the apex of a diamond nanopillar (300 nm in diameter) scans over a U-shaped gold structure (Fig. 3a). Our spatial resolution is limited by the NVsample distance, which can be < ∼ 100 nm. The diamond is attached to a piezoelectric tuning fork, which oscillates on resonance (∼ 32 kHz) and provides frequency feedback to regulate the distance to the sample. In our AC electric field imaging, the oscillation amplitude is kept small (estimated to be <1 nm), and no reduction of the NV coherence time is observed. The probe motion is therefore represented by a flat line in Fig. 3a. An AC signal V pp = 0.96 V at 250 kHz is applied to the middle electrode and synchronized in-phase with the XY-4 pulse sequence with a free evolution time τ = 8 µs. The accumulated phase is Fig. 3b plots φ N V as a function of the AC input amplitude, measured by the NV at a fixed point within the gap. φ N V grows proportionally as expected. The cosine and sine values are measured by rotating the phase of the final π/2 pulse (see Supplementary Note 2C). Dashed traces in Fig. 3c show 1D line scans of φ N V and the corresponding e-field E ζ at different NV-sample distances (controlled by the probe tilting angle). They are in good agreement with the simulated E ζ distribution shown by the solid traces. The azimuth and zenith angles ofζ in the simulation are at φ = 20 • and θ = 45 • respectively (Supplementary Figure 13a). Based on a NV fluorescence rate >100 kcps, an optical contrast >20% and a phase accumulation time τ = 8 µs in our experiment, we have achieved an AC electric field sensitivity of 26 mV/µm/ √ Hz under ambient conditions (see Supplementary Note 2D). A 2D map of the AC electric field is shown in Fig. 3d, and a simulated field distribution at a distance of d = 90 nm is shown in Fig. 3e.
To map DC electric fields, we employ a motion-enabled imaging technique [12] that converts the DC signal to AC in order to overcome the screening effect. More concretely, the probe oscillates with a relatively large amplitude (> 10 nm) such that the NV experiences an AC local field proportional to the local spatial gradient E ζ (x) (Fig. 4a). In addition, the bending of the tuning fork induces a rotational motion of the NV axis, giving rise to an AC modulation proportional to the local static field E ζ . This latter effect cannot be ignored as shown by the data below. The amplitude of the total motion-enabled AC signal has a form of E amp = E ζ (x)A+βE ζ , where A is the probe oscillation amplitude and β is an empirical constant capturing the NV axis rotation. To operate at a sufficiently high frequency at which the screening diminishes, we drive the 'clang' mode of the tuning fork at ∼ 190 kHz [58] (Supplementary Figure 11). The sensing pulse sequence is synchronized with the mechanical motion, so the NV accumulates a coherent phase φ N V = d ⊥ E amp τ . Fig. 4b shows the NV measurement at a fixed point within the gap, where φ N V is proportional to the applied DC voltage V dc . Based on our experimental parameters, we achieved a DC field gradient sensitivity of 2 V/µm 2 / √ Hz (Supplementary Note 2D). Fig. 4c compares a 1D line scan of φ N V along thex-axis at V dc = 16 V with the expected signal deduced from the measurements in Fig. 3c and simulated signal by COMSOL. The top panel shows a clear discrepancy between data and expected signal or simulation, when only the gradient term E ζ (x)A in E amp is considered. The sign of the discrepancy coincides with the sign of E ζ shown in Fig. 3c. Including the term βE ζ in E amp leads to an improved agreement between data and model (with A = 13 nm and β = -0.03), as shown in the middle panel. Since the probe oscillates at a large amplitude while performing contact-mode AFM scanning, the actual motion highly depends on the details of probe-sample engagement. This is challenging to simulate accurately and could account for the remaining discrepancy. A 2D map of φ N V is shown in Fig. 4d, showing a reasonable agreement with the simulation result in Fig. 4e.

DISCUSSION
In conclusion, we demonstrated electric field imaging with a single NV at the apex of a diamond scanning tip under ambient conditions. The diamond surface significantly screens external electric fields at low frequencies. We quantitatively measured its frequency dependence using 'lock-in' detection sequences, and therefore revealed the RC time constant of the tip surface (∼ 30 µs). To overcome screening, we introduced a motion-enabled technique and demonstrated spatial mapping of the DC electric field gradients.
Potential improvements to these demonstrations can be achieved as follows. First, AFM operating in tapping mode would avoid direct contact between the probe and sample and allow the probe to oscillate with a larger amplitude more stably. This will give a higher field gradient sensitivity in motion-enabled imaging. Second, given an unknown sample, the DC field distribution along the maximum coupling axis ζ can be reconstructed by carefully pre-calibrating the probe oscillation direction and amplitude using a well-defined sample as described in this paper. Third, using multiple NVs of different orientations or multiple pillars with NVs of different orientations, one can extract both the magnitude and the direction of external electric fields, i.e. vector electrometry [16]. Finally, a higher spatial resolution is achievable by using even shallower NVs and motorized goniometers for a better control of the probe tilting angle.
NV electrometry possesses a unique combination of properties as compared to other existing techniques. In this work, we achieved an AC electric field sensitivity of 26 mV/µm/ √ Hz, DC electric field gradient sensitivity of 2 mV/µm 2 / √ Hz, and sub-100 nm spatial resolution limited by the NV-sample distance, all under ambient conditions. Most of other electrometry techniques are based on measuring potentials and many require cryogenic conditions. For example, scanning single-electron-transistor (SET) [59][60][61][62] is capable of measuring microvolt local potential, however this remarkable sensitivity requires low-temperature operation and its spatial resolution is limited by the tip size (>100 nm). Kelvin probe force microscopy (KPFM) [63,64] and electrostatic force microscopy (EFM) [65] can achieve sub-10nm spatial resolution and operate under ambient conditions, however they are not optimal for quantitative electric field measurements. NV center is therefore a valuable addition as a unique electric field sensor with complementary advantages. We also highlight that NVs have unparalleled sensing versatility with a broad operating frequency range. Integrating electrometry and magnetometry into a single scanning probe will open up exciting opportunities in imaging nanoscale phenomena.

Diamond scanning probe and AFM control
The diamond probe was made from an electronic-grade CVD diamond purchased from Element Six, with a natural abundance (1.1%) of 13 C impurity spins. The probe is of ∼ 50 µm × 55 µm × 125 µm in dimension. Each probe has seven tips in a row with a spacing of 7 µm. Details of the multiple-pillar probe design and fabrication process are described in [8,66,67].
The probe is attached to one prong of a quartz tuning fork using optical adhesive (Thorlabs NOA63). Two manual goniometers (Edmund) control its tilting angle and the NV-sample distance. The AFM contact between probe and sample is controlled by an attocube SPM controller (ASC500). Piezoelectric nanopositioners (attocube ANPxyz101 and ANPxyz100) were used for sample scanning. More details can be found in Supplementary Note 2E.

Optical setup
NV experiments were performed on a home-built confocal laser scanning microscope. A 532 nm green laser (Cobolt Samba 100), focused by a 100×, NA=0.7 objective (Mitutoyo M Plan NIR HR), was used to initialize the NV spin to the |m S = 0 state and generate spin-dependent photoluminescence for optical readout. An avalanche photodiode (APD, Excelitas Technologies Photon Counting Module SPCM-AQRH-13) was used to measure the NV fluorescence rate.

Quantum control setup
The microwaves were generated from a signal generator (Rohde & Schwarz SGS100A 6GHz SGMA RF Source) and amplified by a MW amplifier (Amplifier Research 30S1G6). All quantum measurements were performed on the Quantum Orchestration Platform (Quantum Machines). An Operator-X (OPX) generated control pulse sequences, output AC input voltages (V pp = 0.96 V), and measured the photon counts. DC input (V dc = 16V) is provided by a voltage source (Yokogawa GS200).

Electrostatics simulation
The finite-element calculation package COMSOL performs the electrostatic simulations. To model the geometry, we use real device dimensions by importing the lithography design into the software. 2D simulation produces plots in Fig. 3c and Fig. 4c. 3D simulation produces the plots in Fig. 3e and Fig. 4e. The different nuclear spins carried by 15 N and 14 N cause subtle and important differences in implementing electric field sensing, as summarized in Table I. For 15 N, nuclear spin is I = 1/2, and the hyperfine coupling constant is A || =3.65 MHz, A ⊥ =3.03 MHz. For 14 N, nuclear spin is I = 1, and the hyperfine coupling constant is A || = A ⊥ =2.2 MHz [3,4]. Both require a (nearly) perpendicular bias magnetic field for electric field sensing. The full NV Hamiltonian . 14 NV has an additional term due to the quadrupole interaction, QI 2 z , where Q ≈-5.01 MHz. Under a perpendicular magnetic field, the energy splitting between |+ and |− is roughly γ 2 B B 2 ⊥ /D gs . We discuss two magnetic field regimes. When γ 2 B B 2 ⊥ /D gs < A ⊥,|| , i.e. B ⊥ < 33 G, the hyperfine splitting is greater than the |± splitting. At θ B = 90 • , where θ B is the angle between B and NV axis, only the m I = 0 state is sensitive to magnetic field, as shown in Figure 2a. Intuitively, for m I = 0 states, the nuclear spin exerts a hyperfine field at the electron spin, so the total magnetic field seen by the electron spin is no longer perpendicular to the NV axis. The external bias magnetic field needs to be slightly off from θ B = 90 • in order for the m I = 0 states to be electric field sensitive, as shown in Figure 2a for 14 NV and Figure 2(b) for 15 NV. For electric field sensing to work in this weak magnetic field regime, the NV dephasing time T * 2 needs to be sufficiently long to resolve the nuclear spin sublevels. In addition, the NV fluoresence contrast, hence the signal-to-noise ratio, is reduced by 2 3 for 14 NV and 1 2 for 15 NV. When γ 2 B B 2 ⊥ /D gs > A ⊥,|| , i.e. B ⊥ > 33 G, the |± splitting is greater than the hyperfine splitting, and all the m I states are sensitive electric field. This is because the hyperfine field that the nuclear spin exerts on the electron spin is much smaller than the external bias field, θ B ≈ 90 • holds for all nuclear spin states. As shown in Figure 2c and d, a 5 V /µm electric field causes energy shifts on all the nuclear spin states.
However, in the strong B regime, there is another subtle difference between 15 NV and 14 NV due to microwave (MW) driving. The MW field component parallel (perpendicular) to the bias magnetic field efficiently drives the transition between |0 and |+ (|− ). Normally, MW field drives the transition between electron states while the nuclear spin state remains unchanged. However under a nearly perpendicular magnetic field, the nuclear spin state becomes electron spin dependent [2]. Transitions between all pairs of states can become allowed to some degree, and the transition amplitude is highly angle dependent and nuclear spin dependent. For example, between |0 and |+ in 14 NV, there are in total 3 × 3 = 9 possible transitions. Consequently, there is no well-defined π pulse duration. After applying a MW pulse, the electron spin can end up in different transversal plans in the Bloch Sphere, which results in a short T * 2 in measuring Rabi oscillation. Comparing Figure 3 and Figure 4, this problem is more severe in 14 NV than in 15 NV. As shown in Figure 4c and d, when driving transitions between |0 and |+ and θ B ≈ 90 • , only one transition is allowed for each nuclear spin sublevel.

A. Sample and NV Characteristics
The U-shaped gold structure and the RF stripline were made of Au/Ti (140nm / 12nm in thickness) by thermal evaporation on a quartz substrate. The RF stripline is about 20 µm in width.
The diamond probe was made from an electronic-grade CVD diamond, with a natural abundance (1.1%) of 13 C impurity spins. NV centers are created by 15 N ion implantation followed by vacuum annealing. They are estimated to be <40 nm deep from the surface. Details of probe fabrication can be found in [5].
We characterized the NV under a bias magnetic field ∼100 G oriented perpendicular to the NV axis. As shown in Figure 1e of the main text, due to the microwave field polarization, the |0 -to-|+ transition is more efficiently driven than the |0 -to-|− transition. Figure 5a shows the Rabi oscillation between |0 and |+ . Figure 5b shows the Ramsey fringes measured with a detuning of ∼1 MHz relative to the |0 ↔ |+ transition frequency. The extracted dephasing time T * 2 is about 1.5 µs. We also measured the coherence time of the |0 -|+ superposition state using spin-echo and dynamical decoupling sequences with multiple equally spaced π pulses ( Figure 6) [6]. As shown in Figure 5c, T 2 increases as the number of π pulses.
Lock-in detection based on dynamical decoupling sequences was used to characterize the screening effect at high frequencies (see Figure 2c-d of the main text). Due to the finite coherence time shown in Figure 5c, measurements were performed above 200 kHz. A XY-4 sequence with τ = 8 µs was used in producing the data shown in Figure 3 and Figure 4 of the main text.

B. Experimental setup and control
Experiments were performed on a home-built confocal laser scanning microscope, with an AFM scanning capability [7]. The AFM contact between probe and sample is controlled by an attocube SPM controller (ASC500). Piezoelectric nanopositioners (attocube ANPxyz101 and ANPxyz100) were used for sample scanning. The quantum measurements were performed on the Quantum Orchestration Platform, a universal quantum control platform developed by Quantum Machines. An Operator-X (OPX) generated control pulse sequences and measured the photon counts. Microwaves are generated by RhodeSchwartz SGS100A and amplified (Amplifier Research Model 30S1G6). NVs are optically addressed with an NA = 0.7 air objective from above, as depicted in Figure 1a of the main text. 532 nm laser of power ∼1 mW, pulsed by an acoustic-optical modulator (AOM), was used to initialize and readout the NV spin states. An avalanche photodiode (APD) collects the red photons from the NV fluorescence and sends the signal to the OPX.
where F 1 , F 2 , F 3 and F 4 are the NV fluorescence rates measured by the green laser pulse at the end the four sequences in Figure 7, respectively.

D. Electric field sensitivity
We first analyze the AC electric field sensitivity η E detected by the sequence as shown in Figure 3a of the main text. The sensitivity is determined by the NV fluorescence count rate F (unit: counts per second), the optical contrast C between the states |0 and |+ , the total phase accumulation time τ , the APD readout window T r , initialization time t ini as well as the electric field coupling strength d ⊥ .
Under an electric field E ζ , we expect NV phase accumulation φ N V = 2πd ⊥ E ζ τ (the initial 2π is to convert the angle unit to rad). After a π 2 pulse, this phase signal is converted to the probability of being in the |0 state, denoted by P . With an initial ( π 2 ) x pulse and a ( π 2 ) y pulse at the end, Suppose a small electric field δE alongζ, the total number of photons collected due to δE after N avg number of averages is: The photon shot noise is Within 1 second of integration time, N avg = 1/(t ini + τ ). Thus we obtain the signal-to-noise ratio (SNR) within 1 sec integration time: The sensitivity η E is defined as the minimal δE that can be detected with SNR = 1 within 1 sec, hence In AC electric field imaging, we have F = 100 kcounts/sec, τ = 8 µs, C ≈ 0.2, T r = 200 ns, and t ini =2 µs. Plugging these experimental parameters and d ⊥ = 0.17 MHz/V/µm into Equation (5), we obtain: η E = 26 mV/µm/ √ Hz = 260 V/cm/ √ Hz, which is comparable to the sensitivity obtained by a single NV in bulk diamonds [1]. In our motion-enabled DC electric field imaging, the directly measured signal is dominated by the local spatial gradient, so we examine the electric field gradient sensitivity: η gradient = η E A , where A is the tip oscillation amplitude. In our experiments, A = 13 nm, therefore η gradient = 2000 mV/µm 2 / √ Hz = 2 V/µm 2 / √ Hz.

E. Multiple-pillar diamond probe and scanning details
Fabrication of the diamond probe follows the procedure descried in [5]. To improve the probability of finding good single NVs in a probe and scanning stability, we fabricated multiple pillars on a single probe as shown in Figure 8 and Figure 9. There are 7 pillars in a row. The outer two pillars have a diameter of ∼800 nm, and the middle five ones are 300 nm in diameter. During the scanning experiments, the probe is slightly tilted such that one of the outer thick pillar is in direct contact with the substrate which performs AFM scanning, and one of the middle thin pillar hovers above the sample of interest at a distance of 20-40 nm which performs the sensing task. The tilting angle θ B is controlled by a manual goniometer (Edmund). This method also prevents the sensing pillar from picking up dirt or scratching the sample, and the thick AFM scanning pillar is less likely to break during harsh long-hour scanning. Overall, this multi-pillar scanning method improves the lifetime of a diamond probe.  8. (a) The probe is glued to the right leg of the tuning using a quartz rod. (b) Side view of the probe with multiple pillars. The outer two pillars are ∼800 nm in diameter. One of them is in direct contact with the substrate and performs AFM scanning. The middle five pillars are 300 nm in diameter. One of them is close to the sample of interest and performs NV quantum sensing. The NV-sample distance is hence controlled by the tilting angle θ, which is calibrated carefully. The tilting angle θ is carefully calibrated in advance. We start with a large θ and perform a 1D scan over two metal gaps to measure the signal profile. Based on the spatial resolution and signal strength, we gradually decrease θ, and repeats the 1D scan along the same line. As shown in Figure 3c of the main text, this process is then repeated multiple times until the signal profile is consistent with simulated signal at a distance of <100 nm away by COMSOL. More details on COMSOL simulation is included in section III.

F. Motion-enabled DC imaging
As mentioned in the main text, we observed a strong electric screening effect at DC and low frequencies. Figure 10 shows the pulsed-ESR spectra measured in the presence of DC voltages applied across the 500 nm gap. No discernible shifts were observed. To overcome the electric screening effect at low frequencies, we mechanically oscillate our diamond probe in imaging DC electric field signals. The probe is attached to one of the legs of a quartz tuning fork (Figure 8 and Figure 9), and a resonant electric signal excites the motion of the tuning fork (Digi-Key 1123-ND) due to the piezoelectric properties of quartz. The fundamental mode of the tuning fork is about 32 kHz, and its clang mode is about 190 kHz, roughly 6 times higher than the fundamental ( Figure 11). As shown in Figure 2 of the main text, the electric screening effect is still severe at 32 kHz, but significantly reduced at about 190 kHz, hence we excite the tuning fork at its clang mode. FIG. 11. Two modes of a tuning fork. The fundamental mode is used in AC imaging, and the 'clang' mode is used in DC imaging.
The probe motion increases as the excitation V pp . However, the quantitative relationship varies between different tuning forks and needs to be calibrated every time when we switch a probe. In our experiments, the maximum probe motion amplitude is <20 nm, as estimated by comparing the detected signal distribution with COMSOL simulation.

III. COMSOL SIMULATION
The finite-element calculation package COMSOL performs the electrostatic simulations.

A. Dielectric screening
First, we considered dielectric screening due to the non-negligible relative permittivity of diamond, which is ∼5.7. Our diamond scanning tip has a circular flat top with a diameter of 300 nm, and the NV is located at ∼40 nm away from the surface. Figure 12 visualizes the simulated electric fields inside and outside the diamond. E x,y and E z at the location of NV are both reduced to ∼41% as compared to the field magnitudes in the air. In subsequent discussions, we take into account this reduction to compare between the simulation and experimental data without explicitly mentioning this.

B. Simulation of AC and DC electric field distributions
To simulate our device, we used the real device dimension to model the geometry. In 3D simulations, The 2D lithography design was imported and extruded for a thickness of 150 nm to create 3D structures. The NV electric field sensitivity is maximized along a particular direction, which depends on the NV orientation and bias magnetic field direction. As shown in Figure 1 of the main text, the energy shift caused by electric field is ∝ E ⊥ cos (2φ B + φ E ), so the direction of maximum sensitivity is at φ E = −2φ B , represented byζ. In COMSOL simulation, we first calculated the 2D distribution of all the three components of electric fields E x , E y and E z at different distances h from the sample surface in the air, then calculated theζ component: Eζ = (E x cos φ + E y sin φ) sin θ + E z cos θ, where φ is the azimuth angle and θ is the zenith angle as depicted in Figure 13. In imaging AC electric fields, the signal is modulated at ∼250 kHz. In imaging DC electric fields, both the probe and signal oscillate at ∼ 190 kHz. φ, θ and the distance h are the three free parameters to produce Figure 3e of the main text, where we used φ = 20 • , θ = 45 • , h = 90 nm. In DC imaging, a XY4-based AC sensing pulse sequence is synchronized with the motion of the diamond probe. Therefore, the coherent phase accumulation is proportional to the product of the electric field gradient along the probe oscillation direction and the oscillation amplitude. These