Online adaptive quantum characterization of a nuclear spin

The characterization of quantum systems is both a theoretical and technical challenge. Theoretical because of the exponentially increasing complexity with system size and the fragility of quantum states under observation. Technical because of the requirement to manipulate and read out individual atomic or photonic elements. Adaptive methods can help to overcome these challenges by optimizing the amount of information each measurement provides and reducing the necessary resources. Their implementation, however, requires fast-feedback and complex processing algorithms. Here, we implement online adaptive sensing with single spins and demonstrate close to photon shot noise limited performance with high repetition rate, including experimental overheads. We further use fast feedback to determine the hyperfine coupling of a nuclear spin to the nitrogen-vacancy sensor with a sensitivity of 445 nT ffiffiffiffiffi Hz p 1 . Our experiment is a proof of concept that online adaptive techniques can be a versatile tool to enable faster characterization of the spin environment.


INTRODUCTION
Quantum sensing uses the readout of quantum systems to obtain information about the physical world. The accuracy depends on the sensor response to the parameter of interest and the ability to collect and process classical information. The ultimate limits to the sensitivity of quantum sensors can be obtained from the maximum speed that a quantum state can be driven to an adjacent state in response to the signal 1,2 and the information content of probabilistic readout 3,4 .
In general, attaining the ultimate sensitivity limits requires prior knowledge about the parameter to be measured, for example, the range of values it can take. Protocols that utilize adaptive measurements can optimize the acquisition of supportive prior information 5 and thus improve the performance of quantum sensors 6 by operating them in their most sensitive regime. An improvement in nonadaptive (NA) techniques is obtained by updating each sensing sequence based on the results of previous measurements, in concert with classical processing algorithms. In particular, adaptiveness allows sensing to be performed in environments with changing conditions and features higher noise resilience [7][8][9][10] .
In the context of measuring optical phases in interferometry, experiments using NA 11 and adaptive 12 phase estimation algorithms (PEAs) have demonstrated nearly Heisenberg-limited sensitivity. Likewise, for magnetic field sensing, similar techniques have been transferred to solid-state spins. The negatively charged nitrogen-vacancy (NV) center in diamond 13 has raised particular attention as atomically sized sensor 14,15 with potential application as a quantum register element for information processing 16 . For the characterization of larger registers that incorporate nearby nuclear spins, it is key to efficiently exploit the available resources, for example, the measurement time, which can exceed days 17 . However, until now, the focus of optimized algorithms has been on improved estimation of the phase response of single spins to constant (direct current (DC)) magnetic fields 5,[18][19][20] . Here, we extend upon previous work to implement multiparameter estimation with adaptive measurements and use this methodology for fast characterization of the sensor's quantum environment. We measure the parallel hyperfine coupling of a single NV center to a single 13 C nucleus with 4.4 kHz uncertainty in 16.0 s. Recently, a one-dimensional (1D) parameter estimation technique has been used for characterization of nuclear spins 21 near to an NV center. However, this experiment required prior knowledge on the hyperfine coupling in order to perform conditional two-qubit gates. Our work is complementary as we require less precharacterization but only measure one of the hyperfine tensor components.
The experiments described here are performed with single NV centers in diamond. Optical readout of the NV spin state (eigenstates are denoted here as |0〉, |1〉) allows induced shifts in the spin population to be detected as a highly sensitive probe of magnetic fields 22 . For sensing of static (DC) magnetic fields, a Ramsey sequence is used to map the sensor-phase response to a magnetic field intensity B ¼ jBe Z j into the population, which is subsequently read out optically. The symmetry axis e Z of the NV determines the projection of the magnetic field vector. The accuracy of Ramsey interferometry for estimating B within a total measurement time T m is limited by the photon shot noise to where C ≤ 1 is a dimensionless parameter describing the readout efficiency, γ ¼ 2π 2:8 MHz G À1 is the NV's gyromagnetic ratio and T Ã 2 is the phase coherence time of the NV. However, this accuracy is only reached when using a Ramsey sequence with a phase evolution time in which the sensor interacts with the magnetic field for a duration of τ $ T Ã 2 =2 23 . By normalization to ffiffiffiffiffi ffi T m p and considering experimental overheads, the standard measurement sensitivity (SMS) limit η SMS ¼ e where t ov ¼ T m À τ is the time of all overhead associated with the experiment (details in "Methods" section). Due to phase wrapping of the sensor, measuring at a fixed, maximum τ creates an ambiguity because multiple magnetic fields result in the same sensor output. Only if the field is known before hand to fall within ½B À π 2γτ ; B þ π 2γτ can it be measured uniquely. This fixes the dynamic range of the measurement to . Without prior information on the magnetic field, the accuracy of an unambiguous Ramsey measurement with a requisite shortened interaction time decreases drastically, as can be seen from Methods Equation (2).
Recently, a class of sensing protocols based on quantum-phase estimation 24 has been suggested [18][19][20]24 to overcome the trade-off that standard approaches for sensing DC magnetic fields suffer: either high sensitivity but limited dynamic range or vice versa. So far, the best-known protocols adapt the measurement settings conditioned on earlier results of the sensing sequence. These require either cryogenic readout conditions 5,25 or, at room temperature, have used post-processing on offline experimental datasets 26 . We implement a recently proposed magnetic field learning (MFL) algorithm 26 that changes the Ramsey sensing sequence applied to a single NV center online and based on results of previous measurements (Fig. 1a). Each updated Ramsey pulse sequence is selected from a list of 256 unique sequences that are applied with an arbitrary waveform generator (AWG).
The algorithm is implemented as follows. An underlying Bayesian model encodes the probability of different magnetic field values in a prior distribution P(B) (Fig. 1a, top middle). In every epoch, a Ramsey experiment and sensor readout are performed multiple times before a subsequent Bayesian update. According to Bayes' rule, the likelihood function LðE ¼ f0 or 1gjB; τÞ describing the probability given the observed experimental outcome (E = |0〉 or E = |1〉, only the two NV eigenstates are considered) is multiplied to P(B) in order to obtain a posterior P(B|E). Thus, one of two possible likelihood functions is chosen determined by comparing the number of detected photons against a known threshold value (majority voting) 27 , as illustrated in Fig. 1a (top right). Due to inefficiencies in the optical readout, on average a photon is recorded only every~17 experiments (≙ average photon number in the NV superposition state per readout I~0.059). We repeat each Ramsey experiment with the same experimental parameters n phot /I times, obtaining on average n phot photons per epoch. The Ramsey sequence for the next epoch i + 1 is chosen by calculating the desired phase evolution time based on the heuristic rule τ iþ1 % 1=σ Pi to ensure that the period of the likelihood function is in the same order as the current uncertainty σ Pi of the posterior in epoch i. The selection, switching, and loading from one of the 256 sequences has no associated overhead (<100 ns duration); however, the feedback loop including calculation of the Bayesian update, performed on a computer, takes~1 ms. Importantly, MFL allows magnetic field determination over a large range without a trade-off in sensitivity. This behavior follows from a heuristic that selects evolution times that optimize the Fisher information extracted from a typical MFL run. The choice of a τ that maximizes the Fisher information of a single measurement depends on the actual magnetic field value to find the steepest slope of the Ramsey fringe 23 . Consequently, the choice of exactly τ ¼ T Ã 2 =2 might be unfavorable, if the slope is as low as illustrated in Fig. 1b (middle) or if environmental conditions change 28 . While our algorithm does not perform optimization for each individual magnetic field value, statistically our heuristic enables us to obtain a sensitivity close to the Crame-Rao bound averaged across all magnetic field values 28 .

RESULTS AND DISCUSSION DC magnetic field estimation
In Fig. 2 we show results of adaptive measurements to estimate the DC magnetic field as a single parameter. In Fig. 2a, we plot four single runs of the algorithm at different DC magnetic fields, illustrating the high dynamic range. In every epoch, a Bayesian update is iteratively applied. Hence, the uncertainty of the magnetic field decreases and convergence to the correct field is reached.
Qualitatively, the increased dynamic range of MFL allows unambiguous estimation of magnetic fields over a range of 1.8-169 μT, while maintaining a sensitivity close to the theoretical SMS. We quantify this behavior in Fig. 2b, where the final sensitivity reached for averaged runs at different magnetic fields is shown. As expected, the sensitivity is nearly constant over the whole accessible magnetic field range and we reach sensitivities close to the SMS over the whole dynamic range. In this measurement, we demonstrate a dynamic range of 110× (lowest field B = 1.43 μT), or at the maximum magnetic field of 153.6 μT, an improvement in sensitivity of 17× over an unambiguous Ramsey experiment with fixed phase evolution time and no prior knowledge of the magnetic field range. Details of the sensitivity calculations for these Ramsey experiments can be found in the "Methods" section. We confirm the dynamic range by simulations in Supplementary Fig. 1, showing only weak modulation of sensitivity with the magnetic field. In our experiment, the maximum detectable magnetic field is limited due to the restricted memory of the AWG as the number of storable experiments with distinct τ is finite. We store 256 Ramsey sequences with linearly spaced τ j , where τ j + 1 = τ j + Δτ. This implies a Shannon-Nyquist sampling limit of f max = 1/(2Δτ), also shown in Fig. 2b. When choosing magnetic fields close to or bigger than 2πf max /γ, the MFL fails to obtain correct estimates. Neglecting the memory constraints of the AWG, the limiting , which defines the standard measurement sensitivity (SMS, red dashed line). The average photon number is n phot~6 5 for all MFL runs. c Magnetic field uncertainty σB versus the accumulated phase evolution time t Φ for an on average number n phot~9 photons per epoch. A NA-PEA simulation (lilac, optimized parameters G = 10, F = 1, n phot = 180) with similar readout conditions is plotted (described in the main text). The black line represents the Heisenberg uncertainty limit for a single experiment with phase evolution t Φ and no decoherence with identical readout efficiency to other datasets, the orange dashed line is the theoretical uncertainty of an unambiguous Ramsey experiment repeated at τ = τ 0 realizing a dynamic range [−B max , B max ]. d The magnetic field sensitivity η versus the accumulated running time including all experimental overheads t tot for MFL runs with several selected average photon numbers. The effective sensitivity η eff`> η, which considers the noisy averaged readout, is optimized at an intermediate photon number. Dashed lines represent simulation data for MFL an NA-PEA, respectively. Deteriorating sensitivity for a longer running time in NA-PEA is due to phase evolution times well beyond T Ã 2 , which one would avoid in a sensor application. The MFL data in b-d represents averaged runs (~50 repetitions per trace or data point) with the same parameters and applied magnetic field. In a, c, d the allowed field range in the initial prior is 0; property of the dynamic range is the Rabi frequency Ω rabi of the π/2 pulses. If the magnetic field saturates γB max >Ω rabi , it is not possible to resonantly control the NV spin anymore. For reported Rabi frequencies of~100 MHz with NV sensors, B max~3 500 μT would be expected. Controlling the frequency and phase of the microwave pulses in addition to the Ramsey phase evolution time could be used to further increase the dynamic range. On the other hand, the smallest detectable signal for both classical Ramsey experiments and phase estimation-based protocols is given by Eq. (1), placing the maximum reachable dynamic range for our experimental conditions to $ 4:2 10 4 ffiffiffiffiffi ffi Hz p À1 . We note that measuring such small fields is frequently challenging due to stability requirements on the test field. For many applications, the sensor readout rate is an important device characteristic since this defines how quickly information is obtained from the device. While PEAs for magnetic field sensing 6,18-20 have combined a sensitivity close to the SMS with a high dynamic range, the experiments differ in how quickly the final sensitivity can be reached. To obtain a comparable metric, in Fig. 2c we plot the magnetic field uncertainty against the accumulated phase evolution time. Three different regimes can be observed: At the beginning, the MFL acquires coarse knowledge by choosing short phase evolution times in the Ramsey sequence. Thus, the magnetic field uncertainty decreases only slowly. Subsequently, after the field range has been narrowed, longer τ is chosen and the uncertainty decreases sharply. When the MFL starts to request phase evolution times close to the dephasing time of the NV, a scaling kink is observed. As τ cannot be increased anymore, the magnetic field uncertainty improves only with statistical / 1= ffiffiffiffi ffi t Φ p behavior. We define the time T rep of the scaling kink as the repetition time of the protocol and plot its dependence against the average number of photons n phot in Supplementary Fig. 2a. Clearly, fewer photons per epoch allow faster repetitions rates and we achieve a minimal T rep = 0.24 s for n phot = 6.
Reaching sensitivities close the SMS over a high dynamic range is a distinct benefit of the MFL. In Fig. 2d we observe that sensitivity improves with the accumulated running time (accumulated phase evolution time plus all overheads), as expected. The statistical / ffiffiffiffi ffi t Φ p scaling after the kink translates into a constant sensitivity reached at longer sensing times. For photon numbers on the order of n phot~1 0, we approach the SMS. The best observed sensitivity is η = 132 nT ffiffiffiffiffi ffi Hz p À1 after a total runtime of~1.5 s including all experimental overheads. This sensitivity is a factor 1.46 higher than the η SMS = 90.5 nT ffiffiffiffiffi ffi Hz p À1 for the employed NV. The discrepancy is mainly explained by the overhead of the feedback loop, which becomes considerable compared to the phase evolution time per epoch at low n phot . For instance, the computational overhead makes up for 45% of the total runtime for an MFL run with n phot~6 (see Supplementary Fig. 3). Including this additional dead time in the analysis, the relative difference between sensitivity and the theoretical limit decreases to only 8%. These results yield a figure of merit Bmax η ¼ 1803 ffiffiffiffiffi ffi Hz p , which is to the best of our knowledge a record for a single NV roomtemperature experiment. Not reaching the SMS at higher n phot is an expected behavior: In every epoch the shape of the likelihood function that is multiplied to the current prior distribution is fully determined by the phase evolution time and the binary outcome of the measurement. In the case of our averaged optical readout, higher photon numbers increase the (photon shot noise limited) signal-to-noise-ratio (SNR) and thus lead to less mistaken outcomes. If the error probability is already low, increasing the SNR only changes the outcomes in few epochs but adds uninformative measurement time to all epochs. Equivalently, the majority voting is discarding more and more information for higher average photon numbers. Hence, the overall sensitivity is decreased. In Supplementary Fig. 2b, we characterize the trade-off between reduced precision (p s ≤ 1, defined as the fraction of complete MFL runs that yield correct magnetic field results) and sensitivity. To this end, we define an effective sensitivity η eff ¼ η= ffiffiffiffi ffi p s p that accounts for the fact that wrong magnetic field attribution due to poor SNR needs to be balanced by additional MFL repetitions. For our experimental conditions, an optimal photon number n phot~4 0 exists, where η eff ¼ 204 nT ffiffiffiffiffi ffi Hz p À1 is minimized. This effective sensitivity is a factor of 2.25 higher than the SMS, even considering all experimental overheads. Arguably, this effective sensitivity is our most relevant result for application as a DC magnetic field sensor. A possible route to mitigate sensitivity loss at elevated photon numbers would be to increase the extracted information per epoch by considering the exact number of detected photons in the likelihood function or even their time of arrival 27 . Previously, it was shown that NA phase estimation protocols can perform nearly as well as adaptive algorithms. However, this requires careful parameter optimization depending on the readout efficiency of the experiment 5,29 (often referred to as G, F, and R in the literature). It is an advantageous property of the MFL that except for determination of the average photon number I, no further tuning is required, since adaption to the readout is implicitly performed by choosing phase evolution times according to the current width of the posterior distribution. In Fig. 2b, c we also plot simulations of a parameter optimized NA-PEA 5 with the same experimental parameters. An increased noise resilience of MFL is reflected by the fact that NA-PEA requires significantly higher average photon numbers of n phot~1 80 and delivers a worse overall sensitivity (η ¼ 259 nT ffiffiffiffiffi ffi Hz p À1 ) than MFL, even comparing with its effective sensitivity defined above. NA-PEA performs best if the predetermined phase evolution times optimize the extracted Fisher information, which can be hindered by fixed values of T Ã 2 and targeted dynamic range. In terms of repetition rate, one could expect an advantage for NA-PEA due to the nonexisting computational overhead as the MW sequence in the AWG is never altered after an optimized set of parameters has been found. We observe, however, that MFL operated with moderate photon numbers n phot = 62 can reach a sensitivity of η $ 260 nT ffiffiffiffiffi ffi Hz p À1 equally fast even considering our current overhead. Optimized hardware allowing for shorter computation and communication times would lead to a further advantage for MFL.
Nuclear spin |A II | characterization Next, we extend our adaptive protocol from sensing a classical DC magnetic field to multiparameter estimation of the quantum environment of the sensor. This capability is required for learning about complex environments. In particular, we use MFL to determine the parallel component of the sensor's hyperfine coupling to a 13 C nuclear spin in the diamond lattice. As evidenced by a splitting in the sensor's energy spectrum, coupling to the carbon spin increases the Hilbert space of the sensor/ environment interaction, thereby increasing the parameter space of our estimation. A Ramsey experiment with a single nuclear spin coupled to the NV yields a result as sketched in Fig. 3b.
The oscillation in the time domain now consists of two single frequencies, which correspond to the offset of the microwave frequency to the NV-13 C resonances that could be observed in a pulsed optically detected magnetic resonance experiment. The parallel hyperfine coupling is given by the difference of both frequencies jA jj j ¼ jf 1 À f 2 j. We apply the same methodology as in the one-dimensional (1D) case to estimate both frequencies of the Ramsey experiment. As the parameter space grows, the prior P(B 1 , B 2 ) and the posterior P(B 1 , B 2 |E) of the MFL become two dimensional (2D). The decoherence-free likelihood function is constructed from equally weighted frequency components. We show exemplary priors and likelihood functions in Fig. 3a. In complete analogy to the 1D case, the MFL picks short τ for early epochs where a flat prior represents poor knowledge about the frequencies f 1;2 γB 1;2 =2π. For the multiparameter estimation, the corresponding likelihood function shows multiple broad peaks that are symmetric in either dimension. As the prior becomes single-peaked during the evolution of the MFL, the choice of τ ensures that the likelihood function at later epochs is multiplying a single peak of the likelihood to non-zero parts of the prior. In Fig. 3c the result of a single 2D MFL run is depicted. In the inset a trace of the differences of the resulting frequencies of subsequent repetitions of the 2D MFL is plotted. A linear fit to these frequency differences directly yields the parallel component of the hyperfine coupling jA jj j ¼ 59:10 ± 0:36 ð Þ kHz, whereas the frequency average gives the background magnetic field strength. Although the perpendicular hyperfine coupling A ⊥ is unknown, it is possible to deduct the possible position of the nuclear spin as an isosurface around the NV from |A || | 30 . In Fig. 3a we plot this localization at the beginning and end of a single MFL run.
For 1D parameter estimation, Heisenberg-limited scaling behavior (η 2 ∝ T −1 ) has been observed 26 for MFL. In our experiments, the scaling in learning a 2D model is less favorable than learning a single parameter like the DC magnetic field. In Fig. 3d the slope in the middle part is reduced for approximately a quarter when comparing to a simulated 1D case with the same experimental parameters. Hence, we assume that the heuristic (see "Methods" section) for multiparameter estimation could be further optimized. Note that it is not possible to directly compare experimental data, since we would need to employ distinct NVs for both cases with different coherence times and fluorescence count rates.
Our experiment operates in a regime where, due relaxation of the carbon spin during readout, multiple incoherent nuclear spin flips occur during the course of a single MFL epoch that averages over multiple (n phot /I) readouts of the NV. Thus, the nuclear spin will be in one of its eigenstates approximately half of the time, motivating the choice of equal contribution of the frequency components in the likelihood function. As each component of the recorded Ramsey fringe has half the contrast compared to the 1D likelihood, a lower sensitivity bound of η SMS Ã ;2D ¼ 2η SMS;1D is theoretically expected. Generally, the SMS for 2D parameter estimation is η SMS;2D ¼ ffiffi ffi 2 p η SMS;1D , assuming that we measure two distinct Ramsey fringes depending on the state of the nuclear spin. The theoretical limit, assuming a perfect readout, for estimating M independent parameters is η SMS;MD ¼ ffiffiffiffi M p η SMS;1D . From the data in Fig. 3d, we calculate a finally obtained experimental phase acquisition sensitivity which is a factor of η 2D ¼ 2:09 η 1D higher than the 1D simulation, indicating that our experiment operates reasonably close to the optimum achievable at the given readout conditions. Our best experimental sensitivity for a 2D MFL is extracted to be η 2D ¼ 445 nT ffiffiffiffiffi ffi Hz p À1 (ΔB avg ¼ 3:1 kHz after 16.0 s including overheads, Fig. 3d). This is a factor of 2.4 higher than the 2D limit η SMS Ã ;2D , a fact that is due to the relatively high average photon numbers (n phot~9 0) in this experiment. As in the 1D simulation, which shows an offset from the theoretical limit of η 1D ¼ 1:9 η SMS;1D (see Fig. 3d), the use of T. Joas et al. majority voting when high numbers of photons are collected results in a reduced sensitivity. At high photon numbers, sensitivity and effective sensitivity coincides closely, η~η eff , which allows the factor of 2.4 to be compared to the 1D MFL case (factor 2.25 in terms of η eff ). We attribute the additional sensitivity impairment to the increased overhead of the computationally more intensive 2D MFL. Likewise to the 1D case, a route to improved sensitivity is the replacement of majority voting to yield a more informative likelihood function.
Our value can be translated to a spatial sensitivity if A ⊥ is known. For example, the spatial sensitivity yields 76 nm ffiffiffiffiffi ffi Hz p À1 , if we were to determine the location a 13 C spin in 5 nm distance (jA ? j ¼ 0:15 kHz, jA jj j ¼ 0:26 kHz for an 13 C with average A ⊥ ). We do not expect this sensitivity to improve compared to A || determination by dynamical decoupling (DD) techniques, since the photon shot noise limit for the latter is limited by T 2 . However, DD techniques typically need to sweep the phase evolution time axis first, before varying the number of decoupling pulses N π 31,32 . Thus, there is substantial overhead spent for pre-characterization in the total measurement. As a consequence, we find our technique to be a valuable tool for characterizing strongly coupled (>1=T Ã 2 ) external spins, not hidden in the spin bath. When measuring hyperfine couplings via the nuclear Lamor precession, measurement back action from the NV can introduce broadening of the nuclear linewidth 33,34 . For the estimation of A || that we perform, however, we do not expect a detrimental influence on our estimation sensitivity as our measurement does not rely on tracking the evolution of a coherent superposition of the nuclear spin state, but only probes the projection of the nuclear spin in the measurement basis.
As an important remark, we note that multidimensional parameter estimation can be combined with arbitrary sensing sequences, as long as an expression for the likelihood function can be given. Analytic expressions exist for the coherence in Hahn echo modulation and DD experiments 31,32,[35][36][37] , which both can be used for determination of the perpendicular hyperfine component A ⊥ . Since for this purpose two parameters (τ, N π ) need to be controlled, this requires more addressable AWG memory than available in our current experiment. We anticipate that recent methods for full characterization of a multi-qubit register 38 or spatial reconstruction of the spin environment 17,30 may benefit from online adaptive sensing. To this end, highdimensional likelihood functions encoding all involved coupling constants and the according computing power to evaluate the large parameter space will be required.
If a complete MFL run could determine γB to an accuracy <2πA || within the relaxation time of a coupled nuclear spin, effectively providing a single-shot measurement of the nuclear state, our method would be extendable to allow for quantum feedback techniques 39 conditioned on the nuclear spin state. This will require implementation of MFL on microcontrollers, field programmable gate arrays, or other devices with improved real-time processing capabilities.

CONCLUSION
We have demonstrated an online adaptive MFL algorithm in a roomtemperature quantum sensing experiment. Using fast-feedback conditioned on measurement of a single spin sensor, we reached a magnetic field sensitivity of 204 nT ffiffiffiffiffi ffi Hz p À1 , which is by a factor of 2.25 above the SMS, while simultaneously increasing the dynamic range by a factor of one hundred when compared to measuring at τ ¼ T Ã 2 =2. By extending the estimation to two parameters, we measured the parallel hyperfine coupling to a nearby 13 C nucleus, allowing estimation of the hyperfine coupling with an uncertainty of 4.4 kHz in 16.0 s without requiring any pre-characterization.
We anticipate that by combining with sensing protocols beyond Ramsey interferometry, adaptive measurements can become a powerful tool for imaging and chemical analysis on the nanoscale and the characterization of complex quantum systems. Since other sequences generally imply more complex likelihood functions, it remains as an open question how an optimal heuristic for choosing the next set of experimental parameters can be derived. For nanoscale magnetic resonance imaging, the slow data rate of measurements, obtained by the readout of a single bit sensor, limits the timescale/speed of experiments. Thus, such adaptive protocols will be of immense importance. Beyond improved sensing and characterization, we envision the techniques developed here to allow for quantum feedback and control methods not achievable with classical feedback methods to have applications in emerging quantum technologies.

METHODS Sample
All experiments were carried out with two single NVs (T Ã 2 ¼ 18:7 μs, no coupled 13 C and T Ã 2 ¼ 19:0 μs with coupled 13 C; data from classical Ramsey experiments), which are located~1 μm below the surface in an isotopically enriched (99.8% 12 C) diamond sample grown by chemical vapor deposition. An external bias magnetic field of B 0~5 00 G was applied along the NV axis by a permanent magnet to ensure polarization of the nitrogen nuclear spin inherent to any NV. For the sake of easier notation throughout the main text, we discuss applied magnetic fields defined as B := B abs − B 0 denoting the absolute magnetic field as B abs .

Setup
We control and read the NV state with a standard home-built confocal microscope. Initialization and readout laser pulses of 750 ns at a wavelength of 532 nm are generated by a continuous wave laser and an acousto-optical modulator. Photon counts are gated by a switch to only consider the beginning 600 ns of the laser pulse. This value is obtained by optimizing the SNR of the optical readout. All laser pulse shapes and microwave waveforms are sampled on an AWG (Tektronix AWG70000A) amplified and applied to the NV through a copper wire of 20 μm diameter placed on top of the diamond. Photoluminescence photons of the NV in the >650 nm band are collected through an oil objective lens (Olympus, ×60, NA 1.35), counted by an avalanche photodiode (APD Excelitas SPCM), and digitized by a counting card (National Instruments 6323). The experiment is controlled by a custom measurement software 40 .

Feedback loop
After the waveform for an epoch has been played, the AWG outputs a trigger signal that is received by the counting card. An interrupt handler implemented in Python is invoked, which creates a timestamp on leave of the handler function. The running time per epoch used for sensitivity calculation is defined as the difference between two of those timestamps. Before creating the timestamp, the photon counting register of the counting card is read out to obtain a binary measurement result by comparing it to a pre-characterized threshold photon number. Then, the Bayesian update is performed by multiplying the likelihood function according to the prior sampled at 1000 (1D MFL) or 2000 (2D MFL) positions. Depending on the width of the posterior distribution, a new phase evolution time is calculated. To reduce computational complexity, we use a particle guess heuristic that samples two positions {B 1 , B 2 } in parameter space and choses τ ¼ 1=ðγjB 2 À B 1 jÞ with |·| denoting the 1norm. For 2D estimation, the heuristic is simply translated by calculating the norm in the 2D space. Then, the memory address of the sequence with the closest τ is output to the AWG. As our addressable AWG memory is limited, it is possible that the MFL requests τ longer than present in memory. Choosing the closest τ would then lead to uninformative repetition of the same sequence. In order to mitigate, we randomly reshuffle those cases such that on average hτi ¼ 3 4 T Ã 2 . By oscilloscope measurements, we confirm that the computational overhead between the trigger signal and new address issued is~1 ms (1D MFL) per epoch. Due to the increased computational complexity, the feedback loop takes~3 ms for 2D estimation. Omitting all computationally intensive code in the interrupt handler and outputting a static address takes~0.2 ms, limited by the notification time, which the Windows operating system requires to invoke the handler on receipt of the AWG trigger signal.

Standard measurement limit
The photon shot noise limited SMS of a single NV sensor reads 23 Io p Þ 2 Þ À1=2 is the readout efficiency defined by the number of photons per readout pulse in |0〉 and |1〉 state of the NV and t ov = T m − τ is the time of all overhead associated with the experiment. For the SMS given in this work, we consider t ov = 1.75 μs, which is the length of the readout and laser pulse plus a waiting time for initialization thereafter.
In Fig. 2b we compare against the sensitivity of an unambiguous Ramsey experiment. Assuming prior knowledge on the bounds as {B ± B max } = fB À π 2γτ ; B þ π 2γτ g, around a bias point B = 0, the dynamic range of a Ramsey experiment is defined by [−B max , B max ]. Thus, we calculate the phase evolution time τ ¼ π 2γBmax that yields an unambiguous result. Inserting into Eq. (2) gives the according unambiguous sensitivity.

Data averaging
Since MFL is a probabilistic algorithm, the magnetic field uncertainty reached in a single run underlies statistical fluctuations. To this end, we give results averaged over~50 independent runs. Due to the irregular spacing of the data points in the time domain, we use a moving average filter that calculates median values over a filter window of 100 ms.

Likelihood functions
For 1D estimation of a DC magnetic field, the likelihood function is L ¼ e Àτ=T Ã 2 L 0 þ 0:5ð1 À e Àτ=T Ã 2 Þ with L0 ¼ cos γBτ 2 À Á 2 . The likelihood function . In our experiments, we found phenomenological agreement with e Àτ=T Ã 2 and adopted it, which also matches the theoretical work 28 proving the optimality of the 1D heuristic.