Machine learning assisted quantum super-resolution microscopy

One of the main characteristics of optical imaging systems is spatial resolution, which is restricted by the diffraction limit to approximately half the wavelength of the incident light. Along with the recently developed classical super-resolution techniques, which aim at breaking the diffraction limit in classical systems, there is a class of quantum super-resolution techniques which leverage the non-classical nature of the optical signals radiated by quantum emitters, the so-called antibunching super-resolution microscopy. This approach can ensure a factor of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{n}$$\end{document}n improvement in the spatial resolution by measuring the n -th order autocorrelation function. The main bottleneck of the antibunching super-resolution microscopy is the time-consuming acquisition of multi-photon event histograms. We present a machine learning-assisted approach for the realization of rapid antibunching super-resolution imaging and demonstrate 12 times speed-up compared to conventional, fitting-based autocorrelation measurements. The developed framework paves the way to the practical realization of scalable quantum super-resolution imaging devices that can be compatible with various types of quantum emitters.


Introduction
Due to the wave nature of light, the spatial resolution of conventional far-field microscopes is fundamentally limited by the diffraction limit to approximately half the wavelength of the incident light, known as the Rayleigh criteria [1] or Abbe limit [2].Far-field super-resolution microscopy (SRM) techniques that aim at overcoming the diffraction limit could greatly impact the fields of biology, physics and chemistry, as well as device engineering, semiconductor industry, and could lead to novel applications [3][4][5][6][7][8].The developed SRM techniques typically break one or more of the underlying fundamental assumptions on the nature of light-matter interaction within the optical system, under which the diffraction limit is derived.Specifically, it is assumed that the illumination intensity is homogenous, the optical response of the stationary object is linear, and all the optical fields in the system are classical.Recently, a plethora of novel super-resolution techniques, including stimulated emission depletion [9], structured illumination microscopy [10], photoactivated localization microscopy [11], and stochastic optical reconstruction microscopy [12] has been developed.All the aforementioned techniques are realized within classical optical systems via breaking the homogeneity, linearity, or stationarity assumptions.Another promising route in the realization of SRM techniques is to take into account the quantum nature of light [13][14][15][16].Recently, several quantum schemes, utilizing multimode squeezed light [17] and generalized quantum states [18] have been proposed.These approaches use complex quantum states of light as an illumination source, which demand highly efficient, deterministic sources of such quantum photons or entangled photon pairs.In contrast, several SRMs have been developed by relying on the quantum nature of the light emitted by the object itself.This approach, originally proposed by Hell et al [19], is based on fact that some quantum sources of light produce emission with sub-Poissonian temporal photon statistics, which can be analyzed by measuring the autocorrelation function of the emission.By analyzing the -th order autocorrelation function at zero time delay  () ( = ) of nonclassical light emitted from a point source, it is possible to reduce the size of the effective point spread function by a factor of √ [20][21][22].The antibunching-based SRM can be coupled with a classical approach to further improve the resolution of the imaging system.By combining image scanning microscopy with the measurement of the second-order quantum photon correlation, a spatial resolution of four times beyond the diffraction limit was achieved [23] with only a modest hardware overhead compared to regular confocal scanning microscopy.This combination makes the antibunching-based SRM technique a very attractive platform for imaging quantum light sources, as these are typically analyzed using confocal scanning microscopy.The main bottleneck of this framework is the time required for the acquisition of the time-resolved photon statistics needed to accurately determine the values of the autocorrelation function at zero delay.This accuracy depends on the number of registered correlated photon detection events.The time requirement scales up exponentially with the increasing order of the autocorrelation function.Hence, in order to realize scalable and practical antibunching-based SRM, one needs to develop a fast and precise approach to determine g (n) (0).Recently, convolutional neural networks (CNNs) enabled the rapid classification of quantum emitters depending on whether g (2) (0) is above or below a given threshold value based on sparse autocorrelation function measurements [24].Leveraging on these results, we present a CNN-based regression model that allows an accurate estimation of the  () () value based on sparse data.Using the developed CNN model, we reduced the acquisition time in the antibunching-based scanning SRM technique by 12 times, thus marking an important step towards the practical realization of scalable quantum super-resolution imaging devices.

Machine learning assisted antibunching super-resolution microscopy
The antibunching SRM technique relies on the detection of quantum correlations in the signal radiated by quantum emitters, which allows for a gain in the spatial resolution of a factor of √ by measuring -th order autocorrelation function [21].This fact can be understood by conducting the Gedanken experiment first proposed by Hell et al [19].In the case of a hypothetical emitter that emits photons by pairs, an improvement in resolution can be theoretically obtained by sending each of the two photons to a separate camera.Since the two cameras will record two independent point-spread function (PSFs) estimates, the spatial resolution can be improved by a factor of √2 via simple multiplication.However, instead of requiring the emitter to emit pairs of photons, one can acquire the same amount of information by assessing an absence of the two-photon correlation in single photon emission by measuring the second-order autocorrelation function.Furthermore, one can achieve an arbitrarily high improvement in resolution by measuring higher-order correlations in the emission of a single photon emitter.In the most general form, the intensity distribution of the super-resolved image based on antibunching SRM  () (, ) can be obtained via retrieving spatial distributions of the -th order autocorrelation function at zero time delay  () (, ,  = 0) and the number of detected photons  ̃(, ) [21]: here 〈 ̃(, )〉 is the average number of detected photon from a given point (, ) of the sample;   is a function of the product  ( 1 ) (, , 0) ( 2 ) (, , 0) …  (  ) (, , 0), where   is the number of ordered combinations, fulfilling the condition ∑    =1 = .For example, for  = 2 case, Eq.( 2) takes the following simple form [21]: (2) (, , 0)). ( The most commonly used approach for retrieving the  (2) (0) value is a Hanbury-Brown-Twiss (HBT) interferometry measurement, composed of a beam-splitter directing the emitted light to two single-photon detectors connected to a correlation board (Figure 1a).The correlation board registers events consisting of pairs of detector clicks.It then arranges these events into a histogram as a function of the time delay  between the clicks, which can be used for the post-processing via Levenberg-Marquardt fitting: Here,   ,   ,  = 1,2 are the fitting parameters related to the internal dynamics of the emitters.
Figure 1b shows the main steps of the fitting-based approach for the realization of the antibunching SRM technique.The area of interest is divided into  ×  pixels, and autocorrelation histograms are acquired at each pixel.The autocorrelation measurement is performed for several minutes.The L-M fitting is done over all of the HBT histograms and the corresponding  (2) (, , 0) map is retrieved.Finally, the resolved image is calculated via Eq.( 2) (Figure 1d).
In our demonstration, we use single nitrogen-vacancy (NV) centers in nanodiamonds dispersed on a coverslip glass substrate as single photon emitters.These emitters typically yield between 10 4 and 10 5 counts per second on each of the single-photon detectors in the HBT setup (when in focus) and exhibit fluorescence lifetimes between 10 and 100 ns.During the scan, when the emitters are partially out of focus, the fluorescence counts drop significantly.Consequently, in order to assess  (2) (0) via Levenberg-Marquardt (L-M) fitting with an uncertainty varying between ±0.01 to ±0.05, autocorrelation histogram acquisition times of 1 min are required per pixel.In the pulsed excitation regime, the fitting is not required to retrieve g (2) (0) as long as the pump repetition period is much longer than the emitter's fluorescence lifetime.However, this requirement becomes somewhat impractical when the emitter lifetime is long as in the case of NV centers.The developed ML approach addresses the aforementioned problem by rapidly estimating the  (2) (, , 0) values based on sparse HBT measurement.The main framework of the developed approach is shown in Figure 1c.A CNN regression network is trained on a set of "sparse" autocorrelation data with short acquisition times.Once trained, the CNN network estimates the g (2) (0) values, requiring an acquisition time of less than 10 s.

Machine learning assisted autocorrelation function measurement
The main building block of our ML assisted antibunching SRM technique is the CNN based regression model, used for retrieving  () () values.In this section, we highlight the structure  (2) (0) (d) from 5s datasets.Dots show the average predicted g (2) (0) value, while error bars shows the standard deviation of the predicted value over all the 5s datasets acquired for a given emitter.
of the CNN, its training and testing, as well as compare its performance against convention L-M fitting.The training dataset for sparse second-order autocorrelation histograms consists of measurements performed on a set of 40 randomly dispersed nanodiamonds with NV centers on a coverslip glass substrate.Figure 2a shows the schematics of the HBT setup used for these measurements.Two avalanche detectors (D1, D2) with 30 ps jitter are connected to a pulse correlator with a 4 ps internal jitter.The co-detection events are recorded over a range of 500 ns and collected into 215 equally sized time bins.For each of the 40 emitters, hundreds of sparse autocorrelation histograms with 1s acquisition time are collected, until the total number of co-detection events in their sum allows a precise ground truth ( () ()) estimation via L-M fitting with fitting uncertainty varying between ±0.01 to ±0.05.The estimated ground truth value is then assigned as a label to the entire set of 1s histograms.We then formed all the possible combinations of 1 to 10 of these 1s histograms to obtain training data that emulated histograms with acquisition times from 1s to 10s.Such a data augmentation process assumed that the emission is a process with no memory over times exceeding 1s and allowed us to significantly extend the training dataset.More information on the training dataset collection process and data augmentation can be found in Supplementary materials.
Figure 2b shows the structure of the CNN used for  () () regression.The CNN consists of one input layer, three hidden convolutional layers, one max-pooling layer followed by dropout, three fully connected layers, and one output node containing the regression result.The input layer had 215 nodes corresponding to the number of bins in the input histogram.The feature learning part of the CNN is optimized to capture the salient features of the autocorrelation datasets, while the regression part is trained to predict  () () values based on these extracted features.All of the hidden layers comprised 260 filters.The third hidden layer's output is connected with the max-pooling layer, followed by the dropout layer.The kernel size of the filters ( 4) is chosen to be the same for each layer.Importantly, the CNN takes the total number of two-photon detection events Nevents in the histogram as an additional input.Nevents is concatenated to the output of the feature learning part and used as a regularization term during the training process.The 5s-10s histograms acquired on pixels where the contribution of the quantum emission to the total counts is negligible, feature Nevents<4, while the histograms on areas close to the quantum emitter locations feature Nevents=65 on average.To populate the "dark" pixels, the CNN regression network is implicitly biased to produce  () () = , on the datasets with Nevents<4 counts.Supervised training of the CNN regression model was performed using the augmented dataset of 5s-10s sparse HBT histograms and the corresponding ground truth labels.The training process is realized by performing adamax gradient descent optimization using the Keras library [25] for 100 epochs with mean absolute percentage error loss function.80% of the dataset is used for training, while the remaining 20% are used for validation and testing.
The performance of the trained CNN regression model is assessed via calculating the mean absolute percentage error (MAPE) and the coefficient of determination (r 2 ) on the 5s histogram datasets.Figure 2c shows the regression plot of the L-F fitting performed on 5s HBT histograms.
Markers show the average value of the prediction, while error bars show the standard deviation over the set of 5s histograms belonging to the same emitter.Due to the sparsity of the HBT measurement, the L-M fitting expectedly cannot ensure precise fitting of the data, which results in MAPE=32%, r 2 =70% and root mean square error (RMSE) of 0.215.
In contrast, the CNN regression model ensures very precise predictions of the  () () values based on 5s HBT histograms (Figure 2d).Due to the ability of the CNN network to learn hidden correlations between signature features of the sparse datasets and the ground truth labels, the CNN regression model shows excellent performance on the sparse dataset and ensures low MAPE (5%), a high coefficient of determination of 93% and RMSE of 0.0018.The CNN performance is also robust against the reduction of the acquisition time.We analyze the performance of both approaches on 5s, 6s and 7s HBT datasets.The performance of the direct fitting ensures 30% and 27% MAPE when applied to 6s and 7s HBT measurements, respectively.The CNN regression model ensures performance that is much more robust than L-M fitting.It ensures 3.92% MAPE on 6s HBT datasets and reaches up to 3.58% MAPE when applied to 7s datasets.

Experimental realization of machine learning assisted antibunching super-resolution microscopy
The benchmarking of the ML-assisted regression of autocorrelation data enables the experimental demonstration of the ML-assisted antibunching SRM.The experiment is realized on a sample of randomly dispersed nanodiamonds with NVs on a glass substrate.In this demonstration, the objective is scanned using a piezo-stage with sub-10 nm resolution over the 775x775 nm 2 region of interest, which is divided into 1024 (32x32) pixels and contains one nanodiamond with a single NV center.Autocorrelation measurements are performed on each pixel in 1s time increments with a 7s total acquisition time per pixel.Along with the autocorrelation data, the corresponding photoluminescence (PL) map is retrieved (Fig. 3a).
The cross-section of the diffraction-limited image, taken along the blue dashed line (Figure 3a), is shown in Figure 3b.Gaussian fitting of the intensity distribution yields a full width half maximum (FWHM= √   ) of 310 nm.By L-M fitting the 5s sparse histograms of each pixel, the  () (, , ) map is retrieved.Due to the sparsity of the HBT histograms, the L-M fitting expectedly leads to a noisy reconstruction of the  () (, , ) distribution (Figure 3c). Figure 3d shows the corresponding reconstructed image of  () (, ) (Eq.3).The cross-section of the obtained image and corresponding fitting with the same  value as of the original PL image are shown in Figure 3e.Here we can see that the  () (, , ) obtained via L-M fitting leads to a noisy, blurred image without any gain in spatial resolution, which is a direct consequence of the inaccurate retrieval of the  () (, , ).In contrast, the CNN-based antibunching SRM ensures the expected √ gain in resolution on sparse 7s HBT scan. Figure 3 (f-g) show  () (, , ) distribution retrieved via using the pre-trained CNN (f) and corresponding super-resolved image (g).Here, we can see that ML-based framework ensures precise reconstruction of the  () (, , ) map, and as a result achieves a √ gain in the spatial resolution of the reconstructed image.Gaussian fitting of the cross-section distribution of the resolved image shows that ML assisted approach ensures a FWHM of 219 nm, which corresponds to   = /√ (Figure 3h).
Up till now, we have considered an acquisition time of 7s per pixel.However, the robustness of the regression model indicates that the developed approach can be efficiently applied to more sparse datasets.Figure 4(a)-(c) shows the reconstructed images based on 5s, 6s and 7s HBT scans, respectively, and Figure 4(d) compares their cross-sections, which appear stable against the reduction of the acquisition time.It is worth noting that the fitting-based approach requires at least 1 min of HBT measurement per pixel for precise retrieval of the  () () values, as it has observed during dataset collection process (Section 2).This time requirement significantly depends on the properties of the single-photon emitters, e.g.quantum purity, lifetime, and emission rate, and can be significantly longer in the case of low emission rates of the emitter.Here, the developed ML-assisted anti-bunching approach ensures up to 12 times speed-up compared with the fitting-based approach.
The developed ML-assisted SRM is also capable of resolving closely spaced quantum emitters (Fig. 5). Figure 5(a)-(c) shows the PL distribution, CNN-based retrieved  () (, , ) map and the resolved image of the two NVs separated by ~600 nm distance.By comparing the original PL distribution and resolved image, the expected √ improvement in the spatial resolution is observed.By performing the Gaussian fitting of the cross-section (taken along the dashed line in Figure 5a), one can retrieve the FWHM values of each of the lobs, which are equal to ~465 nm (Figure 5d).By performing the same fitting on the resolved image, √ narrowing of the emission features (FWHM=330 nm) by the CNN based approach is confirmed.

Conclusion
The proposed ML assisted regression technique allows for a significant speed up of quantum SRM imaging.Specifically, the performance of the CNN-assisted SRM is demonstrated on nanodiamonds which contain single NV centers as quantum emitters.In the microscopy of quantum light sources, the developed ML-assisted super-resolution framework ensures a speed-up of 12 times compared to the conventional L-M fitting-based approach for retrieving the second-order autocorrelation value at zero delays,  () ().The proposed approach can be extended to rapid measurements of higher-order autocorrelation functions, which opens up the way to practical realization of scalable quantum super-resolution imaging systems.The developed method is compatible with the CW excitation regime, which reduces emitter photobleaching due to multi-photon absorption and does not impose restrictions on the fluorescence lifetime.Therefore, it can be extended and applied to a wide variety of quantum emitters used in biological labeling, quantum on-chip photonics and beyond.

Figure 1 .
Figure 1.General framework of the machine learning (ML) assisted antibunching SRM.Antibunching-based SRM image acquisition starts with an area of n by m pixels (a) and measures complete antibunching histograms via Hanbury-Brown-Twiss (HBT) interferometry at each pixel (b).The Levenberg-Marquardt (L-M) fit is done on each pixel's HBT histogram to retrieve  () (, , ) distribution.Finally, the super-resolved image is constructed using Eq. 2 (d).Alternatively, ML-assisted approach relies on pre-trained CNN regression model, which allows to accurately predict  () (, , ) maps utilizing sparse HBT measurement data (c).The developed approach ensures at least 12 times speed-up compared with the conventional L-M fitting based antibunching SRM.

Figure 2 .
Figure 2. Machine learning assisted measurement of g (2) (0).(a) Schematics of the HBT interferometer.Labels: DM -dichroic mirror; LPF -long-pass filter; BS -beam splitter; D1/D2 -detectors.(b) Schematics of the CNN regression network.The input layer takes in sparse HBT histograms.The total number of events, Nevents, of the histogram is concatenated to the output of the feature learning part and used as a regularization term.(c)-(d) Regression plot (predicted vs expected  () () values) for L-M fitting (c) and CNN regression of g(2) (0) (d) from 5s datasets.Dots show the average predicted g(2) (0) value, while error bars shows the standard deviation of the predicted value over all the 5s datasets acquired for a given emitter.

Figure 3 .
Figure 3. Machine learning assisted antibunching SRM of a single NV center (a).Photoluminescence (PL) distribution within the area of 32 by 32 pixels containing one NV center.(b) Cross section of the PL image (blue) along the dashed line in (a) and Gaussian fitting (dashed, black) with 310 nm FWHM.(c)-(e) Results of L-M based antibunching SRM based on 7s HBT measurement: distribution of retrieved  −  () (, , ) (c); reconstructed image (d) and cross-section of G(2)(x,y) distribution of the L-M fitting based image (blue) and Gaussian fitting (dashed, black) with 310 nm FWHM (e).(f)-(h) Results of ML assisted antibunching SRM:  −  () (, , ) map (f); reconstructed image (d); and corresponding cross section of intensity distribution of reconstructed image (blue) and the Gaussian fitting (dashed, black) with 219 nm FWHM.

Figure 4 .
Figure 4. Robustness of the machine learning assisted SRM against the reduction of acquisition time.(a)-(c) Resolved images obtained via applying ML assisted antibunching SRM on 5s (a), 6s (b) and 7s (c) sparse HBT scans.Blue, dashed line shows the cross-section line.(d) Comparison of intensity cross-sections for three cases.

Figure 5 .
Figure 5. Machine learning assisted antibunching SRM resolves two closely spaced emitters.(a)-(c) Results of ML assisted antibunching SRM done on 7s HBT measurement: PL image (a); distribution of retrieved  −  () (, , ) (b) and the reconstructed image (c).(d) crosssection of intensity distribution of the PL image(blue) and Gaussian fitting (dashed, black).Blue dashed line in (a) shows the cross-section line.(e) Cross-section of intensity distribution of the resolved image (blue) and Gaussian fitting (dashed, black).