## Introduction

The spatial resolution of optical imaging systems is established by the diffraction of photons and the noise associated with their quantum fluctuations1,2,3,4,5. For over a century, the Abbe-Rayleigh criterion has been used to assess the diffraction-limited resolution of optical instruments3,6. At a more fundamental level, the ultimate resolution of optical instruments is established by the laws of quantum physics through the Heisenberg uncertainty principle7,8,9. In classical optics, the Abbe-Rayleigh resolution criterion stipulates that an imaging system cannot resolve spatial features smaller than λ/2NA. In this case, λ represents the wavelength of the illumination field, and NA describes numerical aperture of the optical instrument1,2,3,10. Given the implications that overcoming the Abbe-Rayleigh resolution limit has for multiple applications, such as, microscopy, remote sensing, and astronomy3,10,11,12, there has been an enormous interest in improving the spatial resolution of optical systems13,14,15. So far, optical superresolution has been achieved through the decomposition of spatial modes into suitable transverse modes of light14,16,17. These conventional schemes rely on spatial projective measurements to pick up phase information that is used to boost spatial resolution of optical instruments14,18,19,20,21,22.

For almost a century, the importance of phase over amplitude information has constituted established knowledge for optical engineers3,4,5. Recently, this idea has been extensively investigated in the context of quantum metrology5,23,24,25,26. More specifically, it has been demonstrated that phase information can be used to surpass the Abbe-Rayleigh resolution limit for the spatial identification of light sources13,18,19,20,27. For example, phase information can be obtained through mode decomposition by using projective measurements or demultiplexing of spatial modes14,17,18,19,20. Naturally, these approaches require a priori information regarding the coherence properties of the, in principle, ‘unknown’ light sources14,15,21,22. Furthermore, these techniques impose stringent requirements on the alignment and centering conditions of imaging systems14,15,17,18,19,20,21,22,28,29. Despite these limitations, most, if not all, the current experimental protocols have relied on spatial projections and demultiplexing in the Hermite-Gaussian, Laguerre-Gaussian, and parity basis14,17,18,19,20,21,22.

The quantum statistical fluctuations of photons establish the nature of light sources30,31,32,33,34. As such, these fundamental properties are not affected by the spatial resolution of an optical instrument34. Here, we demonstrate that measurements of the quantum statistical properties of a light field enable imaging beyond the Abbe-Rayleigh resolution limit. This is performed by exploiting the self-learning features of artificial intelligence to identify the statistical fluctuations of photon mixtures33. More specifically, we demonstrate a smart quantum camera with the capability to identify photon statistics at each pixel. For this purpose, we introduce a general quantum model that describes the photon statistics produced by the scattering of an arbitrary number of light sources. This model is used to design and train artificial neural networks for the identification of light sources. Remarkably, our scheme enables us to overcome inherent limitations of existing superresolution protocols based on spatial mode projections and multiplexing14,17,18,19,20,21,22.

## Results

### Concept and theory

The conceptual schematic behind our experiment is depicted in Fig. 1a. This camera utilizes an artificial neural network to identify the photon statistics of each point source that constitutes a target object. The description of the photon statistics produced by the scattering of an arbitrary number of light sources is achieved through a general model that relies on the quantum theory of optical coherence introduced by Sudarshan and Glauber34,35,36. We use this model to design and train a neural network capable of identifying light sources at each pixel of our camera. This unique feature is achieved by performing photon-number-resolving detection33. The sensitivity of this camera is limited by the photon fluctuations, as stipulated by the Heisenberg uncertainty principle, and not by the Abbe-Rayleigh resolution limit5,34.

In general, realistic imaging instruments deal with the detection of multiple light sources. These sources can be either distinguishable or indistinguishable3,34. The combination of indistinguishable sources can be represented by either coherent or incoherent superpositions of light sources characterized by Poissonian (coherent) or super-Poissonian (thermal) statistics34. In our model, we first consider the indistinguishable detection of N coherent and M thermal sources. For this purpose, we make use of the P-function Pcoh(γ) = δ2(γ − αk) to model the contributions from the kth coherent source with the corresponding complex amplitude αk35,36. The total complex amplitude associated to the superposition of an arbitrary number of light sources is given by $${\alpha }_{{{{\rm{tot}}}}}=\mathop{\sum }\nolimits_{k = 1}^{N}{\alpha }_{k}$$. In addition, the P-function for the lth thermal source, with the corresponding mean photon numbers $${\bar{m}}_{l}$$, is defined as $${P}_{{{{\rm{th}}}}}(\gamma )={(\pi {\bar{m}}_{l})}^{-1}\exp (-| \gamma {| }^{2}/{\bar{m}}_{l})$$. The total number of photons attributed to the M number of thermal sources is defined as $${m}_{{{{\rm{tot}}}}}=\mathop{\sum }\nolimits_{l = 1}^{M}{\bar{m}}_{l}$$. These quantities allow us to calculate the P-function for the multisource system as

$$\begin{array}{l}{P}_{{{{\rm{th-coh}}}}}(\gamma )=\int \cdots \int {P}_{N+M}(\gamma -{\gamma }_{N+M-1})\\\qquad\qquad\quad\;\; \times \left[\mathop{\prod }\limits_{i=2}^{N+M-1}{P}_{i}({\gamma }_{i}-{\gamma }_{i-1}){d}^{2}{\gamma }_{i}\right]{P}_{1}({\gamma }_{1}){d}^{2}{\gamma }_{1}.\end{array}$$
(1)

This approach enables the analytical description of the photon-number distribution pth-coh(n) associated to the detection of an arbitrary number of indistinguishable light sources. This is calculated as $${p}_{{{{\rm{th-coh}}}}}(n)=\left\langle n\right|{\hat{\rho }}_{{{{\rm{th-coh}}}}}\left|n\right\rangle$$, where $${\rho }_{{{{\rm{th-coh}}}}}=\int {P}_{{{{\rm{th-coh}}}}}(\gamma )\left|\gamma \right\rangle \left\langle \gamma \right|{d}^{2}\gamma$$. After algebraic manipulation (see Supplementary Information), we obtain the following photon-number distribution

$$\begin{array}{l}{p}_{{{{\rm{th-coh}}}}}(n)=\frac{{\left({m}_{{{{\rm{tot}}}}}\right)}^{n}\exp \left(-{\left(| {\alpha }_{{{{\rm{tot}}}}}| \right)}^{2}/{m}_{{{{\rm{tot}}}}}\right)}{\pi {\left({m}_{{{{\rm{tot}}}}}+1\right)}^{n+1}}\\\qquad\qquad\quad\;\; \times\; \mathop{\sum }\limits_{k=0}^{n}\frac{1}{k!(n-k)!}{{\Gamma }}\left(\frac{1}{2}+n-k\right){{\Gamma }}\left(\frac{1}{2}+k\right)\\\qquad\qquad\quad\;\; \times\; {}_{1}{F}_{1}\left(\frac{1}{2}+n-k;\frac{1}{2};\frac{{({{\mathrm{Re}}}\,[{\alpha }_{{{{\rm{tot}}}}}])}^{2}}{{m}_{{{{\rm{tot}}}}}\left({m}_{{{{\rm{tot}}}}}+1\right)}\right)\\\qquad\qquad\quad\;\; \times\, {}_{1}{F}_{1}\left(\frac{1}{2}+k;\frac{1}{2};\frac{{({{\mathrm{Im}}}\,[{\alpha }_{{{{\rm{tot}}}}}])}^{2}}{{m}_{{{{\rm{tot}}}}}\left({m}_{{{{\rm{tot}}}}}+1\right)}\right),\end{array}$$
(2)

where Γ(z) and 1F1(a; b; z) are the Euler gamma and the Kummer confluent hypergeometric functions, respectively. This probability function enables the general description of the photon statistics produced by any indistinguishable combination of light sources. Thus, the photon distribution produced by the distinguishable detection of N light sources can be simply obtained by performing a discrete convolution of Eq. (2) as

$$\begin{array}{l}{p}_{{{{\rm{tot}}}}}(n)=\mathop{\sum }\limits_{{m}_{1}=0}^{n}\mathop{\sum }\limits_{{m}_{2}=0}^{n-{m}_{1}}\cdots \mathop{\sum }\limits_{{m}_{N-1}=0}^{n-\mathop{\sum }\nolimits_{j=1}^{N-1}{m}_{j}}{p}_{1}({m}_{1}){p}_{2}({m}_{2})\cdots \\\qquad\qquad\; {p}_{N-1}({m}_{N-1}){p}_{N}(n-\mathop{\sum }\limits_{j=1}^{N-1}{m}_{j}).\end{array}$$
(3)

The combination of Eqs. (2) and (3) allows the classification of photon-number distributions for any combination of light sources.

### Experiment

We demonstrate our proof-of-principle quantum camera using the experimental setup shown in Fig. 1b. For this purpose, we use a continuous-wave laser at 633 nm to produce either coherent, or incoherent superpositions of distinguishable, indistinguishable, or partially distinguishable light sources. In this case, the combination of photon sources, with tunable statistical fluctuations, acts as our target object. Then, we image our target object onto a digital micro-mirror device (DMD) that is used to implement raster scanning. This is implemented by selectively turning on and off groups of pixels in our DMD. The light reflected off the DMD is measured by a single-photon detector that allows us to perform photon-number-resolving detection. This is implemented through the technique described in ref. 33.

The equations above allow us to implement a multi-layer feed-forward network for the identification of the quantum photon fluctuations of the point sources of a target object. The structure of the network consists of a group of interconnected neurons arranged in layers. Here, the information flows only in one direction, from input to output37,38. As indicated in Fig. 2a, our network comprises two layers, with ten sigmoid neurons in the hidden layer (green neurons) and five softmax neurons in the output layer (orange neurons). In this case, the input features represent the probabilities of detecting n photons at a specific pixel, p(n), whereas the neurons in the last layer correspond to the classes to be identified. The input vector is then defined by twenty-one features corresponding to n = 0, 1, ..., 20. In our experiment, we define five classes that we label as: coherent-thermal (CT), thermal-thermal (TT), coherent-thermal-thermal (CTT), coherent (C), and thermal (T). If the brightness of the experiment remains constant, these classes can be directly defined through the photon-number distribution described by Eqs. (2) and (3). However, if the brightness of the sources is modified, the classes can be defined through the $${g}^{(2)}=1+(\langle {({{\Delta }}\hat{n})}^{2}\rangle -\langle \hat{n}\rangle )/{\langle \hat{n}\rangle }^{2}$$, which is intensity-independent30,33. The parameters in the g(2) function can also be calculated from Eqs. (2) and (3). It is important to mention that the output neurons provide a probability distribution over the predicted classes39,40. Moreover, note that during the training stage we need to define the output classes depending on the possible combination of light sources to be identified at the detection plane. Since our method is based on the discrimination of photon statistics, any point in the detection plane will fall within the defined classes, regardless of the position of the sources. Therefore the spatial distribution of the sources is not required in the training process. The training details of our neural networks can be found in the Methods section.

We test the performance of our neural network through the classification of a complex mixture of photons produced by the combination of one coherent with two thermal light sources. The accuracy of our trained neural network is reported in Fig. 2b. In our setup, the three partially overlapping sources form five classes of light with different mean photon numbers and photon statistics. We exploit the functionality of our artificial neural network to identify the underlying quantum fluctuations that characterize each kind of light. We calculate the accuracy as the ratio of true positive and true negative to the total of input samples during the testing phase. Figure 2b shows the overall accuracy as a function of the number of data points used to build the probability distributions for the identification of the multiple light sources using a supervised neural network. The classification accuracy for the mixture of three light sources is 80% with 100 photon-number-resolving measurements. The performance of the neural networks increases to approximately 95% when we use 3500 data points to generate probability distributions.

The performance of our protocol for light identification can be understood through the distribution of light sources in the probability space shown in Fig. 3. Here we show the projection of the feature space on the plane defined by the probabilities p(0), p(1), and p(2) for different number of data points. Each point is obtained from an experimental probability distribution. As illustrated in Fig. 3a, the distributions associated to the multiple sources obtained for 10 data points are confined to a small region of the feature space. This condition makes extremely hard the identification of light sources with 10 sets of measurements. A similar situation can be observed for the distribution in Fig. 3b that was generated using 100 data points. As shown in panel Fig. 3c, the separations in the distributions produced with 1000 data points occupy different regions, although brown and black points keep closely intertwined. These conditions enable one to identify multiple light sources. Finally, the separated distributions obtained with 10,000 data points in Fig. 3d enable efficient identification of light sources. These probability space diagrams explain the performances reported in Fig. 2. An interesting feature of Fig. 3 is the fact that the distributions in the probability space are linearly separable.

As demonstrated in Fig. 4, the identification of the quantum photon fluctuations at each pixel of our camera enables us to demonstrate superresolving imaging. Our technique involves two main steps. First, we classify each pixel with the help of our neural network (see Fig. 2). Then, we use this information to perform a fitting procedure to find out the positions and sizes of each source (see Methods). In our experiment we prepared each source to have a mean photon number between 1 and 1.5 for the brightest pixel. The raster-scan image of a target object composed of multiple partially distinguishable sources in Fig. 4a illustrates the performance of conventional imaging protocols limited by diffraction4,6,7,8. In this case, it is practically impossible to identify the multiple sources that constitute the target object. Remarkably, as shown in Fig. 4b, our protocol provides a dramatic improvement of the spatial resolution of the imaging system. In this case, we utilize photon statistics of the complex mixture of light sources at each pixel rather than relying on the composite point spread function of the multiple sources. This allows us to surpass the diffraction limit and predict the location of the three point sources. We then use a genetic algorithm-based optimization to predict the actual centroids and diameters of the three point sources with Gaussian point spread functions. Finally, we use this information to reconstruct each of the sources. Then, we simply add all individual source profiles to produce a single intensity plot as described in the Methods section. Our results clearly show the presence of the three emitters that form the remote object. The estimation of separations among light sources is performed through a fit over the classified pixel-by-pixel image. Additional details can be found in the Methods section. In Fig. 4c, d, we demonstrate the robustness of our protocol by performing superresolving imaging for a different configuration of light sources. In this case, two small sources are located inside the point-spread function of a third light source. As shown in Fig. 4c, the Abbe-Rayleigh limit forbids the identification of light sources. However, we demonstrate substantial improvement of spatial resolution in Fig. 4d. The plots in Fig. 4e, f correspond to the inferred spatial distributions based on the experimental pixel-by-pixel imaging used to produce Fig. 4b, d. The insets in Fig. 4e, f show photon-number probability distributions for three pixels. Sharing similarities with conventional schemes for optical superresolution14,18,19,20,21,22, our technique enables performing imaging beyond the Abbe-Rayleigh criterion even when the detected photons are emitted by light sources of the same kind. As shown in Fig. 4, this is possible even if two thermal sources are detected. The theoretical photon-number distributions in Fig. 4e, f are obtained through a procedure of least square regression41. Here the least squares difference between the measured and theoretical probability distribution was minimized for 0 ≤ n ≤ 6. The sources were assumed to be partially distinguishable allowing the theoretical distribution to be defined by Eqs. (2) and (3). The combined mean photon numbers of each source generated for the fit totals the measured mean photon number (see Methods section). Our scheme enables the use of the photon-number distributions or their corresponding g(2) to characterize light sources. This allows us to determine each pixel’s corresponding statistics, regardless of the mean photon numbers of the sources in the detected field30,33.

We now provide a quantitative characterization of our superresolving imaging scheme based on the identification of photon statistics. We demonstrate that our smart camera for superresolving imaging can capture small spatial features that surpass the resolution capabilities of conventional schemes for direct imaging1,2,3,4,5. Consequently, as shown in Fig. 5, our camera enables the possibility of performing imaging beyond the Abbe-Rayleigh criterion. In this case, we performed multiple experiments in which a superposition of partially distinguishable sources were imaged. The superposition was prepared using one coherent and one thermal light source which are separated from 0 to ~2.55 mm. In Fig. 5a, we plot the predicted transverse separation s normalized by the Gaussian beam waist radius w0 for both protocols. Here w0 = λ/πNA ≈ 1.2 mm, this parameter is directly obtained from our experiment. Furthermore, the transverse separation s is calculated following a similar approach to the one used to obtain Fig. 4 (see Methods). As demonstrated in Fig. 5a, our protocol enables one to resolve spatial features for sources with small separations even for diffraction-limited conditions. As expected for larger separation distances, the performance of our protocol matches the accuracy of intensity measurements. This is further demonstrated by the spatial profiles shown from Fig. 5b to d. The first row shows spatial profiles for three experimental points in Fig. 5a obtained through direct imaging whereas the images in the second row were obtained using our scheme for superresolving imaging. The spatial profiles in Fig. 5b show that both imaging techniques lead to comparable resolutions and the correct identification of the centroids of the two sources. However, as shown in Fig. 5c, d, our camera outperforms direct imaging when the separations decrease. Here, the actual separation is smaller than w0/2 for both cases. It is worth noticing that in this case, direct imaging cannot resolve spatial features of the sources. Here, the predictions of direct imaging become unstable and erratic. Remarkably, our simulations show an excellent agreement with the experimental data obtained with our scheme for superresolving imaging (see Methods section).

## Discussions

It is worth noting that sources of light characterized by different quantum statistical properties are ubiquitous in realistic scenarios42. Indeed, this situation prevails even when the detected field results from a combination of light sources of the same kind. For example, the finite size of remote stars produces multimode thermal light that is characterized by a degree of second-order coherence g(2) that deviates from 243. Interestingly, our scheme can identify these conditions. Furthermore, smart quantum statistical imaging can have important implications for LIDAR applications44,45. The performance of rangefinder systems is limited by the ability to discriminate the photon statistics of coherent and thermal light44,45. Remarkably, our imaging protocol shows potential to overcome this problem. Finally, there has been interest in using optical microscopy to identify light emitters42. In this case, it is possible to use our technique to form superresolving images of the optical emitters. Consequently, our work has important implications for the many imaging techniques.

In conclusion, we demonstrated a robust quantum camera that enables superresolving imaging beyond the Abbe-Rayleigh resolution limit. Our scheme for quantum statistical imaging exploits the self-learning features of artificial intelligence to identify the statistical fluctuations of truly unknown mixtures of light sources. This particular feature of our scheme relies on a general model based on the theory of quantum coherence to describe the photon statistics produced by the scattering of an arbitrary number of light sources. While in terms of resolution, the performance of our camera is on par with conventional schemes for superresolution, we demonstrated that the measurement of the quantum statistical fluctuations of photons enables one to overcome inherent limitations of existing superresolution protocols based on spatial mode projections14,18,19,20,21,22. Specifically, our protocol does not require prior information about the coherence properties of the light field. In addition, it does not rely on information of the individual centroids of the sources. We believe that our work will establish a paradigm in the field of optical imaging with important implications for microscopy, remote sensing, and astronomy5,6,7,8,9,10,11.

## Methods

### Training of NN

For the sake of simplicity, we split the functionality of our neural network into two phases: the training and testing phase. In the first phase, the training data is fed to the network multiple times to optimize the synaptic weights through a scaled conjugate gradient back-propagation algorithm46. This optimization seeks to minimize the Kullback-Leibler divergence distance between predicted and the real target classes47,48. Following a standardized ratio for statistical learning, we divide our data into training (70%), validation (15%), and testing (15%) sets49. The training is stopped if the algorithm performance stops improving on the validation set or if the loss function does not decrease within a given number of training epochs50. This method is called early stopping method and allows for reducing effectively overfitting51. Specifically, a limit of 1000 epochs was set for the examples shown in Figs. 4 and 5. In the test phase, we assess the performance of the algorithm by introducing an unknown set of data during the training process. The goal is to estimate the accuracy of the neural network for unknown data by exploiting the information unveiled during the training stage. For both phases, we prepare a data-set consisting of the same number of observations for each output class. The output classes are defined by the possible combinations of the number and type of light sources at the detection plane. For example, in our experiment presented in Fig. 4, we consider three sources, two thermal and one coherent that lead to five classes, thermal, coherent, thermal-thermal, thermal-coherent, and thermal-thermal-coherent. For each of the output classes, we prepared one thousand experimental measurements of photon statistics, for both training and test stages. Note that we train multiple neural networks by considering different numbers of data points for producing the photon statistics (see Fig. 2b). In all cases, we keep invariant the size of the training and test data-set. The networks were trained using the neural network toolbox in MATLAB, which runs on a computer Intel Core i7-4710MQ CPU (@2.50GHz) with 32GB of RAM.

### Fittings

To determine the optimal fits for Fig. 4e, f we design a search space based on Eqs. (2) and (3). To do so we first found the mean photon number of the input pixel, which are later applied to constrain the search space. From here we allowed for the existence of up to three distinguishable modes which will be combined according to Eq. (3). Each of the modes contains an indistinguishable combination of up to one coherent and two thermal sources whose number distribution is given by Eq. (2). The total combination results in partially distinguishable combination and provides the theoretical model for our experiment. From here our search space is

$$\sqrt{\mathop{\sum}\limits_{n=0}{({p}_{\exp }(n)-{p}_{{{{\rm{th}}}}}(n| {\bar{n}}_{1,t},{\bar{n}}_{2,t},{\bar{n}}_{c}))}^{2}},$$
(4)

where $${\bar{n}}_{i,t}$$ and $${\bar{n}}_{c}$$ are the mean photon numbers of that each thermal or coherent source that contributes to each distinguishable mode respectively. The mean photon numbers of each source must add up to the experimental mean photon number, constraining the search. A linear search was then performed over the predicted mean photon numbers and the minimum was returned, providing the optimal fit. Finally, we note that the fitting procedure does not use more information than that provided through the classification. The fitting procedure solely uses the output from the classification algorithm.

### Simulation of the experiment and separation estimation protocol

To demonstrate a consistent improvement over traditional methods, we also simulated the experiment using two beams, a thermal and a coherent, with Gaussian point spread functions over a 128 × 128 grid of pixels. At each pixel, the mean photon number for each source is provided by the Gaussian point spread function, which is then used to create the appropriate distinguishable probability distribution as given in Eq. (3), creating a 128 × 128 grid of photon number distributions. We use Gaussian fittings because this mathematical function describes the most fundamental spatial mode of an optical system. The associated class data for these distributions are then be fitted using a set of pre-labeled disks using a genetic algorithm. This recreates our method in the limits of perfect classification. Each of these distributions is then used to simulate photon-number resolving detection. This data is then used to create a normalized intensity for the classical fit. We fit the image to a combination of Gaussian PSFs. The separation s, is found by taking the centroid of each fit and calculating the distance between them. This value is then normalized by the beam radius ω0 for sake of clarity. This process is repeated ten times for each separation in order to average out fluctuations in the fitting. When combining the results of the intensity fits they are first divided into two sets. One set has the majority of fits return a single Gaussian, while the other returned two Gaussian the majority of the time. The set identified as only containing a single Gaussian is then set at the Abbe-Rayleigh diffraction limit, while the remaining data is used in a linear fit. This causes the sharp transition between the two sets of data.