Noise-resolution uncertainty principle in classical and quantum systems

We show that the width of an arbitrary function and the width of the distribution of its values cannot be made arbitrarily small simultaneously. In the case of ergodic stochastic processes, an ensuing uncertainty relationship is then demonstrated for the product of correlation length and variance. A closely related uncertainty principle is also established for the average degree of fourth-order coherence and the spatial width of modes of bosonic quantum fields. However, it is shown that, in the case of stochastic and quantum observables, certain non-classical states with sub-Poissonian statistics, such as for example photon number squeezed states in quantum optics, can overcome the “classical” noise-resolution uncertainty limit. This uncertainty relationship, which is fundamentally different from the Heisenberg and related uncertainty principles, can define an upper limit for the information capacity of communication and imaging systems. It is expected to be useful in a variety of problems in classical and quantum optics and imaging.


Results
General relationship between the width and the variance of a function. Figure 1 illustrates, in an informal manner, the relationship between the width w of a function f, and the width v of its associated histogram λ f . Here, for the purposes of illustration, we consider N = 100 imaging quanta (e.g. photons, neutrons, electrons etc.) that are collected by a one-dimensional position-sensitive detector composed of ten adjacent pixels. Figure 1(a) shows the case where all imaging quanta are uniformly distributed amongst the pixels, giving the intensity function f a (x) as a function of pixel coordinate x, that has a width w a filling the entire imaging domain; the associated histogram f a λ has a narrow single-bin width v a . In Fig. 1(b), the function is made narrower, to give f b (x) whose width w b is narrower than w a , but whose histogram λ f b has a width v b that is broader than v a . A similar trend is seen in passing to Fig. 1(c). The key point, here, is that a reciprocal relationship exists between v and w -decreasing the width w of a function leads to an increase in the width v of the associated histogram, and vice versa.
The above construction motivates the following more formal treatment. Consider an arbitrary measurable (in the sense of Radon) function ≥ f x ( ) 0 of real variable x, such that f x ( ) 0 = outside some finite interval Ω = [a, b]. We study the one-dimensional case first for simplicity, but use notation, such as |Ω| ≡ − b a for the length of Ω, that allows easy translation of the main results to higher dimensions later. The (normalized) histogram of f x ( ) can be defined in a similar way to the probability density function (PDF). Firstly, consider a function L y ( ) f equal to the fraction (relative area) of all points x in Ω at which f x y ( ) ≤ , i.e.
L y y f x dx ( ) where θ x ( ) is the Heaviside step-function which is equal to zero, when < x 0, and equal to 1, when x 0 ≥ . The normalized histogram function is then defined as the (generalized) derivative of L y ( ) f , that is is the Dirac delta (Fig. 2). As L y ( ) f is a monotonically non-decreasing function of y, y ( ) f λ is non-negative everywhere and ∫ λ can be viewed as a PDF associated with the "random variable" = y f x ( ). Such an interpretation can be literal for example, if f x ( ) is a sample of a spatially-ergodic stochastic process f x { ( )}. A useful result involving the histogram of a function is the following equation for mean values: Ω Ω x xf x dx We would like to quantify and prove the following hypothesis whose intuitive content is evident in Fig. 1 (see also the first paragraph of this section): when f x ( ) becomes narrow, its histogram λ y ( ) f becomes broad, and vice versa. In other words, it is impossible to make a function and its histogram arbitrarily narrow, simultaneously. In accordance with this hypothesis, one could try to show that the product of the widths of f x ( ) and λ y ( ) , is always larger than some positive value. Unfortunately, this is clearly not true for finite intervals a b [ , ] Ω = , because the width of the histogram is zero for constant functions = f x c ( ) , while the width of such functions is finite. However, it turns out that the following closely related inequality does indeed hold for any non-negative function f, for which the involved integrals are well defined: www.nature.com/scientificreports www.nature.com/scientificreports/ x is a dimensionless constant which is precisely defined below. Note that Eq. (7) holds for constant x 1 for any constant value c > 0. Moreover, Eq. (7) is exact; it becomes an equality for certain non-negative functions, such as the Epanechnikov function where C is an arbitrary positive constant, a a b b ≤ ′ < ′ ≤ and the subscript "+" denotes that the function is equal to zero at points where the expression inside the square brackets is negative. See Fig. 3(a) for a sketch of the functional form of the Epanechnikov distribution, together with the associated histogram in Fig. 3(b). For comparison, the truncated Gaussian distribution is shown in Fig. 3(c), together with its associated histogram in Fig. 3(d).
A general proof of Eq. (7) follows from the identity that can be obtained from Eq. (3) with = g y y ( ) 2 , and the one-dimensional (d = 1) version of the mathematical inequality: Before continuing, we make some interpretive remarks. The form of the noise-resolution uncertainty principle, given in Eq. (7), is sketched in Fig. 3(e). This shows the scaled width  in Fig. 3(d), is non-optimal in the sense of corresponding to the non-lower-bound dotted line in Fig. 3(e). This illustrates a previously-mentioned point, that while Gaussian distributions minimise the position-momentum uncertainty product given by the Heisenberg uncertainty principle, such distributions are not optimal from a noise-resolution perspective. Conversely, Epanechnikov distributions are optimal from the perspective of the trade-off between noise and resolution, but are non-optimal with regard to the trade-off between position (width) and momentum. Because the functional λ Δ |Ω| is invariant with respect to rescaling of the width or the height of the function f 9 , one for any given function type. This implies, in particular, that the curves W V ( ) always asymptotically converge to zero when V → ∞.
Consider now the "variance" of a function defined in the conventional way: www.nature.com/scientificreports www.nature.com/scientificreports/ where we have used Eq. (3) with = − g y y y ( ) ( ) 2 . Equation (12) implies, in particular, that the variance of a function is equal to the square of the width of its histogram: Hence, Eq. (7) can be re-written as It is instructive to consider the NRU Eq. (14) for Dirac-delta type functions.
not be formally evaluated, because the square of the delta function is not defined in the space of generalized functions. Consider, however, an area b b [ , ] b Ω ≡ − and the set of Gaussians , and hence in the left-hand side of Eq. (14) we have This value tends to the constant π ≅ . > ′ − C (2 ) 028 1 1 , when 0 σ → , and, in particular, the NRU does not decrease below this positive value when the Gaussians approach the Dirac delta function.
noise-resolution uncertainty for stochastic processes. Let us consider an application of the above result to ergodic stochastic distributions (spatial stochastic processes). See Fig. 4(a). Here it is convenient to work with symmetric domains Ω = −b b [ , ] b . To keep the notation consistent with that used above, instead of considering the proper limits of various quantities at b → ∞ as required in the theory of stochastic processes, we simply assume that b is sufficiently large. For example, the autocorrelation function will be defined as implicitly assuming that b is so large that contribution from the "tails" of the integral at Because ergodic distributions f x ( ) are always statistically stationary 26 , the notion of width does not make sense for them. However, the width of Γ x ( ) f , which is the correlation length, is well defined. We consider here a frequently encountered case of "transmission" through a smearing information channel, where f x ( ) is a convolution of an "input" distribution f x ( ) in as sketched in Fig. 4(a), with a deterministic point-spread function P x ( ) (Fig. 4(b)), in in as for example, in a typical case of a photon flux measured by a position-sensitive area detector (see Fig. 4(c)). For each distribution, we also consider the corresponding "noise" distribution ˜f in the input distribution is essentially uncorrelated, its correlation length is very small and the correlation is determined by the width of P x ( ) (cf. Fig. 4(a-c)). Under these conditions, we obtain an analogue of the NRU for the stochastic distribution = * f f P in by applying Eq. (9) to . P x ( ) Firstly, note that − Ω  Ī t is also easy to verify that ˜Γ = Γ * ( ) is the autocorrelation of P x ( ). Therefore,   21) can be close to zero, in which case Eq. (21) does not impose any substantial lower limit on the product of the 1/SNR 2 and the width of P(x). We see below that a similar situation can occur for quantum fields.
Returning to Eq. (22), this expression may be re-arranged to give the inequality for SNR/SNR in that is plotted in Fig. 4(d). The vertical axis of Fig. 4(d) may be viewed as a "signal-to-noise-ratio gain factor" which quantifies www.nature.com/scientificreports www.nature.com/scientificreports/ how much larger the SNR of the PSF-smoothed signal f(x) is, compared to the unsmoothed signal f in (x). The greater the degree of smoothing, the further one moves along the horizontal axis of Fig. 4(d), and the greater the maximum degree of SNR improvement (at the expense of coarsened spatial resolution). This form of the noise-resolution inequality makes the intuitively reasonable statement that binning over larger effective pixels will reduce noise, and hence increase SNR, at the expense of coarsened spatial resolution.
In relation to imaging problems, note that the structure of Eq. (22) is reminiscent of the Detective Quantum Efficiency (DQE) at zero frequency, which describes the efficiency of a detector and is usually equal to the ratio of the squared SNR in output and input signals 23  See Fig. 4(e). Equation (23) with γ = 1 expresses a typical form of NRU in imaging 6 , where, at a fixed incident photon fluence =F n h / in in in , the ratio of the squared SNR to spatial resolution cannot be increased beyond a certain absolute limit. More generally, if n 1 in > and γ  1, n in γ can in principle be very large and the SNR can be essentially decoupled from the correlation length of a stochastic distribution, at a fixed incident fluence (radiation dose) level. A similar phenomenon is also observed in the context of quantum field theory, as demonstrated below. This can be advantageous in imaging of radiation-sensitive samples 6,20 .
We note that the above results have direct implications for the information capacity of imaging systems 11,12,28 and possibly for quantum information capacity as well [17][18][19] . In particular, Eq. (22) provides an upper bound for the Shannon information capacity C S of an imaging system with M P independent channels (effective pixels) and Poissonian noise statistics, =n SNR in where N M n in in in≡ denotes the mean total number of photons used for imaging. Equation (24) implies that the Shannon information capacity per photon is limited from above by an absolute constant C (2 ) 19 . This limit from above on the Shannon information capacity per photon (C N / S in ) is shown as the dashed horizontal line in Fig. 4(f). The curves in this figure correspond to is the ratio of (i) the number of independent channels (effective pixels) to (ii) the mean total number of photons. When the number of effective pixels per photon E has the relatively small values of E = 0.05 or E = 0.2, which corresponds to a scenario in which there are many photons per effective pixel, the classic Shannon-information-per-photon curves are not constrained by the noise-resolution limit given by the dashed line, for the plotted range of SNR values from 0 to 20. Stated differently, noise does not affect resolution when there are many photons per effective pixel. However, for the same plotted range of SNR values, the classic Shannon-information-per-photon curves are constrained by the noise-resolution limit, when the number of effective pixels per photon takes the values E = 0.5, 1, 5; this corresponds to the case where there are few photons per effective pixel, which is precisely the domain in which noise affects resolution: one cannot ignore the implications of noise-resolution uncertainty in such an imaging regime.
Noise-resolution uncertainty in quantum field theory. Thus far, all of our discussions have used a classical-optics formalism. However, in the dilute-illumination case where the mean number of photons per pixel, n in , becomes small, the quantum nature of light will become important. Hence, for example, Eq. (23) will become less reliable in this regime where quantum effects become more signicant (i.e. when n 1 in < ; cf. Fig. 4(e)). In this regime, the inevitable contribution of detector and/or source statistics cannot be ignored, motivating a quantum field-theoretic treatment of noise-resolution uncertainty to which we now turn.
Consider an analogue of the NRU Eq. (14) in quantum electrodynamics (QED). For a single-mode electric field, , a k and ˆ † a k are the photon annihilation and creation operators for the kth mode, respectively 27 . If ρ is a density operator and 〈 〉 n k is the mean number of photons in the mode, then where we have used the commutation relation a a [ , ] 1 k kˆ † = 27 . Therefore, the coherence functions of the second and fourth order 27,29 , in the special case of single-point measurements, can be expressed as where r is the mean value of r with respect to the PDF . This allows us to introduce the quantities and to consider the following analogue of the functional on the left-hand side of the NRU Eq. (14): It is easy to verify that .  is the Mandel Q parameter 29,30 . For Poissonian statistics, corresponding to coherent states, we have Q = 0 and Eq. (36) transforms back into the "classical" form of the NRU. In non-classical quantum states with sub-Poissonian statistics, Q can become negative. Then Eq. (36) implies that it may be possible, in principle, to make a mode arbitrarily narrow, achieving arbitrarily fine spatial resolution in a corresponding experiment, and at the same time have arbitrary small "average variance", G G ( ) 2 1 2 − . It is instructive to look at the specific values that a two-dimensional analogue of the NRU, Eq. (36), implies for the lower limit of the product of the average degree of fourth-order coherence, G G /( ) 2 1 2 , and the width of the mode in the "classical" case (Poissonian statistics). For that, consider an experiment where measurements are performed by a two-dimensional position-sensitive detector located near the origin of coordinates in the plane = ⊥ z r ( , 0) and a mode has Gaussian spatial distribution in the detector plane,   ( 2 )] 0 (40) In this case, the factor, Q n 1 / k + 〈 〉, multiplying the constant ′ C 3 on the right-hand side of Eq. (36), is less than one, meaning that the "classical" lower limit of C 3 ′ can be overcome, albeit only by a small margin which decreases with the increasing mean number of photons in the mode.
Finally, we note that, using our earlier results 9 , it can be shown that the NRU for the spatially-averaged variance of the electric energy operator, ≡ 〈 〉 − 〈 〉 , can be written in a form similar to Eq. (36), but with the right-hand side always limited from below by an absolute positive constant. This difference in the form of the NRU for the photon number and the electric energy operators is caused by the presence of vacuum fluctuations which induce a shift of the absolute lower bound of the variance of electric energy operator. Interestingly, vacuum fluctuations have been also demonstrated to induce a shift in the dispersion relations between physical observables in wave models of quantum mechanics, such as prequantum classical field theory 31 .

Discussion
A point of central importance has been only lightly touched upon in our discussions: this is the fact, reported by many workers over many decades, that the spatial resolution of an imaging system cannot be meaningfully defined in the absence of noise 11,12,14,18,[32][33][34][35][36][37][38][39] . All of the various forms of noise-resolution trade-off, as given in the present paper, may be viewed as a particular re-statement of this noise-dependent notion of spatial resolution. It would be interesting to compare (i) the noise-dependent notion of resolution as quantified by the noise-resolution uncertainty principle, with (ii) other noise-dependent methods for defining resolution, such as Fourier ring correlation 40 or approaches based on statistical parameter-estimation theory 36 .
The lower bounds, for which the two-dimensional forms of the inequalities in Eqs. (7), (9), (14), (21) become equalities, represent limits bounding the noise-resolution properties of arbitrary images. As has been previously mentioned, such limit cases correspond to the solid line bounding the shaded areas in Figs. 3(e) and 4(d). Now, in many cases, one will have a population of pixellated images, such as a database of images of handwritten characters, stars, galaxies, quasars, stellar spectra, faces, fingerprints, x-ray radiographs or x-ray diffraction patterns. Any one of these and similar image databases will define a context-specific cloud of points which lie outside the prohibited regions of two-dimensional forms of Figs. 3(e) and 4(d). For a given image database, it may be instructive to plot the corresponding cloud of points on top of the allowed regions in Figs. 3(e) and 4(d). Such plots could serve several purposes, such as (i) surveying the degree of optimality from the perspective of the noise-resolution trade-off, of a given image database; (ii) investigating strategies to improve noise or resolution in the presence of suitable constraints such as dose, acquisition time, source size and pixel size.
Another interesting avenue for future work would be to further explore the Epanechnikov states which optimise the trade-off between noise and resolution that is inherent in all of the listed forms for the noise-resolution uncertainty principle. Both motivation and context for this suggestion, is given by the huge degree of applicability of the corresponding states of minimum uncertainty product from the perspective of the Heisenberg uncertainty principle: these are the Gaussian distributions and their generalisation as given by the coherent states 27 . Indeed, it would be scarcely hyperbole to claim that the states which minimise the Heisenberg uncertainty principle's uncertainty product, namely the coherent states of classical and quantum field theory, pervade much of the fabric of physics in general and optical physics in particular. This suggests that the Epanechnikov states may have a deeper significance that is worthy of further investigation in the context of noise-resolution uncertainty. Along similar lines, it would also be interesting to investigate states that interpolate between the Gaussian and Epanechnikov distributions, in the sense of jointly minimising some suitable combination of noise-resolution and position-momentum uncertainty.
It would also be worth exploring the implications of the present work for the formalism of partially coherent optical fields. Several concepts of direct relevance to partially coherent optical fields have appeared in the present paper, particularly in an earlier section's mention of ergodic stochastic processes, autocorrelation lengths etc. Given that ergodic stochastic processes underpin the modern theory of partially coherent optical fields, it would be interesting to further explore any additional connections that may exist between this theory and the noise-resolution concepts in the present paper.

conclusions
We have demonstrated that the width of a function and the width of its histogram cannot be made arbitrarily small at the same time. This relationship can also be stated in terms of an uncertainty relationship between the spatial resolution and the signal-to-noise ratio of a distributed measurement. In the case of statistical quantities associated with stochastic or quantum observables, for example, correlation functions of bosonic fields, the NRU can in principle be made arbitrarily small for non-classical states with sub-Poissonian statistics. We have shown that photon number squeezed states in quantum optics present an example where the classical limit for the NRU can be overcome.