Abstract
Although deconvolution can improve the quality of any type of microscope, the high computational time required has so far limited its massive spreading. Here we demonstrate the ability of the scaledgradientprojection (SGP) method to provide accelerated versions of the most used algorithms in microscopy. To achieve further increases in efficiency, we also consider implementations on graphic processing units (GPUs). We test the proposed algorithms both on synthetic and real data of confocal and STED microscopy. Combining the SGP method with the GPU implementation we achieve a speedup factor from about a factor 25 to 690 (with respect the conventional algorithm). The excellent results obtained on STED microscopy images demonstrate the synergy between superresolution techniques and imagedeconvolution. Further, the realtime processing allows conserving one of the most important property of STED microscopy, i.e the ability to provide fast subdiffraction resolution recordings.
Introduction
Image deconvolution is a computational technique that mitigates the distortions created by an optical system. Agard first applied image deconvolution to fluorescence microscopy in the early 1980s^{1}. In this seminal paper Agard proposed different algorithms for deconvolving images acquired as threedimensional (3D) stacks using widefield microscopy (WFM). In a nutshell, the focal plane of the objective lens moves along the thickness of the specimen and for each position the microscope generates a bidimensional (2D) image. Due to the diffraction phenomena, each 2D image, also called optical section, includes considerable outoffocus light originating from regions of the specimen above and below the focal plane. Image deconvolution uses information describing how the microscope produces the image (forward model) as the basis of a mathematical transformation that reassigns the outoffocus light to the points of origin.
Later, many new optical methods have been proposed to remove outoffocus light and to generate directly true optical sections. Without pretending to be exhaustive, we mention confocal laser scanning microscopy (CLSM)^{2,3}, twophoton excitation microscopy (TPEM)^{2,4} and selective plane illumination microscopy (SPIM)^{5,6}. All these methods remove outoffocus light by rejecting such light before it reaches the detector or by precluding its generation. Further hybrid techniques, which remove outoffocus light by combining optical and computational methods are 4Pi microscopy^{7,8} and structured illumination microscopy (SIM)^{9,10}.
Since CLSM, TPEM and SPIM have considerably smaller contribution of outoffocus light they are sometimes considered as pure alternatives to the deconvolution and WFM combo. However, it has been shown that also these techniques can strongly benefit from image deconvolution^{11,12,13,14}. Although outoffocus background is reduced, the images produced by such systems are still blurred versions of the specimen's structures in the focal plane and are contaminated by noise, thereby deconvolution can improve their contrast and signaltonoise ratio. Similarly, also single 2D image can benefit of deconvolution, especially when obtained from thin specimen, where outoffocus background vanishes.
More recently new superresolution fluorescence microscopy approaches (usually referred to as nanoscopy) have enlarged the portfolio of tools for investigating biological samples^{15}. The nanoscopy techniques have effectively break the diffraction barrier and moved the spatial resolution of fluorescence microscopy down to the nanoscale^{16}. Importantly, also in these cases image deconvolution can help to improve the quality of their images. This has been demonstrated both for stimulated emission depletion (STED) microscopy^{17}, which at moment can be considered as the method of choice between the targeted nanoscopy techniques, and, more recently, also for stochastic nanoscopy techniques^{18}. As a matter of fact, all microscopy techniques that include directly or indirectly a convolution in their image formation processes can benefit from image deconvolution. It is also important to remember that any quantitative analysis on fluorescence images, e.g. colocalization analysis or volume/area estimations, are significantly improved if performed on deconvolved images^{19,20}.
In this scenario, one expects that any 2D or 3D image obtained from almost any fluorescence microscope is deconvolved before being analyzed. This unfortunately is not true. The main disadvantage that precludes this massive spreading of deconvolution is the high computational demand which leads to long waiting time before producing the result. As a consequence in many applications image deconvolution is not used to avoid strong delay in the data analysis pipeline. The situation becomes almost prohibitive in the case of largescale images. For the above mentioned reasons, several methods to increase the speed of the deconvolution process have been proposed.
Two main directions have been followed. The first one relays on the implementation of the algorithms, i.e., parallelization of the calculus and/or implementation on graphics processing units (GPUs)^{21,22,23,24}. A second approach, which found a strong attraction in the 90s, relays on the development of schemes to accelerate the deconvolution algorithms^{25,26}. Even if linear deconvolution, e.g., Wiener filtering, is extremely fast, its application to noisy images provides in general poor results; on the other side nonlinear deconvolution methods, and in particular iterative methods (with or without regularization), lead to excellent results but their convergence is very slow, requiring hundreds or thousands of iterations. The major representative algorithms for nonlinear deconvolution in fluorescence microscopy are based on the maximumlikelihood (ML) approach and, for the regularized version, on the maximum a posteriori (MAP) approach^{27}. These algorithms can take advantage from prior information about the image formation process and the specimen, effectively reducing the illposedness of the problem. Most of this algorithms are iterative firstorder methods, hence their implementation is easy (basically computation of a matrixvector multiplication at each iteration), but, as already mentioned above, their convergence is very slow. In this paper we present a deconvolution package that combines both strategies.
Recently, Bonettini et al.^{28} developed an optimization method, which they called scaledgradientprojection (SGP) method, able to fundamentally speedup the algorithms based on ML and MAP. In this paper we use the SGP method to derive a more efficient version of the RichardsonLucy (RL) algorithm^{29,30}, which represents the most famous nonregularized algorithm for deconvolution on microscopy images. Moreover, the SGP method is also used to derive an acceleration for another important widely used regularized deconvolution algorithm based on a quadratic regularization term^{31}. Finally, both algorithms can be integrated by a boundary effect correction according to the approach proposed by Bertero and Boccacci^{32}. This correction allows the application of these algorithms also to images of cropped structures.
We have first implemented the algorithms for classical centralprocessingunit (CPU)based calculation, in order to quantify the effective speedup obtained by the proposed SGP method with respect to RL. Later, we implemented the algorithm for GPUbased calculation to further reduce the time of the deconvolution process. Importantly, codes for the CPUbased implementation will be freely distributed, as well as the executable files for the GPUbased implementation.
The purpose of this paper is not only to illustrate the features of the SGP method to the microscopy community but also to provide them quasi realtime deconvolution algorithms able to drastically reduce the time for the pipeline image analysis. We used both CLSM and STED microscopy images to demonstrate the speedup of the SGP based algorithms. However, the very same algorithms can be applied to any other fluorescence microscopy technique by simply providing the relative pointspreadfunction, or more generally the relative forward model.
Results
Maximumlikelihood and maximum aposteriori approaches reduce the deconvolution problem into the minimization of a suitable functional (equations (2) and (3)). This functional includes most of the information about the image formation process and, when possible, information about the object to restore, thereby its design represents an important step for the quality of the deconvolution results. On the other hand, the speed of a deconvolution algorithm strictly depends on the scheme used to minimize the functional. Since in this work we focused our attention mainly on the speed issue, we compared the performance of the algorithms when they minimize the very same functionals.
We first used bidimensional (2D) CLSM and STED microscopy synthetic images for comparing the wellknown RL algorithm with the SGPbased algorithm (both minimizing the functional described in equation (2)). Realistic phantoms are crucial when robustness of algorithms has to be evaluated. For this reason, we implemented a routine able to generate pseudorandomly phantoms which mimic the microtubule cytoskeleton of a cell (see Methods). We simulated the images of the two microscopy modalities by using the same random microtubule network specimen (Fig. 1a) and the same imaging conditions (see Methods), but two different pointspreadfunctions (PSFs) (Insets Fig. 1b) for mimicking the different spatial resolutions. In particular, we assumed a Gaussian shaped PSF (see Methods) with a fullwidth at halfmaximum (FWHM) of 220 nm for CLSM (σ_{r} = 93 nm) and a GaussianLorentzian shaped PSF (see Methods) with a FWHM of 100 nm for STED microscopy (σ_{r} = 93 nm, ψ = 3.22·10^{−3} nm^{−1}, ς = 7). Figure 1e–l shows a sidebyside comparison of RL and SGPbased restorations. Clearly, the STED microscopy image reveals superior details compared to the CLSM image because their differences in spatial resolution (Fig. 1b,c,d, Fig. S1). Importantly, we deconvolved the synthetic images by means of the very same PSFs used for their generation (inverse crime). Thereby the results were not fundamentally biased by the choice of the PSF. After deconvolution we obtained excellent contrast improvement and noise reduction that help distinguishing more structural details into the CLSM restored images (Fig. 1e,g,i,k), as well as, into the STED microscopy restored images (Fig. 1f,h,j,l).
More interesting for the scope of this work is the comparison between RL and SGPbased restorations. Both algorithms led to similar results (Fig. 1e,f,i,j). However close looks to the restorations (Fig. 1h,l,g,k) depict slight differences. For example the SGPbased images offers higher contrast with respect to the RLbased counterpart. A quantitatively analysis confirmed such improvement (Fig. S1). Even if RL and SGP algorithms converge to the same minimum, they follow different approximation paths and the restorations satisfying the stopping rule can present marginal differences.
Whereas RL and SGP algorithms are similar in terms of restoration quality they have strong differences in terms of speed. Figure 2 plots the time and the number of iterations requested for obtaining optimal (in terms of restoration accuracy (see Methods)) restored images as function of the image size (number of pixels). We confirmed robustness of the algorithms against noise and object structures by running the algorithms with different noise realizations and different random tubulin network realizations. Moreover, we carefully maintained the concentration of filaments constant for all image sizes, in order to remove any dependency of the iterations' optimal number on the image size. On the other side, the optimal number of iteration changes between the two microscopy techniques. However, the reason is not connected to the microscopy technique itself, but to the different intensitydynamic of their images (Color bars Fig. 1a) and the different size of the their PSFs. As a rule of thumb the optimal number of iterations increases for increasing intensitydynamic (number of photons collected per pixel) and blurring of the image (size of the PSF with respect to the pixel size).
More interesting for SGP and RL algorithms comparison is that SGP reduces the optimal number of iterations (~87% for CLSM and ~51% for STED microscopy). This is in agreement with the main feature of the SGP method, i.e., the ability to find optimal direction toward the minimum of the functional and thereby to reduce the number of iterations needed. However, a fair comparison of the speedup of SGP algorithm has to take into account that a single SGP iteration needs more computation than a single RL iteration. Thereby, the overall time speedup obtained with the SGP algorithm is ~20% for STED microscopy and ~80% for CLSM. Similarly to the RL algorithm, also the SGP algorithm decreases the optimal number of iterations when the signaltonoise ratio (SNR) decreases. Thus, in a regime of very low SNR the speedup of the SGPbased algorithm with respect to the RL algorithm can reduce (Fig. S2).
After estimating the speedup related to the SGP algorithm alone, we evaluated the further speedup obtained by implementing the SGP algorithm for GPU (instead of CPU). Figure 3 shows the time needed to obtain optimal restoration as a function of the image size. The GPUbased algorithm works ~10 times faster for small images (126 × 126 pixels) and ~100 times faster for large images (4096 × 4096 pixels) when compared to the CPUbased algorithm. Notably, this speedup has to be added to the speedup provided by the SGP algorithm, for example for large CLSM images (4096 × 4096 pixels) the GPUbased SGP algorithm need ~8 s, that is ~690 times faster then the CPUbased RL algorithm. If we consider that to achieve adequate SNR a modern CLSM need a pixeldwell time of at least 1 μs, in this example the deconvolution process is at least 2 times faster then the time to collect the image.
Next, motivated by the promising results on synthetic images we applied the SGP algorithm to real images of tubulin network (Fig. 4). In contrast to results on synthetic images, results on real images strictly depend on the PSF. Thereby, even if any method which estimate the PSF is fully compatible with the proposed algorithms, one has to pay particular attention to the PSF choice. A PSF may be empirical, i.e., measured^{33} or theoretical, i.e., calculated^{34}. Empirical PSF is generally obtained by imaging of subresolved structures in the same system conditions (i.e. optics and specimen's environment) used to image the specimen. Whereas calculated PSF is generated by using analytical models which require parameters like wavelength configurations, objective lens details, refractive indexes of immersion and mounting media, etc. Both methods present advantages and disadvantages. Briefly, an empirical PSF is contaminated by noise and has to be measured exactly in the same conditions that will be used to image the specimen, on the other side, a measured PSF takes into account any kind of aberration that can arise in the whole system, including aberration introduced by the specimen itself; a theoretical PSF is noisefree, but, its computation requires many information that are not easy to known and complex models. Also, a third option exists where the PSF is estimated from the image together with the unknown object, i.e., blind deconvolution^{35}. In this paper we adopted an hybrid method (see Methods): we used a rather easy PSF parametric model whose parameters are directly extracted from the image of subresolved structures contained in the very same specimen (σ_{r} = 93 nm, ψ = 3.22·10^{−3} nm^{−1}, ς = 5.2). Importantly, in the case of deconvolution for STED microscopy is extremely important to estimate the PSF directly from the image being deconvolved since the PSF strictly depends also by the properties of the fluorescent marker. For example, the use of fluorescent beads can result in a wrong estimation of the PSF, since in most of the case the fluorescent marker used for the beads is different from the one used for labeling the specimen.
The superior resolution of STED microscopy clearly highlights filaments intersection that can not be resolved in the CLSM counterpart (Fig. 4a–d). By strongly improving the contrast and reducing the noise, the SGP algorithm is able to recover many structural details from the raw CLSM image, as well as from the raw STED microscopy images. These results fully confirm the importance of applying deconvolution also to superresolution techniques, such as STED microscopy. Moreover, this example clarifies which are the benefits of using algorithms based on equation (2), like RL and SGP. It is well known that minimization of equation (2) leads to pointwise (sparse) restorations. For this reason many regularization methods have been proposed by different groups in order to apply deconvolution also for imaging of piecewise structures. In this work we applied deconvolution on tubulin network images, which is a rather sparse structure. SGP algorithm offers superior results when it is applied to reconstruct single isolated tubulin filaments. There are almost no differences between CLSM and STED microscopy restoration when comparing the intensity profile through a single isolated filaments (Fig. 4j), i.e., deconvolution on CLSM can, in these particular circumstances, substitutes STED microscopy. On the contrary, when more convoluted structures are imaged, the lower resolution offered by CLSM microscopy can not be compensated by deconvolution. STED microscopy, especially when combined with deconvolution, easily resolves two close (<100 nm) tubulin filaments (Fig. 4i), but CLSM, even if combined with deconvolution, fails on the same task (Fig. 4i). The GPUbased SGP algorithm provided the restoration in ~0.07 s (21 iterations) and ~0.16 s (45 iterations) for CLSM and STED microscopy, respectively. Indeed, ~37 (STED) and ~16 (CLSM) time faster than the time that the microscope need to produce the images. The advantages of using the GPUbased algorithm becomes plain for 3D data set. We tested the SGP algorithm on a 3D CLSM image of the entire cytoskeleton of a cell (Fig. 5a) which took 180 s to be collected. Despite the huge data set (1024 × 1024 × 33 voxel) the GPUbased implementation of the SGP algorithm obtained an excellent restoration (Fig. 5b) after 20 iterations taking ~35 s, which is about a factor ~5 and ~35 faster than the collection time and the time need by the CPUbased implementation, respectively. Finally, we remark that we obtained all the results working with double precision, thereby a further reduction of running time is expected when using single precision. For example, when working in single precision the entire cytoskeleton 3D restoration needed ~17 s, thereby ~2 time faster. Importantly, we observe that in the microscopy contest running the deconvolution algorithms in single and double precision we obtained similar qualitative results.
Discussion
Image deconvolution can potentially improve image quality for any fluorescence microscopy technique, including the new emerging nanoscopy techniques. However, the amount of computational time required, which characterizes any high performance algorithm, has so far limited the massive spreading of image deconvolution. In this paper we describe a framework able to efficiently reduced the computational time for solving both the ML (unregularized) and the MAP (regularized) deconvolution problem. This framework uses the SGP method for solving the minimization problem associated to deconvolution. As an example, we use this framework to derive an efficient alternative to one of the most used deconvolution algorithm in fluorescence microscopy, the RL algorithm. Further, we compared CPUbased and GPUbased implementations of this algorithm. The synergy between the SGP method and the GPUbased implementation achieves an improvement which ranges from about a factor of 25 to 690 (when compared to a CPUbased implementation of the RL algorithm), without loosing in quality of the reconstruction.
The executable files for the GPUbased implementation can be freely downloaded (http://www.unife.it/prisma), as well as the codes for the CPUbased implementation. Moreover, as an example of the SGP method applied to regularized deconvolution, the software provides a GPUbased algorithm which can efficiently substitute the widely used regularized algorithms based on Tikhonov regularization. Last but not least, the software includes a boundary effect correction, which allows the application of the algorithms to images of cropped structures.
It is important however to point out the limitations of the SGP method which, in this paper, is mainly applied to the ML problem because we are focusing on possible realtime applications. As previously remarked, the SGP method can also be applied to the solution of regularized problems (and one example is provided in this paper), but only if the regularization function is differentiable. This is an important limitation because, in general, the SGP method can not be applied to the important case of sparse reconstruction schemes, i.e. norm regularization. More precisely, it can be applied to the case of edgepreserving regularization if a smoothed TVnorm is used^{36}, but not to the case of sparsity of the object with respect to a suitable wavelet transform, such as a dualtree complex wavelet transform or a dictionary composed of curvelets and undecimated wavelet transform^{37,38}, an approach already proposed for confocal microscopy.
In the case of a piecewise object with sharp edges, regularization by early stopping of unregularized SGP or RL can produce a smoothing of the edges and therefore edgepreserving regularization is required. This oversmoothing effect does not appear in the restoration of tubulines networks because, as we already remarked, this is essentially a sparse object and the ML solutions are sparse in the pixel space.
In the case of regularized methods an important point is the choice of the regularization parameter. For any selection criterion the solution of several minimization problems is in general required so that a realtime application is not possible. A way could be the approach proposed in^{38}, where the choice of the parameter is reduced to the solution of a unique constrained minimization problem with an additional constraint related to the selection criterion. We believe that also this constrained minimization problem is too much timeconsuming to enable realtime deconvolution with the available GPU technology. A more practical way could be to calibrate offline the regularization parameter for a given class of objects (for instance tubulines) and a given value of the signaltonoise ratio. Then the estimated value could be used for realtime SGPbased deconvolution.
We conclude by highlighting the advantages of image deconvolution on STED microscopy imaging. The question if conventional microscopy combined with image deconvolution alone (without using prior information about the object to reconstruct) can recover object's frequencies beyond the cutoff frequency of the system (i.e. achieve subdiffraction resolution) is still controversial. In the case of STED microscopy the situation is rather different. In a STED microscope the response of the object's emission rate to the illumination is nonlinear (exponential). Roughly speaking, this property allows to the STED microscope system to transfer all the object's frequencies (no cutoff frequency exists), thus permitting theoretically unlimited resolution^{39,40}. However, the strength of the frequencies declines rapidly with the increases of the order, leaving the practical resolution finite due to signaltonoise concerns. On the other side, in a STED microscope the strength of the high frequency can be enhanced by increasing the intensity of the illumination, which unfortunately can also introduce photodamage effects on the specimen. In this scenario, image deconvolution can efficiently help recovering high frequencies which are transmitted by the microscope system but hindered by the noise, thereby improving the practical resolution without increasing the intensity of the illumination.
Methods
Scaledgradientprojection (SGP) algorithm
Let us assume that the image detected values y_{i} (here i is an index labeling the pixels or voxels of the image) are realizations of independent Poisson random variables, with unknown expected values (Hx + b)_{i}, where x is the unknown object, H is the imaging matrix given in terms of the known PSF of the microscope h by and b is the known background emission. Then, the maximumlikelihood (ML) approach to the image deconvolution problem is equivalent to minimize the following generalized KullbackLeibler (KL) divergence (or Csiszár Idivergence)^{27}, given by As shown in^{41}, an iterative algorithm converging to nonnegative minimizers of the KL divergence is the wellknown RichardsonLucy (RL) algorithm^{29,30}, recalled in the Supplementary Information.
Since it is known that the nonnegative minimizers of the generalized KL divergence consist of a set of bright spots over a black background, the socalled nightsky solutions^{42}, the algorithm can not be pushed to convergence and early stopping of the iterations is required for obtaining a sort of “regularization” effect. Recently a few stopping criteria have been proposed^{43,44,38}, but their utility in practice has still to be tested.
Regularization can also be obtained in a Bayesian framework by assuming that the unknown object x is a realization of a random variable. If the probability density (prior) is of the Gibbs type, by taking the negative logarithm of the posterior probability one finds that the maximum aposteriori (MAP) estimates are the nonnegative minimizers of the function where the second term is the negative log of the prior. In the following we will call f_{1}(x) the regularization function and β the regularization parameter. Examples of f_{1}(x) considered in microscopy are, for instance, the square of the norm of x^{31} or edgepreserving functions of x^{45,46}.
In the case of a differentiable penalty function f_{1}(x) several iterative methods have been proposed for the minimization of the function f_{β}(x; y) defined in equations (3) and (2). For our purposes two methods are interesting: the onestep late (OSL) method proposed in^{47} and the splitgradient method (SGM) proposed in^{48}. The first is used for instance in^{45} for total variation (TV) regularization and the second in^{46} for Markov random field (MRF) regularization.
It is easy to show that both OSL and SGM are scaled gradient methods; however only in the case of SGM the scaling is nonnegative for any regularization function f_{1}(x) and any value of the regularization parameter β (see Supplementary Information). Therefore our reference algorithms are RL for the maximumlikelihood approach and SGM for the Bayes approach.
Motivated by the large application of these algorithms in microscopy, we derived new algorithms based on the scaled gradient projection (SGP) method^{28} which use the scaling suggested by the RL (Eq. (S3)) and the SGM (Eq. (S6)) algorithms. In the first case SGP is able to provide a very efficient solution of the ML image deconvolution, hence an acceleration of the RL method; in the second case an efficient solution of the MAP image deconvolution with , hence an acceleration of the algorithm proposed in^{31}. For the purpose of this paper, we describe the SGP algorithm in the case of diagonal positive definite scaling matrices and nonnegativity constraint. The general case of arbitrary convex constraints and/or nondiagonal positive definite scaling matrices is given in^{28}.
The considered ML and MAP problems are particular cases of the following general convex optimization problem where f is a continuously differentiable convex function. The SGP algorithm for solving this problem can be stated as in Table 1.
Here we denote by the projection onto the nonnegative orthant, i.e., the operator setting to zero the negative components of a vector, and by the compact set of the diagonal positive definite matrices whose diagonal entries have values between two positive constants L_{1} and L_{2}, 0 < L_{1} < L_{2}.
As a gradient projection algorithm, SGP involves two standard elements: the choice of a descent direction (Step 3.) by means of the projection onto the feasible region and a linesearch along the descent direction (Step 5.). For the latter, a classical monotone linesearch technique is considered but, as described in^{28}, nonmonotone strategies could be also exploited. The main feature of SGP consists in the definition of the search direction, that is obtained by combining diagonally scaled gradient directions with special steplength selection rules with the aim of accelerating the path toward the minimum without losing the simplicity and low computational cost of each iteration. In particular, the choice of the steplength α_{k} is usually inspired by quasiNewton properties, but without the need of computing any secondorder information. In our implementations we use an adaptive alternation strategy based on the two BarzilaiBorwein (BB) rules which, in the case of a scaled gradient directions, are as follows^{28} where s^{(k−1)} = x^{(k)} − x^{(k−1)} and w^{(k−1)} = ∇ f(x^{(k)}) − ∇ f(x^{(k−1)}). When D_{k} is equal to the identity matrix, one obtains the standard BB rules^{49}. More details on the adaptive steplength alternation rule used in SGP are given in the Supplementary Information and we refer to^{50,51,52} for discussion on the rationale behind the steplength alternation. Concerning the choice of the scaling matrix D_{k}, it takes into account the special form of the function f(x) we are minimizing and needs to be faced separately for each application considered in the paper. In the case of the minimization of the KL divergence we use the scaling suggested by Equation (S3), corrected with a threshold assuring that the scaling matrix belongs to Similarly, the analysis of SGM suggests the following scaling in the application of SGP to the minimization of f_{β}(x; y) where V_{1}(x^{(k)}) is a nonnegative array/cube defined by an appropriate splitting of ∇f_{1}(x^{(k)}). In the case of quadratic (or Tikhonov) regularization, we recall that V_{1}(x^{(k)}) = x^{(k)} (see Supplementary Information). Here it is also shown that, as far as the SGP algorithm concerns, the boundary effect correction is incorporated in the scaling matrix while all the other steps remain unchanged.
We conclude by recalling that global convergence of the algorithm is proved in^{28}, for every choice of the steplength α_{k} in the closed interval [α_{min}, α_{max}] and of the scaling matrix D_{k} in the compact set . Further useful implementation suggestions on the variables initialization, the parameter setting and the stopping rules can be found in the Supplementary Information.
Point spread function
It has been shown by^{53} that a confocal PSF is well modeled by a radially symmetric Gaussian function as: where σ is related to the fullwidth at halfmaximum (FWHM) by . We estimated both σ_{r} and σ_{z} from the detected confocal image by using intensity profiles of subresolved structures into the image, like unspecifically bound single antibodies or nanosize subcellular compartments, together with Gaussian fits for obtaining σ_{r} and σ_{z}. Similarly, it has been shown that the PSF of a STED microscope operating with continuouswave (CW) lasers (also called CWSTED microscope) is well modeled by^{54,55}: where: ψ is a constant that depends on the shape of the doughnutlike STED intensity distribution at the focus^{56}; ς is the so called saturation factor, which is defined as ς = I_{STED}/I_{s}, I_{STED} being the maximum value of the STED intensity distribution at the focus and I_{s} being the effective saturation intensity, which can be defined as the intensity at which the probability of fluorescence emission is reduced by half. In the most general case, I_{s} is a function of the orientation distribution and rotational behavior of the fluorescent marker, as well as of the wavelength, temporal structure and polarization of the inhibition light^{56,57}. We estimated ψ by using scattering images of single isolated 80nm gold bead. Importantly, ς = 0 and h_{CW}_{−STED} = h_{CLSM} if the STED beam intensity is null. Thereby, by taking advantage of having the CWSTED and confocal images of the very same specimen, we estimated the σ values as described above. Next, given σ and ψ, the saturation factor ς was estimated using equation (9) and the intensity profiles through subresolved structures in the CWSTED image. Finally, we mention that the Gaussianbased models for the PSF can fail in the case of thick specimen. In this case images are affected by a depthvariant blur due to spherical aberrations induced by refractive index mismatch between the different media composing the system as well as the specimen^{58}. Nevertheless, in practice it is difficult to obtain such a PSF, in spite of the existence of theoretical models accounting for spherical aberrations, because these models depend on some unknown acquisition parameter, such as the refractive index of the specimen. Therefore one needs blind or semiblind restoration algorithms (see, for instance^{59}, where an alternating minimization scheme is used in conjunction with SGP as minimization algorithm for depthvariant image deconvolution in confocal microscopy).
Synthetic images
To generate pseudorandom phantoms which mimic the microtubule network of a cell we randomly selected the starting positions of a given and fixed number of filaments and successively we used a stochastic process for choosing iteratively the directions of growth. The growth has been performed in a bidimensional or threedimensional space to obtain 2D or 3D phantom, respectively. We assumed filaments having tubular structure with radius 30 nm and we introduced heterogeneity of protein concentration between different filaments by associating to each filament a value in the range [0, 1]. Successively, to obtain the ideal image we convolved the phantom with the system PSF, i.e., . Importantly, the PSF of the STED system becomes narrower respect to the confocal counterpart as the saturation factor ς increases, but the intensity value at the peak stays constant. Thereby, in the convolution process, we used Equations (8) and (9) without any normalization to the sum of the pixels/voxels.
To obtain the ideal image in terms of average detected photons, we multiplied the convolved object by a factor τ which depends on several multiplicative factors, such as the emission rate of the fluorophore, the collection efficiency of the system and the pixel dwelltime. Since we assumed that photon counting noise represents the major source of noise for the detection process, and a constant background b can further degrade the image, we obtained the final image by corrupting every pixel/voxel i with a Poisson process with mean . Thus by increasing τ, the average number of detected photons increases and hence the signaltonoise ratio (SNR) increases. The relation between SNR and τ is Since in simulation we know the object used to generated the simulated image, we can numerically evaluate the quality of the deconvolved images at each iteration k. In particular, we used the KullbackLeibler divergence of the reconstructed image x^{(k)} from the known object (see Supplementary Information). Notably, we computed the KL divergence by using the phantom scaled with the effective photons emitted from each pixel/voxel. In the case of simulated data, we stopped the algorithms when they reached the minimum of the KL divergence (see Supplementary Information).
Real images
To test the proposed algorithms we imaged the microtubule cytoskeleton of fixed PtK2 cells. We used two different protocols for labeling two different proteins of the filament systems. The first protocol localizes the βtubulin protein; it involves a primary antibody (antiβtubulin mouse IgG, Sigma) and a secondary antibody (sheep antimouse IgG, Dianova) labeled with ATTO 647N (AttoTec). The second protocol localizes keratin protein and uses the Citrine yellow fluorescent protein. The samples were examined using a Leica TCS STEDCW microscope (Leica Microsystems) equipped with a 100×/1.40 OIL STED orange objective (Leica Microsystems). The system is able to perform both confocal and STED imaging. We excited (with a regular Gaussian beam) ATTO 647N and Citrine fluorophores at 635 nm and 488 nm, respectively, and we collected emitted light in the 650–750 nm and 495–580 nm spectral windows, respectively. For STED imaging on Citrine tagged sample fluorescence we depleted with a doughnut shaped beam at 592 nm. In the case of real data, we stopped the algorithms when they reached convergence with a tolerance of 10^{−3} (Eq. (S26)).
Computational features
Our test platform consists of a workstation equipped with 2 Intel Xeon SixCore CPUs at 3.1 GHz, 188 GB of RAM and 4 GPUs Nvidia Fermi C2070. It is managed by a CentOS Linux distribution. Each GPU is highly parallel: 14 streaming multiprocessors for a total of 448 64 bit computing cores, a highspeed RAM block shared among the 448 cores and a cache. Additional hardware details are available in the last section of the Supplementary Information.
Two implementations of SGP are available: one in Matlab (CPUbased) and another one in C/CUDA (GPUbased). The GPU implementation is developed in mixed C and CUDA languages, the latter being a manufacturerprovided framework for Clike GPU programming (see Supplementary Information). We used Matlab v. 7.11 and CUDA v. 4.3.
The main computational cost in both the RL and the SGP iterations is a pair of forward and backward FFTs for computing the image convolutions.
In the GPU framework, these operations were faced by means of the Nvidia CUFFT library, while Matlab exploits a multithreaded implemetation of FFTW libraries: a realtocomplex FFT is followed by a componentwise multiplication between the transformed iterate and the PSF and, at the end, a complextoreal inverse FFT gives the convolution. A few additional componentwise operations are needed, which only depend on each single pixel/voxel (which is good for the GPU implementation), so that the complexity per iteration remains essentially unaffected. The SGP algorithm also requires a few products between long vectors, to update the steplengths: these are “reduction” operations, involving communications in the GPU case. They are performed by calling dedicated and optimized library routines (see Supplementary Information).
References
 1.
Agard, D. A. Optical sectioning microscopy: Cellular architecture in three dimensions. Annu. Rev. Biophys. Bioeng. 13, 191–219 (1984).
 2.
Diaspro A. (ed.) Confocal and TwoPhoton Microscopy: Foundations, Applications, and Advances (John Wiley & Sons, 2002).
 3.
Pawley J. B. (ed.) Handbook of Biological Confocal Microscopy (Springer, 2006).
 4.
Diaspro, A. et al. Multiphoton excitation microscopy. Biomed. Eng. Online 5, 36 (2006).
 5.
Huisken, J., Swoger, J., Del Bene, F., Wittbrodt, J. & Stelzer, E. H. K. Optical sectioning deep inside live embryos by selective plane illumination microscopy. Science 305, 1007–1009 (2004).
 6.
Huisken, J. & Stainier, D. Y. R. Selective plane illumination microscopy techniques in developmental biology. Development 136, 1963–1975 (2009).
 7.
Hell, S. W., Stelzer, E. H. K., Lindek, S. & Cremer, C. Confocal microscopy with an increased detection aperture: typeb 4pi confocal microscopy. Opt. Lett. 19, 222–224 (1994).
 8.
Vicidomini, G., Schmidt, R., Egner, A., Hell, S. & Schönle, A. Automatic deconvolution in 4pimicroscopy with variable phase. Opt. Express 18, 10154–10167 (2010).
 9.
Neil, M. A. A., Juskaitis, R. & Wilson, T. Method of obtaining optical sectioning by using structured light in a conventional microscope. Opt. Lett. 22, 1905–1907 (1997).
 10.
Gustafsson, M. G. L. Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy. J. Microsc. 198, 82–87 (2000).
 11.
Bertero, M., Boccacci, P., Brakenhoff, G. J., Malfanti, F. & van der Voort, H. T. M. Threedimensional image restoration and superresolution in fluorescence confocal microscopy. J. Microsc. 157, 3–20 (1990).
 12.
Mondal, P., Vicidomini, G. & Diaspro, A. Markov random field aided bayesian approach for image reconstruction in confocal microscopy. J. Appl. Phys. 102 (2007).
 13.
Verveer, P. J. et al. Highresolution threedimensional imaging of large specimens with light sheetbased microscopy. Nat. Methods 4, 311–313 (2007).
 14.
Mondal, P., Vicidomini, G. & Diaspro, A. Image reconstruction for multiphoton fluorescence microscopy. Appl. Phys. Lett. 92 (2008).
 15.
Huang, B., Babcock, H. & Zhuang, X. Breaking the diffraction barrier: Superresolution imaging of cells. Cell 143, 1047–1058 (2010).
 16.
Hell, S. Microscopy and its focal switch. Nat. Methods 6, 24–32 (2009).
 17.
Donnert, G. et al. Twocolor farfield fluorescence nanoscopy. Biophys. J. 92, L67–L69 (2007).
 18.
Mukamel, E. A., Babcock, H. & Zhuang, X. Statistical deconvolution for superresolution fluorescence microscopy. Biophys. J. 102, 2391–2400 (2012).
 19.
Difato, F. et al. Improvement in volume estimation from confocal sections after image deconvolution. Microsc. Res. Tech. 64, 151–155 (2004).
 20.
Vicidomini, G. et al. A novel approach for correlative light electron microscopy analysis. Microsc. Res. Tech. 73, 215–224 (2010).
 21.
Lee, S. & Wright, S. J. Implementing algorithms for signal and image reconstruction on graphical processing units. Tech. Rep., Computer Sciences Department, University of WisconsinMadison (2008).
 22.
Ruggiero, V., Serafini, T., Zanella, R. & Zanni, L. Iterative regularization algorithms for constrained image deblurring on graphics processors. J. Global Optim. 1–13 (2010).
 23.
Serafini, T., Zanella, R. & Zanni, L. Gradient projection methods for image deblurring and denoising on graphics processors. In Chapman, B., Desprez, F., Joubert, G. R., Lichnewsky, A. & Peters, F. (eds.) Parallel Computing: From Multicores and GPU's to Petascale, vol. 19, 95–66 (IOS Press, 2010).
 24.
Bruce, M. A. & Butte, M. J. Realtime gpubased 3d deconvolution. Opt. Express 21, 4766–4773 (2013).
 25.
Biggs, D. S. C. & Andrews, M. Acceleration of iterative image restoration algorithms. Appl. Opt. 36, 1766–1775 (1997).
 26.
Llacer, J. & Nunez, J. Iterative maximum likelihood estimator and bayesian algorithms for image reconstruction in astronomy. In White R. L., & Allen R. J. (eds.) Restoration of HST Images and Spectra, 52–70 (Space Telescope Science Institute, Baltimore, MD, 1990).
 27.
Bertero, M., Boccacci, P., Desiderà, G. & Vicidomini, G. Image deblurring with poisson data: from cells to galaxies. Inverse Prob. 25, 123006 (2009).
 28.
Bonettini, S., Zanella, R. & Zanni, L. A scaled gradient projection method for constrained image deblurring. Inverse Prob. 25, 015002 (23 pp) (2009).
 29.
Richardson, W. H. Bayesian–based iterative method of image restoration. J. Opt. Soc. Amer. A 62, 55–59 (1972).
 30.
Lucy, L. B. An iterative technique for the rectification of observed distributions. Astronom. J. 79, 745–754 (1974).
 31.
Conchello, J. A. & McNally, J. G. Fast regularization technique for expectation maximization algorithm for optical sectioning microscopy. In Cogswell, C. J., Kino, G. S. & Wilson, T. (eds.) ThreeDimensional Microscopy: Image Acquisition and Processing III vol. 2655, 199–208 (SPIE, 1996).
 32.
Bertero, M. & Boccacci, P. A simple method for the reduction of boundary effects in the richardsonlucy approach to image deconvolution. Atron. Astrophys. 437, 369–374 (2005).
 33.
Sibarita, J.B. Deconvolution microscopy. In Rietdorf, J. & Gadella, T. W. J. (eds.) Microscopy Techniques, 201–243 (SpringerVerlag, 2005).
 34.
Frisken Gibson, S. & Lanni, F. Experimental test of an analytical model of aberration in an oilimmersion objective lens used in threedimensional light microscopy. J. Opt. Soc. Am. A 9, 154–166 (1992).
 35.
Holmes, T. J. Blind deconvolution of quantumlimited incoherent imagery: maximumlikelihood approach. J. Opt. Soc. Am. A 9, 1052–1061 (1992).
 36.
Zanella, R., Boccacci, P., Zanni, L. & Bertero, M. Efficient gradient projection methods for edgepreserving removal of Poisson noise. Inverse Prob. 25, 045010 (2009).
 37.
Dupé, F.X., Fadili, J. & Stark, J.L. A proximal iteration for deconvolving poisson noisy images using sparse representations. IEEE Trans. Image Process. 18, 310–321 (2009).
 38.
Carlavan, M. & BlancFéraud, L. Sparse poisson noisy image deblurring. IEEE Trans. Image Process. 21, 1834–1846 (2012).
 39.
Heintzmann, R. & Gustafsson, M. G. L. Subdiffraction resolution in continuous samples. Nature Photon. 3, 362–364 (2009).
 40.
Vicidomini, G. et al. Sted nanoscopy with timegated detection: Theoretical and experimental aspects. PLoS ONE 8, e54421 (2013).
 41.
Shepp, L. A. & Vardi, Y. Maximum likelihood reconstruction for emission tomography. IEEE Trans. Med. Imaging 1, 113–122 (1982).
 42.
Barrett, H. H. & Meyers, K. J. Foundations of Image Science (Wiley and Sons, 2003).
 43.
Bardsley, J. M. & Goldes, J. Regularization parameter selection methods for illposed poisson maximumlikelihood estimation. Inverse Prob 25, 095005 (2009).
 44.
Bertero, M., Boccacci, P., Talenti, G., Zanella, R. & Zanni, L. A discrepancy principle for poisson data. Inverse Prob. 26, 105004 (2010).
 45.
Dey, N. et al. Richardsonlucy algorithm with total variation regularization for 3d confocal microscope deconvolution. Microsc. Res. Tech. 69, 260–266 (2006).
 46.
Vicidomini, G., Boccacci, P., Diaspro, A. & Bertero, M. Application of the splitgradient method to 3d image deconvolution in fluorescence microscopy. J. Microsc. 234, 47–61 (2009).
 47.
Green, P. J. Bayesian reconstructions from emission tomography data using a modified EM algorithm. IEEE Trans. Med. Imaging 9, 84–93 (1990).
 48.
Lantéri, H., Roche, M. & Aime, C. Penalized maximum likelihood image restoration with positivity constraints: multiplicative algorithms. Inverse Prob. 18, 1397–1419 (2002).
 49.
Barzilai, J. & Borwein, J. M. Twopoint step size gradient methods. IMA J. Num. Analysis 8, 141–148 (1988).
 50.
Dai, Y. & Fletcher, R. Projected barzilaiborwein methods for largescale boxconstrained quadratic programming. Numer. Math. 100, 21–47 (2005).
 51.
Zhou, B., Gao, L. & Dai, Y. Gradient methods with adaptive stepsizes. Comput. Optim. Appl. 35, 69–86 (2006).
 52.
Frassoldati, G., Zanghirati, G. & Zanni, L. New adaptive stepsize selections in gradient methods. J. Ind. Manag. Optim. 4, 299–312 (2008).
 53.
Zhang, B., Zerubia, J. & OlivoMarin, J.C. Gaussian approximations of fluorescence microscope pointspread function models. Appl. Opt. 46, 1819–1829 (2007).
 54.
Vicidomini, G. et al. Sharper lowpower sted nanoscopy by time gating. Nat. Methods 8, 571–575 (2011).
 55.
Vicidomini, G. et al. Gated cwsted microscopy: A versatile tool for biological nanometer scale investigation. Methods (2013) URL http://dx.doi.org/10.1016/j.ymeth.2013.06.029.
 56.
Harke, B. et al. Resolution scaling in sted microscopy. Opt. Express 16, 4154–4162 (2008).
 57.
Vicidomini, G., Moneron, G., Eggeling, C., Rittweger, E. & Hell, S. W. Sted with wavelengths closer to the emission maximum. Opt. Express 20, 5225–5236 (2012).
 58.
Preza, C. & Conchello, J.A. Depthvariant maximumlikelihood restoration for threedimensional fluorescence microscopy. J. Opt. Soc. America A 21, 1593–1601 (2004).
 59.
Hadj, S. B., BlancFéraud, L., Aubert, G. & Engler, G. Blind restoration of confocal microscopy images in presence of a depthvariant blur and poisson noise. In: Proc. ICASSP 2013, 915–919 (IEEE, 2013).
Acknowledgements
We thanks Prof. Alberto Diasrpo and Dr. Paolo Bianchini for useful discussions and for providing us STED images. This work was support in part by the Italian Ministry of Education, University and Research (MIUR) trough PRIN2008 project ”PRISMA” (grant n. 2008T5KA4L) and by the University of Ferrara trough MultiNOPaC2010 and NOCSiMA2011 projects.
Author information
Affiliations
Laboratorio delle Tecnologie per Terapie Avanzate, Università di Ferrara, Ferrara, Italy
 R. Zanella
Dipartimento di Matematica e Informatica, Università di Ferrara, Ferrara, Italy
 G. Zanghirati
Dipartimento di Scienze Fisiche, Informatiche e Matematiche, Università di Modena e Reggio Emilia, Modena, Italy
 R. Cavicchioli
 & L. Zanni
Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi, Università di Genova, Genoa, Italy
 P. Boccacci
 & M. Bertero
Nanophysics, Istituto Italiano di Tecnologia, Genoa, Italy
 G. Vicidomini
Authors
Search for R. Zanella in:
Search for G. Zanghirati in:
Search for R. Cavicchioli in:
Search for L. Zanni in:
Search for P. Boccacci in:
Search for M. Bertero in:
Search for G. Vicidomini in:
Contributions
R.Z., G.Z., P.B., M.B. and G.V. designed the experiments. R.Z., G.Z., R.C., L.Z. and P.B. designed the algorithm and executed the experiments. R.Z., G.Z., L.Z., P.B., M.B. and G.V. analyzed the data. G.Z., L.Z., M.B. and G.V. wrote the manuscript. G.Z. and G.V. prepared the figures. All authors read and approved the final manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to G. Vicidomini.
Supplementary information
PDF files
 1.
Supplementary Information
Supplementary Figures and Text
Rights and permissions
This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/
About this article
Further reading

A ThreeDimensional Deconvolution Algorithm Using Graphic Processors
Computational Mathematics and Modeling (2019)

STED superresolved microscopy
Nature Methods (2018)

3D surface morphology imaging of opaque microstructures via lightfield microscopy
Scientific Reports (2018)

SPRING: a novel parallel chaosbased image encryption scheme
Nonlinear Dynamics (2018)

A fast deconvolutionbased approach for singleimage superresolution with GPU acceleration
Journal of RealTime Image Processing (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.