Smooth 2D manifold extraction from 3D image stack

Three-dimensional fluorescence microscopy followed by image processing is routinely used to study biological objects at various scales such as cells and tissue. However, maximum intensity projection, the most broadly used rendering tool, extracts a discontinuous layer of voxels, obliviously creating important artifacts and possibly misleading interpretation. Here we propose smooth manifold extraction, an algorithm that produces a continuous focused 2D extraction from a 3D volume, hence preserving local spatial relationships. We demonstrate the usefulness of our approach by applying it to various biological applications using confocal and wide-field microscopy 3D image stacks. We provide a parameter-free ImageJ/Fiji plugin that allows 2D visualization and interpretation of 3D image stacks with maximum accuracy.

SML computation. By using SML on stacks acquired with a widefield microscope we aim at enhancing the contrast between the foreground and the background while reducing the noise. Therefore a quantity that makes sense to maximize is the ratio of the difference between the maximum and the minimum value of the SML result image (that we call Range) over the standard deviation of the overall SML result image (that we call Std). By iteratively smoothing a stack obtain by a widefield microscope we can reach a maximum of the given criteria easily. This plot shows a typical evolution of Range/Std obtained by smoothing with a Gaussian kernel with σ from 0 to 10. An optimal cF minimizing sensitivity to clutter and maximizing sensitivity to the manifold curvature is computed the following way (A) Two Zmax distributions are computed : one from the foreground class values and one from the uncertain class values defined by the profile classification. An optimal cut-off is automatically obtained at the intersection of both distribution. This cut-off define a balanced ratio of true positive versus true negative foreground. (B) The same ratio is applied to the c distribution computed from the foreground to obtain cF . (C) The relative position of the means µF , µU and µB of the Zmax distributions for each class is then used to define a ratio to linearly determine the value of cU . It is clear here that if µU is closer to the background µB then the weight for this class will be lower and the minimization will favor the regularization term. In contrast, if µU is closer to µF the importance of the regularization term will then be lowered. Supplementary Figure 5: Setting of the stopping criteria . The stopping criteria is set such that additional iterations would not significantly reduce further the cost. The 3 plots presented here show the cost evolution during the minimization process for 3 different choice of for 3 different datasets. This parameter is robust for all data sets as it is automatically adapted to its size.  . From left to right, the 4 methods (MIP, EOG, SML and EDF) -as described in the Literature overview section, are compared to SME. We clearly see in the zoomed results (cyan arrows) that MIP tends to saturate the resulting image causing loss of important details, SML fails to adapt to complex manifold shape and misses essential details. EDF fails to maintain a local structural coherence and the index maps lacks smoothness. Results obtained by SME, the proposed method, are consistently better on those two aspects. Note that the recovered manifold by SME is the closest to the synthetic reference manifold shown in Supplementary The top cyan arrow in the zoomed section (bottom row) points to structured noise accumulated by all other methods except SME. The bottom purple arrow shows that cell membranes are render with a higher contrast and a lower amount of noise by the SME method compared to the others. The main reason explaining the higher contrast obtained with SME is that the background voxels selected by SME are not chosen randomly (as in MIP) but they are located close to the foreground, thus the background obtained is less corrupted by the PSF glow that is stronger in the z-direction Supplementary Figure 10: Results on NEURON2 image acquired by a confocal microscopy. Purkinje cell from a mixed cerebellar culture at day in vitro 7 immunostained using a calbindin antibody and acquired by a confocal microscope. Top row is the index map represented as a 2D image, middle row is the composite image and bottom row shows zoomed sections highlighting errors and artifacts. The jet colorbar (top row) indicates the stack depth (31 slices), the gray colorbar (middle row) indicates intensity values (16 bits) and the hot colorbar (bottom row) indicates intensity values for the zoomed section (16 bits). Scale bar is 10 µm. From left to right, the 4 methods (MIP, EOG, SML and EDF) are compared to SME. Arrows indicate that the other methods tend to create saturated images with possible important loss of texture details. This is also due to the aggregation of voxels originating both from the foreground and the background occasioning possible wrong interpretation. Rapid advances in imaging technology has enabled acquisition of large 3D image volumes. Biological objects that are thicker than the focus range of the microscope lenses are frequently imaged as 3D stacks of 2D images. Optical sectioning of a preparation captures the in-focus information at increasing depths simulating an extended depth of field [8,6,9]. A 3D image stack is thus a sequence of focused image slices, which can subsequently be processed to reconstruct a single all-in-focus image. In computational imaging, this technique resulting in an image with greater depth of field (DOF) than any of the individual source images is called focus stacking (alternatively called focal plane merging, z-stacking, focus blending and most commonly z-projection, the most widely adopted methodology for focal stacking). Ideally, a 2D approximation of a 3D volume aims at maximizing information content and object coherence while minimizing the distortion generated by the Point Spread Function (PSF) of the microscope, the noise and the imaging artifacts. Accurate 2D approximation can help subsequent processing and quantification such as segmentation [7] for cell counting or cell profiling [10]. Image deconvolution to restore PSF distortion are commonly used for widefield images and there are several preconditions on the image acquisition parameters such as the sampling density must be higher than the Nyquist rate of the point spread function (PSF). However, deconvolution is known to create artifacts as if the PSF is not properly modeled, it is then a transformation of the image that is obtained and not a restoration. For deconvolution of 3D image stacks the interested readers may refer to these papers [11,12]. In the proposed method, we extract a 2D representation based on a smooth 2D manifold embedded in the 3D observed volume on which both the foreground and background are in focus, thus minimizing the PSF effect and other artifacts. In this paper, we employ the term composite image for the recovered 2D all-in-focus image and the term index map for the 2D image of Z-values indicating the depth at which each pixels information are eventually retrieved from.

Supplementary
In z stacking, various factors such as the chosen focus operator, the depth map smoothing, the window size and the properties of the object imaged (such as the scale variation, the texture etc.) affect the quality of the computed composite image. Focus operators determine which sections of the field of view are in focus at different levels of the stack. In [13,14,15], the authors present comprehensive reviews of focus measures. A composite image is obtained by fusion or blending of in-focus pixel values from different slices of an image stack. Spatially coherent smoothing of the depth map has also been extensively studied [16,17,18]. In [19], Muhammad and colleagues present a study on the effect of the sampling step size in the z direction on the accuracy of the recovered z-map indices. While the most common methods are Gaussian smoothing and averaging, Bezier surface approximation [20], anisotropic diffusion favoring data guided smoothing [21], nearest neighbor interpolation [22] have also been examined in the context of manifold recovery from focus measure in optical microscopy volumes.
Region based compared to pixel based projection prove to be consistently more successful at overcoming artifacts coming from noise and structural incoherence due to mis-aligned or mis-registrated projections [23]. Traditionally, blocks of regular shapes were favored for the ease and simplicity they offered in handling image data. In [24], the authors propose an adaptive window mechanism for image smoothing. In [25], the authors demonstrate how variation in window size affect the accuracy of the depth map estimation and conclude that smaller window sizes lead to a higher accuracy. In [26], the authors reported that an increasing window size reduces the differences in depth map estimation by various focus measures. Overall, it was beneficial to make the projection results free from a given window shape, size and orientation in order to manage regions that are irregularly shaped but semantically significant in the biological context.
In the this section, we examine in detail a few of the existing methods against which we will compare the performance of our proposed method. First, we chose the pixel-wise maximum intensity projection (MIP), the most widely adopted method by biologists due to its availability as a easy to use one-click, parameter free Fiji Plugin. Secondly, we chose a couple of image blending methods with different focus operators as the sum of modified Laplacian (SML) and the energy of gradients (EOG). Among these methods, only MIP is truly pixel based whereas the other methods, EOG, SML and EDF are region based as the focus is obtained from a neighborhood. We observe in our experiments that the critical factor influencing the quality of the composite image is the index map blending technique chosen to produce a smooth and natural transition in order to preserve spatial coherence. The choice of the focus measure shows a much lower contribution to the overall result is more related to type of microscopy. The last method is the extended depth of field (EDF) also available as a Fiji plugin.
1. MIP: Intensity based methods such as maximum intensity projection (MIP) M IP = max z (I) are the simplest and most widely used focus criteria (see Supplementary Figure 1). A suite of intensity based Z projection methods are available as Fiji Plugin [27] and seem to be very popular with biologists. The intuition behind is that biologists make a deliberate staining of objects of interest and hence the maxima of intensity represents the location of the object. However, in practise, noise, artifacts and PSF halo often creates intensity peaks in between objects misguiding the index decision map. Intensity based methods, thus, tends to accumulate PSF and noise. Proposed variations of the max are the mean, median, standard deviation or the sum of the pixel values along the slices for each pixel.
Although, these create a seemingly less noisy and cleaner composite image, these methods fail to limit artifacts and sharp transitions in index maps.
2. EOG: Gradient methods use the first order derivative of image intensities [13]. The intuition driving these methods is that sharp or focused images present stronger gradient compared to blurred, out of focus images. Energy of Gradients (EOG) is computed as EoG = max z (|∇(I)|) and is the absolute (alternatively squared) magnitude of gradient. Additionally, to induce a spatial coherence, a median or Gaussian smoothing is performed. In our implementation, we chose a Sobel operator for gradient detection followed by a Gaussian smoothing with sigma equal to 1 to obtained the z index map.
3. SML: The Sum of Modified Laplacian (SML) as defined in [28] is based on the second derivative of the image intensity. Laplacian based methods were reported to be the most successful in predicting sharpness or focusness of image regions in natural images [13]. Here too, to induce a spatial coherence a Gaussian smoothing with sigma equal to 1 is used to obtained the z index map.

EDF:
In [29], the authors proposed a method which uses wavelet transform co-efficient as a measure of sharpness and then uses a image blending technique to obtain a smooth z index map. It is also available as a Fiji Plugin called the extended depth of field (EDF) and offers the flexibility of choosing among several focus operators (sobel, variance, real wavelets, complex wavelets) and control the amount of smoothing which comes at a trade-off on speed. Although this approach produces better results than the intensity based methods, it fails to maintain smoothness in the transition from background to foreground. Moreover, in case of multi-channel images, it treats channels independently and combine them afterwards. It does not offer the flexibility to treat each channel separately or to chose the channel containing the reference manifold which is most of the time the only relevant option for bioimage analysis.

Supplementary Methods
Parameter settings for the cost function The cost function we designed contains two parameters (one only in the case of confocal images). In the following we describe how each of those parameters are set automatically from the data to produce a parameter free cost function.
• Gaussian smoothing prior SML. A Gaussian smoothing is performed as a preprocessing step of SML for widefield imaging modality. The Supplementary Figure 3 describes how an optimal standard deviation of the Gaussian kernel G is obtained directly from the data.
• Weights c F and c U of the class map C. The purpose of the weighted class map C is to balance the two terms of the cost function deferentially depending on the z-profile type. The weight assigned a z-profile that belong to the background is 0 such that no data attachment term is considered in this case. On the other hand, Supplementary Figure 4 described the strategy we designed to obtain optimal values of c F and c U , the weights assigned respectively to the z-profiles that belong to the foreground and the background classes.

Parameter settings for the minimization process
The cost function to minimize is now parameter free. However, the minimization process itself requires some fairly common parameters: the termination criteria and the discretization factor ∆T which the convergence depends on. As we show in this section, the minimization process and the results obtained are robust to the choice of these parameters with some trade-off on speed. Moreover, we adapt those parameters such that they can be set automatically depending on the image size.
• The termination criteria . In Supplementary Figure 5, we evaluated the impact of different values on several example image datasets (described in Supplementary Table 1). As expected, the experiment shows that the convergence time increase with a smaller . However the result is not affected as the cost minimization always converge to the same value. In order to maintain a reasonable computing time whatever the data size, is set to W × H × D × 10 −6 .
• The discretization factor ∆T . In Supplementary Figure 6, we evaluated the impact of different ∆T values on several example image datasets (described in Supplementary Table 1). The step size is initialized to T = D/100, where D is the number of slices in the image stack. Multiplying it by ∆T at each iteration produces a geometric decay scheme that ensures a faster convergence [30]. Empirically, in all dataset case ∆T = 0.99 gave satisfactory results.

Computation of a rolling standard deviation
If a single value is modified among a set of n values, it is possible to obtain the updated mean (Equation 1) and standard deviation (Equation 2) efficiently from the previous mean and the previous standard deviation respectively. Let's denote {X 0 , ..., X n−1 }, the initial set 0, and {X 1 , ..., X n } the set 1 obtained from the set 0 by replacing the value X 0 by X n . The mean µ 1 of Set 1 can be obtained from the mean µ 0 of Set 0 through the difference: Similarly, the squared standard deviation σ 2 1 of set 1 can be obtained from the squared standard deviation σ 2 0 of the set 0 by using the following definition of the sample variance: and writing the difference This rolling standard deviation formula (Equation 2) is used in the algorithm to test each shift in Z(x, y) (in brackets below the argmin in Algorithm 1) for each position (x, y) inside the loop, without having to rescan the whole window around (x, y) to compute the standard deviation.