Edge-guided second-order total generalized variation for Gaussian noise removal from depth map

Total generalized variation models have recently demonstrated high-quality denoising capacity for single image. In this paper, we present an accurate denoising method for depth map. Our method uses a weighted second-order total generalized variational model for Gaussian noise removal. By fusing an edge indicator function into the regularization term of the second-order total generalized variational model to guide the diffusion of gradients, our method aims to use the first or second derivative to enhance the intensity of the diffusion tensor. We use the first-order primal–dual algorithm to minimize the proposed energy function and achieve high-quality denoising and edge preserving result for depth maps with high -intensity noise. Extensive quantitative and qualitative evaluations in comparison to bench-mark datasets show that the proposed method provides significant higher accuracy and visual improvements than many state-of-the-art denoising algorithms.

www.nature.com/scientificreports/ derivative to update the diffusion tensor strength according to the edge indicator function value in the edge area and the smooth area. Then the minimum approximation of the energy function is solved by the first-order primal-dual algorithm (FOPD) 25 , which can effectively preserve the edge by denoising the depth map with highdensity Gaussian noise. The experimental results showed that the ESTGV outperforms state-of-the-art denoising methods not only in PSNR index, but also in visual comparison.

Related work
total variation (tV) for image denoising. The total variation model (TV) is a classical image denoising algorithm proposed by Rudin et al. 22 , also known as the ROF model. The minimum energy functional of the TV model can be expressed as: where u denotes the predicted image, u ∈ ∈ R , f denotes the noisy image, and the regularization term |∇u|dxdy denotes the prior constraint of image u , which is used for image denoising. |∇u| denotes the model of the gradient of u . 2 � (u − f ) 2 dxdy is the data fidelity term, also known as the approximation term, which measures the approximation degree of u and f by distance and constrains the solution of total variation minimizing to limit the deviation of solution in a small range, so as to achieve the purpose of protecting the structural information of image and reducing the distortion of image. α and are Lagrange scale constants for adjusting regularization term and data fidelity term. They need a proper balancing between the edge preservation and the noise removal. The larger is, the closer f approaches u , but if the value of is too large, the smoothing intensity of local features will be weakened, and the denoising effect will be decreased; the smaller is, the greater the denoising degree will be, but if the value of is too small, the image will become blurred. The Euler-Lagrange equation of energy functional (1) can be expressed as: where the first term −∇ · ∇u |∇u| is the diffusion tensor and 1 |∇u| is the diffusion coefficient. In the edge area with large gradient, |∇u| is larger, the diffusion coefficient is small, and the diffusion intensity along the edge direction is smaller, so the edge detail features can be well preserved. In the smooth area with small gradient, |∇u| is smaller, the diffusion coefficient is higher, the denoising intensity is higher.
The TV model determines the edge and the noise by the gradient value, and the diffusion is performed only in the gradient orthogonal direction of the image, so that the edge structure can be effectively preserved. But in the smooth area of the image, the TV model can only approximate the piecewise constant function effectively, so it is easy to misjudge the noise as edge to be preserved, and eventually produce the "staircasing artifacts" of the piecewise area.
Second-order total generalized variation. In order to solve the problem of "staircasing artifacts" in denoising of TV model, Bredies et al. 24 proposed a generalized mathematical model of TV algorithm, that is, total generalized variation (TGV). Different from the fact that the TV model can only approximate the piecewise constant function, the convex optimization model generated by the TGV model can approximate the piecewise multinomial function of any order, thus effectively avoiding the "staircasing artifacts" in the process of denoising.
The k-order TGV model can be defined as: where is an open interval, ∈ R d , and u denotes the original image, u ∈ L 1 loc (�),TGV k α denotes the k-order total generalized bounded variations with the weight coefficient α(k ≥ 1 α = α 0 , · · · , α k−1 0 ), and TGV k α provides a way of automatically balancing between the first derivative and the k-order derivative. div k v denotes the k-order symmetric divergence operator. v ∈ c k c , Sym k R d denotes the functions of k-order symmetric tensor in the , where c k c is k-order continuous function space, Sym k R d is k-order symmetric tensor space based on R d , and α l is a fixed positive parameter.
The corresponding second-order denoising model is expressed as: The TGV model can effectively avoid the "staircasing artifacts" of piecewise constant areas and is superior to TV model in denoising and edge preservation.
The first-order primal-dual algorithm (FOPD). The FOPD algorithm is a particularly effective iterative algorithm proposed by Chambolle et al. 25 in 2011, which is used to solve the convex optimization problems with a special proximal regularization parameter. The FOPD algorithm converts the original minimizing problem into the dual maximization problem by dual variables, and then represents the original problem and the dual problem as their corresponding saddle-point optimization problem, and then iterates the original variables and dual variables alternately until the optimal solution of the original problem is finally approximated. The FOPD algorithm is easy to implement and can effectively iterate alternately, so it can accelerate convergence.
The general convex optimization problem is defined as: the corresponding dual problem can be expressed as: where x is the primal variable, y is the dual variable. X and Y denote the finite-dimensional real vector space of the image with an inner product �·, ·� and norm �·� = √ �·, ·� , I : X → Y is a continuous linear operator, F(Ix) and G(x) are semi-continuous convex functions, F * and G * are conjugate convex functions that are topologically dual to functions F and G , respectively. I * is the adjoint operator of I.
The saddle-point formulation of (9) and (10) is given by where, the function δ P y is the dual function of F g = g 1 : The convex set P is expressed as: p ∞ denotes the discrete maximum norm defined as: According to the projection approximation algorithm, the iteration formulas of the original variable x and the dual variable y can be expressed as: www.nature.com/scientificreports/ and the proposed method edge-guided second-order total generalized variation for Gaussian noise removal. In order to solve the edge blurring issue during denoising depth maps with high-intensity Gaussian noise, we propose an edge-guided second-order total generalized variation model (ESTGV) for depth map Gaussian noise removal. The ESTGV model aims to preserve the depth edge and detail features profoundly when denoising Gaussian noise heavily polluted depth maps by adding an edge indication function into the second-order total generalized variation model. For the choice of edge indication function, we have compared some other edge detection algorithms, and found that Roberts detector usually misses some edges, Sobel detector and Prewitt detector often generate false edges, Laplacian detector is over-sensitive to noise and not suitable for noisy images, and LOG (Laplacian of Gaussian) detector usually loses sharp edges, while Canny detector adopts the principle of optimal edge detection, which has strong anti-noise ability, high integrity and continuity of edge detected , and adaptive threshold generation, and its edge preserving performance is obviously better than other detectors.
By adding the Canny edge detection algorithm 28 into the regularization term of the second-order TGV, we propose a weighted second-order TGV minimization denoising model: is the data fidelity term, u denotes the predicted depth map denoised, f denotes the noisy depth map, is the Lagrange constant factor for balancing the regularization term and the data fidelity term, is the definition domain of pixels on the depth map, (x, y) ∈ �.
The regularization term min � (u) can be expressed as �(u) , that is, the second order total generalized variation TGV 2 a (u): where p ∈ R mn × R mn is a bounded two-dimensional first-order vector field, p (i,j) = [p i,j,1 , p i,j,2 ] . a 1 and a 2 are two constants used to coordinate the first derivative and second derivative; ∇u is the gradient map of u , which is used to judge the scale of noise and edge. The gradient formulation of pixel u x, y on depth map u is given by: where ∂u ∂x n and ∂u ∂y n denote the horizontal and vertical gradients in the nth iteration process, respectively, which can be calculated by the median difference: T of (20) denotes the edge indicator function: where M ≥ 0 is the contrast factor, G σ is the Gaussian filtering kernel with a mean value of 0 and a standard deviation of σ. *denotes convolution operation, and G σ * f denotes the depth map preprocessed by Gaussian (20) denotes the second-order symmetric gradient operator: (17) where σ is the weight of the Gaussian filter to adjust the smoothing level. The smaller the σ value, the higher the positioning accuracy of the filter, but the lower the signal-to-noise ratio (SNR). The larger the σ value, the higher the SNR, but the positioning accuracy of the filter is also reduced.
According to the denoising model (19), the edge detection of the depth map is performed by |∇G σ * f | 2 , if the value of |∇G σ * f | 2 is large, the position of the observed pixel is the depth edge, and the value of T decreases because of the larger denominator, so the first derivative diffusion becomes relatively weak, so that the edge structure can be preserved well. If the value of |∇G σ * f | 2 is small, the observed pixel is located in the smooth area of the depth map, the value of T is close to 1 and the diffusion degree of the first derivative increases, so that the noise can be filtered effectively.
In conclusion, the edge indicator function can adaptively adjust the diffusion intensity of different frequency domains in depth images. For the edge areas, the first derivative is used to preserve their detailed features. For the smooth areas, the higher order derivative is used to filter out the noise, so as to effectively preserve the edge while denoising the depth map with high-density gaussian noise. numerical algorithms. In our denoising model, the regularization term is proper, convex and lower semicontinuous, and the data fidelity term is strictly convex and completely non-smooth. According to the convex analysis theory 29 , (19) is an unconstrained optimization problem, and there is a unique minimum solution, that is, the global optimal solution 30 .
We employ the FOPD method 25 as the numerical algorithm in order to solve the ESTGV model (19). The FOPD algorithm can solve optimization problems with efficient iterations.
According to the Legendre-Fenchel transformation 24 , the dual form of (19) can be obtained from: where Using the original-dual algorithm 25 to solve (26), The iterative process is obtained: where k denotes the number of iterations, the divergence operator div is the dual operator of the gradient operator ∇ , which denotes the negative conjugate of the gradient operator ∇ , i.e. div = −∇ * , the discrete version of div is div(m 1 , m 2 ) = ∂ − x m 1 + ∂ − y m 2 ,div h denotes the negative conjugate of the second-order symmetric gradient operator ε , div h = −ε * . When ∂ − x , ∂ − y is the first-order backward differences operator. The projections of (24) can be easily obtained by applying pointwise operations below: In conclusion, the solution of denoising model (19) is an iterative process for solving original variables u, p and dual variables m, n . The specific solving process of the primal-dual algorithm is as follows: Step 1: Initialization: n k+1 = proj N (n k + a(ε(p k ))) u k+1 = (u k + τ (div m k+1 + f )) · (1 + τ ) www.nature.com/scientificreports/ Step 2: Using the Iteration formulation (22) to solve m j,k+1 , n j,k+1 , u j,k+1 , p j,k+1 , u j,k+1 , p j,k+1 .
Step 3: When u k+1 , p k+1 appear to converge, the algorithm is terminated; otherwise, set k = k + 1 , and continue to iterate until the stopping criterion is satisfied.
The numerical operation of the ESTGV model (19) needs to be implemented in a discrete version of the gradient operator and the divergence operator.
Assuming that the resolution of depth map is s × s, then the discrete versions of the first-order forward and backward difference operators are given by The discrete version of the gradient operator is expressed as ∇ = ∂ + x , ∂ + y T , and the discrete second-order symmetric gradient operator ε is expressed as:

Experimental results
In this section, we evaluate our proposed method both quantitatively and qualitatively with respect to other benchmark and state-of-the-art image denoising methods.  22 and Total Generalized Variation (TGV) 24 . The results of the comparison algorithms are either obtained from the original papers or from the source code provided by the authors.
Parameters Setting: The parameters of our ESTGV algorithm are set as follows: Lagrange multiplier = 10 ,Gaussian noise of standard deviation σ = 0.2 , contrast factor M = 5, filter kernel size 7 × 7 , a 2 = 2 , a 1 = 4 ,τ = 0.04 . We also set parameters for each comparison method. Note that at least one of these parameters is from the default parameters provided by the authors, while the others are selected by ourselves.
Quantitative results. We first evaluate our results on six benchmark maps (Art, Books, Dolls, Laundry, Moebius and Reindeer) from Middlebury Stereo dataset 32 . Note that we added Zero-Mean-Gaussian noise with standard deviation 15, 20, 25 and 50 to all depth maps. We use peak signal-to-noise ratio (PSNR) as the evaluation metric. Tables 1 and 2 show the comparison of our proposed method with respect to various methods.
As shown in Tables 1 and 2, the PSNR results of each denoising algorithm on different depth maps are different due to the internal structure variance of the six datasets, and the proposed ESTGV achieves the highest PSNR values on different depth maps in the vast majority of the cases, which shows significant improvement in comparison to other denoising methods. The second-best denoising algorithm i.e. MLP 12 , despite outperformed the proposed ESTGV slightly on Dolls dataset at σ = 25 and Moebius dataset at σ = 50 , with only 0.01 dB and 0.02 dB, respectively, it underperformed the proposed ESTGV model on all datasets with two noise levels of σ = 15 and σ = 20 . Our results verify that the proposed Edge-guided Second-order Total Generalized Variation (ESTGV) can provide more superior performance for depth map denoising. Our results also indicate that MLP 12 is suitable for training a separate denoising model for each individual noise level. However, if the noise level of the test dataset deviates from the trained dataset, its performance would be compromised and it is impractical to train a separate MLP 12 denoising model for each noise level. On average, the proposed ESTGV achieves 0.11 dB, 0.22 dB, 0.29 dB and 0.22 dB performance improvements over MLP 12 at four individual noise levels ( σ = 20 , σ = 25 and σ = 50 ), respectively. Compared with the benchmark BF method 1 , the proposed ESTGV has improved 3.29 dB, 3.17 dB, 3.11 dB and 4.22 dB, respectively. In summary, our results show that the proposed ESTGV can provide robust performance at different noise levels in comparison to its counterparties. Qualitative results. We evaluate the proposed ESTGV method visually in Figs. 3, 4, 5, 6, which show the visual comparison of different denoising methods on Art, Bowling, Aloe and Teddy from Middlebury dataset, respectively. We have the following observations. MLP 12 over-smooths the textures more on the edges of depth map. TNRD 13 can reconstruct sharp edges and fine details, but it is easy to generate artifacts in smooth areas. Methods BM3D 2 and CSF also generate blurred boundaries and block artifacts. Although WNNM 6 has a proper balance between noise removal and edge preservation, it can still cause the depth details to be too smooth and produce ringing phenomena. K-SVD 7 can preserve clear edges and rich details, but it also generates structural distortion phenomena such as jags and burrs in the edge area. TV 22 produces obvious "staircasing artifacts". Compared with TV 22 , TGV 24 can avoid "staircasing artifacts" and appear smoother and more natural in visual effect, but it cannot preserve the edges too well because of its over-smoothness reduces the sharpness of the depth map causing some blocky effect in the smooth area. The result of BF 1 is the blurriest with texture artifacts existing in various areas of the depth map. It is obvious that the proposed ESTGV generates much less artifacts and preserves the edge details much better than other denoising algorithms. In summary, the proposed ESTGV algorithm shows more robustness to noise strength, producing much more pleasant visual outputs.

conclusions
Aiming to extend the research in improving generalized variational models' denoising capacity on depth maps, we have presented an edge-guided second-order total generalized variation model for Gaussian noise removal (ESTGV) in this paper. ESTGV fuses an edge indicator function into the regularization term of the second-order total generalized variational model to guide the diffusion of gradients. It can adaptively use the first or second derivative to update the intensity of the diffusion tensor, and therefore effectively denoise and preserve edges under different intensity noise levels. Our quantitative and qualitative experimental results demonstrated that the proposed ESTGV method shows more robustness on noise strength in comparison to very recent state-of-the www.nature.com/scientificreports/  www.nature.com/scientificreports/ art denoising algorithms. It can not only lead to visible PSNR improvements over state-of-the-art methods such as MLP, WNNM and TNRD, but also preserve the depth structures much better and generate much less texture artifacts.