Edge-guided second-order total generalized variation for Gaussian noise removal from depth map

Li, Shuaihao; Zhang, Bin; Yang, Xinfeng; Zhu, Weiping

doi:10.1038/s41598-020-73342-3

Download PDF

Article
Open access
Published: 01 October 2020

Edge-guided second-order total generalized variation for Gaussian noise removal from depth map

Shuaihao Li^1,2,
Bin Zhang⁴,
Xinfeng Yang³ &
…
Weiping Zhu³

Scientific Reports volume 10, Article number: 16329 (2020) Cite this article

2152 Accesses
5 Citations
Metrics details

Subjects

Abstract

Total generalized variation models have recently demonstrated high-quality denoising capacity for single image. In this paper, we present an accurate denoising method for depth map. Our method uses a weighted second-order total generalized variational model for Gaussian noise removal. By fusing an edge indicator function into the regularization term of the second-order total generalized variational model to guide the diffusion of gradients, our method aims to use the first or second derivative to enhance the intensity of the diffusion tensor. We use the first-order primal–dual algorithm to minimize the proposed energy function and achieve high-quality denoising and edge preserving result for depth maps with high -intensity noise. Extensive quantitative and qualitative evaluations in comparison to bench-mark datasets show that the proposed method provides significant higher accuracy and visual improvements than many state-of-the-art denoising algorithms.

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

Article 12 April 2024

Chenxi Ma, Weimin Tan, … Bo Yan

Geometry-enhanced pretraining on interatomic potentials

Article 05 April 2024

Taoyong Cui, Chenyu Tang, … Wanli Ouyang

Physics-informed machine learning

Article 24 May 2021

George Em Karniadakis, Ioannis G. Kevrekidis, … Liu Yang

Introduction

Since the advent of ToF sensor, the denoising research on the depth map acquired by it has never stopped. As a major noise type in depth map, Gaussian noise has been widely concerned by researchers. Consequently effective denoising methods have appeared, such as, filtering-based methods^1,2,3, partial differential equation (PDE)-based methods^4,5,6, sparse representation-based dictionary learning methods^7,8,9,10,11, deep learning-based methods^{12,13,14,15,16,17,18,19,20}, and recent variation minimization-based methods^21,22,23,24.

The filtering methods mainly include spatial domain filtering and frequency domain filtering. No matter which filter, there are some common phenomena such as incomplete denoising, blur edge, and inconsistency between the smooth area of image denoised and the constant area of ground truth.

The PDE-based methods can diffuse the smooth area of image quickly and the edge area slowly by setting diffusion factors, and finally achieve the effect of denoising and edge preservation. However, for images polluted by high-density noise, the denoising effect of the PDE-based methods is poorly achieved as the termination of the iterative process is hard to control and the convergence speed is slow causing the poor real-time performance.

For sparse representation-based methods, their performance mainly depends on image sparse domain. But, in practice, it is usually not easy to choose the sparse domain due to the variety of images.

The deep learning-based methods have recently demonstrated high-quality denoising for various images. However, such methods also have some drawbacks, such as, the optimization of network structure requires a lot of training, and the datasets required for training are often very large. In addition, specific models need to be trained for specific noise intensity, so their application universality is still poor.

The variation minimization-based methods denoise by iterating the variational model composed of regularization term and data item. The most representative one of this type of method is the total generalized variation (TGV) model proposed by Bredies et al.²⁴. The TGV model can not only remove noise effectively, but also preserve the detailed feature of image edge well. However, when the image edge is polluted by noise heavily, the TGV model is prone to misjudge the noises as the edge and therefore unable to extract the noises from the edge.

The goal of our work is to denoise depth maps polluted by high-intensity Gaussian noise and effectively remove the noises to preserve the edges. In this work, we have presented an edge-guided second-order total generalized variation for Gaussian noise removal from depth map (ESTGV). In the ESTGV model, the edge indicating function Canny algorithm is added into the regularization term of the second-order generalized total variational, which can guide the diffusion of the gradients, and then adaptively select the first or second derivative to update the diffusion tensor strength according to the edge indicator function value in the edge area and the smooth area. Then the minimum approximation of the energy function is solved by the first-order primal–dual algorithm (FOPD)²⁵, which can effectively preserve the edge by denoising the depth map with high-density Gaussian noise. The experimental results showed that the ESTGV outperforms state-of-the-art denoising methods not only in PSNR index, but also in visual comparison.

Related work

Total variation (TV) for image denoising

The total variation model (TV) is a classical image denoising algorithm proposed by Rudin et al.²², also known as the ROF model. The minimum energy functional of the TV model can be expressed as:

$$\mathop {\min }\limits_{\Omega } a\int\limits_{\Omega } {|\nabla u|} dxdy + \frac{\lambda }{2}\int\limits_{\Omega } {(u - f)^{2} } dxdy,$$

(1)

where $u$ denotes the predicted image, $u \in \Omega \in R$, $f$ denotes the noisy image, and the regularization term $\int\limits_{\Omega } {|\nabla u|} dxdy$ denotes the prior constraint of image $u$, which is used for image denoising. $|\nabla u|$ denotes the model of the gradient of $u$. $\frac{\lambda }{2}\int\limits_{\Omega } {(u - f)^{2} } dxdy$ is the data fidelity term, also known as the approximation term, which measures the approximation degree of $u$ and $f$ by distance and constrains the solution of total variation minimizing to limit the deviation of solution in a small range, so as to achieve the purpose of protecting the structural information of image and reducing the distortion of image. $\alpha$ and $\lambda$ are Lagrange scale constants for adjusting regularization term and data fidelity term. They need a proper balancing between the edge preservation and the noise removal. The larger $\lambda$ is, the closer $f$ approaches $u$, but if the value of $\lambda$ is too large, the smoothing intensity of local features will be weakened, and the denoising effect will be decreased; the smaller $\lambda$ is, the greater the denoising degree will be, but if the value of $\lambda$ is too small, the image will become blurred.

The Euler–Lagrange equation of energy functional (1) can be expressed as:

$$- \nabla \cdot \left( {\frac{{\nabla u}}{{\left| {\nabla u} \right|}}} \right) + \lambda (u - f) = 0,$$

(2)

where the first term $- \nabla \cdot \left( {\frac{{\nabla u}}{{\left| {\nabla u} \right|}}} \right)$ is the diffusion tensor and $\frac{1}{{\left| {\nabla u} \right|}}$ is the diffusion coefficient. In the edge area with large gradient, $\left| {\nabla u} \right|$ is larger, the diffusion coefficient is small, and the diffusion intensity along the edge direction is smaller, so the edge detail features can be well preserved. In the smooth area with small gradient, $\left| {\nabla u} \right|$ is smaller, the diffusion coefficient is higher, the denoising intensity is higher.

The TV model determines the edge and the noise by the gradient value, and the diffusion is performed only in the gradient orthogonal direction of the image, so that the edge structure can be effectively preserved. But in the smooth area of the image, the TV model can only approximate the piecewise constant function effectively, so it is easy to misjudge the noise as edge to be preserved, and eventually produce the “staircasing artifacts” of the piecewise area.

Second-order total generalized variation

In order to solve the problem of “staircasing artifacts” in denoising of TV model, Bredies et al.²⁴ proposed a generalized mathematical model of TV algorithm, that is, total generalized variation (TGV). Different from the fact that the TV model can only approximate the piecewise constant function, the convex optimization model generated by the TGV model can approximate the piecewise multinomial function of any order, thus effectively avoiding the “staircasing artifacts” in the process of denoising.

The k-order TGV model can be defined as:

$$TGV_{\alpha }^{k} (u) = \sup \left\{ {\int_{\Omega } {u{\text{ div}}^{k} v \, dx \, \left| { \, v \in c_{c}^{k} \left( {\Omega ,Sym^{k} \left( {{\mathbb{R}}^{d} } \right)} \right), \, \left\| {{\text{div}}^{l} v} \right\|_{\infty } \le \alpha_{l} ,l = 0, \cdots ,k - 1} \right.} } \right\}$$

(3)

where $\Omega$ is an open interval, $\Omega \in {\mathbb{R}}^{d}$, and $u$ denotes the original image, $u \in L_{loc}^{1} \left( \Omega \right)$,$TGV_{\alpha }^{k}$ denotes the k-order total generalized bounded variations with the weight coefficient $\alpha$($k \ge 1$$\alpha = \left( {\alpha _{0} , \cdots ,\alpha _{{k - 1}} } \right)0$), and $TGV_{\alpha }^{k}$ provides a way of automatically balancing between the first derivative and the k-order derivative. ${\text{div}}^{k} v$ denotes the k-order symmetric divergence operator. $v \in c_{c}^{k} \left( {\Omega ,Sym^{k} \left( {{\mathbb{R}}^{d} } \right)} \right)$ denotes the functions of k-order symmetric tensor in the $\Omega$, where $c_{c}^{k}$ is k-order continuous function space, $Sym^{k} \left( {{\mathbb{R}}^{d} } \right)$ is k-order symmetric tensor space based on ${\mathbb{R}}^{d}$, and $\alpha_{l}$ is a fixed positive parameter.

When $k = 1$ and $\alpha_{1} = 1$, Eq. (3) is the dual form of TV model:

$$TGV_{1}^{1} (u) = \sup \left\{ {\int_{\Omega } {u{\text{ div }}v{\mkern 1mu} dx{\mkern 1mu} \left| {\left\| v \right\|_{\infty } \le 1} \right.} } \right\} = TV\left( u \right)$$

(4)

According to Legendre–Fenchel convex conjugate transformation²⁶, the original form of formula (3) can be obtained:

$$TGV_{\alpha }^{k} (u) = \mathop {\inf }\limits_{\begin{subarray}{l} u_{l} \in C_{c}^{k - l} \left( {\Omega ,Sym^{l} \left( {{\mathbb{R}}^{d} } \right)} \right) \\ l = 1, \cdots ,k - 1,u_{0} = u,u_{k} = 0 \end{subarray} } \sum\limits_{l = 1}^{k} {\alpha_{k - 1} \left\| {\varepsilon (u_{l} - 1) - u_{l} } \right\|_{1} } ,$$

(5)

where the gradient operator $\varepsilon (u_{l} - 1)$ denotes symmetric partial derivative, which can be expressed as:

$$\varepsilon (u_{l} - 1) = \frac{1}{2}\left( {\nabla u_{l - 1} + \left( {\nabla u_{l - 1} } \right)^{T} } \right).$$

(6)

In particular, for $k = 2$, Eq. (3) is the second-order TGV:

$$TGV_{\alpha }^{2} (u) = \mathop {\min }\limits_{\omega \in BD\left( \Omega \right)} \alpha_{1} \int_{\Omega } {\left| {\nabla u - \omega } \right|} \, dx{ + }\alpha_{0} \int_{\Omega } {\left| {\varepsilon \left( \omega \right)} \right.} \left| {dx} \right.$$

(7)

where $BD\left( \Omega \right)$ is a bounded and distorted vector field²⁷, the gradient operator $\varepsilon (\omega ) = \frac{1}{2}\left( {\nabla u_{\omega } + \nabla u^{T} } \right)$_.

The corresponding second-order denoising model is expressed as:

$$\mathop {\min }\limits_{u} TGV_{\alpha }^{2} \left( u \right){ + }\frac{{\left\| {u - f} \right\|_{2}^{2} }}{2}$$

(8)

The TGV model can effectively avoid the “staircasing artifacts” of piecewise constant areas and is superior to TV model in denoising and edge preservation.

The first-order primal–dual algorithm (FOPD)

The FOPD algorithm is a particularly effective iterative algorithm proposed by Chambolle et al.²⁵ in 2011, which is used to solve the convex optimization problems with a special proximal regularization parameter. The FOPD algorithm converts the original minimizing problem into the dual maximization problem by dual variables, and then represents the original problem and the dual problem as their corresponding saddle-point optimization problem, and then iterates the original variables and dual variables alternately until the optimal solution of the original problem is finally approximated. The FOPD algorithm is easy to implement and can effectively iterate alternately, so it can accelerate convergence.

The general convex optimization problem is defined as:

$${\mathop {\min }\limits_{x \in X} F\left( {Ix} \right) + \text{ G}}\left( x \right),$$

(9)

the corresponding dual problem can be expressed as:

$$\mathop {\max }\limits_{y \in Y} - \left\{ {F^{ * } \left( y \right){\text{ } + \text{ G}}^{ * } \left( { - I^{ * } y} \right)} \right\},$$

(10)

where $x$ is the primal variable, $y$ is the dual variable. $X$ and $Y$ denote the finite-dimensional real vector space of the image with an inner product $\langle \cdot , \cdot \rangle$ and norm $\left\| \cdot \right\| = \sqrt {\langle \cdot , \cdot \rangle }$, $I:X \to Y$ is a continuous linear operator, $F\left( {Ix} \right)$ and ${\text{G}}\left( x \right)$ are semi-continuous convex functions, $F^{ * }$ and $G^{ * }$ are conjugate convex functions that are topologically dual to functions $F$ and $G$, respectively. $I^{ * }$ is the adjoint operator of $I$.

The saddle-point formulation of (9) and (10) is given by

$$\mathop {\min }\limits_{x \in X} \mathop {\max }\limits_{y \in Y} \left[ {Ix,y} \right] + G\left( x \right) - F^{ * } \left( y \right).$$

(11)

For $F\left( {If} \right) = \left\| {\nabla f} \right\|_{1}$ and ${\text{G}}\left( f \right) = \frac{\lambda }{2}\left\| {f - g} \right\|_{2}^{2}$, (11) becomes

$$\mathop {\min }\limits_{f \in X} \mathop {\max }\limits_{y \in Y} \left\{ {\langle \nabla f,y\rangle_{Y} + \frac{\lambda }{2}\left\| {f - g} \right\|_{2}^{2} - \delta_{P} \left( y \right)} \right\},$$

(12)

where, the function $\delta_{P} \left( y \right)$ is the dual function of $F\left( g \right) = \left\| g \right\|_{1}$:

$$F^{*} \left( y \right) = \delta _{P} \left( y \right) = \left\{ {\begin{array}{*{20}l} 0 & {{\mkern 1mu} \left\| y \right\|_{\infty } \le 1} \\ \infty & {otherwise} \\ \end{array} } \right.$$

(13)

The convex set P is expressed as:

$$P = \left\{ {p \in Y:\left\| p \right\|_{\infty } \le 1} \right\}$$

(14)

$\left\| p \right\|_{\infty }$ denotes the discrete maximum norm defined as:

$$\left\| p \right\|_{\infty } = \max \left| {p_{i,j} } \right|,\left| {p_{i,j} } \right| = \sqrt {\left( {p_{i,j}^{1} } \right)^{2} + \left( {p_{i,j}^{2} } \right)^{2} }$$

(15)

According to the projection approximation algorithm, the iteration formulas of the original variable $x$ and the dual variable $y$ can be expressed as:

$$\begin{gathered} y^{n + 1} = (I + \sigma \partial F)^{ - 1} (y^{n} + \nabla \overline{f}^{n} ) \hfill \\ \Leftrightarrow \, y^{n + 1} = \arg \, \mathop {\min}\limits_{y \in Y} \left\{ {\frac{1}{2\sigma } \cdot \left\| {y - (y^{n} + \sigma \nabla \overline{f}^{n} )} \right\|_{2}^{2} + \delta (y)} \right\} \hfill \\ \Leftrightarrow \, y_{i,j}^{n + 1} = \left( {y_{i,j}^{n} + \sigma \nabla \overline{f}_{{^{i,j} }}^{n} } \right) \cdot \max \left( {1,\left| {y_{i,j}^{n} + \sigma \nabla \overline{f}_{{^{i,j} }}^{n} } \right|} \right)^{ - 1} , \hfill \\ \end{gathered}$$

(16)

and

$$\begin{gathered} f^{n + 1} = (I + \tau \partial G)^{ - 1} (f^{n} + \tau {\text{div }}y^{n + 1} ) \hfill \\ \Leftrightarrow \, f^{n + 1} = \arg \, \mathop {\min}\limits_{f} \left\{ {\frac{1}{2\tau } \cdot \left\| {f - (f^{n} + \tau {\text{div }}y^{n + 1} )} \right\|_{2}^{2} + \frac{\lambda }{2}\left\| {f - g} \right\|_{2}^{2} } \right\} \hfill \\ \Leftrightarrow \, f_{i,j}^{n + 1} = \left( {\left( {f_{i,j}^{n} + \tau {\text{div }}y_{i,j}^{{^{n + 1} }} } \right) + \tau \lambda g_{i,j} } \right) \cdot \left( {1 + \tau \lambda } \right)^{ - 1} , \hfill \\ \end{gathered}$$

(17)

and

$$\overline{f}^{n + 1} = f^{n + 1} + \theta \left( {f^{n + 1} - f^{n} } \right) = 2f^{n + 1} - f^{n} .$$

(18)

The proposed method

Edge-guided second-order total generalized variation for Gaussian noise removal

In order to solve the edge blurring issue during denoising depth maps with high-intensity Gaussian noise, we propose an edge-guided second-order total generalized variation model (ESTGV) for depth map Gaussian noise removal. The ESTGV model aims to preserve the depth edge and detail features profoundly when denoising Gaussian noise heavily polluted depth maps by adding an edge indication function into the second-order total generalized variation model.

For the choice of edge indication function, we have compared some other edge detection algorithms, and found that Roberts detector usually misses some edges, Sobel detector and Prewitt detector often generate false edges, Laplacian detector is over-sensitive to noise and not suitable for noisy images, and LOG (Laplacian of Gaussian) detector usually loses sharp edges, while Canny detector adopts the principle of optimal edge detection, which has strong anti-noise ability, high integrity and continuity of edge detected , and adaptive threshold generation, and its edge preserving performance is obviously better than other detectors.

By adding the Canny edge detection algorithm²⁸ into the regularization term of the second-order TGV, we propose a weighted second-order TGV minimization denoising model:

$$\mathop {\min }\limits_{\Omega } (u) + \frac{\lambda }{2}\int\limits_{\Omega } {(u - f)^{2} } dx$$

(19)

where $\mathop {\min }\limits_{\Omega } (u)$ is the regularization term, $\frac{\lambda }{2}\int\limits_{\Omega } {(u - f)^{2} } dx$ is the data fidelity term, $u$ denotes the predicted depth map denoised, $f$ denotes the noisy depth map, $\lambda$ is the Lagrange constant factor for balancing the regularization term and the data fidelity term, $\Omega$ is the definition domain of pixels on the depth map, $(x,y) \in \Omega$.

The regularization term $\mathop {\min }\limits_{\Omega } (u)$ can be expressed as $\Phi (u)$, that is, the second order total generalized variation $TGV_{a}^{2} (u)$:

$$\Phi (u) = TGV_{a}^{2} (u) = \mathop {\min }\limits_{p} a_{1} \parallel \varepsilon (p)\parallel _{1} + a_{2} \parallel T(\nabla u - p)\parallel _{1}$$

(20)

where $p \in R^{mn} \times R^{mn}$ is a bounded two-dimensional first-order vector field, $p_{(i,j)} = [p_{i,j,1} ,p_{i,j,2} ]$. $a_{1}$ and $a_{2}$ are two constants used to coordinate the first derivative and second derivative; $\nabla u$ is the gradient map of $u$, which is used to judge the scale of noise and edge. The gradient formulation of pixel $u\left( {x,y} \right)$ on depth map $u$ is given by:

$$\left| {\nabla u(x,y)} \right|_{n} = \sqrt {\left( {\frac{{\partial u}}{{\partial x}}} \right)_{n}^{2} + \left( {\frac{{\partial u}}{{\partial y}}} \right)_{n}^{2} }$$

(21)

where $\left( {\frac{\partial u}{{\partial x}}} \right)_{n}$ and $\left( {\frac{\partial u}{{\partial y}}} \right)_{n}$ denote the horizontal and vertical gradients in the nth iteration process, respectively, which can be calculated by the median difference:

$$\left\{ \begin{gathered} \left( {\frac{\partial u}{{\partial x}}} \right)_{n} = \frac{1}{2}\left( {u\left( {x + 1,y} \right)_{n} - u\left( {x - 1,y} \right)_{n} } \right) \hfill \\ \left( {\frac{\partial u}{{\partial y}}} \right)_{n} = \frac{1}{2}\left( {u\left( {x,y + 1} \right)_{n} - u\left( {x,y - 1} \right)_{n} } \right) \hfill \\ \end{gathered} \right.$$

(22)

$T$ of (20) denotes the edge indicator function:

$$T = \frac{1}{{1 + M|\nabla G_{\sigma } * f|^{2} }}$$

(23)

where M ≥ 0 is the contrast factor, $G_{\sigma }$ is the Gaussian filtering kernel with a mean value of 0 and a standard deviation of σ. *denotes convolution operation, and $G_{\sigma } *f$ denotes the depth map preprocessed by Gaussian filtering kernel $G_{\sigma } *f$.

$\varepsilon (p)$ in (20) denotes the second-order symmetric gradient operator:

$$\varepsilon (p) = \frac{{\nabla p + (\nabla p)^{T} }}{2} = \left[ \begin{gathered} \varepsilon (p)_{i,j,1 \, } \, \varepsilon (p)_{i,j,3} \hfill \\ \varepsilon (p)_{i,j,3} \, \varepsilon (p)_{i,j,2} \hfill \\ \end{gathered} \right]$$

(24)

The regularization item $G_{\sigma } (x)$ is the Canny edge detection algorithm:

$$G_{\sigma } (x) = \frac{1}{{2\pi \sigma^{2} }}\exp [ - \frac{{x^{2} + y^{2} }}{{2\sigma^{2} }}]$$

(25)

where σ is the weight of the Gaussian filter to adjust the smoothing level. The smaller the σ value, the higher the positioning accuracy of the filter, but the lower the signal-to-noise ratio (SNR). The larger the σ value, the higher the SNR, but the positioning accuracy of the filter is also reduced.

According to the denoising model (19), the edge detection of the depth map is performed by $|\nabla G_{\sigma } *f|^{2}$, if the value of $|\nabla G_{\sigma } *f|^{2}$ is large, the position of the observed pixel is the depth edge, and the value of T decreases because of the larger denominator, so the first derivative diffusion becomes relatively weak, so that the edge structure can be preserved well. If the value of $|\nabla G_{\sigma } *f|^{2}$ is small, the observed pixel is located in the smooth area of the depth map, the value of $T$ is close to 1 and the diffusion degree of the first derivative increases, so that the noise can be filtered effectively.

In conclusion, the edge indicator function can adaptively adjust the diffusion intensity of different frequency domains in depth images. For the edge areas, the first derivative is used to preserve their detailed features. For the smooth areas, the higher order derivative is used to filter out the noise, so as to effectively preserve the edge while denoising the depth map with high-density gaussian noise.

Numerical algorithms

In our denoising model, the regularization term is proper, convex and lower semi-continuous, and the data fidelity term is strictly convex and completely non-smooth. According to the convex analysis theory²⁹, (19) is an unconstrained optimization problem, and there is a unique minimum solution, that is, the global optimal solution³⁰.

We employ the FOPD method²⁵ as the numerical algorithm in order to solve the ESTGV model (19). The FOPD algorithm can solve optimization problems with efficient iterations.

According to the Legendre-Fenchel transformation²⁴, the dual form of (19) can be obtained from:

$$\mathop {\min }\limits_{u,w} \mathop {\max }\limits_{m \in M,n \in N} \langle \nabla u - p,m\rangle + \langle \varepsilon (p),m\rangle + \frac{\lambda }{2}\left\| {(u - f)^{2} } \right\|_{2}^{2}$$

(26)

where

$$\begin{gathered} M = \{ m = (m_{1} ,m_{2} )^{T} ||m(x)| \le Ta_{2} \} , \hfill \\ N = \left\{ {n = \left( \begin{gathered} n_{11} {, }n_{12} \hfill \\ n_{21} , \, n_{22} \hfill \\ \end{gathered} \right)\left| {\left\| n \right\|_{\infty } \le a_{1} } \right|} \right\} \hfill \\ \end{gathered}$$

(27)

Using the original-dual algorithm²⁵ to solve (26), The iterative process is obtained:

$$\left\{ \begin{gathered} m^{k + 1} = proj_{M} (m^{k} + a(\nabla \overline{u}^{k} - \overline{p}^{k} )) \hfill \\ n^{k + 1} = proj_{N} (n^{k} + a(\varepsilon (\overline{p}^{k} ))) \hfill \\ u^{k + 1} = (u^{k} + \tau ({\text{div }}m^{k + 1} + \lambda f)) \cdot (1 + \tau \lambda ) \hfill \\ p^{k + 1} = p^{k} + \tau ({\text{div}}^{h} \, n^{k + 1} + m^{k} ) \hfill \\ \overline{u}^{k + 1} = u^{k} + \eta (u^{k + 1} - \overline{u}^{k} ) \hfill \\ \overline{p}^{k + 1} = p^{k + 1} + \eta (p^{k + 1} - \overline{p}^{k} ) \hfill \\ \end{gathered} \right.$$

(28)

where $k$ denotes the number of iterations, the divergence operator $div$ is the dual operator of the gradient operator $\nabla$, which denotes the negative conjugate of the gradient operator $\nabla$, i.e. $div = - \nabla*$, the discrete version of $div$ is $div(m_{1} ,m_{2} ) = \partial_{x}^{ - } m_{1} + \partial_{y}^{ - } m_{2}$,$div^{ h}$ denotes the negative conjugate of the second-order symmetric gradient operator $\varepsilon$, $div^{h} = - \varepsilon*$. When

$$m = \left( {\begin{array}{*{20}l} {m_{{11}} ,} & {m_{{12}} } \\ {m_{{21}} ,} & {m_{{22}} } \\ \end{array} } \right),\;div^{h} (n) = \left( {\begin{array}{*{20}c} {\partial _{x}^{ - } m_{{11}} + \partial _{y}^{ - } m_{{12}} } \\ {\partial _{x}^{ - } m_{{21}} + \partial _{y}^{ - } m_{{22}} } \\ \end{array} } \right).$$

$\partial_{x}^{ - } ,\partial_{y}^{ - }$ is the first-order backward differences operator.

The projections of (24) can be easily obtained by applying pointwise operations below:

$$\left\{ {\begin{array}{*{20}l} {proj_{M} (m){\text{ }} = {\text{ m}} \cdot (max\left( {1,\frac{{\left| m \right|}}{{Ta_{2} }}} \right)^{1} } \\ {proj_{N} (n) = {\text{ n}} \cdot (max\left( {1,\frac{{\left| n \right|}}{{Ta_{1} }}} \right)^{1} } \\ \end{array} } \right.$$

(29)

In conclusion, the solution of denoising model (19) is an iterative process for solving original variables $u,p$ and dual variables $m,n$. The specific solving process of the primal–dual algorithm is as follows:

Step 1: Initialization:

$$\begin{aligned} & u^{0} = \bar{u}^{0} = f,p^{0} = \bar{p}^{0} = 0,m^{0} = n^{0} = 0, \\ & \eta _{m} > 0,\eta _{n} > 0,\tau _{u} > 0,\tau _{p} > 0,\lambda > 0,k \ge 0. \\ \end{aligned}$$

Step 2: Using the Iteration formulation (22) to solve $m^{j,k + 1} ,n^{j,k + 1} ,u^{j,k + 1} ,p^{j,k + 1} ,\overline{u}^{j,k + 1} ,\overline{p}^{j,k + 1}$.

Step 3: When $u^{k + 1} ,p^{k + 1}$ appear to converge, the algorithm is terminated; otherwise, set $k = k + 1$, and continue to iterate until the stopping criterion is satisfied.

The numerical operation of the ESTGV model (19) needs to be implemented in a discrete version of the gradient operator and the divergence operator.

Assuming that the resolution of depth map is s × s, then the discrete versions of the first-order forward and backward difference operators are given by

$$\begin{aligned} (\partial _{x}^{ + } u)_{{i,j}} & = \left\{ {\begin{array}{*{20}l} {u_{{i + 1,j}} - u_{{i,j}} ,} \hfill & {1 \le i < n} \hfill \\ {0,} \hfill & {i = n} \hfill \\ \end{array} } \right. \\ (\partial _{y}^{ + } u)_{{i,j}} & = \left\{ {\begin{array}{*{20}l} {u_{{i,j + 1}} - u_{{i,j}} ,} \hfill & {1 \le i < n} \hfill \\ {0,} \hfill & {j = n} \hfill \\ \end{array} } \right. \\ (\partial _{x}^{ + } u)_{{i,j}} & = \left\{ {\begin{array}{*{20}l} {u_{{i,j}} - u_{{i - 1,j}} ,} \hfill & {1 < i < n} \hfill \\ {u_{{1,j}} } \hfill & {i = 1} \hfill \\ { - u_{{n - 1,j}} ,} \hfill & {i = n} \hfill \\ \end{array} } \right. \\ (\partial _{y}^{ + } u)_{{i,j}} & = \left\{ {\begin{array}{*{20}l} {u_{{i,j}} - u_{{i,j - 1}} ,} \hfill & {1 < j < n} \hfill \\ {u_{{j,1}} ,} \hfill & {j = 1} \hfill \\ { - u_{{i,n - 1}} ,} \hfill & {j = n} \hfill \\ \end{array} } \right. \\ \end{aligned}$$

(30)

The discrete version of the gradient operator is expressed as $\nabla = \left( {\partial _{x}^{ + } ,\partial _{y}^{ + } } \right)^{T}$, and the discrete second-order symmetric gradient operator $\varepsilon$ is expressed as:

$$\varepsilon (p) = \frac{1}{2}(\nabla p + \nabla p^{T} ) = \left[ \begin{gathered} \partial_{x}^{ + } p_{1} ,\frac{1}{2}(\partial_{y}^{ + } p_{1} + \partial_{x}^{ + } p_{2} ) \hfill \\ \frac{1}{2}(\partial_{x}^{ + } p_{2} + \partial_{y}^{ + } p_{1} ),\partial_{y}^{ + } p_{2} \hfill \\ \end{gathered} \right]$$

(31)

Experimental results

In this section, we evaluate our proposed method both quantitatively and qualitatively with respect to other benchmark and state-of-the-art image denoising methods.

Experimental platform: The simulations are carried out on MATLAB R2018a using a laptop with a quad-core 2.2 GHz Intel(R) i7-4770HQ CPU, 16 GB RAM, Intel (R) Iris (TM) Pro Graphics 5200.

Experimental Dataset: We experimented with our denoising method on the public dataset, namely the Middlebury stereo dataset^31,32. The resolution of each depth map is 256 × 256.

Baseline Methods: We compare our results with the following five categories of the methods. (1) Filtering-based methods: Bilateral Filter (BF)¹ and Block-Matching and 3D Filtering (BM3D)². (2) PDE-based methods: Weighted Nuclear Norm Minimization (WNNM)⁶. (3) Sparse representation algorithm: K-Singular Value Decomposition (K-SVD)⁷. (4) Deep learning-based methods including Multi-Layer Perceptron (MLP)¹², Trainable Nonlinear Reaction Diffusion (TNRD)¹³ and A Cascade of Shrinkage Fields (CSF)¹⁴. (5) Variational-based methods: Total Variation (TV)²² and Total Generalized Variation (TGV)²⁴. The results of the comparison algorithms are either obtained from the original papers or from the source code provided by the authors.

Parameters Setting: The parameters of our ESTGV algorithm are set as follows: Lagrange multiplier $\lambda = 10$,Gaussian noise of standard deviation $\sigma = 0.2$, contrast factor M = 5, filter kernel size $7 \times 7$, $a_{2} = 2$, $a_{1} = 4$,$\tau = 0.04$. We also set parameters for each comparison method. Note that at least one of these parameters is from the default parameters provided by the authors, while the others are selected by ourselves.

Quantitative results

We first evaluate our results on six benchmark maps (Art, Books, Dolls, Laundry, Moebius and Reindeer) from Middlebury Stereo dataset³². Note that we added Zero-Mean-Gaussian noise with standard deviation 15, 20, 25 and 50 to all depth maps. We use peak signal-to-noise ratio (PSNR) as the evaluation metric. Tables 1 and 2 show the comparison of our proposed method with respect to various methods.

Table 1 PSNR (dB) evaluation of different algorithms under two noise levels ($\sigma = 15$, $\sigma = 20$) on the Middlebury dataset.

Full size table

Table 2 PSNR (dB) evaluation of different algorithms under two noise levels ($\sigma = 25$, $\sigma = 50$) on the Middlebury dataset.

Full size table

As shown in Tables 1 and 2, the PSNR results of each denoising algorithm on different depth maps are different due to the internal structure variance of the six datasets, and the proposed ESTGV achieves the highest PSNR values on different depth maps in the vast majority of the cases, which shows significant improvement in comparison to other denoising methods. The second-best denoising algorithm i.e. MLP¹², despite outperformed the proposed ESTGV slightly on Dolls dataset at $\sigma = 25$ and Moebius dataset at $\sigma = 50$, with only 0.01 dB and 0.02 dB, respectively, it underperformed the proposed ESTGV model on all datasets with two noise levels of $\sigma = 15$ and $\sigma = 20$. Our results verify that the proposed Edge-guided Second-order Total Generalized Variation (ESTGV) can provide more superior performance for depth map denoising. Our results also indicate that MLP¹² is suitable for training a separate denoising model for each individual noise level. However, if the noise level of the test dataset deviates from the trained dataset, its performance would be compromised and it is impractical to train a separate MLP¹² denoising model for each noise level. On average, the proposed ESTGV achieves 0.11 dB, 0.22 dB, 0.29 dB and 0.22 dB performance improvements over MLP¹² at four individual noise levels ($\sigma = 20$, $\sigma = 25$ and $\sigma = 50$), respectively. Compared with the benchmark BF method¹, the proposed ESTGV has improved 3.29 dB, 3.17 dB, 3.11 dB and 4.22 dB, respectively. In summary, our results show that the proposed ESTGV can provide robust performance at different noise levels in comparison to its counterparties.

Figures 1 and 2 show intuitively the results of removing four levels of Gaussian noise from the Middlebury dataset³² by 10 algorithms in the form of histogram and broken line graph, respectively.

Qualitative results

We evaluate the proposed ESTGV method visually in Figs. 3, 4, 5, 6, which show the visual comparison of different denoising methods on Art, Bowling, Aloe and Teddy from Middlebury dataset, respectively. We have the following observations. MLP¹² over-smooths the textures more on the edges of depth map. TNRD¹³ can reconstruct sharp edges and fine details, but it is easy to generate artifacts in smooth areas. Methods BM3D² and CSF also generate blurred boundaries and block artifacts. Although WNNM⁶ has a proper balance between noise removal and edge preservation, it can still cause the depth details to be too smooth and produce ringing phenomena. K-SVD⁷ can preserve clear edges and rich details, but it also generates structural distortion phenomena such as jags and burrs in the edge area. TV²² produces obvious “staircasing artifacts”. Compared with TV²², TGV²⁴ can avoid “staircasing artifacts” and appear smoother and more natural in visual effect, but it cannot preserve the edges too well because of its over-smoothness reduces the sharpness of the depth map causing some blocky effect in the smooth area. The result of BF¹ is the blurriest with texture artifacts existing in various areas of the depth map. It is obvious that the proposed ESTGV generates much less artifacts and preserves the edge details much better than other denoising algorithms. In summary, the proposed ESTGV algorithm shows more robustness to noise strength, producing much more pleasant visual outputs.

Conclusions

Aiming to extend the research in improving generalized variational models’ denoising capacity on depth maps, we have presented an edge-guided second-order total generalized variation model for Gaussian noise removal (ESTGV) in this paper. ESTGV fuses an edge indicator function into the regularization term of the second-order total generalized variational model to guide the diffusion of gradients. It can adaptively use the first or second derivative to update the intensity of the diffusion tensor, and therefore effectively denoise and preserve edges under different intensity noise levels. Our quantitative and qualitative experimental results demonstrated that the proposed ESTGV method shows more robustness on noise strength in comparison to very recent state-of-the art denoising algorithms. It can not only lead to visible PSNR improvements over state-of-the-art methods such as MLP, WNNM and TNRD, but also preserve the depth structures much better and generate much less texture artifacts.

References

Tomasi, C. & Manduchi, R. Bilateral filtering for gray and color images. In Proceedings of IEEE Conference Computer Vision. (ICCV). Bombay, India, Vol. 846, 839–846 (1998).
Dabov, K., Foi, A., Katkovnik, V. & Egiazarian, K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 2080–2095 (2007).
Article ADS MathSciNet Google Scholar
Li, S. H., Zhu, W. P., Zhang, B., Yang, X. F. & Chen, M. Depth map denoising using a bilateral filter and a progressive CNN. J. Opt. Technol. 87, 361–364 (2020).
Article Google Scholar
Koenderink, J. J. The structure of images. Biol. Cybern. 50, 363–370 (1984).
Article MathSciNet CAS Google Scholar
Witkin, A. P. Scale-space filtering. Read. Comput. Vis. 42, 329–332 (1987).
Google Scholar
Gu, S., Zhang, L., Zuo, W. & Feng X. Weighted nuclear norm minimization with application to image denoising. In Proceedings of IEEE Conference Computer Vision Pattern Recognition. (CVPR) , 2862–2869 (Columbus, Ohio, USA, 2014).
Aharon, M., Elad, M. & Bruckstein, A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54, 4311–4322 (2006).
Article ADS Google Scholar
Elad, M. & Aharon, M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15, 3736–3745 (2006).
Article ADS MathSciNet Google Scholar
Mahmoudi, M. & Sapiro, G. Sparse representations for range data restoration. IEEE Trans. Image Process. 21, 2909–2915 (2012).
Article ADS MathSciNet Google Scholar
Hong, Y. & Zhu, W. P. Learning visual codebooks for image classification using spectral clustering. Soft Comput. 22, 1–10 (2017).
Google Scholar
Hong, Y. & Zhu, W. P. Spatial co-training for semi-supervised image classification. Pattern Recognit. Lett. 63, 59–65 (2015).
Article Google Scholar
Burger, H. C., Schuler, C. J. & Harmeling, S. Image denoising: Can plain neural networks compete with BM3D? In Proceedings of IEEE Conference Computer Vision Pattern Recognition (CVPR) 2392–2399 (Providence, Rhode, USA, 2012).
Chen, Y. & Pock, T. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1256–1272 (2017).
Article Google Scholar
Schmidt, U. & Roth, S. Shrinkage fields for effective image restoration. In Proceedings of IEEE Conference Computer Vision Pattern Recognition (CVPR) 2774–2781 (Columbus, Ohio, USA, 2014).
You, C. et al. Structurally-sensitive multi-scale deep neural network for low-dose CT denoising. IEEE Access. 6, 41839–41855 (2018).
Article Google Scholar
You, C. et al. CT super-resolution GAN constrained by the identical, residual, and cycle learning ensemble (GAN-CIRCLE). IEEE Trans. Med. Imaging. 39, 188–203 (2019).
Article Google Scholar
Guha, I. et al. Deep learning-based high-resolution reconstruction of trabecular bone microstructures from low-resolution CT scans using GAN-CIRCLE. InProceedings of SPIE—The International Society for Optical Engineering Vol. 11317. https://doi.org/10.1117/12.2549318 (2020).
You, C., Yang, L., Zhang, Y. & Wang, G. Low-dose CT via deep CNN with skip connection and network-in-network.InProceedings of SPIE in Developments in X-ray Tomography XII Vol.11113 (2019). https://doi.org/10.1117/12.2534960.
Lyu, Q., You, C., Shan, H., Zhang, Y. & Wang, G. Super-resolution MRI and CT through GAN-circle. InProceedings of SPIE in Developments in X-Ray Tomography XII Vol. 11113 (2019). https://doi.org/10.1117/12.2530592
Li, S. H., Zhang, B., Yang, X. F., He, Y. X. & Chen, M. Depth image super resolution for 3D reconstruction of oil reflnery buildings. Chem Tech Fuels Oil+ 55, 491–496 (2019).
Article CAS Google Scholar
Sun, L. et al. A novel weighted cross total variation method for hyperspectral image mixed denoising. IEEE Access. 5, 27172–27188 (2017).
Article Google Scholar
Rudin, L. I., Osher, S. & Fatemi, E. Nonlinear total variation-based noise removal algorithms. Physica D 60, 259–268 (1992).
Article ADS MathSciNet Google Scholar
Ranftl, R., Bredies, K. & Pock, T. Non-local total generalized variation for optical flow estimation. In Proceedings of European Conference on Computer Vision (ECCV)439–454 (Springer, Cham, 2014).
Bredies, K., Kunisch, K. & Pock, T. Total generalized variation. Slam J Imaging Sci. 3, 492–526 (2010).
Article MathSciNet Google Scholar
Chambolle, A. & Pock, T. A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40, 120–145 (2011).
Article MathSciNet Google Scholar
Rockafellar, R. T. & Zagrodny, D. A derivative-coderivative inclusion in second-order non-smooth analysis. Set Valued Anal. 5, 89–105 (1997).
Article MathSciNet Google Scholar
Bredies, K., Dong, Y. & Hintermüller, M. Spatially dependent regularization parameter selection in total generalized variation models for image restoration. Int. J. Comput. Math. 90, 109–123 (2013).
Article MathSciNet Google Scholar
Canny, J. A computational approach to edge detection.IEEE Trans. Pattern Anal. Mach. Intell. PAMI-8, 679–698 (1986).
Bertsekas, D. Convex analysis and optimization. Athena Sci. 129, 420–432 (2010).
Google Scholar
Bredies, K., Kunisch, K. & Pock, T. Total generalized variation. SIAM J. Imaging Sci. 3, 492–526 (2010).
Article MathSciNet Google Scholar
Scharstein, D. & Szeliski, R. High-accuracy stereo depth maps using structured light. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 195–202 (Madison, WI, 2003).
Hirschmuller, H. & Scharstein, D. Evaluation of cost functions for stereo matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (CVPR)1–8 (Minneapolis, MN, 2007).

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFC1604000 and Grant 2016YFB1101702, in part by the Nature Science Foundation of China under Grant 61772573, and in part by the National Mapping and Geographic Information Bureau of China under Grant SIDT20170101.

Author information

Authors and Affiliations

Research Center for International Business and Economy, Sichuan International Studies University, Chongqing, 400031, China
Shuaihao Li
International Business School, Sichuan International Studies University, Chongqing, 400031, China
Shuaihao Li
School of Computer Science, Wuhan University, Wuhan, 430072, China
Xinfeng Yang & Weiping Zhu
Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
Bin Zhang

Authors

Shuaihao Li
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xinfeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weiping Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Quantitative analysis and qualitative analysis was provided by B.Z., City University of Hong Kong. Writing assistance was provided by W.Z. and X.Y., Wuhan University. The main manuscript text was written by S.L., Sichuan International Studies University.

Corresponding author

Correspondence to Shuaihao Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Li, S., Zhang, B., Yang, X. et al. Edge-guided second-order total generalized variation for Gaussian noise removal from depth map. Sci Rep 10, 16329 (2020). https://doi.org/10.1038/s41598-020-73342-3

Download citation

Received: 25 June 2020
Accepted: 15 September 2020
Published: 01 October 2020
DOI: https://doi.org/10.1038/s41598-020-73342-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Edge-guided second-order total generalized variation for Gaussian noise removal from depth map

Subjects

Abstract

Similar content being viewed by others

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

Geometry-enhanced pretraining on interatomic potentials

Physics-informed machine learning

Introduction

Related work

Total variation (TV) for image denoising

Second-order total generalized variation

The first-order primal–dual algorithm (FOPD)

The proposed method

Edge-guided second-order total generalized variation for Gaussian noise removal

Numerical algorithms

Experimental results

Quantitative results

Qualitative results

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

Geometry-enhanced pretraining on interatomic potentials

Physics-informed machine learning

Introduction

Related work

Total variation (TV) for image denoising

Second-order total generalized variation

The first-order primal–dual algorithm (FOPD)

The proposed method

Edge-guided second-order total generalized variation for Gaussian noise removal

Numerical algorithms

Experimental results

Quantitative results

Qualitative results

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links