Non-line-of-sight imaging with arbitrary illumination and detection pattern

Liu, Xintong; Wang, Jianyu; Xiao, Leping; Shi, Zuoqiang; Fu, Xing; Qiu, Lingyun

doi:10.1038/s41467-023-38898-4

Download PDF

Article
Open access
Published: 03 June 2023

Non-line-of-sight imaging with arbitrary illumination and detection pattern

Nature Communications volume 14, Article number: 3230 (2023) Cite this article

7362 Accesses
7 Citations
12 Altmetric
Metrics details

Subjects

Abstract

Non-line-of-sight (NLOS) imaging aims at reconstructing targets obscured from the direct line of sight. Existing NLOS imaging algorithms require dense measurements at regular grid points in a large area of the relay surface, which severely hinders their availability to variable relay scenarios in practical applications such as robotic vision, autonomous driving, rescue operations and remote sensing. In this work, we propose a Bayesian framework for NLOS imaging without specific requirements on the spatial pattern of illumination and detection points. By introducing virtual confocal signals, we design a confocal complemented signal-object collaborative regularization (CC-SOCR) algorithm for high-quality reconstructions. Our approach is capable of reconstructing both the albedo and surface normal of the hidden objects with fine details under general relay settings. Moreover, with a regular relay surface, coarse rather than dense measurements are enough for our approach such that the acquisition time can be reduced significantly. As demonstrated in multiple experiments, the proposed framework substantially extends the application range of NLOS imaging.

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

Article 12 April 2024

Mid-infrared wide-field nanoscopy

Article 17 April 2024

Photonic-electronic integrated circuit-based coherent LiDAR engine

Article Open access 11 April 2024

Introduction

The technique of imaging objects out of the direct line of sight has attracted increasing attention in recent years^{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26}. A typical non-line-of-sight (NLOS) imaging scenario is looking around the corner with a relay surface, where the target is obscured from the vision of the observer. NLOS imaging aims to recover the albedo and surface normal of the hidden targets with the measured photon information. Potential applications of NLOS imaging include but are not limited to robotic vision, autonomous driving, rescue operations, remote sensing and medical imaging.

To achieve NLOS reconstruction, laser pulses of high temporal resolution are used to illuminate several points on the relay surface, where the first diffuse reflection occurs. After that, photons enter the NLOS domain and are bounced back to the visible surface again by the unknown targets. The hidden targets can be reconstructed with the time-resolved photon intensity measured at several detection points on the visible surface. Commonly used time-resolved detectors are single-photon avalanche diodes (SPAD)²⁷. The imaging system is confocal if the illumination point coincides with the detection point for each spatial measurement, otherwise being non-confocal. Besides, we call the measurements regular if the illumination and detection points are uniformly distributed in a rectangular region.

According to the representation of the hidden surface, existing imaging algorithms are divided into three categories: point-cloud-based²⁸, mesh-based²⁹ and voxel-based methods^{1,8,9,30,31,32,33,34,35}. Among these categories, voxel-based algorithms yield to be the most efficient ones with low time complexity³² and fine reconstruction results³⁴. For voxel-based methods, the reconstruction domain is discretized with three-dimensional grid points and the albedo is represented as a grid function.

The first voxel-based NLOS reconstruction method is the back-projection algorithm proposed by Velten et al.¹. The measured photon intensity is modeled as a linear operator applied to the albedo, and the targets are reconstructed by applying the adjoint operator to the measured data. Further improvements of the back-projection method include rendering approaches for fast implementations^2,16 and filtering techniques^33,36 for noise reduction. The light-cone-transform³⁰ (LCT) proposed by O’Toole et al. describes the physical process as a convolution of the light cone kernel and the hidden target. In this way, the reconstruction is formulated as a deblurring problem and can be computed efficiently using the fast Fourier transform. The directional light cone transform³¹ (D-LCT) generalizes this method and simultaneously reconstructs the albedo and surface normal of the hidden target. The frequency-wavenumber migration⁸ (F-K) method uses the wave equation to reconstruct the albedo and can also be implemented efficiently in the frequency domain. The LCT, D-LCT and F-K methods only work directly under confocal settings. Although it is possible to transfer the data collected in non-confocal setups to confocal ones, the approximation error cannot be neglected³⁴. To reconstruct the hidden object under non-confocal settings, the phasor field³² (PF) method formulates the NLOS detection process as one of diffractive wave propagation and provides a direct solution with low time complexity. Its recent extension with SPAD arrays reconstructs live low-latency videos of NLOS scenes³⁷. The signal-object collaborative regularization³⁴ (SOCR) method considers priors on both the reconstructed target and the measured signal, which leads to high-quality reconstruction with little background noise. For scenarios with non-planar relay surfaces, the F-K and back-projection type methods can be used directly. Algorithms designed only for planar relay settings can be applied using the signal shifting techniques^8,14.

Despite these breakthroughs, two major obstacles of existing methods toward practical applications are the need for a large relay surface and dense measurement. If the relay surface is irregular or small, these algorithms may fail due to the lack of data. Besides, dense measurement results in a long acquisition time, which poses a significant challenge for applications such as auto-driving where the observer may move at high speed. In recent works, it was reported that sparse measurements could be used to reconstruct the hidden scenes. Isogawa et al. showed that the target could be reconstructed with confocal and circular NLOS scans³⁸. Sparse measurements from square grids scanning on the relay surface could also be used by incorporating the compressed sensing technique³⁵. Besides, a single shot can be used to track a moving hidden target¹⁷, although the reconstruction fails when the target is still due to the ill-posedness of the inverse problem.

In this work, we propose a Bayesian framework for NLOS reconstruction which is applicable for any spatial pattern of the illumination and detection points. By introducing the virtual confocal signal at rectangular grid points, we design joint regularizations for the measured signal, virtual confocal signal and the hidden target. We put forward a confocal complemented signal-object collaborative regularization (CC-SOCR) framework, which reconstructs both the albedo and surface normal of the hidden target. The proposed method allows regular and irregular measurement patterns in both confocal and non-confocal scenarios. Besides, our approach provides faithful reconstructions with negligible background noise, even in cases with very coarse and noisy measurements. Notably, the proposed method suggests a paradigm shift, liberating the research of NLOS imaging from relying heavily on the assumption of a large-size relay surface with the entire region (wall, ground). Our method demonstrates high-quality NLOS reconstructions in various scenarios with the relay surfaces having discrete scattering regions, arbitrary irregular shape, or very limited size, enabling the hidden object reconstruction with far more types of realistic relay surfaces such as window shutter, window frame, and fence, which significantly broadens the scope of NLOS imaging applications. As shown in Fig. 1, the illumination and detection patterns are irregular but manifest in ubiquitous scenes of daily lives. Reconstruction results of the bunny with synthetic confocal signals³⁹, detected at the entire relay surface and these four scenarios, are provided in Supplementary Figs. 1–5.

**Fig. 1: Irregular illumination and detection patterns for NLOS imaging.**

Results

The NLOS physical model

The goal of NLOS imaging is to take a collection of measured transient data and find the target that comes closest to fitting these measured signals. In this work, we adopt the physical model proposed in SOCR³⁴. Let ${x}_{i}^{{\prime} }$ and ${x}_{d}^{{\prime} }$ be the illumination and detection points on the visible surface, and we call $({x}_{i}^{{\prime} },{x}_{d}^{{\prime} })$ an active measurement pair, or simply a pair in the following. The photon intensity measured at time t is given by

$$\tau ({x}_{i}^{{\prime} },{x}_{d}^{{\prime} },t)={\int }_{\varOmega }\frac{({x}_{d}^{{\prime} }-x)\cdot {{{{{\bf{n}}}}}}(x)}{{|{x}_{i}^{{\prime} }-x|}^{2}{|{x}_{d}^{{\prime} }-x|}^{3}}f(x)\delta (|{x}_{i}^{{\prime} }-x |+|{x}_{d}^{{\prime} }-x|-ct)dx$$

(1)

in which Ω is the three-dimensional reconstruction domain, f(x) denotes the albedo value of the point x, n(x) is the unit surface normal at x that points towards the visible surface. The unit vector n(x) can be arbitrarily chosen for points with zero albedo value. By denoting ${{{{{\bf{u}}}}}}=f{{{{{\bf{n}}}}}}$, Eq. (1) is written equivalently as

$$\tau ({x}_{i}^{{\prime} },{x}_{d}^{{\prime} },t)={\int }_{\varOmega }\frac{({x}_{d}^{{\prime} }-x)\cdot {{{{{\bf{u}}}}}}(x)}{{|{x}_{i}^{{\prime} }-x|}^{2}{|{x}_{d}^{{\prime} }-x|}^{3}}\delta (|{x}_{i}^{{\prime} }-x |+|{x}_{d}^{{\prime} }-x|-ct)dx$$

(2)

Noting that the intensity is linear with u, the physical model can be written as ${{{{{\boldsymbol{\tau }}}}}}=A{{{{{\bf{u}}}}}}$ in the discrete form. The albedo and surface normal can be obtained directly from u. Indeed, the albedo of a voxel x is given by the norm of the vector u(x). The surface normal of a voxel x is obtained by normalizing the vector u(x). The surface normal is not defined where the albedo is zero.

The measured signal

To reconstruct the hidden target, we consider a collection of M measurements. Let ${p}_{m}=({x}_{m}^{p},{y}_{m}^{p},{z}_{m}^{p})$ be the coordinates of the m^th illumination point, in which ${x}_{m}^{p}$, ${y}_{m}^{p}$ and ${z}_{m}^{p}$ are the coordinates in the horizontal, vertical and depth directions. We denote by ${q}_{m}=({x}_{m}^{q},{y}_{m}^{q},{z}_{m}^{q})$ the coordinates of the m^th detection point, and call $({p}_{m},{q}_{m})$ a measurement pair. For each pair, the photon counts of the first T time bins are collected. The coordinates of all measurement pairs are written as ${C}_{meas}=\{({x}_{m}^{p},{y}_{m}^{p},{z}_{m}^{p},{x}_{m}^{q},{y}_{m}^{q},{z}_{m}^{q})|m\in [M]\}$, in which we denote by [M] the set $\{1,2,\ldots,M\}$. Let $\tilde{{{{{{\bf{b}}}}}}}$ be the noisy signal measured at C_meas. In practice, various types of noise inevitably corrupt the measured signals and significantly degrade the quality of the reconstruction. To mitigate the effects of noise and improve the reconstruction quality, we introduce the estimated signal b as an approximation of the ideal signal considered at the measured locations. The variable b is treated as a random vector so that it can be determined under the Bayesian framework. Besides, we denote the simulated signal considered at the set C_meas as ${A}_{{{{{{\bf{b}}}}}}}{{{{{\bf{u}}}}}}$, in which A_b is the discrete physical model defined in Eq. (2).

The virtual confocal signal

We discretize the reconstruction domain Ω with $V= \{({x}_{i},{y}_{j},{z}_{k}) |i\in [I],j\in [J],k\in [K]\}$, in which x_i, y_j and z_k are coordinates of the voxel in the horizontal, vertical and depth directions, respectively. When the number of measurement pairs is small, the solution to the least-squares reconstruction problem may not be unique due to the lack of data. To overcome the rank deficiency of the measurement matrix, we introduce the virtual confocal signal d considered at the regular focal points $({x}_{i},{y}_{j},0)$, in which $i\in [I]$ and $j\in [J]$. The set of measurement pairs of the virtual confocal signal is denoted as ${C}_{virt}=\{({x}_{i},{y}_{j},0,{x}_{i},{y}_{j},0) |i\in [I],j\in [J]\}$. The simulated signal generated with Eq. (2) at the set C_virt is denoted by ${A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}}$. The variable d is treated as an optimization variable and obtained together with the reconstruction by solving the optimization problem introduced in the next subsection. Let ${C}_{common}={C}_{meas}\cap {C}_{virt}$, we denote by ${R}_{{{{{{\bf{b}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})$ the subset of b which is spatially located at the set ${C}_{common}$. We also write ${R}_{{{{{{\bf{d}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})$ the subset of the signal d which is considered at the set ${C}_{common}$. When ${C}_{common}$ is empty, both ${R}_{{{{{{\bf{b}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})$ and ${R}_{{{{{{\bf{d}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})$ are empty datasets.

The Bayesian framework

We treat the reconstructed target u, the measured signal $\tilde{{{{{{\bf{b}}}}}}}$, the approximated signal b, and the virtual confocal signal d as random vectors and formulate the imaging task as an optimization problem using Bayesian inference. The target and signals are obtained simultaneously by maximizing the joint posterior probability.

$$({{{{{{\bf{u}}}}}}}^{\ast },\,{{{{{{\bf{b}}}}}}}^{\ast },\,{{{{{{\bf{d}}}}}}}^{\ast })=\mathop{{{\arg }}\,\max }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}}{\mathbb{P}}({{{{{\bf{u}}}}}},\,{{{{{\bf{b}}}}}},\,{{{{{\bf{d}}}}}}|\tilde{{{{{{\bf{b}}}}}}})$$

(3)

Three assumptions are made to formulate this as a concrete optimization problem. Firstly, the conditional distribution of the measured signal $\tilde{{{{{{\bf{b}}}}}}}$ given the joint probability distribution of u, b and d is

$${\mathbb{P}}(\tilde{{{{{{\bf{b}}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})={\mathbb{P}}(\tilde{{{{{{\bf{b}}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})=\exp (-{|{{{{{\bf{b}}}}}}-\tilde{{{{{{\bf{b}}}}}}}|}^{2}-\varUpsilon ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},\tilde{{{{{{\bf{b}}}}}}}))$$

(4)

in which ϒ is related to the joint prior distribution of u, b and $\tilde{{{{{{\bf{b}}}}}}}$. With this assumption, d does not provide additional information to predict $\tilde{{{{{{\bf{b}}}}}}}$ when b is known. Secondly, the joint prior distribution of u and b is

$${\mathbb{P}}({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})=\exp (-{|{A}_{{{{{{\bf{b}}}}}}}{{{{{\bf{u}}}}}}-{{{{{\bf{b}}}}}}|}^{2}-\varGamma ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}}))$$

(5)

in which $\varGamma$ describes the prior distribution of u and b. The estimated signal b is less noisy than the measured data and is closer to the ideal signal of certain real-world targets, which helps to enhance the reconstruction quality. Thirdly, the conditional distribution of d given u and b is

$${\mathbb{P}}({{{{{\bf{d}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})=\exp (-{|{R}_{{{{{{\bf{b}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})-{R}_{{{{{{\bf{d}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})|}^{2}-{|{A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}}-{{{{{\bf{d}}}}}}|}^{2}-\varXi ({{{{{\bf{u}}}}}},{{{{{\bf{d}}}}}}))$$

(6)

in which R_b(b,d) and R_d(b,d) are the subsets of the signals b and d that share the same measurement pairs. $\varXi ({{{{{\bf{u}}}}}},{{{{{\bf{d}}}}}})$ is related to the joint prior distribution of the target u and the virtual confocal signal d.

With these assumptions, we derive a concrete optimization problem using the Bayesian formula.

$$({{{{{{\bf{u}}}}}}}^{\ast },{{{{{{\bf{b}}}}}}}^{\ast },{{{{{{\bf{d}}}}}}}^{\ast }) =\mathop{{{\arg }}\,\max }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}}\,{\mathbb{P}}({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}|\tilde{{{{{{\bf{b}}}}}}})\\= \mathop{{{\arg }}\,\max }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}}\,{\mathbb{P}}(\tilde{{{{{{\bf{b}}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}){\mathbb{P}}({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})\\= \mathop{{{\arg }}\,\max }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}}\,{\mathbb{P}}(\tilde{{{{{{\bf{b}}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}}){\mathbb{P}}({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})\\= \mathop{{{\arg }}\,\max }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}}\,{\mathbb{P}}(\tilde{{{{{{\bf{b}}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}}){\mathbb{P}}({{{{{\bf{d}}}}}}|{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}}){\mathbb{P}}({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})\\= \mathop{{{\arg }}\,\min }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}}}{|{{{{{\bf{b}}}}}}-\tilde{{{{{{\bf{b}}}}}}}|}^{2}+{|{R}_{{{{{{\bf{b}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})-{R}_{{{{{{\bf{d}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})|}^{2}+{|{A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}}-{{{{{\bf{d}}}}}}|}^{2}\,\\ +{|{A}_{{{{{{\bf{b}}}}}}}{{{{{\bf{u}}}}}}-{{{{{\bf{b}}}}}}|}^{2}+\varUpsilon ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},\tilde{{{{{{\bf{b}}}}}}})+\varXi ({{{{{\bf{u}}}}}},{{{{{\bf{d}}}}}})+\varGamma ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})$$

(7)

in which the third equality follows from Eq. (4) and the last equality holds with Eqs. (4), (5) and (6). By designing appropriate regularization terms Y, $\varXi$ and $\varGamma$, we obtain high-quality reconstructions of the targets even in scenarios with highly incomplete measurements. The proposed framework and collaborative regularizations designed are illustrated in Fig. 2a. Concrete expressions of the regularizations are provided in the Methods section. We term the proposed method the confocal complemented signal-object collaborative regularization (CC-SOCR) due to the virtual confocal signal d introduced and the regularizations imposed on the signals and the target.

**Fig. 2: The proposed CC-SOCR method.**

In the following, we compare the reconstruction results of the proposed method with the Laplacian of Gaussian filtered back-projection³³ (LOG-BP), F-K, LCT, PF and SOCR methods. For the LCT method, we adopt the D-LCT³¹ extension that reconstructs both the albedo and surface normal. For the PF method, we adopt the implementation with the back-projection (PF-BP) algorithm⁹ and the Rayleigh Sommerfeld Diffraction (PF-RSD) algorithm³². Performance comparisons of all these methods are shown in Table 1. To bring existing methods into comparisons in scenarios with incomplete measurements, we interpolate the signal with the nearest neighbor method^8,35, which generates better results than zero padding³² (See Supplementary Fig. 24).

Table 1 Comparisons of eight NLOS reconstruction algorithms

Full size table

Results on synthetic data

Instead of using an entire planar visible surface, we assume the relay to be a square box that simulates the scenario of the four edges of a window. The hidden object is a regular quadrangular pyramid, whose base length and height are 1 m and 0.2 m respectively. The central axis of the pyramid is perpendicular to the plane in which the relay square box lies, and the distance of the pyramid to this plane is 0.5 m. The albedo of the pyramid is assumed to be a constant. As shown in Fig. 3a, we simulate the signal measured at 36 points with Eq. (1). The points are exhaustively scanned, where only one point is illuminated each time, and signals are detected at all points. The dataset contains signals measured at 36 confocal and 1260 non-confocal pairs. The time resolution is set to 32 ps. Note that the LCT, D-LCT, F-K, PF-RSD and SOCR methods do not work directly in this scenario. We compare the reconstruction result of the proposed method with LOG-BP. The maximum intensity projections are shown in Fig. 3c and Fig. 3d. The reconstructed albedo is normalized to the range [0,1]. Albedo values that are less than 0.25 are thresholded to zero. The LOG-BP method fails to locate the target correctly and contains misleading artifacts near the boundary of the reconstruction domain. The proposed method locates the target correctly and does not contain noise in the background. The maximum depth error of the CC-SOCR reconstruction is 0.02 m, which is much smaller than the LOG-BP reconstruction (0.12 m). The absolute depth errors are shown in Fig. 3f. Classification error, defined as the percentage of excessive and missing voxels of the reconstruction, is used to assess the methods of locating the target. The classification error of the CC-SOCR reconstruction is 2.86%, which is nearly one order of magnitude smaller than that of the LOG-BP reconstruction (21.75%).

**Fig. 3: Reconstruction results of the pyramid (non-confocal, synthetic signal).**

Results on measured data

For confocal experiments, we use the instance of the statue in the Stanford dataset⁸ to test the performance of the proposed method. The target is 1 m away from the visible planar surface. In the original dataset, 512 × 512 focal points are raster-scanned in a square region of size 2 × 2 m². The time resolution is 32 ps and the total exposure time is 60 min. An evenly distributed 64 × 64 dataset is sub-sampled from the original dataset, and it would take an exposure time of 56.25 s to measure this sub-sampled signal. The oracle shown in Fig. 4a and Fig. 5a are generated with the SOCR method using this sub-sampled signal. To simulate the case where the relay surface is a horizontal shutter, we only extract the signals measured at 21 rows from the downsampled data, as shown in the yellow region of Fig. 4b. From bottom to top, the five equispaced regions contain 3, 5, 5, 5 and 3 rows of measurements, respectively. The dataset contains signals measured at 1344 focal points, which would take 18.46 s for data acquisition. Reconstruction results are shown in Fig. 4. The LOG-BP reconstruction is noisy. The reconstruction results of F-K, D-LCT and SOCR algorithms are blurry and contain artifacts. The proposed method reconstructs the target faithfully.

**Fig. 4: Reconstructions of the statue with the relay surface in the shape of a horizontal shutter (confocal, measured signal).**

**Fig. 5: Reconstructions of the statue with 10 × 10 confocal measurements (confocal, measured signal).**

Figure 5 shows the reconstruction results of the statue with signals detected at 10 × 10 uniformly distributed focal points in a square region of size 2 × 2 m², which would take 1.37 s for the measurements. The points scanned are shown in Fig. 5b. The LOG-BP reconstruction contains heavy background noise and the target cannot be clearly identified. The F-K and D-LCT reconstructions are blurry and also contain background noise. The SOCR reconstruction contains artifacts, indicating that the error of the signal introduced in the nearest neighbor interpolation process cannot be neglected. In contrast, the proposed method locates the target correctly and reconstructs more details than other methods. More reconstruction results with different numbers of uniformly distributed confocal measurements are compared in Supplementary Figs. 6–10.

Figure 6 shows the reconstruction results of the statue obtained with signals measured at different regions of the relay surface: a set of 200 randomly distributed focal points in an area of size 2 × 2 m²; a region consisting of 5 equispaced vertical bars with 1344 focal points; a region that consists of four letters N, L, O and S with 825 focal points; a region made up of several sticks sparsely and randomly distributed with 1229 focal points; and a heart-shaped region with 258 focal points. These results indicate the capability of the proposed method in reconstructing the hidden target under various relay settings. For the case of the heart-shaped relay, the CC-SOCR method locates the target correctly, while all other methods fail. The measured signal, approximated signal and virtual confocal signal of the scenario with measurements at the four letters N, L, O and S are shown in Fig. 2b. The virtual confocal signal plays an important role for high-quality reconstruction. The three views and surface normal of the reconstructions as well as more comparisons under different relay settings are provided in Supplementary Figs. 11–17.

**Fig. 6: Reconstructions of the statue under representative cases with different relays (confocal, measured signal).**

For non-confocal experiments, we use the measured data of the instance of the Figure 4 provided by the phasor field method³². The hidden object is 1 m away from the visible wall. The temporal resolution is 16 ps. We pick out the signal measured at 64 × 64 illumination points in a square region of size 1.27 × 1.27 m². The detection point is 0.64 m to the left and 0.55 m to the bottom of the illumination region. Except for the signal selected, we also use four subsets of the signal to reconstruct the target: signals measured at five equispaced vertical bars that contain 3, 5, 5, 5, and 3 columns of focal points from left to right; signals measured at five equispaced horizontal bars that contain 3, 5, 5, 5, and 3 rows of focal points from bottom to top; signals measured at 14 × 14 uniformly distributed focal points in an area of 1.27 × 1.27 m²; signals measured at 200 randomly chosen focal points. To bring the PF-RSD and SOCR methods into comparison, the nearest neighbor interpolation technique is applied to extend the signal to 64 × 64 illuminations. As shown in Fig. 7, the LOG-BP and PF-BP reconstructions are noisy and contain artifacts. The PF-RSD reconstructions also contain artifacts. Both SOCR and CC-SOCR methods reconstruct the target successfully. However, the SOCR reconstructions contain artifacts (the third row) or lose some details (the fourth and fifth rows). These results also indicate that the bias of the signal obtained from the nearest neighbor interpolation leads to non-negligible reconstruction error. The proposed CC-SOCR method provides faithful reconstructions in all cases. The three views and surface normal of the reconstructions are provided in Supplementary Figs. 18–22.

**Fig. 7: Reconstruction results of the instance of the figure 4 (non-confocal, measured signal).**

For scenarios with non-planar relay surfaces, we use the measured data in the Stanford dataset⁸ to test the proposed method. The original dataset contains confocal signals measured at 128 × 128 focal points and is sub-sampled to 64 × 64. The NLOS scene contains two retroreflective letters, which leads to a bias with the physical model used. We extract subsets of the sub-sampled dataset to construct confocal and non-planar signals with irregular measurement patterns, as shown in opaque in the first column of Fig. 8. The proposed CC-SOCR method directly works under these settings and the results are shown in the last column. The LOG-BP method also works directly under these settings, but the reconstructions are of low quality and contain heavy background noise (See Supplementary Fig. 28). To bring the F-K, D-LCT and SOCR methods into comparisons, we shift the signal in the temporal dimension with the technique provided by the code of the F-K method. The shifted signals are then interpolated to 64 × 64 in spatial dimensions using the nearest neighbor method and serve as inputs of conventional imaging methods. As is shown in the last row of Fig. 8, the proposed method locates the targets correctly with the oval-shaped non-planar illumination region, while all other methods fail.

**Fig. 8: Reconstructions of the letters N and T with irregular and non-planar relay settings (confocal, measured signal).**

Discussion

We have proposed a framework for the general setting of NLOS imaging. In this section, we discuss its relationship with the original SOCR method, the complexity of the algorithm and possible directions for further improvements.

The SOCR method reconstructs the albedo and surface normal of the hidden targets under both confocal and non-confocal settings. However, the experimental setup is still quite limited. As demonstrated in the original paper³⁴, it only deals with signals measured at regular grid points. This is due to the spatial correlation of the signals in the regularization term.

The proposed CC-SOCR method generalizes the SOCR method to the most general setup, where no limitations of the measurement pairs are required. The CC-SOCR differs from SOCR in three aspects. Firstly, the introduced virtual confocal signal overcomes the rank deficiency of the measurement matrix, making it capable of reconstructing the targets under more general settings. Secondly, CC-SCOR does not include spatial correlations of the measured signal in the regularization term. As discussed in the Methods section, in CC-SOCR, the Wiener filter is applied only to the temporal dimension of the measured signal. Thirdly, the priors imposed on the target are related not only to the measured data but also to the introduced virtual confocal signal. Concrete expressions of these regularization terms are provided in the Methods section.

The proposed optimization problem can be solved efficiently using the alternative iteration method. In Supplementary Note 2, we decompose the problem into several sub-problems and discuss in detail the solutions to each sub-problem. We also provide a guide for choosing parameters in Supplementary Note 3. Convergences of all sub-problems are guaranteed, as discussed in the work of the SOCR method³⁴ and Supplementary Note 2. However, global convergence is not guaranteed because the sub-problem of updating the reconstructed target is solved approximately. Nonetheless, extensive results in Supplementary Note 1 have demonstrated the capability of the proposed method in providing high-quality reconstructions in various scenarios.

When the reconstruction domain is discretized with N×N×N voxels and the signal is detected at M measurement pairs, the memory complexity of the CC-SOCR algorithm is ${{{{{\rm{O}}}}}}(\max \{{N}^{3},MN\})$. The time complexity per iteration is ${{{{{\rm{O}}}}}}(\max \{{N}^{5},M{N}^{3}\})$, which is the same as the overall computation complexity. In Supplementary Note 4, we provide a detailed discussion of the complexity and report the running time for the instance of the statue with 200 randomly distributed confocal measurements. For the special case of N×N confocal measurements, the time and memory complexity is ${{{{{\rm{O}}}}}}({N}^{5})$ and ${{{{{\rm{O}}}}}}({N}^{3})$, which is the same as the SOCR algorithm. To reduce the computational complexity, the virtual confocal signal at coarser grids can be used. The time complexity reduces to ${{{{{\rm{O}}}}}}({N}^{4})$ in scenarios with ${{{{{\rm{O}}}}}}(N)$ measurement pairs if the virtual confocal signals are considered at $\sqrt{N}\times \sqrt{N}$ focal points. In Supplementary Figure 23, we compare the reconstruction results of the statue with virtual confocal signals of sizes 64 × 64, 32 × 32, 16 × 16 and 8 × 8 in an area of 2 × 2 m², respectively. The execution time is provided in Supplementary Tables 3–6. Besides, the CC-SOCR algorithm can be implemented using the embarrassingly parallel paradigm. The imaging process can be accelerated with GPU implementations of the code on large-scale parallel computing platforms. In the future, we would like to implement the octree representation of the reconstruction domain to reduce the complexity of the proposed method.

In CC-SOCR, virtual confocal signals observed at planar rectangular grid points are used to complement the reconstruction process in the case of incomplete measurements. It is also possible to consider virtual non-confocal signals for stronger regularizations. Besides, virtual confocal signals at several planes may be introduced to make use of the spatial correlation. However, the time and memory complexity will also increase.

With sufficient measurements, both the SOCR method and the CC-SOCR method provide high-quality reconstructions (See Supplementary Figs. 1, 11, 18). However, when the number of measurement pairs is small, the reconstruction problem is ill-posed. Although the complete signal can be obtained with interpolation techniques, existing methods still fail due to the bias introduced in the signal (Supplementary Fig. 27). The introduced virtual confocal signal benefits from the regularization guided by the simulated signal of the target and leads to faithful reconstructions. In the absence of the virtual confocal signal, the reconstructions may be blurry (Supplementary Fig. 25) or contain artifacts in the background (Supplementary Fig. 26). Besides, the CC-SOCR algorithm provides a robust way to convert measured non-confocal NLOS signals to their confocal counterparts. The generated confocal signal of the instance of figure 4 is provided in the supplementary code.

Methods

The joint regularizations

In Eq. (7), we formulate the CC-SOCR framework as an optimization problem. Here we show how the regularization terms $\varGamma ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})$, $\Xi ({{{{{\bf{u}}}}}},{{{{{\bf{d}}}}}})$ and $\varUpsilon ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},\tilde{{{{{{\bf{b}}}}}}})$ are designed. To better grasp the idea of these regularization terms, we suggest a basic understanding of the data driven tight frame algorithm⁴⁰, the block matching and 3D filtering (BM3D) algorithm⁴¹ and the SOCR method³⁴.

$\varGamma ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})$ describes the prior distribution of the reconstructed target and the approximated signal of the measurement pairs. For the reconstructed target, we consider the sparsity and non-local self-similarity priors and use the zero norm to impose sparseness on the approximated signal b. We set

$$\varGamma ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}})={s}_{u}{|{{{{{\bf{L}}}}}}|}_{1}+{\lambda }_{u}\mathop{\sum}\limits_{i}[{|{B}_{i}({{{{{\bf{L}}}}}})-{D}_{s}{C}_{i}{D}_{n}^{T}|}^{2}+{\lambda }_{pu}{|{C}_{i}|}_{0}]+{s}_{b}{|{{{{{\bf{b}}}}}}|}_{0}$$

(8)

in which s_u, ${\lambda }_{u}$, ${\lambda }_{pu}$ and s_b are fixed parameters. L is the albedo of u, ${B}_{i}$ is the block matching operator, with i the index of a reference block. The summation is made over all possible blocks. D_s and D_n are two orthogonal matrices that capture the local structure and non-local correlations of the 3D albedo blocks. C_i is the matrix consisting of transform coefficients of the i^th block. $|\cdot {|}_{0}$ denotes the number of nonzero values of a tensor.

For the term $\varUpsilon ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},\tilde{{{{{{\bf{b}}}}}}})$, we set

$$\varUpsilon ({{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},\tilde{{{{{{\bf{b}}}}}}})= \mathop{\sum}\limits_{i}{|{P}_{i}(\tilde{{{{{{\bf{b}}}}}}})-D{S}_{i}|}^{2}+\mathop{\sum}\limits_{i,j}{\left(\frac{{\sigma }_{{{{{{\bf{b}}}}}}}}{{d}_{j}^{T}{P}_{i}({A}_{{{{{{\bf{b}}}}}}}{{{{{\bf{u}}}}}})}{S}_{i}(j)\right)}^{2}\\ +{\lambda }_{sb}\mathop{\sum}\limits_{i}{|{P}_{i}({{{{{\bf{b}}}}}})-D{S}_{i}|}^{2}$$

(9)

in which ${\lambda }_{sb}$ is a fixed parameter, P_i is the patch extracting operator, with i the index of a local patch. Noting that the signals may not be measured at regular grid points, the patch extracting operator P_i only applies to the temporal direction of the signals. $\tilde{{{{{{\bf{b}}}}}}}$ is the measured signal. D is the matrix of discrete cosine transform. The j^th filter of D is denoted by d_j. A_b is the measurement matrix. S_i is the vector that consists of Wiener coefficients of the i^th patch, with its j^th element denoted by ${S}_{i}(j)$. ${\sigma }_{{{{{{\bf{b}}}}}}}$ is the noise level. The summations are made over all possible patches and filters of the discrete cosine matrix.

For the regularization term $\varXi ({{{{{\bf{u}}}}}},{{{{{\bf{d}}}}}})$, the prior of the virtual confocal signal d is constructed under the guidance of the target u and the physical model A_d. Noting that the confocal signal d is considered at rectangular grid points, both the spatial and temporal correlations can be used. Let P_i be the 3D patch extracting operator (2D in space and 1D in time), we seek a data-driven orthogonal dictionary $\varPsi$ that sparsely represents the local patches of both the approximated signal d and the simulated signal ${A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}}$. For simplicity, we abuse the notation P_i to represent either a 1D patch of the measured signal b in the temporal direction or a 3D patch of the virtual confocal signal. The meaning can be made clear from the variable to which it applies. Let ${Q}_{i}$ be the matrix of transform coefficients of the i^th patch, the regularization term is given by

$$\Xi ({{{{{\bf{u}}}}}},{{{{{\bf{d}}}}}})=\mathop{\sum}\limits_{i}[{|{Q}_{i}-{\varPsi }^{T}{P}_{i}({{{{{\bf{d}}}}}})|}^{2}+{\lambda }_{sd}{|{Q}_{i}-{\varPsi }^{T}{P}_{i}({A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}})|}^{2}+{\lambda }_{fd}{|{Q}_{i}|}_{0}]+{s}_{d}|{{{{{\bf{d}}}}}}{|}_{0}$$

(10)

in which ${\lambda }_{sd}$ and ${\lambda }_{fd}$ are two fixed parameters that control the weight of the simulated signal and the sparsity of the representation, respectively. s_d is the parameter that controls the sparsity of the virtual confocal signal d.

The CC-SOCR optimization problem

By substituting Eqs. (8), (9), (10) into Eq. (7) and introducing weights, we obtain the concrete optimization problem of the proposed CC-SOCR framework as follows.

$$\mathop{\min }\limits_{{{{{{\bf{u}}}}}},{{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}},{D}_{s},{D}_{n},{{{{{\bf{C}}}}}},{{{{{\bf{S}}}}}},\varPsi,{{{{{\bf{Q}}}}}}} \, {|{A}_{{{{{{\bf{b}}}}}}}{{{{{\bf{u}}}}}}-{{{{{\bf{b}}}}}}|}^{2}+{s}_{u}{|{{{{{\bf{L}}}}}}|}_{1}+{s}_{b}{|{{{{{\bf{b}}}}}}|}_{0}\\ +{\lambda }_{u}\mathop{\sum }\limits_{i}[{|{B}_{i}({{{{{\bf{L}}}}}})-{D}_{s}{C}_{i}{D}_{n}^{T}|}^{2}+{\lambda }_{pu}{|{C}_{i}|}_{0}]\\ +{\lambda }_{b}{|{{{{{\bf{b}}}}}}-\tilde{{{{{{\bf{b}}}}}}}|}^{2}+{\lambda }_{b}{\lambda }_{pb}\mathop{\sum }\limits_{i}{|{P}_{i}(\tilde{{{{{{\bf{b}}}}}}})-D{S}_{i}|}^{2}\\ +{\lambda }_{b}{\lambda }_{pb}\mathop{\sum }\limits_{i,j}{\left[\frac{{\sigma }_{{{{{{\bf{b}}}}}}}}{{d}_{j}^{T}{P}_{i}({A}_{{{{{{\bf{b}}}}}}}{{{{{\bf{u}}}}}})}{S}_{i}(j)\right]}^{2}\\ +{\lambda }_{b}{\lambda }_{pb}{\lambda }_{sb}\mathop{\sum }\limits_{i}{|{P}_{i}({{{{{\bf{b}}}}}})-D{S}_{i}|}^{2}\\ +{\lambda }_{d}{|{A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}}-{{{{{\bf{d}}}}}}|}^{2}+{s}_{d}{|{{{{{\bf{d}}}}}}|}_{0}\\ +{\lambda }_{d}{\lambda }_{pd}\mathop{\sum }\limits_{i}{|{Q}_{i}-{\varPsi }^{T}{P}_{i}({{{{{\bf{d}}}}}})|}^{2}\\ +{\lambda }_{d}{\lambda }_{pd}{\lambda }_{sd}\mathop{\sum }\limits_{i}{|{Q}_{i}-{\varPsi }^{T}{P}_{i}({A}_{{{{{{\bf{d}}}}}}}{{{{{\bf{u}}}}}})|}^{2}\\ +{\lambda }_{d}{\lambda }_{pd}{\lambda }_{fd}\mathop{\sum }\limits_{i}{|{Q}_{i}|}_{0}\\ +{\lambda }_{bd}{|{R}_{{{{{{\bf{b}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})-{R}_{{{{{{\bf{d}}}}}}}({{{{{\bf{b}}}}}},{{{{{\bf{d}}}}}})|}^{2}\\ \,{{{{{\rm{s}}}}}}.{{{{{\rm{t}}}}}}. \, {{{{{\bf{L}}}}}} ={{{{{\rm{albedo}}}}}}({{{{{\bf{u}}}}}}),\\ {D}_{s}^{T}{D}_{s} =I[{p}_{x}{p}_{y}{p}_{z}],\,{D}_{n}^{T}{D}_{n}=I[r],\\ {\varPsi }^{T}\varPsi =I[{q}_{x}{q}_{y}{q}_{t}]$$

(11)

in which C, S and Q represent the collections of the transform-domain coefficients $\{{C}_{i}\}$, $\{{S}_{i}\}$ and $\{{Q}_{i}\}$ respectively. $I[n]$ represents the identity matrix of order n. ${p}_{x}$, ${p}_{y}$ and ${p}_{z}$ are the patch sizes of the albedo in the horizontal, vertical and depth directions. r is the number of neighboring blocks of each reference albedo block. q_x, q_y and q_t are the patch sizes of the virtual confocal signal d in the horizontal, vertical and temporal directions. ${\sigma }_{{{{{{\bf{b}}}}}}}$ is a parameter related to the noise level of the measured signal. The fixed parameters s_u, s_b, s_d, λ_u, λ_b, λ_d, λ_pu, λ_pb, λ_pd, λ_sb, λ_sd, λ_fd, λ_bd balance the data-fitting terms and the regularization terms. The solution to the optimization problem is provided in Supplementary Note 2, and the supplementary software has been attached to this article.

Data availability

The Zaragoza dataset is available in Zaragoza NLOS synthetic dataset [http://graphics.unizar.es/nlos_dataset.html]. The Stanford dataset can be downloaded at the project page [http://www.computationalimaging.org/publications/nlos-fk/]. The dataset provided by the phasor field method is available at the project page [https://biostat.wisc.edu/~compoptics/phasornlos20/fastnlos.html]. Synthetic data of the instance of the pyramid are attached to the code.

Code availability

The code of the proposed method can be downloaded in the supplementary materials.

The accession codes of other methods are listed below.

“LOG-BP [https://springernature.figshare.com/articles/dataset/Datasets_and_reconstruction_code_for_a_virtual_wave_non-line-of-sight_imaging_approach/8084987]”

“F-K and LCT

[http://www.computationalimaging.org/publications/nlos-fk/]”

“D-LCT [https://github.com/computational-imaging/nlos-dlct]”

“PF-RSD [https://biostat.wisc.edu/~compoptics/phasornlos20/fastnlos.html]”

“SOCR

[https://www.nature.com/articles/s41377-021-00633-3]”.

References

Velten, A. et al. Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging. Nat. Commun. 3, 745 (2012).
Article ADS PubMed Google Scholar
Arellano, V., Gutierrez, D. & Jarabo, A. Fast back-projection for non-line of sight reconstruction. In ACM SIGGRAPH 2017 Posters (Association for Computing Machinery, 2017). https://doi.org/10.1145/3102163.3102241.
Jarabo, A., Masia, B., Marco, J. & Gutierrez, D. Recent advances in transient imaging: a computer graphics and vision perspective. Vis. Inform. 1, 65–79 (2017).
Article Google Scholar
Heide, F. et al. Non-line-of-sight imaging with partial occluders and surface normals. ACM Trans Graph 38, 1–10 (2019).
Thrampoulidis, C. et al. Exploiting occlusion in non-line-of-sight active imaging. IEEE Trans. Comput. Imaging 4, 419–431 (2018).
Article Google Scholar
Ahn, B., Dave, A., Veeraraghavan, A., Gkioulekas, I. & Sankaranarayanan, A. Convolutional approximations to the general non-line-of-sight imaging operator. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 7888–7898 (IEEE, 2019). https://doi.org/10.1109/ICCV.2019.00798.
Chen, W., Daneau, S., Brosseau, C. & Heide, F. Steady-state non-line-of-sight imaging. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 6783–6792 (IEEE, 2019). https://doi.org/10.1109/CVPR.2019.00695.
Lindell, D. B., Wetzstein, G. & O’Toole, M. Wave-based non-line-of-sight imaging using fast f-k migration. ACM Trans. Graph. 38, 1–13 (2019).
Article Google Scholar
Liu, X. et al. Non-line-of-sight imaging using phasor-field virtual wave optics. Nature 572, 620–623 (2019).
Article ADS CAS PubMed Google Scholar
Pediredla, A., Dave, A. & Veeraraghavan, A. SNLOS: non-line-of-sight scanning through temporal focusing. In 2019 IEEE International Conference on Computational Photography (ICCP) 1–13 (IEEE, 2019). https://doi.org/10.1109/ICCPHOT.2019.8747336.
Chen, W., Wei, F., Kutulakos, K. N., Rusinkiewicz, S. & Heide, F. Learned feature embeddings for non-line-of-sight imaging and recognition. ACM Trans. Graph. 39, 1–18 (2020).
CAS Google Scholar
Chopite, J. G., Hullin, M. B., Wand, M. & Iseringhausen, J. Deep non-line-of-sight reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020).
Isogawa, M., Yuan, Y., O’Toole, M. & Kitani, K. Optical non-line-of-sight physics-based 3D human pose estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 7011–7020 (IEEE, 2020). https://doi.org/10.1109/CVPR42600.2020.00704.
La Manna, M., Nam, J.-H., Azer Reza, S. & Velten, A. Non-line-of-sight-imaging using dynamic relay surfaces. Opt. Express 28, 5331 (2020).
Article ADS PubMed Google Scholar
Liu, X. & Velten, A. The role of Wigner distribution function in non-line-of-sight imaging. In 2020 IEEE International Conference on Computational Photography (ICCP) 1–12 (IEEE, 2020). https://doi.org/10.1109/ICCP48838.2020.9105266.
Liao, Z. et al. FPGA accelerator for real-time non-line-of-sight imaging. IEEE Trans. Circuits Syst. Regul. Pap. 1–14. https://doi.org/10.1109/TCSI.2021.3122309 (2021).
Metzler, C. A., Lindell, D. B. & Wetzstein, G. Keyhole imaging:non-line-of-sight imaging and tracking of moving objects along a single optical path. IEEE Trans. Comput. IMAGING 7, 12 (2021).
Article MathSciNet Google Scholar
Pei, C. et al. Dynamic non-line-of-sight imaging system based on the optimization of point spread functions. Opt. Express 29, 32349–32364 (2021).
Article ADS PubMed Google Scholar
Wu, C. et al. Non–line-of-sight imaging over 1.43 km. Proc. Natl Acad. Sci. 118, e2024468118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yang, W., Zhang, C., Jiang, W., Zhang, Z. & Sun, B. None-line-of-sight imaging enhanced with spatial multiplexing. Opt. Express 30, 5855 (2022).
Article ADS PubMed Google Scholar
Liu, X., Bauer, S. & Velten, A. Analysis of feature visibility in non-line-of-sight measurements. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 10132–10140. https://doi.org/10.1109/CVPR.2019.01038 (2019).
Feng, X. & Gao, L. Ultrafast light field tomography for snapshot transient and non-line-of-sight imaging. Nat. Commun. 12, 2179 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Geng, R. et al. Passive non-line-of-sight imaging using optimal transport. IEEE Trans. Image Process 31, 110–124 (2022).
Article ADS PubMed Google Scholar
Sasaki, T., Hashemi, C. & Leger, J. R. Passive 3D location estimation of non-line-of-sight objects from a scattered thermal infrared light field. Opt. Express 29, 43642 (2021).
Article ADS Google Scholar
La Manna, M. et al. Error backprojection algorithms for non-line-of-sight imaging. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1615–1626 (2019).
Article PubMed Google Scholar
Li, Z. et al. Fast non-line-of-sight imaging based on first photon event stamping. Opt. Lett. 47, 1928–1931 (2022).
Buttafava, M., Zeman, J., Tosi, A., Eliceiri, K. & Velten, A. Non-line-of-sight imaging using a time-gated single photon avalanche diode. Opt. Express 23, 20997 (2015).
Article ADS CAS PubMed Google Scholar
Xin, S., Nousias, S., Kutulakos, K. N., Sankaranarayanan, A. C. & Gkioulekas, I. A Theory of fermat paths for non-line-of-sight shape reconstruction. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019).
Tsai, C.-Y., Sankaranarayanan, A. C. & Gkioulekas, I. Beyond Volumetric Albedo — a surface optimization framework for non-line-of-sight imaging. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 1545–1555 (IEEE, 2019). https://doi.org/10.1109/CVPR.2019.00164.
Matthew, O. ’T., Lindell, D. B. & Wetzstein, G. Confocal non-line-of-sight imaging based on the light-cone transform. Nature 555, 338–341 (2018).
Article ADS Google Scholar
Young, S. I., Lindell, D. B., Girod, B., Taubman, D. & Wetzstein, G. Non-line-of-sight surface reconstruction using the directional light-cone transform. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 1404–1413 (IEEE, 2020). https://doi.org/10.1109/CVPR42600.2020.00148.
Liu, X., Bauer, S. & Velten, A. Phasor field diffraction based reconstruction for fast non-line-of-sight imaging systems. Nat. Commun. 11, 1645 (2020).
Article ADS PubMed PubMed Central Google Scholar
Laurenzis, M. & Velten, A. Feature selection and back-projection algorithms for nonline-of-sight laser–gated viewing. J. Electron. Imaging 23, 1–6 (2014).
Article Google Scholar
Liu, X. et al. Non-line-of-sight reconstruction with signal–object collaborative regularization. Light Sci. Appl. 10, 198 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Ye, J. T., Huang, X., Li, Z. P. & Xu, F. Compressed sensing for active non-line-of-sight imaging. Opt. Express 29, 1749–1763 (2021).
Feng, X. & Gao, L. Improving non-line-of-sight image reconstruction with weighting factors. Opt. Lett. 45, 3921 (2020).
Article ADS PubMed PubMed Central Google Scholar
Nam, J. H. et al. Low-latency time-of-flight non-line-of-sight imaging at 5 frames per second. Nat. Commun. 12, 6526 (2021).
Article ADS PubMed PubMed Central Google Scholar
Isogawa, M., Chan, D., Yuan, Y., Kitani, K. & O’Toole, M. 17 Sinogram Efficient Non-Line-of-Sight Imaging from Transient Sinograms. In Computer Vision – ECCV 2020 (eds. Vedaldi, A., Bischof, H., Brox, T. & Frahm, J.-M.) vol. 12352 193–208 (Springer International Publishing, 2020).
Galindo, M., Marco, J., O’Toole, M., Wetzstein, G. & Jarabo, A. A dataset for benchmarking time-resolved non-line-of-sight imaging. In ACM SIGGRAPH 2019 Posters 1–2 (2019).
Cai, J.-F., Ji, H., Shen, Z. & Ye, G.-B. Data-driven tight frame construction and image denoising. Appl. Comput. Harmon. Anal. 37, 89–105 (2014).
Article MathSciNet MATH Google Scholar
D Abov, K., Foi, A., Katkovnik, V. & Egiazarian, K. Image denoising with block-matching and 3D filtering. Image processing: algorithms and systems, neural networks, and machine learning 6064, 354–365 (2006).

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61975087 F, 12071244 S, 11971258 Q).

Author information

Authors and Affiliations

Yau Mathematical Sciences Center, Tsinghua University, Beijing, 100084, PR China
Xintong Liu, Jianyu Wang, Zuoqiang Shi & Lingyun Qiu
State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, 100084, PR China
Leping Xiao & Xing Fu
Key Laboratory of Photonic Control Technology (Tsinghua University), Ministry of Education, Beijing, 100084, PR China
Leping Xiao & Xing Fu
Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, 101408, PR China
Zuoqiang Shi & Lingyun Qiu

Authors

Xintong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jianyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Leping Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Zuoqiang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xing Fu
View author publications
You can also search for this author in PubMed Google Scholar
Lingyun Qiu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

X.L. conceived the idea of the CC-SOCR method and implemented the code. X.L. and J.W. ran the experiments. L.X. created the art work of Fig. 1. Z.S. participated in the discussion of the Bayesian framework and the choice of parameters. X.F. initiated the idea of NLOS imaging with irregular illumination and detection pattern. L.Q. participated in the discussion of the solution to the proposed CC-SOCR optimization problem and suggested possible acceleration of the code. Z.S., X.F., and L.Q. supervised and directed this project. All authors discussed the results and contributed to the writing of the manuscript.

Corresponding authors

Correspondence to Xing Fu or Lingyun Qiu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Software

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, X., Wang, J., Xiao, L. et al. Non-line-of-sight imaging with arbitrary illumination and detection pattern. Nat Commun 14, 3230 (2023). https://doi.org/10.1038/s41467-023-38898-4

Download citation

Received: 12 June 2022
Accepted: 12 May 2023
Published: 03 June 2023
DOI: https://doi.org/10.1038/s41467-023-38898-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.