Abstract
Non-line-of-sight (NLoS) imaging is an important challenge in many fields ranging from autonomous vehicles and smart cities to defense applications. Several recent works in optics and acoustics tackle the challenge of imaging targets hidden from view (e.g. placed around a corner) by measuring time-of-flight information using active SONAR/LiDAR techniques, effectively mapping the Green functions (impulse responses) from several controlled sources to an array of detectors. Here, leveraging passive correlations-based imaging techniques (also termed ’acoustic daylight imaging’), we study the possibility of acoustic NLoS target localization around a corner without the use of controlled active sources. We demonstrate localization and tracking of a human subject hidden around a corner in a reverberating room using Green functions retrieved from correlations of broadband uncontrolled noise sources recorded by multiple detectors. Our results demonstrate that for NLoS localization controlled active sources can be replaced by passive detectors as long as a sufficiently broadband noise is present in the scene.
Similar content being viewed by others
Introduction
Non-line-of-sight (NLoS) imaging techniques have important applications in the fields of autonomous vehicle navigation and remote sensing1. NLoS techniques aim to localize, track, and image targets hidden from view by recording ’multiply-bounced’ reflected waves, i.e. waves that reflect off a directly visible surface, such as a wall, towards the hidden target, and back from it to a detector array by another reflection. In the last decade, there have been great advancements in the field, enabling high-resolution NLoS imaging and tracking in real-time for a variety of applications using both light and sound1,2,3,4,5,6,7,8.
In the optical domain, time-of-flight (ToF) techniques, achieve centimeter-scale lateral resolution by computational back-projection reconstruction3,4,5,6,7. However, since in the optical domain, the reflections from most common surfaces are diffuse reflections, due to the surface roughness being large compared to the optical wavelength, the quartic falloff of the multi-bounce diffuse reflections fundamentally limits the imaging range. In addition, many real-life applications, such as in automotive and indoor tracking of subjects, do not require the centimeter-scale resolution achievable via optical NLoS techniques, making acoustic-based NLoS techniques attractive.
When acoustic waves are considered8, the optically-rough surfaces of e.g. white-painted walls, become effectively flat reflective mirrors due to the considerably longer acoustic wavelength (\(\lambda \approx 1\) m–10 cm for acoustic frequencies of 300 Hz–3 KHz). The specular reflections of audible-frequency waves from most ordinary walls can then straightforwardly reveal the mirror image of the hidden targets by conventional beam-forming back-projection techniques8, similar to the ones used in ultrasound echography. Furthermore, in the acoustic domain, the direct measurement of the acoustic fields is performed using conventional off-the-shelf microphones and does not require specialized ultrafast detectors or interferometric techniques, as used in the optical domain.
Acoustic NLoS localization of active sources, such as speakers, has been long demonstrated using either reflected waves9,10, or waves refracted by a cornered edge of an occluder11. Recently, Lindell et al. have demonstrated NLoS localization and imaging of passive reflectors in an anechoic chamber by applying a multi-bounce ToF approach, utilizing an array of microphones and speakers emitting strong chirped pulses8. Specifically, the pulsed emissions from each of the speakers and consecutive measurements of the reflected waves by the microphones array have allowed the retrieval of a set of speaker-microphone Green functions. These were then used to reconstruct the hidden scene by beam-forming back-projection.
Here, we study the possibility of retrieving the same set of temporal Green functions passively, i.e. without emitting controlled acoustic waveforms. To achieve this, we leverage the ideas of passive imaging12,13,14,15,16,17,18,19,20 to estimate the Green functions from cross-correlations of ambient broadband noise, using only an array of microphones. We demonstrate localization of a human subject around the corner in a reverberating concrete-walled room containing several uncontrolled broadband noise sources. In our experiments, random diffuse signals reveal pulse-echo-like reflected signals via temporal cross-correlations between pairs of microphones in the array, which are then used as the estimates of the Green functions to faithfully estimate the hidden targets positions.
Our work is based on passive correlation imaging, also known as coda-interferometry in seismology13, and which is utilized in underwater acoustics for ocean tomography21,22,23. The working principle of coda-interferometry (or ’acoustic daylight imaging’ as termed in underwater acoustics23 ) is that by cross-correlating recordings of ambient noise one can reproduce the Green function, which contains the same ToF information measured in active pulse-echo experiments. The idea was first put to use in helioseismology for extracting the travel time of acoustic waves from temporal cross-correlations of the intensity fluctuations on the solar surface14. Lobkis and Weaver have shown that the autocorrelation function of ultrasound noise measurements reveals the same waveform as the one measured in a single transducer pulse-echo experiment15 and that the cross-correlation between two registrations of the diffuse noise field at two arbitrary points in space can reveal the Green’s function between these points16. The approach was also put to use in geophysics17, microwave18, and in optical studies of complex media19. It is important to note that in underwater acoustics, the term acoustic daylight imaging is used to describe both a correlations-based coda-interferometry approach that retrieves the Green-function between pairs of detectors21,22,23, and both an approach that mimics optical incoherent imaging, without Green function retrieval24. Importantly, the Green function retrieval-based approach that we utilize in this work has the advantage of using the extracted ToF information for localization. As passive correlation allows to acquire the same ToF information as obtained in active pulse-echo experiments, it could be used, in principle, to localize hidden targets in an NLoS scenario in the same fashion as conventional ToF measurements2,8. Thus, one can utilize uncontrolled broadband noise sources for passive NLoS imaging of reflective targets, in a similar fashion to the use in direct passive imaging20. This is the goal we were set to demonstrate in this work.
Results
The principle of our approach and the setup for realizing it are depicted in Fig. 1a, accompanied by a numerically simulated sample result (Fig. 1b–h, see “Methods”). We consider a simplified scenario, where a hidden target is outside the line of sight for both a microphone array and a broadband uncontrolled noise source (Fig. 1a). A broadband acoustic noise field emitted by the noise source is reflected off the target either by reflection from the relay wall (iii, depicted by a magenta dashed line in Fig. 1a) or by diffraction from the occluding wall edge (ii, depicted in cyan in Fig. 1a). A detector array composed of N microphones records these reflected fields, in addition to reflections from the walls in the scene (e.g. (i) depicted in green), and the direct arriving waves from the noise source.
The waveforms \(v_j(t)\) \(j=1\ldots N\), recorded at the different detectors are given in Fig. 1b. While seemingly random, the cross-correlation, \(C_{ij}(\tau )\) between each pair i, j of the recorded waveforms reveals pulse-echo-like ToF information (Fig. 1c):
Where \(T_{avg}\) is the recording (averaging) time, and \(\tau\) is the variable computed lag time between the two waveforms. This simple post-processing provides an estimate of the Green function between the two detectors. The longer is \(T_{avg}\) the better is the estimate25. Since the cross-correlated data is approximately equivalent to a measurement of a pulsed source and detector pair16, it can be beam-formed back to form an image by conventional delay and sum beamforming26,27 (Fig. 1d), assuming that the reflecting ’relay wall’ is a flat mirror, which is a good approximation for most common indoor walls. The presence of multiple reflections that do not originate from the target result in strong reconstructed features that are not related to the target (Fig. 1d), but originate from the static walls in the scene. These contributions can be subtracted using an additional identical measurement performed without the target present in the scene (Fig. 1e,f), where only the contributions of the walls are present (a background measurement). Taking the difference between the cross-correlation of the measurements with and without a target leaves only the target-related signals (Fig. 1g). Beam-forming using these signals allows localizing the position of the target mirror-image (Fig. 1h). A reconstruction artefact originating from early-arriving signals appears in the beam-formed image (marked by a cyan arrow in Fig. 1h). This artefact originates from signals that diffract off the cornered edge of the barrier rather than the relay wall in either the detection or sonification paths (Fig. 1g (ii, cyan arrow)). A more detailed analysis of this diffraction artefact is given below (Fig. 3).
Figure 2 presents experimental results of passive acoustic localization around the corner. A photo of the experimental setup is given in Fig. 2a: A human subject is hidden around the corner from a linear array of \(N=16\) microphones that record the acoustic fields from two uncontrolled broadband sources (Fig. 2c). The broadband spectrum of the raw measured signal of a single microphone is given in Fig. 2b (source - blue curve). We calculate the pair-wise cross-correlations between the measured signals after band-pass filtering the raw recorded signals with a Gaussian filter of central frequency \(f_0 = 5.3\) kHz and a full width at half max (FWHM) bandwidth of \(\Delta f_{FWHM} = 1.8\) kHz. Repeating the cross-correlations calculation for signals acquired with and without the subject present, and taking their difference reveals a pulse-echo-like ToF information with a peak at the expected delay time (Fig. 2c). Applying delay-and-sum beamforming on the \(N^2\) cross-correlations traces, and flipping the reconstructed (mirror) image vertically with respect to the relay wall, localizes faithfully the subject’s position in several locations by analyzing different 80 s-long temporal segments of a single recording (Fig. 2d, true positions marked by cyan crosses). Using shorter recorded segments of \(T_{avg} = 2\) s still reveals the correct positions of the hidden target, with more artefacts present (Fig. 2e). Numerical simulation of the simplified experimental scene, without the presence of noise and additional reflections that are outside the shown field of view, shows good qualitative agreement with the experimental reconstructions (Fig. 2f). In order to study the effect of the locations of the uncontrolled noise sources on the reconstruction fidelity, we have performed several numerical simulations with various locations of uncorrelated sources. The results of these simulations are presented in Supplementary Fig. S1.
To provide more in-depth analysis and understanding of the origins of the diffraction artefact present in Fig. 1g,h, we display in Fig. 3 four snapshots of a simulated propagated impulse field from one noise source. The simulated results have been obtained by a two-dimensional FDTD simulation (k-Wave28, see “Methods”): In Fig. 3a, the free-space propagation results in a perfect spherical wavefront. When the pulse front hits the walls (Fig. 3b) it is reflected from the relay-wall (green arrow, i) and the occluding barrier. Shortly after (Fig. 3c) two phenomena can be observed: The first is the propagation of the reflected wave from the relay wall (green arrow, i), and the second is the weak, but non-negligible, ’knife-edge’ diffraction from the edge of the occluding barrier (cyan arrow, ii). Finally, at later times (Fig. 3d), while the wave reflected from the relay wall continues to propagate towards the target (magenta arrow, iii), the weak knife-edge diffracted wave already arrives to the target (cyan arrow). The contribution from both of these signals will be eventually recorded by the detectors. While the diffracted peak arrives at an earlier time (cyan arrow in Fig. 1c,g) than the signal reflected from the relay wall (magenta arrow in Fig. 1c,g), only the latter will yield the correct position of the target when conventional beam-forming is used for reconstruction. Nonetheless, knowledge of the visible scene geometry can be used to take into account the contribution of such knife-edge diffraction signals to improve the reconstruction. Removing undesired artifacts and improving the SNR in the reconstructed image, can be achieved by diffraction and reflections aware localization29.
Discussion
To summarize, we have demonstrated an approach that allows to localize and track a person hidden around a corner using conventional off-the-shelf microphones and uncontrolled broadband noise sources. The presented NLoS acoustic imaging approach offers improved covertness over previous acoustic-based approaches8,30 by two important differences: the first is the use of broadband random emissions rather than pulsed emissions, similar to the use in chaotic-waveform SONAR31. The second, and most important difference, is in the fact that, unlike chaotic-waveform SONAR, our correlation-based approach does not require the knowledge of the spatial positions and exact emitted waveforms of the sources. Our approach is in essence the utilization of correlation-based ’acoustic daylight imaging’21,22,23 for NLoS imaging. In this respect, it is important to note that the term acoustic daylight imaging is also used to refer to a passive imaging technique that does not rely on retrieval of the Green-function from cross-correlations, but rather utilizes spatio-temporal correlations through interference in an acoustic analog to incoherent optical imaging24.
In our Green-function correlations-based approach, the spatial localization accuracy is dictated by the ToF temporal resolution, which is given by the temporal width of the cross-correlation peak. For a broadband source, this width is given by the source coherence time \(t_c \approx 1/\Delta f\), where \(\Delta f\) is the source spectral bandwidth. Each single ToF measurement from temporal cross-correlation between two detectors localizes the target on an ellipsoid surface (or a sphere in the case of the autocorrelation of a single detector) with an axial resolution of \(dr \approx c_s/2\Delta f\). Where \(c_s\) is the speed of sound. Assuming a perfect retrieval of the Green functions, the final reconstruction resolution is the same as for active SONAR experiments8. In practice, the finite recording time will result in noisy cross-correlations and thus to reconstruction clutter artefacts (Fig. 2).
Our method is based on Green function retrieval from temporal cross-correlations of broadband noise. In most works the noise field is assumed to be diffuse and isotropic15, which may be indeed the case for strongly reverberant rooms. In the case of an anisotropic noise field, e.g. where the waves traveling in the medium are arriving mainly from a one-sided half plane, the Green function retrieval would result in a one-sided projection of either \(G(x_i,x_j,t)\), or \(G(x_i,x_j,-t)\)32. In our experiments, the field is not entirely diffuse, and we have noticed differences in the reconstructions depending on the exact placement of the non-isotropic noise sources (see also Supplementary Fig. S1).
The two main challenges in making the presented approach useful in practical scenarios are the relatively narrow bandwidth of common ambient noise (Fig. 2b, red curve), which results in a lower reconstruction resolution, and the current requirement for a relatively long averaging time. The averaging time can be lowered by using a larger number of detectors, and adapting advanced reconstruction approaches. Development of more advanced reconstruction algorithms that take into account the contributions of diffracted waves using the (known or measured) room geometry is expected to significantly improve the reconstruction fidelity. Similar data-driven approaches using neural networks have been recently put forward for optical NLoS reconstruction33,34, for NLoS classification of individuals35 and for suppressing interfering echoes in NLoS echolocation30. Alternatively, it was found in the microwave regime, that the reverberation creates an interferometric sensitivity enabling sub-wavelength resolution.36
Methods
Experimental setup
The experimental setup is presented in Fig. 2a. The occluder was realized by a pair of acoustic drywall plates with two layers of Suprema—Tecsound pallet sandwiched between them. This 3 cm thick occluder was placed perpendicularly to the wall at a distance of 45 cm. Noise was generated by playing two different Gaussian random white noises through two audio speakers (MIYAKO Ltd, SL-800). The microphone array consisted of 16 condenser microphones (BOYA, BY-M1) placed at a spacing of 4 cm, and were sampled simultaneously at 40 kHz with 16-bit depth using a multichannel DAQ device (National Instruments, PXIe-6363). The array was placed at a distance of 53 cm from the wall, in parallel to it, and the rightmost microphone was at a distance of 5 cm from the occluder. A human subject served as the target in all experiments. The figures were created using MATLAB V. R2022a (https://www.mathworks.com/) and INKSCAPE V. 1.2 (https://inkscape.org/).
Numerical simulations
Simulations were performed using ’k-Wave’, a 2D Finite-Difference Time-Domain (FDTD) simulation toolbox28. The simulations computed the propagation of a delta-like impulse pressure wave from each of the noise sources through the simulated scene to each of the microphones (Fig. 1a), yielding the Green functions from each source to each microphone. The full simulated scene was represented by \(400\times 400\) pixels, with a pixel size of 1 \(\mathrm {cm^2}\) representing a plane of \(\mathrm {4 \;m\times 4\;m}\). Free-space propagation through air was represented by a speed-of-sound of \(\mathrm {345\;m/s}\) and density of \(\mathrm {1.225\;kg/m^3}\). The wall and occluder were represented by a 1.47 m and 3 cm thick simulated regions having a density of \(\mathrm {24.5\;kg/m^3}\), and speed of sound of \(\mathrm {1500\;m/s}\), which yielded a high value of reflection coefficient and low transmission. The random noise sources were simulated by convolving the Green functions related to each source with a single random signal with a length of \(7501\times 10^3\) samples. The two random signals obtained for each microphone (from each of the two noise sources) were then summed, cropped to a finite measurement time, and were considered as the signal measured by this microphone. These ’measured’ signals were then processed in the same manner as the measured experimental signals (Fig. 2b).
Data availability
The data which support the findings of this study are available from the corresponding author upon reasonable request.
References
Faccio, D., Velten, A. & Wetzstein, G. Non-line-of-sight imaging. Nat. Rev. Phys. 2, 318–327 (2020).
Velten, A. et al. Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging. Nat. Commun. 3, 1–8 (2012).
O’Toole, M., Lindell, D. B. & Wetzstein, G. Confocal non-line-of-sight imaging based on the light-cone transform. Nature 555, 338–341 (2018).
Liu, X. et al. Non-line-of-sight imaging using phasor-field virtual wave optics. Nature 572, 620–623 (2019).
Lindell, D. B., Wetzstein, G. & O’Toole, M. Wave-based non-line-of-sight imaging using fast fk migration. ACM Trans. Gr. (TOG) 38, 1–13 (2019).
Boger-Lombard, J. & Katz, O. Passive optical time-of-flight for non line-of-sight localization. Nat. Commun. 10, 1–9 (2019).
Nam, J. H. et al. Low-latency time-of-flight non-line-of-sight imaging at 5 frames per second. Nat. Commun. 12, 1–10 (2021).
Lindell, D. B., Wetzstein, G. & Koltun, V. Acoustic non-line-of-sight imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6780–6789 (2019).
Mak, L. C. & Furukawa, T. Non-line-of-sight localization of a controlled sound source. In 2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 475–480 (IEEE, 2009).
Kitić, S., Bertin, N. & Gribonval, R. Hearing behind walls: Localizing sources in the room next door with cosparsity. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3087–3091 (IEEE, 2014).
Singh, V., Knisely, K. E., Yönak, S. H., Grosh, K. & Dowling, D. R. Non-line-of-sight sound source localization using matched-field processing. J. Acoust. Soc. Am. 131, 292–302 (2012).
Snieder, R. & Wapenaar, K. Imaging with ambient noise. Phys. Today 63, 44–49 (2010).
Snieder, R. The theory of coda wave interferometry. Pure Appl. Geophys. 163, 455–473 (2006).
Duvall, T. L., Jeffferies, S., Harvey, J. & Pomerantz, M. Time-distance helioseismology. Nature 362, 430–432 (1993).
Weaver, R. L. & Lobkis, O. I. Ultrasonics without a source: Thermal fluctuation correlations at MHz frequencies. Phys. Rev. Lett. 87, 134301 (2001).
Lobkis, O. I. & Weaver, R. L. On the emergence of the green’s function in the correlations of a diffuse field. J. Acoust. Soc. Am. 110, 3011–3017 (2001).
Shapiro, N. M., Campillo, M., Stehly, L. & Ritzwoller, M. H. High-resolution surface-wave tomography from ambient seismic noise. Science 307, 1615–1618 (2005).
Davy, M., Fink, M. & De Rosny, J. Green’s function retrieval and passive imaging from correlations of wideband thermal radiations. Phys. Rev. Lett. 110, 203901 (2013).
Badon, A., Lerosey, G., Boccara, A. C., Fink, M. & Aubry, A. Retrieving time-dependent green’s functions in optics with low-coherence interferometry. Phys. Rev. Lett. 114, 023901 (2015).
Garnier, J. & Papanicolaou, G. Passive Imaging with Ambient Noise (Cambridge University Press, 2016).
Godin, O. A., Zabotin, N. A. & Goncharov, V. V. Ocean tomography with acoustic daylight. Geophys. Res. Lett.https://doi.org/10.1029/2010GL043623 (2010).
Roux, P., Kuperman, W. & group, N. Extracting coherent wave fronts from acoustic ambient noise in the ocean. J. Acoust. Soc. Am. 116, 1995–2003 (2004).
Rickett, J. & Claerbout, J. Acoustic daylight imaging via spectral factorization: Helioseismology and reservoir monitoring. Lead. Edge 18, 957–960 (1999).
Buckingham, M. J., Berknout, B. V. & Glegg, S. A. Imaging the ocean with ambient noise. Nature 356, 327–329 (1992).
Seats, K. J., Lawrence, J. F. & Prieto, G. A. Improved ambient noise correlation functions using Welch’s method. Geophys. J. Int. 188, 513–523 (2012).
Friis, H. T. & Feldman, C. B. A multiple unit steerable antenna for short-wave reception. Proc. Inst. Radio Eng. 25, 841–917 (1937).
Perrot, V., Polichetti, M., Varray, F. & Garcia, D. So you think you can das? A viewpoint on delay-and-sum beamforming. Ultrasonics 111, 106309 (2021).
Treeby, B. E., Budisky, J., Wise, E. S., Jaros, J. & Cox, B. Rapid calculation of acoustic fields from arbitrary continuous-wave sources. J. Acoust. Soc. Am. 143, 529–537 (2018).
An, I., Lee, D., Choi, J.-w., Manocha, D. & Yoon, S.-e. Diffraction-aware sound localization for a non-line-of-sight source. In 2019 International Conference on Robotics and Automation (ICRA), 4061–4067 (IEEE, 2019).
Jang, S., Shin, U.-H. & Kim, K. Deep non-line-of-sight imaging using echolocation. Sensors 22, 8477 (2022).
Karimov, T. I., Druzhina, O. S., Kolev, G. Y., Andreev, V. S. & Butusov, D. N. Multiband and wideband chaotic waveforms for hydroacoustics. In 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), 1392–1395 (IEEE, 2020).
Lin, F.-C., Ritzwoller, M. H. & Snieder, R. Eikonal tomography: Surface wave tomography by phase front tracking across a regional broad-band seismic array. Geophys. J. Int. 177, 1091–1110 (2009).
Tancik, M., Swedish, T., Satat, G. & Raskar, R. Data-driven non-line-of-sight imaging with a traditional camera. In Imaging Systems and Applications, IW2B–6 (Optical Society of America, 2018).
Chen, W., Daneau, S., Mannan, F. & Heide, F. Steady-state non-line-of-sight imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6790–6799 (2019).
Caramazza, P. et al. Neural network identification of people hidden from view with a single-pixel, single-photon detector. Sci. Rep. 8, 1–6 (2018).
Del Hougne, M., Gigan, S. & Del Hougne, P. Deeply subwavelength localization with reverberation-coded aperture. Phys. Rev. Lett. 127, 043903 (2021).
Acknowledgements
This work has received funding from the European Research Council under the European Union’s Horizon 2020 Research and Innovation Program grant number 101002406, and the Israel Science Foundation (grant number 1361/18).
Author information
Authors and Affiliations
Contributions
O.K. conceived the project. J.B.L. and O.K. designed the experimental setup. J.B.L. performed measurements and data analysis under the supervision of O.K. J.B.L. and Y.S. performed numerical simulations under the supervision of O.K. J.B.L. and O.K. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Boger-Lombard, J., Slobodkin, Y. & Katz, O. Towards passive non-line-of-sight acoustic localization around corners using uncontrolled random noise sources. Sci Rep 13, 4952 (2023). https://doi.org/10.1038/s41598-023-31490-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-31490-2
This article is cited by
-
Two-edge-resolved three-dimensional non-line-of-sight imaging with an ordinary camera
Nature Communications (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.