Abstract
Autonomous driving has recently gained lots of attention due to its disruptive potential and impact on the global economy; however, these high expectations are hindered by strict safety requirements for redundant sensing modalities that are each able to independently perform complex tasks to ensure reliable operation. At the core of an autonomous driving algorithmic stack is road segmentation, which is the basis for numerous planning and decision-making algorithms. Radar-based methods fail in many driving scenarios, mainly as various common road delimiters barely reflect radar signals, coupled with a lack of analytical models for road delimiters and the inherit limitations in radar angular resolution. Our approach is based on radar data in the form of a two-dimensional complex range-Doppler array as input into a deep neural network (DNN) that is trained to semantically segment the drivable area using weak supervision from a camera. Furthermore, guided back propagation was utilized to analyse radar data and design a novel perception filter. Our approach creates the ability to perform road segmentation in common driving scenarios based solely on radar data and we propose to utilize this method as an enabler for redundant sensing modalities for autonomous driving.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The data generated to support the findings of this study are available from the corresponding author upon reasonable request and for non-commercial purposes only.
Code availability
The code that supports the findings of this study is available at https://doi.org/10.5281/zenodo.4318829
Change history
10 February 2021
A Correction to this paper has been published: https://doi.org/10.1038/s42256-021-00314-1
References
Clements, L. M. & Kockelman, K. M. Economic effects of automated vehicles. Transp. Res. Rec. 2606, 106–114 (2017).
Road Vehicles—Functional Safety—Part 1: Vocabulary (International Organization for Standardization, 2018); https://www.iso.org/standard/68383.html
Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles 1–5 (SAE International, 2018).
Yurtsever, E., Lambert, J., Carballo, A. & Takeda, K. A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8, 58443–58469 (2020).
Divakarla, K. P., Emadi, A. & Razavi, S. A cognitive advanced driver assistance systems architecture for autonomous-capable electrified vehicles. IEEE Trans. Transp. Electrif. 5, 48–58 (2019).
Zhu, H., Yuen, K., Mihaylova, L. & Leung, H. Overview of environment perception for intelligent vehicles. IEEE Trans. Intell. Transp. Syst. 18, 2584–2601 (2017)..
Pendleton, S. D. et al. Perception, planning, control, and coordination for autonomous vehicles. Machines 5, 1–54 (2017).
Graves, D., Rezaee, K. & Scheideman, S. Perception as prediction using general value functions in autonomous driving applications. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems 1202–1209 (IEEE, 2019); https://doi.org/10.1109/IROS40897.2019.8968293
Zong, W., Zhang, C., Wang, Z., Zhu, J. & Chen, Q. Architecture design and implementation of an autonomous vehicle. IEEE Access 6, 21956–21970 (2018).
Yang, D., Jiao, X., Jiang, K. & Cao, Z. Driving space for autonomous vehicles. Automot. Innov. 2, 241–253 (2019).
Alvarez, J. M., Gevers, T., LeCun, Y. & Lopez, A. M. Road scene segmentation from a single image. In 12th European Conference on Computer Vision Vol. 7578, 376–389 (Springer, 2012)..
Shelhamer, E., Long, J. & Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017).
Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In 18th International Conference on Medical Image Computing and Computer-assisted Intervention Vol. 9351, 234–241 (Springer, 2015).
Jegou, S., Drozdzal, M., Vazquez, D., Romero, A. & Bengio, Y. The one hundred layers Tiramisu: fully convolutional densenets for semantic segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops 1175–1183 (IEEE, 2017).
Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid scene parsing network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition 6230–6239 (IEEE, 2017).
Lin, G., Milan, A., Shen, C. & Reid, I. RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition 5168–5177 (IEEE, 2017).
Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. Preprint at https://arxiv.org/abs/1706.05587 (2017).
Felzenszwalb, P. F. & Huttenlocher, D. P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 59, 167–181 (2004).
Tsutsui, S., Kerola, T., Saito, S. & Crandall, D. J. Minimizing supervision for free-space segmentation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 1101–1110 (IEEE, 2018).
Tsutsui, S., Saito, S. & Kerola, T. Distantly supervised road segmentation. In 2017 IEEE International Conference on Computer Vision Workshops 174–181 (IEEE, 2017).
Chen, Y. H. et al. No more discrimination: cross city adaptation of road scene segmenters. In 2017 IEEE International Conference on Computer Vision 2011–2020 (IEEE, 2017).
Topudurti, K., Keefe, M., Wooliever, P. & Lewis, N. PointNet: deep learning on point sets for 3D classification and segmentation. Water Sci. Technol. 30, 95–104 (2017).
Badrinarayanan, V., Kendall, A. & Cipolla, R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017).
Prophet, R., Li, G., Sturm, C. & Vossiek, M. Semantic segmentation on automotive radar maps. In 2019 IEEE Intelligent Vehicles Symposium 756–763 (IEEE, 2019).
Feng, Z., Zhang, S., Kunert, M. & Wiesbeck, W. Point cloud segmentation with a high-resolution automotive radar. In AmE 2019—Automotive meets Electronics 10th GMM Symposium 1–5 (IEEE, 2019).
Schumann, O., Hahn, M., Dickmann, J. & Wöhler, C. Semantic segmentation on radar point clouds. In 2018 21st International Conference on Information Fusion 2179–2186 (IEEE, 2018).
Sless, L., Cohen, G., Shlomo, B. El, Oron, S. Road scene understanding by occupancy grid learning. In 2019 IEEE/CVF International Conference on Computer Vision Workshop 1–9 (IEEE, 2019).
Lombacher, J., Laudt, K., Hahn, M., Dickmann, J. & Wohler, C. Semantic radar grids. In 2017 IEEE Intelligent Vehicles Symposium 1170–1175 (IEEE, 2017); https://doi.org/10.1109/IVS.2017.7995871
Barnes, D., Gadd, M., Murcutt, P., Newman, P. & Posner, I. The Oxford Radar RobotCar Dataset: A Radar Extension to the Oxford RobotCar Dataset (University of Oxford, 2019); https://oxford-robotics-institute.github.io/radar-robotcar-dataset/
Williams, D., De Martini, D., Gadd, M., Marchegiani, L. & Newman, P. Keep off the Grass: Permissible Driving Routes from Radar with Weak Audio Supervision (University of Oxford, 2020).
Esteves, C., Allen-Blanchette, C., Zhou, X. & Daniilidis, K. Polar transformer networks. In 6th International Conference on Learning Representations 1–14 (DBLP, 2018).
Weston, R., Cen, S., Newman, P. & Posner, I. Probably unknown: deep inverse sensor modelling radar. In 2019 International Conference on Robotics and Automation 5446–5452 (IEEE, 2019).
Kaul, P., De Martini, D., Gadd, M. & Newman, P. RSS-Net: Weakly-supervised Multi-Class Semantic Segmentation with FMCW Radar (University of Oxford, 2020).
Nowruzi, F. E. et al. Deep Open Space Segmentation Using Automotive Radar 2–5 (IEEE, 2020).
Engelhardt, N., Perez, R. & Rao, Q. Occupancy grids generation using deep radar network for autonomous driving. In 2019 IEEE Intelligent Transportation Systems Conference 2866–2871 (IEEE, 2019)..
Li, J. & Stoica, P. MIMO RADAR (Wiley, 2008).
Springenberg, J. T., Dosovitskiy, A., Brox, T. & Riedmiller, M. Striving for simplicity: the all convolutional net. Preprint at https://arxiv.org/abs/1412.6806 (2014).
Ostyakov, P. et al. Label denoising with large ensembles of heterogeneous neural networks. In 2018 European Conference on Computer Vision 250–261 (2019).
Geyer, J. et al. A2D2: Audi Autonomous Driving Dataset (Audi, 2020); https://www.a2d2.audi/a2d2/en/download.html
Oktay, O. et al. Attention U-Net: learning where to look for the pancreas. Preprint at https://arxiv.org/abs/1804.03999 (2018)
Milletari, F., Navab, N. & Ahmadi, S. A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 4th International Conference on 3D Vision 565–571 (2016)..
Wang, Y. et al. Symmetric cross entropy for robust learning with noisy labels. In 2019 IEEE/CVF International Conference on Computer Vision 322–330 (2019).
Liao, W. MUSIC for multidimensional spectral estimation: stability and super-resolution. IEEE Trans. Signal Process. 63, 6395–6406 (2015).
Acknowledgements
We thank H. Damari for assembling the dataset and H. Omer, Z. Iluz, Y. Avargel, L. Korkidi, M. Raifel, K. Twizer, P. Fidelman and N. Orr for their insights and advice.
Author information
Authors and Affiliations
Contributions
I.O. conceived the study and conducted training. All authors contributed to the design of the study, interpreting the results and writing the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Machine Intelligence thanks Zdenka Babić and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Additional sample results from the validation dataset.
Blurry images were caused by rain droplets on the camera lens. Left column shows range-Doppler maps in dB. Middle column shows the suggested radar-based DNN prediction overlayed on a camera image, values are confidence level on a scale of (0,1). Right column shows the corresponding camera pseudo label generated from a camera-based DNN, values are confidence level on a scale of (0,1).
Extended Data Fig. 2 Camera label projection to RADAR coordinate frame.
A sample of urban scene from the validation dataset showing the camera pseudo label projected on to cartesian coordinates. (a); camera image overlayed with a camera pseudo label, values are confidence level on a scale of (0,1). (b); displays the associated radar data in cartesian representation with values displayed in dB. (c); radar data in cartesian coordinates with values displayed in dB. Overlayed on top (in black) is the projected camera pseudo label. This sample frame further illustrates the lack of distinguishable features associated with common road delimiters such as sidewalks and curbstones. Note that the projected label minimum range is 4.5m due to the camera’s ground clearance.
Extended Data Fig. 3 Filter correlation heatmap.
Comparison between conventional CFAR and the suggested perception filter. The results were averaged over the validation dataset. Y axis represents the conventional CFAR threshold in dB and X axis represents the perception filter on a scale of (0,1). High correlation between the two filters would have created a diagonal heatmap with IoU values close to 1. However, these results show low correlation with low IoU values between the two methods which further suggests the perception filter eliminates data based on context as well as SNR.
Extended Data Fig. 4 Training methodology.
A camera-based DNN is trained on a publicly available dataset and used to create pseudo labels for a RADAR-based DNN. The radar and camera data are temporally synced and spatially overlapped. Radar pre-processing includes windowing and 2D FFT on the sweeps and samples dimensions to create a complex 2D array of range-Doppler maps. The radar model is trained using segmentation loss to identify the drivable area.
Extended Data Fig. 5 Model architecture.
Based on encoder-decoder Unet architecture with channel attention mechanism to encourage learnable cross channel correlations.
Rights and permissions
About this article
Cite this article
Orr, I., Cohen, M. & Zalevsky, Z. High-resolution radar road segmentation using weakly supervised learning. Nat Mach Intell 3, 239–246 (2021). https://doi.org/10.1038/s42256-020-00288-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-020-00288-6
This article is cited by
-
Physics and semantic informed multi-sensor calibration via optimization theory and self-supervised learning
Scientific Reports (2024)
-
Machine vision-based detections of transparent chemical vessels toward the safe automation of material synthesis
npj Computational Materials (2024)
-
Autonomous vehicles decision-making enhancement using self-determination theory and mixed-precision neural networks
Multimedia Tools and Applications (2023)
-
Deep learning-based robust positioning for all-weather autonomous driving
Nature Machine Intelligence (2022)