Dense reinforcement learning for safety validation of autonomous vehicles

Feng, Shuo; Sun, Haowei; Yan, Xintao; Zhu, Haojie; Zou, Zhengxia; Shen, Shengyin; Liu, Henry X.

doi:10.1038/s41586-023-05732-2

Article
Published: 22 March 2023

Dense reinforcement learning for safety validation of autonomous vehicles

Nature volume 615, pages 620–627 (2023)Cite this article

28k Accesses
82 Citations
146 Altmetric
Metrics details

Subjects

Abstract

One critical bottleneck that impedes the development and deployment of autonomous vehicles is the prohibitively high economic and time costs required to validate their safety in a naturalistic driving environment, owing to the rarity of safety-critical events¹. Here we report the development of an intelligent testing environment, where artificial-intelligence-based background agents are trained to validate the safety performances of autonomous vehicles in an accelerated mode, without loss of unbiasedness. From naturalistic driving data, the background agents learn what adversarial manoeuvre to execute through a dense deep-reinforcement-learning (D2RL) approach, in which Markov decision processes are edited by removing non-safety-critical states and reconnecting critical ones so that the information in the training data is densified. D2RL enables neural networks to learn from densified information with safety-critical events and achieves tasks that are intractable for traditional deep-reinforcement-learning approaches. We demonstrate the effectiveness of our approach by testing a highly automated vehicle in both highway and urban test tracks with an augmented-reality environment, combining simulated background vehicles with physical road infrastructure and a real autonomous test vehicle. Our results show that the D2RL-trained agents can accelerate the evaluation process by multiple orders of magnitude (10³ to 10⁵ times faster). In addition, D2RL will enable accelerated testing and training with other safety-critical autonomous systems.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Validating safety-critical AI with the dense-learning approach.**

**Fig. 2: Comparison of D2RL with DRL using the corner-case-generation examples.**

**Fig. 3: Performance evaluation of the D2RL-based intelligent testing environment.**

**Fig. 4: Testing experiments for a real-world AV at physical test tracks.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

John Jumper, Richard Evans, … Demis Hassabis

Solving olympiad geometry without human demonstrations

Article Open access 17 January 2024

Trieu H. Trinh, Yuhuai Wu, … Thang Luong

War city profiles drawn from satellite images

Article 09 April 2024

Zhengyang Hou, Ying Qu, … Chenghu Zhou

Data availability

The raw datasets that we used for modelling the naturalistic driving environment come from the Safety Pilot Model Deployment (SPMD) programme⁴⁸ and the Integrated Vehicle-Based Safety System (IVBSS)⁴⁹ at the University of Michigan, Ann Arbor. The ShapeNet Dataset that includes the three-dimensional model assets for the image augmented-reality module can be found at https://github.com/mmatl/pyrender. The police crash reports used in Supplementary Video 7 are available at https://www.michigantrafficcrashfacts.org/. The processed data for constructing NDE models and the intelligent testing environment and the experiment results that support the findings of this study are available at https://github.com/michigan-traffic-lab/Dense-Deep-Reinforcement-Learning. Source data are provided with this paper.

Code availability

The simulation software SUMO, the automated driving system Autoware and the RLLib platform with the implemented PPO algorithm are publicly available, as described in the text and the relevant references^23,25,52. The source codes for the naturalistic driving environment simulator, the driving behaviour models in the simulator, the D2RL-based intelligent testing environment and the simulation set-ups are available at https://github.com/michigan-traffic-lab/Dense-Deep-Reinforcement-Learning.

References

Kalra, N. & Paddock, S. M. Driving to safety: how many miles of driving would it take to demonstrate autonomous vehicle reliability? Transp. Res. A 94, 182–193 (2016).
Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS CAS PubMed Google Scholar
10 million self-driving cars will be on the road by 2020. Insider https://www.businessinsider.com/report-10-million-self-driving-cars-will-be-on-the-road-by-2020-2015-5-6 (2016).
Nissan promises self-driving cars by 2020. Wired https://www.wired.com/2013/08/nissan-autonomous-drive/ (2014).
Tesla’s self-driving vehicles are not far off. Insider https://www.businessinsider.com/elon-musk-on-teslas-autonomous-cars-2015-9 (2015).
Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles (Society of Automotive Engineers, 2021); https://www.sae.org/standards/content/j3016_202104/.
2021 Disengagement Reports (California Department of Motor Vehicles, 2022); https://www.dmv.ca.gov/portal/vehicle-industry-services/autonomous-vehicles/disengagement-reports/.
Paz, D., Lai, P. J., Chan, N., Jiang, Y. & Christensen, H. I. Autonomous vehicle benchmarking using unbiased metrics. In IEEE International Conference on Intelligent Robots and Systems 6223–6228 (IEEE, 2020).
Favarò, F., Eurich, S. & Nader, N. Autonomous vehicles’ disengagements: trends, triggers, and regulatory limitations. Accid. Anal. Prev. 110, 136–148 (2018).
Article PubMed Google Scholar
Riedmaier, S., Ponn, T., Ludwig, D., Schick, B. & Diermeyer, F. Survey on scenario-based safety assessment of automated vehicles. IEEE Access 8, 87456–87477 (2020).
Article Google Scholar
Nalic, D. et al. Scenario based testing of automated driving systems: a literature survey. In Proc. of the FISITA Web Congress 1–10 (Fisita, 2020).
Feng, S., Feng, Y., Yu, C., Zhang, Y. & Liu, H. X. Testing scenario library generation for connected and automated vehicles, part I: methodology. IEEE Trans. Intell. Transp. Syst. 22, 1573–1582 (2020).
Article Google Scholar
Feng, S. et al. Testing scenario library generation for connected and automated vehicles, part II: case studies. IEEE Trans. Intell. Transp. Syst. 22, 5635–5647 (2020).
Article Google Scholar
Feng, S., Yan, X., Sun, H., Feng, Y. & Liu, H. X. Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nat. Commun. 12, 748 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Sinha, A., O’Kelly, M., Tedrake, R. & Duchi, J. C. Neural bridge sampling for evaluating safety-critical autonomous systems. Adv. Neural Inf. Process. Syst. 33, 6402–6416 (2020).
Google Scholar
Li, L. et al. Parallel testing of vehicle intelligence via virtual-real interaction. Sci. Robot. 4, eaaw4106 (2019).
Article PubMed Google Scholar
Zhao, D. et al. Accelerated evaluation of automated vehicles safety in lane-change scenarios based on importance sampling techniques. IEEE Trans. Intell. Transp. Syst. 18, 595–607 (2016).
Article PubMed PubMed Central Google Scholar
Donoho, D. L. High-dimensional data analysis: the curses and blessings of dimensionality. AMS Math Challenges Lecture 1, 32 (2000).
Google Scholar
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).
Article ADS CAS PubMed Google Scholar
Mirhoseini, A. et al. A graph placement methodology for fast chip design. Nature 594, 207–212 (2021).
Article ADS CAS PubMed Google Scholar
Cummings, M. L. Rethinking the maturity of artificial intelligence in safety-critical settings. AI Mag. 42, 6–15 (2021).
Google Scholar
Kato, S. et al. Autoware on board: enabling autonomous vehicles with embedded systems. In 2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems 287–296 (IEEE, 2018).
Feng, S. et al. Safety assessment of highly automated driving systems in test tracks: a new framework. Accid. Anal. Prev. 144, 105664 (2020).
Article PubMed Google Scholar
Lopez, P. et al. Microscopic traffic simulation using SUMO. In International Conference on Intelligent Transportation Systems 2575–2582 (IEEE, 2018).
Arun, A., Haque, M. M., Bhaskar, A., Washington, S. & Sayed, T. A systematic mapping review of surrogate safety assessment using traffic conflict techniques. Accid. Anal. Prev. 153, 106016 (2021).
Article PubMed Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Koren, M., Alsaif, S., Lee, R. & Kochenderfer, M. J. Adaptive stress testing for autonomous vehicles. In IEEE Intelligent Vehicles Symposium (IV) 1–7 (IEEE, 2018).
Sun, H., Feng, S., Yan, X. & Liu, H. X. Corner case generation and analysis for safety assessment of autonomous vehicles. Transport. Res. Rec. 2675, 587–600 (2021).
Article Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).
Owen, A. B. Monte Carlo theory, methods and examples. Art Owen https://artowen.su.domains/mc/ (2013).
Krajewski, R., Moers, T., Bock, J., Vater, L. & Eckstein, L. September. The round dataset: a drone dataset of road user trajectories at roundabouts in Germany. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems 1–6 (IEEE, 2020).
Nowakowski, C., Shladover, S. E., Chan, C. Y. & Tan, H. S. Development of California regulations to govern testing and operation of automated driving systems. Transport. Res. Rec. 2489, 137–144 (2015).
Article Google Scholar
Sauerbier, J., Bock, J., Weber, H. & Eckstein, L. Definition of scenarios for safety validation of automated driving functions. ATZ Worldwide 121, 42–45 (2019).
Article Google Scholar
Pek, C., Manzinger, S., Koschi, M. & Althoff, M. Using online verification to prevent autonomous vehicles from causing accidents. Nat. Mach. Intell. 2, 518–528 (2020).
Article Google Scholar
Seshia, S. A., Sadigh, D. & Sastry, S. S. Toward verified artificial intelligence. Commun. ACM 65, 46–55 (2022).
Article Google Scholar
Wing, J. M. A specifier’s introduction to formal methods. IEEE Comput. 23, 8–24 (1990).
Article Google Scholar
Li, A., Sun, L., Zhan, W., Tomizuka, M. & Chen, M. Prediction-based reachability for collision avoidance in autonomous driving. In 2021 IEEE International Conference on Robotics and Automation 7908–7914 (IEEE, 2021).
Automated Vehicle Safety Consortium AVSC Best Practice for Metrics and Methods for Assessing Safety Performance of Automated Driving Systems (ADS) (SAE Industry Technologies Consortia, 2021).
Au, S. K. & Beck, J. L. Important sampling in high dimensions. Struct. Saf. 25, 139–163 (2003).
Article Google Scholar
Silver, D., Singh, S., Precup, D. & Sutton, R. S. Reward is enough. Artif. Intell. 299, 1–13 (2021).
Article MathSciNet MATH Google Scholar
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Article ADS CAS PubMed Google Scholar
Weng, B., Rao, S. J., Deosthale, E., Schnelle, S. & Barickman, F. Model predictive instantaneous safety metric for evaluation of automated driving systems. In IEEE Intelligent Vehicles Symposium (IV) 1899–1906 (IEEE, 2020).
Junietz, P., Bonakdar, F., Klamann, B. & Winner, H. Criticality metric for the safety validation of automated driving using model predictive trajectory optimization. In International Conference on Intelligent Transportation Systems 60–65 (IEEE, 2018).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition 4700–4708 (IEEE, 2017).
Bengio, Y., Louradour, J., Collobert, R. & Weston, J. Curriculum learning. In International Conference on Machine Learning 41–48 (ICML, 2009).
Yan, X., Feng, S., Sun, H., & Liu, H. X. Distributionally consistent simulation of naturalistic driving environment for autonomous vehicle testing. Preprint at https://arxiv.org/abs/2101.02828 (2021).
Bezzina, D. & Sayer, J. Safety Pilot Model Deployment: Test Conductor Team Report DOT HS 812 171 (National Highway Traffic Safety Administration, 2014).
Sayer, J. et al. Integrated Vehicle-based Safety Systems Field Operational Test: Final Program Report FHWA-JPO-11-150; UMTRI-2010-36 (Joint Program Office for Intelligent Transportation Systems, 2011).
Treiber, M., Hennecke, A. & Helbing, D. Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E 62, 1805 (2000).
Article ADS CAS MATH Google Scholar
Kesting, A., Treiber, M. & Helbing, D. General lane-changing model MOBIL for car-following models. Transp. Res. Rec. 1999, 86–94 (2007).
Article Google Scholar
Liang, E. et al. RLlib: abstractions for distributed reinforcement learning. In International Conference on Machine Learning 3053–3062 (ICML, 2018).
Chang A. X. et al. ShapeNet: an information-rich 3D model repository. Preprint at https://arxiv.org/abs/1512.03012 (2015).
Darweesh, H. et al. Open source integrated planner for autonomous navigation in highly dynamic environments. J. Robot. Mechatron. 29, 668–684 (2017).
Article Google Scholar

Download references

Acknowledgements

This research was partially funded by the US Department of Transportation (USDOT) Region 5 University Transportation Center: Center for Connected and Automated Transportation (CCAT) of the University of Michigan (#69A3551747105) and the National Science Foundation (CMMI #2223517). We thank the American Center for Mobility (ACM) for providing access to their test track. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the official policy or position of the US government or the American Center for Mobility.

Author information

Shuo Feng
Present address: Department of Automation, Tsinghua University, Beijing, China
Zhengxia Zou
Present address: School of Astronautics, Beihang University, Beijing, China

Authors and Affiliations

Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, MI, USA
Shuo Feng, Haowei Sun, Xintao Yan, Haojie Zhu, Zhengxia Zou & Henry X. Liu
University of Michigan Transportation Research Institute, Ann Arbor, MI, USA
Shuo Feng, Shengyin Shen & Henry X. Liu
Mcity, University of Michigan, Ann Arbor, MI, USA
Henry X. Liu

Authors

Shuo Feng
View author publications
You can also search for this author in PubMed Google Scholar
Haowei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xintao Yan
View author publications
You can also search for this author in PubMed Google Scholar
Haojie Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhengxia Zou
View author publications
You can also search for this author in PubMed Google Scholar
Shengyin Shen
View author publications
You can also search for this author in PubMed Google Scholar
Henry X. Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.F. and H.X.L. conceived and led the research programme, developed the AI against AI concepts, developed the dense-learning approach, and wrote the paper. S.F. and H.S. developed the algorithms for the intelligent-testing-environment generation and designed the experiments. H.S. and H.Z. developed the simulation platform, implemented the algorithms, performed the simulation tests and prepared the simulation results. X.Y., H.Z. and S.S. implemented the Autoware system in the autonomous vehicle, performed the field tests and prepared the testing results. Z.Z. developed and performed the augmented image rendering. All authors provided feedback during the manuscript revision and results discussions. H.X.L. approved the submission and accepted responsibility for the overall integrity of the paper.

Corresponding author

Correspondence to Henry X. Liu.

Ethics declarations

Competing interests

The University of Michigan is in the process of applying for a patent application #63/338,424 covering the dense reinforcement learning, intelligent testing environment generation, and augmented reality testing techniques that lists H.X.L., S.F., H.S., X.Y., H.Z., Z.Z., and S.S. as inventors.

Peer review

Peer review information

Nature thanks Colin Paterson, Fredrik Warg and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

This file contains Supplementary Sections 1–4, including Supplementary text and equations, Figs. 1–19, Tables 1 and 2. and references—see Contents for details. It also includes links to Supplementary Videos 1–8 in Section 5, which are hosted externally via figshare.

Source data

Source Data Fig. 2

Source Data Fig. 3

Source Data Fig. 4

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Feng, S., Sun, H., Yan, X. et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 615, 620–627 (2023). https://doi.org/10.1038/s41586-023-05732-2

Download citation

Received: 01 March 2022
Accepted: 16 January 2023
Published: 22 March 2023
Issue Date: 23 March 2023
DOI: https://doi.org/10.1038/s41586-023-05732-2

This article is cited by

Stable training via elastic adaptive deep reinforcement learning for autonomous navigation of intelligent vehicles
- Yujiao Zhao
- Yong Ma
- Xinping Yan
Communications Engineering (2024)
Online legal driving behavior monitoring for self-driving vehicles
- Wenhao Yu
- Chengxiang Zhao
- Ding Zhao
Nature Communications (2024)
Shapley value: from cooperative game to explainable artificial intelligence
- Meng Li
- Hengyang Sun
- Hong Chen
Autonomous Intelligent Systems (2024)
Knowledge transfer enabled reinforcement learning for efficient and safe autonomous ship collision avoidance
- Chengbo Wang
- Ning Wang
- Mingxing Fang
International Journal of Machine Learning and Cybernetics (2024)
Spatiotemporal dynamics of traffic bottlenecks yields an early signal of heavy congestions
- Jinxiao Duan
- Guanwen Zeng
- Shlomo Havlin
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.