Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Dense reinforcement learning for safety validation of autonomous vehicles


One critical bottleneck that impedes the development and deployment of autonomous vehicles is the prohibitively high economic and time costs required to validate their safety in a naturalistic driving environment, owing to the rarity of safety-critical events1. Here we report the development of an intelligent testing environment, where artificial-intelligence-based background agents are trained to validate the safety performances of autonomous vehicles in an accelerated mode, without loss of unbiasedness. From naturalistic driving data, the background agents learn what adversarial manoeuvre to execute through a dense deep-reinforcement-learning (D2RL) approach, in which Markov decision processes are edited by removing non-safety-critical states and reconnecting critical ones so that the information in the training data is densified. D2RL enables neural networks to learn from densified information with safety-critical events and achieves tasks that are intractable for traditional deep-reinforcement-learning approaches. We demonstrate the effectiveness of our approach by testing a highly automated vehicle in both highway and urban test tracks with an augmented-reality environment, combining simulated background vehicles with physical road infrastructure and a real autonomous test vehicle. Our results show that the D2RL-trained agents can accelerate the evaluation process by multiple orders of magnitude (103 to 105 times faster). In addition, D2RL will enable accelerated testing and training with other safety-critical autonomous systems.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Validating safety-critical AI with the dense-learning approach.
Fig. 2: Comparison of D2RL with DRL using the corner-case-generation examples.
Fig. 3: Performance evaluation of the D2RL-based intelligent testing environment.
Fig. 4: Testing experiments for a real-world AV at physical test tracks.

Similar content being viewed by others

Data availability

The raw datasets that we used for modelling the naturalistic driving environment come from the Safety Pilot Model Deployment (SPMD) programme48 and the Integrated Vehicle-Based Safety System (IVBSS)49 at the University of Michigan, Ann Arbor. The ShapeNet Dataset that includes the three-dimensional model assets for the image augmented-reality module can be found at The police crash reports used in Supplementary Video 7 are available at The processed data for constructing NDE models and the intelligent testing environment and the experiment results that support the findings of this study are available at data are provided with this paper.

Code availability

The simulation software SUMO, the automated driving system Autoware and the RLLib platform with the implemented PPO algorithm are publicly available, as described in the text and the relevant references23,25,52. The source codes for the naturalistic driving environment simulator, the driving behaviour models in the simulator, the D2RL-based intelligent testing environment and the simulation set-ups are available at


  1. Kalra, N. & Paddock, S. M. Driving to safety: how many miles of driving would it take to demonstrate autonomous vehicle reliability? Transp. Res. A 94, 182–193 (2016).

    Google Scholar 

  2. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  3. 10 million self-driving cars will be on the road by 2020. Insider (2016).

  4. Nissan promises self-driving cars by 2020. Wired (2014).

  5. Tesla’s self-driving vehicles are not far off. Insider (2015).

  6. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles (Society of Automotive Engineers, 2021);

  7. 2021 Disengagement Reports (California Department of Motor Vehicles, 2022);

  8. Paz, D., Lai, P. J., Chan, N., Jiang, Y. & Christensen, H. I. Autonomous vehicle benchmarking using unbiased metrics. In IEEE International Conference on Intelligent Robots and Systems 6223–6228 (IEEE, 2020).

  9. Favarò, F., Eurich, S. & Nader, N. Autonomous vehicles’ disengagements: trends, triggers, and regulatory limitations. Accid. Anal. Prev. 110, 136–148 (2018).

    Article  PubMed  Google Scholar 

  10. Riedmaier, S., Ponn, T., Ludwig, D., Schick, B. & Diermeyer, F. Survey on scenario-based safety assessment of automated vehicles. IEEE Access 8, 87456–87477 (2020).

    Article  Google Scholar 

  11. Nalic, D. et al. Scenario based testing of automated driving systems: a literature survey. In Proc. of the FISITA Web Congress 1–10 (Fisita, 2020).

  12. Feng, S., Feng, Y., Yu, C., Zhang, Y. & Liu, H. X. Testing scenario library generation for connected and automated vehicles, part I: methodology. IEEE Trans. Intell. Transp. Syst. 22, 1573–1582 (2020).

    Article  Google Scholar 

  13. Feng, S. et al. Testing scenario library generation for connected and automated vehicles, part II: case studies. IEEE Trans. Intell. Transp. Syst. 22, 5635–5647 (2020).

    Article  Google Scholar 

  14. Feng, S., Yan, X., Sun, H., Feng, Y. & Liu, H. X. Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nat. Commun. 12, 748 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  15. Sinha, A., O’Kelly, M., Tedrake, R. & Duchi, J. C. Neural bridge sampling for evaluating safety-critical autonomous systems. Adv. Neural Inf. Process. Syst. 33, 6402–6416 (2020).

    Google Scholar 

  16. Li, L. et al. Parallel testing of vehicle intelligence via virtual-real interaction. Sci. Robot. 4, eaaw4106 (2019).

    Article  PubMed  Google Scholar 

  17. Zhao, D. et al. Accelerated evaluation of automated vehicles safety in lane-change scenarios based on importance sampling techniques. IEEE Trans. Intell. Transp. Syst. 18, 595–607 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Donoho, D. L. High-dimensional data analysis: the curses and blessings of dimensionality. AMS Math Challenges Lecture 1, 32 (2000).

    Google Scholar 

  19. Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).

    Article  ADS  MathSciNet  CAS  PubMed  MATH  Google Scholar 

  20. Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  21. Mirhoseini, A. et al. A graph placement methodology for fast chip design. Nature 594, 207–212 (2021).

    Article  ADS  CAS  PubMed  Google Scholar 

  22. Cummings, M. L. Rethinking the maturity of artificial intelligence in safety-critical settings. AI Mag. 42, 6–15 (2021).

    Google Scholar 

  23. Kato, S. et al. Autoware on board: enabling autonomous vehicles with embedded systems. In 2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems 287–296 (IEEE, 2018).

  24. Feng, S. et al. Safety assessment of highly automated driving systems in test tracks: a new framework. Accid. Anal. Prev. 144, 105664 (2020).

    Article  PubMed  Google Scholar 

  25. Lopez, P. et al. Microscopic traffic simulation using SUMO. In International Conference on Intelligent Transportation Systems 2575–2582 (IEEE, 2018).

  26. Arun, A., Haque, M. M., Bhaskar, A., Washington, S. & Sayed, T. A systematic mapping review of surrogate safety assessment using traffic conflict techniques. Accid. Anal. Prev. 153, 106016 (2021).

    Article  PubMed  Google Scholar 

  27. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).

  28. Koren, M., Alsaif, S., Lee, R. & Kochenderfer, M. J. Adaptive stress testing for autonomous vehicles. In IEEE Intelligent Vehicles Symposium (IV) 1–7 (IEEE, 2018).

  29. Sun, H., Feng, S., Yan, X. & Liu, H. X. Corner case generation and analysis for safety assessment of autonomous vehicles. Transport. Res. Rec. 2675, 587–600 (2021).

    Article  Google Scholar 

  30. Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at (2017).

  31. Owen, A. B. Monte Carlo theory, methods and examples. Art Owen (2013).

  32. Krajewski, R., Moers, T., Bock, J., Vater, L. & Eckstein, L. September. The round dataset: a drone dataset of road user trajectories at roundabouts in Germany. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems 1–6 (IEEE, 2020).

  33. Nowakowski, C., Shladover, S. E., Chan, C. Y. & Tan, H. S. Development of California regulations to govern testing and operation of automated driving systems. Transport. Res. Rec. 2489, 137–144 (2015).

    Article  Google Scholar 

  34. Sauerbier, J., Bock, J., Weber, H. & Eckstein, L. Definition of scenarios for safety validation of automated driving functions. ATZ Worldwide 121, 42–45 (2019).

    Article  Google Scholar 

  35. Pek, C., Manzinger, S., Koschi, M. & Althoff, M. Using online verification to prevent autonomous vehicles from causing accidents. Nat. Mach. Intell. 2, 518–528 (2020).

    Article  Google Scholar 

  36. Seshia, S. A., Sadigh, D. & Sastry, S. S. Toward verified artificial intelligence. Commun. ACM 65, 46–55 (2022).

    Article  Google Scholar 

  37. Wing, J. M. A specifier’s introduction to formal methods. IEEE Comput. 23, 8–24 (1990).

    Article  Google Scholar 

  38. Li, A., Sun, L., Zhan, W., Tomizuka, M. & Chen, M. Prediction-based reachability for collision avoidance in autonomous driving. In 2021 IEEE International Conference on Robotics and Automation 7908–7914 (IEEE, 2021).

  39. Automated Vehicle Safety Consortium AVSC Best Practice for Metrics and Methods for Assessing Safety Performance of Automated Driving Systems (ADS) (SAE Industry Technologies Consortia, 2021).

  40. Au, S. K. & Beck, J. L. Important sampling in high dimensions. Struct. Saf. 25, 139–163 (2003).

    Article  Google Scholar 

  41. Silver, D., Singh, S., Precup, D. & Sutton, R. S. Reward is enough. Artif. Intell. 299, 1–13 (2021).

    Article  MathSciNet  MATH  Google Scholar 

  42. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  43. Weng, B., Rao, S. J., Deosthale, E., Schnelle, S. & Barickman, F. Model predictive instantaneous safety metric for evaluation of automated driving systems. In IEEE Intelligent Vehicles Symposium (IV) 1899–1906 (IEEE, 2020).

  44. Junietz, P., Bonakdar, F., Klamann, B. & Winner, H. Criticality metric for the safety validation of automated driving using model predictive trajectory optimization. In International Conference on Intelligent Transportation Systems 60–65 (IEEE, 2018).

  45. Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition 4700–4708 (IEEE, 2017).

  46. Bengio, Y., Louradour, J., Collobert, R. & Weston, J. Curriculum learning. In International Conference on Machine Learning 41–48 (ICML, 2009).

  47. Yan, X., Feng, S., Sun, H., & Liu, H. X. Distributionally consistent simulation of naturalistic driving environment for autonomous vehicle testing. Preprint at (2021).

  48. Bezzina, D. & Sayer, J. Safety Pilot Model Deployment: Test Conductor Team Report DOT HS 812 171 (National Highway Traffic Safety Administration, 2014).

  49. Sayer, J. et al. Integrated Vehicle-based Safety Systems Field Operational Test: Final Program Report FHWA-JPO-11-150; UMTRI-2010-36 (Joint Program Office for Intelligent Transportation Systems, 2011).

  50. Treiber, M., Hennecke, A. & Helbing, D. Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E 62, 1805 (2000).

    Article  ADS  CAS  MATH  Google Scholar 

  51. Kesting, A., Treiber, M. & Helbing, D. General lane-changing model MOBIL for car-following models. Transp. Res. Rec. 1999, 86–94 (2007).

    Article  Google Scholar 

  52. Liang, E. et al. RLlib: abstractions for distributed reinforcement learning. In International Conference on Machine Learning 3053–3062 (ICML, 2018).

  53. Chang A. X. et al. ShapeNet: an information-rich 3D model repository. Preprint at (2015).

  54. Darweesh, H. et al. Open source integrated planner for autonomous navigation in highly dynamic environments. J. Robot. Mechatron. 29, 668–684 (2017).

    Article  Google Scholar 

Download references


This research was partially funded by the US Department of Transportation (USDOT) Region 5 University Transportation Center: Center for Connected and Automated Transportation (CCAT) of the University of Michigan (#69A3551747105) and the National Science Foundation (CMMI #2223517). We thank the American Center for Mobility (ACM) for providing access to their test track. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the official policy or position of the US government or the American Center for Mobility.

Author information

Authors and Affiliations



S.F. and H.X.L. conceived and led the research programme, developed the AI against AI concepts, developed the dense-learning approach, and wrote the paper. S.F. and H.S. developed the algorithms for the intelligent-testing-environment generation and designed the experiments. H.S. and H.Z. developed the simulation platform, implemented the algorithms, performed the simulation tests and prepared the simulation results. X.Y., H.Z. and S.S. implemented the Autoware system in the autonomous vehicle, performed the field tests and prepared the testing results. Z.Z. developed and performed the augmented image rendering. All authors provided feedback during the manuscript revision and results discussions. H.X.L. approved the submission and accepted responsibility for the overall integrity of the paper.

Corresponding author

Correspondence to Henry X. Liu.

Ethics declarations

Competing interests

The University of Michigan is in the process of applying for a patent application #63/338,424 covering the dense reinforcement learning, intelligent testing environment generation, and augmented reality testing techniques that lists H.X.L., S.F., H.S., X.Y., H.Z., Z.Z., and S.S. as inventors.

Peer review

Peer review information

Nature thanks Colin Paterson, Fredrik Warg and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

This file contains Supplementary Sections 1–4, including Supplementary text and equations, Figs. 1–19, Tables 1 and 2. and references—see Contents for details. It also includes links to Supplementary Videos 1–8 in Section 5, which are hosted externally via figshare.

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, S., Sun, H., Yan, X. et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 615, 620–627 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing