Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Neural circuit policies enabling auditable autonomy

Abstract

A central goal of artificial intelligence in high-stakes decision-making applications is to design a single algorithm that simultaneously expresses generalizability by learning coherent representations of their world and interpretable explanations of its dynamics. Here, we combine brain-inspired neural computation principles and scalable deep learning architectures to design compact neural controllers for task-specific compartments of a full-stack autonomous vehicle control system. We discover that a single algorithm with 19 control neurons, connecting 32 encapsulated input features to outputs by 253 synapses, learns to map high-dimensional inputs into steering commands. This system shows superior generalizability, interpretability and robustness compared with orders-of-magnitude larger black-box learning systems. The obtained neural agents enable high-fidelity autonomy for task-specific parts of a complex autonomous system.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: End-to-end driving.
Fig. 2: Recurrent network modules are essential for the lane-keeping tasks.
Fig. 3: Designing NCP networks with an LTC neural model.
Fig. 4: Robustness analysis.
Fig. 5: Global network dynamics.
Fig. 6: Intuitive comprehension of NCP’s cells activity while driving.
Fig. 7: NCPs as task-specific networks within a full-stack autonomous vehicle engine.

Data availability

A description of how to obtain the data and code used for this manuscript is available at the manuscript’s GitHub repository: https://github.com/mlech26l/keras-ncp/ (https://doi.org/10.5281/zenodo.3999484). The data generated by the active test runs is available for download from the repository, while the full dataset of 193 GB is available on request from M.L.

Code availability

An Apache-2.0 licensed reference implementation maintained by the authors is available at the GitHub repository: https://github.com/mlech26l/keras-ncp/ (https://doi.org/10.5281/zenodo.3999484)

References

  1. 1.

    Lecun, Y., Cosatto, E., Ben, J., Muller, U. & Flepp, B. Dave: Autonomous Off-road Vehicle Control Using End-to-end Learning Technical Report DARPA-IPTO Final Report (Courant Institute/CBLL, 2004); https://cs.nyu.edu/~yann/research/dave/

  2. 2.

    Bojarski, M. et al. End to end learning for self-driving cars. Preprint at http://arXiv.org/abs/1604.07316 (2016).

  3. 3.

    Kato, S. et al. Global brain dynamics embed the motor command sequence of Caenorhabditis elegans. Cell 163, 656–669 (2015).

    Article  Google Scholar 

  4. 4.

    Stephens, G. J., Johnson-Kerner, B., Bialek, W. & Ryu, W. S. Dimensionality and dynamics in the behavior of C. elegans. PLoS Comput. Biol. 4, e1000028 (2008).

    MathSciNet  Article  Google Scholar 

  5. 5.

    Gray, J. M., Hill, J. J. & Bargmann, C. I. A circuit for navigation in caenorhabditis elegans. Proc. Natl Acad. Sci. USA 102, 3184–3191 (2005).

    Article  Google Scholar 

  6. 6.

    Yan, G. et al. Network control principles predict neuron function in the Caenorhabditis elegans connectome. Nature 550, 519–523 (2017).

    Article  Google Scholar 

  7. 7.

    Cook, S. J. et al. Whole-animal connectomes of both Caenorhabditis elegans sexes. Nature 571, 63–71 (2019).

    Article  Google Scholar 

  8. 8.

    Kaplan, H. S., Thula, O. S., Khoss, N. & Zimmer, M. Nested neuronal dynamics orchestrate a behavioral hierarchy across timescales. Neuron 105(3), 562–576 (2019).

    Article  Google Scholar 

  9. 9.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  Google Scholar 

  10. 10.

    Hassabis, D., Kumaran, D., Summerfield, C. & Botvinick, M. Neuroscience-inspired artificial intelligence. Neuron 95, 245–258 (2017).

    Article  Google Scholar 

  11. 11.

    Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).

    Article  Google Scholar 

  12. 12.

    Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

    Article  Google Scholar 

  13. 13.

    Silver, D. et al. Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  Google Scholar 

  14. 14.

    Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).

    Article  Google Scholar 

  15. 15.

    Schrittwieser, J. et al. Mastering atari, go, chess and shogi by planning with a learned model. Preprint at http://arXiv.org/abs/1911.08265 (2019).

  16. 16.

    Vinyals, O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019).

    Article  Google Scholar 

  17. 17.

    Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013).

    Article  Google Scholar 

  18. 18.

    Lipton, Z. C. The mythos of model interpretability. Queue 16, 31–57 (2018).

    Article  Google Scholar 

  19. 19.

    Lechner, M., Hasani, R., Rus, D. & Grosu, R. Gershgorin loss stabilizes the recurrent neural network compartment of an end-to-end robot learning scheme. In Proc. 2020 International Conference on Robotics and Automation (ICRA) 5446–5452 (2020).

  20. 20.

    Knight, J. C. Safety critical systems: challenges and directions. In Proc. 24th International Conference on Software Engineering 547–550 (2002).

  21. 21.

    Pearl, J. Causality (Cambridge Univ. Press, 2009).

  22. 22.

    Peters, J., Janzing, D. & Schölkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms (MIT Press, 2017).

  23. 23.

    Joseph, M., Kearns, M., Morgenstern, J. H. & Roth, A. Fairness in learning: classic and contextual bandits. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 325–333 (2016).

  24. 24.

    Fish, B., Kun, J. & Lelkes, Á. D. A confidence-based approach for balancing fairness and accuracy. In Proc. SIAM International Conference on Data Mining 144–152 (2016).

  25. 25.

    Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 5998–6008 (2017).

  26. 26.

    Xu, H., Gao, Y., Yu, F. & Darrell, T. End-to-end learning of driving models from large-scale video datasets. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2174–2182 (2017).

  27. 27.

    Amini, A., Paull, L., Balch, T., Karaman, S. & Rus, D. Learning steering bounds for parallel autonomous systems. In IEEE International Conference on Robotics and Automation (ICRA) 1–8 (2018).

  28. 28.

    Fridman, L. et al. MIT advanced vehicle technology study: large-scale naturalistic driving study of driver behavior and interaction with automation. IEEE Access 7, 102021–102038 (2019).

    Article  Google Scholar 

  29. 29.

    LeCun, Y. et al. Handwritten digit recognition with a back-propagation network. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 396–404 (1990).

  30. 30.

    Amini, A., Rosman, G., Karaman, S. & Rus, D. Variational end-to-end navigation and localization. In Proc. 2019 International Conference on Robotics and Automation (ICRA) 8958–8964 (2019).

  31. 31.

    Hochreiter, S. Untersuchungen zu dynamischen neuronalen netzen. Diploma, Technische Universität München 91 (1991).

  32. 32.

    Bengio, Y., Simard, P. & Frasconi, P. et al. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994).

    Article  Google Scholar 

  33. 33.

    Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

    Article  Google Scholar 

  34. 34.

    Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

    Article  Google Scholar 

  35. 35.

    Reimer, B., Mehler, B., Wang, Y. & Coughlin, J. F. A field study on the impact of variations in short-term memory demands on drivers’ visual attention and driving performance across three age groups. Hum. Factors 54, 454–468 (2012).

    Article  Google Scholar 

  36. 36.

    Funahashi, K.-i & Nakamura, Y. Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 6, 801–806 (1993).

    Article  Google Scholar 

  37. 37.

    Chen, T. Q., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 6571–6583 (2018).

  38. 38.

    Lechner, M. & Hasani, R. Learning long-term dependencies in irregularly-sampled time series. Preprint at http://arXiv.org/abs/2006.04418 (2020).

  39. 39.

    Sarma, G. P. et al. Openworm: overview and recent advances in integrative biological simulation of Caenorhabditis elegans. Phil. Trans. R. Soc. B 373, 20170382 (2018).

    Article  Google Scholar 

  40. 40.

    Gleeson, P., Lung, D., Grosu, R., Hasani, R. & Larson, S. D. c302: a multiscale framework for modelling the nervous system of Caenorhabditis elegans. Phil. Trans. R. Soc. B. 373, 20170379 (2018).

    Article  Google Scholar 

  41. 41.

    Hasani, R., Lechner, M., Amini, A., Rus, D. & Grosu, R. Liquid time-constant networks. Preprint at http://arXiv.org/abs/2006.04439 (2020).

  42. 42.

    LeCun, Y. et al. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989).

    Article  Google Scholar 

  43. 43.

    Wicks, S. R., Roehrig, C. J. & Rankin, C. H. A dynamic network simulation of the nematode tap withdrawal circuit: predictions concerning synaptic function using behavioral criteria. J. Neurosci. 16, 4017–4031 (1996).

    Article  Google Scholar 

  44. 44.

    Lechner, M., Hasani, R., Zimmer, M., Henzinger, T. A. & Grosu, R. Designing worm-inspired neural networks for interpretable robotic control. In International Conference on Robotics and Automation (ICRA) 87–94 (2019).

  45. 45.

    Hasani, R., Lechner, M., Amini, A., Rus, D. & Grosu, R. The natural lottery ticket winner: reinforcement learning with ordinary neural circuits. In Proc. International Conference on Machine Learning (2020).

  46. 46.

    Bengio, Y. & Grandvalet, Y. No unbiased estimator of the variance of k-fold cross-validation. J. Mach. Learn. Res. 5, 1089–1105 (2004).

    MathSciNet  MATH  Google Scholar 

  47. 47.

    Molnar, C. Interpretable Machine Learning (Lulu.com, 2019).

  48. 48.

    Hasani, R. Interpretable Recurrent Neural Networks in Continuous-time Control Environments. PhD dissertation, Technische Universität Wien (2020).

  49. 49.

    Erhan, D., Bengio, Y., Courville, A. & Vincent, P. Visualizing Higher-layer Features of a Deep Network Technical Report 1341 (Univ. Montreal, 2009).

  50. 50.

    Zeiler, M. D. & Fergus, R. Visualizing and understanding convolutional networks. In European Conference on Computer Vision 818–833 (2014).

  51. 51.

    Yosinski, J., Clune, J., Nguyen, A., Fuchs, T. & Lipson, H. Understanding neural networks through deep visualization. Preprint at http://arXiv.org/abs/1506.06579 (2015).

  52. 52.

    Karpathy, A., Johnson, J. & Fei-Fei, L. Visualizing and understanding recurrent networks. Preprint at http://arXiv.org/abs/1506.02078 (2015).

  53. 53.

    Strobelt, H., Gehrmann, S., Pfister, H. & Rush, A. M. LSTMVis: a tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Trans. Vis. Comput Graph. 24, 667–676 (2018).

    Article  Google Scholar 

  54. 54.

    Bilal, A., Jourabloo, A., Ye, M., Liu, X. & Ren, L. Do convolutional neural networks learn class hierarchy? IEEE Trans. Vis. Comput. Graph. 24, 152–162 (2018).

    Article  Google Scholar 

  55. 55.

    Olah, C. et al. The building blocks of interpretability. Distill 3, e10 (2018).

    Article  Google Scholar 

  56. 56.

    Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at http://arXiv.org/abs/1312.6034 (2013).

  57. 57.

    Fong, R. C. & Vedaldi, A. Interpretable explanations of black boxes by meaningful perturbation. Proc. IEEE International Conference on Computer Vision 3449–3457 (IEEE, 2017).

  58. 58.

    Kindermans, P.-J., Schütt, K. T., Alber, M., Müller, K.-R. & Dähne, S. Learning how to explain neural networks: PatternNet and PatternAttribution. Proc. International Conference on Learning Representations (ICLR) (2018).

  59. 59.

    Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. Proc. 34th International Conference on Machine Learning (ICML) (2017).

  60. 60.

    Doshi-Velez, F. & Kim, B. Towards a rigorous science of interpretable machine learning. Preprint at http://arXiv.org/abs/1702.08608 (2017).

  61. 61.

    Trask, A. et al. Neural arithmetic logic units. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 8035–8044 (2018).

  62. 62.

    Bojarski, M. et al. Visualbackprop: efficient visualization of cnns for autonomous driving. In IEEE International Conference on Robotics and Automation (ICRA) 1–8 (2018).

  63. 63.

    Maaten, Lvd & Hinton, G. Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    MATH  Google Scholar 

  64. 64.

    Tesla Autopilot (Tesla, 2020); https://www.tesla.com/autopilot

  65. 65.

    Karpathy, A. PyTorch at Tesla. In PyTorch Devcon Conference 19 https://youtu.be/oBklltKXtDE (2019).

  66. 66.

    Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. Numerical Recipes: The Art of Scientific Computing 3rd edn (Cambridge Univ. Press, 2007).

  67. 67.

    Naser, F. et al. A parallel autonomy research platform. In 2017 IEEE Intelligent Vehicles Symposium (IV) 933–940 (IEEE, 2017).

  68. 68.

    Amini, A. et al. Learning robust control policies for end-to-end autonomous driving from data-driven simulation. IEEE Robot. Autom. Lett. 5, 1143–1150 (2020).

    Article  Google Scholar 

  69. 69.

    Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference for Learning Representations (ICLR) (2015).

  70. 70.

    Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).

    Article  Google Scholar 

  71. 71.

    Girosi, F., Jones, M. & Poggio, T. Regularization theory and neural networks architectures. Neural Comput. 7, 219–269 (1995).

    Article  Google Scholar 

  72. 72.

    Smale, S. & Zhou, D.-X. Learning theory estimates via integral operators and their approximations. Constr. Approx. 26, 153–172 (2007).

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

We thank M. Zimmer and the Zimmer Group for constructive discussions. R.H. and R.G. are partially supported by Horizon-2020 ECSEL Project grant no. 783163 (iDev40), and the Austrian Research Promotion Agency (FFG), project no. 860424. M.L. and T.A.H. were supported in part by the Austrian Science Fund (FWF) under grant Z211-N23 (Wittgenstein Award). A.A. is supported by the National Science Foundation (NSF) Graduate Research Fellowship Program. A.A. and D.R. were partially sponsored by the United States Air Force Research Laboratory and was accomplished under Cooperative Agreement no. FA8750-19-2-1000. R.H. and D.R. are partially supported by The Boeing Company. This research work is drawn from the PhD dissertation of R.H.

Author information

Affiliations

Authors

Contributions

R.H. and M.L. conceptualized, designed and performed research, and analysed data. A.A. contributed to data curation, research implementation and new analytical tools, and analysed data. R.G., T.A.H. and D.R. helped with the design and supervised the work. All authors wrote the paper.

Corresponding authors

Correspondence to Mathias Lechner or Ramin Hasani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Conciseness of the hidden-state dynamics of LSTMs vs NCPs.

a, Hidden state dynamics of 64 LSTM cells as a function of network output. b, Hidden state dynamics of 13 NCP cells as a function of the network output. c, PCA on LSTM cells + output, d, PCA on LSTM cells only. e, PCA on NCP cells + output. f, PCA on NCP cells only. x-axis represents the activity of the output, y-axis stands for the dynamics of an individual neuron. The colour represents the steering angle (The more yellow regions depict sharper turns to the left-hand-side, and the more blue regions stand for sharper turns to the right-hand-side).

Extended Data Fig. 2 Learning curves of the models tested in the active driving experiments.

Early stopping 71,72 is deployed as a regularisation mechanism to obtain better generalisation. The terminating epoch for each experiment, is reported in Extended Data Figure 4.

Extended Data Fig. 3 Neural activity of all NCP neurons presented in Fig. 6.

The colour-bar represents the neuron output of each individual neuron in the NCP architecture.

Extended Data Fig. 4 Coupling sensitivity of all NCP neurons presented in Fig. 6.

The colour-bar represents the time constant of each individual neuron in the NCP architecture.

Extended Data Fig. 5 Convolutional head.

Size of the convolutional kernels.

Extended Data Fig. 6 Layers of the feedforward CNN, adapted from 2.

Conv2D refers to a convolutional layer, F to the number of filters, K to the kernel size, S to the strides, U to the number of units in a fully-connected layer. The values of the dropout-rates δ1,δ2, and δ3 were optimised on the passive benchmark and reported in Extended Data Figure 3.

Extended Data Fig. 7 Models’ training hyperparameters.

The values of all hyperparameters were selected through empirical evaluation over the passive training dataset. We did not search through the hyperparameters space exhaustively, due to the computational costs. However, the use of a systematic meta-learning algorithm over these parameter-spaces can presumably result in achieving better performances.

Extended Data Fig. 8 The learning termination epoch properties, (shown in Extended Data Fig. 2).

Training and validation metrics of the models tested in the active driving experiment. As discussed thoroughly (Fig. 4), LSTM model achieves the best performance in the passive test but fails to express proper driving behaviour under environmental disturbances.

Supplementary information

Supplementary Video 1

NCP—driving performance with no perturbation.

Supplementary Video 2

CNN—driving performance with no perturbation.

Supplementary Video 3

LSTM—driving performance with no perturbation.

Supplementary Video 4

CT-RNN—driving performance with no perturbation.

Supplementary Video 5

NCP—driving performance with perturbation variance = 0.1.

Supplementary Video 6

CNN—driving performance with perturbation variance = 0.1.

Supplementary Video 7

LSTM—driving performance with perturbation variance = 0.1.

Supplementary Video 8

CT-RNN—driving performance with perturbation variance = 0.1.

Supplementary Video 9

NCP—driving performance with perturbation variance = 0.2.

Supplementary Video 10

CNN—driving performance with perturbation variance = 0.2.

Supplementary Video 11

LSTM—driving performance with perturbation variance = 0.2.

Supplementary Video 12

CT-RNN—driving performance with perturbation variance = 0.2.

Supplementary Video 13

NCP—driving performance with perturbation variance = 0.3.

Supplementary Video 14

CNN—driving performance with perturbation variance = 0.3.

Supplementary Video 15

LSTM—driving performance with perturbations variance = 0.3.

Supplementary Video 16

CT-RNN—driving performance with perturbation variance = 0.3.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lechner, M., Hasani, R., Amini, A. et al. Neural circuit policies enabling auditable autonomy. Nat Mach Intell 2, 642–652 (2020). https://doi.org/10.1038/s42256-020-00237-3

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing