Incorporating neuro-inspired adaptability for continual learning in artificial intelligence

Wang, Liyuan; Zhang, Xingxing; Li, Qian; Zhang, Mingtian; Su, Hang; Zhu, Jun; Zhong, Yi

doi:10.1038/s42256-023-00747-w

Article
Published: 16 November 2023

Incorporating neuro-inspired adaptability for continual learning in artificial intelligence

Nature Machine Intelligence volume 5, pages 1356–1368 (2023)Cite this article

4442 Accesses
1 Citations
20 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

Continual learning aims to empower artificial intelligence with strong adaptability to the real world. For this purpose, a desirable solution should properly balance memory stability with learning plasticity, and acquire sufficient compatibility to capture the observed distributions. Existing advances mainly focus on preserving memory stability to overcome catastrophic forgetting, but it remains difficult to flexibly accommodate incremental changes as biological intelligence does. Here, by modelling a robust Drosophila learning system that actively regulates forgetting with multiple learning modules, we propose a generic approach that appropriately attenuates old memories in parameter distributions to improve learning plasticity, and accordingly coordinates a multi-learner architecture to ensure solution compatibility. Through extensive theoretical and empirical validation, our approach not only enhances the performance of continual learning, especially over synaptic regularization methods in task-incremental settings, but also potentially advances the understanding of neurological adaptive mechanisms.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Continual learning with reference to a biological learning system.**

**Fig. 2: Implementation of active forgetting in a continual learning model.**

**Fig. 3: The γMB-like architecture of MCL with adaptive modulations.**

**Fig. 4: Exploration of underlying mechanisms that improve continual learning.**

**Fig. 5: Performance evaluation for visual classification tasks.**

**Fig. 6: Performance evaluation for Atari reinforcement tasks.**

Three types of incremental learning

Article Open access 05 December 2022

Sleep-like unsupervised replay reduces catastrophic forgetting in artificial neural networks

Article Open access 15 December 2022

Network of evolvable neural units can learn synaptic learning rules and spiking dynamics

Article 10 December 2020

Data availability

All benchmark datasets used in this paper are publicly available, including CIFAR-10/100⁴⁵ (https://www.cs.toronto.edu/~kriz/cifar.html), Omniglot⁵⁷ (https://www.omniglot.com), CUB-200-2011⁵⁸ (https://www.vision.caltech.edu/datasets/cub_200_2011/), Tiny-ImageNet¹⁷ (https://www.image-net.org/download.php), CORe50⁵⁹ (https://vlomonaco.github.io/core50/) and Atari games⁷³ (https://github.com/openai/baselines).

Code availability

The implementation code is available via Zenodo https://doi.org/10.5281/zenodo.8293564 ref. ⁷⁴.

References

Chen, Z. & Liu, B. Lifelong machine learning. (San Rafael: Morgan & Claypool Publishers, 2018).
Parisi, G. I., Kemker, R., Part, J. L., Kanan, C. & Wermter, S. Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019).
Kudithipudi, D. et al. Biological underpinnings for lifelong learning machines. Nat. Mach. Intell. 4, 196–210 (2022).
Article Google Scholar
McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989).
Article Google Scholar
McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419 (1995).
Article Google Scholar
Wang, L., Zhang, X., Su, H. & Zhu, J. A comprehensive survey of continual learning: theory, method and application. Preprint at https://arxiv.org/abs/2302.00487 (2023).
Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl Acad. Sci. USA 114, 3521–3526 (2017).
Article MathSciNet Google Scholar
Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M. & Tuytelaars, T. Memory aware synapses: learning what (not) to forget. In Proc. European Conference on Computer Vision 139–154 (Springer, 2018).
Zenke, F., Poole, B. & Ganguli, S. Continual learning through synaptic intelligence. In Proc. International Conference on Machine Learning 3987–3995 (PMLR, 2017).
Chaudhry, A., Dokania, P. K., Ajanthan, T. & Torr, P. H. Riemannian walk for incremental learning: understanding forgetting and intransigence. In Proc. European Conference on Computer Vision 532–547 (Springer, 2018).
Ritter, H., Botev, A. & Barber, D. Online structured laplace approximations for overcoming catastrophic forgetting. Adv. Neural Inf. Process. Syst. 31, 3742–3752 (2018).
Rebuffi, S.-A., Kolesnikov, A., Sperl, G. & Lampert, C. H. iCaRL: incremental classifier and representation learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2001–2010 (IEEE, 2017).
Shin, H., Lee, J. K., Kim, J. & Kim, J. Continual learning with deep generative replay. Adv. Neural Inf. Process. Syst. 30, 2990–2999 (2017).
Wang, L. et al. Memory replay with data compression for continual learning. In International Conference on Learning Representations (2021).
Serra, J., Suris, D., Miron, M. & Karatzoglou, A. Overcoming catastrophic forgetting with hard attention to the task. In Proc. International Conference on Machine Learning 4548–4557 (PMLR, 2018).
Fernando, C. et al. PathNet: evolution channels gradient descent in super neural networks. Preprint at https://arxiv.org/abs/1701.08734 (2017).
Delange, M. et al. A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. (2021).
Hadsell, R., Rao, D., Rusu, A. A. & Pascanu, R. Embracing change: continual learning in deep neural networks. Trends Cogn. Sci. 24, 1028–1040 (2020).
Article Google Scholar
Shuai, Y. et al. Forgetting is regulated through Rac activity in Drosophila. Cell 140, 579–589 (2010).
Article Google Scholar
Cohn, R., Morantte, I. & Ruta, V. Coordinated and compartmentalized neuromodulation shapes sensory processing in Drosophila. Cell 163, 1742–1755 (2015).
Waddell, S. Neural plasticity: dopamine tunes the mushroom body output network. Curr. Biol. 26, R109–R112 (2016).
Article Google Scholar
Modi, M. N., Shuai, Y. & Turner, G. C. The Drosophila mushroom body: from architecture to algorithm in a learning circuit. Annu. Rev. Neurosci. 43, 465–484 (2020).
Article Google Scholar
Aso, Y. et al. Mushroom body output neurons encode valence and guide memory-based action selection in Drosophila. eLife 3, e04580 (2014).
Article Google Scholar
Aso, Y. & Rubin, G. M. Dopaminergic neurons write and update memories with cell-type-specific rules. eLife 5, e16135 (2016).
Article Google Scholar
Gao, Y. et al. Genetic dissection of active forgetting in labile and consolidated memories in Drosophila. Proc. Natl Acad. Sci. USA 116, 21191–21197 (2019).
Article Google Scholar
Zhao, J. et al. Genetic dissection of mutual interference between two consecutive learning tasks in Drosophila. eLife 12, e83516 (2023).
Richards, B. A. & Frankland, P. W. The persistence and transience of memory. Neuron 94, 1071–1084 (2017).
Article Google Scholar
Dong, T. et al. Inability to activate Rac1-dependent forgetting contributes to behavioral inflexibility in mutants of multiple autism-risk genes. Proc. Natl Acad. Sci. USA 113, 7644–7649 (2016).
Article Google Scholar
Zhang, X., Li, Q., Wang, L., Liu, Z.-J. & Zhong, Y. Active protection: learning-activated Raf/MAPK activity protects labile memory from Rac1-independent forgetting. Neuron 98, 142–155 (2018).
Article Google Scholar
Davis, R. L. & Zhong, Y. The biology of forgetting—a perspective. Neuron 95, 490–503 (2017).
Article Google Scholar
Mo, H. et al. Age-related memory vulnerability to interfering stimuli is caused by gradual loss of MAPK-dependent protection in Drosophila. Aging Cell 21, e13628 (2022).
Cervantes-Sandoval, I., Chakraborty, M., MacMullen, C. & Davis, R. L. Scribble scaffolds a signalosome for active forgetting. Neuron 90, 1230–1242 (2016).
Article Google Scholar
Noyes, N. C., Phan, A. & Davis, R. L. Memory suppressor genes: modulating acquisition, consolidation, and forgetting. Neuron 109, 3211–3227 (2021).
Article Google Scholar
Cognigni, P., Felsenberg, J. & Waddell, S. Do the right thing: neural network mechanisms of memory formation, expression and update in Drosophila. Curr. Opin. Neurobiol. 49, 51–58 (2018).
Article Google Scholar
Amin, H. & Lin, A. C. Neuronal mechanisms underlying innate and learned olfactory processing in Drosophila. Curr. Opin. Insect Sci. 36, 9–17 (2019).
Article Google Scholar
Handler, A. et al. Distinct dopamine receptor pathways underlie the temporal sensitivity of associative learning. Cell 178, 60–75 (2019).
Article Google Scholar
McCurdy, L. Y., Sareen, P., Davoudian, P. A. & Nitabach, M. N. Dopaminergic mechanism underlying reward-encoding of punishment omission during reversal learning in Drosophila. Nat. Commun. 12, 1115 (2021).
Berry, J. A., Cervantes-Sandoval, I., Nicholas, E. P. & Davis, R. L. Dopamine is required for learning and forgetting in Drosophila. Neuron 74, 530–542 (2012).
Article Google Scholar
Berry, J. A., Phan, A. & Davis, R. L. Dopamine neurons mediate learning and forgetting through bidirectional modulation of a memory trace. Cell Rep. 25, 651–662 (2018).
Article Google Scholar
Aitchison, L. et al. Synaptic plasticity as bayesian inference. Nat. Neurosci. 24, 565–571 (2021).
Article Google Scholar
Schug, S., Benzing, F. & Steger, A. Presynaptic stochasticity improves energy efficiency and helps alleviate the stability–plasticity dilemma. eLife 10, e69884 (2021).
Article Google Scholar
Wang, L. et al. AFEC: active forgetting of negative transfer in continual learning. Adv. Neural Inf. Process. Syst. 34, 22379–22391 (2021).
Benzing, F. Unifying importance based regularisation methods for continual learning. In Proc. International Conference on Artificial Intelligence and Statistics 2372–2396 (PMLR, 2022).
Bouton, M. E. Context, time, and memory retrieval in the interference paradigms of pavlovian learning. Psychol. Bull. 114, 80 (1993).
Article Google Scholar
Krizhevsky, A. et al. Learning multiple layers of features from tiny images. Technical Report, Citeseer (2009).
Shuai, Y. et al. Dissecting neural pathways for forgetting in Drosophila olfactory aversive memory. Proc. Natl Acad. Sci. USA 112, E6663–E6672 (2015).
Chen, L. et al. AI of brain and cognitive sciences: from the perspective of first principles. Preprint at https://arxiv.org/abs/2301.08382 (2023).
Caron, S. J., Ruta, V., Abbott, L. F. & Axel, R. Random convergence of olfactory inputs in the Drosophila mushroom body. Nature 497, 113–117 (2013).
Article Google Scholar
Endo, K., Tsuchimoto, Y. & Kazama, H. Synthesis of conserved odor object representations in a random, divergent-convergent network. Neuron 108, 367–381 (2020).
Article Google Scholar
Long, M., Cao, Y., Wang, J. & Jordan, M. Learning transferable features with deep adaptation networks. In Proc. International Conference on Machine Learning 97–105 (PMLR, 2015).
Wang, L., Zhang, X., Li, Q., Zhu, J. & Zhong, Y. CoSCL: cooperation of small continual learners is stronger than a big one. In Proc. European Conference on Computer Vision 254–271 (Springer, 2022).
van de Ven, G. M., Tuytelaars, T. & Tolias, A. S. Three types of incremental learning. Nat. Mach. Intell. 4, 1185–1197 (2022).
Riemer, M. et al. Learning to learn without forgetting by maximizing transfer and minimizing interference. In International Conference on Learning Representations (2018).
Schwarz, J. et al. Progress & compress: a scalable framework for continual learning. In Proc. International Conference on Machine Learning 4528–4537 (PMLR, 2018).
Jung, S., Ahn, H., Cha, S. & Moon, T. Continual learning with node-importance based adaptive group sparse regularization. Adv. Neural Inf. Process. Syst. 33, 3647–3658 (2020).
Cha, S., Hsu, H., Hwang, T., Calmon, F. & Moon, T. CPR: classifier-projection regularization for continual learning. In International Conference on Learning Representations (2020).
Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015).
Article MathSciNet Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. The Caltech-UCSD birds-200-2011 dataset. (2011). http://www.vision.caltech.edu/datasets/
Lomonaco, V. & Maltoni, D. Core50: a new dataset and benchmark for continuous object recognition. In Conference on Robot Learning 17–26 (PMLR, 2017).
Ryan, T. J. & Frankland, P. W. Forgetting as a form of adaptive engram cell plasticity. Nat. Rev. Neurosci. 23, 173–186 (2022).
Article Google Scholar
Luo, L. et al. Differential effects of the Rac GTPase on Purkinje cell axons and dendritic trunks and spines. Nature 379, 837–840 (1996).
Tashiro, A., Minden, A. & Yuste, R. Regulation of dendritic spine morphology by the rho family of small gtpases: antagonistic roles of Rac and Rho. Cerebral Cortex 10, 927–938 (2000).
Article Google Scholar
Hayashi-Takagi, A. et al. Disrupted-in-Schizophrenia 1 (DISC1) regulates spines of the glutamate synapse via Rac1. Nat. Neurosci. 13, 327–332 (2010).
Article Google Scholar
Hayashi-Takagi, A. et al. Labelling and optical erasure of synaptic memory traces in the motor cortex. Nature 525, 333–338 (2015).
Article Google Scholar
Martens, J. & Grosse, R. Optimizing neural networks with kronecker-factored approximate curvature. In Proc. International Conference on Machine Learning 2408–2417 (PMLR, 2015).
Knoblauch, J., Husain, H. & Diethe, T. Optimal continual learning has perfect memory and is NP-hard. In Proc. International Conference on Machine Learning 5327–5337 (PMLR, 2020).
Deng, D., Chen, G., Hao, J., Wang, Q. & Heng, P.-A. Flattening sharpness for dynamic gradient projection memory benefits continual learning. Adv. Neural Inf. Process. Syst. 34, 18710–18721 (2021).
Mirzadeh, S. I., Farajtabar, M., Pascanu, R. & Ghasemzadeh, H. Understanding the role of training regimes in continual learning. Adv. Neural Inf. Process. Syst. 33, 7308–7320 (2020).
Google Scholar
McAllester, D. A. PAC-Bayesian model averaging. In Proc. Twelfth Annual Conference on Computational Learning Theory 164–170 (ACM, 1999).
Pham, Q., Liu, C., Sahoo, D. & Steven, H. Contextual transformation networks for online continual learning. In International Conference on Learning Representations (2021).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).
Lopez-Paz, D. et al. Gradient episodic memory for continual learning. Adv. Neural Inf. Process. Syst. 30, 6467–6476 (2017).
Mnih, V. et al. Playing Atari with deep reinforcement learning. Preprint at https://arxiv.org/abs/1312.5602 (2013).
Wang, L. & Zhang, X. lywang3081/CAF: CAF paper. Zenodo https://doi.org/10.5281/zenodo.8293564 (2023).
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 618–626 (IEEE, 2017).

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2020AAA0106302, to J.Z.), the STI2030-Major Projects (2022ZD0204900, to Y.Z.), the National Natural Science Foundation of China (nos 62061136001 and 92248303, to J.Z., 32021002, to Y.Z., U19A2081, to H.S.), the Tsinghua-Peking Center for Life Sciences, the Tsinghua Institute for Guo Qiang, and the High Performance Computing Center, Tsinghua University. L.W. was also supported by the Shuimu Tsinghua Scholar. J.Z. was also supported by the New Cornerstone Science Foundation through the XPLORER PRIZE.

Author information

These authors contributed equally: Liyuan Wang, Xingxing Zhang.

Authors and Affiliations

Department of Computer Science and Technology, Institute for AI, BNRist Center, Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University, Beijing, China
Liyuan Wang, Xingxing Zhang, Hang Su & Jun Zhu
School of Life Sciences, IDG/McGovern Institute for Brain Research, Tsinghua University, Beijing, China
Liyuan Wang, Qian Li & Yi Zhong
Tsinghua-Peking Center for Life Sciences, Beijing, China
Liyuan Wang, Qian Li & Yi Zhong
Centre for Artificial Intelligence, University College London, London, UK
Mingtian Zhang

Authors

Liyuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xingxing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Li
View author publications
You can also search for this author in PubMed Google Scholar
Mingtian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hang Su
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zhong
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.W., X.Z., J.Z. and Y.Z. conceived the project. L.W., X.Z., Q.L. and M.Z. designed the computational model. X.Z. performed the theoretical analysis, assisted by L.W. L.W. performed all experiments and analysed the data. L.W., X.Z. and Q.L. wrote the paper. L.W., X.Z., Q.L., M.Z., H.S., J.Z. and Y.Z. revised the paper. J.Z. and Y.Z. supervised the project.

Corresponding authors

Correspondence to Jun Zhu or Yi Zhong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Gido van de Ven and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Text, Tables 1–6 and Figs. 1–7.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, L., Zhang, X., Li, Q. et al. Incorporating neuro-inspired adaptability for continual learning in artificial intelligence. Nat Mach Intell 5, 1356–1368 (2023). https://doi.org/10.1038/s42256-023-00747-w

Download citation

Received: 03 October 2022
Accepted: 26 September 2023
Published: 16 November 2023
Issue Date: December 2023
DOI: https://doi.org/10.1038/s42256-023-00747-w

This article is cited by

Continual learning, deep reinforcement learning, and microcircuits: a novel method for clever game playing
- Oscar Chang
- Leo Ramos
- Luis Zhinin-Vera
Multimedia Tools and Applications (2024)