Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Incorporating neuro-inspired adaptability for continual learning in artificial intelligence

A preprint version of the article is available at arXiv.


Continual learning aims to empower artificial intelligence with strong adaptability to the real world. For this purpose, a desirable solution should properly balance memory stability with learning plasticity, and acquire sufficient compatibility to capture the observed distributions. Existing advances mainly focus on preserving memory stability to overcome catastrophic forgetting, but it remains difficult to flexibly accommodate incremental changes as biological intelligence does. Here, by modelling a robust Drosophila learning system that actively regulates forgetting with multiple learning modules, we propose a generic approach that appropriately attenuates old memories in parameter distributions to improve learning plasticity, and accordingly coordinates a multi-learner architecture to ensure solution compatibility. Through extensive theoretical and empirical validation, our approach not only enhances the performance of continual learning, especially over synaptic regularization methods in task-incremental settings, but also potentially advances the understanding of neurological adaptive mechanisms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Continual learning with reference to a biological learning system.
Fig. 2: Implementation of active forgetting in a continual learning model.
Fig. 3: The γMB-like architecture of MCL with adaptive modulations.
Fig. 4: Exploration of underlying mechanisms that improve continual learning.
Fig. 5: Performance evaluation for visual classification tasks.
Fig. 6: Performance evaluation for Atari reinforcement tasks.

Similar content being viewed by others

Data availability

All benchmark datasets used in this paper are publicly available, including CIFAR-10/10045 (, Omniglot57 (, CUB-200-201158 (, Tiny-ImageNet17 (, CORe5059 ( and Atari games73 (

Code availability

The implementation code is available via Zenodo ref. 74.


  1. Chen, Z. & Liu, B. Lifelong machine learning. (San Rafael: Morgan & Claypool Publishers, 2018).

  2. Parisi, G. I., Kemker, R., Part, J. L., Kanan, C. & Wermter, S. Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019).

  3. Kudithipudi, D. et al. Biological underpinnings for lifelong learning machines. Nat. Mach. Intell. 4, 196–210 (2022).

    Article  Google Scholar 

  4. McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989).

    Article  Google Scholar 

  5. McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419 (1995).

    Article  Google Scholar 

  6. Wang, L., Zhang, X., Su, H. & Zhu, J. A comprehensive survey of continual learning: theory, method and application. Preprint at (2023).

  7. Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl Acad. Sci. USA 114, 3521–3526 (2017).

    Article  MathSciNet  Google Scholar 

  8. Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M. & Tuytelaars, T. Memory aware synapses: learning what (not) to forget. In Proc. European Conference on Computer Vision 139–154 (Springer, 2018).

  9. Zenke, F., Poole, B. & Ganguli, S. Continual learning through synaptic intelligence. In Proc. International Conference on Machine Learning 3987–3995 (PMLR, 2017).

  10. Chaudhry, A., Dokania, P. K., Ajanthan, T. & Torr, P. H. Riemannian walk for incremental learning: understanding forgetting and intransigence. In Proc. European Conference on Computer Vision 532–547 (Springer, 2018).

  11. Ritter, H., Botev, A. & Barber, D. Online structured laplace approximations for overcoming catastrophic forgetting. Adv. Neural Inf. Process. Syst. 31, 3742–3752 (2018).

  12. Rebuffi, S.-A., Kolesnikov, A., Sperl, G. & Lampert, C. H. iCaRL: incremental classifier and representation learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2001–2010 (IEEE, 2017).

  13. Shin, H., Lee, J. K., Kim, J. & Kim, J. Continual learning with deep generative replay. Adv. Neural Inf. Process. Syst. 30, 2990–2999 (2017).

  14. Wang, L. et al. Memory replay with data compression for continual learning. In International Conference on Learning Representations (2021).

  15. Serra, J., Suris, D., Miron, M. & Karatzoglou, A. Overcoming catastrophic forgetting with hard attention to the task. In Proc. International Conference on Machine Learning 4548–4557 (PMLR, 2018).

  16. Fernando, C. et al. PathNet: evolution channels gradient descent in super neural networks. Preprint at (2017).

  17. Delange, M. et al. A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. (2021).

  18. Hadsell, R., Rao, D., Rusu, A. A. & Pascanu, R. Embracing change: continual learning in deep neural networks. Trends Cogn. Sci. 24, 1028–1040 (2020).

    Article  Google Scholar 

  19. Shuai, Y. et al. Forgetting is regulated through Rac activity in Drosophila. Cell 140, 579–589 (2010).

    Article  Google Scholar 

  20. Cohn, R., Morantte, I. & Ruta, V. Coordinated and compartmentalized neuromodulation shapes sensory processing in Drosophila. Cell 163, 1742–1755 (2015).

  21. Waddell, S. Neural plasticity: dopamine tunes the mushroom body output network. Curr. Biol. 26, R109–R112 (2016).

    Article  Google Scholar 

  22. Modi, M. N., Shuai, Y. & Turner, G. C. The Drosophila mushroom body: from architecture to algorithm in a learning circuit. Annu. Rev. Neurosci. 43, 465–484 (2020).

    Article  Google Scholar 

  23. Aso, Y. et al. Mushroom body output neurons encode valence and guide memory-based action selection in Drosophila. eLife 3, e04580 (2014).

    Article  Google Scholar 

  24. Aso, Y. & Rubin, G. M. Dopaminergic neurons write and update memories with cell-type-specific rules. eLife 5, e16135 (2016).

    Article  Google Scholar 

  25. Gao, Y. et al. Genetic dissection of active forgetting in labile and consolidated memories in Drosophila. Proc. Natl Acad. Sci. USA 116, 21191–21197 (2019).

    Article  Google Scholar 

  26. Zhao, J. et al. Genetic dissection of mutual interference between two consecutive learning tasks in Drosophila. eLife 12, e83516 (2023).

  27. Richards, B. A. & Frankland, P. W. The persistence and transience of memory. Neuron 94, 1071–1084 (2017).

    Article  Google Scholar 

  28. Dong, T. et al. Inability to activate Rac1-dependent forgetting contributes to behavioral inflexibility in mutants of multiple autism-risk genes. Proc. Natl Acad. Sci. USA 113, 7644–7649 (2016).

    Article  Google Scholar 

  29. Zhang, X., Li, Q., Wang, L., Liu, Z.-J. & Zhong, Y. Active protection: learning-activated Raf/MAPK activity protects labile memory from Rac1-independent forgetting. Neuron 98, 142–155 (2018).

    Article  Google Scholar 

  30. Davis, R. L. & Zhong, Y. The biology of forgetting—a perspective. Neuron 95, 490–503 (2017).

    Article  Google Scholar 

  31. Mo, H. et al. Age-related memory vulnerability to interfering stimuli is caused by gradual loss of MAPK-dependent protection in Drosophila. Aging Cell 21, e13628 (2022).

  32. Cervantes-Sandoval, I., Chakraborty, M., MacMullen, C. & Davis, R. L. Scribble scaffolds a signalosome for active forgetting. Neuron 90, 1230–1242 (2016).

    Article  Google Scholar 

  33. Noyes, N. C., Phan, A. & Davis, R. L. Memory suppressor genes: modulating acquisition, consolidation, and forgetting. Neuron 109, 3211–3227 (2021).

    Article  Google Scholar 

  34. Cognigni, P., Felsenberg, J. & Waddell, S. Do the right thing: neural network mechanisms of memory formation, expression and update in Drosophila. Curr. Opin. Neurobiol. 49, 51–58 (2018).

    Article  Google Scholar 

  35. Amin, H. & Lin, A. C. Neuronal mechanisms underlying innate and learned olfactory processing in Drosophila. Curr. Opin. Insect Sci. 36, 9–17 (2019).

    Article  Google Scholar 

  36. Handler, A. et al. Distinct dopamine receptor pathways underlie the temporal sensitivity of associative learning. Cell 178, 60–75 (2019).

    Article  Google Scholar 

  37. McCurdy, L. Y., Sareen, P., Davoudian, P. A. & Nitabach, M. N. Dopaminergic mechanism underlying reward-encoding of punishment omission during reversal learning in Drosophila. Nat. Commun. 12, 1115 (2021).

  38. Berry, J. A., Cervantes-Sandoval, I., Nicholas, E. P. & Davis, R. L. Dopamine is required for learning and forgetting in Drosophila. Neuron 74, 530–542 (2012).

    Article  Google Scholar 

  39. Berry, J. A., Phan, A. & Davis, R. L. Dopamine neurons mediate learning and forgetting through bidirectional modulation of a memory trace. Cell Rep. 25, 651–662 (2018).

    Article  Google Scholar 

  40. Aitchison, L. et al. Synaptic plasticity as bayesian inference. Nat. Neurosci. 24, 565–571 (2021).

    Article  Google Scholar 

  41. Schug, S., Benzing, F. & Steger, A. Presynaptic stochasticity improves energy efficiency and helps alleviate the stability–plasticity dilemma. eLife 10, e69884 (2021).

    Article  Google Scholar 

  42. Wang, L. et al. AFEC: active forgetting of negative transfer in continual learning. Adv. Neural Inf. Process. Syst. 34, 22379–22391 (2021).

  43. Benzing, F. Unifying importance based regularisation methods for continual learning. In Proc. International Conference on Artificial Intelligence and Statistics 2372–2396 (PMLR, 2022).

  44. Bouton, M. E. Context, time, and memory retrieval in the interference paradigms of pavlovian learning. Psychol. Bull. 114, 80 (1993).

    Article  Google Scholar 

  45. Krizhevsky, A. et al. Learning multiple layers of features from tiny images. Technical Report, Citeseer (2009).

  46. Shuai, Y. et al. Dissecting neural pathways for forgetting in Drosophila olfactory aversive memory. Proc. Natl Acad. Sci. USA 112, E6663–E6672 (2015).

  47. Chen, L. et al. AI of brain and cognitive sciences: from the perspective of first principles. Preprint at (2023).

  48. Caron, S. J., Ruta, V., Abbott, L. F. & Axel, R. Random convergence of olfactory inputs in the Drosophila mushroom body. Nature 497, 113–117 (2013).

    Article  Google Scholar 

  49. Endo, K., Tsuchimoto, Y. & Kazama, H. Synthesis of conserved odor object representations in a random, divergent-convergent network. Neuron 108, 367–381 (2020).

    Article  Google Scholar 

  50. Long, M., Cao, Y., Wang, J. & Jordan, M. Learning transferable features with deep adaptation networks. In Proc. International Conference on Machine Learning 97–105 (PMLR, 2015).

  51. Wang, L., Zhang, X., Li, Q., Zhu, J. & Zhong, Y. CoSCL: cooperation of small continual learners is stronger than a big one. In Proc. European Conference on Computer Vision 254–271 (Springer, 2022).

  52. van de Ven, G. M., Tuytelaars, T. & Tolias, A. S. Three types of incremental learning. Nat. Mach. Intell. 4, 1185–1197 (2022).

  53. Riemer, M. et al. Learning to learn without forgetting by maximizing transfer and minimizing interference. In International Conference on Learning Representations (2018).

  54. Schwarz, J. et al. Progress & compress: a scalable framework for continual learning. In Proc. International Conference on Machine Learning 4528–4537 (PMLR, 2018).

  55. Jung, S., Ahn, H., Cha, S. & Moon, T. Continual learning with node-importance based adaptive group sparse regularization. Adv. Neural Inf. Process. Syst. 33, 3647–3658 (2020).

  56. Cha, S., Hsu, H., Hwang, T., Calmon, F. & Moon, T. CPR: classifier-projection regularization for continual learning. In International Conference on Learning Representations (2020).

  57. Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015).

    Article  MathSciNet  Google Scholar 

  58. Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. The Caltech-UCSD birds-200-2011 dataset. (2011).

  59. Lomonaco, V. & Maltoni, D. Core50: a new dataset and benchmark for continuous object recognition. In Conference on Robot Learning 17–26 (PMLR, 2017).

  60. Ryan, T. J. & Frankland, P. W. Forgetting as a form of adaptive engram cell plasticity. Nat. Rev. Neurosci. 23, 173–186 (2022).

    Article  Google Scholar 

  61. Luo, L. et al. Differential effects of the Rac GTPase on Purkinje cell axons and dendritic trunks and spines. Nature 379, 837–840 (1996).

  62. Tashiro, A., Minden, A. & Yuste, R. Regulation of dendritic spine morphology by the rho family of small gtpases: antagonistic roles of Rac and Rho. Cerebral Cortex 10, 927–938 (2000).

    Article  Google Scholar 

  63. Hayashi-Takagi, A. et al. Disrupted-in-Schizophrenia 1 (DISC1) regulates spines of the glutamate synapse via Rac1. Nat. Neurosci. 13, 327–332 (2010).

    Article  Google Scholar 

  64. Hayashi-Takagi, A. et al. Labelling and optical erasure of synaptic memory traces in the motor cortex. Nature 525, 333–338 (2015).

    Article  Google Scholar 

  65. Martens, J. & Grosse, R. Optimizing neural networks with kronecker-factored approximate curvature. In Proc. International Conference on Machine Learning 2408–2417 (PMLR, 2015).

  66. Knoblauch, J., Husain, H. & Diethe, T. Optimal continual learning has perfect memory and is NP-hard. In Proc. International Conference on Machine Learning 5327–5337 (PMLR, 2020).

  67. Deng, D., Chen, G., Hao, J., Wang, Q. & Heng, P.-A. Flattening sharpness for dynamic gradient projection memory benefits continual learning. Adv. Neural Inf. Process. Syst. 34, 18710–18721 (2021).

  68. Mirzadeh, S. I., Farajtabar, M., Pascanu, R. & Ghasemzadeh, H. Understanding the role of training regimes in continual learning. Adv. Neural Inf. Process. Syst. 33, 7308–7320 (2020).

    Google Scholar 

  69. McAllester, D. A. PAC-Bayesian model averaging. In Proc. Twelfth Annual Conference on Computational Learning Theory 164–170 (ACM, 1999).

  70. Pham, Q., Liu, C., Sahoo, D. & Steven, H. Contextual transformation networks for online continual learning. In International Conference on Learning Representations (2021).

  71. Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at (2017).

  72. Lopez-Paz, D. et al. Gradient episodic memory for continual learning. Adv. Neural Inf. Process. Syst. 30, 6467–6476 (2017).

  73. Mnih, V. et al. Playing Atari with deep reinforcement learning. Preprint at (2013).

  74. Wang, L. & Zhang, X. lywang3081/CAF: CAF paper. Zenodo (2023).

  75. Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 618–626 (IEEE, 2017).

Download references


This work was supported by the National Key Research and Development Program of China (2020AAA0106302, to J.Z.), the STI2030-Major Projects (2022ZD0204900, to Y.Z.), the National Natural Science Foundation of China (nos 62061136001 and 92248303, to J.Z., 32021002, to Y.Z., U19A2081, to H.S.), the Tsinghua-Peking Center for Life Sciences, the Tsinghua Institute for Guo Qiang, and the High Performance Computing Center, Tsinghua University. L.W. was also supported by the Shuimu Tsinghua Scholar. J.Z. was also supported by the New Cornerstone Science Foundation through the XPLORER PRIZE.

Author information

Authors and Affiliations



L.W., X.Z., J.Z. and Y.Z. conceived the project. L.W., X.Z., Q.L. and M.Z. designed the computational model. X.Z. performed the theoretical analysis, assisted by L.W. L.W. performed all experiments and analysed the data. L.W., X.Z. and Q.L. wrote the paper. L.W., X.Z., Q.L., M.Z., H.S., J.Z. and Y.Z. revised the paper. J.Z. and Y.Z. supervised the project.

Corresponding authors

Correspondence to Jun Zhu or Yi Zhong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Gido van de Ven and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Text, Tables 1–6 and Figs. 1–7.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., Zhang, X., Li, Q. et al. Incorporating neuro-inspired adaptability for continual learning in artificial intelligence. Nat Mach Intell 5, 1356–1368 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing