Abstract
Synaptic plasticity is believed to be a key physiological mechanism for learning. It is well established that it depends on pre- and postsynaptic activity. However, models that rely solely on pre- and postsynaptic activity for synaptic changes have, so far, not been able to account for learning complex tasks that demand credit assignment in hierarchical networks. Here we show that if synaptic plasticity is regulated by high-frequency bursts of spikes, then pyramidal neurons higher in a hierarchical circuit can coordinate the plasticity of lower-level connections. Using simulations and mathematical analyses, we demonstrate that, when paired with short-term synaptic dynamics, regenerative activity in the apical dendrites and synaptic plasticity in feedback pathways, a burst-dependent learning rule can solve challenging tasks that require deep network architectures. Our results demonstrate that well-known properties of dendrites, synapses and synaptic plasticity are sufficient to enable sophisticated learning in hierarchical circuits.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The MNIST, CIFAR-10 (ref. 76) and ImageNet77 datasets are publicly available from http://yann.lecun.com/exdb/mnist/, https://www.cs.toronto.edu/~kriz/cifar.html and http://www.image-net.org, respectively.
Code availability
The code used in this article is available at https://github.com/apayeur/spikingburstprop and https://github.com/jordan-g/Burstprop.
Change history
02 November 2021
A Correction to this paper has been published: https://doi.org/10.1038/s41593-021-00970-x
References
Hebb, D. O. The Organization of Behavior (Wiley, New York, 1949).
Artola, A., Bröcher, S. & Singer, W. Different voltage dependent thresholds for inducing long-term depression and long-term potentiation in slices of rat visual cortex. Nature 347, 69–72 (1990).
Markram, H., Lübke, J., Frotscher, M. & Sakmann, B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science 275, 213–215 (1997).
Paulsen, O. & Sejnowski, T. J. Natural patterns of activity and long-term synaptic plasticity. Curr. Opin. Neurobiol. 10, 172–180 (2000).
Sjöström, P. J., Turrigiano, G. G. & Nelson, S. B. Rate, timing, and cooperativity jointly determine cortical synaptic plasticity. Neuron 32, 1149–1164 (2001).
Letzkus, J. J., Kampa, B. M. & Stuart, G. J. Learning rules for spike timing-dependent plasticity depend on dendritic synapse location. J. Neurosci. 26, 10420–10429 (2006).
Kampa, B., Letzkus, J. & Stuart, G. Requirement of dendritic calcium spikes for induction of spike-timing-dependent synaptic plasticity. J. Physiol. 574, 283–290 (2006).
Sjöström, P. J. & Häusser, M. A cooperative switch determines the sign of synaptic plasticity in distal dendrites of neocortical pyramidal neurons. Neuron 51, 227–238 (2006).
Gambino, F. et al. Sensory-evoked LTP driven by dendritic plateau potentials in vivo. Nature 515, 116–119 (2014).
Geun Hee, S. et al. Neuromodulators control the polarity of spike-timing-dependent synaptic plasticity. Neuron 55, 919–929 (2007).
Gerstner, W., Lehmann, M., Liakoni, V., Corneil, D. & Brea, J. Eligibility traces and plasticity on behavioral time scales: experimental support of neoHebbian three-factor learning rules. Front. Neural Circuits 12, 53 (2018).
Roelfsema, P. R. & Holtmaat, A. Control of synaptic plasticity in deep cortical networks. Nat. Rev. Neurosci. 19, 166–180 (2018).
Williams, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992).
Werfel, J., Xie, X. & Seung, H. S. Learning curves for stochastic gradient descent in linear feedforward networks. Neural Comput. 17, 2699–2718 (2005).
Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J. & Hinton, G. Backpropagation and the brain. Nat. Rev. Neurosci. 21, 335–346 (2020).
Richards, B. A. et al. A deep learning framework for systems neuroscience. Nat. Neurosci. 22, 1761–1770 (2019).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Larkum, M. E., Zhu, J. & Sakmann, B. A new cellular mechanism for coupling inputs arriving at different cortical layers. Nature 398, 338–341 (1999).
Markram, H., Wang, Y. & Tsodyks, M. Differential signaling via the same axon of neocortical pyramidal neurons. Proc. Natl. Acad. Sci. USA 95, 5323–5328 (1998).
Nevian, T. & Sakmann, B. Spine Ca2+ signaling in spike-timing-dependent plasticity. J. Neurosci. 26, 11001–11013 (2006).
Froemke, R. C., Tsay, I. A., Raad, M., Long, J. D. & Dan, Y. Contribution of individual spikes in burst-induced long-term synaptic modification. J. Neurophys. 95, 1620–1629 (2006).
Bell, C. C., Caputi, A., Grant, K. & Serrier, J. Storage of a sensory pattern by anti-Hebbian synaptic plasticity in an electric fish. Proc. Natl Acad. Sci. USA 90, 4650–4654 (1993).
Bol, K., Marsat, G., Harvey-Girard, E., Longtin, André & Maler, L. Frequency-tuned cerebellar channels and burst-induced LTD lead to the cancellation of redundant sensory inputs. J. Neurosci. 31, 11028–11038 (2011).
Richards, B. A. & Lillicrap, T. P. Dendritic solutions to the credit assignment problem. Curr. Opin. Neurobiol. 54, 28–36 (2019).
Brandalise, F. & Gerber, U. Mossy fiber-evoked subthreshold responses induce timing-dependent plasticity at hippocampal Ca3 recurrent synapses. Proc. Natl Acad. Sci. USA 111, 4303–4308 (2014).
Kayser, C., Montemurro, M. A., Logothetis, N. K. & Panzeri, S. Spike-phase coding boosts and stabilizes information carried by spatial and temporal spike patterns. Neuron 61, 597–608 (2009).
Herzfeld, D. J., Kojima, Y., Soetedjo, R. & Shadmehr, R. Encoding of action by the purkinje cells of the cerebellum. Nature 526, 439–442 (2015).
Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. Proc. Nat. Acad. Sci. USA 115, 6329–6338 (2018).
Burbank, K. S. Mirrored STDP implements autoencoder learning in a network of spiking neurons. PLoS Comp. Biol. 11, e1004566 (2015).
Akrout, M., Wilson, C., Humphreys, P. C., Lillicrap, T. & Tweed, D. Using weight mirrors to improve feedback alignment. Preprint at arXiv https://arxiv.org/abs/1904.05391 (2019).
Murayama, M. et al. Dendritic encoding of sensory stimuli controlled by deep cortical interneurons. Nature 457, 1137–1141 (2009).
Körding, K. P. & König, P. Supervised and unsupervised learning with two sites of synaptic integration. J. Comput. Neurosci. 11, 207–215 (2001).
Granseth, B., Ahlstrand, E. & Lindström, S. Paired pulse facilitation of corticogeniculate epscs in the dorsal lateral geniculate nucleus of the rat investigated in vitro. J. Physiol. 544, 477–486 (2002).
Sherman, S. M. Thalamocortical interactions. Curr. Opin. Neurobiol. 22, 575–579 (2012).
Meredith, R. M., Floyer-Lea, A. M. & Paulsen, O. Maturation of long-term potentiation induction rules in rodent hippocampus: role of gabaergic inhibition. J. Neurosci. 23, 11142–11146 (2003).
Inglebert, Y., Aljadeff, J., Brunel, N. & Debanne, D. Synaptic plasticity rules with physiological calcium levels. Proc. Natl Acad.Sci. USA 117, 33639–33648 (2020).
Kampa, B. M. & Stuart, G. J. Calcium spikes in basal dendrites of layer 5 pyramidal neurons during action potential bursts. J. Neurosci. 26, 7424–32 (2006).
Doron, G. et al. Perirhinal input to neocortical layer 1 controls learning. Science 370, eaaz3136 (2020).
Mäki-Marttunen, T., Iannella, N., Edwards, A. G., Einevoll, G. & Blackwell, K. T. A unified computational model for cortical post-synaptic plasticity. eLife 9, e55714 (2020).
Bienenstock, E. L., Cooper, L. N. & Munro, P. W. Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J. Neurosci. 2, 32–48 (1982).
Ning-Long, X. et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. Nature 492, 247–251 (2012).
Felleman, D. J. & van Essen, D. C. Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex 1, 1–47 (1991).
Sacramento, J., Costa, R. C., Bengio, Y. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. Adv. Neural Inf. Process. Syst. 31, 8721–8732 (2018).
Bartunov, S. et al. Assessing the scalability of biologically-motivated deep learning algorithms and architectures. Adv. Neural Inf. Process. Syst. 31, 9368–9378 (2018).
Boerlin, M., Machens, C. K. & Denève, S. Predictive coding of dynamical variables in balanced spiking networks. PLoS Comp. Biol. 9, e1003258 (2013).
Petreanu, L., Mao, T., Sternson, S. M. & Svoboda, K. The subcellular organization of neocortical excitatory connections. Nature 457, 1142–1145 (2009).
Ren, Si-Qiang, Li, Z., Lin, S., Bergami, M. & Shi, S.-H. Precise long-range microcircuit-to-microcircuit communication connects the frontal and sensory cortices in the mammalian brain. Neuron 104, 385–401.e3 (2019).
Golding, N. L., Staff, N. P. & Spruston, N. Dendritic spikes as a mechanism for cooperative long-term potentiation. Nature 418, 326–331 (2002).
Wang, X. et al. Feedforward excitation and inhibition evoke dual modes of firing in the cat’s visual thalamus during naturalistic viewing. Neuron 55, 465–478 (2007).
Owen, S. F., Berke, J. D. & Kreitzer, A. C. Fast-spiking interneurons supply feedforward control of bursting, calcium, and plasticity for efficient learning. Cell 172, 683–695 (2018).
Zenke, F. & Gerstner, W. Limits to high-speed simulations of spiking neural networks using general-purpose computers. Front. Neuroinf. 8, 76 (2014).
Bittner, K. C., Milstein, A. D., Grienberger, C., Romani, S. & Magee, J. C. Behavioral time scale synaptic plasticity underlies ca1 place fields. Science 357, 1033–1036 (2017).
Tremblay, R., Lee, S. & Rudy, B. Gabaergic interneurons in the neocortex: from cellular properties to circuits. Neuron 91, 260–292 (2016).
Nigro, M. J., Hashikawa-Yamasaki, Y. & Rudy, B. Diversity and connectivity of layer 5 somatostatin-expressing interneurons in the mouse barrel cortex. J. Neurosci. 38, 1622–1633 (2018).
Hilscher, M. M., Leão, R. N., Edwards, S. J., Leão, K. E. & Kullander, K. ChRNA2-Martinotti cells synchronize layer 5 type a pyramidal cells via rebound excitation. PLOS Biol. 15, e200139226 (2017).
Naud, R., Marcille, N., Clopath, C. & Gerstner, W. Firing patterns in the adaptive exponential integrate-and-fire model. Biol. Cybern. 99, 335–347 (2008).
Packer, A. M. & Yuste, R. Dense, unspecific connectivity of neocortical parvalbumin-positive interneurons: a canonical microcircuit for inhibition? J. Neurosci. 31, 13260–13271 (2011).
De Kock, C. P. J. & Sakmann, B. High frequency action potential bursts (>100 Hz) in l2/3 and l5b thick tufted neurons in anaesthetized and awake rat primary somatosensory cortex. J. Physiol. 586, 3353–3364 (2008).
Womelsdorf, T., Ardid, S., Everling, S. & Valiante, T. A. Burst firing synchronizes prefrontal and anterior cingulate cortex during attentional control. Current Biology 24, 2613–2621 (2014).
Costa, R. P., Sjöström, P. J. & Van Rossum, M. C. W. Probabilistic inference of short-term synaptic plasticity in neocortical microcircuits. Front. Comput. Neurosci. 7, 75 (2013).
Samadi, A., Lillicrap, T. P. & Tweed, D. B. Deep learning with dynamic spiking neurons and fixed feedback weights. Neural Comput. 29, 578–602 (2017).
Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. eLife 6, e22901 (2017).
Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. Difference target propagation. In Proc. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ed. Hutter, F. et al.) 498–515 (Springer, 2015).
Liao, Q., Leibo, J. Z. & Poggio, T. How important is weight symmetry in backpropagation? In Proc. Thirtieth AAAI Conference on Artificial Intelligence (ed. Schuurmans, D. et al.) 1837–1844 (AAAI, 2016).
Xiao, W., Chen, H., Liao, Q. & Poggio, T. Biologically-plausible learning algorithms can scale to large datasets. Preprint at arXiv https://arxiv.org/abs/1811.03567 (2018).
Lillicrap, T. C., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nature Commun. 7, 13276 (2016).
Scellier, B. & Bengio. Y. Towards a biologically plausible backprop. Preprint at arXiv https://arxiv.org/abs/1602.05179v5 (2016).
Yali, A. Deep learning with asymmetric connections and Hebbian updates. Front. Comput. Neurosci. https://doi.org/10.3389/fncom.2019.00018 (2019).
Whittington, J. C. R. & Bogacz, R. Theories of error back-propagation in the brain. Trends Cogn. Sci. 23, 235–250 (2019).
Mostafa, H., Ramesh, V. & Cauwenberghs, G. Deep supervised learning using local errors. Front. Neurosci. 12, 608 (2018).
Nokland, A. Direct feedback alignment provides learning in deep neural networks. Adv. Neural Inf. Process. Syst. 29, 1037–1045 (2016).
Lansdell, B., J., Prakash, P. R. & Kording, K. P. Learning to solve the credit assignment problem. Preprint at arXiv https://arxiv.org/abs/1906.00889v4 (2019).
Pozzi, I., Bohté, S. & Roelfsema, P. A biologically plausible learning rule for deep learning in the brain. Preprint at arXiv https://arxiv.org/abs/1811.01768 (2018).
Laborieux, A. et al. Scaling equilibrium propagation to deep convnets by drastically reducing its gradient estimator bias. Front. Neurosci. 15, 129 (2021).
Kolen, J. F. & Pollack, J. B. Backpropagation without weight transport. In Proc. 1994 IEEE International Conference on Neural Networks (ICNN’94) Vol. 3, 1375–1380 (IEEE, 1994).
Krizhevsky, A., Nair, V. & Hinton, G. Cifar-10 (Canadian Institute for Advanced Research) Technical Report (Univ. Toronto, 2009).
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. & Fei-Fei, L. ImageNet: a large-scale hierarchical image database. In Proc. CVPR09 248–255 (IEEE, 2009).
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In Proc. IEEE International Conference on Computer Vision 1026–1034 (IEEE, 2015).
Glorot, X. & Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proc. Thirteenth International Conference on Artificial Intelligence and Statistics (ed. Whye Teh, Y. et al.) 249–256 (Society for Artificial Intelligence and Statistics, 2010).
Acknowledgements
We thank A. Santoro and L. Maler for comments on this manuscript. We also thank M. Hilscher and M.J. Nigro for sharing data about SOM+ neurons. In addition, we thank T. Mesnard for helping with the development of the rate-based model. This work was supported by two NSERC Discovery grants (to R.N., no. 06872 and to B.A.R., no. 04947), a CIHR Project grant (no. RN383647-418955), a Fellowship from the CIFAR Learning in Machines and Brains Program (to B.A.R.), an Ontario Early Researcher Award (to B.A.R., no. ER 17-13-242), a Healthy Brains, Healthy Lives New Investigator Start-up (to B.A.R., no. 2b-NISU-8) the Novartis Research Foundation (to F.Z.).
Author information
Authors and Affiliations
Contributions
All authors contributed to the burst-dependent learning rule. A.P., F.Z. and R.N. designed the spiking simulations. A.P. performed the spiking simulations. J.G. designed the recurrent plasticity rule and performed the numerical experiments on CIFAR-10 and ImageNet. B.A.R. and R.N. wrote the manuscript, with contributions from J.G. and A.P. B.A.R. and R.N. cosupervised the project.
Corresponding authors
Ethics declarations
Competing interests
R.N., B.A.R. and A.P. have a provisional patent application for a neuromorphic implementation of the algorithm described in this article. The other authors declare no competing interests.
Additional information
Peer review information Nature Neuroscience thanks Gabriel Kreiman, Panayiota Poirazi and Nelson Spruston for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Effects of population size, randomized examples and absence of hidden-layer plasticity on the XOR task.
a, Comparison of costs for the XOR task. In blue is the cost for the network in Fig. 4 in the main text, but with 2000 neurons per population and slightly different parameter values. The dot-dashed pink line is for when the examples are randomly selected within an epoch. The dotted red line has no plasticity in the hidden layer. The dashed green line is for 400 neurons per population. b-e, Output event rate (ER) after learning. The dashed grey line separates ‘true (1)’ and ‘false (0)’ for the XOR. Only in c is XOR not solved.
Extended Data Fig. 2 Impact of different time scales on the XOR task.
a, Comparison of costs for when the duration of examples T (in s) (dashed green line) and the moving average time constant τavg (in s) (dotted orange line) are changed with respect to the values used in Fig. 4 (solid blue). b, Output event rate (ER) after learning for the three cases in panel a. The dashed grey line separates ‘true (1)’ and ‘false (0)’ for the XOR.
Extended Data Fig. 3 Learning XOR with symmetric feedback pathways.
a, Schematic diagram illustrating the symmetric feedback (⊕ and ⊕). b, Output-layer activity for the XOR task. Note that the XOR task is still solved. Only a single realization is displayed here. (ci-cii) The symmetric feedback yields very similar representations at the hidden layer.
Extended Data Fig. 4 Dynamics of the time-dependent rate model while learning MNIST.
a, Schematic of the network. The enlarged hidden layer population stresses the fact that the burst rate is equal to the event rate times the burst probability, with the event and burst probability nonlinearly integrating the feedforward and feedback signals, respectively. b, Example event rates (i, iii, v) and weights (ii, iv) for two consecutive examples during the first epoch. In (i), the teacher is illustrated as a dashed line. Learning intervals are indicated by light green vertical bars. c, Burst probabilities (i, iii) and differences of burst probabilities (ii, iv) for the same examples as in b.
Extended Data Fig. 5 Network mechanisms regulating the bursting nonlinearity.
All panels display the burst probability of a large population of two-compartment pyramidal neurons as a function of the intensity of the injected dendritic current. The insets illustrate the microcircuit - including the PV-like neurons (disks) and the SOM-like neurons (inverted triangles) - and the parameter that is being modified is indicated by a colored circuit element. Increasing color intensities corresponds to increasing values of the parameter. a, Increasing the strength of inhibitory synapses from SOM neurons onto the pyramidal neurons’ dendrites produces divisive burst probability control. b, Disinhibiting the pyramidal neurons’ dendrites by applying a hyperpolarizing current to the SOM neurons - mimicking inhibition from the VIP neurons - increases the slope. c, Increasing the probability of release onto SOM neurons produces a small divisive gain modulation. d, Increasing the dendritic excitability by increasing the strength of the regenerative dendritic activity produces an additive gain control.
Extended Data Fig. 6 The bursting nonlinearity controls the learning rate.
a, Schematic of the network. Each hidden layer had 500 units. The recurrent weights (Z(1) and Z(2)) and the feedback alignment weights (Y(1) and Y(2)) are explicitly represented. b, Angle between the weight updates W(1) in the standard backpropagation algorithm and in burstprop for the MNIST digit recognition task. The angle is displayed for different values of the slope of the dendritic nonlinearity (β). Results are displayed as the mean +/- standard deviation over 10 realizations with randomly initialized weights.
Extended Data Fig. 7 Linearity of feedback signals degrades with depth in deep convolutional network trained on ImageNet.
Each plot shows the change in burst probability of a unit in hidden layer l, Δpl, as the burst probability at the output layer, p8, is changed by Δp8 (n=1000), along with the Pearson’s correlation coefficient and two-tailed p-value (blue, top), as well as a random sample of 2000 burst probabilities after presentation of an input image (red, bottom).
Extended Data Fig. 9 The variance of the burst probability decreases during learning.
a, Variance of the burst probability as a function of the epoch for the MNIST task, for each layer in a network with 3 hidden layers with 500 units each. b, Variance of the burst probability as a function of the test error, showing that the magnitude of the variance is correlated with the test error.
Supplementary information
Supplementary Information
Supplementary Text, Tables 1–4 and Supplementary Figs. 1 and 2.
Rights and permissions
About this article
Cite this article
Payeur, A., Guerguiev, J., Zenke, F. et al. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. Nat Neurosci 24, 1010–1019 (2021). https://doi.org/10.1038/s41593-021-00857-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-021-00857-x
This article is cited by
-
Co-dependent excitatory and inhibitory plasticity accounts for quick, stable and long-lasting memories in biological networks
Nature Neuroscience (2024)
-
Learning efficient backprojections across cortical hierarchies in real time
Nature Machine Intelligence (2024)
-
A method for the ethical analysis of brain-inspired AI
Artificial Intelligence Review (2024)
-
Burst patterns with Hopf bifurcation in a simplified FHN circuit
Nonlinear Dynamics (2024)
-
The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks
Nature Neuroscience (2023)