Introduction

Neurons in the brain display a variety of temporal discharging patterns, among which bursting represents the generation of multiple spikes with brief inter-spike intervals (typically several milliseconds) in a short period of time (typically, several tens to hundreds of milliseconds). Bursting neurons are found ubiquitously in the brain and are thought to play active roles in transferring and routing information1,2,3,4,5, inducing synaptic plasticity6,7, and supporting and/or altering cognitive functions2,7,8,9,10,11,12,13,14. Altered neuronal bursting has been implicated in neurodegenerative disorders15 and depression16. While our understanding of the roles of bursting has been advanced, the computational advantages of spike bursts over isolated spikes remain elusive.

Here, we show the benefits of bursting activity in learning sequences generated by a special class of random walks observed in various animal behaviors. We investigate whether and how bursting neurons improve the ability of neural network models to learn the dynamical trajectories of Lévy flight, which is a random walk with step sizes obeying a heavy-tailed distribution17,18,19. As a consequence, Lévy flight consists of many short steps and rare long-distance jumps. A well-known characteristic of Lévy flight is that it makes search more efficient than Brownian walks which only consist of relatively short steps20,21. Many processes observed in biology22,23,24 and physics25,26 can be described as Lévy flight. In neuroscience, an interesting example of Lévy flight is the stochastic trajectories of saccadic eye movement27 on which the visual exploration of the objects of interest significantly relies. Several cortical and subcortical regions including the frontal eye field, superior colliculus, and cerebellar cortex participate in controlling and executing saccades28 and various neurons show spike bursts in these regions8,9,29,30. Propagation of gamma-frequency (\(\sim\) 40 Hz) bursts of the local field potentials also obeys Lévy flight in the middle temporal cortex of marmosets, which is engaged in visual motion processing31. Other examples of Lévy flight are found in memory processing of animals. In the spatial exploration of rodents, the animal spends the majority of time for exploring small local areas but occasionally travels to distant places at greater speeds32. Hippocampal10 and subicular33 neurons can learn spatial receptive fields and are known to exhibit burst firing. In human subjects, memory recall can be viewed as foraging behavior obeying Lévy flight34,35,36. The appearance of Lévy flight in various types of foraging behavior and the participation of bursting neurons in the relevant brain regions motivate us to explore what benefits neuronal bursting brings to the learning and execution of such behavior.

For this purpose, we employ reservoir computing (RC) that uses a recurrent network model and FORCE learning of information-readout neurons for efficient learning of time-varying external signals (i.e., teaching signals)37. Originally, RC and FORCE learning were formulated for rate-coding neurons, and FORCE learning of continuous dynamical trajectories is generally fast. The RC system was also quite successful in modeling neural activities recorded from various cortical areas38,39,40,41. Later, RC was extended to networks of spiking neurons42,43, and variants of FORCE learning or some other learning method44 for spiking neurons have also been proposed45,46,47. Results of the previous studies have indicated that isolated spikes are sufficient for learning smooth trajectories. However, whether and how isolated spikes and bursts contribute differently to learning a more general class of sequences has not been explored. In this study, we clarify this by using a spiking-neuron version of FORCE learning for training an RC system of bursting neurons.

Results

Our model follows the conventional framework of reservoir computing except that neurons constituting a recurrent network called reservoir have regular-spiking (RS) and bursting modes. Neurons in the reservoir project to two readout neurons to describe the two-dimensional coordinates \((x_1, x_2)\) of Lévy flight, and the outputs of these neurons are fed back to all neurons in the reservoir (Fig. 1a). The weights of readout connections are modified based on the FORCE learning extended to spiking neurons47. In the RS mode, the neurons tend to generate isolated spikes (Fig. 1b) whereas they are strongly bursty in the bursting mode (Fig. 1c). See “Methods” for the details of the network model and construction of Lévy flight.

Figure 1
figure 1

The architecture and basic performance of the model. (a) The present RC system consists of a reservoir and two readout neurons. (b, c) Before learning, neurons in the reservoir tend to show isolated spikes in the RS mode (b) or intrinsic bursts in the bursting mode (c). Here, the firing patterns were simulated at \(G=50\). (d) A typical example of the target trajectories representing a finite portion of two-dimensional Lévy flight (orange) and the learned responses of the readout neurons (blue). (e) The time evolution of the two readout neurons are shown as functions of time. The target signal was lifted after the time point indicated by the vertical dashed line. Large discontinuous jumps in (d) and (e) indicate the start and end points of the repeated target signal. These jumps were not included in the target signal and hence not learned by the model.

During learning, the model was repeatedly exposed to a periodic target signal representing the repetition of a finite portion of Lévy flight trajectories. The model can learn these trajectories in either RS or bursting mode. Stochastic jumps in the trajectory are thought to be difficult for the model to accurately learn. As we will show later, the accuracy and speed of learning significantly depend on the mode of firing. Figure 1d displays an example of the time-varying output of the two readout neurons after the model learned a target signal in the bursting mode. As expected, the output of the model tends to deviate largely from the target trajectory when it shows relatively large jumps. Nonetheless, overall the model well replicates the target trajectory in the burst mode even after the learning process is turned off. The agreement between the target trajectory and the model’s output is more clearly visible in the time evolution of the variables \(x_1\) and \(x_2\) (Fig. 1e).

Figure 2
figure 2

Learning in the burst vs. RS modes. (a) Errors in the bursting and RS modes are plotted against the strength of recurrent connections after 25 learning trials. Error bars show the standard deviations. (b) Similar errors are plotted after 50 learning trials. (c) The time courses of errors during learning are shown for the optimal coupling strengths of the individual modes. (d, e) Similar time courses are shown in the bursting (d) and RS (e) modes for target signals of lengths 400, 800, and 1200 ms. The plots for 400 ms are copied from (c). (f) Error time courses are shown in the bursting mode when the target length is 1200 ms and the reservoir size is 1000 or 2000.

We quantitatively compare the performance of the model in learning between the bursting and RS modes. The strength of synaptic connections that gives an optimal performance may differ in the individual modes. To make a fair comparison, we first search an optimal coupling strength that minimizes the error in each mode. We calculate the average errors between a target trajectory and an actual output in the bursting mode and the RS mode as a function of the connection strength G. Figure 2a,b show the average errors of the trajectories obtained in trial-25 and trial-50 of learning, respectively, when the target length is 400 ms. The errors during trial-25 or trial-50 were first temporally averaged, and then the average and standard deviation of the resultant temporal averages were calculated over 20 simulation runs with different initial settings of neural states and synaptic weights. For each value of G, the standard deviations of the average error are calculated over these 20 simulations. As we can see from the figures, the error is minimized for relatively weak connections (\(G \sim\) 50) in the bursting mode. In contrast, the model achieves the least error at much stronger connections (\(G\sim\) 170) in the RS mode. The minimum average error is slightly smaller in the bursting mode than in the RS mode although the error sizes are not greatly different between the two modes after 50 cycles of training (Fig. 2b). Lyapunov exponents indicate that the initial network state was weakly non-chaotic in the RS mode and weakly chaotic in the (optimal) bursting mode (Supplementary Fig. 1). Similar differences in learning behavior are observed between the RS mode and bursting mode for another choice of the parameters of Lévy flight (Supplementary Fig. 2). Given these results, one might conclude that spike bursts have little advantage over isolated spikes in the present sequence learning task.

However, the results presented in Fig. 2a,b reveal two intriguing differences in learning between the RS mode and the bursting mode. First, while the two modes yield approximately the same minimum values of average errors (see “Methods”), the bursting mode yields a much smaller variance at the minimum error than the RS mode. In particular, Fig. 2a demonstrates that the variance almost vanishes for the optimal range of G values after 25 training cycles in the bursting mode. This is not the case for the optimal range of G values in the RS mode. Second and more importantly, the average error decreases much faster during learning in the bursting mode than in the RS mode, showing impressively different learning speeds between the two modes (Fig. 2c). Generally, the FORCE learning enables rapid learning of a smooth target trajectory even if the trajectory is chaotic37. However, our results show that the FORCE learning with isolated spikes requires several tens of trials for learning a target trajectory representing random walks of Lévy flight. In strong contrast, spike bursts enable the same rule to learn such a target trajectory at a similar accuracy within only ten trials. The merits of bursting are also suggested by the common observation that the individual neurons tend to generate spike bursts after learning at the corresponding optimal coupling strength irrespective of the mode (Supplementary Fig. 3).

As the length of target trajectories is increased, performance in sequence learning is degraded in both modes. However, the superiority of the bursting mode over the RS mode in rapid sequence learning remains (Fig. 2d,e). We note that the absolute values of the error are not really meaningful. These values become smaller as we include more neurons in the reservoir (Fig. 2f).

Now, we investigate why and how spike bursts improve the performance of the network model in learning trajectories of Lévy flight. We show that synchronized bursting of neurons plays an active role in the present sequence learning. Figure 3a shows the time evolution of a portion of the learned trajectory \(x_1(t)\) and \(x_2(t)\) with vertical dashed lines indicating the times of long-distance flights. Here, a long-distance flight refers to a step \((\Delta x_1, \Delta x_2)\) of which the length \(\sqrt{\Delta x_1(t)^2+ \Delta x_2(t)^2}\) is greater than 0.16, which approximately corresponds to the top 5% of long-distance jumps. In Fig. 3b,c, we show spike raster of 100 bursting neurons chosen randomly from the reservoir during the corresponding period of time before and after learning, respectively. While there are many neurons that rarely fire, some neurons intermittently generate brief (\(\sim\) 30 ms) to prolonged (\(\sim\) 150 ms) high-frequency bursts. The individual neurons change their firing patterns before and after learning, but the distributions of inter-spike intervals at the population level remain almost unchanged during learning (Supplementary Fig. 4a,b).

Figure 3
figure 3

Temporal coordination of bursts by learning. (a) A two-dimensional target trajectory shows big jumps at the times indicated by vertical dashed lines. (b) Spike raster of 100 neurons sampled randomly from the reservoir before learning. (c) Spike raster is shown for the same neurons after learning. (d,e) Distributions of the onset and end times of bursts around the times of big jumps are calculated before (d) and after (e) learning.

However, visual inspection of the spike raster suggests that many neurons start or stop generating spike bursts around the times of large flights after learning and that such a tendency is weak before learning. Therefore, assuming that spikes with their inter-spike intervals shorter than 6 ms belonged to a burst, we identified the onsets and end times of bursts of individual neurons and calculated the distributions of the onset/end times of bursts relative to the times of the nearest large jumps (i.e., the times of burst onsets/ends minus the times of the nearest neighbor large flights) before (Fig. 3d) and after learning (Fig. 3e). The threshold of 6 ms was determined from a gap in the inter-spike interval distribution (Supplementary Fig. 3b2). Intriguingly, the post-learning distributions exhibited sharp peaks around the origin of the axis for the relative time. The relative times of burst onsets show a particularly prominent peak. These results reveal that the RC system operating in the bursting mode learns the target trajectory of Lévy flight by shifting the times of bursts close to the occurrence times of large jumps. In other words, the RC system synchronizes bursting of the individual neurons around the times of large jumps. This synchronization of bursts is thought to advantage recurrent networks of bursting neurons in learning of sequences that involve abrupt changes in the trajectories.

The above results shown for the bursting mode and the bursting of many neurons after learning (Supplementary Fig. 3) suggest that bursting also plays a similar role for learning in the RS mode. We examined this possibility by investigating learning performance in the RS mode for different coupling strengths: \(G=\) 50, 100 and 150. Before learning, the majority of neurons showed regular spiking for \(G=50\) whereas a larger portion of neurons had bursting patterns for \(G=100\) and 150 (Fig. 4a, top). In all three cases, the number of bursting neurons was increased and the error was decreased after learning (Fig. 4a, bottom). Interestingly, learning with a larger value of G reduced the error more efficiently (Fig. 4b). In addition, the number of synchronous bursting near the onset of large jumps increased more prominently as the value of G was increased (Fig. 4c). These results show that the network set in the RS mode also develops bursting states to improve the accuracy of learning.

Figure 4
figure 4

Development of bursting states in the RS mode. The inter-spike interval distributions (a), learning curves (b) and distributions of burst timing (c) were calculated in the network models of RS-type neurons for three different values of coupling strength.

We further examined how synchronous neuronal bursting evolves during the progress of learning in the optimal model for the RS mode (\(G=170\)). As shown previously, this model has a tendency of bursting even before learning (Supplementary Fig. 4a). Synchronous bursting was not prominent at the trial-10 of learning but became prominent sometime between trial 10 and trial 20 (Fig. 5). Intriguingly, during this period the error was decreased to a similar magnitude to the minimum error of the optimal model for the bursting mode (c.f. Fig. 2c). The results indicate the pivotal role of burst synchronization in the present sequence learning task.

Figure 5
figure 5

Evolution of synchronous bursting in the RS mode. (top) Learning curve, (middle) the distributions of inter-spike intervals and (bottom) the distributions of burst timing are shown at trials 10, 20 and 30 for the optimally tuned model of the RS mode.

Discussion

We have trained an RC system of spiking neurons on a difficult sequence learning task where the target sequence represents random walks. FORCE learning can project the neural population activity of the reservoir quickly onto a target trajectory for a wide range of continuous trajectories including chaotic ones. This fast convergence of learning is a merit of RC, making RC useful for various practical applications. However, when a target trajectory consists of abrupt steps including long-distance jumps, as was the case in Lévy flight, FORCE learning with isolated spikes requires a large number of trials for minimizing the error signal. In contrast, the same learning rule can rapidly minimize the error by aligning the onsets as well as the end times of bursts in the neighborhoods of the times of long-distance jumps. This implies that the system synchronizes bursts of the individual neurons around these times. Such time-locked synchronization also emerges in the RS mode during learning. Moreover, the growth of synchronous bursting improves the performance of the trained model. This result suggests that the initial absence of synchronous bursts is the primary course of slow learning in the RS mode. Since the optimal model for the RS mode has a tendency of bursting before learning, a transition from the RS neuron to a bursting type is unlikely to be the primary course. Thus, the RC system can learn the Lévy flight trajectories more efficiently with bursts than with isolated spikes. Our model suggests that bursts contribute crucially to learning foraging-like cognitive behaviors.

Our results show an interesting qualitative agreement with some experimental observations. It has been known that the onsets of bursts in the saccade-related burst neurons are tightly linked to saccade onsets in the superior colliculus8,9. These neurons tend to discharge prior to a saccade if the movement is in their preferred direction, and their discharges follow rather than precede saccades for movements deviating from their preferred directions. Altough our model is far simpler compared to the actual neural circuits that control saccadic eye movements48, the sharp peak of burst onsets around the times of long-distance steps in Fig. 3e seems to be consistent with the characteristic behavioral correlates of the saccade-related burst neurons in the superior colliculus.

During spatial navigation, hippocampal place cells exhibit both bursts and isolated spikes3, and the different discharging patterns are thought to play distinct functional roles in the hippocampal memory processing3,11,49. The hippocampal area CA3, which has prominent recurrent excitatory connections, resembles a reservoir in this model. Furthermore, an abstract model of the entorhinal-hippocampal memory system accounted for the different statistical structures of hippocampal sequence generation, such as diffusive vs. Lévy flight-like random walks32. Therefore, the hippocampal circuits are of potential relevance to this study. However, the relationships between spatial information coding and the cells’ discharging patterns are not simple, depending on specific cell types and brain regions33,49. To our knowledge, whether CA3 neural population synchronizes their burst discharges around the times of long-distance runs of animals has not been known. On the other hand, it is known that bursts of CA3 neurons mostly occur in an inbound travel towards their receptive field centers10. Clarifying the distinct computational roles of isolated spikes and bursts to the hippocampal memory processing is an intriguing open question.

In summary, this study showed the advantages of bursting neuronal activity in rapid learning of dynamical trajectories obeying Lévy flight. Bursting is ubiquitously found in various regions of the brain, and previous studies suggest the active roles of bursts in robust spike propagation and induction of synaptic plasticity. Our results give a further insight into the unique role of bursts at the network-level learning and computation.

Methods

Neuron model

We describe neurons in the reservoir with the Izhikevich model, which is able to mimic the temporal discharging patterns of various neurons50:

$$\begin{aligned} \frac{dv_i}{dt}=&0.04v_i^2+5u_i+140-u_i+I_i, \nonumber \\ \frac{du_i}{dt}=&a(bv_i-u_i), (i=1, \ldots , N), \end{aligned}$$
(1)

where \(a=0.02\) and \(b=0.2\), i is a neuron index, and the number of neurons \(N=1000\). The values of \(v_i\) and \(u_i\) are reset to c and \(u_i+d\) when \(v_i\) reaches the threshold of 30 mV. We set \(c=-65\) mV and \(d=8\) in the RS mode and \(c=-50\) mV and \(d=2\) in the bursting mode. We use this model without taking refractory periods into account for simplicity of numerical simulations though some neurons may exhibit unrealistically high frequency bursting.

Synaptic current is given as \(I_i=s_i(t)+I_b\), where \(I_b\) is a constant bias and recurrent synaptic inputs are

$$\begin{aligned} s_i(t) =&\sum _{j=1}^N w_{ij}r_j(t), \end{aligned}$$
(2)
$$\begin{aligned} w_{ij} =&G w^0_{ij}+ Q\sum _{k=1,2}\eta _i^{(k)}\phi _j^{(k)}, \end{aligned}$$
(3)

in terms of the instantaneous firing rate \(r_i(t)\) of neuron i at time t. Throughout this study, we set \(I_b=10\). The synaptic weight matrix \(w_{ij}\) has non-modifiable components \(w^0_{ij}\) and modifiable components \(\phi _j^{(k)}\), with G and Q being constant parameters. The non-modifiable components have the connection probability \(p=0.1\) and their values are drawn from a normal distribution with mean 0 and variance \(1/\sqrt{Np^2}\). While \(Q=100\) throughout this paper, the value of G is mode-dependent, as shown later. The encoding parameter \(\eta _i^{(k)}\) (\(k=1,\ 2)\) is randomly drawn from the uniform distribution \([-1,+1]\). The linear decoder \(\phi ^{(k)}_i(t)\) determines activities of the readout units \(x^{(k)}(t)\):

$$\begin{aligned} x^{(k)}(t) = \sum _{j=1}^N \phi _j^{(k)}(t)r_j(t), (k=1, 2) \end{aligned}$$
(4)

which should approximate a given target trajectory.

FORCE learning

We used a straight-forward extension of the FORCE learning to spiking neurons47. A double exponential filter was used to low-pass filter the individual spikes of the i-th neuron in the reservoir:

$$\begin{aligned} {\dot{r}}_i= & {} -\frac{r_i}{\tau _d}+h_i, \end{aligned}$$
(5)
$$\begin{aligned} \tau _r {\dot{h}}_i= & {} -h_i + \frac{1}{\tau _d} \sum _{t_{ik}<t} \delta (t-t_{ik}), \end{aligned}$$
(6)

where \(\tau _r\) and \(\tau _d\) are the synaptic rise time and synaptic decay time, respectively. Values of these parameters were set as \(\tau _r=2\) ms and \(\tau _d= 20\) ms.

Using the error signals \(e^{(k)}(t)=f^{(k)}(t)-x^{(k)}(t)\), we update the decoders as follows:

$$\begin{aligned} \varvec{\phi }^{(k)}(t) =&\varvec{\phi }^{(k)}(t-\Delta t)-e^{(k)}(t)\mathbf{P}(t)\mathbf{r}(t), \end{aligned}$$
(7)
$$\begin{aligned} \mathbf{P}(t) =&\mathbf{P}(t-\Delta t) -\frac{\mathbf{P}(t-\Delta t) \mathbf{r}(t) \mathbf{r}(t)^T \mathbf{P}(t-\Delta t)}{1+\mathbf{r}(t)^T \mathbf{P}(t-\Delta t) \mathbf{r}(t)}. \end{aligned}$$
(8)

The initial conditions are given as \(\phi _j^{(k)}(0)=0\) and \(\mathbf{P}(0)=\mathbf{I}_N/ \lambda\), where \(\mathbf{I}_N\) is an N-dimensional identity matrix and \(\lambda =10\) for both regular and bursting modes. The performance of the model is evaluated by the average squared error between a target trajectory and the corresponding network output:

$$\begin{aligned} \mathrm{Error} = \sqrt{ \langle { e^{(1)}(t)^2+ e^{(2)}(t)^2 \rangle }_t}, \end{aligned}$$
(9)

where \(<\cdot >_t\) means averaging over time within the corresponding learning step.

Lévy flight

Trajectories obeying Lévy flight were generated by using the function, scipy.stats.levy_stable.rvs(), in the Scipy library of Python for scientific calculations. This function generates a series of random numbers that obey the Lévy distribution17,18. In short, a stable distribution has the characteristic function of the form,

$$\begin{aligned} \varphi (t;\alpha ,\beta ,c,\mu ) = e^{it\mu -|ct|^\alpha (1-i\beta \ \text {sign}(t) \Phi (\alpha ,t))}, \end{aligned}$$
(10)

where \(\alpha\), \(\beta\), c, and \(\mu\) are the characteristic exponent, skewness parameter, scale parameter, and location parameter, respectively, and

$$\begin{aligned} \Phi = {\left\{ \begin{array}{ll} \tan (\dfrac{\pi \alpha }{2}) &{} \alpha \ne 1 \\ -\dfrac{2}{\pi } \log |t| &{} \alpha = 1. \end{array}\right. } \end{aligned}$$
(11)

The probability density function for a stable distribution is given as

$$\begin{aligned} f(x; \alpha ,\beta ,c,\mu ) = \dfrac{1}{2\pi } \int ^{\infty }_{-\infty } \varphi (t; \alpha ,\beta ,c,\mu ) e^{-ixt} dt, \end{aligned}$$
(12)

where \(-\infty<x<\infty\). If we set \(c=1\) and \(\mu =0\), we obtain a class of long-tailed distributions in which Lévy distribution of a narrow sense is obtained for \(\alpha =0.5\) and \(\beta =1\). However, to obtain sufficiently significant long tails, unless otherwise stated, we used the values \(\alpha =1.5\) and \(\beta =0\) throughout this study. The distribution of jump distances is shown for the choice of parameter values together with another example also used in this study (Supplementary Fig. 2a).

Now, step sizes of a two-dimensional Lévy flight can be written as

$$\begin{aligned} \Delta x_1(t)&= R(t) \cos \theta (t), \end{aligned}$$
(13)
$$\begin{aligned} \Delta x_2(t)&= R(t) \sin \theta (t), \end{aligned}$$
(14)

where the angle of each step \(\theta (t)\) is drawn randomly from the uniform distribution \(0 \le \theta \le 2 \pi\), and the step amplitude R(t) was determined as \(R=F^{-1}(r; \alpha ,\beta ,c,\mu )\), where

$$\begin{aligned} F(r; \alpha ,\beta ,c,\mu ) = \int _{r}^{\infty } f(t; \alpha ,\beta ,c,\mu ) dt, \end{aligned}$$
(15)

and \(0 < r \le 1\) is a uniform random number.

We limited the target trajectories with in a square area \(|x_1| \le 2\), \(|x_2| \le 2\) by normalizing the coordinates of Lévy flight as

$$\begin{aligned} x_1(t) = 4\dfrac{\Delta x_1(t) - \Delta x_{1, \mathsf {min}}}{\Delta x_{1, \mathsf {max}} -\Delta x_{1,\mathsf {min}}} -2, \end{aligned}$$
(16)
$$\begin{aligned} x_2(t) = 4\dfrac{\Delta x_2(t) -\Delta x_{2, \mathsf {min}}}{\Delta x_{2,\mathsf {max}} - \Delta x_{2,\mathsf {min}}} -2, \end{aligned}$$
(17)

where \(\Delta x_{k, \mathsf {min}}\) and \(\Delta x_{k,\mathsf {max}}\) (\(k=1,2\)) stand for the minimum and maximum values of the step sizes, respectively, constituting the target trajectory.