Introduction

Reservoir computing1,2 is a computation framework that has shown great potential for solving complex problems in various fields ranging from speech recognition3 to robotics. The core element of this setup is a network of interacting components, called the reservoir, whose realizations vary from abstract nonlinear dynamical systems4 (in theoretical works) to various physical substrates5,6 (in experimental approaches). The main advantage of reservoir computing lies in the fact that the input signal is processed in a nonlinear way by the reservoir, which is a recurrent neural network, and only the output layer needs to be trained to obtain the desired output. This makes the training process much simpler and faster compared to conventional artificial neural networks.

One concept, tightly-related to networked systems, which can impact the performance of reservoir computers, is that of criticality7. This denotes the property of a dynamical system marking the transition between ordered and disordered states. It has been shown that criticality is essential for efficient information processing in real neural systems, where it provides a balance between stability and flexibility8,9, a rapid and accurate response to external inputs10, and it is believed to be crucial for many cognitive functions such as perception, attention, and decision-making11. It is therefore not surprising that oscillators are a natural choice for neuromorphic architectures, as they mimic the behavior of neurons in the brain and can be implemented using electronic circuits. Moreover, oscillator networks have been shown to exhibit critical behavior, which makes them suitable for reservoir computing applications. In particular, a number of studies have demonstrated that reservoirs that operate at or near criticality exhibit superior performance on a range of tasks compared to those that operate away from criticality, see, e.g., networks based on leaky-integrate-and-fire oscillators12 and spin-torque nano-oscillators13, Kuramoto networks14, nanowire networks15, and atomic switch networks16,17. Nevertheless, designing reservoir computers that are both critical and robust to parameter variations remains a goal with many obstacles. In particular, one of the drawbacks of many reservoirs is a relatively narrow range of admissible parameters enabling critical behavior. This particularly means that the hardware implementation might be complicated and requires the precise parameter matching that is not always possible18. Moreover, such networks do not possess enough robustness with respect to the external disturbances and noise.

In this work we study ensembles of resistively coupled FitzHugh-Nagumo oscillators (FNOs)19,20 in the framework of criticality and reservoir computing. The FNO belongs to the class of relaxation-type oscillators and is a simplified version of the Hodgkin-Huxley model21. It is considered as a prototype of an excitable system to describes basal functionalities of a neuron22. If an external stimulus exceeds a threshold, an FNO starts to exhibit a sequence of action potentials (spikes) and relaxes after some time, if the input stimuli vanishes. The FNO model has been widely applied in order to study the dynamics of oscillator networks including but not limited to, diffusive coupling, the impact of noise, as well as memristively coupled FNO ensembles23,24,25. In dependency of the particular coupling mechanism and the FNO parameter-set, a plethora of dynamic states have been observed such as, synchrony, symmetric patterns and chaos26,27,28. Research of FNOs in the context of criticality has been less intensively studied29, although an FNO represent a prototype of an excitable system. Indeed, ensembles of coupled FNOs may allow an in-depth exploration of complex brain states, as for example, has been recently shown in epileptic-seizure-related synchronization phenomena and quasi-critical brain states30. While these results hint towards the existence of criticality in FitzHugh-Nagumo oscillator ensembles, there is a lack of evidence regarding its existence up to this point. In this work, we show how a robust critical state can be achieved in a network of resistively coupled FNOs. Besides this, we provide design concepts that lead to a network, whose activity is both spatially- and scale-invariant. Here, spatial invariance means that the average activity of nodes over certain time interval does not depend on the number of nodes or on their spatial location, which allows for a free choice of readout nodes. Moreover, we provide an alternative characterization of criticality in terms of the power dissipation in the network and demonstrate that criticality supports the robustness of the classification accuracy with respect to the readout shrinkage. All our investigations are performed under the restriction that all components are available in analog hardware, as we aim to support the development of analog bio-inspired reservoir computers.

Results

In this section, we present our bio-inspired reservoir computer. We derive conditions on the baseline coupling strength enabling the reservoir to operate in a critical regime. Furthermore, we establish a link between criticality and power flow, which allows for characterizing criticality from an electrical perspective. As a benchmark for the classification accuracy of our reservoir computer, we consider a drybean classification problem31. We have picked this task for three specific reasons. (i) Simulations in this work are performed on a circuit-level. Such simulations are slower than simulations of classical artificial neurons. Therefore, it is reasonable to pick a dataset with a moderate number of data samples. Datasets such as MNIST or CIFAR10 have many data samples, which require large computational effort. (ii) Each data sample in the drybean dataset has 16 attributes. To enhance our criticality analysis, we only supply an input to 20% of our network. Our reservoir is composed of 100 FitzHugh-Nagumo oscillators (FNOs). Thus, we can directly translate the 16 attributes of a data sample from the drybean dataset into analog signals for 16 oscillators within our network. (iii) The drybean dataset has been widely utilized in the literature as a benchmark for classical artificial neural networks. This enables us to compare the performance of our reservoir computer to existing works that have employed this dataset31,32,33,34,35.

Biological Scenario

Our electrical setup aims to mimic the basic topology of neural networks in animals36, see Fig. 1a. Here, we implicitly assume the existence of sensory neurons generating action potentials corresponding to a set of perceived features. The information is mainly encoded into the spiking rate of these action potentials. Specifically, the intensity of the perceived features is directly proportional to the spiking rate of the sensory neurons, hence we speak of spike-rate encoded signals in Fig. 1a, which is inspired by discoveries on neural coding in the visual and motor cortex37. These signals are forwarded to a subset of information processing neurons, which are responsible for filtering the input. From a hierarchical perspective, the information processing neurons are found in an intermediate stage between the sensory neurons and the brain, hence they can be understood as local information processors. Furthermore, the (local) neural network is assumed to operate at a critical state, as supported by many studies in literature38,39,40,41,42,43,44. Similar to the sensory neurons, the information processing neurons produce action potentials of which a subset is forwarded to a specialized neural network (e.g. in the brain) that classifies the perceived features.

Fig. 1: Overview of the reservoir computer setup drawing a comparison between the biological scenario and the electrical setup.
figure 1

a A network of sensory neurons forwards the perceived information as action potentials to a network of information processing neurons. The latter filters the received input and a subset of the generated action potentials is forwarded to another neural network (in the brain) that classifies the input. b Sensory information is encoded into the spike train distances (rate coding) with a fixed pulse width and then forwarded to a network of artificial neurons. The axonal connection from a are modeled by a network of resistors, see the red dashed box. The output voltages of a subset of oscillators is used as an input for a simple machine learning algorithm to perform the classification. c Circuit of a FitzHugh-Nagumo oscillator \({{{{{{{{\mathcal{N}}}}}}}}}_{\mu }\). The current source j0,μ is used to supply the pulse trains encoding data sample attributes. d Watt-Strogatz graph used in this work. Every node corresponds to an oscillator, as depicted in c, while every edge corresponds to a resistor. The green nodes represent oscillators receiving pulse trains, as depicted on the left side of b. The pink nodes represent readout oscillators; their voltages uμ are used to train the classification network depicted on the right side of b.

Electrical Setup

1) Spike-Rate-Coding for Input Generation

Moving on to the electrical setup as sketched in Fig. 1b, the sensory inputs are modeled by spike trains. Here, we generate a pulse train for every attribute of a data sample within the considered dataset. The amplitude J0 and pulse width Tp of every pulse is fixed and is chosen, so it leads to exactly one (voltage) spike at the receiving neuron. Every input attribute is normalized to the interval [0, 1]; the μ-th normalized input attribute of the ν-th data sample will subsequently be called aμ,ν. To linearly map the input attribute onto a spike train, we have defined a minimal and maximal spiking rate \({r}_{\min }\) and \({r}_{\max }\), such that the μ-th neuron receives a spike train rate of \({r}_{\nu }={r}_{\min }+{a}_{\mu ,\nu }[{r}_{\max }-{r}_{\min }]\). A detailed explanation regarding the input preparation is presented in the Methods section.

2) FitzHugh-Nagumo Oscillators as Reservoir Nodes

Every information processing neuron is modeled by a biologically plausible neural oscillator, the so-called FitzHugh-Nagumo oscillator (FNO)20, see Fig. 1c. Although we only present emulation results, we have deliberately picked the FNO for one specific reasons: The oscillator retains the main features of a biological neuron, while also being realizable in practice45,46. In this context, biological plausibility means that the resulting spike form is quite similar to that of a real action potential, i.e. the oscillator closely mimics all the polarization phases of a real neuron. Furthermore, the oscillator can be parametrized, so a voltage spike is only produced when it is perturbed by an external input. Here, the input pulse trains described in 1) are supplied as using a current source j0, see Fig. 1 and refer to the Methods section for more details.

3) Resistive Coupling Network

The axonal connections depicted in Fig. 1a are modeled by a network of resistors, see the red dashed box in 1b. In biology, axonal connections are usually unidirectional (due to unidirectional terminal synapses), e.g. in mammals47, and induce a small delay. However, one can also find bidirectional axonal connections in other organisms such as the Hydra48. Bidirectional connections can be advantageous as they presumably provide the network with functional robustness, that is, the loss/death of neurons does not greatly influence the network’s connectivity and hence its overall functionality49. To reduce the number of optimization parameters in our criticality analysis later on, the resistances are modeled by a Gaussian distribution. The mean value of the Gaussian distribution serves as the control parameter of our criticality analysis, while a standard deviation of 10% is used to mimic the parameter spread observed under practical conditions. The delays of biological axons are not considered for two reasons. First, considering such delays would in another parameter that must be optimized to observe critical behavior. This would require a more sophisticated analysis and characterization of criticality. Furthermore, implementing distortion-free delays with analog hardware is not a trivial task.

4) Network Topology

In this work, the network’s topology is kept fixed. A reasonable way of modeling the topology of biological network is by making use of small-world network models, since biological networks are known to have the small-world property50. In this work, we make use of a Watt-Strogatz model51, which is a small-world network with n nodes, a mean node degree of 2k, and nk edges. Here, we have chosen n = 100, k = 5, and a rewiring parameter of β = 0.15; a graphical illustration of the network is given in Fig. 1d. Here, only 20% of the network receives an external (sensory) input (green nodes in Fig. 1d, while the outputs of another 20% serves training/classification purposes (pink nodes in Fig. 1d. By letting only a minor part of the network receive external inputs, we are able to quantify how much information is propagated throughout the network, when the mean coupling resistance is varied. Note that the input and output sets are strictly disjoint. Also, we have evenly distributed both the input and output nodes throughout the network, as this ensures that (1) no part of the network is silent when an input is supplied to all nodes at the same time; and that (2) information is evenly extracted from the entirety of the network.

5) Emulation Technique

To emulate a large number of resistively coupled FNOs, we make use of the wave digital concept52, which is an appropriate and powerful tool for emulating structurally-similar circuits in a highly parallel fashion53. A thorough description and derivation is presented in the Methods section. However, we would like to briefly mention that a wave digital model is essentially a signal flow diagram, whose iterative evaluation allows emulating the corresponding electrical circuit. In this work, we use the emulated voltages of every output oscillator to train a learning algorithm. Note that all parameters used within our emulations can be found in Table 1.

Table 1 Circuit parameters.

Avalanche criticality in coupled FitzHugh-Nagumo oscillators

The existence of avalanche criticality in a system of coupled FNOs is a natural implication of the network’s dynamical behavior. To clarify this statement, take the example of two coupled FNOs. If the coupling resistance is chosen so high such that only a negligible amount of power is exchanged through the coupling resistor, then they would have a low mutual influence and would generally exhibit different oscillatory outputs. The same effect can be observed in the other extreme case, that is, if the coupling resistance is chosen so low that the oscillators synchronize, then there would be no power exchange once the synchronous state is reached54. Now, in the first case, we may speak of a disordered or subcritical system, while in the second case, we may speak of an ordered or supercritical system. Somewhere in between these two states, i.e. for a certain range of coupling resistances, we assume a phase transition to take place during which the network is in a critical regime. In the sequel, we show this to indeed be the case. However, it should be noted that a critical state can only be reached if (1) the network contains autonomous oscillators or (2) is stimulated by a set of sufficiently strong external signals (pulse trains), as the oscillators would otherwise stop firing and interacting, once all the potential energy has been dissipated. In this work, we externally excite the network to maintain its interaction. Here, we say that the strength of a signal is proportional to its repetition rate and not amplitude.

To test our hypothesis, we defined two types of input signals. The first one is called the nominal input, which describes the case where aμ,ν = 1 for all μ, i.e. the case where all input oscillators (simultaneously) receive a pulse train with the rate \({r}_{\max }\). This type of input induces a maximal amount of activity in the network and corresponds to the maximal possible perturbation. However, it does not reflect the average amount of activity that can be observed, when dealing with real data samples. Thus, we make use of a second input, termed bean input, stemming from the dataset of the considered drybean classification problem. Here, we picked 7 random sets of attributes from the drybean dataset and supplied the corresponding pulse trains as inputs. This allows us to gain a more realistic estimation of the network’s activity when dealing with real data.

Figure 2 a demonstrates the average network activity as a function of the baseline coupling strength Gμ = 1/Rμ in case the nominal input is supplied. The average network activity is defined as the overall number of spikes produced by the reservoir divided by the number of sample points, i.e. it is the average number of spikes per discrete time instant. Since we are working with analog oscillators, we make use of a moving time window with the width Δt = 0.2 μs in order to identify spikes events. An oscillator \({{{{{{{{\mathcal{N}}}}}}}}}_{\mu }\) is said to have spiked if the corresponding output voltage uμ has a local maximum with a positive voltage value within the moving time window; thus Δt is chosen so it is slightly wider than the length of one spike event. In Fig. 2a, we have also plotted the average network activity in the case the bean input used, as indicated by the red dots. Since, we have used 7 different attributes sets from the drybean dataset, we experienced a deviation in the average network activity, which is indicated by the error bars. In general, we observe a second-order phase transition in the interval Gμ [20, 60] μS. The left and right sides of this phase transition correspond to the coupling strength regimes, where the system is said to be subcritical and supercritical, respectively. On average, it can be seen that the network activity is slightly lower if we use the same coupling strength but the bean input instead of the nominal input. This outcome is natural, as many of the normalized attributes aμ,ν are usually smaller than 1 when the bean input is used. Hence, the network experiences a smaller perturbation leading to less activity on average. As well as many spiking networks, the constructed FNO-network features cascades of spikes that are being spread over the network both spatially and temporally. These cascades are called avalanches. The remainder of Fig. 2(b-g) provides an overview of the relative frequency of occurrence of avalanches depending on their size and duration for different values of baseline coupling strengths. For some coupling strengths, the relative frequency follows the power law distribution. This power law property, combining with the evidence of different dynamical regimes on either sides of phase transition, serves as a criticality signature55 of our network. Overall, our analysis indicates that a coupling resistance of Rμ = 14 kΩ leads to critical network behavior. However, in terms of task performance, it is known that slightly subcritical behavior can lead to even better results56,57, which makes the resistance Rμ = 18 kΩ also an interesting value for benchmarking against particular tasks. The details on the power law fitting and the resulting power law exponents are summarized in Supplementary Note 1.

Fig. 2: Average network activity and power law fitting statistics for the avalanche size and avalanche duration under the bean-input for different baseline coupling strengths.
figure 2

a Blue dots indicate the average network activity for a given coupling strength under the nominal input. Red dots indicate the average network activity for six selected coupling strengths (5 kΩ, 11.5 kΩ, 14 kΩ, 18 kΩ, 23.5 kΩ, and 60 kΩ) under the bean-input. Error bars indicate the standard deviation of the average activity over seven different bean classes served for the bean-input. The standard deviation for the case of 60 kΩ is only 0.00977, and, therefore, not visible. Relative frequencies of the occurence of avalanches depending on their size and duration for six different coupling strengths (from strong/supercritical to weak/subcritical): 5 kΩ (b), 11.5 kΩ (c), 14 kΩ (d), 18 kΩ (e), 23.5 kΩ (f), and 60 kΩ (g). The avalanche duration corresponds to a number of subsequent Δt-long time bins during which spikes have been observed, and the avalanche size corresponds to the total number of spikes occured within a single avalanche. Dashed lines correspond to the best power law fitting of the data (see the detailed information in Supplementary Figure 1 and Supplementary Table 1. For the case of 60 kΩ (g), all avalanches are of duration 1, i.e., every spike is isolated due to the weak coupling between the oscillators, and, therefore, no power law fitting is possible.

To understand the network behavior from a dynamical perspective, we present the output oscillation time series in Fig. 3e–h. Here, we excite the network with input signals relating to a random data sample from the drybean data in order to visualize its behavior during task performance, see Fig. 3e. Figure 3f–h depict the reactions of a subcritical, critical, and supercritical network, respectively. In Fig. 3f, we observe many inactive oscillators due to our choice of a weak coupling strength. Here, oscillators receiving an input are not able to excite their neighbors, such that a large part of the network remains inactive. The weak interaction corresponds to a disordered state, hence we refer to the network behavior as being subcritical. In Fig. 3g, we see the opposite case, a very active and synchronous network. Here, the coupling is strong, and the input oscillators are able to excite a large part of the network. This type of network behavior is ordered, hence, it is supercritical. Lastly, in Fig. 3h, we encounter a network with baseline coupling strength within the phase transition zone. In this case, we observe diverse oscillation patterns and alternating spiking behavior, where parts of the network are active at times and inactive at others. This intricate behavior is labeled as “critical" due to the network’s complexity.

Fig. 3: Illustration of the network activity in the subcritical, critical, and supercritical state in terms of raster plots (top) and time series plots (bottom).
figure 3

a Input pulse trains supplied to the ν-th oscillator over time. Raster plot representation of the oscillators' output voltages uν in the case of a subcritical network (b, Rc = 60 kΩ), a critical network (c, Rc = 18 kΩ), and a supercritical network (d, Rc = 5 kΩ). The time series plots corresponding to raster plots ad are given in subfigures fh.

As plan to use our network as a reservoir computer, we have visualized the network behavior using raster plots in Fig. 3a–d. The raster plots can be used to visualize the quality of the nonlinear projection that is performed by our reservoir. In the case of a subcritical network, we see that the input-output behavior is 1-to-1, that is, the readout network perceives a scaled version of the input signals, see Fig. 3b. Therefore, the reservoir is effectively useless, since it does not process the input at all. In Fig. 3d, the input leads to synchronous outputs and hence to a predictable network behavior, since the network will behave this way for nearly any data sample. Thus, we conclude that both cases b and d represent bad nonlinear projections of the input in Fig. 3a. The spiking pattern in Fig. 3c, on the other hand, shows a rich variety of collective activity. In other words, we see that the reservoir projects the input signals Fig. 3a in a very nonlinear manner, which is suitable for classification purposes, as we show later on.

Scale and spatial Invariance

Scale invariance is an important property of critical networks, which states that the behavior of a complex network is invariant w.r.t. its scales. Here, the terms behavior and scales are specific to the network at hand. Thus, in order to identify scale invariance, we must first clearly define these terms. In the case of our network, its behavior can be characterized by the average network activity discussed in the previous section. If we can measure a phase transition in the average network activity irrespective of the network’s scales, we say that the network is scale-invariant. The scale of our network can be defined in different ways. For example, it can refer to the number of oscillators, the number of resistive connections, the average number of resistive connections per oscillator, etc. In this work, we limit our analysis to the number of oscillators. In other words, we analyze scale invariance by measuring the average network activity as a function of the number of oscillators. In terms of the Watt-Strogatz model, this means that n is changed, while k and β are kept fixed. Figure 4a depicts the results of our analysis. Here, we have measured the average network activity as a function of Gμ, while varying the number of oscillators, such that n. In order to compare the average network activity for different network sizes, we have normalized the network activity to the interval [0; 1]. Our results in Fig. 4a show the average network activity to retain its phase transition, even when the number of oscillators is varied. Moreover, we also observe the phase transition regime is nearly invariant w.r.t. the number of oscillators.

Fig. 4: Spatial and scale invariance of network’s activity.
figure 4

a Average network activity as a function of the baseline coupling strength Gμ and the number of oscillators n. For comparison purposes, the average activity of every network has been normalized to its maximum. b Average activity per node depending on baseline coupling strength for 20 output nodes (blue points) and five 20-node-large random sets (semi-transparent gray points). The random nodes are picked among all nodes which do not receive external input. c Average activity per node depending on baseline coupling strength for different random sets of nodes which do not receive external input: all 80 nodes (blue points), five random 40-node-large sets (semi-transparent green points), five random 20-node-large sets (semi-transparent gray points), and five random 10-node-large sets (semi-transparent red points). The average number of spikes occurred in a randomly selected subset of network’s nodes does not depend on the number of nodes in this subset or on their spatial location.

In addition to scale invariance, another compelling feature of our network is one that we refer to as spatial invariance. We define a spatial invariant network as a network, where signatures of criticality can be observed in a sufficiently large subnetwork consisting of randomly chosen nodes and all their incident edges. This property is quite useful, as it implies that the output nodes can be chosen freely without greatly effecting the task performance. Furthermore, it also allows reducing the number of output nodes without greatly influencing the task performance, as we discuss later on. A good way of inducing this property in oscillator networks is to evenly distribute the oscillators receiving an input throughout the network. Achieving this type of distribution can also be seen as an optimization task, specifically, a vertex cover problem, where the goal is to find a predefined number of input oscillators (nodes) that can cover all the resistive connections (edges) in the network. This ensures that the input signals propagate throughout the entire network yielding a maximal information spread. Note, if the number of input nodes is much smaller than the network size then there are two ways to achieve spatial invariance. The first option is to increase the network connectivity by increasing the average node degree. In our scenario, the Watt-Strogatz model can be parametrized with a predefined average node degree, which makes it a good choice for dealing with this specific problem. The second option is to find the largest possible vertex cover with a predefined number of nodes, which is only sensible if the number of input nodes is close to the cardinality of the minimal vertex cover set.

To verify the presence of spatial invariance, we let a set of 20 oscillators undergo the same type of analysis depicted in Fig. 2, see Fig. 4b. At first, these oscillators were chosen to be the same ones as our output oscillators. Then we chose 5 sets of 20 random oscillators among those that do not receive any external input. Here, we see that the same phase transition as depicted in Fig. 2 can be observed, when evaluating the average network activity of only 20 oscillators within the network. The same picture holds quantitatively when we probe different number of oscillators (see Fig. 4c with 10, 20, and 40 randomly chosen nodes). Note that the phase transition takes place in the same interval Gμ [20, 60]μΩ, which indicates a form of scale-invariance that is typical to critical networks38,39.

Relationship between criticality and power flow

In this section, we discuss a possibly new measure for criticality that can be applied in the context of resistively coupled neural oscillators. While power law and phase transition analysis are popular and well established methods for analyzing criticality in complex networks, these measures are based on statistics and require a great amount of data for their calculations. An instructive way of finding a simpler measure with a similar interpretation is to think about how information spreads throughout the network. In our setting, input information generates a spike at the receiving oscillator, which induces a current flow directed towards adjacent oscillators. Such a flow of current can also be understood in terms of a power flow, because any oscillator receiving an input is exciting its inactive neighboring oscillators, which in turn leads to more spikes within the network.

Avalanche criticality corresponds to a state of high network activity with diverse spiking behavior58. Hence, it is justified to think that a critical state is correlated to one with a high power transmission59. To test this hypothesis, we analyzed the average dissipated power (see Methods section) within the coupling network for different coupling strengths Gμ and different network sizes n, while using the nominal and bean input. Figure 5 presents the results of our analysis. The left and right plot depict the average dissipated power when the nominal and bean input are used, respectively. To relate the results in Fig. 5 to the phase transition in Fig. 2, we have shaded the phase transition regime in light blue in both plots. Let us first consider the left plot in Fig. 5. This plot illustrates that a maximal power transmission takes place for Rμ ≈ 18 kΩ, while both low and high values of Rμ lead to less power transmission. This result is very natural as high values of Rμ lead to weaker interactions and hence less power transmission. On the other hand, low values of Rμ lead to strong interactions and therefore synchrony, and the latter leads to less power exchange. Interestingly, the value of Rμ leading to maximal power transmission lies within the phase regime. In fact, power transmission quickly deteriorates outside the phase transition regime. Hence, we see a strong correlation between power transmission and the average network activity. In general, the same trends are seen in the right plot of Fig. 5. However, when the bean input is used, we can observe two major differences: (1) the maximal power transmission value is obtained for Rμ ≈ 11.5 kΩ and the value does not lie in the phase transition regime for the nominal input; and (2) the average network activity does not deteriorate as quick as when the nominal input is used. The first aspect is to be expected, because input signals representing a data sample from the drybean dataset, will always supply less energy (on average) to the network due to their lower pulse repetition rate. Thus, the network perceives a weaker excitation such that higher values of Gμ are required in order to obtain maximal power transmission. Moreover, the fact that the phase transition in Fig. 2 is measured using the bean input justifies why the optimal value Rμ = 11.5 kΩ does not lie in the phase transition regime when the bean input is used. Measuring the phase transition using the bean input would shift the phase transition, as indicated by the red dots in Fig. 2. The second aspect can be interpreted as follows: power transmission within the network is less sensitive to changes in the coupling strength, when the bean input supplied. This implies a sort of parametric robustness, which greatly supports the technical implementation of our network in practice. We will show that this parametric robustness also translates to robustness in task performance in the next section.

Fig. 5: Average dissipated power as a function of the baseline coupling strength and the network size.
figure 5

a The nominal input is used to excite the network. b The bean input is used to excite the network. In both plots, the blue shaded area denotes the phase transition regime, which can be seen in Fig. 2. For comparison purposes, the average dissipated power has been normalized to its maximum.

Overall our analysis reveals two key aspects. First, to induce critical behavior, we can analyze the average dissipated power for different coupling strengths and pick the value that maximizes power dissipation. In fact, contrary to typical analysis methods (phase transition, power law, etc.), a power flow analysis specifies the optimal coupling strength for the application at hand, is simpler to perform, and requires less simulation effort. Second, there is a wide range of coupling strengths that can be chosen, while still inducing a critical state. In other words, the phase transition regime is rather wide, which, to our utmost knowledge, is quite rare for complex networks.

Classification accuracy and its robustness with respect to the readout shrinkage

We test the quality of the constructed FNO-oscillator-based reservoir using classification task for the drybeans dataset31. The readout consists of two fully connected feed-forward layers of artificial neurons: the input layer that receives the signal from the reservoir, and the output layer, with the softmax activation function, containing 7 nodes which represent probabilities of a bean to belong to a particular bean class. The 100 equidistantly sampled voltage values from every of 20 output oscillators form the readout’s input layer, we refer the reader to the Methods section for more details. The total of 13611 samples are generated which are split into the training and validation sets following the 10-fold cross-validation procedure (i.e., we split all samples into 10 subsets and use 9 of them for the training and 1 for the validation. Then we repeat the training procedure 10 times taking different validation sets). The supervised learning procedure is performed in Python using the Keras API60 for 1000 epochs. The categorical cross-entropy loss is backpropagated using the stochastic gradient descent method Adam. The highest average classification accuracy of 90.75% over all 10 folds was achieved for the slightly sub-critical reservoir with the baseline coupling resistance of Rμ = 18 kΩ. A summary of the training process for this case is depicted in Fig. 6a and b. It is important to note that the relatively high classification accuracy is consistent throughout the wide range of near-critical resistances Rμ = 23.5 kΩ, 18 kΩ, 14 kΩ, and 11.5 kΩ (red markers in Fig. 6d). The case of Rμ = 11.5 kΩ, which corresponds to the maximum power transmission and the most critical behavior of the network (see Fig. 5 and Supplementary Fig. 1, respectively), provides a bit worse classification accuracy compared to Rμ = 18 kΩ. Also, the classification accuracy obtained using the FitzHugh-Nagumo reservoir is comparable to other approaches reported in the literature for this dataset, namely, Multilayer perceptron (MLP), Support Vector Machine (SVM), k-Nearest Neighbors (kNN), Decision Tree (DT) classification models from31 and Bayesian network (BNC) and Decision Tree (DT) from34 (see Fig. 6c).

Fig. 6: A comparison of the classification accuracy.
figure 6

a Summary of the 10-fold cross-validation for the critical network with 18 kΩ baseline resistance. The classification accuracy is consistent over all 10 folds with the average accuracy of 90.75%. b Confusion matrix for the case of Rμ = 18 kΩ. c A benchmark of the proposed FitzHugh-Nagumo reservoir (FHN RC) against other classification methods available in the literature for the same dataset. Red point corresponds to the average classification accuracy over 10 folds for Rμ = 18 kΩ (see a and b). Green points correpond to the classification using Bayesian network (BNC) and Decision Tree (DT) models from34. Blue points correspond to the classification accuracies using Multilayer perceptron (MLP), Support Vector Machine (SVM), k-Nearest Neighbours (kNN), Decision Tree (DT) classification models from31. d A summary of the drop of the classification accuracy due to the shrinkage of the readout. It is seen that the critical networks are less sensitive to the reduction of the readout layer compared to those that are initialized far away from criticality. Different colors corrrespond to different sizes and configurations of the readout layer (see the legend). In all figures, the error bars indicate the standard deviation over 10 folds.

Next, we compare the average classification accuracy for reservoirs with different mean coupling resistances and different sizes of the readout input layer. The latter has been made by either reducing the number of samples from every output oscillator (50 and 20 equidistantly sampled voltages instead of the original 100 samples), or reducing the number of output oscillators (10 and 4 oscillators instead of the original 20 output oscillators). The results are summarized in Fig. 6d. It is clearly seen that the highest classification accuracy is achieved for resistances that correspond to criticality of the reservoir. Also, the non-critical regimes (60 kΩ and 5 kΩ) significantly suffer from the shrinkage of the readout: The classification accuracy drops dramatically, whereas the critically initialized reservoirs maintain relatively high classification accuracy despite the readout shrinkage.

Finally, we would like to highlight another favourable property of the critical networks, namely that the reservoir computer setups based on critical networks possess better training capabilities compared to the ones based on the non-critical networks not only in terms of accuracy, but also in terms of training time/effort and convergence (see Fig. 7 that summarizes the training of the readout for reservoirs with different baseline coupling strengths). In particular, from Fig. 7a, it is visible that for networks close to criticality (i.e., Rμ = 11.5 kΩ, 14 kΩ, 18 kΩ, and 23.5 kΩ), the classification accuracy climbs rather fast during the first dozens of training epochs, and then the improvement is marginal. In contrast, the training is considerably slower for the data obtained from the supercritical network (Rμ = 5 kΩ) and dramatically slower for the non-critical network (Rμ = 60 kΩ). Qualitatively the same picture holds for the shrunk readouts. To quantify the steepness of the training curve in the case when every output node is sampled 20 times, we calculate the number of epochs needed for every network to reach a threshold of 90% of their maximal accuracy over the course of a 1000-epoch-long training process (Fig. 7b). These numbers, ranging from few hundreds epochs (for Rμ = 11.5 kΩ, 14 kΩ, 18 kΩ) to three, four and even eight hundreds epochs for (Rμ = 23.5 kΩ, 5 kΩ, and 60 kΩ, respectively), are in a good correlation with the distance to criticality (see Supplementary Figure 1) for these networks.

Fig. 7: The course of the readout training shows a good correlation with the distance to criticality.
figure 7

a Classification accuracy during the first 400 epochs (out of 1000) for different baseline coupling strengths. The semitransparent lines correspond to partilcular folds 1--10, whilst the solid lines indicate mean classification accuracies over all 10 folds. A visualization of the classification accuracies and loss function values for the entire training can be found in Supplementary Figures 27. b A comparison of the number of epochs needed for different networks to reach 90% of their maximal classification accuracy over the course of a 1000-epoch-long training for the case of the shrunk readout (20 output nodes sampled 20 times). Error bars depict standard deviation over 10 folds.

Discussion

In this work, we have presented an analog bio-inspired reservoir computer. Our key findings can be summarized as follows: (i) A wide phase transition can be observed in an oscillator network consisting of coupled FitzHugh-Nagumo oscillators when varying the baseline coupling strength; (ii) operating the network in the critical regime leads to the highest classification accuracy and the shortest training time; (iii) spatial invariance can be induced by a small world network topology and a suitable connectivity; (iv) coupled FitzHugh-Nagumo oscillators are robust w.r.t. parameter variations due to the wide phase transition, which (v) carries over to the classification accuracy; and (vi) criticality can be characterized from an electrical perspective and is tightly related to information/power flow between oscillators.

Many works in literature explore the computational capabilities of reservoirs operating at the edge of chaos. While the results of such works may seem similar to ours, we would like to stress that we are dealing with a different type of network. Usually, a disordered state is associated with chaos61,62,63,64,65. In our work, a disordered state is associated with a very weak coupling, such that the network is effectively disconnected. Research, dealing with networks at the edge of chaos, attempts to find the control parameter value, which brings the network close to a chaotic state. The aim is to enhance the nonlinear projection performed by the network by making it behave in a deterministic but complicated way. A network is said to have a high computational capability if small changes in the input leads to noticeable changes in the output that are not random/chaotic64,65. Such analysis is not directly applicable to our network, because there is no control parameter value that can make our network behave in a chaotic manner. Furthermore, we would like to point out that most of the aforementioned works deal with discrete systems, which are more accessible from a mathematical point of view. Thus, many analysis methods from preexisting works on criticality in discrete networks are difficult to transfer to an analog network such as ours. The methods introduced in this work, on the other hand, are applicable to analog networks and may be helpful for analyzing aspiring analog technologies.

While the results of this work hint towards the statement that criticality enhances task performance in analog reservoirs, it is also important to stress that we have only measured the computational compatibility using classification tasks. As pointed out by some researchers12,66, criticality is only beneficial in certain tasks. In a different setting with different tasks, criticality may impede the performance of the reservoir computer. This is something that we plan to explore in future research. One of the main goals of this work was to show that criticality enhances the performance of reservoir computers on the specific example of classification tasks, with few works drawing a direct correlation between criticality and classification accuracy13.

In a future work, we aim to replace the resistive inter-oscillator coupling with memristive coupling, similar to what has been recently done67,68. The goal is to introduce a sort of (synaptic) plasticity into the network, to autonomously drive the reservoir into a critical state. In this case, one may speak of self-organized criticality69. This will, however, require finding local update rules, both in space and time. Up to this point, most update rules are difficult to implement in analog hardware, as they require global information regarding the network activity and/or a way of storing information, e.g., via digital circuitry12,70. This work can be understood as a pre-investigation towards deriving such rules for analog neural networks, since one must first understand how criticality emerges and how it can be characterized.

Methods

Input Preparation

This subsection explains the mapping of dataset attributes onto analog pulse trains. Consider a data matrix \({{{{{{{\boldsymbol{M}}}}}}}}\in {{\mathbb{R}}}^{p\times k}\), where each row corresponds to an attribute, such as color or size, while each column corresponds to a data sample. Evidently, p denotes the number of attributes of our dataset, while k denotes the number of samples. Our goal is to map the attributes of every data sample onto a set of pulse trains. Thus, every column of M should be represented by a set of pulse trains and supplied to a subset of oscillators within the reservoir. Assuming the entries mμ,ν of M are already given as real numbers, we start out by normalizing M, so every attribute is described by a number in the interval [0, 1]. This can be achieved by dividing ever row of M by the maximal entry of the corresponding row. The result is a normalized matrix A = [a1, a2,…, ak] with the column vectors aν comprising the entries aμ,ν. Every entry can be mapped onto a corresponding pulse train by using the mapping relation depicted in Fig. 8. Given some column aν representing the different attributes of the ν-th data sample, the first step is to map the entries of aν onto spike rates rμ,ν, where the mapping relation is given by

$$\begin{array}{r}r({a}_{\mu ,\nu })={r}_{\min }+{a}_{\mu ,\nu }[{r}_{\max }-{r}_{\min }],\end{array}$$
(1)

where \({r}_{\min }\) and \({r}_{\max }\) denote the minimal and maximal pulse train rates; their values are given in Table 1. Next, the spike rate rμ,ν is forwarded to a signal generator producing the current signal j0,μ(t) with the function

$$j({r}_{\mu ,\nu }) = {J}_{0}\mathop{\sum }\limits_{\nu =0}^{\infty }{{{{{{{\rm{rect}}}}}}}}\left(\frac{t-\nu {T}_{\mu ,\nu }}{{T}_{{{{{{{{\rm{p}}}}}}}}}}\right),\quad \,{{\mbox{with}}}\,\quad {r}_{\mu ,\nu }{T}_{\mu ,\nu }=1\quad \,{{\mbox{and}}}\, \\ {{{{{{{\rm{rect}}}}}}}}(x) = \left\{\begin{array}{ll}1,& 0 \, < \, x \, < \, 1\hfill\\ 0,\hfill &{{{{{\rm{otherwise}}}}}} \hfill \end{array}\right.,$$
(2)

where Tp denotes the width of every pulse within the pulse train. Thus, the magnitude of the μ-th attribute is encoded onto the pulse repetition rate of the μ-th pulse train current signal j0,μ(t).

Fig. 8: Signal flow diagram depicting the preparation of input signals.
figure 8

Attributes aμν of the ν-th data sample are mapped to a spike rate rμ,ν via a function r(aμ,ν). The resulting spike rate rμ,ν serves as the input to a signal generator j(rμ,ν) producing a pulse train with the corresponding pulse repetition rate.

Readout layer and classification

To perform the drybeans classification task, the spike-coded signal generated from the entries of the drybeans dataset31 is fed into the 20 input nodes of the FHN-network. The instanteneous voltages ui(t), i = 1,…,r, r = 20 at times t = T − jΔt, j = 1,…, q, q = 100, T = 60 μs, Δt = 0.2 μs are read out from the reservoir’s output nodes. These voltages form vectors \({{{{{{{{\boldsymbol{x}}}}}}}}}_{i}\in {{\mathbb{R}}}^{q},i=1,\ldots ,r\) (see Fig. 9). The time window of 20 μs corresponds to approximately 10 oscillation periods of the considered FHN oscillators. In total N = 13611 sets of vectors \({{{{{{{{\boldsymbol{x}}}}}}}}}_{i}\in {{\mathbb{R}}}^{q},i=1,\ldots ,r\) and the respective labels yGT {0, 1}7 are generated. All components of 7-dimensional vectors yGT are zeros except for a single component. The position of the non-zero component defines the corresponding bean-class of the dataset entry (i.e., Seker, Barbunya, Bombay, Cali, Horoz, Sira, or Dermason).

Fig. 9: Readout and classification.
figure 9

Equidistant samples of instantaneous voltages ui(t), i = 1,…r of the output oscillators (red dots) are fed into the two-layered fully connected feedforward ANN, forming the input layer vector \({{{{{{{\boldsymbol{x}}}}}}}}={[{{{{{{{{\boldsymbol{x}}}}}}}}}_{1}^{{{{{{{{\rm{T}}}}}}}}},\ldots ,{{{{{{{{{\boldsymbol{x}}}}}}}}}_{{{{{{{{\boldsymbol{r}}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}]}^{{{{{{{{\rm{T}}}}}}}}}\), matrix of weights \({{{{{{{\mathcal{W}}}}}}}}\), and the output layer vector y [0, 1]7. The red shaded area denotes the time window from which training samples are extracted for training purposes.

The readout is the ANN consisting of two fully connected feed-forward layers of artificial neurons: the input layer which receives the signal \({{{{{{{\boldsymbol{x}}}}}}}}={[{{{{{{{{\boldsymbol{x}}}}}}}}}_{1}^{{{{{{{{\rm{T}}}}}}}}},\ldots ,{{{{{{{{{\boldsymbol{x}}}}}}}}}_{{{{{{{{\boldsymbol{r}}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}]}^{{{{{{{{\rm{T}}}}}}}}},r=20\) from the reservoir, and the output layer y [0, 1]7, with the softmax activation function, containing 7 nodes which represent probabilities of a bean to belong to a particular bean class. Collecting the weights of the ANN into matrix \({{{{{{{\mathcal{W}}}}}}}}\), we get

$${{{{{{{\boldsymbol{y}}}}}}}} = {\mathtt{softmax}}\left({{{{{{{\mathcal{W}}}}}}}}{{{{{{{\boldsymbol{x}}}}}}}}\right)\quad \,{{\mbox{with}}}\,\quad {\mathtt{softmax}}({{{{{{{\boldsymbol{y}}}}}}}}) \\ = \left(\frac{{e}^{{{{{{{{{\boldsymbol{y}}}}}}}}}_{1}}}{\mathop{\sum }\nolimits_{j = 1}^{7}{e}^{{{{{{{{{\boldsymbol{y}}}}}}}}}_{j}}},\ldots ,\frac{{e}^{{{{{{{{{\boldsymbol{y}}}}}}}}}_{7}}}{\mathop{\sum }\nolimits_{j = 1}^{7}{e}^{{{{{{{{{\boldsymbol{y}}}}}}}}}_{j}}}\right).$$

The categorical cross entropy loss between the ground truth yGT and y for all training samples is used to optimize the weight matrix \({{{{{{{\mathcal{W}}}}}}}}\). The stochastic gradient descent optimization is implemented in Python using the Keras API60 for 1000 epochs using the whole training set as a single batch (with all other Keras parameters kept default). Smaller batch sizes (1000 and 256 training samples) were also probed, however, the resulting classification accuracy appeared to be slightly worse compared to the single-batch approach. All generated samples were split into the training and validation sets following the 10-fold cross-validation procedure: out of 10 fixed subsets of generated samples, 9 subsets were used for the training and 1 subset is used for the validation. Then we repeat the training procedure 10 times taking different validation sets and average the resulting classification accuracy over all 10 folds. Such an approach allows us to be ensured about the consistency of the classification accuracy over the entire dataset, and for a comparison with other classification approaches which use 10-fold cross validation for the same dataset31,34 (see Fig. 6a and c, respectively).

The generation of vectors x for the study of the classification accuracy under the readout shrinkage is performed in a similar way. We test 4 different scenarios: 2 times less samples per node, 5 times less samples per node, 2 times less number of output nodes in the reservoir, and 5 times less number of output nodes in the reservoir compared to the original readout. In the first two cases, we use the same 20 μs-long time interval as in the initial training, however, the number of samples is q = 50 and q = 20, respectively. These correspond to the sampling intervals of Δt = 0.4 μs and Δt = 1 μs. In case of the reduction of the number of output nodes, 10 and 4 random oscillators were chosen, and q = 100 equidistant instantaneous voltages were sampled from the 20 μs-long time interval (as described at the beginning of this subsection).

Electrical Network Model

The FitzHugh-Nagumo oscillator is both a technologically realizable and biologically plausible oscillator. Originally, the oscillator has been described by a unitless set of differential equations19. Later on, Fitzhugh, Arimoto, and Yoshizawa provided an equivalent circuit20, which inspired the circuit depicted in Fig. 1c. Its dynamic behavior can be captured by the following set of differential equations:

$$C\frac{{{{{{{{\rm{d}}}}}}}}u}{{{{{{{{\rm{d}}}}}}}}t} = {j}_{0}-{i}_{G}(u)+i+{i}_{{{{{{{{\rm{c}}}}}}}}},\quad L\frac{{{{{{{{\rm{d}}}}}}}}i}{{{{{{{{\rm{d}}}}}}}}t}={e}_{0}-{R}_{0}i-u,\quad \\ {i}_{G}(u) = {G}_{0}\left[\frac{{u}^{3}}{3{U}_{0}^{2}}-u\right],\quad u(0)={u}_{0},\quad i(0)={i}_{0},\quad {R}_{0},C,L \, > \, 0.$$
(3)

Here, u (u0) and i (i0) denote the voltage across the capacitor (initial) with C > 0 and the current flowing through the inductor (initial) with L > 0, respectively. Furthermore, e0 and j0 denote an external input voltage and current, respectively. The current iG(u) represents the nonlinearity of the oscillator, which, in terms of electrical components, is given by a nonlinear resistor with a cubic (I, u)-curve. The nonlinearity has been formerly realized by a tunnel diode20 and more recently by a combination of a negative impedance converter (NIC) and a diode clipper45,46. Finally, the current ic denotes the coupling current at the external coupling port, see the open port in Fig. 1c.

A vector-valued electrical model presents a compact way of representing coupled oscillator networks. To achieve this representation, we start by considering of a network consisting of n uncoupled FNOs, whose dynamic behavior can be described by vector-valued differential equations based on (3):

$$\begin{array}{r}{{{{{{{\boldsymbol{C}}}}}}}}\frac{{{{{{{{\rm{d}}}}}}}}{{{{{{{\boldsymbol{u}}}}}}}}}{{{{{{{{\rm{d}}}}}}}}t}={{{{{{{{\boldsymbol{j}}}}}}}}}_{0}-{{{{{{{{\boldsymbol{i}}}}}}}}}_{G}({{{{{{{\boldsymbol{u}}}}}}}})+{{{{{{{\boldsymbol{i}}}}}}}}+{{{{{{{{\boldsymbol{i}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}},\quad {{{{{{{\boldsymbol{L}}}}}}}}\frac{{{{{{{{\rm{d}}}}}}}}{{{{{{{\boldsymbol{i}}}}}}}}}{{{{{{{{\rm{d}}}}}}}}t}={{{{{{{{\boldsymbol{e}}}}}}}}}_{0}-{{{{{{{{\boldsymbol{R}}}}}}}}}_{0}{{{{{{{\boldsymbol{i}}}}}}}}-{{{{{{{\boldsymbol{u}}}}}}}},\\ {{{{{{{\boldsymbol{u}}}}}}}}(0)={{{{{{{{\boldsymbol{u}}}}}}}}}_{0}, \quad {{{{{{{\boldsymbol{i}}}}}}}}(0)={{{{{{{{\boldsymbol{i}}}}}}}}}_{0},\quad {{{{{{{{\boldsymbol{R}}}}}}}}}_{0}\ge {{{{{{{\bf{0}}}}}}}},\quad {{{{{{{\boldsymbol{C}}}}}}}},{{{{{{{\boldsymbol{L}}}}}}}} \, > \, {{{{{{{\bf{0}}}}}}}}.\end{array}$$
(4)

For the sake of simplicity, we chose C = C1, L = L1, and R0 = R01, where 1 denotes the unit matrix, which corresponds to the unnecessary assumption of identical oscillators. The dynamical variables of the ν-th oscillator are comprised into the vectors \({{{{{{{\boldsymbol{u}}}}}}}}={[{u}_{1},{u}_{2},\ldots ,{u}_{\nu },\ldots ,{u}_{N}]}^{{{{{{{{\rm{T}}}}}}}}}\) and \({{{{{{{\boldsymbol{i}}}}}}}}={[{i}_{1},{i}_{2},\ldots ,{i}_{\nu },\ldots ,{i}_{N}]}^{{{{{{{{\rm{T}}}}}}}}}\), respectively. As such, the function iG(u) can be understood as an element-wise evaluated version of the function iG(u) in (3). Lastly, we define the vector of input quantities \({{{{{{{{\boldsymbol{e}}}}}}}}}_{0}={e}_{0}{\mathbb{1}},{{{{{{{{\boldsymbol{j}}}}}}}}}_{0}={[{j}_{0,1},{j}_{0,2},\ldots ,{j}_{0,\nu },\ldots ,{j}_{0,n}]}^{{{{{{{{\rm{T}}}}}}}}}\), and \({{{{{{{{\boldsymbol{i}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}={[{i}_{{{{{{{{\rm{c}}}}}}}},1},{i}_{{{{{{{{\rm{c}}}}}}}},2},\ldots ,{i}_{{{{{{{{\rm{c}}}}}}}},\nu },\ldots ,{i}_{{{{{{{{\rm{c}}}}}}}},n}]}^{{{{{{{{\rm{T}}}}}}}}}\), where \({\mathbb{1}}\) denotes a vector of ones.

Let Rμν be the coupling resistor interconnecting the μ-th and ν-th oscillator (μ < ν) and Gμν = 1/Rμν be its inverse value, which we term the coupling strength, then the voltage and current across every coupling resistor are given by:

$$\begin{array}{r}{v}_{\mu \nu }={u}_{\mu }-{u}_{\nu }\quad \,{{\mbox{and}}}\,\quad {j}_{\mu \nu }={G}_{\mu \nu }{v}_{\mu \nu },\quad \,{{\mbox{with}}}\,\quad {i}_{{{{{{{{\rm{c}}}}}}}},\mu }=\mathop{\sum }\limits_{\mu > \nu }^{n}{j}_{\nu \mu }-\mathop{\sum }\limits_{\mu < \nu }^{n}{j}_{\mu \nu }.\end{array}$$
(5)

To compactly write this relation, we define the incidence matrix \({{{{{{{\boldsymbol{N}}}}}}}}\in {{\mathbb{R}}}^{n\times nk}\) of the underlying interconnection graph, where n is the number of oscillators and nk is the number of resistive connections. For every resistive connection Rμν, the matrix contains a corresponding column, whose μ-th and ν-th rows have the entries 1 and −1, respectively, otherwise all other entries in the column are zero. With this definition, we are now able to compactly rewrite (5) as

$${{{{{{{\boldsymbol{v}}}}}}}}= {{{{{{{{\boldsymbol{N}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{\boldsymbol{u}}}}}}}},\quad {{{{{{{\boldsymbol{j}}}}}}}}={{{{{{{{\boldsymbol{G}}}}}}}}}_{{{{{{{{\rm{d}}}}}}}}}{{{{{{{\boldsymbol{v}}}}}}}},\quad \,{{\mbox{with}}}\,\quad {{{{{{{{\boldsymbol{i}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}=-{{{{{{{\boldsymbol{N}}}}}}}}{{{{{{{\boldsymbol{j}}}}}}}}\quad \Rightarrow \quad {{{{{{{{\boldsymbol{i}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}=-{{{{{{{{\boldsymbol{W}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}{{{{{{{\boldsymbol{u}}}}}}}},\quad \,{{\mbox{with}}}\, \\ {{{{{{{{\boldsymbol{W}}}}}}}}}_{{{{\!\!{{{{\rm{c}}}}}}}}} = {{{{{{{{\boldsymbol{W}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}={{{{{{{\boldsymbol{N}}}}}}}}{{{{{{{{\boldsymbol{G}}}}}}}}}_{{{{{{{{\rm{d}}}}}}}}}{{{{{{{{\boldsymbol{N}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}},$$
(6)

where v and i are vectors, whose entries are the conductances voltages vμν and current jμν, respectively, sorted first according to μ and then according to ν. The matrix Gd is a diagonal matrix with the diagonal entries Gμν with the same sorting as the two aforementioned vectors. Moreover, the matrix Wc can be understood as the coupling network’s weighted Laplacian, with the unit of a conductance, where the weights are given by the conductances Gμν. The relation relating ic with u constitutes that of a so-called multi-port resistor71.

Combining the vector-valued differential equation with the compact relation (6), we obtain the vector-valued circuit depicted at the left of Fig. 10. In the following, we make use of this circuit to derive a wave digital model with which we can emulate an arbitrarily large network of resistively coupled FNOs.

Fig. 10: Electrical and wave digital model of a FitzHugh-Nagumo oscillator network.
figure 10

a Vector-valued circuit representing a network of n coupled FitzHugh-Nagumo oscillators. The coupling is represented by the multiport conductance Wc. The current source j0 is used as an excitation source in order to supply pulse trains to the oscillators. b Wave digital model of the reference circuit in a. The wave digital model and the reference circuit have a 1-to-1 correspondence. The current source j0 translates to a reflective wave source. The ideal voltage source e0 translates to a non-reflective wave source. Both the capacitor and inductor translate to delay elements. The multiport conductance translates to a scattering matrix. Lastly, the series and parallel interconnection translate to a series and parallel adaptor, respectively.

Wave digital emulation

The wave digital concept is a powerful tool for emulating electrical circuits. The idea is to map a given reference circuit onto a signal flow diagram with wave quantities, the so-called wave flow diagram. An iterative evaluation of the signal flow diagram, which we refer to as the wave digital algorithm, allows for a real-time emulation of the electrical circuit. Furthermore, the compact representation of a wave flow diagram is referred to as the wave digital model, whose ports have a 1-to-1 correspondence with those of the reference circuit, see, for example, the right side of Fig. 10. To obtain this representation, we start by decomposing the reference circuit into a set of one- and multi-ports, see the left side of Fig. 10. The current and voltage at every port are related by some constitutive relation that is dependent on the electrical component at the same port. Every one of these components can be translated into the wave digital domain by applying the bijective mapping relation:

$${{{{{{{\boldsymbol{a}}}}}}}}= \, {{{{{{{\boldsymbol{u}}}}}}}}+{{{{{{{\boldsymbol{R}}}}}}}}{{{{{{{\boldsymbol{i}}}}}}}},\quad {{{{{{{\boldsymbol{b}}}}}}}}={{{{{{{\boldsymbol{u}}}}}}}}-{{{{{{{\boldsymbol{R}}}}}}}}{{{{{{{\boldsymbol{i}}}}}}}},\quad {{{{{{{\boldsymbol{R}}}}}}}} \, > \, 0,\quad {{{{{{{\boldsymbol{a}}}}}}}}\in {{\mathbb{R}}}^{p\times 1},\quad {{{{{{{\boldsymbol{b}}}}}}}}\in {{\mathbb{R}}}^{p\times 1},\quad \\ {{{{{{{\boldsymbol{R}}}}}}}},{{{{{{{\boldsymbol{G}}}}}}}}\in {{\mathbb{R}}}^{p\times p}.$$
(7)

Here, a and b denote the vector of incident waves and reflected wave, respectively. Moreover, R is a diagonal positive-definite matrix, the so-called port resistance matrix; certain choices of R can greatly simplify the resulting wave flow diagram. Note that differential constitutive relationships, for example those of capacitors and inductors, must first be numerically integrated. In this work, we make use of the trapezoidal rule. For an overview of how different electrical components are translated to wave digital structures, the interested reader is referred to52.

Now, we translate the circuit on the right side of Fig. 10 into the wave digital domain. The capacitor (inductor) translates to a delay element (with sign inversion) with the port resistance(s)

$${{{{{{{{\boldsymbol{R}}}}}}}}}_{C}=\frac{T}{2}{{{{{{{{\boldsymbol{C}}}}}}}}}^{-1}\quad \,{{\mbox{and}}}\,\quad {{{{{{{{\boldsymbol{R}}}}}}}}}_{L}=\frac{2}{T}{{{{{{{\boldsymbol{L}}}}}}}},$$
(8)

where T denotes the sampling period of the wave digital algorithm. The voltage source with the internal resistance R0 translates to a wave source supplying the wave ae = −e0. The multi-port resistor Wc representing the coupling network translates to a scattering matrix with

$${{{{{{{{\boldsymbol{S}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}} = {[{{{{{{{\bf{1}}}}}}}}+{{{{{{{{\boldsymbol{W}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}{{{{{{{{\boldsymbol{R}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}]}^{-1}[{{{{{{{\bf{1}}}}}}}}-{{{{{{{{\boldsymbol{W}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}{{{{{{{{\boldsymbol{R}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}]\quad \,{{\mbox{and}}}\,\quad \\ {{{{{{{{\boldsymbol{R}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}} = {\left[{{{{{{{{\boldsymbol{R}}}}}}}}}_{j}^{-1}+{[{{{{{{{{\boldsymbol{R}}}}}}}}}_{L}+{{{{{{{{\boldsymbol{R}}}}}}}}}_{0}]}^{-1}\right]}^{-1}.$$
(9)

The series connection (parallel connection) translates to a so-called series adaptor (parallel adaptor), depicted as square boxes on the right side of Fig. 10. Finally, the current source, with the nonlinear internal resistance iG(u), is treated like an ideal voltage-controlled current source, such that it translates to the reflective wave source supplying the wave aj.

In a final note, we would like to discuss the topic of directed delay-free loops in our wave flow diagram. The wave flow diagram, being a directed signal flow diagram, can sometimes contain directed delay-free loops. The latter correspond to implicit relationships that must be resolved, as the wave flow diagram can otherwise not be evaluated. Such loops can be eliminated in two different ways, namely by using reflection-free ports or iteration methods. The former means that the reflected wave at the corresponding port is not dependent on its incident wave52. The latter is a method, where a missing wave is approximated by iteratively evaluating the part of the wave flow diagram, where the wave appears72. Our wave digital algorithm makes use of both of these techniques. In the right part of Fig. 10 the parallel adaptor port directed towards the scattering matrix Sc and the series adaptor port directed towards the parallel adaptor are both chosen to be reflection-free, denoted by the T-shaped symbol, as this eliminates delay-free loops that emerge, when interconnecting these wave digital structures. Furthermore, we make use of a fixed-point iteration to resolve the implicit relationship at the left port of the parallel adaptor, given by:

$$\begin{array}{r}{{{{{{{{\boldsymbol{a}}}}}}}}}_{j}=2{{{{{{{{\boldsymbol{R}}}}}}}}}_{j}[\, {{{{{{{{\boldsymbol{j}}}}}}}}}_{0}-{{{{{{{{\boldsymbol{i}}}}}}}}}_{G}({{{{{{{\boldsymbol{u}}}}}}}})],\quad \,{{\mbox{with}}}\,\quad {{{{{{{\boldsymbol{u}}}}}}}}=\frac{{{{{{{{{\boldsymbol{a}}}}}}}}}_{j}+{{{{{{{{\boldsymbol{b}}}}}}}}}_{j}}{2}.\end{array}$$
(10)

The implicitness of this relation is caused by the dependency of aj on u.

Average dissipated power

The average dissipated power is a measure of how much power is exchanged through the resistive coupling network over all times. In other words, it is a measure of how much communication is taking place between the oscillators averaged over a large time frame. To calculate this quantity, we start by calculating the instantaneous power pc(t) of resistive coupling network:

$$\begin{array}{r}{p}_{{{{{{{{\rm{c}}}}}}}}}={{{{{{{{\boldsymbol{v}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{\boldsymbol{j}}}}}}}}=-{{{{{{{{\boldsymbol{u}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{{\boldsymbol{i}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}={{{{{{{{\boldsymbol{u}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{{\boldsymbol{W}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}{{{{{{{\boldsymbol{u}}}}}}}}.\end{array}$$
(11)

Considering that we only have access to the discrete time points tk = t0 + kT, where t0 is a reference time and T is the wave digital algorithm’s sampling period, the average dissipated power can be calculated by averaging the instantaneous power over all discrete time instants:

$${\bar{p}}_{{{{{{{{\rm{c}}}}}}}}}=\frac{1}{K}\mathop{\sum }\limits_{k=0}^{K-1}{p}_{{{{{{{{\rm{c}}}}}}}}}({t}_{k}),$$
(12)

where K is the number of sampling points in the wave digital emulation. However, it may also be beneficial to consider the instantaneous power pc(t) over single resistive connections:

$$\begin{array}{r}{{{{{{{{\boldsymbol{p}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}={{{{{{{\rm{diag}}}}}}}}({{{{{{{\boldsymbol{v}}}}}}}}){{{{{{{{\boldsymbol{G}}}}}}}}}_{{{{{{{{\rm{d}}}}}}}}}{{{{{{{\boldsymbol{v}}}}}}}}={{{{{{{\rm{diag}}}}}}}}({{{{{{{{\boldsymbol{N}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{\boldsymbol{u}}}}}}}}){{{{{{{{\boldsymbol{G}}}}}}}}}_{{{{{{{{\rm{d}}}}}}}}}{{{{{{{{\boldsymbol{N}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{\boldsymbol{u}}}}}}}},\quad \,{{\mbox{with}}}\,\quad {p}_{{{{{{{{\rm{c}}}}}}}}}={{\mathbb{1}}}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{{\boldsymbol{p}}}}}}}}}_{{{{{{{{\rm{c}}}}}}}}}\end{array}$$
(13)

The entries of pc(t) describe the power that is dissipated by the (μ, ν)-th resistive connection at the time instant t (the entries of pc are sorted in the same manner as the diagonal entries of Gc). Hence, they are a measure of how much communication takes place between the μ-th and ν-th oscillator.

Since the FNOs communicate by exchanging voltage spikes, which induce a current flow over the corresponding coupling resistors, the dissipated power is tightly correlated to transfer entropy73,74 and can serve as a simple measure of information flow. However, while the latter requires a sophisticated statistical analysis, the average dissipated power can easily be calculated using knowledge about the network’s topology and the oscillator’s voltage time series.