Physical models typically assume time-independent interactions, whereas neural networks and machine learning incorporate interactions that function as adjustable parameters. Here we demonstrate a new type of abundant cooperative nonlinear dynamics where learning is attributed solely to the nodes, instead of the network links which their number is significantly larger. The nodal, neuronal, fast adaptation follows its relative anisotropic (dendritic) input timings, as indicated experimentally, similarly to the slow learning mechanism currently attributed to the links, synapses. It represents a non-local learning rule, where effectively many incoming links to a node concurrently undergo the same adaptation. The network dynamics is now counterintuitively governed by the weak links, which previously were assumed to be insignificant. This cooperative nonlinear dynamic adaptation presents a self-controlled mechanism to prevent divergence or vanishing of the learning parameters, as opposed to learning by links, and also supports self-oscillations of the effective learning parameters. It hints on a hierarchical computational complexity of nodes, following their number of anisotropic inputs and opens new horizons for advanced deep learning algorithms and artificial intelligence based applications, as well as a new mechanism for enhanced and fast learning by neural networks.
Research in neurophysiology reveals detailed structures of neural networks, including a large diversity among their building blocks. The coarse picture is a unidirectional network, where each neuron collects the incoming signals from its input neurons (Fig. 1a). Specifically, the output neuron collects its incoming inputs via several dendritic trees1 (light-blue lines in Fig. 1a), where each input neuron transmits its output signal via a single axon (red lines in Fig. 1a). The connections between the many branches of the axon and the dendritic trees are called synapses (green-stars in Fig. 1a), which bridge between the output and the input signals. The current assumption is that synapses slowly change their strengths during the learning process2,3,4,5. A simplified scheme is demonstrated in Fig. 1b, where the branches of each neuronal dendritic tree are represented by a bar (light-blue), with multiple connecting synapses (green-arrows), where each input neuron (gray) can be connected to several dendritic trees, a scheme which is also used in deep learning6,7,8.
A network of connecting neurons consists of N × C learning parameters (synapses), where N and C stand for the number of nodes and the average incoming connections per node, respectively (Fig. 1c1). In the approach suggested here, the network consists of N × N D learning parameters, which are the strengths of the dendritic trees, where N D is the average number of dendritic trees per neuron (Fig. 1c2). The parameter C scales as O(N) in fully connected or dense networks and is estimated9 in neural networks to be O(104). This number is significantly larger than N D , since typically there are only a few dendritic trees per neuron9,10,11.
Our central objective is to compare the cooperative dynamical properties between synaptic (link) and dendritic (nodal) learning scenarios (Fig. 1c). In particular, we examine whether the attribution of the learning process to much fewer adjustable parameters, N × N D , enriches or diminishes the learning capabilities. Finally, the consensus that the learning process is attributed solely to the synapses is questioned. A new type of experiments strongly indicates that a faster and enhanced learning process occurs in the neuronal dendrites, similarly to what is currently attributed to the synapses3,11.
A comparison between the two types of learning processes is first examined using a prototypical feedforward network, a perceptron12,13,14 consisting of three input nodes, one output node and three weights with given delays and initial strengths (Fig. 2a). For synaptic learning, the adjustable parameters are the three weight strengths (color coded in Fig. 2a). For dendritic learning the weight strengths (W) are unchanged and the adjustable parameters are the two dendritic strengths (WD in Fig. 2b), connected to the first and to the last two input units, respectively (Fig. 2b). The synaptic and dendritic adaptations are identical and are based on the currently accepted modified Hebbian learning rule, known as spike-time-dependent-plasticity3,4,11. Specifically, the relative change in the strength of a weight, δW, during a learning step is a function of the time-lag, Δ, between an above-threshold stimulation, resulting in an evoked spike, and a stimulation that does not result in an evoke spike, e.g. sub-threshold stimulation. A positive/negative Δ strengthens/weakens a weight following a typical profile (Fig. 2c).
The input units are stimulated above threshold simultaneously at 10 Hz and the following standard leaky integrate-and-fire model15,16 is used to evaluate the dynamics of the output neuron for both scenarios (Fig. 2a,b). Specifically, we simulated a perceptron consisting of N = 3 excitatory leaky integrate and fire input neurons and one output neuron. The voltage V(t) of the output neuron is given by the equation:
where Wi and WDi are the connection’s and dendrite’s strength from neuron i to the output neuron, respectively. di is the delay from neuron i to the output neuron, and N stands for the number of input neurons. τ = 20 ms is the membrane time constant and Vst = −70 mV stands for the stable membrane (resting) potential. The summation over ti(n) sums all the firing times of neuron i. A neuronal threshold is defined as Vth = −54 mV and a threshold crossing results in an evoked spike. In the event of synaptic learning, for every pair of a sub-threshold stimulation and an evoked spike, the weights, Wi, were modulated according to the learning curve (Fig. 2c). Similarly, in the event of dendritic learning, for every pair of a sub-threshold stimulation and an evoked spike, originated from two different dendrites, the dendritic weights, WDi, were modulated following the learning curve (Fig. 2c, see Methods for more details).
For synaptic learning (Fig. 2d, left), evoked spikes are initially generated by the orange-weight, and the preceding/later sub-threshold stimulation weakens/strengthens the green/red weight, respectively (Fig. 2c). Asymptotically, the green-weight vanishes and the red-weight is at threshold and the perceptron repeatedly generates pairs of evoked spikes (Fig. 2d, left).
For dendritic learning (Fig. 2d, right), an evoked spike is generated by the left dendrite after 12 ms (orange) and two sub-threshold stimulations arrive via the right dendrite after 7 ms (green) and 15 ms (red), 5 ms before and 3 ms after an evoked spike, respectively. Consequently, the right dendrite is strengthening on the average (Fig. 2c). Asymptotically, all three effective weights, W*WD, are above-threshold and generate triplets of evoked spikes (Fig. 2d, right).
The same perceptron but with different initial weights results for synaptic learning in the same firing pattern and weight strengths (Fig. 2d,e, left). For dendritic learning (Fig. 2e, right), the right-dendrite strengthens such that the red-weight is effectively above-threshold, while the green-weight is still sub-threshold (Fig. 2e, right at ~15 s). The firing patterns consist now of orange-red pairs of spikes (Fig. 2e, right), however the learning process proceeds. The green sub-threshold stimulation arrives before the orange-spike, resulting in the weakening of the right-dendrite and the termination of red-spikes. Now the right-dendrite is again strengthening as in Fig. 2d and so forth, resulting in a complex firing pattern with a longer periodicity.
Examples presented in Fig. 2d,e hint on two major differences between the two learning scenarios. For the same architecture but different initial weights, synaptic learning tends to stabilize on the same firing pattern, whereas dendritic leaning may result in a variety of firing patterns. In addition, synaptic learning drives weights to extreme limits17,18, vanishing or threshold, whereas dendritic learning enables stabilization around intermediate values.
An extension to a perceptron with seven inputs (N = 7 in the abovementioned equation) and with three dendrites enriches the fundamental differences between the two adaptive dynamics (Fig. 3a,b). The seven delays (Fig. 3a,b, bottom) and initial weights (Fig. 3c1 and d1, top) are identical for both scenarios and the input units are simultaneously stimulated above-threshold at 10 Hz. Weights in synaptic learning are driven again toward vanishing or threshold limits (Figs 3c1 and 2d,e), however, dendritic learning reveals a new phenomenon, oscillatory behavior of the weights. These trends are explained using several snapshots of the effective weights, color coded and ordered following their delays, representing different stages of the dynamics (Fig. 3c2 and d2). Since the neuronal voltage has a decay time to the resting potential after an input arrival (Methods), a necessary condition to generate an evoked spike is an effective weight which reaches , the difference between the threshold and the current neuronal voltage. For synaptic learning, initially only the dark-orange weight is at threshold (panel A in Fig. 3c2). Following the learning rule (Fig. 2c), the strengths of all longer/shorter delays increase/decrease (panel B in Fig. 3c2), until only vanishing or weights at threshold remain (panel C in Fig. 3c2).
For dendritic learning, initially only the effective orange weight (W*WD) is at threshold (panel A in Fig. 3d2), generating a spike 40 ms after each input stimulation. Consequently, the red-dendrite and effectively its three incoming weights are strengthening (panel B in Fig. 3d2), since its nearby sub-threshold input, via the 50 ms pink-weight, arrives 10 ms later (Fig. 3b). Similarly, the strength of the green-dendrite decreases as it generates sub-threshold stimulations prior to the evoked spikes. Spikes are now generated after 5 ms and also after 50 ms (panel B in Fig. 3d2) and the strength of the orange-dendrite rapidly decreases, since its sub-threshold input arrives just before, after 46 ms (The origin of the orange decay-slope shape is demonstrated in Fig. S1). The red-evoked spike at 5 ms is now rapidly strengthening the green-dendrite (with 20 ms and 25 ms delays) until generating evoked spikes (panels B and C in Fig. 3d2). The 10 ms red-weight sub-threshold stimulations, arriving before the green-spikes, weaken the red dendrite and the red-spikes terminate (panel C in Fig. 3d2). In addition, the orange-dendrite is strengthening and finally generates evoked spikes, as its sub-threshold stimulations arrive after the green-spikes (panel D in Fig. 3d2). Now green-spikes terminate, as a result of green-sub-threshold stimulation at 25 ms, prior to the orange-spikes (panels D,E in Fig. 3d2). A loop of the weight strengths emerges (panels E and A in Fig. 3d2) generating an oscillatory behavior (Fig. 3d1). Identical architectures (Fig. 3a,b) but with different initial weights (Fig. 3e,f) result again in extreme limit weights for synaptic learning, but with a different oscillatory behavior for dendritic learning.
Synaptic learning terminates in vanishing or threshold weights, independent of the initial conditions (Figs 2 and 3) and represents an unrealistic biological reality. In addition, the large fraction of very weak weights has practically no impact at all on the dynamics. In contrast, dendritic learning can stabilize weights with intermediate strengths (Fig. 2e, right) and oscillatory behaviors (Fig. 3d1,f) which are significantly and instantaneously governed by the sub-threshold stimulations originated from the weak effective couplings.
The reality of the theoretical concept of dendritic learning receives a support from the following new type of in-vitro experiments, where synaptic blockers are added to neuronal cultures such that sparse synaptic connectivity is excluded (Methods). A multi-electrode array (Fig. S2) is used to stimulate extracellularly a patched neuron19 via its dendrites (Fig. 4a). An online method is used to identify a subset of extracellular electrodes which reliably generate intracellularly evoked spikes (Fig. 4b). Low stimulation rates (e.g. 1 Hz) ensure stable neuronal response latencies, NRL, measuring the time-lag between the extracellular stimulation and the intracellularly recorded evoked spike, which is crucial for controlling the relative timings between pairs of intra- and extra- stimulations20.
The learning process is based on a training set of typically 50 pairs of stimulations, an above-threshold intracellular stimulation followed by an extracellular stimulation which does not result in evoke spikes, e.g. sub-threshold (Fig. 4c), arriving after a predefined delay, typically 2–5 ms to enhance possible adaptation (Fig. 2c, Methods). We take into account only experimental realizations where a local depolarization was visible by a consecutive sub-threshold stimulation to the above-threshold one (Fig. 4d). The demonstrated results were quantitatively repeated tens of times on many cultures (see statistical analysis in Methods).
The intracellular voltage recordings of a patched neuron stimulated extracellularly before and a few minutes after training (Fig. 4c) presents a significant effect of the learning in the form of 200–300% increase in the local depolarization (Fig. 4e). This learning effect emerges only a few minutes after the termination of the training procedure and was found to be stable and persistent over longer periods (by repeated measurements of solely extracellular stimulations over tens of minutes). Another evidence for such learning is the enhancement of the effect of extracellular stimulations from small local depolarization to evoked spikes recorded intracellularly (Fig. 4f). Note that before training, the responsiveness of neurons was found to be time-independent and over tens of minutes.
A reverse learning procedure, presenting the sub-threshold stimulation prior to the above-threshold one, was also examined in tens of experiments, indicating no effect or weakening of the local depolarization, but no strengthening (Fig. 5). It suggests, as indicated by some preliminary results, the possibility to first strengthen and then weaken the local depolarization, using sequential learning and reverse learning.
Most of the neural network links have relatively weak strengths in comparison to the threshold21,22. Hence, a persistent cooperation among many stimulation timings is required to reliably influence the dynamics, otherwise most of the links are actually dynamically insignificant. Using a nodal (dendritic) learning rule, we show that the dynamics is counterintuitively mainly governed by the weak links (Figs 2 and 3). Interestingly, the nodal learning exhibits a self-controlled mechanism for achieving intermediate and oscillatory weight strengths, as opposed to learning by the links, and hints on new horizons for online learning23. The emergence of fast (Fig. 2e) and slow (Fig. 3) oscillations as a result of the learning process might be related to high cognitive functionalities and a source for transitory binding activities among macroscopic cortical regions24. These oscillations were found to be robust also to the anisotropic nature of neurons25,26 and have to be distinguished from oscillations emerging from the stochastic neuronal responses20. The presented nodal adaptation questions the objective of the similar accepted slower learning rules of tens of minutes by the links, which are probably done in a serial manner (Fig. 1).
The experimental results were obtained using solely cortical pyramidal neurons (Methods), and call to examine their generality using other types of neurons. In addition, the experiments were designed such that the sub-threshold stimulation arrives shortly after or before the spike (2–5 ms) in order to enhance the effect of adaptation. To recover the full learning curve (Fig. 2c), more detailed experiments are required.
The adaptation process was examined when an extracellular sub-threshold stimulation was given after or before an intracellular above-threshold stimulation. Preliminary results indicate that a similar adaptation occurs also in the scenario of solely two sources of extracellular stimulations, one above- and one sub-threshold. The time-lag between the arrivals of both stimulations to the neuron was tuned carefully, taking into account the NRL, in order to imitate a similar scenario to Fig. 4e,f. Preliminary results also indicate the possibility to strengthen and then weaken the local depolarization by consecutive nodal learning and reverse nodal learning (Figs 4 and 5). The observation of the oscillatory behavior of the strength of a dendrite is a necessary condition to verify the similarity between the theoretical predictions (Figs 2 and 3) and experimental observations. It requires a stable control over intra- and extra- stimulations of several patched neurons which constitute small networks (Figs 2 and 3), which is currently beyond our experimental capabilities.
The oscillatory behavior is exemplified for a few specific sets of weights and delays (Figs 2 and 3), however, it represents a generic behavior. The architectures of a neuron with two or three dendrites, where each dendrite has several synapses, were simulated for a few thousands of sets of initial conditions, i.e. synaptic delays and synaptic weights. Specifically, synaptic delays were randomly chosen between 1 and 50 ms, with a gap of at least 3 ms between synaptic delays belonging the same dendrite, and with the constraint that the minimal and the maximal synaptic delays belong to the first dendrite (as in Figs 2 and 3). Weights were randomly chosen from a uniform distribution between 0.1 and 1.8, and at least one effective weight is above threshold, in order to initiate firing. Results indicate that a large fraction of random initial conditions leads to oscillatory behaviors, e.g. ~0.53 for three dendrites with three synapses per dendrite.
The slow oscillatory behavior of the effective weight strengths is realized using a node with three adaptive dendrites, but is unreachable in our scheme using a node with two adaptive dendrites. It hints on a computational hierarchical among networks following the complex morphology of their nodes, e.g. the number of their dendrites27,28. In addition, preliminary results indicate that the different number of firing patterns, stationary or oscillatory, obtained for the nodal learning can exceed dozens in scenarios of only a few adjustable parameters, dendrites. More precisely, for a given architecture and delays, the number of different firing patterns is estimated using different initial time-independent synaptic weights, and is found to exceed a hundred, for instance, for feedforward networks consisting of three adjustable dendrites. These manifolds of firing patterns might be relevant to realities where each dendrite has many input synapses but only a subset is deliberately activated. In addition, preliminary results indicate that one can find several firing patterns for given synaptic strengths and different initial conditions for the dendritic strengths. The large number of time-dependent firing patterns, compared to the number of adjustable parameters, indicates that notions like capacity of a network, capacity per weight and generalization have to be redefined in the light of nodal learning. Results call to examine the features of such dynamics, including the possible number of oscillatory attractors for the weights, using more complex feedforward and recurrent networks and their implication on advanced deep learning algorithms29,30,31,32,33,34.
Methods of Simulation
We simulated a perceptron consisting of N = 3 (Fig. 2) and N = 7 (Fig. 3) excitatory leaky integrate and fire input neurons and one output neuron using equation 1, where Wi and WDi are the connection’s and dendrite’s strength from neuron i to the output neuron, respectively. di is the delay from neuron i to the output neuron and N stands for the number of input neurons. τ = 20 ms is the membrane time constant and Vst = −70 mV stands for the stable membrane (resting) potential. The summation over ti(n) sums all the firing times of neuron i. A neuronal threshold is defined at Vth = −54 mV and a threshold crossing results in an evoked spike followed by a refractory period of 2 ms. During the refractory period no evoked spikes are possible and the voltage is set to Vst. For simplicity, we scale the equation such that Vth = 1, Vst = 0, consequently, V > = 1 is above threshold and V < 1 is below threshold. Nevertheless, results remain the same for both the scaled and unscaled equation. The initial voltage is V (t = 0) = 0 and wDi = 1.
The connectivity was designed as stated for each experiment. One output neuron was defined, and several input neurons were connected to it by regular connections or through defined dendrites. Initially all Wi were set to their initial value, as stated for each simulation.
We simultaneously stimulated above-threshold the input neurons at 10 Hz.
Learning Rule for synaptic learning
For every pair of a sub-threshold stimulation and an evoked spike (originated from two different input neurons), the connections weights were modulated according to the following equation:
were is the change in the weight, Wi, and measured in milliseconds is the time-lag between the sub threshold stimulation and evoked spike. The learning curve had a cutoff at 50 ms. The weights were updated the following way: and the minimum possible weight value was set to 0.001.
Learning Rule for dendritic learning
For every pair of a sub-threshold stimulation and an evoked spike originated from two different dendrites, the dendritic weights, WDi, were modulated according to the same above-mentioned equation as for synaptic learning, where stands for the change in the dendritic weight: .
For experimental methods see ref.26. Additional procedures are detailed below:
Extracellular threshold estimation
The extracellular threshold remained stable during the experiments. After the training of coupled intra- and extra- stimulations, the extracellular threshold was re-estimated every several minutes in order to validate the stability of the parameters used in the experiment. The learning effect is visible only after several minutes, typically 3–6 minutes. The results presented in Fig. 4 are recorded ~5 minutes after training.
An extracellular electrode was selected and both intra- and extra- cellular thresholds were estimated. The neuronal response latency, NRL, and its stability for the extracellular electrode were estimated in order to accurately adjust the arrival timings of the intra- and extra- cellular stimulations. In order to enhance possible adaptation, the time-lags between stimulations were set to 2–5 ms (following the expected learning curve), and sub-threshold stimulation amplitudes close to the threshold (with visible local depolarization). We note that an above-threshold extracellular stimulation given shortly, e.g. 2 ms, after the intracellular stimulation, does not result in an evoked spike, and can be used to enhance the adaptation. The thresholds and NRL were rechecked at the end of the experiment, in order to ensure its stability.
The demonstrated results were quantitatively repeated tens of times on many cultures. In particular, the effect of Fig. 4e,f was observed in more than 85% of such examined experiments, where Δ was in the range of 2–5 ms. The increase in the local depolarization (Fig. 4e) typically ranges between 100–300% and the enhance from small local depolarization to evoked spikes recorded intracellularly was extended to one or several lowered stimulation amplitudes. A reverse learning procedure (Fig. 5) was also observed tens of times on many cultures and resulted in more than 90% with no effect or weakening of the local depolarization.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank Moshe Abeles for stimulating discussions. A technical assistance by Hana Arnon is acknowledged. This research was supported by the TELEM grant of the Council for Higher Education of Israel.