Nonspecific synaptic plasticity improves the recognition of sparse patterns degraded by local noise

Safaryan, Karen; Maex, Reinoud; Davey, Neil; Adams, Rod; Steuber, Volker

doi:10.1038/srep46550

Download PDF

Article
Open access
Published: 20 April 2017

Nonspecific synaptic plasticity improves the recognition of sparse patterns degraded by local noise

Karen Safaryan^1,2,
Reinoud Maex^1,3,
Neil Davey¹,
Rod Adams¹ &
…
Volker Steuber¹

Scientific Reports volume 7, Article number: 46550 (2017) Cite this article

1280 Accesses
4 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Many forms of synaptic plasticity require the local production of volatile or rapidly diffusing substances such as nitric oxide. The nonspecific plasticity these neuromodulators may induce at neighboring non-active synapses is thought to be detrimental for the specificity of memory storage. We show here that memory retrieval may benefit from this non-specific plasticity when the applied sparse binary input patterns are degraded by local noise. Simulations of a biophysically realistic model of a cerebellar Purkinje cell in a pattern recognition task show that, in the absence of noise, leakage of plasticity to adjacent synapses degrades the recognition of sparse static patterns. However, above a local noise level of 20%, the model with nonspecific plasticity outperforms the standard, specific model. The gain in performance is greatest when the spatial distribution of noise in the input matches the range of diffusion-induced plasticity. Hence non-specific plasticity may offer a benefit in noisy environments or when the pressure to generalize is strong.

Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits

Article 13 May 2021

Alexandre Payeur, Jordan Guerguiev, … Richard Naud

Local synaptic inhibition mediates cerebellar granule cell pattern separation and enables learned sensorimotor associations

Article 06 February 2024

Elizabeth A. Fleming, Greg D. Field, … Court Hull

Distributing task-related neural activity across a cortical network through task-independent connections

Article Open access 18 May 2023

Christopher M. Kim, Arseny Finkelstein, … Ran Darshan

Introduction

A central paradigm of neuroscience is that memories can be stored by adapting the strengths of synaptic connections¹. After learning, re-application of a stored pattern re-produces an associated pattern of neuronal activity. The details of the implementation can differ, according to the learning rule used, the extent of dendritic processing, and the response metric taken as output^2,3.

It has been suggested that the inputs may first undergo a transformation to a sparse pattern in a higher-dimensional space (for review see ref. 4). In a binary classification task, such a transformation could make the input patterns that are to be associated with each one of the binary outputs linearly separable by a hyperplane⁵. An expansion of input space into a higher-dimensional space is indeed observed in many neural systems, most prominently in the granular layer of the cerebellar cortex where granule cells outnumber their afferent mossy fibres by at least two orders of magnitude^6,7. The granule cells presumably generate sparse patterns of activity^8,9,10,11 that they convey to the principal neurons or Purkinje cells (PCs) via their ascending axons and parallel fibres (PFs).

The apparent lack of feedback has inspired theorists to model the PC as a perceptron^3,11,12,13 that stores patterns through long-term depression (LTD) of active PF synapses during conjunctive climbing-fibre input^14,15,16. The sparse activity of the granular layer enhances the storage capacity of the Purkinje cell (defined as the number of PF patterns that can be stored without intolerable error)^{3,9,17,18,19,20}.

Nevertheless, these two views, of the granular layer as generating an expansive sparse code and of the Purkinje cell as a binary classifier, have recently been challenged on both experimental and theoretical grounds²¹. Firstly, the transformation of a (dense) input pattern into a sparse pattern is a process that is very sensitive to noise in the input layer⁴. This transformation therefore requires an intermediate (unsupervised) learning stage that maintains the clustering present in the input space²². Plasticity of the mossy-fibre-to-granule-cell connection may provide the neural substrate for this transformation²³. Secondly, and more importantly, LTD at the parallel-fibre-to-Purkinje-cell synapse requires the production and release of NO by PFs^24,25. This NO diffuses to neighboring synapses and compromises the synapse specificity of LTD^{26,27,28,29,30,31,32,33}. A recent theoretical study predicted that such non-specific plasticity would be detrimental for memory³⁴.

Hence both the lack of specificity at the input stage (pattern noise), and the lack of specificity of the learning rule (leakage of plasticity), are expected to affect memory storage and recall. In the present study we used computer simulations and mathematical analyses of Purkinje cell models with different degrees of complexity and biological realism to examine whether both drawbacks could compensate each other, that is, whether nonspecific LTD could make pattern recognition more robust in the presence of local spatial noise.

Results

We examined the effect of leakage of plasticity (nonspecific LTD, or nsLTD) on the recognition of sparse, binary and stationary input patterns disrupted by local noise, in both a linear artificial neural network unit (further called ANN unit) and a morphologically realistic conductance-based Purkinje cell (PC) model (Table 1).

Table 1 The four models used in the present study.

Full size table

These models are described in detail in the Methods section. Briefly, the response r of the simple ANN unit was given by the inner product of the synaptic weight vector w and the input pattern vector x :

The multi-compartmental PC model^35,36 contained a morphologically realistic representation of the dendrite and ten different types of voltage- and ligand-gated ion channels that were modeled using Hodgkin-Huxley-type equations. The model received continuous background input through excitatory PF and inhibitory interneuron synapses, and was active at a baseline rate of 48 spikes per s.

The input patterns had N = 14,740 or 147,400 bits, one for each afferent PF, of which a randomly chosen subset (between 0.35 and 5.6%) was ON. A hundred such patterns were stored by LTD of the PF synapses using one-shot supervised Hebbian learning^5,17 (Fig. 1a). (In the Mathematical Appendix in Supplementary Information, we show analytically that slightly potentiating the non-depressed synapses does not alter the characteristics of the learning rules).

**Figure 1: Pattern recognition in the presence of nonspecific LTD.**

In most simulations, the leakage of plasticity and the pattern noise were local, and could either be limited to a fixed radius of up to three nearest neighbors along the dendritic shaft (further called the 1D neighbor relationship) (Fig. 1b), or could show a volume spread according to a Gaussian distance profile (the 3D neighbor relationship). The same neighbor rules were used to select the noisy bits in noisy versions of the stored patterns. For the 3D relationship, the leakage of plasticity and pattern noise could spread with the same profile or show a mismatch.

The pattern recognition performance was measured by comparing the responses to the 100 stored patterns (or 100 noisy stored patterns) to those to 100 novel random patterns. To this end, neuronal responses were quantified as the weighted input sum for ANN units (see Equation 1 and Fig. 1b) and as the duration of the pause in firing following the pattern-evoked burst for the PC model (see Fig. 2a,b and d,e and ref. 3). Figures 2c,f show examples of the response distributions of pauses evoked in the PC model.

**Figure 2: Pattern recognition in the biophysical PC model.**

Clearly, nsLTD (right column) enhanced the separation between the responses to noisy stored patterns (blue) and novel patterns (red), while decreasing the separation between stored and noisy stored patterns (black versus blue) and, to a lesser extent, the separation between stored and novel patterns (black versus red). In the following sections, this phenomenon will be compared in a quantitative manner for ANN units and PCs, and for 1D and 3D synaptic neighborhood relationships.

The difference in response distribution (stored or noisy-stored versus novel), that is, the pattern recognition performance, was then quantified using a signal-to-noise ratio (s/n)³⁷

where μ_s and μ_n are the mean values and and the variances of the responses to stored and novel patterns, respectively.

Pattern recognition in an ANN unit with a 1D synaptic neighborhood function

Figure 3a plots, for varying degrees of pattern noise, the signal-to-noise ratio of the simulated responses of a linear ANN unit to 100 novel versus 100 noisy stored patterns, each pattern consisting of N = 147,400 bits of which 1000 bits (0.7%) were ON.

**Figure 3: Pattern recognition in an ANN unit with a 1D synaptic neighborhood relationship.**

In the absence of noise (0% on the horizontal axis), standard LTD was always more effective than nonspecific LTD, by a factor of almost 2. However, with increasing local noise levels, the performance fell more sharply for standard LTD, in such a way that above noise levels of 30–40%, nsLTD outperformed LTD. In these simulations, the leakage of LTD and the spread of noise had been matched, and decayed exponentially to a fixed number of one, two or three neighbors (ANN-1D, see Table 1 and Methods Equation 7).

Analytical calculation of the signal-to-noise ratio

To better understand the results of the numerical simulations of the ANN unit we derived the signal-to-noise ratio analytically (see Mathematical Appendix in Supplementary Information). Figure 3b plots a comparison of the analytical and numerical results (numerical results as in Fig. 3a). The complete derivation given in the Appendix shows that, in the presence of nsLTD and for a neighborhood of 1, the relationship between the signal-to-noise ratio and the fraction α of noisy bits in a pattern can be approximated by (Appendix Eq. A10):

where d = 0.5 is the depression factor for activated synapses (ON bits in a pattern), and d_leak is the nsLTD depression factor in the neighborhood, set for example to 0.75 for nearest neighbor synapses in the 1D neighborhood function. For specific LTD, there was no additional depression in the neighborhood of activated synapses and d_leak was equal to 1. As a consequence, the curves describing the relationship between noise level α and signal-to-noise ratio had a shallower slope when the patterns had been stored with nsLTD (Fig. 3b).

In the absence of noise (for α = 0), the signal-to-noise ratio is given by

The value of the ratio in Eqs 3 and 4 can be derived analytically by assuming that the number of times a synapse is hit by an ON bit in a pattern follows a Poisson distribution (see Appendix Eqs A11–A20). In our simulations and analyses nsLTD led to a smaller ratio than specific LTD, which meant that nsLTD resulted in a smaller signal-to-noise ratio in the absence of noise (Appendix Eq. A23), and the s/n curves for LTD and nsLTD crossed each other at a particular noise level α.

The learning rule is robust to additive noise

The present standard LTD learning rule, in which the depression of synapses is divisive and follows a geometrical progression (see Methods Equation 6), is a further elaboration of the learning rule used in the Willshaw associative nets³⁸. A characteristic of these nets is that the patterns always have the same arity (density of ON bits), and it is well known that presenting patterns with fewer or more ON bits than the learned pattern will affect pattern recognition (see Supplementary Figure S1). There is one situation, however, where nsLTD offers an additional benefit: when the supernumerous synapses are activated within the neighborhood of the pattern’s ON bits (Fig. 3c). Such patterns with local additive noise correspond more closely to the clustered patterns of PF activation observed after peripheral stimulation³⁹. In that case, the ANN trained with nsLTD weights the additional ON bits by depressed synapses, whereas specific LTD, which does not have a neighborhood function, cannot tell them apart from random synapses. Nonspecific LTD now starts outperforming specific LTD at 20% noise levels (Fig. 3c), as compared to 30% in the absence of additive noise (Fig. 3a).

Pattern recognition in the PC model with a 1D synaptic neighborhood function

Very similar effects of local noise were observed for the biophysical PC model displayed in Fig. 4a.

**Figure 4: Pattern recognition in the PC model with a 1D synaptic neighborhood relationship.**

When plasticity and noise spread only to the nearest neighbors (red curve), nsLTD already outperformed standard LTD at a noise level as low as 20% (Fig. 4b). Note that overall, the performance of the PC was an order of magnitude lower than that of the ANN unit (compare Figs 3 and 4), but this was partially a consequence of the different metrics used to characterize the responses (see Methods).

Pattern recognition using 3D synaptic neighborhood functions

In order to be able to implement more biologically realistic 3D Euclidean distances (as opposed to 1D nearest neighbor relationships) between PF synapses in the PC model, we reduced the number of PF inputs to N = 14,740, but made each PF innervate a unique individual spine by increasing the number of spines from 1 to 10 on each dendritic compartment (Fig. 5a). Note that this manipulation did not alter the input-output relationship of the model PC³⁶ (Supplementary Fig. S2).

**Figure 5: Pattern recognition in the PC model with a 3D synaptic neighborhood relationship.**

Moreover, Fig. 5b shows the same effects of local noise as observed above: a sharp decline in performance when noisy patterns were presented after training with standard LTD (black curve); a drop in performance with nsLTD in the absence of pattern noise; and an enhanced performance with nsLTD at local pattern noise levels above 20% (red and blue). In the absence of pattern noise, the performance declined monotonically with the leakage radius of nsLTD (Fig. 5c). In contrast, when noise was present, the performance was highest when the radius of leakage of LTD matched the spatial spread of the pattern noise (σ_LTD = σ_noise = 0.75 μm), falling off at smaller and higher radii (Fig. 5d).

ANN units and the biophysical PC model compared

To further explore the quantitative difference between the effect of nsLTD in the ANN and the biophysical PC model, we introduced Euclidean distances between the synapses on the virtual dendrite of the ANN unit by using the same distances as those calculated between the corresponding synapses of the PC-3D model.

Figure 6 plots the performance of the ANN-3D unit in the same experiments as those plotted in Fig. 5b for the PC-3D model. Clearly, in the ANN-3D unit, for nonspecific LTD to outperform standard LTD, the patterns required higher noise levels than in the PC-3D model (about 40% versus 20%), and the gain in performance was lower. The difference between the PC model and the ANN unit is illustrated in Fig. 6b, which plots the gain in performance by nonspecific LTD relative to standard LTD, using the formula:

**Figure 6: Pattern recognition performance of the PC-3D and ANN-3D models compared.**

These results confirm that the larger robustness against noise introduced by nsLTD in the biophysical PC model compared to the ANN must be based on the non-linear synaptic integration in the PC model rather than the spatial distribution of inputs across the dendrite, which was the same for linear ANN-3D.

Effects of pattern loading and sparsity

The observed beneficial effect of nonspecific LTD relative to standard LTD, illustrated in the previous sections after training with 100 binary patterns of 0.7% sparsity (0.7% ON-bits), could be extended to more dense patterns and to higher loadings in which both training and test sets contained greater numbers of patterns. For practical reasons, the effects of these two parameters were only examined in the ANN-3D model (ANN units to which the synaptic positions, and hence the inter-synaptic distances of the PC model had been copied, see above).

Figure 7 compares LTD and nsLTD for two levels of local noise. At the lower noise level of 10%, standard LTD (cyan) was always better at telling apart noisy-stored patterns from novel patterns. In contrast, distinguishing very noisy patterns (60% noise level) from novel patterns was invariably better after training with the nsLTD rule (red). These conclusions held over the whole range of loadings (25 to 400 patterns, Fig. 7a) and pattern densities tested (0.35 to 5.6%, Fig. 7b). Supplementary Fig. S3 plots the s/n ratio as a function of the density of ON bits.

**Figure 7: The effect of varying the number of stored patterns and their sparsity on pattern recognition performance of the ANN-3D unit.**

As compared to the ANN, the storage capacity of the biophysical PC model was rather limited (300–400 patterns in Steuber et al.)³. The ANN capacity has been calculated to amount to several thousands of patterns (see the work by Brunel et al.⁹, and our own calculations and Fig. A1 in Mathematical Appendix of Supplementary Information). On the other hand Fig. 6b showed that nsLTD was more effective in the model PC than in the ANN unit. It must thus be concluded that the PC has a limited capacity for the storage of uncorrelated patterns, but that this limitation is compensated by a greater ability to recognize noisy (hence correlated) patterns. It is also possible that the actual readout occurs downstream in cerebellar nucleus neurons, on which the outputs of about 40 PCs converge⁴⁰.

Effects of combined (ns)LTD and LTP

In the previous simulations, the total change in synaptic weight was greater with nsLTD than with LTD because in addition to weights at active synapses, the weights in the neighborhood were also depressed (see Methods). To examine whether the difference in total weight change could affect our results, we compared the pattern recognition performance of the ANN-3D unit for LTD and nsLTD with equal mean synaptic weights after learning. As shown in Supplementary Fig. S4, this rescaling of the synaptic weights did not alter the signal-to-noise ratio, because the change in mean weight is compensated by an equivalent change of the variance. A candidate mechanism for weight homeostasis is LTP^28,41 or slight potentiation of all inactive PF synapses each time a pattern is stored. The Mathematical Appendix predicts that adding LTP to the learning rule would not affect the performance of the ANN, under the assumption that the number of times a synapse is potentiated versus depressed follows a binomial distribution. This lack of an effect of LTP for pattern recognition by the linear ANN unit was borne out by numerical simulations (compare Fig. 8a to Fig. 3a).

**Figure 8: Adding LTP has little effect for the ANN unit but makes nsLTD superior to LTD at all noise levels in the PC model.**

In sharp contrast, the s/n ratio of the PC response was sensitive to the average synaptic weight, which determined not only the spontaneous spike rate but also the strength of the burst response and the duration of the subsequent pause. Interestingly, adding LTP to the learning rule in the default PC model made nsLTD equivalent or superior to LTD at all levels of pattern noise (Fig. 8b; see also Supplementary Fig. S5 for the weight distributions after training with combined LTP and nsLTD). The resulting weight homeostasis also prevented that the burst response would become too weak to be able to induce a pause (see raster plots in Supplementary Fig. S5), and, consequently, increased the number of patterns the PC could store with an s/n ratio >4 (from ~200 with simple LTD to more than 800 with combined LTP and nsLTD). This importance of LTP-induced weight homeostasis may explain the observed need for LTP in motor learning^42,43,44.

Discussion

Theories of learning in neural systems typically assume specific weight changes at activated synapses. In apparent contrast to this common assumption, it has been shown that in brain areas such as the cerebellum synaptic plasticity can spread to neighboring inactive synapses^{26,27,28,29,30,31,32,33}. The presence of this kind of non-specific synaptic plasticity is expected to be detrimental for the recall of stored patterns³⁴. We have investigated the storage and recall of input patterns in the presence and absence of non-specific long-term depression (nsLTD) in cerebellar PC models with different levels of complexity and biological realism. At noise levels above 20–30%, nsLTD outperformed standard LTD in a biophysical PC model in a standard pattern recognition task. Compared to the ANN units, which are optimal linear decoders, individual PCs performed rather poorly, but the recognition-enhancing effect of nsLTD manifested itself over a broader range of noise levels in the model PC than in the ANN unit (Fig. 6b, beyond 20% versus 40%). In addition, as has been shown before^45,46 the signal-to-noise ratio will rise by several orders of magnitude when multiple PCs, trained by similar patterns, converge onto neurons in the cerebellar nuclei. Note that in the present model, the nuclear neurons would read out the patterns by an increase in their spike rate during the PC pause.

The leakage of LTD had to be restrained within a distance of about one μm for a positive effect of nsLTD to be observed (Fig. 5d). This spatial confinement is narrower than the spread of LTD over tens of micrometers originally reported in in vitro studies^26,29,32. These in vitro studies may have overestimated the physiological action radius of NO, however, as a consequence of pharmacological (for instance bicuculline) or stimulation effects (bundles of PFs being fired). A more recent study, measuring NO-dependent LTP using a different stimulation protocol, observed a steep decline of heterosynaptic plasticity within 5–10 μm^47,48. This is in closer agreement with modeling studies that simulated the NO concentration using the reaction-diffusion equation³³. These studies found a strong nonlinear dependence of the action radius of NO on the diameter of the fiber by which it was released. For instance, for a fiber of 0.1 μm diameter, [NO] falls off to 50% at a distance of 2 μm⁴⁹. Note that in mice, parallel fibers have an average diameter of 0.15 μm⁵⁰. Moreover, in insects, spatial arrangements between NO and non-NO producing fibres have been shown to sharpen the resolution of NO effects⁵¹. Taken together, it must be concluded that the present nonspecific LTD learning rule operates much more locally than the (bidirectional) heterosynaptic plasticity rule that recently has been suggested to serve as a homeostatic control mechanism for the overall distribution of synaptic weights⁵².

The noise level of 20% at which nsLTD started to outperform standard LTD in the PC model (Figs 4b and 5b) may seem high, but taking into account the huge dimension of the input space (150,000 PF synapses on a rat PC), the probability that exactly the same pattern of parallel-fiber activity is generated twice during a lifetime seems to be vanishingly small¹⁰. Even though the synapses from mossy fibers onto granule cells are very reliable^8,53, the mossy fibers to the same granule cells may convey not only peripheral inputs from different modalities⁵⁴, but also information from neocortex that reaches the granular layer polysynaptically via the pontine nuclei, enhancing the probability of intervening noise.

An assumption of the present model is that noise preferentially spreads to neighboring parallel fibers, because the plasticity rule inevitably must be local and the spread of noise must match the leakage of LTD (Fig. 5d). There are no indications that neighboring PF synapses on a PC originate from neighboring granule cells in the granular layer^39,55, the projections seem to be rather divergent. But in their recent technically ingenious study, Wilms and Häusser³⁹ did find that behaviorally relevant stimuli excite clusters of neighboring parallel fibers, in spite of their being coded in a distributed fashion in the granular layer. It is therefore conceivable that local noise in stimulus space is propagated within clusters of parallel fibers, hence that the natural neighborhood relationships are preserved. At first sight, the clusters of co-activated PFs observed by Wilms and Häusser³⁹ may be too large (median distance of 11 μm) for the very local action of nsLTD in the present simulations (Fig. 5), but the labeling of PFs was too sparse in this imaging study to reliably measure cluster size.

It should be noted that the findings of our present study do not depend on the specificity of the modulator involved. For instance, intracellular free Ca²⁺, and Ca²⁺-dependent synaptic signals, may invade neighboring spines along the dendritic shaft within a distance of 10 μm, not only in cerebellar Purkinje cells^29,56, but also in hippocampal pyramidal cells^57,58,59.

In summary, the present paper suggests that nonspecific synaptic depression evoked by nitric oxide diffusion to neighboring synapses may have a functional role. nsLTD made the response of a model Purkinje cell robust against noise in the precise location of the activated synapses. If this spatial noise or variability in synaptic activity reflects natural errors or variability in sensory signals or motor commands, nonspecific plasticity may be a mechanism for error correction and/or pattern generalization and completion. In the PC with 147,400 parallel fiber synapses, however, nsLTD provided a significant advantage only when the noise and leakage of plasticity were very local (on the order of magnitude of a micrometer). This spatial confinement may be below the experimental detection limit, and it may therefore be useful to extend this study to model neurons with smaller densities of synapses.

Leakage of plasticity is also at the heart of the formation of neuronal maps^60,61 and of bio-inspired clustering algorithms like gas nets⁶² and volume transmission through diffusion of NO at parallel-fiber synapses³³ or climbing-fiber synapses⁶³ has been suggested to improve motor learning in robots. As a final remark, it must be admitted that a paradigmal cerebellar task such as eyeblink conditiong has recently been attributed to adaptive timing by intrinsic Purkinje cell mechanisms^64,65.

Methods

Pattern recognition task

Single neurons were trained with a set of sparse static input patterns, and were then tested for their capacity to distinguish, by the strength of their response, learned from random novel patterns. The input patterns were uncorrelated and binary, one bit for each afferent, which was set to ON if the corresponding synapse was activated by the pattern. More particularly, we examined whether the leak of plasticity to neighboring synapses during the training phase generated robustness to local noise applied to the pattern during the test (recall) phase (Fig. 1).

Neuron models

We simulated two categories of neuronal models (Table 1): artificial neural network (ANN) units and various versions of a biophysical model of a cerebellar Purkinje cell (PC).

The ANN units were simple linear summation units that generated as their output r the inner product of the synaptic weight vector w and the pattern vector x (Equation 1, and Fig. 1a,b). As the patterns were binary, and the weights positive, the output of the ANN was a positive value. The number of synapses, and hence the pattern size N, was either 147,400 or 14,740 (see Table 1).

The biophysical Purkinje cell model^35,36 consisted of a soma, and a dendrite of 1599 compartments, out of which 1474 were budded with spines that received AMPA receptor synapses from PFs. Since the number of spines on a single PC amounts to approximately 150,000 in rats⁶⁶, and since each spine requires for its implementation a neck and head compartment, it was not practical to model all spines in the present learning paradigm. Instead, two variants of the PC model were simulated (Table 1). PC-1D received the full set of 147,400 PF afferents, but these were lumped into groups of 100, each group innervating the same single spine that a compartment was equipped with (Fig. 4a). PC-3D had a more realistic configuration of spines, each spine being innervated by a unique PF, but their number (and hence pattern size) was reduced to N = 14,740 (Fig. 5a).

The PC model had an intrinsic spike rate, with all synapses blocked, of 70 s⁻¹. For the present in vivo simulations, each of its GABA_A receptor synapses was randomly activated at 1 Hz, and the background PF spike rate was set at 0.28 Hz (2.8 Hz in the PC-3D model) so as to confer to the PC a spontaneous activity of 48 spikes s⁻¹.

Neighborhood functions

Both the leakage of plasticity and the spatial spread of pattern noise required defining a neighborhood function. This could be one-dimensional (1D) or three-dimensional (3D). In the 1D case (ANN-1D and PC-1D, see Table 1), each synapse onto an ANN unit, as well as each of the 100 PFs converging onto the same PC spine, was given a fixed index in a ring array, by which also its nearest and next-nearest neighbors were defined (see Fig. 4a). In the 3D case, in contrast, the actual architecture of the PC dendrite was used to calculate Euclidean distances between spines, or, equivalently, between the bits in a pattern (PC-3D, see Fig. 5a). In ANN-3D, the synapses were mapped onto the PC morphology, but the output was calculated, as for ANN-1D, as a weighted sum.

Once the 3D distances between synapses were determined, leakage of plasticity was modelled as a 3D Gaussian kernel of distance, and the same kernel (albeit not necessarily with the same width) was used to represent the decaying probability with distance of a pattern bit being switched ON erroneously by noise (see below).

Synaptic plasticity rules

In actual PCs, PF synapses undergo LTD when their activation is temporally associated with a dendritic complex spike, evoked by the activation of the PC’s climbing fibre. This climbing fibre signal functions as a teacher, but was not explicitly implemented in the present study.

In simulations with specific or standard LTD (briefly ‘LTD’), only those PFs actually activated were depressed. We here used a depression factor of d = 0.5 (the effect of d becomes explicit in the Mathematical Appendix). Hence the weight w_i of synapse i, after storing n patterns (indexed by j), was equal to

where p_i,j = 1 if the jth pattern had an ON bit at synapse i, and zero otherwise. It has here implicitly been assumed that the weights started from a value of 1. For the PC model, w_i was the factor with which the initial peak synaptic conductance of 200 pS had to be multiplied to obtain the resulting conductance of the depressed PF-to-PC synapse.

In simulations with nonspecific LTD (nsLTD), the depression spread to neighboring PF synapses even if these were not active during climbing fibre activation. For the 1D-neighborhood function, weights were updated as follows

where δ is the distance of synapse i to the active PF synapse, counted as path length on the ring array, hence δ = 1 for the two nearest-neighbor synapses, δ = 2 for the two second-nearest neighbors, etc. Usually the depression was limited to up to three nearest neighbors on either side (see Fig. 4b).

When the 3D neighborhood function was used, all the synapses of the model were adapted by a factor equal to 0.5 times the value of a Gaussian distance kernel centred at the active PF synapse

where δ is the distance to the active synapse in 3D space, and σ is the standard deviation of the Gaussian.

Patterns and noise

As stated above, the patterns were uncorrelated and sparse; a sparsity of 0.007 was used for most of the simulations (the exception being control simulations with varying sparsity, see Fig. 7), meaning that 0.7% (1000 out of 147,400 or 100 out of 14,740) of the pattern bits were ON, and hence 0.7% of the synapses activated by each pattern. Noise was applied as a percentage of ON-bits being displaced from their original position as given in the trained pattern. After the bit to be displaced (or synapse) had been randomly selected, it was assigned to a neighbor according to the defined neighborhood relationship, 1D or 3D. In most simulations, the probability of a neighbor being selected as target for the displaced bit was proportional to the degree of nsLTD applied to the corresponding synapse. Figure 5d examines the effect of disparity between the local spread of noise and LTD.

Hence, in the case of one-dimensional nsLTD (in ANN-1D or PC-1D) with leakage to only the nearest neighbor on either side, the probability for each neighbor of being selected to activate its input (switching from OFF to ON) would be equal to 0.5. For a two-nearest-neighbor leakage, these values would be 0.33 (for each nearest neighbor) and 0.17 (for each next-nearest neighbor), etc.

To select neighbors for pattern noise in three-dimensional nsLTD (in ANN-3D or PC-3D), the cumulative distribution function was calculated of the Gaussian neighborhood function centred at the synapse selected for noise (the synapse being switched from 1 to 0). After this, a number was drawn randomly from a uniform distribution over the [0, 1] real interval and inversely mapped, by the cumulative distribution function, onto the domain of synapses. This way, the probability of a pattern bit (synapse) being switched from 0 to 1 by the noise was proportional to its Gaussian distance from the selected synapse (the central bit was prohibited from being selected).

Output metrics

The pattern recognition performance of a neuron model was assessed by the signal-to-noise ratio of its responses to 100 stored versus 100 novel patterns. The selected response criterion was different for ANN units and the PC model. For an ANN unit, the response was its level of excitation, calculated as the weighted sum of inputs it received (the inner product of weight and pattern vector, Equation 1). For the biophysical PC model, which generated action potentials, the most sensitive response metric³ was the duration of the pause in firing following the initial burst response to the pattern and before spontaneous spiking resumed (see Fig. 2a,d).

From the distribution of the obtained responses (Fig. 2c,f), a signal-to-noise ratio was calculated as in ref. 67:

which is the square of the difference in mean response between 100 stored and 100 novel patters, divided by the mean of their variances. For robustness, the whole procedure of learning and recognition was repeated 10 times to obtain an average s/n and a standard deviation indicated by error bars in the figures.

Implementation

The ANN model and all analyses were implemented in Matlab (The Mathworks). The PC model was converted from its original Genesis code to Neuron⁶⁸. The simulations were run on the University of Hertfordshire Science and Technology Research Institute high-performance computing facility.

Additional Information

How to cite this article: Safaryan, K. et al. Nonspecific synaptic plasticity improves the recognition of sparse patterns degraded by local noise. Sci. Rep. 7, 46550; doi: 10.1038/srep46550 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Neves, G., Cooke, S. F. & Bliss, T. V. Synaptic plasticity, memory and the hippocampus: a neural network approach to causality. Nat. Rev. Neurosci. 9, 65–75 (2008).
CAS PubMed Google Scholar
Hinton, G. E. & Anderson, J. A. Parallel models of associative memory(Lawrence Erlbaum Associates, 1981).
Steuber, V. et al. Cerebellar LTD and pattern recognition by Purkinje cells. Neuron 54, 121–136 (2007).
CAS PubMed PubMed Central Google Scholar
Ganguli, S. & Sompolinsky, H. Compressed sensing, sparsity, and dimensionality in neuronal information processing and data analysis. Annu. Rev. Neurosci. 35, 485–508 (2012).
CAS PubMed Google Scholar
Hertz, J. A., Krogh, A. S. & Palmer, R. G. Introduction to the theory of neural computation(Westview Press, 1991).
Palkovits, M., Magyar, P. & Szentagothai, J. Quantitative histological analysis of the cerebellar cortex in the cat. IV. Mossy fiber-Purkinje cell numerical transfer. Brain Res. 45, 15–29 (1972).
CAS PubMed Google Scholar
Palay, S. L. & Chan-Palay, V. Cerebellar cortex: cytology and organization(Springer, 1974).
Chadderton, P., Margrie, T. W. & Hausser, M. Integration of quanta in cerebellar granule cells during sensory processing. Nature 428, 856–860 (2004).
CAS ADS PubMed Google Scholar
Brunel, N., Hakim, V., Isope, P., Nadal, J. P. & Barbour, B. Optimal information storage and the distribution of synaptic weights: perceptron versus Purkinje cell. Neuron 43, 745–757 (2004).
CAS PubMed Google Scholar
Billings, G., Piasini, E., Lorincz, A., Nusser, Z. & Silver, R. A. Network structure within the cerebellar input layer enables lossless sparse encoding. Neuron 83, 960–974 (2014).
CAS PubMed PubMed Central Google Scholar
Marr, D. A theory of cerebellar cortex. J. Physiol. 202, 437–470 (1969).
CAS PubMed PubMed Central Google Scholar
Tyrrell, T. & Willshaw, D. Cerebellar cortex: its simulation and the relevance of Marr’s theory. Philos. Trans. R. Soc. Lond. B Biol. Sci. 336, 239–257 (1992).
CAS ADS PubMed Google Scholar
Clopath, C., Nadal, J. P. & Brunel, N. Storage of correlated patterns in standard and bistable Purkinje cell models. PLoS Comput. Biol. 8, e1002448 (2012).
CAS MathSciNet PubMed PubMed Central Google Scholar
Ito, M. & Kano, M. Long-lasting depression of parallel fiber-Purkinje cell transmission induced by conjunctive stimulation of parallel fibers and climbing fibers in the cerebellar cortex. Neurosci. Lett. 33, 253–258 (1982).
CAS PubMed Google Scholar
Ito, M. Cerebellar circuitry as a neuronal machine. Prog. Neurobiol. 78, 272–303 (2006).
PubMed Google Scholar
Linden, D. J., Dickinson, M. H., Smeyne, M. & Connor, J. A. A long-term depression of AMPA currents in cultured cerebellar Purkinje neurons. Neuron 7, 81–89 (1991).
CAS PubMed Google Scholar
Nadal, J. P. & Toulouse, G. Information storage in sparsely coded memory nets. Network-Computation in Neural Systems 1, 61–74 (1990).
MathSciNet MATH Google Scholar
Nadal, J. P. Associative memory - on the (puzzling) sparse coding limit. J. Phys. A 24, 1093–1101 (1991).
ADS MATH Google Scholar
Willshaw, D. J., Buneman, O. P. & Longuet-Higgins, H. C. Non-holographic associative memory. Nature 222, 960–962 (1969).
CAS ADS PubMed Google Scholar
de Sousa, G., Maex, R., Adams, R., Davey, N. & Steuber, V. In The Computing Dendrite(eds Cuntz, H., Remme, M. W. H. & Torben-Nielsen, B. ) 433–448 (Springer, 2014).
Spanne, A. & Jörntell, H. Questioning the role of sparse coding in the brain. Trends Neurosci. 38, 417–427 (2015).
CAS PubMed Google Scholar
Babadi, B. & Sompolinsky, H. Sparseness and expansion in sensory representations. Neuron 83, 1213–1226 (2014).
CAS PubMed Google Scholar
Mapelli, J. & D’Angelo, E. The spatial organization of long-term synaptic plasticity at the input stage of cerebellum. J. Neurosci. 27, 1285–1296 (2007).
CAS PubMed PubMed Central Google Scholar
Shibuki, K. & Okada, D. Endogenous nitric oxide release required for long-term synaptic depression in the cerebellum. Nature 349, 326–328 (1991).
CAS ADS PubMed Google Scholar
Shibuki, K. & Kimura, S. Dynamic properties of nitric oxide release from parallel fibres in rat cerebellar slices. J. Physiol. 498, 443–452 (1997).
CAS PubMed PubMed Central Google Scholar
Lev-Ram, V., Makings, L. R., Keitz, P. F., Kao, J. P. & Tsien, R. Y. Long-term depression in cerebellar Purkinje neurons results from coincidence of nitric oxide and depolarization-induced Ca²⁺ transients. Neuron 15, 407–415 (1995).
CAS PubMed Google Scholar
Lev-Ram, V., Jiang, T., Wood, J., Lawrence, D. S. & Tsien, R. Y. Synergies and coincidence requirements between NO, cGMP, and Ca²⁺ in the induction of cerebellar long-term depression. Neuron 18, 1025–1038 (1997).
CAS PubMed Google Scholar
Lev-Ram, V., Wong, S. T., Storm, D. R. & Tsien, R. Y. A new form of cerebellar long-term potentiation is postsynaptic and depends on nitric oxide but not cAMP. Proc. Natl. Acad. Sci. USA 99, 8389–8393 (2002).
CAS ADS PubMed PubMed Central Google Scholar
Wang, S. S., Khiroug, L. & Augustine, G. J. Quantification of spread of cerebellar long-term depression with chemical two-photon uncaging of glutamate. Proc. Natl. Acad. Sci. USA 97, 8635–8640 (2000).
CAS ADS PubMed PubMed Central Google Scholar
Casado, M., Isope, P. & Ascher, P. Involvement of presynaptic N-methyl-D-aspartate receptors in cerebellar long-term depression. Neuron 33, 123–130 (2002).
CAS PubMed Google Scholar
Shin, J. H. & Linden, D. J. An NMDA receptor/nitric oxide cascade is involved in cerebellar LTD but is not localized to the parallel fiber terminal. J. Neurophysiol. 94, 4281–4289 (2005).
CAS PubMed Google Scholar
Reynolds, T. & Hartell, N. A. An evaluation of the synapse specificity of long-term depression induced in rat cerebellar slices. J. Physiol. 527, 563–577 (2000).
CAS PubMed PubMed Central Google Scholar
Ogasawara, H., Doi, T., Doya, K. & Kawato, M. Nitric oxide regulates input specificity of long-term depression and context dependence of cerebellar learning. PLoS Comput. Biol. 3, e179 (2007).
ADS PubMed PubMed Central Google Scholar
Radulescu, A., Cox, K. & Adams, P. Hebbian errors in learning: an analysis using the Oja model. J. Theor. Biol. 258, 489–501 (2009).
MathSciNet PubMed MATH Google Scholar
De Schutter, E. & Bower, J. M. An active membrane model of the cerebellar Purkinje cell II. Simulation of synaptic responses. J. Neurophysiol. 71, 401–419 (1994).
CAS PubMed Google Scholar
De Schutter, E. & Bower, J. M. Simulated responses of cerebellar Purkinje cells are independent of the dendritic location of granule cell synaptic inputs. Proc. Natl. Acad. Sci. USA 91, 4736–4740 (1994).
CAS ADS PubMed PubMed Central Google Scholar
Willshaw, D. & Dayan, P. Optimal plasticity from matrix memories: What goes up must come down. Neural Comput. 2, 85–93 (1990).
Google Scholar
Willshaw, D. In Parallel Models of Associative Memory(eds Hinton, G. E. & Anderson, J. A. ) 103–128 (Lawrence Erlbaum, 1989).
Wilms, C. D. & Häusser, M. Reading out a spatiotemporal population code by imaging neighbouring parallel fibre axons in vivo . Nat. Commun. 6, 6464, 10.1038/ncomms7464 (2015).
Article CAS ADS PubMed Google Scholar
Person, A. L. & Raman, I. M. Purkinje neuron synchrony elicits time-locked spiking in the cerebellar nuclei. Nature 481, 502–505 (2011).
ADS PubMed PubMed Central Google Scholar
Dean, P. & Porrill, J. Decorrelation learning in the cerebellum: computational analysis and experimental questions. Prog. Brain Res. 210, 157–192 (2014).
PubMed Google Scholar
Badura, A., Clopath, C., Schonewille, M. & De Zeeuw, C. I. Modeled changes of cerebellar activity in mutant mice are predictive of their learning impairments. Sci. Rep. 6, 36131 (2016).
CAS ADS PubMed PubMed Central Google Scholar
Gutierrez-Castellanos, N. et al. Motor learning requires Purkinje cell synaptic potentiation through activation of AMPA-receptor subunit GluA3. Neuron 93, 409–424 (2017).
CAS PubMed PubMed Central Google Scholar
Schonewille, M. et al. Purkinje cell-specific knockout of the protein phosphatase PP2B impairs potentiation and cerebellar motor learning. Neuron 67, 618–628 (2010).
CAS PubMed PubMed Central Google Scholar
Luthman, J., Adams, R., Davey, N., Maex, R. & Steuber, V. Decoding of Purkinje cell pauses by deep cerebellar nucleus neurons. BMC Neurosci. 10 (Supp 1), 10.1186/1471-2202-10-S1-P105 (2009).
Walter, J. T. & Khodakhah, K. The advantages of linear information processing for cerebellar computation. Proc. Natl. Acad. Sci. USA 106, 4471–4476 (2009).
CAS ADS PubMed Google Scholar
Namiki, S., Kakizawa, S., Hirose, K. & Iino, M. NO signalling decodes frequency of neuronal activity and generates synapse-specific plasticity in mouse cerebellum. J. Physiol. 566, 849–863 (2005).
CAS PubMed PubMed Central Google Scholar
Iino, M. Ca²⁺-dependent inositol 1,4,5-trisphosphate and nitric oxide signaling in cerebellar neurons. J. Pharmacol. Sci. 100, 538–544 (2006).
CAS PubMed Google Scholar
Philippides, A., Ott, S. R., Husbands, P., Lovick, T. A. & O’Shea, M. Modeling cooperative volume signaling in a plexus of nitric-oxide-synthase-expressing neurons. J. Neurosci. 25, 6520–6532 (2005).
CAS PubMed PubMed Central Google Scholar
Sultan, F. Exploring a critical parameter of timing in the mouse cerebellar microcircuitry: the parallel fiber diameter. Neurosci. Lett. 280, 41–44 (2000).
CAS PubMed Google Scholar
Ott, S. R., Philippides, A., Elphick, M. R. & O’Shea, M. Enhanced fidelity of diffusive nitric oxide signalling by the spatial segregation of source and target neurones in the memory centre of an insect brain. Eur. J. Neurosci. 25, 181–190 (2007).
PubMed Google Scholar
Chistiakova, M., Bannon, N. M., Chen, J. Y., Bazhenov, M. & Volgushev, M. Homeostatic role of heterosynaptic plasticity: models and experiments. Front. Comput. Neurosci. 9, 89 (2015).
PubMed PubMed Central Google Scholar
Rancz, E. A. et al. High-fidelity transmission of sensory information by single cerebellar mossy fibre boutons. Nature 450, 1245–1248 (2007).
CAS ADS PubMed PubMed Central Google Scholar
Chabrol, F. P., Arenz, A., Wiechert, M. T., Margrie, T. W. & DiGregorio, D. A. Synaptic diversity enables temporal coding of coincident multisensory inputs in single neurons. Nat. Neurosci. 18, 718–727 (2015).
CAS PubMed PubMed Central Google Scholar
Künzle, H. Non-uniform projections of granule cells to the cerebellar molecular layer. An autoradiographic tracing study in a turtle. Anat. Embryol. (Berl.) 175, 537–544 (1987).
Google Scholar
Hartell, N. A. Strong activation of parallel fibers produces localized calcium transients and a form of LTD that spreads to distant synapses. Neuron 16, 601–610 (1996).
CAS PubMed Google Scholar
Engert, F. & Bonhoeffer, T. Synapse specificity of long-term potentiation breaks down at short distances. Nature 388, 279–284 (1997).
CAS ADS PubMed Google Scholar
Harvey, C. D. & Svoboda, K. Locally dynamic synaptic learning rules in pyramidal neuron dendrites. Nature 450, 1195–1200 (2007).
CAS ADS PubMed PubMed Central Google Scholar
Harvey, C. D., Yasuda, R., Zhong, H. & Svoboda, K. The spread of Ras activity triggered by activation of a single dendritic spine. Science 321, 136–140 (2008).
CAS ADS PubMed PubMed Central Google Scholar
Willshaw, D. J. & von der Malsburg, C. How patterned neural connections can be set up by self-organization. Proc. R. Soc. Lond. B. Biol. Sci. 194, 431–445 (1976).
CAS ADS PubMed Google Scholar
Kohonen, T. Self-Organization and Associative Memory(Springer, 1989).
Husbands, P. et al. Spatial, temporal, and modulatory factors affecting GasNet evolvability in a visually guided robotics task. Complexity 16, 35–44 (2010).
Google Scholar
Schweighofer, N. & Ferriol, G. Diffusion of nitric oxide can facilitate cerebellar learning: A simulation study. Proc. Natl. Acad. Sci. USA 97, 10661–10665 (2000).
CAS ADS PubMed PubMed Central Google Scholar
Johansson, F., Carlsson, H. A., Rasmussen, A., Yeo, C. H. & Hesslow, G. Activation of a Temporal Memory in Purkinje Cells by the mGluR7 Receptor. Cell Rep. 13, 1741–1746 (2015).
CAS PubMed Google Scholar
Johansson, F. & Hesslow, G. Theoretical considerations for understanding a Purkinje cell timing mechanism. Commun. Integr. Biol. 7, e994376 (2014); 10.4161/19420889.2014.994376.
Article PubMed PubMed Central Google Scholar
Napper, R. M. & Harvey, R. J. Number of parallel fiber synapses on an individual Purkinje cell in the cerebellum of the rat. J. Comp. Neurol. 274, 168–177 (1988).
CAS PubMed Google Scholar
Dayan, P. & Willshaw, D. J. Optimising synaptic learning rules in linear associative memories. Biol. Cybern. 65, 253–265 (1991).
CAS PubMed MATH Google Scholar
Hines, M. L. & Carnevale, N. T. NEURON: a tool for neuroscientists. Neuroscientist 7, 123–135 (2001).
CAS PubMed Google Scholar

Download references

Acknowledgements

Arnd Roth kindly provided a Neuron version of the Purkinje cell model. This work was partly supported by ANR-10-LABX-0087 IEC and ANR-10-IDEX-0001-02 PSL (all France).

Author information

Authors and Affiliations

Centre for Computer Science and Informatics Research, University of Hertfordshire, College Lane, AL10 9AB, Hatfield, United Kingdom
Karen Safaryan, Reinoud Maex, Neil Davey, Rod Adams & Volker Steuber
Department of Physics and Astronomy, Knudsen Hall, University of California, Los Angeles, 90095-0001, CA, USA
Karen Safaryan
Department of Cognitive Sciences, Ecole Normale Supérieure, rue d’Ulm 25, Paris, 75005, France
Reinoud Maex

Authors

Karen Safaryan
View author publications
You can also search for this author in PubMed Google Scholar
Reinoud Maex
View author publications
You can also search for this author in PubMed Google Scholar
Neil Davey
View author publications
You can also search for this author in PubMed Google Scholar
Rod Adams
View author publications
You can also search for this author in PubMed Google Scholar
Volker Steuber
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

V.S. and R.M. designed the study. K.S. built the models, conducted the simulations, and analysed the results. V.S., R.M., N.D., and R.A. weekly supervised the progress of the study. R.M. wrote the first draft and the mathematical appendix.

Corresponding author

Correspondence to Reinoud Maex.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information (PDF 4113 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Safaryan, K., Maex, R., Davey, N. et al. Nonspecific synaptic plasticity improves the recognition of sparse patterns degraded by local noise. Sci Rep 7, 46550 (2017). https://doi.org/10.1038/srep46550

Download citation

Received: 20 October 2016
Accepted: 22 March 2017
Published: 20 April 2017
DOI: https://doi.org/10.1038/srep46550

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.