Introduction

The use of objective functions for the formulation of complex systems has seen a study surge of interest. Objective functions, in particular objective functions based on information theoretical principles1,2,3,4,5,6,7, are used increasingly as generating functionals for the construction of complex dynamical and cognitive systems. There is then no need to formulate by hand equations of motion, just as it is possible, in analogy, to generate in classical mechanics Newton's equation of motion from an appropriate Lagrange function. When studying dynamical systems generated from objective functions encoding general principles, one may expect to obtain a deeper understanding of the resulting behavior. The kind of generating functional employed also serves, in addition, to characterize the class of dynamical systems for which the results obtained may be generically valid.

Here we study the interplay between two generating functionals. The first generating functional is a simple energy functional. Minimizing this objective function one generates a neural network with predefined point attractors, the Hopfield net8,9. The second generating functional describes the information content of the individual neural firing rates. Minimizing this functional results in maximizing the information entropy6,7 and in the generation of adaption rules for the intrinsic neural parameters, the threshold and the gain. This principle has been denoted polyhomeostatic optimization7,10, as it involves the optimization of an entire function, the distribution function of the time-averaged neural activities.

We show that polyhomeostatic optimization destabilizes all attractors of the Hopfield net, turning them into attractor ruins. The resulting dynamical network is an attractor relict network and the dynamics involves sequences of continuously latching transient states. This dynamical state is characterized by trajectories slowing down close to the succession of attractor ruins visited consecutively. The two generating functionals can have incompatible objectives. Each generating functional, on its own, would lead to dynamical states with certain average levels of activity. Stress is induced when these two target mean activity levels differ. We find that the system responds to objective function stress by resorting to intermittent bursting, with laminar flow interseeded by burst of transient state latching.

Dynamical systems functionally equivalent to attractor relict networks have been used widely to formulate dynamics involving on-going sequences of transient states. Latching dynamics has been studied in the context of the grammar generation with infinite recursion11,12 and in the context of reliable sequence generation13,14. Transient state latching has also been observed in the brain15,16 and may constitute an important component of the internal brain dynamics. This internal brain dynamics is autonomous, ongoing being modulated, but not driven, by the sensory input17,18. In this context an attractor relict network has been used to model autonomous neural dynamics in terms of sequences of alternating neural firing patterns19. The modulation of this type of internal latching dynamics by a stream of sensory inputs results, via unsupervised learning, in a mapping of objects present in the sensory input stream to the preexisting attractor ruins of the attractor relict network20,21, the associative latching dynamics thus acquiring semantic content.

Results

As an introductory example we consider one of the simplest possible networks having two distinct attractors in terms of minima of the energy functional, as defined by Eq. (7). The dynamics of the 3-site network illustrated in Fig. 1 is given by

with w+ > 0 and w > 0 denoting the excitatory and inhibitory link strength respectively. Here the xi/yi correspond to neural membrane potentials and firing rates, respectively, related through a sigmoidal transfer function, see Eq. (5), which is parameterized by the gain ai and the threshold bi. For fixed intrinsic parameters aai and bbi, viz for vanishing adaption rates , , the network has two possible phases, as shown in Fig. 1. There is either a single global attractor, with activities yi determined mostly by the value of the threshold b, or two stable attractors, with either y1 or y3 being small and the other two firing rates large.

Figure 1
figure 1

Left: A three-site graph with symmetric excitatory (solid green lines) and inhibitory connections (red dashed lines). Right: The corresponding phase diagram, for fixed Γ = 1 and w+ = 1 = w, as function of the gain aai and threshold bbi, for i = 1, 2, 3. For large/small thresholds the sites tend to be inactive/active. The color encodes the activity of the most active neuron. Inside the shaded area there are two stable attractors, outside a single, globally attracting state. The binary representations of the two non-trivial attractors are (1,1,0) and (0,1,1), where 1/0 denotes an active/inactive site. The lower and upper lines (blue/white) of the phase transition are of second and first order respectively, with the lower phase transition line determined by where is the activity of the single fixed point at site one and three. The two critical lines meet at (ac, bc) = (4, 0.413) (blue dot).

The point attractors present for vanishing intrinsic adaption become unstable for finite adaption rates , . Point attractors, defined by , have time-independent neural activities. The objective (9), to attain a minimal Kullback-Leibler divergence, can however not be achieved for constant firing rates yi. The polyhomeostatic adaption (11) hence forces the system to become autonomously active, as seen in Fig. 2.

Figure 2
figure 2

Time series of the three-site network shown in Fig. 1, for Γ = 1, , , λ1 = 0, λ2 = 0 and w± = 1.

The dynamics is given by (6) and (11). The dynamics retraces periodically the original attractor states, as one can see from the oscillation of the overlap Op and the Ap, compare Eqs. (13) and (14), between the patterns of neural and attractor activities.

In Fig. 2 we present the results for the overlaps Op and Ap, as defined by Eqs. (13) and (14), of the neural activity with the stored memories. One observes, that the original attractors become unstable in the presence of finite polyhomeostatic adaption, retaining however a prominent role in phase space. Unstable attractors, transiently attracting the phase space flow, can be considered to act as ‘attractor relicts’. The resulting dynamical system is hence an attractor relict network, viz a network of coupled attractor ruins19,20. When coupling an attractor network to slow intrinsic variables, here the threshold b and the gain a, the attractors are destroyed but retain presence in the flow in terms of attractor ruins, with the dynamics slowing down close to the attractor relict.

The concept of an attractor ruin is quite general and attractor relict networks have implicitly been generated in the past using a range of functionally equivalent schemes. It is possible to introduce additional local slow variables, like a reservoir, which are slowly depleted when a unit is active19,20. Similarly, slowly accumulating spin variables have been coupled to local dynamic thresholds in order to destabilize attractors13 and local, slowly adapting, zero state fields have been considered in the context of latching Potts attractor networks11,12,22,23. Alternatively, attractors have been destabilized in neural networks by introducing slowly adapting asymmetric components to the inter-neural synaptic weights14.

Phase boundary adaption

In Fig. 3 we have superimposed, for the three-site network, a typical set of trajectories, with finite adaption rates and , onto the phase diagram evaluated for non-adapting neurons, with , as shown in Fig. 1. The three trajectories (green: (a2, b2), red and blue: (a1, b1) and (a3, b3)) all start well in the region with a single global fixed point, at ai = 5, bi = −0.5 (i = 1, 2, 3). Two features of the flow are eye-catching.

  • The intrinsic parameters of the active neurons settle, after a transient initial period, at the phase boundary and oscillate alternating between the phase with a single stable fixed point and the phase with two attractors.

  • The trajectory (a2, b2) for the central neuron adapts to a large value for the threshold, taking care not to cross any phase boundary during the initial transient trajectory.

Figure 3
figure 3

The trajectories of (ai(t), bi(t)), for finite adaption rates, and , superimposed onto the phase diagram of the three-site network shown in Fig. 1. The color encodes the activity of the most active site. Inside the shaded blue area two attractors are stable, outside there is only a single global attractor. The parameters are the same as for the simulation presented in Fig. 2. The thresholds b1(t) and b3(t) (red/blue trajectories) oscillate across the second-order phase boundary (blue line), the threshold b2(t) (green trajectory) acquires a large value, receiving two excitatory inputs and avoids the first-order phase boundary (white line) during the initial transient. All three trajectories start in the center, lower-half of the phase diagram, at a = 5, b = −0.5 (black dot). Note that the underlying phase diagram is for identical thresholds bi and gains ai.

We first discuss the physics behind the second phenomenon. The Euclidean distance in phase space between the two attractors, when existing, can be used as an order parameter. In Fig. 4 the order parameter is given for vertical cuts through the phase diagram. The lower transition is evidently of second order, which can be confirmed by a stability analysis of (1), with the upper transition being of first order, in agreement with a graphical analysis of (1). A first order transition will not be crossed, in general, by an adaptive process, as there are no gradients for the adaption to follow. This is the reason that the trajectory for (a2, b2) shown in Fig. 3 needs to take a large detour in order to arrive to its target spot in phase space. This behavior is independent of the choice of initial conditions.

Figure 4
figure 4

Vertical cuts (for constant gains a = 4, 0, 4.3, 4.7, 5.0, 6.0, bottom to top curves) through the phase diagram of the 3-site graph shown in Fig. 1. Plotted is the order parameter , where are the two fixed point solutions (γ = 1, 2) in the coexistent region. The lower/upper transitions are second/first order.

Using an elementary stability analysis one can show that the condition for the second-order line is , where (for the single global fixed point). Only a single global attractor is consequently stable for a ≤ 4 (since and ) and the second and the first order lines meet at (ac, bc) = (4, 0.413), where the critical threshold bc is determined by the self-consistent solution of bc = 1/[1 + exp(4bc − 4)] − 1/2. The trajectories (a1, b1) and (a3, b3) are hence oscillating across the locus in phase space where there would be a second-order phase transition, compare Fig. 3, for the case of identical internal parameters ai and bi. With adaption, however, the respective internal parameters take distinct values, (a, b ± δb) for the first/third neuron and (a, b2) for the second neuron, with a ≈ 6, b ≈ 0 and b2 ≈ 1. In fact the system adapts the intrinsic parameters dynamical in a way that a non-stopping sequence of states

is visited, where ξ1 = (1, 1, 0), ξ2 = (0, 1, 1) denote the two stable cliques and ξ0 = (1, 1, 1) the fixed point in the phase having only a single global attractor. It is hence not a coincidence that the system adapts autonomously, as shown in Fig. 3, to a region in phase space where there would be a second order phase transition for identical internal parameters. At this point, small adaptive variations make the sequence (2) achievable. The adaption process hence shares phenomenological some aspects with what one calls ‘self-organized quasi-criticality’ in the context of absorbing phase transitions27,28, which denotes the circumstance that a system without energy conservation may become critical only when tuning an external recharging rate. For imperfect tuning the trajectories would linger close to second-order phase transition point, without actually reaching it, a phenomenology close to one seen in Fig. 3.

Objective function stress

We will now investigate networks of larger size N. In principle we could generate random synaptic link matrices {wij} and find their eigenstates using a diagonalization routine. When using the Hopfield encoding (15) for the synaptic weights wij the attractors are known, corresponding, as long as the number Np of patterns is not too large8,9, to the to the stored patterns ξp. We here make use of the Hopfield encoding for convenience, no claim is made that memories in the brain are actually stored and defined by (15).

We consider in the following random binary patterns, as illustrated in Fig. 5,

where α is the mean activity level or sparseness. The patterns ξp have in general a finite, albeit small, overlap, as illustrated in Fig. 5. The target distribution q(y) for the intrinsic adaption has, for λ2 = 0, an expected mean

which can be evaluated noting that the support of q(y) is [0, 1]. The target mean activity μ can now differ from the average activity of an attractor relict, the mean pattern activity α, compare (3). The difference between the two quantities, viz between the two objectives, energy minimization vs. polyhomeostatic optimization, induces stress into the latching dynamics.

Figure 5
figure 5

Left: Example of NP = 7 random binary patterns ξp, compare (3), with sparseness α = 0.3 for a network of N = 100 neurons. Right: Respective pairwise mutual pattern overlaps Op, ξq). Patterns maximally overlap with themselves, the inter-pattern overlap is random and small.

In Fig. 6 we present the time evolution of the overlaps Ap and Op for α = 0.3 = μ and adaption rates , . In this case the target mean activity, μ, of the intrinsic adaption rules (11) is consistent with the mean activity α of the stored patterns ξp, viz with the mean activity level of the attractor relicts. One observes that the system settles into a limiting cycle with all seven stored patterns becoming successively transiently active, a near-to-perfect latching dynamics. The dynamics is very stable and independent of initial conditions, which were selected randomly.

Figure 6
figure 6

Overlaps Op and Ap (color encoded, see Eqs. (13) and (14)), for a N = 100 network, of the neural activities with the Np = 7 attractor ruins, as a function of time t. The sparseness of the binary patterns defining the attractor ruins is α = 0.3. Here the average activity α of the attractor relicts and the target mean activity level μ = 0.3 have been selected to be equal, resulting in a limiting cycle with clean latching dynamics.

In Fig. 7 we present the time evolution, for 20 out of the N = 100 neurons, of the respective individual variables. The simulation parameters are identical for Figs. 6 and 7. Shown in Fig. 7 are the individual membrane potentials xi(t), the firing rates yi(t), the gains ai(t) and the thresholds bi(t). The latching activation of the attractor relicts seen in Fig. 6 reflects in corresponding transient activations of the respective membrane potentials and firing rates. The oscillations in the thresholds bi(t) drive the latching dynamics, interestingly, even though the adaption rate is larger for the gain.

Figure 7
figure 7

Time evolution, for a selection of 20 out of N = 100 neurons, of the membrane potential xi, gain ai, threshold bi and firing rate yi for the time series of overlaps presented in Fig. 6.

The synaptic weights wij are symmetric and consequently also the overlap matrix presented in Fig. 5. The latching transitions evident in Fig. 6 are hence spontaneous in the sense that they are not induced by asymmetries in the weight matrix14. We have selected uncorrelated patterns ξp and the chronological order of the transient states is hence determined by small stochastic differences in the pattern overlaps. It would however be possible to consider correlated activity patters ξp incorporating a rudimental grammatical structure22,23, which is however beyond the scope of the present study.

Stress-induced intermittency

In Fig. 8 we present the time evolution of the overlaps Ap and Op for the case when the two objective functions, the energy functional and the polyhomeostatic optimization, incorporate conflicting targets. We retain the average sparseness α = 0.3 for the stored patterns, as for Fig. 6, but reduced the target mean firing rate to μ = 0.15. This discrepancy between α and μ induces stress into the dynamics. The pure latching dynamics, as previously observed in Fig. 6, corresponds to a mean activity of about 0.3, in conflict with the target value μ = 0.15.

Figure 8
figure 8

Time evolution of overlaps Op and Ap (color coded, compare Eqs. (13) and (14)) for all NP = 7 binary patterns, with sparseness α = 0.3, for a network of N = 100 neurons. The target mean neural activity is μ = 0.15, the difference between the two objective functions, viz between μ and α induces stress. The intermittent latching dynamics has a mean activity of about 0.3, which is too large. The phases of laminar flows between the burst of latching has a reduced average activity, thus reducing the time-averaged mean activity level toward the target μ = 0.15.

Phases of laminar flow are induced by the objective function stress and the latching dynamics occurs now in the form of intermittent bursts. The neural activity, see Ap in Fig. 8, is substantially reduced during the laminar flow and the time averaged mean firing rate such reduced towards the target activity level of μ = 0.15. The trajectory does not come close to any particular attractor relict during the laminar flow, the overall Op being close to 0.5 for all Np = 7 stored pattern.

The time evolution is hence segmented into two distinct phases, a slow laminar phase far from any attractor and intermittent phases of bursting activity in which the trajectories linger transiently close to attractor relicts with comparatively fast (relative to the time scale of the laminar phase) latching transitions. Intermittent bursting dynamics has been observed previously in polyhomeostatically adapting neural networks with randomly selected synaptic weights7,10, the underlying causes had however not been clear. The results presented above show that the concept of competing generating functionals can be used to define objective function stress and that a given difference in target mean activities is the underlying force driving the system into the intermittent bursting regime.

Robustness of transient state dynamics

The latching dynamics presented in Figs. 6 and 8 is robust with respect to system size N. We did run the simulation for different sizes of networks with up to N = 105 neurons, a series of sparseness parameters α and number of stored patterns NP. As an example we present in Fig. 9 the overlap Op for N = 1000 neurons and NP = 20 binary patterns with a sparseness of α = 0.2. No stress is present, the target mean activity level is μ = 0.2. The latching dynamics is regular, no intermittent bursting is observed. There is no constraint, generically, to force the limiting cycle to incorporate all attractor relicts. Indeed, for the transient state dynamics presented in Fig. 9, some of the stored patterns are never activated.

Figure 9
figure 9

Time evolution, for N = 1000, of the overlaps Op for all NP = 20 binary patterns (vertically displaced) with sparseness α = 0.2 and a target mean activity of μ = 0.2. There is no objective-function stress and the latching is regular, the adaption rates are , .

The data presented in Figs. 6, 8 and 9 is for small numbers of attractor relicts Np, relative to the systems size N, a systematic study for large values of Np is beyond the scope of the present study. Latching dynamics tends to break down, generically speaking, when the overlap between distinct attractors, as shown in Fig. 5, becomes large. The autonomous dynamics then becomes irregular.

Discussion

The use of generation functionals has a long tradition in physics in general and in classical mechanics in particular. Here we point out that using several multivariate objective functions may lead to novel dynamical behaviors and an improved understanding of complex systems in general. We propose in particular to employ generating functionals which are multivariate in the sense that they are used to derive the equations of motion for distinct, non-overlapping subsets of dynamical variables. In the present work we have studied a neural network with fast primary variables xi(t), the membrane potentials and slow secondary variables ai(t) and bi(t), characterizing the internal behavior of individual neurons, here the gains ai and the thresholds bi. The time evolution of these sets of interdependent variables is determined respectively by two generating functionals.

The equations of motion (leaky integrator) for the primary dynamical variables , the individual membrane potentials. are generated minimizing an energy functional. The Kullback-Leibler divergence is, on the other side, an example of an information-theoretical fucntional. Minimizing the Kullback-Leibler divergence between the distribution pi(y) of the time-average neural firing rate yi and a target distribution function q(y) maximizing information entropy generates intrinsic adaption rules and for the gain a and the threshold b (polyhomeostatic optimization).

Generating functionals may incorporate certain targets or constraints, either explicitly or implicitly. We denote the interplay between distinct objectives incorporated by competing generating functionals “objective functions stress”. For the two generating functionals considered in this study there are two types of objective functions stress.

The minima of the energy functional are time-independent point attractors leading to firing-rate distributions pi(y) which are sharply peaked. The target firing-rated distribution q(y) for the information-theoretical functional is however smooth (polyhomeostasis). This functional stress leads to the formation of an attractor relict network. The mean target neural firing rate μ is, on the other side, a scalar parameter for the target firing-rate distribution q(y) and hence encoded explicitly within the information theoretical functional. The local minima of the energy functional, determined by the synaptic weights wij, determine implicitly the mean activity levels α of the corresponding point attractors. Scalar objective function stress is present for α ≠ μ.

For the two generating functionals considered, we find that the scalar objective function stress induces a novel dynamical state, characterized by periods of slow laminar flow interseeded by bursts of rapid latching transitions. We propose that objective function stress is a powerful tool, in general, for controlling the behavior of complex dynamical systems. The interplay between distinct objective functions may hence serve as a mechanism for guiding self organization30,31,32, a process also denoted targeted self organization25.

The main focus of this study has been to to investigate and to quantify, for larger size networks, the competition between an objective function based on energy minimization and a generating functional based on the principle of homeostasis. Within this investigation we also detailed out the phase diagram of a model 3-site network having either one or two transiently stable attractors. This application to a simple model network allowed to investigate the adaptive flow within the underlying phase diagram, as generated in the adiabatic limit by the energy functional, compare Fig. 3. The results indicate that polyhomeostatic adaption has the tendency to drive the system toward regions of phase space close to a second order phase transition, as this offers to generate the desired variability in neural activity. We plan to investigate further this interesting phenomenon in future studies.

Methods

We use N rate encoding neurons in continuous time, with a non-linear transfer function,

where is the membrane potential and yi(t) [0, 1] the firing rate of the ith neuron. The dynamics of the neural activity is

which describes a leaky integrator24, with Γ being the leak rate and the wij the inter-neural synaptic weights. This kind of dynamics can be derived minimizing a simple energy functional

with respect to the membrane potential xi8,9. When using an energy functional to derive (6), the resulting synaptic weights are necessarily symmetrized, viz wij = wji. Here we do indeed consider only symmetric synaptic links, the dynamics of a network of leaky integrators would however remain well behaved when generalizing to non-symmetric weight matrices.

The transfer function (5) contains two intrinsic parameters, a and b, which can be either set by hand or autonomously adapted, an approach considered here. The intrinsic parameters determine the shape of the individual firing-rated distributions pi(y), given generically by

where δ(·) is the Dirac δ-function and T → ∞ the observation time-interval. The Kullback-Leibler divergence25

measures the distance between the neural firing rate distribution pi(y) and a normalized distribution q(y), with DKL ≥ 0 generically and DKL = 0 only for pi(y) = q(y). Gaussian distributions

maximize the information entropy whenever mean and standard deviation are given25. Minimizing the Kullback-Leibler divergence (9) with respect to the individual intrinsic parameters ai and bi is hence equivalent to optimizing the information content of the neural activity6. The optimization can be performed using variational calculus7,10,26, one obtains

with θ = 1 − 2yi + (λ1 + 2λ2yi) (1 − yi) yi and where the and are adaption rates. When these adaption rates are small, with respect to the time scale of the cognitive dynamics (6), an implicit time average with respect to the distribution of input signals is performed10,26, compare (8) and the respective firing rate distribution pi(y) approaches the target distribution function q(y). The rules (11) are also denoted stochastic adaption rules, since they correspond7,10,26, for small adaption rates and , to an implicit (stochastic) average over all input signals xi. The adaption rules (11) implement the optimization of an entire distribution function and not of a single scalar quantity and are hence equivalent to a polyhomeostatic optimization process7.

The original network (6) has point attractors, as given by , i. These attractors are destabilized for any , , the coupled system (6) and (11) has no stationary points, as defined by . Stationary point attractors lead to neural firing statistics far from the target distribution (10) and to a large Kullback-Leibler divergence DKL and hence to finite stochastic gradients (11). The principle of polyhomeostatic adaption is hence intrinsically destabilizing.

In deriving the stochastic adaption rules (11) one rewrites the Kullback-Leibler divergence DKL as an integral over the distribution pi(x) of the respective membrane potential xi, namely as , with an appropriate kernel d(x). The optimal overall adaption rules depend on the specific shape of pi(x). The optimal adaption rates not dependent on the distribution of the membrane potential are, on the other side, given by minimizing the kernel, and respectively, which leads to the adaption rates (11), which are instantaneous in time7,10,26.

For the analysis of the simulation results we use with

the binary activity patterns of the attractors of (6). The are ξ1 = (1, 1, 0) and ξ2 = (0, 1, 1) for the case of the N = 3 network presented in Fig. 1. We use two criteria in order to investigate to which degree the original binary attractors are retraced. The first criterion is the overlap Op [0, 1],

which corresponds to the cosine of the angle between the actual neural activity (y1, …, yN) and the attractor state ξp. As a second measure, of how close the actual activity pattern and the original attractors are, we consider the reweighted scalar product Ap [0, 1]

Note that the are positive, representing the neural firing rate in attractor states. For the general network analysis we use the Hopfield encoding8,9

for the synaptic weights wij, where the are the arithmetic means of all pattern activities, for the respective sites. Here Np is the number of encoded attractors and α [0, 1] the mean pattern activity,

When using the Hopfield encoding (15), the attractors are known to correspond, to a close degree, to the stored patterns ξp, as long as the number Np of patterns is not too large8,9.

We simulated Eqs. (6) and (11) using 4th order classical Runge-Kutta29 and a timestep of Δt = 0.1. The resulting dynamics is dependent on the magnitude of the adaption rates for the gain a and for the threshold b. In general latching dynamics is favored for small and somewhat larger . We kept a constant leak rate Γ = 1. The results presented in Figs. 6 and 7 have been obtained setting λ2 = 0 and using Table 1 for λ1.

Table 1 Relation between the parameter λ1, for λ2 = 0 and the mean value μ, as given by Eq. 4, for the target distribution q(y)