Novelty and imitation within the brain: a Darwinian neurodynamic approach to combinatorial problems

Czégel, Dániel; Giaffar, Hamza; Csillag, Márton; Futó, Bálint; Szathmáry, Eörs

doi:10.1038/s41598-021-91489-5

Download PDF

Article
Open access
Published: 15 June 2021

Novelty and imitation within the brain: a Darwinian neurodynamic approach to combinatorial problems

Dániel Czégel^1,2,3,4^na1,
Hamza Giaffar⁵^na1,
Márton Csillag¹^na1,
Bálint Futó¹ &
…
Eörs Szathmáry^1,2,3

Scientific Reports volume 11, Article number: 12513 (2021) Cite this article

2571 Accesses
2 Citations
5 Altmetric
Metrics details

Subjects

Abstract

Efficient search in vast combinatorial spaces, such as those of possible action sequences, linguistic structures, or causal explanations, is an essential component of intelligence. Is there any computational domain that is flexible enough to provide solutions to such diverse problems and can be robustly implemented over neural substrates? Based on previous accounts, we propose that a Darwinian process, operating over sequential cycles of imperfect copying and selection of neural informational patterns, is a promising candidate. Here we implement imperfect information copying through one reservoir computing unit teaching another. Teacher and learner roles are assigned dynamically based on evaluation of the readout signal. We demonstrate that the emerging Darwinian population of readout activity patterns is capable of maintaining and continually improving upon existing solutions over rugged combinatorial reward landscapes. We also demonstrate the existence of a sharp error threshold, a neural noise level beyond which information accumulated by an evolutionary process cannot be maintained. We introduce a novel analysis method, neural phylogenies, that displays the unfolding of the neural-evolutionary process.

Complex computation from developmental priors

Article Open access 19 April 2023

The generative grammar of the brain: a critique of internally generated representations

Article 30 November 2023

Canonical neural networks perform active inference

Article Open access 14 January 2022

Introduction

Life, with its “endless forms most beautiful”¹, is a result of a Darwinian evolutionary process operating over the enormous representational capacity of chemistry. Any Darwinian process is built on the principles of replicating units, hereditary variation, and selection²; beyond the ability to sustain these, a Darwinian process makes no further demands of the underlying substrate. Replication, hereditary variation and selection collectively operate at other levels of biological organization, for example in the mammalian immune system³, and may well feed on other-than-chemical substrates. The brain’s neural networks are another example of a system that produces seemingly endless beautiful forms, from neural activity to the mental states and actions that they support. The endeavor of Darwinian neurodynamics (DN) explores the possible ways in which (i) Darwinian dynamics might emerge as an effective high-level algorithmic mechanism from plasticity and activity dynamics of neural populations^4,5 (how does it work?) and (ii) this high level algorithmic mechanism fits into cognition (what is it good for?). This endeavour can be seen as part of a more general pursuit of a theory of high-dimensional adaptations that unifies our understanding of evolutionary and learning processes (see Table 1).

Table 1 A unified view of evolutionary and learning processes, as part of a theory of high-dimensional adaptations.

Full size table

In any model of DN, the modeller is required to specify how the ingredients of Darwinian evolution, namely replication, selection, and the heritable variation of firing patterns, emerge from lower-level neural activity and synaptic plasticity rules. A related early idea, termed neural Darwinism, together with its more recent implementations, considers selective amplification of a pre-existing but possibly adaptive repertoire of computational units^{27,28,29,30,31}. It is halfway between traditional neural adaptation mechanisms, implementing effective high-dimensional hill-climbing in representation space, and a bona fide evolutionary search, with multiple rounds of selection over heritable variation generated by high-fidelity replication^4,32. Building on these selectionist ideas, the neuronal replicator hypothesis³³ explores possible ways to add the crucial missing ingredient, replication, using exclusively local neural plasticity and activity rules. Neural replicators proposed to date come in two flavours: (i) combinatorial replicators copy a sequence of low-informational states of neural activity (represented by e.g. bistable neurons), akin to template replication^33,34, and( ii) holistic replicators³⁵ copy one high-informational state, such as memory traces of autoassociative attractor networks⁵. Three broad categories of neuronal units of evolution have been suggested: (i) neural activity patterns^5,33, (ii) connectivity³⁴, and (iii) evolvable neuronal paths³⁰. If neuronal replicators exist, their appearance marks a new way in which information is represented in evolutionary units and transmitted between different neuronal replicators: a filial major transition in evolution, as proposed in³⁶.

DN is similar to other approaches to understanding the computational functions of the mind and its neural correlates in the brain: efficient theoretical frameworks of adaptation, notably reinforcement learning and statistical learning, have been successfully guiding experimentation in neuroscience in recent years. As in the aforementioned theories, in DN, the neural correlates of crucial algorithmic elements, replication, selection and heritable variation, are a priori unknown, emphasizing the importance of hypothesis-driven search. When searching for these algorithmic ingredients of an emergent Darwinian process, three important conceptual issues arise recurrently.

1.
Unlimited hereditary potential Long term evolution rests on the ability of novel solutions, generated via mutation, to replicate. If the set of solutions that can be copied with high-enough fidelity is limited, no efficient evolutionary search can take place. This requirement for high enough fidelity copying of a very large range of potential solutions is one that directly in silico models of evolution rarely has to be considered, as replication can be achieved simply with arbitrary precision. When the replication mechanism is itself explicitly modelled, the question of its capacity to replicate solutions must be addressed.
2.
Maintenance of information Although the appearance of novel beneficial mutations is necessary, it is not sufficient for continuous evolutionary search; information must also be maintained. In particular, the information that is lost through deleterious mutation must (at least) be compensated by selection. This sets an upper limit on the mutation rate, depending on the selective advantage of the current solution over the surrounding ones.
3.
Representation of phenotype/solution In evolutionary theory, phenotypes over which selection acts, are represented by genotypes that replicate. In neurobiology, computational states (e.g. solutions to a given problem) are represented by neural activity patterns. These mappings between genotype/neural activity and phenotype/solution might be highly complex. Representation is important for at least two different reasons. On one hand, it provides enormous adaptive potential: appropriate transformation of rich-enough representations might render simple algorithms surprisingly powerful. In evolution, in particular, this permits genotypic mutations to be channeled along some, potentially fit, phenotypic directions; this is one of the mechanisms upon which evolvability rests. On the other hand, we must separate the effect of representation and the algorithm operating on top of it; in particular, the evaluation of the efficiency of algorithms based on a few hand-picked and hand-fabricated representations of a given problem. This issue is unavoidable in DN, as we have at this point no detailed knowledge of how particular mental states are represented in human and animal brains. That is, the understanding of these relevant variables at the computational/behavioral, algorithmic and implementational level remains at present far from complete.

In this paper, we propose a model of Darwinian neurodynamics based on the emergent replication of dynamic patterns of neural activity. Building on the versatility and efficiency of the reservoir computing paradigm, we demonstrate that dynamic activity patterns can be effectively replicated via training one recurrent network by another. The two main components of our model that map both to building blocks of evolutionary dynamics and proposed mechanisms of mental search (e.g. in the combinatorial space of action sequences) are evaluation of proposed activity patterns and nearly perfect copying of them. The latter proceeds via a teacher network, i.e. the network that generated the fitter signal, training a learner network, i.e. associated to the less fit signal. We show that an ensemble of such recurrent networks, capable of training one another in parallel based on previous evaluation of their output signals, can give rise of an effective evolutionary search that i) maintains information about already discovered solutions and ii) improves on them occasionally through the appearance and spread of beneficial mutations.

By continually selecting and imperfectly replicating solutions that are better fit to the demands of the cognitive task at hand, and discarding those which don’t meet the muster, DN can be seen as a process of stochastic parallel search with redistribution of resources. In the biosphere, this process of Natural Selection successfully operates in an enormously high dimensional (genomes range in size from $\sim 2$kb to $10^8$kb), combinatorial (each position can be one of four bases) and compositional (genomes are semantically and syntactically complex) space of genotypes. The solution spaces of problems in many cognitive domains share these characteristics to differing extents. Language is one such domain; indeed an evolutionary computational framework has previously been reported addressing aspects of this faculty³⁷, in which linguistic constructs (as units of evolution) can be coherently multiplied with hereditary variation under selection to achieve e.g. mutual comprehension in a population. Causal reasoning and action planning are two other examples of cognitive domains characterized by complex search spaces, where it is unclear if local gradient information is useful. To model these complex search spaces in the most generic manner possible, we focus on the capacity of neural-evolutionary dynamics to solve two classes of combinatorial search problems: the Travelling Salesman Problem (TSP) and the tunably rugged NK landscapes³⁸. We demonstrate novel analysis methods that combine approaches from computational neuroscience and phylogenetics, and are generally applicable to any proposed DN architecture.

This endeavour of understanding the implementational, algorithmic and computational features of Darwinian neurodynamics is worthwhile for at least two reasons: (i) for its potential as a useful, though not necessarily ubiquitous, computational module in wet brains; a capacity, which, if it exists, has been selected for by biological natural selection, and (ii) for its potential as a computational module that is orthogonal to current paradigms, implemented in engineered systems such as spiking neural network hardware architectures. We do not expect that DN, as a computational module, is necessarily topmost in the computational hierarchy and therefore we expect behavioral correlates to potentially be indirect. More emphasis is placed here on discussing DN as a fundamental problem-solving neural machine as opposed to relating its solutions to possibly convoluted human behavioral patterns.

Building blocks of models of Darwinian neurodynamics

When proposing a DN architecture, one has to specify how necessary elements of a Darwinian evolutionary process, namely, replication and selection of units of evolution, and heritable variation over them, are neurally implemented. In the following, we show how these choices can be made by building up our model architecture, based on dynamic firing patterns as evolutionary units, which we refer to as recurrent DN, in a step-by-step manner.

Unit of evolution

The unit of evolution in recurrent DN is the output firing pattern of a recurrent computational unit, as illustrated in Fig. 1a. It is, in the current model, a one-dimensional time-dependent signal, usually interpreted as neuronal firing rate. Importantly, the unit of evolution is not the recurrent network itself, it is the computation it does. We use reservoir computing units to implement these recurrent computations. In reservoir computing units, learning takes place only through the linear readout synapses; the recurrent connections within the network are fixed. This makes the map from computer to computation highly degenerate: the same computation, i.e., output signal, can be represented (at least approximately) by an enormous number of different reservoirs. The reservoirs themselves form representational slots for the replicators, i.e., the output signals.

Replication

The crucial step in any DN architecture is the replication of firing patterns, utilizing local learning rules. Here, in recurrent DN, replication takes place by one recurrent network training another, as shown in Fig. 1b. More precisely, the training signal of reservoir A is the output of reservoir B, rendering, after some training period, the two output signals approximately the same. In other words, the output of reservoir B serves as a template for the output of reservoir A, resulting in the replication of the output of reservoir B, with some variation. We use the FORCE algorithm³⁹ to train reservoirs. Note that the FORCE algorithm is a supervised learning algorithm, which is in general inapplicable for computations where a supervised signal is not available; an evolutionary process artificially creates supervised signals by displacing the low-dimensional informational bottleneck between the environment and the system from reservoir training to fitness computations.

Heritable variation

Heritable variation is a necessary for selection to act upon, and therefore, to improve solutions over time according to their fitness metric. Variation comes from two sources in our model. (i) It comes from imperfect training of one network by another via the FORCE algorithm, and (ii) it also comes from neural noise that we model as white noise added to the learned signal. Importantly, this variation is heritable if the resulting mutated signal can be learned by yet another reservoir. If the set of heritable signals is practically unlimited (i.e., much larger than the population size, the number of replicating signals), the system exhibits unlimited hereditary potential, upon which long-term evolution can take place.

Fitness landscape

Evolution, like reinforcement learning, necessarily channels all information about the environment through a one-dimensional bottleneck. It is this scalar-valued function, the fitness function, that must contain all task-related information if the architecture itself is meant to be of general purpose. It is not in the scope of this paper to discuss possible implementations of this evaluation signal in wet brains, but this similarity suggests that if Darwinian dynamics takes place in brains, the striatal reward system might be a possible candidate location of fitness assigment. In this paper, we intend to demonstrate the feasibility of an emergent Darwinian process over dynamic neural firing patterns in an as task-agnostic manner as possible. We therefore evaluate our recurrent DN architecture on general combinatorial landscapes with many non-equivalent local optima: on the Tavelling Salesman Problem and on NK-landscapes with different ruggedness. Figure 1c visualizes the idea of ruggedness of a continuous 2-dimensional landscape; in our simulations, we use high-dimensional discrete landscapes that are difficult to visualize.

The traveling salesman problem (TSP) is stated as: compute the shortest possible route that visits each city (from a given list of destinations) and returns to the origin city. Possible solutions can thus be represented as permutations of the cities. In this classic NP hard combinatorial optimization problem, the total path length l for a permutation is related to fitness f by any monotonously decreasing function.

NK-landscapes⁴⁰, originally modelling epistatic fitness interactions of genes, are simple non-deterministic landscapes that are able to model arbitrary level of ruggedness, set by a parameter K. They assign fitness f to binary sequences $x_1,x_2,\dots x_N$ as

$$\begin{aligned} f(x_1,x_2,\dots , x_N)=\sum _{i=1}^m f_i \end{aligned}$$

(1)

where each of the m fitness-components $f_i$ depend on exactly $K\le N$ coordinates. The value of a fitness component for all of its $2^K$ possible arguments is drawn from the uniform distribution over [0, 1] independently. As an effect of the stochastic nature of NK landscape generation, realizations of landscapes with the same parameters N and K generally differ. N and K therefore define task classes, with N, the number of bits or “genes”, parametrizing problem size and K parametrizing “ruggedness” or “frustration” in the system⁴¹. Note that NK landscapes include spin glass Hamiltonians as special case, corresponding to (a subset of) $K=2$ landscapes. $K>2$ landscapes, on the other hand, model higher order interactions between elements. However, even low levels of ruggedness (any $K>1$) make finding even local optima difficult⁴². Overall, NK landscapes form a simple and flexible benchmark for combinatorial problems.

Selection

Selection in DN refers to the assignment of number of offspring to replicating firing rate patterns. It is done through evaluating the solutions represented by the replicating signals according to a scalar function, the fitness. In our architecture, reservoirs that output higher-fitness signals become trainer reservoirs, reservoirs outputting lower-fitness signals are assigned to be learners. The signal of trainer reservoirs reproduce, while the signal of learner reservoirs dies (i.e., disappears form the population of signals). The full evolutionary process over a two-dimensional sheet of reservoirs is illustrated on Fig. 1d.

Population structure defined by cortical topology

Fitness values determine the reproduction rate of signals, in terms of assigning the corresponding recurrent unit to be either teacher or learner. This assignment is dependent on comparing a subset of signals with each other. We define these competing subsets in terms of a connection topology of recurrent units: each recurrent unit (which is a neural network itself) is competing with its direct neighbours according to this pre-defined meta-topology. Special cases of this network meta-topology include a one-dimensional array of networks for illustration purposes (see phylogenetic trees of signals in section 3), and a two-dimensional grid, that might be identified with a cortical sheet of canonical microcircuits, see Fig. 1a. Although these meta-topologies induce a local competition dynamics, competing groups overlap and therefore high-fitness solutions are able to spread in the population globally. Population structure, defined here by this meta-topology of recurrent units, strongly shapes the course of evolutionary competition dynamics in general. If the evolutionary system as a whole is itself subject to optimization, e.g. by another evolutionary process at a higher level, we might expect to observe fine-tuned hyper-parameters in general, and a fine-tuned cortical topology in particular. Although optimizing cortical topology for various performance measures is out of the scope of this paper, we refer to evolutionary graph theory⁴³ as a systematic study of amplifying and diminishing effects of population structure on selection.

Representation

Notice that replication acts at the level of signals whereas selection acts at the level of solutions represented by the signals. This mapping between signal to solution determines the accessibility of solutions by evolution, just like the genotype-phenotype map in biology determines the accessibility of phenotypes. Solutions that are represented by many signals can be found with relative ease by evolutionary trajectories, whereas solutions that are represented by a tiny fraction of signal space or not at all are difficult to find. Statistical properties of the signal-solution map are therefore of fundamental importance. Since our goal here is to explore how well our recurrent DN arcitecture performs over rugged fitness surfaces with a large number of non-equivalent local optima, we map the signals to combinatorial solutions represented by (i) permutations, in case of the travelling salesman problem (TSP) fitness landscape, or by (ii) binary sequences, in case of the NK-landscape. These signal-solution maps are visualized in Fig. 2a; their statistical properties are shown in Fig. 2b–e.

Results

Having described the essential ingredients, we now focus on exploring the characteristics of recurrent Darwinian neurodynamics as a coherent computational framework. In all cases, we either use a one dimensional array, or a two-dimensional sheet of reservoirs as meta-topology (i.e., population structure from an evolutionary point of view), mirroring the possibility of representation by cortical microcircuits. First, we illustrate the evolution of the best firing pattern and the corresponding solution on a well-known combinatorial optimization problem, the traveling salesman problem (TSP), see Fig. 3a. We then turn to a more general (but less visual) set of combinatorial problems, optimization over NK landscapes. We introduce an analysis technique for tracking the evolution of firing patterns as a population, neural phylogenies (Fig. 3b,c), and demonstate the ability of recurrent DN to continuously improve upon currently best solutions through beneficial mutations. Next, we formulate fitness in terms of information gain in order to make evolution over different landscapes comparable, and we show the time evolution of fitness distribution of the population over different landscapes and initial populations (Fig. 3d). Finally, we demonstrate the existence of a sharp neural-evolutionary error threshold, a neural noise level above which information gained by evolution cannot be maintained (Fig. 4a–c), and contrast it with the effect of varying selection strength (Fig. 4d–f) that does not entail a sharp breakdown of the evolutionary process.

We emphasize here that there is nothing inherent in the computational units (reservoirs) or in the specific signal-solution map that makes the system suitable to search in combinatorial spaces. From the point of view of the computational task, these are arbitrary choices. This highlights our point: it is the emergent evolutionary dynamics over the output signals that is responsible for any higher-than-random performance on a combinatorial task.

Breeding traveling salesmen

Given a list of cities, the Traveling Salesman Problem (TSP), asks for the shortest length route that visits each city and returns to the origin. While simple to state, both the rapid growth of the number of possible routes (permutations of the city list) with the number of cities and the difficulty of factoring the problem into simpler sub-problems renders the TSP NP hard. A wide range of algorithmic approaches have been proposed for the TSP, including those inspired by evolutionary dynamics. Here, we select the TSP as a classic example of a difficult combinatorial optimization problem to demonstrate the behaviour of this recurrent DN architecture. The goal of this illustration is not to compete with other specialized algorithms employing TSP specific heuristics, but to highlight the importance of representation and demonstrate the capacity of this DN system.

Figure 2a illustrates one possible encoding of a permutation in a time varying signal; here each coloured dot represents a city identity and the order along the y axis represents a route. Notice again that many other possible signals (genotypes) are consistent with a specific route (phenotype): this degeneracy may have important consequences for the evolvability of the representation (see “Discussion”). Figure 3a shows a toy example TSP with 10 cities (a solution space size of $\sim 2^{18}$) that is solved via evolutionary dynamics over a 1D array of reservoirs.

Although performance, in general, depends on the initial signals and the inner weights of reservoirs (determining the scope of signals that can be learned by each reservoir), the single run we visualize in Fig. 3a captures an essential feature of any evolutionary dynamics: the automatic reallocation of computational resources to improving upon currently best solutions. After pooling a diverse set of initial conditions, the dynamics automatically narrows down to local improvements around highly promising solutions, culminating, in this example, in the global optimum after 11 generations.

Phylogeny of firing patterns. Although tracking the best solution at all times already offers intuition behind our neuro-evolutionary dynamics, a higher resolution analysis can be given by following the fate of all signals as they replicate and mutate, leading to neural phylogenies (ancestry trees), visualized in Fig. 3b and c.

We twist conventional phylogenies in two ways. First, since the spatial arrangement of reservoirs (i.e., hosts of replicating signals) is fixed to be a 1D array or a 2D lattice, we simply use this additional information to fix the spatial position of replicators at all times to be those of their host reservoirs. As time (measured in the number of generations) passes vertically, a vertical line corresponds to replicators in the same host reservoir at subsequent generations. Second, in order to keep phylogenies simple, we visualize only the fitness of each signal at all times; since the fitness of different signals is likely to be different, this provides a good proxy for the actual phylogeny of signals.

Such neural phylogenies are especially useful in detecting the dynamics of relevant scales of competition. Initially, out of the sea of randomly chosen initial conditions, locally good signals emerge and start to spread. They then form growing homogeneous islands. Higher-fitness islands out compete lower-fitness ones, the relevant scale of competition increases. In the meantime, computations given by local fitness comparisons are allocated to compare higher and higher fitness solutions. Phylogenies also reveal the reservoirs’ differential ability to learn; in extreme cases, particularly inflexible reservoirs form a local wall, preventing otherwise high-fitness signals as replicators to spread.

Information gain

Since fitness values f cannot be compared across landscapes, we transform fitness to information gain $I_f$, calculated as $I_f=-\log _2 q_f$, where $q_f$ is the fraction of all types (here, binary sequences of length N) that have higher or equal fitness than f. If the fitness f of the current type is the median of all fitness values, then $I_f=1\, \text {bit}$; if f is at 75 percentile, $I_f=2\, \text {bits}$, and so on. Figure 3d shows the distribution of information gain $I_f$ in the population as it evolves over NK landscapes with $N=20$ and $K=5$. We compare runs over three axes: different population of initial signals, different landscapes, and the dimensionality of the meta-topology of reservoirs (i.e., 1D array or 2D lattice). The high observed between-run variability makes it uninformative to average performance over multiple runs, we therefore present results of single runs. In all runs, information gain increases monotonously. It increases in jumps, corresponding to rare beneficial mutations that improve the currently best solutions. Between jumps, information is maintained: highest-fitness solutions do not disappear from the population unless a higher-fitness mutant appears. This is marked by relatively steady periods of information gain distributions. Finally, islands of lower-fitness solutions are sometimes also kept by the dynamics, as the existence of multiple relative stable modes of the information gain distribution suggests.

Error threshold for Darwinian neurodynamics

The error threshold in population genetics refers to a critical mutation rate above which populations diffuse out of fitness peaks to neighbouring genotypes⁴⁴. Although these neighbouring genotypes have lower fitnesses, they are more numerous; lower mutation rate allows for (i) narrower or (ii) lower fitness peaks. Beyond the error threshold, locally good types are not maintained, and consequently, cannot be improved upon by beneficial mutations.

In analogy with population genetics, here we present an error threshold for Darwinian neurodynamics: a critical neural noise value above which the Darwinian evolutionary process over neural replicators breaks down. As shown in Fig. 4a and b, there is indeed a sharp transition from high to low fitness, and correspondingly, high to low information gain $I_f$ as neural the noise level increases. This threshold might provide a useful definition to what neural replication means: an approximate copying of firing patterns on top of which a functional Darwinian process, below error threshold, can be built. For such processes, theoretical frameworks and arguments from evolutionary biology and population genetics might be efficiently transferred to understand neural dynamics at a computational and algorithmic level.

In particular, we point out the possibility of an adaptive mutation rate in the context of neural replicators, where adaptation can occur both within a single unfolding of the Darwinian neurodynamic process and over genetic evolutionary timescales. Such adaptive mutation rate could be an outcome of selection pressures coming from the outer (genetic) evolutionary loop, optimizing for a specific computational function of the within-brain Darwinian neurodynamic process. Although throughout this paper we consider this computational function to be combinatorial search, here we propose another possibility: that such an evolutionary mechanism implements particle filtering, performing statistical inference. These two computational functions are not exclusive: brains might adaptively control for the mutation rate and the timescale of the process, achieving a “crossover” from one computational domain to the other on demand. Furthermore, this might serve as an explanation to how high-fidelity neural replication appeared over genetic evolutionary timescales in the very first place: through a selection pressure perfecting statistical inference, implemented through particle filters. Such selection pressure might have acted over the computational function of statistical inference (at a high mutation rate regime) way before it co-opted for an efficient Darwinian optimization mechanism at a low mutation rate (below error threshold) regime.

The effect of selection strength

Selection strength is another fundamental concept in evolutionary dynamics. It refers to the magnitude of differential survival in a population in a given environment. In accordance with evolutionary game theory^45,46, we model selection strength by introducing a temperature parameter $\tau$ controlling for the stochasticity of the selection process. Zero temperature $\tau =0$ implies a deterministic selection, i.e., higher-fitness individuals are always selected over less fit ones, whereas infinite temperature $\tau =\infty$ results in random selection: individuals are selected for replication independently of their fitness. Intermediate temperature values interpolate between these two extremes, with selection being correlated with fitness to a given degree. We introduce tunable selection strength through a temperature parameter to our model as follows. For each reservoir unit, we select its teacher unit from its neighborhood in two steps. First, we rank the focal unit and all its neighbors according to the fitness of their output. In case of a 1D and a 2D topology of reservoir units, this means comparing 3 and 5 outputs, respectively. This results in a rank $r_i$ associated with each reservoir i in the neighborhood. Second, we select the teacher reservoir (which might be the same as the focal reservoir) to be reservoir i with probability $p_i\propto \exp (-r_i/\tau )$. Clearly, setting $\tau =0$ reproduces deterministic selection of the reservoir with maximal fitness. However, it is a question whether increasing $\tau$ would result in a smooth deterioration of the evolutionary process or an abrupt one, as we see when neural noise amplitude is increased. We therefore analysed several actual runs through their phylogenetic tree (Fig. 4f) as well as the amount of information the evolutionary process can accumulate in the function of temperature $\tau$ (Fig. 4e,f). In contrast to the effect of an increase in neural noise, the effect of a gradual decrease of selection strength (i.e., increase in $\tau$) is smooth. Populations can still maintain some information they accumulated over their evolutionary history even when selection is relatively weak, corresponding to high $\tau$ values. Overall this result points at the qualitatively different nature of mutation rate (resulting from neural noise) and selection strength as evolutionary parameters: one resulting in an abrupt, the other in a gradual breakdown of the process.

Discussion

Darwinian neurodynamics links two well-studied and experimentally rooted conceptual systems: neuroscience and evolution. Two principal aims of this endeavor are (i) to ascertain whether or not brains use principles of Darwinian evolution to select actions or representations, and if so, to understand how these principles are reified in neural populations; (ii) to assemble a conceptual bridge over which evolutionary principles can be transferred to neuromorphic learning architectures. These two goals might very well overlap. This paper approaches goals (i) from a reverse engineering perspective, asking how local learning in populations of neurons can sustain evolutionary dynamics at a higher computational level. We discuss the key theoretical issues involved in mapping Darwinian dynamics onto neuronal populations, anchoring our discussion in a simple computational paradigm based on the emergent replication of dynamic firing patterns. We point at the series of decisions one has to make when setting up a DN architecture, from the computational context, in this case combinatorial search problems, all the way down to algorithmic and implementational choices.

Biological evolution has produced spectacular solutions in geological time. Were a similar mechanism operational in the brain on the millisecond timescale, spectacular intellectual achievements would be directly comparable to solutions in evolution. In a remarkable twist an evolutionary mechanism might be even more efficient in the brain than in the wild, for two reasons: first, it can be complemented by other features of the nervous system that are not available for ordinary biological populations; second, even the basic operations such as copying would not be limited by the linearity of informational macromolecules in the neural mechanism. Time will tell whether the latter is an asset or not.

Natural selection can be thought of as an algorithmic process that dynamically reallocates computational resources from globally bad to globally good solutions. It does this in a simple and therefore universally robust way: good solutions are copied in place of bad ones. Many engineering methods are inspired by this idea, notably evolution strategies and estimation of distribution algorithms^47,48: both iteratively propose new solutions based on the goodness of previously proposed ones. These, like all machine learning algorithms, trade off universality for efficiency in particular cases⁴⁹. Evolution as a dynamic reallocation algorithm, however, is only useful if existing solutions are not washed away by the sea of mutations while searching for better solutions. In other words, although mutations generate the necessary variation upon which selection can act, there is an upper limit, given by the Eigen error threshold in simple combinatorial systems. Mutations generate variation through which the space of possible solutions is explored. If replication and selection act at different levels, connected by e.g. a genotype-phenotype or, in our case, a signal-solution map, then generated variation at the replication level is transformed at the selection level. This leads to a potentially highly non-uniform exploration of phenotypes/solutions. These non-uniformities can take two forms. i) Evolvability/facilitated variation: generated variation at the phenotype level is informed by the evolutionary past and proposes solutions that have a higher expected fitness than proposed solutions without this mechanism. ii) Limited accessibility of phenotypic regions: phenotypes cannot be represented with equal ease. This effect combines with fitness to determine evolutionary trajectories. The recurrent DN architecture introduced in this paper illustrates these conceptual issues in a specific context, but they are sufficiently general that they are likely relevant to any non-trivial model of DN.

The recurrent DN model exemplifies another core idea: the use of a population of supervised learning modules to solve reinforcement learning like problems, where information from the environment is only available through a scalar bottleneck, called reward or fitness. Our architecture minimally couples these two dynamics: the role of being a teacher or learner unit is determined via a simple comparison of reward/fitness; variation on which the Darwinian process feeds is introduced ’blindly’ through necessarily imperfect supervised training. There is no need for gradient information or another computational module that is aware of the problem structure; the population of agents - in this case, neural firing patterns—explores both locally and globally via emergent Darwinian dynamics. We envisage that this recurrent architecture can be substantially extended and generalized along the lines of these core ideas to more realistic scenarios. One possibility involves adapting the reservoir structure to increase the range of signals that can successfully be learned⁵⁰; recent work suggests that embedding strongly connected cell assemblies within reservoir networks can enable rapid learning of complex signals within a single presentation of the teaching signal⁵¹. Another important direction involves copying (noisy) higher-dimensional attractors instead of one-dimensional time-varying signals.

One might expect that brains, having evolved under tight energy and space constraints, would employ an evolutionary process, including the maintenance of a population of solutions, only if it provides a qualitative advantage in search problems over non-population based processes. What are these qualitative advantages? Here we provide a subjective account, emphasizing algorithmic features that are unique to evolution, and pointing at their relation to cognitive problem solving.

A key feature of evolution is the unrestrictedness of representations, and the opportunity for open-endedness this unrestrictedness offers^52,53. Since there is no need for estimating how to improve on the current representation, e.g. by computing gradients or solving complicated credit assignment problems, in evolution, “anything goes”. Anything that is, from which assembly instructions can be copied. The more the phenotypes (or candidate solutions, in the language of DN) are decoupled from the replicating substrate that holds the set of instructions for making the phenotype, the more freely the (infinite) phenotype space can be explored. In biology, phenotypes are constructed from genotypes by (probabilistic) developmental programs; in this sense the phenotype is decoupled from the genotype. These programs are sets of instructions that encode incremental and modular construction, they allow for arbitrary re-use and can modify existing modules, paving the royal road to open-endedness. This royal road may well have been stumbled upon by neural systems at some point in their convoluted evolutionary history, replete with complex problem solving. Indeed, probabilistic programs leading to compositional/hierarchical action sequences and the inference of these programs from sensory data offers a successful modeling framework for human cognition^54,55.

Biological evolution goes beyond this, however, by employing an array of clever tricks to boost search in (developmental) program space. Tricks that are not tied to the specificities of life as we know it. These include i) a variety of methods for maintaining a diverse set of solutions^56,57, ii) (re)combining already invented modules at different timescales^58,59,60, and iii) accelerating change by setting up arms races of various kinds^61,62. We believe that exploring these substrate independent algorithmic tricks in the context of Darwinian neurodynamics potentially holds unexpected insights along the way.

Materials and methods