# Concurrence of form and function in developing networks and its role in synaptic pruning

## Abstract

A fundamental question in neuroscience is how structure and function of neural systems are related. We study this interplay by combining a familiar auto-associative neural network with an evolving mechanism for the birth and death of synapses. A feedback loop then arises leading to two qualitatively different types of behaviour. In one, the network structure becomes heterogeneous and dissasortative, and the system displays good memory performance; furthermore, the structure is optimised for the particular memory patterns stored during the process. In the other, the structure remains homogeneous and incapable of pattern retrieval. These findings provide an inspiring picture of brain structure and dynamics that is compatible with experimental results on early brain development, and may help to explain synaptic pruning. Other evolving networks—such as those of protein interactions—might share the basic ingredients for this feedback loop and other questions, and indeed many of their structural features are as predicted by our model.

## Introduction

The fundamental question in neuroscience of how structural and functional properties of neural networks are related has recently been considered in terms of large-scale connectomes and functional networks obtained with various imaging techniques1. But at the lower level of individual neurons and synapses there is still much to learn. The brain can be regarded as a complex network in which nodes represent neurons, and edges stand for synapses. It is then possible to use mathematical models, which capture the essence of neural and synaptic activity, to study how a great many of such elements can give rise to collective behaviour with at least some of the characteristics of cognition2,3,4. In this context, the structural properties of the underlying network supporting brain dynamics have been found to affect the behaviour of the system in various ways5,6,7,8. For instance, experimental setups have confirmed that actual neural networks exhibit degree heterogeneity that roughly accords with scale-free distributions, and negative degree–degree correlations (dissasortativity), which strongly influence the dynamics of the system9,10.

These networks are also not static: new synapses grow and others disappear in response to neural activity11,12,13,14. In many species, early brain development seems to be dominated by a remarkable process known as synaptic pruning. The brain starts out with a relatively high density of synapses, which is gradually reduced as the individual matures. In humans, for example, synaptic density at birth is about twice what it will be at puberty, and certain disorders, such as autism and schizophrenia, have been related to details of this process15,16,17,18,19. It has been suggested that such synaptic pruning may represent some kind of optimization, perhaps minimizing energy consumption and/or the genetic information needed to build an efficient and robust network, and/or to optimize network structure20,21.

Some progress has been made in understanding how new synapses are grown. It has been shown that pre-synaptic activity and glutamate could act as trophic factors to guide new synaptic spines growth22, and other mechanisms of cooperation among neurons have also been proposed, such as spike-timing dependence plasticity23,24. Imaging experiments also reveal that brain architecture is sculpted by spontaneous and sensory-evoked activity, in a process that goes on into adulthood, as synaptic circuits continue to stabilize in the mature brain22,25. Even though synapses are highly dynamic in time, their overall statistics are preserved over time in the adult brain, indicating that synapse creation and pruning balance each other23.

In this context, models in which networks are gradually formed, for instance by addition of nodes and edges or by rewiring of the latter, have long been studied in different contexts. Typically, the probabilistic addition or deletion of elements is a function of the existing structure. For example, in the familiar Barabási–Albert model, a node’s probability of receiving a new edge is proportional to its degree26. These rules often give rise to phase transitions (almost invariably of a continuous nature), such that different kinds of network topology can ensue, depending on parameters27, and have been used in the past to reproduce some connectivity data on human brain development28. More recently, models in which the evolution of network structure is intrinsically coupled with an activity model that runs on the nodes of the network—the so-called co-evolving network models—have gained attention as a way to approximate the evolution of real systems29,30,31. Previous studies of co-evolving brain networks have studied the temporal evolution of mean degree32, particular microscopic mechanisms20, the development of certain computational capabilities33, or the effects of specific growth rules34, and have suggested evidence for the role of bistability and discontinuous transitions in the brain, for instance in synaptic plasticity mechanisms involved in learning35,36.

Here we define a co-evolving model for brain network development by combining the Amari-Hopfield neural network2,3 with a plausible model of network evolution28, by setting the probabilities of synaptic growth and death to depend on neural activity, as it has been empirically observed22. We find that, for certain parameter ranges, the phase transition between memory and randomness becomes discontinuous (i.e., resembling a first order thermodynamic transition). Depending on initial conditions, the system can either evolve towards heterogeneous networks with good memory performance, or homogeneous ones incapable of memory, as a consequence of the feedback loop between structure and function. To the best of our knowledge, this is the first time that this feedback loop, and the ensuing discontinuous transition, have been identified. Also, in our model networks are generated, which have optimal memory performance for the specific memory patterns they encode, thus allowing for a greater memory capacity than would be possible in the absence of such a mechanism. Our results thus suggest a more complete explanation of synaptic pruning. Finally, we discuss the possibility that other biological systems—in particular, protein interaction networks—also owe their topologies to a version of the feedback loop between form and function that we identify here.

## Results

### Model construction

We define a co-evolving model of synaptic pruning that couples a traditional associative memory model, the Hopfield model37, with a preferential attachment model for network evolution28. Memory is measured via the overlap, mμ(t), of the network with the P memorized patterns of activity, {ξμ}, which are stored by an appropriate definition of the synaptic weights w ij . In a stationary regime, it undergoes a continuous transition from a phase of memory recovery (mμ → 1) to one dominated by noise (mμ ≈ 0) as a function of the noise level or temperature T.

Network structure dynamics is defined by the probability each node i has to gain or to lose an edge:

$$P_i^g = u\left( \kappa \right)\pi \left( {I_i} \right),\quad P_i^l = d\left( \kappa \right)\eta \left( {I_i} \right),$$
(1)

respectively, where κ is the mean degree of the network and I i a physiological variable that measures the incoming current at node i. A second node j is then randomly chosen to be connected to (or disconnected from) node i, so that there are two processes that can lead to an increase (decrease) in node degree, and we shall define the effective local probabilities $$\tilde \pi$$ and $$\tilde \eta$$ to account for both of them (see the Methods). We choose these to be power-law distributed, $$\tilde \pi \propto I_i^\alpha$$ and $$\tilde \eta \propto I_i^\gamma$$, to allow for a smooth transition from a sub-linear to a super-linear dependence with a single parameter. Network structure is characterized by the homogeneity parameter g(t), which equals 1 if $$p(k) = \delta _{k_0,k}$$ and tends to 0 for highly heterogeneous (bimodal) networks. Depending on α and γ, networks are homogeneous (every node having similar degree, g→1) or heterogeneous (with the appearance of hubs, g→0). Probabilities u and d are chosen to be consistent with empirical data on synaptic pruning, as we discuss in subsection Synaptic pruning. A full and comprehensive description of the model is detailed in the Methods.

### Topological limit

Consider first the topological limit of the model as defined by Eq. (4), in which network structure is de-coupled from neural activity. Given the biologically inspired probabilities of Eq. (2) and (3), three qualitatively different behaviours, leading in practice to different phases, are possible for k i 1 (Fig. 1a). If γ > α, then $$\tilde \eta (k_i) > \tilde \pi (k_i)$$, and high-degree nodes are more likely to lose than to gain edges, so that p(k) = p(k,t→∞) is homogeneous, g(t→∞)→1, and the probability of having high-degree nodes vanishes rapidly from a maximum. On the other hand, if γ<α, then $$\tilde \eta (k_i) < \tilde \pi (k_i)$$, and high-degree nodes are more likely to continue to gain than to lose edges. Since the stationary number of edges Nκ is fixed, this leads to a bimodal p(k), with g(t→∞)→0. Finally, in the case γ = α, $$\tilde \eta (k_i) = \tilde \pi (k_i)$$ and very connected nodes are as likely to gain as to lose edges. Excluding low degrees, p(k) then decays as a power-law with exponent μ ≈ 2.5, in accordance with long-range connections observed in the human brain8 and measures in protein interaction networks38. We have found that this condition is mandatory to obtain critical behaviour, despite previous preliminary studies assuming the contrary28, as shown in Fig. 1b.

### Synaptic pruning

The time evolution of κ(t) is controlled in our model by the global probabilities u(κ) and d(κ), which can be chosen to model experimental data on synaptic density during brain development28. In particular, here we analyze and fit two experimental data sets (Fig. 2): the first one (a) corresponds to postmorten measures on layers 1 and 2 of the human auditory cortex, obtained by directly counting synapses in the tissue32; whereas, the second set (b) corresponds to an electron microscopy imaging study on the mouse somatosensory cortex21. Even though they correspond to different animals and have been obtained through different techniques, both data sets show the same overall behaviour: an extreme initial growth of synapses, followed by a maximum when pruning begins and connectivity starts decreasing, until a plateau is reached. Synaptic density decays roughly exponentially during pruning, and can be fitted by u(κ) and d(κ) as given by Eq. (2), which describe a situation in which synapses are less likely to grow, and more likely to atrophy, when the connectivity is high, and vice-versa, a situation that could easily arise in the presence of a finite quantity of nutrients. These lead to κ(t) = (κ0κ)exp(−t/τ p )+κ, where τ p  = /(2n), so that κ(t) decays exponentially from κ0 to κ, assuming that κ0>κ as in the case of interest.

On the other hand, the initial overgrowth of synapses can be related to the transient existence of some growth factors, and it can be accounted for in our model by including a nonlinear, time dependent term, c(t) = a exp(−t/τ g ), in the growth probability u(κ). The solution is now κ(t) = κ[1+b exp(−t/τ g )−a exp(−t/τ p )], with b =  g /(τ g τ p ), a = 1−κ0/κ+b and τ p as before. With the inclusion of this term, the evolution of κ(t) on both data sets can be fully reproduced.

Therefore, our model can approximate the evolution of the mean density of synapses in the mammal brain during infancy, and with the inclusion of an initial growth factor it also reproduces the fast growth and early maximum of the connectivity. Notice that this framework also goes in line with previous studies that have highlighted the computational benefits of a pruning process in which the pruning rate decreases with time, as in our model, which can optimize both efficiency and robustness when growth takes place locally throughout the network21. In our model the decreasing rate is naturally obtained via a simple, physiologically inspired master equation for p(k,t). Even when an extra growth factor is included, once pruning begins its rate decreases over time, in accordance with21. Given that our goal here is to understand the effect of neuronal physiological activity on network development, in the framework of co-evolving growing models, we shall focus on the simplest version of the model, with linear dependence on u(κ) and d(κ), to illustrate the effect of coupling structure and physiology. Nonlinear global probabilities could also be considered, but would add an extra level of complexity which we will not explore here.

### Phase diagram

In the coupled model, where pruning depends on I i (see the Methods section), computer simulations depict a rich emergent phenomenology depending on stochasticity (T) and emerging degree heterogeneity (α). The effect of the number of memorized patterns P is analyzed in subsection Capacity analysis, and we discuss the effect of γ in Supplementary Fig. 1. Other parameters, such as κ or a0, were found to have only a quantitative effect on the resulting phase diagram, and they are set as in a previous study28. Preliminary simulations suggested dependence on the heterogeneity of the initial condition (IC), so we consider two different types of IC, namely, homogeneous, p(k,t = 0) = δ(kκ0), and heterogeneous, p(k,t=0)k−2.5, networks, with fixed κ0.

Our coupling dynamics lead to a rich phenomenology, including discontinuous transitions and multistability. Three different phases can be identified by monitoring the stationary values of the order parameters, $$\bar g(\alpha ,T),\bar m(\alpha ,T)$$, and the Pearson correlation coefficient, $$\bar r(\alpha ,T)$$, as illustrated in Fig. 3 (data for $$\bar r(\alpha ,T)$$ is not shown here since it provides similar information as $$\bar g(\alpha ,T)$$). Analysis of these and similar curves for different parameter values leads to the phase diagram in Fig. 4. That is, there is a homogeneous memory phase for low α and T which is characterized by high $$\bar m$$, high $$\bar g$$ and low (negative, almost zero) $$\bar r$$ (Fig. 3a, c for α = 0.5 and low T, and in Fig. 3b, d for T < 1.2 and low α). The system is then able to reach and maintain memory, while the topological processes lead to a homogeneous network configuration. Due to the existence of memory, there is a strong correlation between the physiological state of the network, as measured by the currents I i , and its topology, as reflected by the degrees k i (Fig. 1c).

A continuous increase in α leads through a topological phase transition from homogeneous final networks (with roughly Poisson degree distributions) to heterogeneous ones (with bimodal degree distributions). At the critical point $$\alpha = \alpha _c^t(T)$$ the emergent networks are scale-free (i.e., with power-law-like degree distributions). The phase transition is revealed in $$\bar g(\alpha )$$ (Fig. 3b for T = 0.9, 1.0 and 1.1) as a fast (and continuous) decay to zero, and it also appears in $$\bar m(\alpha )$$ (Fig. 3d) where, after an initial growth with α, $$\bar m$$ then decreases at the transition, and finally approaches a constant value for $$\alpha > \alpha _c^t(T)_{}^{}$$. This is a consequence of the strong coupling between structure and memory, and it shows that scale-free networks optimize memory recovery for a given set of control parameters. We call this the heterogeneous memory phase, characterized by high $$\bar m$$, very low (almost zero) $$\bar g$$ and high negative $$\bar r$$, indicating a memory state with heterogeneous (bimodal) dissasortative structure. Interestingly, this phase expands up to high-noise levels as a consequence of network heterogeneity, which increases memory performance9,10. Moreover, memory recovery in turn favours network heterogenization in a feedback manner due to the microscopic dynamics, enhancing the stability of the state. This is because I i becomes proportional to k i in the memory regime, whereas this correlation is reduced in disordered neural states (see Fig. 1c).

As T is further increased, dynamics is finally governed by noise, and the stationary network is then homogeneous, resulting in the homogeneous noisy phase, characterized by low (almost zero) $$\overline m _{}^{}$$, high $$\bar g$$ and low (almost zero) $$\bar r$$ (Fig. 3a, c for α = 0.5 and 1.0 and high T, and Fig. 3b, d for T > 0.9 and low α).

A particularly interesting aspect of this phenomenology is that the nature of the phase transition with T depends on α. For low α ($$\alpha < \alpha _c^t(T)$$) there is a continuous (second order) transition with increasing T from a homogeneous memory phase to a homogeneous noisy one through $$\alpha = \alpha _c^m(T)$$ (Fig. 3a, c for α = 0.5). On the other hand, at higher α ($$\alpha > \alpha _c^t(T)$$) the transition from the heterogeneous memory phase to the homogeneous noisy one is discontinuous (first order), and includes a bistability region (striped area in Fig. 4) in which simulations starting from heterogeneous IC reach the heterogeneous memory state, whereas those starting from homogeneous ones fall into the homogeneous noisy one (Fig. 3b, d for T = 1.2, and Fig. 3a, c for α = 1.5).

The existence of multistability illustrates how memory promotes itself in a heterogeneous network, which is a direct consequence of the coupled dynamics in our model, and it is lost in the topological limit (see Supplementary Fig. 1). This is because heterogeneous networks present higher memory recovery than homogeneous ones for high-noise levels, particularly for T > 19,10. At the same time, given that in our model the growth and death of links depend on the activity of the nodes, the evolution of the structure of the network is driven by its activity state. In this way, an ordered state of the activity of the system—that is, a memory state—is needed in order to allow for the formation of heterogeneous (ordered) structures. Hence, if the network is in a noisy state, edge birth and death are random processes, and thus lead to a homogeneous network configuration, whereas in a memory state there is a direct correlation between I i and k i (Fig. 1b) that allows for the emergence of structure—as given by the local probabilities. Correspondingly, the physiological state directly depends on the network structure through the currents I i , thus closing a memory-heterogeneity feedback loop. As a consequence, homogeneous and heterogeneous IC evolve differently, which translates into a multistability region for high α and T, and the presence of memory for T > 1.

Therefore, a main observation here is that the model shows an intriguing relationship between memory and topology, which induces complex transitions that might be relevant for understanding actual systems. In particular, the inclusion of a topological process allows for memory recovery when T > 1; whereas, the presence of thermal noise shifts the topological transition, which in the topological limit occurs at α = 1, to α > 1. These findings can also have implications for network design—for instance, to help memory recovery optimization in noisy environments. It is worth noting also that one does not need to know the specific patterns that induce the synaptic weights in order to have a memory state, so that the definition of memories is not essential—only an ordered state of neural activity. Therefore, our results could be extended to other models of microscopic activity, not necessarily based on a Hebbian learning, or to other systems such as protein interaction networks (see subsection Protein interaction networks), as long as they present a transition between an ordered and a disordered activity state.

### Capacity analysis

We have so far considered P = 1 for the sake of simplicity. However, for neural networks—whether natural or artificial—to be useful, they must generally store many different memory patterns. Something analogous seems also to be true for other complex systems, such as gene regulatory networks, which switch between different configurations. We therefore go on to study the capacity of the network—that is, how its memory capability depends on the number of memorized patterns P. For random uncorrelated memory patterns (a0 = 0.5) the performance (as measured by the overlap $$\bar m$$ of the recovered pattern) drops fast with P39 (see Fig. 5a). However, there is experimental evidence that the configurations of neural activity related to particular memories in the animal brain involve many more silent neurons, ξμ i  = 0, than active ones, ξμ i  = 140 (notice that in this case there will be a positive correlation between different patterns); we therefore consider this kind of activity patterns in Fig. 5b, and find that good memory performance can be maintained at higher P depending on α. These curves are combined with analogous ones for $$\bar g(P,\alpha )$$ to define the phase diagram of panel c. This shows that for α < 1 memory is preserved only for small P and the structural noise (or quenched disorder) introduced by the patterns has a similar effect to that of the thermal noise in Fig. 4, and a continuous (second order) transition from homogeneous memory to homogeneous noisy networks is observed. On the other hand, for α > 1 the competition between different patterns boosts network heterogeneity, and the capacity of the network increases greatly. There is a transition from a pure memory state to a spin-glass-like state (SG), in which several patterns are partially retrieved at the same time. However, due to the presence of hubs and the correlation among patterns, this corresponds to high overlap ($$\overline {m^\mu} \approx 1$$ for the recovered patterns), instead of a moderate value as one would expect in a typical spin-glass phase37.

Finally, Fig. 5d shows the same diagram but for the topological limit of the model, where now the memory phases are much more narrow and SG states correspond to moderate overlap values ($$\overline {m^\mu } \approx 0.6$$), thus indicating a pure SG state. Therefore, the capacity analysis reveals another significant difference between the topological and coupled versions of the model, since the coupled one allows for good memory retrieval even at much larger numbers of patterns (P > 50) for α 1—that is, for the limiting case in which the emerging networks are approximately scale-free. It follows that the emerging network topologies in the coupled version do not owe their good memory performance solely to their power-law degree distributions. Rather, the network is tuned for good memory performance on the specific patterns encoded in the edge weights.

### Protein interaction networks

Many complex systems can be described as networks which evolve under the influence of node activity, and it is likely that the described structure-function feedback loop plays a role in these settings too. In particular, it could be compatible with experimental and theoretical studies concerning protein interaction networks, which gather different types of metabolic interactions amongst proteins. These can be either inhibitory or excitatory and also have different strengths, as the edges in our model. Moreover, network structure changes on an evolutionary timescale, during the evolution of the species, and these changes combine a random origin, typically due to mutations, with a “force” driven by natural selection, which is likely to be activity dependent38. The similarities between this picture and our model suggest a parallelism concerning the resulting network topologies as well.

Measurements on protein interaction networks show power-law distributions of some important topological magnitudes: degree distribution p(k)kμ, with μ ≈ 2.538; clustering coefficient C(k)kθ, with θ ≈ 1 (metabolic networks) or 2 (protein interaction networks)38,41; and neighbours mean degree k nn (k)kv, with v ≈ 0.642. We find that these magnitudes are also power-law distributed for networks in our model near the transition $$\alpha _c^t(T)$$ (Fig. 6), with exponents μ M  = 2.55 ± 0.01, θM= 0.98 ± 0.01 and v M  = 0.95 ± 0.02, so there is a good agreement with μ and θ. Moreover, k nn (k) decays in our model as a power-law for almost every value of the parameters, which is related to the intrinsic topological dynamics of the model, that creates an asymmetry between the nodes that gain and loose edges42. We find v M  ≈ 0.5 for homogeneous networks and v M  ≈ 1.5 for bimodal ones, whereas scale-free networks lie in between, with v M  ≈ 1.0. It is likely that this parameter could be better reproduced with further adjustments in the local probabilities, so as to reflect degree–degree correlations among different proteins.

In conclusion, there are definite indications that some of the main topological properties of protein interaction networks could be qualitatively reproduced with simple adjustments or extensions of our model. This suggests a general mechanism underlying the dynamics of different biological systems, which is likely to extend as well to other contexts. Moreover, several studies have recently shown that there are specific patterns in protein interaction networks that can be determined experimentally50,51, and which could allow us to identify important biological substructures in the network52,53. This information could be used, together with the model proposed here, to determine the relevance of such patterns and of the complex interplay between the underlying structure of the network and its functional role, as in the present study.

## Discussion

It is well known that the brain stores information in synaptic conductances, which mediate neural activity and that this in turn affects the birth and death of synapses. To explore what this feedback might entail, we have coupled an auto—associative or attractor neural network model with a model for the evolution of the underlying network topology, which has been used to describe synaptic pruning. Neural network models have long provided a means of relating cognitive processes, such as memory, with biophysical dynamics at the cellular level. This coupled model includes the further ingredient of a changing underlying network structure, in such a way that it can be used as a more general and complete study of synaptic pruning. The intention behind our theoretical framework is to identify a minimal set of ingredients which can give rise to observed phenomena. Specific details of a given system could be added to the model, such as more realistic neuron and synaptic dynamics.

Taken separately, each of the two models involved exhibit continuous phase transitions: between a phase of memory and one of noise in the neural network, and between homogeneous and heterogeneous network topologies in the pruning model. Our coupled model continues to display both of these transitions for certain parameter regimes, but a new discontinuous transition emerges, giving rise to a region of coexistence of phases (also known as hysteresis). In other words, situations with the same parameters but slightly different initial conditions can lead to markedly different outcomes. In this case, whether the attractor neural network is initially capable of memory retrieval influences the emerging network structure, which feeds back into memory performance. There is therefore a crucial feedback loop between structure and function, which determines the capabilities of the eventual system which the process yields. Our picture thus addresses how neural activity can impact on early brain development, and relates specific dynamic processes in the brain to well-defined mathematical properties, such as bistability and critical-like regions, or the emergence of a feedback loop.

The models we have coupled in this work are the simplest ones able to reproduce the behaviour of interest—namely, associative memory and network topologies with realistic features. However, there are no indications to believe that this feedback will disappear when using more complex models, including the consideration of asymmetric networks, or more realistic choices of the synaptic weights. Hebbian synapses have been considered here as a standard way to define memory attractors, and therefore a useful tool to understand the effect of heterogeneity, and its coupling with memory, on network dynamics. More realistic scenarios could include time dependent synapses, considering for instance learning43 or fast noise44, which would indubitably add more complexity to the model. Similarly, particular definitions of the memories could boost the capacity of the network, and even create topologically induced oscillations.

On the other hand, neural systems may not be the only ones to display the properties we found to be sufficient for the existence of this feedback loop between structure and function. Networks of proteins, metabolites or genes also adopt specific configurations (often associated to attractors of some dynamics), and the existence of interactions between nodes might depend on their activity. One may expect that the interplay between form and function we have described in this work could play a role in many natural, complex systems. We have shown some evidence that protein networks seem to have topological features which emerge naturally from the coupled models we have considered here.

Finally, an unresolved question in neuroscience is why brain development proceeds via a severe synaptic pruning—that is, with an initial overgrowth of sypapses, followed by the subsequent atrophy of approximately half of them throughout infancy. Fewer synapses require less metabolic energy, but why not begin with the optimal synaptic density? Navlakha et al. have shown that network properties such as efficiency and robustness can be optimised by a pruning rule which favours short paths21. However, in order for synapses to “know” whether they belong to short paths, some kind of back-propagating signal must be postulated. Our coupled model provides a simple demonstration of how network structure can be optimised by pruning, as in Navlakha’s model, with a rule that only depends on local information at each synapse—namely, the intensity of electrical current. Moreover, this rule is consistent with empirical results on synaptic growth and death11,12,13,14. In this view, a neural network would begin life as a more or less random structure with a sufficiently high synaptic density that it is capable of memory performance. This is in keeping with the description given by neuroscientists such as Kolb and Gibb, who say of the initial 1014 synapses in the human brain: “This enormous number could not possibly be determined by a genetic program, but rather only the general outlines of neural connections in the brain will be genetically predetermined”54. Throughout infancy, certain memory patterns are stored, and pruning gradually eliminates synapses experiencing less electrical activity. Eventually, a network architecture emerges which has lower mean synaptic density but is still capable, by virtue of a more optimal structure, of retrieving memories. Moreover, the network structure will be optimised for the specific patterns it stored and, when the emerging networks are scale free, it becomes possible to store orders of magnitude more memory patterns via this mechanism as compared to the networks generated with the uncoupled (topological) version of the model. This seems consistent with the fact that young children can acquire memory patterns (such as languages or artistic skills) which remain with them indefinitely, yet as adults they struggle to learn new ones54,55.

## Methods

### The neural network model

Consider an undirected network with N nodes, and edges which can change in discrete time. At time t, the adjacency matrix is {e ij (t)}, for i,j = 1,…,N, with elements 1 or 0 according to whether there exists or not an edge between the pair of nodes (i,j), respectively. The degree of node i at time t is $$k_i(t) = \mathop {\sum}\nolimits_j^N e_{ij}(t)$$, and the mean degree of the network is $$\kappa (t) = N^{ - 1}\mathop {\sum}\nolimits_i^N k_i(t)$$.

Each node represents a neuron, and each edge a synapse. We shall follow the Amari-Hopfield model, according to which each neuron has two possible states at each time t, either firing or silent, given by s i (t) = {0,1}37. Each edge (i,j) is characterized by its synaptic weight w ij , and the local field at neuron i is$$h_i(t) = \mathop {\sum}\nolimits_{j = 1}^N w_{ij}e_{ij}(t)s_j(t).$$ The states of all neurons are updated in parallel at every time step according to the transition probability $$P\left\{ {s_i(t + 1) = 1} \right\} = \frac{1}{2}\left[ {1 + {\rm tanh}\left( {\beta \left[ {h_i(t) -\theta _i(t)} \right]} \right)} \right],$$ where $$\theta _i(t) = \frac{1}{2}\mathop {\sum}\nolimits_{j = 1}^N w_{ij}e_{ij}(t)$$ is a neuron’s threshold for firing, and β = T−1 is a noise parameter controlling stochasticity (analogous to the inverse temperature in statistical physics)45. The synaptic weights can be used to encode information in the form of a set of $$P$$ patterns $$\left\{ {\xi _i^\mu ;\;\mu = 1, \ldots ,P} \right\}$$, with mean $$a_0 = \left\langle {\xi ^\mu } \right\rangle$$, specific configurations of neural states which can be regarded as memories via the Hebbian learning rule, $$w_{ij} = \left[ {\kappa _\infty a_0(1 - a_0)} \right]^{ - 1}\mathop {\sum}\nolimits_{\mu = 1}^P (\xi _i^\mu - a_0)(\xi _j^\mu - a_0),$$ where κ∞ = κ(t →∞). The order parameter of the model is the overlap of the state of the neurons with each of the memorized patterns of activity, $$m^\mu = \left( {Na_0(1 - a_0)} \right)^{ - 1}\mathop {\sum}\nolimits_{j = 1}^N (\xi _i^\mu - a_0)s_i(t)_{}^{}$$. The canonical Amari-Hopfield model, which is here a reference, is defined on a fully connected network (e ij  = 1, i = j), and it exhibits a continuous phase transition at the critical value T = 137.

Notice that, even though synapses are generally asymmetric, we have defined an undirected, symmetric network, in the spirit of previous studies on synaptic pruning21. Earlier studies suggest that the inclusion of asymmetry could lead to the induction of chaos, affecting learning, for instance causing the system to oscillate among different states46. Here we have decided to simplify the picture and consider symmetric networks, and we expect that, given a reasonable definition of asymmetry, the main results of our work would hold.

### The pruning model

Edge dynamics is modelled as follows. At each time t, each node has a probability $$P_i^{\,g}(t)$$ of being assigned a new edge to another node, randomly chosen. Likewise, each node has a probability $$P_i^l(t)$$ of losing one of its edges. These probabilities are given by $$P_i^{\,g} = u\left( \kappa \right)\pi \left( {I_i} \right)$$ and $$P_i^l = d\left( \kappa \right)\eta \left( {I_i} \right),$$ where we have dropped the time dependence of all variables for clarity, and I i is a physiological variable that characterizes the local dependence of the probabilities: I i  = |h i θ i |. The first terms on the right-hand side of each equation represent a global dependence to account for the fact that such processes rely in some way on diffusion of different molecules through large areas of tissue, and we take the mean degree κ at each time as a proxy. In order to describe synaptic pruning, we choose these probabilities to be consistent with empirical data describing synaptic density in mammals during infancy. The simplest choice is28,32

$$u\left( \kappa \right) = \frac{n}{N}\left( {1 - \frac{\kappa }{{2\kappa _\infty }}} \right),\quad d\left( \kappa \right) = \frac{n}{N}\frac{\kappa }{{2\kappa _\infty }},$$
(2)

where n is the number of edges to be added or removed at each step, which sets the speed of the process, and κ = κ(t→∞) is the stationary mean degree. It should also be noticed that experimental studies32 have revealed a fast initial overgrowth of synapses associated with the transient existence of different growth factors. This and other particular mechanisms could be accounted for by adding extra factors in eq. (2). For example, initial overgrowth has been reproduced by adding the term aexp(−t/τ g ) in the growth probability28.

The second terms of the right hand of Eq. (1) were previously considered28 to be functions of k i . This was meant as a proxy for the empirical observations that synaptic growth and death are determined by neural activity11,12,13,14. We now make this dependence explicit, and couple the evolving network with the neural dynamics by considering a dependence on the local current at each neuron, I i  = |h i θ i |, as stated before.

### Monte Carlo simulations

The initial conditions for the neural states are random. For the network topology we can draw an initial degree sequence from some distribution p(k,t = 0), and then place edges between nodes i and j with a probability proportional to k i (0)k j (0), as in the configuration model. Time evolution is then accomplished in practice via computer simulations as follows. First, the number of links to be created and destroyed is chosen according to two Poisson distributions with means Nu(κ) and Nd(κ), respectively. Then, as many times as needed according to this draw, we choose a node i with probability π(I i ) to be assigned a new edge, to another node randomly chosen; and similarly we choose a new node j according to η(I j ) to lose an edge from one of its neighbours, randomly chosen. This procedure uses the BKL algorithm to assure proper evolution towards stationarity45. This way, each node can then gain (or lose) an edge via two paths: either through the process with probability π(I i ) for a gain (or η(I i ) for a loss), or when it is randomly connected to (or disconnected from) an already chosen node. Therefore, the effective values of the second factors in Eq. 1 are $$\tilde \pi = 1/2\left( {\pi \left( {I_i} \right) + 1/N} \right)$$ and $$\tilde \eta = 1/2\left( {\eta \left( {I_i} \right) + k_i/(\kappa N)} \right)$$, where the 1/2 factor is included to assure normalization.

For the sake of simplicity, we shall consider $$\tilde \pi$$ and $$\tilde \eta$$ to be power-law distributed28, which allows one to move smoothly from a sub-linear to a super-linear dependence with a single parameter, $$\tilde \pi = I_i^\alpha /(\left\langle {I^\alpha } \right\rangle N)$$ and $$\tilde \eta = I_i^\gamma /(\left\langle {I^\gamma } \right\rangle N)$$. This leads to:

$$\pi \left( {I_i} \right) = 2\frac{{I_i^\alpha }}{{\left\langle {I^\alpha } \right\rangle N}} - \frac{1}{N},\quad \eta \left( {I_i,k_i} \right) = 2\frac{{I_i^\gamma }}{{\left\langle {I}^{\gamma} \right\rangle N}} - \frac{{k_i}}{{\kappa N}},$$
(3)

where normalization of π, η, $$\tilde \pi$$ and $$\tilde \eta$$ has been imposed. Notice that by construction the death probability not only depends on I i , but also on the degree k i . Given that, in the memory regime of the Hopfield model, I i k i , as shown in Results section, here we consider the approximation ηη(I i ). Notice also that with this definition the local probabilities could become negative, so we define π(I i )→max(π(I i ),0) and η(I i )→max(η(I i ),0). These definitions are most important, as they characterize the coupling between neural activity and structure. However, the particular functions are an arbitrary choice and other ones could be considered. In our scenario, the parameters α and γ characterize the dependence of the local probabilities on the local currents and account for the different proteins and factors that control synaptic growth. These could be obtained experimentally, although to the best of our knowledge this has not yet been done.

The timescale for structure changes is set by the parameter n in Eq. 2, whereas the time unit for activity changes, h s , is the number of Monte Carlo Steps (MCS) that the states of all neurons are updated according to the Hopfield dynamics between each structural network update. Our studies show a low dependence on these parameters in the cases of interest, so we only report results here for h s  = 10 MCS and n = 10.

### The macroscopic state

The macroscopic state is characterized by the overlap, as defined before, and by the mean neural activity, $$M(t) = N^{ - 1}\mathop {\sum}\nolimits_{i = 1}^N s_i(t)$$. Results in the main section are for P = 1, so we simplify the notation and use m = m1. A discussion on the effect of learning more patterns is included in subsection Capacity analysis. Also of interest is the degree distribution at each time, p(k,t), whose homogeneity may be measured via g(t) = exp(−σ2(t)/κ2(t)), where σ2(t) is the variance of the distribution and κ its mean, as defined before. g(t) = 1 for highly homogeneous networks, p(k) = δ(kκ), and tends to zero for highly heterogeneous ones. Network structure is also characterized by the clustering coefficient for each node, C i , is defined as C i =(2t i )/(k i (k i −1)), where t i is the number of triangles incident to node i, that is, $$t_i = 1/2\mathop {\sum}\nolimits_{j,h} a_{ij}a_{ih}a_{jh}$$. We also monitored the Pearson correlation coefficient applied to the edges, $$r = \left( {[k_lk_{l^{\prime}}] - [k_l]^2} \right)/\left( {[k_l^2] - [k_l]^2} \right)$$, where $$[ \cdot ] = \frac{1}{{ < k > N}}\mathop {\sum}\nolimits_l ( \cdot )$$ stands for the average over edges47, and the time dependence has been dropped for clarity. In this context it can be estimated as $$r = \left\langle k \right\rangle \left\langle {k^2k_{nn}\left( k \right) - \left\langle {k^2} \right\rangle ^2} \right\rangle /\left( {\left\langle k \right\rangle \left\langle {k^3} \right\rangle - \left\langle {k^2} \right\rangle ^2} \right),$$ where k nn (k) is the neighbours mean degree function: $$k_{nn,i} = k_i^{ - 1}\mathop {\sum}\nolimits_j a_{ij}k_j$$28. This characterizes the degree–degree correlations, which have important implications for network connectedness and robustness. That is, whereas most social networks are assortative (r > 0), almost all other networks, whether biological, technological or information-related, seem to be generically dissasortative (r < 0), meaning that high-degree nodes tend to have low-degree neighbours, and vice-versa. Previous studies showed that heterogeneous networks favour the emergence of dissasortative correlations47,48,49. Measures of the global variables on the stationary state are obtained by averaging during a long window of time: $$\overline f = \Delta t^{ - 1}\mathop {\sum}\nolimits_{t = t_0}^{t_0 + \Delta t} f(t)$$.

### Topological limit

In the topological limit of the model the topology of the network is independent from its neural state, and one substitutes I i k i , so that $$\tilde \eta _i = \tilde \eta \left( {k_i} \right)$$ and $$\tilde \pi _i = \tilde \pi \left( {k_i} \right)$$. In this way we can construct a master equation for the evolution of the degree distribution by considering network evolution as a one step process with transition rates $$u(\kappa )\tilde \pi (k)$$ for degree increment and $$d(\kappa )\tilde \sigma (k)$$ for the decrement. Approximating the temporal derivative for the expected value of the difference in a given p(k,t) at each time step we get:

$$\begin{array}{ccccc}\\ \frac{{{\rm d}p(k,t)}}{{{\rm d}t}} & {\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!} = u\left( \kappa \right)\tilde \pi \left( {k - 1} \right)p\left( {k - 1,t} \right) \\ \\ & {\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!} \quad + d\left( \kappa \right)\tilde \eta \left( {k + 1} \right)p\left( {k + 1,t} \right)\\ \\ & \quad\quad\quad {\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!} -\left[ {u\left( \kappa \right)\tilde \pi \left( k \right) + d\left( \kappa \right)\tilde \eta \left( k \right)} \right]p\left( {k,t} \right),\\ \end{array}$$
(4)

which is exact in the limit of no degree–degree correlations between nodes.

Finally, results of the manuscript are for γ = 1 since, in this topological limit, it corresponds to choosing links at random for removal, given that the probability of choosing an edge (i,j) is then $$p_{ij} = \frac{1}{{k_i}}\tilde \eta \left( {k_i} \right) + \frac{1}{{k_j}}\tilde \eta \left( {k_j} \right) = \frac{1}{{\left\langle k \right\rangle N}}$$. This can be seen as a first order approximation to the pruning dynamics, and it also induces powerful simplifications during computations. Furthermore, the relevant parameter determining the behaviour of the system is the ratio between α and γ, whereas their absolute values only affect quantitatively (see Supplementary Fig. 1).

### Statistics and general methods

In this work, we used systems sizes N = 800, 1600 and 3200, as indicated in each section. Results for N < 800 presented strong finite size effects, so they were discarded (data not shown). The sample size for each result was chosen by convergence of the mean value.

### Code availability

Generated codes are available from the corresponding author upon reasonable request.

### Data availability

All data that support this study are available from the corresponding author upon reasonable request.

## References

1. 1.

Bullmore, E. & Sporns, O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 186–198 (2009).

2. 2.

Amari, S. Characteristics of random nets of analog neuron-like elements. IEEE Trans. Syst. Man Cybern. 5, 643–657 (1972).

3. 3.

Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. USA 79, 2554–2558 (1982).

4. 4.

5. 5.

Modha, D. S. & Singh, R. Network architecture of the long-distance pathways in the macaque brain. Proc. Natl. Acad. Sci. USA 107, 13485–13490 (2010).

6. 6.

Telesford, Q. K., Simpson, S. L., Burdette, J. H., Hayasaka, S. & Laurienti, P. J. The brain as a complex system: using network science as a tool for understanding the brain. Brain Connect. 1, 295–308 (2011).

7. 7.

Voges, N. & Perrinet, L. Complex dynamics in recurrent cortical networks based on spatially realistic connectivities. Front. Comput. Neurosci. 6, 41 (2012).

8. 8.

Gastner, M. T. & Ódor, G. The topology of large open connectome networks for the human brain. Sci. Rep. 6, 27249 (2016).

9. 9.

Torres, J. J., Muñoz, M. A., Marro, J. & Garrido, P. L. Influence of topology on the performance of a neural network. Neurocomputing 58, 229–234 (2004).

10. 10.

de Franciscis, S., Johnson, S. & Torres, J. J. Enhancing neural-network performance via assortativity. Phys. Rev. E 83, 036114 (2011).

11. 11.

Lee, K. S., Schottler, F., Oliver, M. & Lynch, G. Brief bursts of high-frequency stimulation produce two types of structural change in rat hippocampus. J. Neurophysiol. 44, 247–248 (1980).

12. 12.

Frank, E. Synapse elimination: for nerves it’s all or nothing. Science 275, 324–325 (1997).

13. 13.

Klintsova, A. Y. & Greenough, W. T. Synaptic plasticity in cortical systems. Curr. Opin. Neurobiol. 9, 203–208 (1999).

14. 14.

De Roo, M., Klauser, P., Mendez, P., Poglia, L. & Muller, D. Activity-dependent PSD formation and stabilization of newly formed spines in hippocampal slice cultures. Cereb. Cortex 18, 151–161 (2008).

15. 15.

Keshavan, M. S., Anderson, S. & Pettergrew, J. W. Is schizophrenia due to excessive synaptic pruning in the prefrontal cortex? The Feinberg hypothesis revisited. J. Psychiatr. Res. 28, 239–265 (1994).

16. 16.

Kolb, B. et al. Experience and the developing prefrontal cortex. Proc. Natl Acad. Sci. USA 109, 17186–17193 (2012).

17. 17.

Geschwind, D. H. & Levitt, P. Autism spectrum disorders: developmental disconnection syndromes. Curr. Opin. Neurobiol. 17, 103–111 (2007).

18. 18.

Faludi, G. & Mirnics, K. Synaptic changes in the brain of subjects with schizophrenia. Int. J. Dev. Neurosci. 29, 305–309 (2011).

19. 19.

Fornito, A., Zalesky, A. & Breakspear, M. The connectomics of brain disorders. Nat. Rev. Neurosci. 16, 159 (2015).

20. 20.

Chechik, G., Meilijson, I. & Ruppin, E. Neuronal regulation: a mechanism for synaptic pruning during brain maturation. Neural Comput. 11, 2061–2080 (1999).

21. 21.

Navlakha, S., Barth, A. L. & Bar-Joseph, Z. Decreasing-rate pruning optimizes the construction of efficient and robust distributed networks. PLoS Comput. Biol. 11, e1004347 (2015).

22. 22.

Holtmaat, A. & Svoboda, K. Experience-dependent structural synaptic plasticity in the mammalian brain. Nat. Rev. Neurosci. 10, 647 (2009).

23. 23.

Deger, M., Helias, M., Rotter, S. & Diesmann, M. Spike-timing dependence of structural plasticity explains cooperative synapse formation in the neocortex. PLoS Comput. Biol. 8, e1002689 (2012).

24. 24.

Deger, M., Seeholzer, A. & Gerstner, W. Multi-contact synapses for stable networks: a spike-timing dependent model of dendritic spine plasticity and turnover. Preprint at http://arXiv:1609.05730 (2016).

25. 25.

Holtmaat, A. et al. Transient and persistent dendritic spines in the neocortex in vivo. Neuron 45, 279–291 (2005).

26. 26.

Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).

27. 27.

Johnson, S., Torres, J. J. & Marro, J. Nonlinear preferential rewiring in fixed-size networks as a diffusion process. Phys. Rev. E 79, 050104 (2009).

28. 28.

Johnson, S., Marro, J. & Torres, J. J. Evolving networks and the development of neural systems. J. Stat. Mech. 2010, P03003 (2010).

29. 29.

Vazquez, F., Eguiluz, V. M. & San Miguel, M. Generic absorbing transition in coevolution dynamics. Phys. Rev. Lett. 100, 108702 (2008).

30. 30.

Gross, T. & Blasius, B. Adaptive coevolutionary networks: a review. J. R. Soc. Interface 5, 259–271 (2008).

31. 31.

Sayama, H. et al. Modeling complex systems with adaptive networks. Comput. Math. Appl. 65, 1645–1664 (2013).

32. 32.

Huttenlocher, P. R. & Dabholkar, A. S. Regional differences in synaptogenesis in human cerebral cortex. J. Comp. Neurol. 387, 167–178 (1997).

33. 33.

Eguiluz, V. M., Chialvo, D. R., Cecchi, G. A., Baliki, M. & Apkarian, A. V. Scale-free brain functional networks. Phys. Rev. Lett. 94, 018102 (2005).

34. 34.

Iglesias, J., Eriksson, J., Grize, F., Tomassini, M. & Villa, A. E. Dynamics of pruning in simulated large-scale spiking neural networks. Biosystems 79, 11–20 (2005).

35. 35.

Litwin-Kumar, A. & Doiron, B. Formation and maintenance of neuronal assemblies through synaptic plasticity. Nat. Commun. 5, 5319 (2014).

36. 36.

Zenke, F., Agnes, E. J. & Gerstner, W. Diverse synaptic plasticity mechanisms orchestrated to form and retrieve memories in spiking neural networks. Nat. Commun. 6, 6922 (2015).

37. 37.

Amit, D. J. Modeling Brain Function: The World of Attractor Neural Networks (Cambridge University Press, 1992).

38. 38.

Albert, R. Scale-free networks in cell biology. J. Cell Sci. 118, 4947–4957 (2005).

39. 39.

Morelli, L. G. & Kuperman, M. N. Associative memory on a small-world neural network. Eur. Phys. J. B. 38, 495–500 (2004).

40. 40.

Akam, T. & Kullmann, D. M. Oscillatory multiplexing of population codes for selective communication in the mammalian brain. Nat. Rev. Neurosci. 15, 111 (2014).

41. 41.

Maslov, S. & Sneppen, K. Specificity and stability in topology of protein networks. Science 296, 910–913 (2002).

42. 42.

Berg, J., Lässig, M. & Wagner, A. Structure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications. BMC Evol. Biol. 4, 51 (2004).

43. 43.

Song, S., Miller, K. D. & Abbott, L. F. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nat. Neurosci. 3, 919 (2000).

44. 44.

Cortés, J. M., Torres, J. J., Marro, J., Garrido, P. L. & Kappen, H. J. Effects of fast presynaptic noise in attractor neural networks. Neural Comput. 18, 614–633 (2006).

45. 45.

Bortz, A. B., Kalos, M. H. & Lebowitz, J. L. A new algorithm for Monte Carlo simulation of Ising spin systems. J. Comput. Phys. 17, 10–18 (1975).

46. 46.

Sompolinsky, H. & Kanter, I. Temporal association in asymmetric neural networks. Phys. Rev. Lett. 57, 2861 (1986).

47. 47.

Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D. U. Complex networks: Structure and dynamics. Phys. Rep. 424, 175–308 (2006).

48. 48.

Johnson, S., Torres, J. J., Marro, J. & Muñoz, M. A. Entropic origin of disassortativity in complex networks. Phys. Rev. Lett. 104, 108702 (2010).

49. 49.

Williams, O. & Del Genio, C. I. Degree correlations in directed scale-free networks. PLoS ONE 9, e110121 (2014).

50. 50.

Turanalp, M. E. & Can, T. Discovering functional interaction patterns in protein-protein interaction networks. BMC Bioinform. 9, 276 (2008).

51. 51.

Liu, G., Wang, H., Chu, H., Yu, J. & Zhou, X. Functional diversity of topological modules in human protein-protein interaction networks. Sci. Rep. 7, 16199 (2017).

52. 52.

Xiong, W., Xie, L., Zhou, S. & Guan, J. Active learning for protein function prediction in protein-protein interaction networks. Neurocomputing 145, 44–52 (2014).

53. 53.

Ren, J., Wang, J., Li, M. & Wang, L. Identifying protein complexes based on density and modularity in protein-protein interaction network. BMC Syst. Biol. 7, S12 (2013).

54. 54.

Kolb, B. & Gibb, R. Brain plasticity and behaviour in the developing brain. J. Can. Acad. Child Adolesc. Psychiatry 20, 265–276 (2011).

55. 55.

Gómez, R. L. & Gerken, L. Infant artificial language learning and language acquisition. Trends Cogn. Sci. 4, 178–186 (2000).

## Acknowledgements

We are grateful for financial support from the Spanish MINECO (project of Excellence: FIS2017-84256-P) and from “Obra Social La Caixa”.

## Author information

A.P.M., J.J.T., J.M. and S.J. designed the analyses, discussed the results and wrote the manuscript. A.P.M. also wrote the codes and performed the simulations.

Correspondence to Ana P. Millán.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Millán, A.P., Torres, J.J., Johnson, S. et al. Concurrence of form and function in developing networks and its role in synaptic pruning. Nat Commun 9, 2236 (2018). https://doi.org/10.1038/s41467-018-04537-6

• Accepted:

• Published:

• ### Structural Insights Into the Dynamic Evolution of Neuronal Networks as Synaptic Density Decreases

• Ye Yuan
• , Jian Liu
• , Peng Zhao
• , Fu Xing
• , Hong Huo
•  & Tao Fang

Frontiers in Neuroscience (2019)

• ### How Memory Conforms to Brain Development

• Ana P. Millán
• , Joaquín J. Torres
•  & Joaquín Marro

Frontiers in Computational Neuroscience (2019)

• ### Apoptosis as a systemic adaptive mechanism in ischemic stroke

• S. P. Sergeeva
• , A. A. Savin
• , P. F. Litvitsky
• , A. V. Lyundup
• , E. V. Kiseleva
• , L. R. Gorbacheva
• , I. D. Breslavich
• , K. I. Kucenko
•  & M. V. Balyasin

Zhurnal nevrologii i psikhiatrii im. S.S. Korsakova (2018)