Since the early stages of the effort to formulate a mathematical description of collective behaviour, the fundamental dynamical rule common to most theoretical models has been that of local mutual imitation: each individual within the group tends to adjust its state of motion to that of its neighbours1,2,3,4,5,6,7,8. This type of imitative behaviour can be either explicitly prescribed by the model through a direct interaction between the particles’ velocities1,3,4,5, or it may be an effective interaction emerging from simpler positional rules, as attraction and repulsion2,7,8,9, depending on the coarse-graining level we decide to work at. In either case, effective imitation of the local neighbours is the cornerstone of self-organized collective dynamics. The early models also assumed that all individuals within the group moved with the same constant speed1,2,3,4,5,6. In that case, mutual imitation requires each particle to only adapt the orientation of its velocity to that of its neighbours. However, in real instances of collective behaviour, be they natural or artificial, the individual speeds fluctuate10,11,12, hence mutual imitation requires a particle to adjusts also its speed to that of the neighbours. In contrast to orientation, though, speed control cannot be left just to mutual imitation, as nothing then would prevent particles to move in sync at unreasonably large (or small) speeds. One therefore needs to devise a control mechanism aimed at keeping the individual speed of each particle in the ballpark of some reference value, v0, which is set by the biomechanical constraints of a given species in the case of natural collectives, or by the technological constraints in the case of artificial swarms. While v0 is a system-specific parameter, typically linked to the hardware of the animal or robot, and therefore hard to change, how the individual speeds are allowed to fluctuate around v0 is what actually constitutes the software, namely the essence of control, which is a fundamental problem both for natural13,14,15,16 and for artificial collective behaviour17,18; fluctuations of the individual speeds cannot just occur randomly, because when the group is under perturbation the speed change of each individual must influence those of other individuals in order to propagate information across the system. Therefore, collective response is inextricably linked to the speed control mechanism.

A vivid exemplification of this connection comes from one of the most spectacular instances of natural collective behaviour, namely starling flocks. Experimental observations have shown that speed correlations in these systems are scale-free: the changes in speed of one individual are statistically connected to those of all other individuals across the flock10. More precisely, in statistical physics language, what happens is that the spatial range of speed correlations, i.e. the speed correlation length, grows linearly with the group’s size: no matter how large is a flock, individual speed changes are correlated to each other. This is a startling phenomenon that does not have an obvious mathematical explanation, and that very likely is essential to an efficient collective functioning of this natural system. In support of this view come the extensive numerical simulations of Hemelrijk and Hildenbrandt14, which show that scale-free correlations are necessary to grant cohesion to the group, preventing fragmentation under external perturbations. Collective response and propagation of information, and how these traits impact on the evasion manoeuvres that the group can display in the face of external perturbations, are not only concerns of starling flocks, and not even only of natural systems, but of all instances, biological and artificial, of self-organized collective behaviour. Speed control is therefore a crucial issue in both biology and engineering.

Scale-free correlation is not at all a mild requirement set on speed control and this is demonstrated by the fact that the most widespread speed control mechanism used in the literature, namely linear control, does not work. The simplest way to control speed is indeed through a linear restoring force: whenever the speed vi of particle i deviates from the natural reference value v0, it gets ‘pushed back’ proportionally to the deviation (Fig. 1). Linear speed control is widely used to study the collective behaviour of the most diverse systems, from migrating cells19, bird flocks13,14, fish schools15, and pedestrian collectives20,21,22, to robots swarms17, and vehicle crowds18, to name just a few examples; it lies at the centre of most current implementations of collective behaviour. By using field data on starling flocks, numerical simulations of self-propelled particles and statistical field theory, we show here that linear speed confinement entails an intrinsic conflict between yielding a reasonable group’s speed and producing scale-free correlations. At its core, the problem is that to reproduce long-range correlations linear control requires a weak speed-confining force, so that the particles’ speeds are very loosely confined around their reference natural value, v0; but when this happens, entropic forces push the typical speed of the group to grow significantly larger than v0, not only causing a disagreement between theory and experiments, but more generally giving a severe discrepancy between reference speed and group’s speed, which is problematic in any collective system.

Fig. 1: Qualitative sketch of linear vs. marginal speed-restoring force.
figure 1

In the linear case the force pulls the speed back to its natural reference value v0 proportionally to the deviation from v0. Instead, in the marginal case, the force is extremely weak for small deviations from the reference speed, while it increases very sharply for large deviations, harshly suppressing them.

We will show that, to resolve this conflict, it is convenient to employ a different speed control mechanism, whose fundamental idea is quite simple: small speed fluctuations elicit nearly zero restoring force, while larger speed fluctuations are pushed back extremely sharply (Fig. 1). The great advantage of this kind of control is that the low stiffness for small fluctuations boosts correlation, while the sharp increase of the confining force for large fluctuations always grants a plausible speed to the group. For mathematical reasons that will be clearer later on, we call this mechanism, marginal speed control. Marginal control was first studied on purely speculative grounds in ref. 23, although no self-propelled particles simulations, nor comparison with the experimental data were conducted in that study. Here, we will provide theoretical, numerical and—most importantly—field empirical evidence indicating that a model of collective behaviour based on marginal speed control can produce scale-free correlations and acceptable group’s speed without the need to fine-tune any parameters. Although we use starling flocks as a crucial biological benchmark for validation, our results are general, as both numerical results and theoretical calculations support marginal control, irrespective of the specific experimental system one explores. Hence, we argue that marginal speed control may become a centrepiece of collective behaviour at a general level.

Results and discussion

Experimental evidence from a natural system

We consider 3D experimental data on natural flocks of starlings (Sturnus vulgaris) in the field. To the data previously reported by our lab in refs. 24,25,26, we added new data from our most recent campaign of acquisition conducted in 2019–2020 (see the “Methods” section for details of the experiments, and Table S1 in the SI for all biological data in each acquisition). The new data expand the span of the group sizes between N = 10 and N = 3000 animals, a wider interval than any previously reported study. As we shall see, this expansion of the dataset will be crucial in selecting the correct theory.

The three main experimental results that are of interest for us here are the following: (i) speed fluctuations in natural flocks are correlated over long distances, namely their spatial correlation functions are scale-free (Fig. 2a)10. The connected correlation function measures the similarity between the speed changes of different birds at a certain distance (see the “Methods” section for the mathematical definition of connected correlation function); how rapidly this correlation decays with the distance defines the correlation length, ξ. In standard systems, normally ξ has a certain fixed value (the scale of correlation), which does not depend on the size of the system, L. In scale-free systems, though, one finds that ξ is proportional to L, meaning that there is no actual scale of the correlation, apart from that set by the system’s size, L. Flocks have this very non-trivial property. (ii) Flocks are highly ordered systems. The polarization, \({{\Phi }}=(1/N)\left|\mathop{\sum }\nolimits_{i}^{N}{{{{{{{{\bf{v}}}}}}}}}_{i}/{v}_{i}\right|\) (where N is the total number of birds in the flock, vi is the vector velocity of bird i, and vi is its modulus, i.e. speed), is always quite large, typically above 0.9 (Fig. 2b, c). The large polarization indicates that these systems are deep into their ordered phase, which rules out the possibility that flocks are close to an ordering transition; hence, near-criticality in the standard ferromagnetic sense cannot be invoked to explain scale-free correlations. Moreover, unlike the case of orientations, scale-free correlations of the speed cannot be explained as the effect of a spontaneously-broken continuous symmetry27; in fact, in standard statistical physics, fluctuations of the modulus of the order parameter are heavily suppressed in the ordered phase, so they are very much short-range correlated28. The origin of scale-free correlations of the speed is therefore far from being trivial. (iii) Finally, flock-to-flock speed fluctuations are moderate (Fig. 2d). The average cruising speed of starlings within a flock is about 12 meters-per-second (m s−1), with typical fluctuations of 2 m s−1 10. This is also the typical cruising speed of an entire flock, namely s = (1/N)∑ivi, whose distribution is reported in (Fig. 2d). Hence, neither the individuals, nor the group, ever cruise at a speed much different from the natural reference value, v0. As we shall see, this seemingly puny experimental trait may become tremendously difficult to reconcile with scale-free correlations at the theoretical and practical level.

Fig. 2: Experimental evidence on starling flocks.
figure 2

a The equal-time space correlation function of the speed fluctuations (for the definition of the correlation function see the “Methods” section), plotted against the distance r between the birds rescaled by the flock’s size L, for some typical flocks (each colour corresponds to a different flock); the fact that all the curves collapse onto each other indicates that the spatial range of the speed correlation, namely the correlation length ξsp, scales with L, i.e. that the system is scale-free (see also Fig. 3a, c). b Scatter plot displaying polarization vs. mean speed of each flock in all recorded events; as the polarization, \({{\Phi }}=(1/N)\left|{\sum }_{i}{{{{{{{{\bf{v}}}}}}}}}_{i}/{v}_{i}\right|\), is quite close to 1 for all flocks, it is more convenient to plot 1−Φ in log scale. Data show that starling flocks are highly ordered systems, incompatible with the standard notion of near-criticality (green points correspond to medians over time, error bars to median absolute deviations). The probability distributions of polarization and mean speed are reported in panels c and d, showing that the typical mean group’s speed is 12 m s−1, with fluctuations of about 2 m s−1.

General theory

The reference flocking dynamics we will consider here is one in which the animals’ velocities interact through a direct coupling, aimed at describing the effective imitation between neighbouring individuals. This kind of dynamics can be written in a compact way as follows:

$$\frac{{\mathrm {d}}{{{{{{{{\bf{v}}}}}}}}}_{i}}{{\mathrm {d}}t}=-\frac{\partial H}{\partial {{{{{{{{\bf{v}}}}}}}}}_{i}}+{{{{{{{{\boldsymbol{\eta }}}}}}}}}_{i}$$
$$\frac{{\mathrm {d}}{{{{{{{{\bf{x}}}}}}}}}_{i}}{{\mathrm {d}}t}={{{{{{{{\bf{v}}}}}}}}}_{i},$$

where ηi is a white noise with strength proportional to T, a parameter playing the role of an effective temperature in the statistical physics context, namely \(\langle {{{{{{{{\boldsymbol{\eta }}}}}}}}}_{i}(t)\cdot {{{{{{{{\boldsymbol{\eta }}}}}}}}}_{j}(t^{\prime})\rangle =2dT{\delta }_{ij}\delta (t\,-t^{\prime})\); H is a cost function (or effective Hamiltonian), whose derivative with respect to vi represents the social force acting on the particle’s velocity (the effective friction coefficient in front of \({\dot{{{{{{{{\bf{v}}}}}}}}}}_{i}\) in Eq. (1) can be set to 1 through an appropriate rescaling of time29). In order to implement an imitation dynamics we can use the following—very general—cost function,

$$H=\frac{1}{2}J\mathop{\sum }\limits_{i,j}^{N}{n}_{ij}{({{{{{{{{\bf{v}}}}}}}}}_{i}-{{{{{{{{\bf{v}}}}}}}}}_{j})}^{2}+\mathop{\sum }\limits_{i}^{N}V({{{{{{{{\bf{v}}}}}}}}}_{{{{{{{{\bf{i}}}}}}}}}),$$

where the first term represents the imitation interaction between particles’ velocities, having strength J, and the second term is the speed control term, which affects each particle independently. The adjacency matrix, nij, is 1 for interacting neighbours and 0 otherwise, and the self-propulsion part of the dynamics, Eq. (2), implies that the interaction network depends on time, nij = nij(t). The first term in the cost function favours neighbouring individuals to have similar velocity, hence it contains an alignment component of the dynamics that was first explored in the seminal Vicsek model4,30,31; in the Vicsek model, though, the speed of the particles was kept constant, vi = v0, so that velocity imitation only impacted upon the orientations. Here, on the other hand, we want to study speed fluctuations and their correlations, hence we relax the Vicsek constraint of fixed speed: the first term in H favours mutual imitation of orientation and speed, while the confining potential, V(vi), keeps the speed of each particle confined around the natural reference value, v0. Notice also that the Vicsek dynamics was a discrete one, while here we wish to work in the continuum time limit, a generalization far from trivial, which has required a great deal of elaboration since the original Vicsek model was introduced (see, for example, refs. 16,32).

Equation (3) is certainly not the only possible choice, as more sophisticated cost functions could be considered. Likewise, more complex dynamical evolutions than Eqs. (1) and (2) could be adopted, including different kinds of fluctuating terms32 or correlated noise. Our aim here is to seek the simplest possible description able to explain the data; we therefore choose the minimalistic framework of Eqs. (1)–(3) as a starting point for the theoretical analysis. Moreover, apart from the simplicity of this modelling approach, it must be said that the analysis of biological data based on statistical inference13,33,34 indicates that this class of model provides a good statistical description of natural flocks of birds. We will discuss in the conclusions how models based on separate fluctuations for speed and orientations32,35 may be relevant for describing other types of biological systems.

The interaction term in the cost function H in Eq. (3) involves the full velocity vectors and therefore regulates mutual adjustment of both speeds and flight directions: individuals who tend to have similar directions also tend to have similar speeds. The actual speed of a bird will then be the product of the interplay between the individual confining force and the collective imitation among the birds. We could have considered two distinct imitation terms for speeds and flight directions, allowing these quantities to be adjusted independently. However, it has been shown in ref. 13 that the two models are substantially equivalent in explaining experimental data on starling flocks, suggesting that a unique adjustment interaction between full velocities already captures most of the relevant information in the data.

Linear speed control

The simplest control, and indeed the one used in the majority of models with fluctuating speed to date, consists of a harmonic potential confining the speed13,14,19,32,35,36,


where vi = vi. This potential generates a linear restoring force acting on the speed in the equation of motion (1), hence it is called linear speed control. The parameter g is the stiffness of the restoring force, and it can be interpreted as the elastic constant of a spring keeping the speed around its natural reference value, v0. The determination of the correlation length ξsp of the speed fluctuations in the case of a linear control has been worked out in ref. 13, and it gives

$${\xi }_{{{{{{{{\rm{sp}}}}}}}}}={r}_{1}{\left(\frac{J{n}_{c}}{g}\right)}^{1/2},$$

where r1 is the mean inter-particle distance and nc is average number of interacting nearest-neighbours. The explanation of Eq. (5) is simple: the theory defined by Eqs. (3) and (4), has a critical point at g = 0, where the correlation length diverges. Conversely, large values of the speed stiffness g suppress the range of speed correlations13. To have scale-free correlations with linear control one must have have \({\xi }_{{{{{{{{\rm{sp}}}}}}}}}\gg {L}_{\max }\) (where \({L}_{\max }\) is the size of the largest flock in the dataset), which can be achieved by fixing the stiffness g to be much smaller than \(1/{L}_{\max }^{2}\).

This theoretical scenario is confirmed in Fig. 3a, where we report the correlation length of the speed fluctuations, ξsp, vs. the system’s size, L, in numerical simulations of self-propelled particles (SPP) regulated by linear speed control (coloured points—see the “Methods” section and SI for details of the SPP simulations): when the speed stiffness g is small enough, namely smaller than \(1/{L}_{\max }^{2}\), the correlation length ξsp scales linearly with L over the whole range (dark red points), thus reproducing the scale-free nature of the experimental correlation length (black points). On the contrary, if g is larger than \(1/{L}_{\max }^{2}\), the range of the correlation grows linearly with L only up to a certain size, and then it saturates to its bulk value (5) (orange and yellow points). We conclude that, if correlations were our only experimental concern there would be no need to increase the speed stiffness g beyond \(1/{L}_{\max }^{2}\), and everything would be fine.

Fig. 3: Linear vs. Marginal speed control.
figure 3

a Natural flocks show a clear scale-free behaviour of the speed correlation length, ξsp, which scales linearly with L (Pearson coefficient rP = 0.97, p < 10−9). SPP simulations with linear speed control yields scale-free correlations over the entire range of L only at the smallest value of the stiffness g (dark red). b Natural flocks show no detectable dependence of their mean speed on the number of birds in the flock (Spearman coefficient rS = −0.13, p = 0.21; the black line is the average over all flocks). SPP simulations with linear control give a near-constant speed compatible with experiments only at the largest value of the stiffness g (light yellow); coloured lines represent the theoretical prediction of Eq. (6). Linear speed control is therefore unable to reproduce both experimental traits at the same time. c The correlation length in SPP simulations with marginal speed control scales linearly with L over the full range, provided that the temperature/noise T is low enough to have a polarization equal to the experimental one. d At the same value of the parameters as in panel c, SPP simulations with marginal control give mean group’s speed very weakly dependent on N, fully compatible with the experimental data; the y scale of the main plot is set to make a comparison with panel b, while in the inset we report the same data over a smaller y range to appreciate the agreement between theory (blue line) and simulations, and to show that the trend of group’s speed with N of the marginal model is really weak. [Each length scale pertaining to numerical simulations is reported in meters by normalizing it to the inter-particle distances, \(\ell ={\ell }^{ \sim }({r}_{1}^{\exp }/{r}_{1}^{ \sim })\), where \({r}_{1}^{\exp }\) and \({r}_{1}^{ \sim }\) are the mean inter-particle distances of experiments and simulations respectively. Numerical and experimental correlation lengths are reported on the same scale by matching the curves at the scale-free value of the parameters; numerical and experimental speeds are reported on the same scale by matching the curves at the largest value of N. Coloured points correspond to averages over numerical data, error bars to standard deviations. Black points correspond to the median (over time) of experimental data for each individual flocking event, error bars to median absolute deviations].

However, when we turn our attention to the mean speed of the flock, s = (1/N)∑ivi, the linear theory becomes problematic. Empirical data show that the mean speed does not change much from flock to flock and it does not have any dependence on the flock’s number of birds, N (black points in Fig. 3b). Let us see what is the prediction of the linear theory for the mean speed, s. Calculating the probability distribution, P(s), from Eqs. (1) and (2) is a prohibitive task, due to the time-dependence of the interaction network, nij(t); however, previous studies have shown that, in the deeply ordered phase in which flocks live, the timescale for rearrangement of nij(t) is significantly larger than the relaxation time of the velocities, hence one obtains reasonably accurate results by assuming a time-independent form of nij, namely a fixed interaction network34; as we shall see from the perfect agreement between off-equilibrium SPP simulations and theory, this approximation works very well. Under this assumption (plus some more bland algebraic approximations—see SI for details) one can calculate the probability distribution of the mean speed with linear control, obtaining for d = 3 the result,

$$P(s)=\frac{1}{Z}{s}^{2}\exp \left[-\frac{Ng}{T}{(s-{v}_{0})}^{2}\right],$$

where Z is a normalization factor. We can easily evaluate the peak of this distribution, that is the typical value of the mean speed of the group,


For N →  we get stypical = v0, so all is good for infinitely large groups, as their typical speed is just the same as the natural reference speed, v0. But in finite groups serious troubles emerge, as the typical speed grows for decreasing N, eventually becoming absurdly larger than the natural reference value, v0; and because in Eq. (7) the combination Ng appears, this problematic effect is all the more serious the smaller the speed stiffness g, so that for very weak control, even relatively large flocks will have a biologically implausible speed. Yet weak stiffness g is exactly what we need to grant strong correlations! Linear control has therefore a serious problem.

The physical reason for this drift of the mean speed in the linear theory is the following. In absence of the prefactor s2 in the distribution (6), a decrease of the stiffness g would increase the flock-to-flock fluctuations of the mean speed, but its typical value would be always equal to the natural one, namely v0. The s2 prefactor, though, changes this, pushing the maximum of the distribution at larger and larger speed for decreasing g. Where is this prefactor from? It is essentially the Jacobian of the change of variable between the d-dimensional velocity vector and the modulus of the velocity (see SI); in generic dimension, sd−1ds is the volume in phase space of all configurations with the same mean speed, but variable velocity direction (an identical term appears in the Maxwell–Boltzmann speed distribution). This is an entropic term, which boosts the probability of large speed merely because there are more ways to realise larger rather than smaller velocity vectors. When the imitation force is strong (as it is within a flock) and the speed-confining force is weak (as it must be for the sake of scale-free correlation), the system is allowed to gain entropy by increasing in a coordinated fashion all the individual speeds of the particles; as this entropic push is not suppressed by a strong enough exponential decay, it gives rise to unreasonably fast groups.

This theoretical prediction is confirmed by a comparison between numerical SPP simulations ruled by linear speed control and experimental data. Figure 3b shows that, once the reference speeds v0 of theory and experiments are matched at the largest sizes, for small values of N and g numerical flocks with linear speed control (dark red points) have a mean speed that is incompatible with that of actual experimental flocks (black points), which shows no appreciable dependence on N. To contrast the increase of the mean speed in smaller SPP flocks one needs a larger value of the speed stiffness g (light yellow points), but this depresses the range of the speed correlations, so that one fails to reproduce scale-free behaviour, Fig. 3a. This is the blanket-too-short dilemma of linear speed control: either we use a speed stiffness g small enough to reproduce scale-free correlations even at the largest observed values of N, but in that case we get implausible large group’s speed at low N (dark red points), or we increase g to tame the entropic boost of the speed and keep it within the experimental fluctuations at low N, but then we lose scale-free correlations at large N (light yellow points). Linear speed control cannot yield both experimental traits at the same time.

The numerical simulations that we present in Fig. 3 have been performed in a standard cubic box of side L, with periodic boundary conditions; given that the density is 1 (see the “Methods” section), the number of particles in each simulation is N = L3. Natural flocks, on the other hand, do not have a cubic aspect ratio37, but rather three main axis of different sizes, so that N = L1L2L3, where the main (i.e. longest) axis L1 is what we define as the flock’s linear size, L = L1; flocks are in fact quite elongated, with an aspect ratio L1/L2 which typically ranges between 6 and 8, depending on the flock. This means that if a flock and a simulation have the same value of L, they will not have the same number of particles N, and vice-versa. Hence, one possible objection to our discussion about the failure of the linear theory is that the fair comparison should be between experimental elongated flocks and simulations in equally elongated non-cubic boxes, with aspect ratio as similar as possible to the natural one. We run these elongated simulations and we present the results in the SI; what we find is that the results do not change: even in an elongated geometry similar to real flocks, the linear model cannot fit the experimental data. More precisely, if we set the stiffness to g = 1, in order to have the group’s speed under control at all values of N (see Fig. 3b), then the correlation is not scale-free, and ξ, instead of scaling with L, rapidly saturates as in Fig. 3a (see SI for the data). The shortcomings of the linear model are therefore not related to the system’s geometry, nor to the different ways in which L and N are related to each other.

We have seen that linear speed control is unable to produce scale-free speed correlations and at the same time to keep a limited value of the group’s speed, with any fixed value of the stiffness g. But what about a tuning mechanism? Could one imagine that g changes with size, in order to keep both experimental traits on board? Let us see this. In order to have scale-free correlations at all observed sizes, one needs ξspL for each L, a condition that, together with Eq. (5), leads to,

$$g\ll \frac{a}{{L}_{\max }^{2}}$$

where \({L}_{\max }\) is the size of the largest flock in the dataset and \(a={r}_{1}^{2}J{n}_{\mathrm {{c}}}\) collects all size-independent quantities. On the other hand, from Eq. (7) we see that, in order to have a typical flock’s speed reasonably close to the natural reference value, v0, one must ensure that \(T/(Ng{v}_{0}^{2})\ll 1\) for all observed sizes; if we use the reasonable approximation \(N \sim {L}^{3}/{r}_{1}^{3}\), where r1 is the mean nearest-neighbour distance, we obtain the condition,

$$g\gg \frac{b}{{L}_{\min }^{3}}$$

where \({L}_{\min }\) is the size of the smallest flock in the data-set, and where once again we have grouped into the parameter \(b={r}_{1}^{3}T/{v}_{0}^{2}\) all size-independent constants. Once the spectrum of observed values of L is wide enough, the two bounds (8) and (9) cannot be satisfied both with one single value of the speed control stiffness g. The only way to reconcile linear speed control with the empirical observations then would be to assume the existence of a tuning mechanism such that the speed stiffness g depends on the size L of the flock according to the condition

$$\frac{b}{{L}^{3}}\ll g(L)\ll \frac{a}{{L}^{2}}$$

This is a rather narrow strip for g(L) to live in, so that a biological mechanism fulfilling (10) would require some very tricky size-dependent fine-tuning. But in fact, even that could be insufficient: the two inequalities in Eq. (10) are asymptotic, namely they require the stiffness g to stay well clear of both boundaries, 1/L3 and 1/L2, not just between them; for medium-small values of L this becomes harder and harder to achieve. We can think of condition (10) as a wedge on the (g, L) plane, a wedge that closes rapidly when L decreases; the only way to keep this wedge open also for small values of L would be to tune also all other parameters, r1, J, nc, v0 etc., beside tuning the stiffness g. Such a fine tuning seems unlikely, if not impossible, to achieve. We believe it is more realistic to turn to some other, tuning-free, control mechanism.

Marginal speed control

In statistical physics, the correlation length ξ is connected to the inverse of the quadratic curvature of the (renormalized) potential, calculated at its minimum38,39,40; very small curvature implies very large correlation length, so that a divergent ξ is always due to a zero second derivative (or marginal mode) along some direction of the (renormalized) potential. This is also the case for linear speed control (4): the second derivative of the quadratic potential along the speed is proportional to g, hence when g is small, the correlation length is large. The problem, however, is that because the function is quadratic, by decreasing g we weaken the whole speed-confining potential, not just its curvature, hence giving a freeway to the entropic boost we have discussed before, ultimately resulting in the implausible large speed of small flocks.

This state of affairs suggests that we must turn to a confining potential that does not vanish entirely when its curvature does. To find this potential we proceed through general considerations of symmetry and common sense. First, the potential must keep the speed around the reference natural value v0 and it must diverge for large values of the speed; secondly, it must be rotationally symmetric in the whole velocity vector; third, it must have the simplest mathematical form compatible with the previous conditions and with the experimental evidence. The most general form of a rotationally symmetric potential that confines the speed around the natural reference value v0 is, \(V({{{{{{{{\bf{v}}}}}}}}}_{{{{{{{{\bf{i}}}}}}}}})={\left({{{{{{{{\bf{v}}}}}}}}}_{i}\cdot {{{{{{{{\bf{v}}}}}}}}}_{i}-{v}_{0}^{2}\right)}^{p}\), where the integer power p ≥ 2 must be even, in order to produce a minimum of the potential at v0. For p = 2 we have the classic O(n) quartic potential of standard vector ferromagnets28,38, namely, \(V({{{{{{{{\bf{v}}}}}}}}}_{{{{{{{{\bf{i}}}}}}}}})={\left({{{{{{{{\bf{v}}}}}}}}}_{i}\cdot {{{{{{{{\bf{v}}}}}}}}}_{i}-{v}_{0}^{2}\right)}^{2}\); several works that study collective behaviour with variable speed indeed use this kind of quartic potential16,36,41,42. Although this may seem a speed control mechanism genuinely different from the harmonic one that we considered in the previous section (after all, it generates a cubic, rather than linear, speed-confining force), in fact it is not. As we have seen, in highly polarized and coherent flocks individual speed fluctuations are relatively mild, so that for vi ~ v0 we can rewrite the quartic potential as, \(V({{{{{{{{\bf{v}}}}}}}}}_{{{{{{{{\bf{i}}}}}}}}})={({v}_{i}+{v}_{0})}^{2}{({v}_{i}-{v}_{0})}^{2} \sim 4{v}_{0}^{2}{({v}_{i}-{v}_{0})}^{2}\), which is nothing else than the harmonic potential that we already took into consideration. Hence, the p = 2, or quartic, theory is not suitable for our purposes, because it has a non-zero quadratic expansion around v0, which gives a non-zero curvature, leading to a saturation of the correlation length, and therefore to a violation of the scale-free phenomenology. The only way to avoid this would be to put a vanishing amplitude (i.e. stiffness) g in front of the whole quartic potential; but this is exactly what we tried doing in the case of linear control, and it does not work, because then the mean speed of the group explodes. Incidentally, the fact that the ‘harmonic’ or quadratic speed control theory is essentially identical to the quartic O(n) model, shows that that theory is not ‘harmonic’ at all in the actual vectorial degrees of freedom, vi.

The next simplest possibility is p = 4, which gives the following speed-control potential23,

$$V({{{{{{{{\bf{v}}}}}}}}}_{i})=\frac{1}{{v}_{0}^{6}}\lambda {\left({{{{{{{{\bf{v}}}}}}}}}_{i}\cdot {{{{{{{{\bf{v}}}}}}}}}_{i}-{v}_{0}^{2}\right)}^{4}$$

where, thanks to the \({v}_{0}^{-6}\) normalization, the amplitude λ has the same physical dimensions as the other coupling constants, J and g. The crucial feature of the potential in Eq. (11) is that its second derivative with respect to the speed is always zero, irrespective of the value of the amplitude λ, hence we will call this marginal speed control. Higher order powers in the expansion of the potential are nonzero and very steep, though, thus confining the speed much more effectively to its reference value, v0, compared to linear control. Correspondingly, the speed-restoring force is very weak for small deviations from v0, but very strong for large deviations, as one can qualitatively see from the marginal speed-restoring force in Fig. 1. We will discuss later about the biological plausibility of this kind of nonlinear speed confinement; here this is for us merely the simplest confining potential, beyond the harmonic one. Although from a physical point of view simplicity is certainly appealing, it is not always the best path to interpreting the experimental results, especially in biological systems, whose complexity often defies the physicist’s desire of (over?) simplification; hence, we cannot exclude that other mechanisms, possibly within more complex theories, may equally successfully explain the empirical evidence. For example, in the spirit of ref. 3 different kinds of noise could be adopted in the equations, affecting the speed differently than the orientations (like the active noise considered in ref. 32) or with non-trivial temporal correlations. Moreover, the reference speed v0 might itself be a dynamical variable: even though we do not observe temporal trends of the mean group speed in the data, at least on the scale of our experiment, it could be a relevant issue for other instances of collective motion11. Once again, our criterion here is to explore the simplest mathematical possibility that may work.

The complete absence of a quadratic term in the expansion of Eq. (11) seems to suggest that the marginal potential gives rise to an infinite correlation length of speed fluctuations under all physical conditions; in fact, this is not the case. Speed correlations are regulated by the confining potential (i.e. the energy), but also by the fluctuations induced by the noise (i.e. the entropy): at very low noise the marginal potential dominates, so that speed fluctuations are indeed scale-free, while by increasing the noise the correlation is increasingly suppressed by entropic fluctuations23. In field theory terms, what happens is that at finite temperature entropy provides a non-zero second derivative of the renormalized potential, i.e. a non-zero mass of speed fluctuations, and therefore a finite correlation length. The renormalized curvature only goes to zero at T = 0, where the speed correlation length diverges. As a consequence, the marginal theory has a zero-temperature (or zero-noise) critical point, where the correlation length of the speed fluctuations diverges. A mean-field analysis23 shows that the speed correlation length diverges as

$${\xi }_{{{{{{{{\rm{sp}}}}}}}}} \sim \frac{1}{{T}^{1/2}},$$

where the generalized temperature T is the strength of the noise in Eq. (1). This scenario has an interesting and very convenient consequence: in the marginal theory, by simply decreasing the noise strength T, we bring home two out of three empirical traits, that is a large polarization and a large correlation length, a somewhat unusual result within standard statistical systems, which normally are the less correlated the more they are ordered (we are talking about connected correlation, of course). But what about the crucial constraint of producing a biologically reasonable group speed?

The calculation of the distribution of the mean speed of flocks under the marginal potential is more complicated than in the linear case (see SI), but under some reasonable approximations one obtains

$$P(s)=\frac{1}{Z}{s}^{2}\exp \left[-\frac{N\lambda }{T{v}_{0}^{6}}{({v}_{0}^{2}-{s}^{2})}^{4}\right]$$

which has two great differences compared to the linear case: first, the power four in the exponential tames the entropic push of the s2 term extremely sharply; secondly, the amplitude λ, unlike g, does not need to be small to grant scale-free correlations, so the exponential weight remains always effective in suppressing large values of the mean speed, s. The maximum of this distribution, i.e. the typical mean speed of the flock, is given by (see SI for details),

$${s}_{{{{{{{{\rm{typical}}}}}}}}}\simeq \left\{\begin{array}{ll}{v}_{0}\kern5.5pc &{{{{{{{\rm{for}}}}}}}}\,N\gg \frac{T}{\lambda {v}_{0}^{2}}\\ {v}_{0}{\left(\frac{T}{4N\lambda {v}_{0}^{2}}\right)}^{1/8}\kern2.2pc &{{{{{{{\rm{for}}}}}}}}\,N\ll \frac{T}{\lambda {v}_{0}^{2}}\end{array}\right.$$

As in the linear case, for N → , the typical speed of the flocks becomes the same as the natural reference speed, v0, while it increases for smaller sizes. However, in the marginal case this growth is very moderate indeed: the strength T of the noise is small (to have large correlation length) and the amplitude λ is finite, so that the crossover size \(N=T/(\lambda {v}_{0}^{2})\), below which the mean speed increases, is very small, thus shielding this regime; moreover, the exponent of the speed’s increase for small N is 1/8, significantly smaller than the exponent 1/2 of linear speed control (see Eq. (7)). For these two reasons marginal speed control is so much more effective than linear control at small N.

These theoretical results are fully confirmed by numerical simulations of SPP flocks regulated by marginal speed control (Fig. 3c, d). Moreover, the comparison with experimental data on real starling flocks in the field indicates that marginal control gives satisfactory results: with just one reasonable set of parameters—the only crucial one in fact being the low noise strength, T—numerical simulation of SPP flocks with marginal speed control reproduce the experimental data very well: the correlation length ξsp scales linearly with L up to the largest size (Fig. 3c), and the mean speed s shows only an extremely weak increase at low N, well within the scatter of the empirical data (Fig. 3d).

There is one further—and final—empirical test that we can make to compare the two theories, linear vs. marginal; it consists in investigating the individual speeds, and how they fluctuate in the experimental data, compared to the two models. This is a complementary view to the analysis of the mean speed of the group that we presented in Fig. 3. In Fig. 4 we report the single particle speed distributions for natural flocks, and compare it with the numerical simulations of the linear and marginal model; we do this for three different values of the number of particles N. The stiffness in the linear case has been fixed to a value granting scale-free correlations at all sizes (see Fig. 3a), while the parameters of the marginal model are the same as in Fig. 3. It is quite clear that individual speed fluctuations are far too large in the case of linear control, even in the largest N case, and they are completely out of scale in the medium-small N cases. On the contrary, marginal control gives very reasonable distributions at all sizes, in very good agreement with experiments.

Fig. 4: Single particle speed distributions.
figure 4

We measured the single particle speed distribution (as opposed to the mean speed of the group of Fig. 3), for different groups sizes, for real flocks (grey), the linear model (dark red) and the marginal model (light blue). The parameters in the marginal case are the same as in Fig. 3, while for the linear case we fixed the stiffness at g = 10−3, which is the only value giving scale-free correlations at all sizes (see Fig. 3). The individual speed on the abscissa has been normalized to the reference speed value v0, in order to compare all cases on the same plot. The data show that linear speed control is unable to fit the experimental distribution even at the largest N, and it is a total disaster in the medium and smallest N cases. On the other hand, the marginal theory fits the data in a rather satisfactory way at all values of N.

We stress once again that all three pieces of phenomenology—large polarization, large correlation length, moderate speed at all group sizes—are achieved with marginal control by doing just one very sensible thing, namely pushing the system into the ordered phase real flocks naturally belong to. The entropy-triggered conflict between scale-free correlation and moderate group speed that hinders linear control is therefore resolved by the marginal theory without any fuss.

Biological significance

The highly non-linear marginal potential implies that small speed fluctuations elicit nearly zero restoring force, while larger speed fluctuations are pushed back extremely sharply, in contrast with the constant slope of a linear confining force, Fig. 1. In bird flocks, small speed fluctuations are not prevented by biomechanical constraints, but they could be depressed by energetic expenditure concerns, as changing the speed requires extra energy consumption; however, starlings prove to be very liberal about their energy expenditure habits while flocking43,44,45: although their metabolic rate is dramatically higher in flight than on the roost43, these birds will spectacularly wheel every day for half an hour before landing, expending energy at a ferocious rate; this suggests that small extra energy expenditures due to small speed fluctuations may indeed be weaker-than-linearly suppressed. On the other hand, large speed fluctuations clash against biomechanical and aerodynamic constraints, which are set very stringently by anatomy, physiology and physics46,47,48; therefore, a stronger-than-linear suppression of large speed fluctuations also seems quite reasonable. Moreover, we notice that while the linear restoring force is completely symmetric around v0, hence suppressing fluctuations smaller than v0 as strongly as those larger than v0, the marginal force is asymmetric, suppressing decelerating speed fluctuations less harshly than accelerating ones (see Fig. 1). It seems reasonable to expect that for birds, as well as for animals in general, accelerating should be harder than slowing down; hence, the asymmetry in the speed-restoring force adds to the biologically plausibility of marginal control.

Even though we tested our conclusions on the case of starling flocks, it may be that marginal control is one simple way (possibly not the only one) to reconcile data with theory in other biological systems, although of course this should be tested experimentally. Field experiments reporting the dynamical trajectories within animal groups are quite rare, especially in three dimensions, and beyond starlings the only data are for pigeons (Columba livia)49,50,51, jackdaws (Corvus monedula)52,53,54 and chimney swifts (Chaetura pelagica)55. Monitoring the mean speed of these groups should be straightforward, while checking the speed correlations may be somewhat more laborious, as only a study of correlation at different group’s size L would reveal whether or not scale-free correlations are present also in these systems; and yet, this seems to us an essential step to establish on a firmer basis the connection between speed control and correlation, which is the cornerstone of our results.

One interesting issue raised by the experimental studies on jackdaws52 and pigeons49,50,51 is the fact that real biological flocks are heterogeneous: birds are old and young, male and females, and beyond these obvious traits one may have more influential individuals in the group, as well as outliers with a non-average behaviour. To what extent the results that we obtained within a perfectly homogeneous model, in which all agents are exactly the same, are robust against heterogeneities? In particular, an important problem is whether or not the deviations from the mean behaviour of a few keystone individuals are acted upon by the rest of the group. Despite the simplicity of the marginal model, we can introduce meaningful heterogeneities in two ways, namely by attributing a different reference speed or a different speed variability to a few individuals. The results (that we discuss in details in the SI) are quite interesting: both the deviation in the reference speed and the deviation in speed variability are indeed tamed by the rest of the group provided that the imitative interaction is strong enough, which is by no means a problem for the marginal model, as stronger interaction means stronger correlation. Hence, not only scale-free correlations and moderate group speed, but also robustness against heterogeneities, are all achieved in the marginal model by simply staying in the strongly interacting ordered phase. Of course, there are other types of heterogeneities that cannot be studied within our simple model: considering variations in structural size and body mass between individuals within the same species, as well as understanding what happens in mixed-species flocks, require more sophisticated models, as for example the one studied in ref. 14; and—of course—real experimental studies. The fact that simple heterogeneities are tamed very easily in the marginal model, makes us optimistic about the robustness of marginal control in more general cases.

A further crucial issue to consider when we think about real biological systems, is that the dynamical phases of natural collective behaviour are diverse: starlings’ aerial display studied in our data (sometimes called murmurations) is characterized by a very compact drop-like structure, moving coherently over the roost, often subject to predation56; chimney swifts display a remarkable circling geometry55, while the jackdaws data collected in ref. 53 display two group-level phases, namely a cruising-to-roost dynamics and an anti-predator mobbing dynamics. It would be helpful to understand what are the properties that these different phases have in common. This is particularly relevant for correlation, which—as we have seen—is a tricky trait to sustain. Consider, for example, correlations in the velocity orientations of animals: statistical physics tells us27 that when the rotational symmetry is spontaneously broken by the group, namely when out of many equivalent directions of motion only one is selected, scale-free correlations of the orientations emerge automatically in the system. If spontaneous symmetry-breaking seems certainly to be the relevant case for starling flocks swirling over the roost, jackdaws flocks travelling to the roost52,53,54 and homing pigeons49,50,51 need to follow one specific direction, hence there is no spontaneous symmetry breaking; similarly, in the case of migrating birds, when longer duration flight carry the individuals along one well-defined route, there is no spontaneous symmetry breaking. In these cases, correlations of the orientations could be quite different from those of starlings. However, speed requires a different correlation mechanism than orientation, as no physical reason automatically grants long-range correlation; hence, it would be really helpful to investigate the link between correlation and speed control in different phases. Our impression is that long-range speed correlations are essential to propagate information in all phases of collective motion, in order to keep a good degree of cohesion in the face of natural variations of the individual speeds; we therefore expect that the interplay between speed control and speed correlation is a very general concern of collective motion. But only experiments can confirm this.

Experimental data on two-dimensional collective motion are somewhat more accessible than 3D data. The recent study about sheep herds57 presents a case where exactly the same interplay between correlation and speed control could be at work, and the presence of intermittency—with its huge speed fluctuations—could make even more urgent the issue of speed control; again, obtaining correlations at different groups size could require some nontrivial work, but the two-dimensional nature of the systems makes the tracking somewhat simpler than in the case of bird flocks. Intermittent motion and large speed distributions have also been observed in locust swarms11. Similarly, 2D data of fish schools in shallow water have been studied for a long time, both numerically3 and experimentally58,59; as in the case of herds—and unlike the case of flocks—speed fluctuations in fish schools are substantial, hence the issue of speed control can be quite different than in bird flocks. In fact, there may exist a rather profound difference between animals that cannot change much their speed, as birds within a flock, for which not only very large speeds are forbidden, but also very low ones, because of the very aerodynamics of flight, and animals that can reduce considerably their speed, down to halting, as mammal herds or fish schools, at least to a certain extent; apart from a significantly larger asymmetry in the control mechanisms, the very possibility to reduce the speed to zero could be a game changer. As we have explained above, the entropic push that blows the group speed in the linear case is due to the Jacobian factor sd−1ds, which is not kept under control by the exponential factor in the speed distribution; this factor implies that the probability to have speed equal to zero is always vanishing, a property that is certainly true in the case of bird flocks (see Fig. 4), but it is not so for organisms that can actually halt. To study such a case, hence, a different kind of model than the one studied here would be required. One possibility would be to control separately the noise fluctuations of velocity orientation and speed, along the lines of the studies in refs. 32,35, in such a way to eliminate the Jacobian factor and shift the distribution towards lower speed. This would be an interesting direction to explore, provided—of course—that some new measures are taken to grant the second key ingredient we have been investigating here, namely long-range speed correlations.

Finally, beyond the horizon of broadening our study to include biological systems other than starling flocks, experiments on artificial self-organized swarms would be the next essential step forward. Artificial swarms would allow first to assess at the embodied level (which is quite different from the numerical one) to what extent group’s cohesion depends on the range of the correlation: the fact that long-range correlations (quite rare a condition in physical systems) are so frequent in biological collective behaviour—from bird flocks10, to midge swarms60 and down to bacterial clusters61—has prompted biophysicists to connect this trait to collective response and cohesion, with both theoretical62 and numerical backup14; yet, biology is one thing, engineering another one, and no matter how much inspiration the latter takes form the former, it is crucial to check whether the link between correlation and response holds at the technological level. Secondly, in artificial collectives it would be possible—at least to some extent—to tune the reference agents’ speed v0 and to tweak the manner individual speed can fluctuate around v0 in a controlled way, which is impossible to achieve in natural systems. Having the possibility to operate on both arms of the problem (correlation and speed control) would be invaluable, given the growing technological relevance of self-organized collective behaviour.



Empirical observations of starling flocks have been performed in Rome, from the terrace of Palazzo Massimo alle Terme, in front of a large roosting site at the Termini Railway Station. The experimental technique used is stereoscopic photography, where multiple synchronized video-sequences of a flocking event are acquired from different observation points with a calibrated multi-camera video-acquisition system63. Digital images are then analysed with a specifically designed tracking software64, in order to extract from the raw data the three-dimensional trajectories of the individual birds in the flock.

Data have been collected across the years during several experimental campaigns. The very first data were collected in the context of the Starflag project24,25, between 2007 and 2010, using Canon D1-Mark II cameras, shooting interlaced at 10 frames-per-second (FPS), with a resolution of 8.2 Megapixels (MP). A second campaign took place between 2011 and 2012, using faster cameras, namely IDT M5, shooting at 170FPS with a resolution of 5MP. A final campaign took place in the last months of 2019 and in January and February 2020. This campaign uses state-of-the-art IDT OS10-4K cameras, shooting at 155FPS with a resolution of 9.2MP. Overall, we have data from flocks with sizes ranging between 10 and 2500 birds, a span that is essential to differentiate between linear and marginal speed control.

All campaigns have been conducted with a three camera system exploiting trifocal geometry65. The image analysis—segmentation of individual birds, stereometric matching and dynamical tracking—have been performed using the method of refs. 24,25 for the first campaign, and the most advanced method of Attanasi et al. 64 for the second and third campaigns. We summarize in Table S1 in the SI all the experimental quantities used in our analysis for each flocking event.

Correlation functions and correlation length

The speed spatial connected correlation function is defined as66

$$C(r)=\frac{\mathop{\sum }\limits_{i,j}^{N}\delta {v}_{i}\,\delta {v}_{j}\,\delta (r-{r}_{ij})}{\mathop{\sum }\limits_{i,j}^{N}\delta (r-{r}_{ij})}$$

where N is the number of individuals in the system, rij = ri – rj is the mutual distance between individuals i and j, and

$$\delta {v}_{i}={v}_{i}-\frac{1}{N}\mathop{\sum }\limits_{k}^{N}{v}_{k}$$

is the fluctuation of the individual speed vi = vi with respect to the mean speed of the group s = (1/N)∑kvk, evaluated at a given instant of time, thus neglecting effects of common external environmental factors. The function C(r) represents the instantaneous average of mutual correlations among all pairs at distance r: in systems with local, distance-dependent interactions, for large enough system sizes, this quantity is a good proxy of the typical correlation at that distance, as computed with the correct theoretical measure. A full discussion of definition (15), its asymptotic limit, finite-size effects, and behaviour in known cases, can be found in ref. 66. Here we notice that the correlation function (15) is the only possible definition applicable to experimental data, where no a priori information is available on the true nature of the dynamics. This definition has indeed been used in all the previous analysis of speed correlations mentioned in this paper. We display in Fig. S5 of the SI the correlation function (15) computed, respectively, from experimental data (panel a), and from numerical simulations with a linear speed control model (panel b) and a marginal speed control model (panel c).

For each configuration of the system (at a given time) we estimate the correlation length as

$$\xi =\frac{\int\limits_{0}^{{r}_{0}}\,{{{{{{{\rm{d}}}}}}}}r\,r\,C(r)}{\int\limits_{0}^{{r}_{0}}\,{{{{{{{\rm{d}}}}}}}}r\,C(r)}$$

and then we perform a time average of this quantity over different configurations. The point r0 is the first point at which the correlation vanishes, C(r0) = 0; such a point always exists due to the very definition of correlation function given in Eq. (15)66. Definition (17) provides a reliable estimate of the correlation length in every regime, both when the system is scale-free with long-range correlations and when the system is far from criticality with short-range correlations (in the linear speed control model we can see all this phenomenology by changing the parameter g, see Fig. 3a). The reason is that Eq. (17) makes use of the information encoded in the zero-crossing point r0, together with the shape of the entire function C(r).

Different definitions of the correlation length are possible, but one has to be careful, as it is very difficult to find alternative definitions that provide a reasonable estimate of the spatial range of the correlation both in the scale-free case and in the short-range case. For example, in the original work on scale-free correlations in flocks10, we used the zero of the correlation function, r0, as an estimate of the spatial span of the correlations, because it scales properly when the system is scale-free, correctly identifying the size of the correlated domains, and scaling linearly with the system’s size, L; however, r0 is not a good estimate of the correlation length when a system is not scale-free, namely when there is an intrinsic short-range length-scale, ξ, as in that case one finds \({r}_{0} \sim \xi \log L\)66. Conversely, in the case of short-range correlations it is relatively easy to determine the correlation length, for example via an exponential fit, while this procedure is unfeasible in the scale-free case, when the correlation function does not have a short-range decay (see Fig. S5 in the SI).

On the other hand, definition (17) works in all regimes. In the case of a short-range correlation, for example exponential, (\(C(r) \sim {e}^{-r/\hat{\xi }}\)), it is easy to verify that (17) gives \(\xi \sim \hat{\xi }\). Conversely, when correlations are scale-free, regardless of the precise shape of the C(r) one has r0 ~ L, and Eq. (17) gives a correlation length that scales with L as well; more precisely, for scale-free correlations as the ones we find in flocks, namely with an almost linear decay, Eq. (17) gives, ξ = r0/3; considering that in ref. 10 we found r0 = L/3, one has that in scale-free flocks, ξ = L/9, which is consistent with the data we report in Fig. 3. We believe that this factor 3 correction with respect to r0 is inessential, as the only relevant information in the scale-free case is not the actual value of the span of the spatial correlation, but rather the fact that it scales with L.

Numerical simulations of SPP flocks

To investigate the flocking dynamics described by Eqs. (1) and (2), we perform numerical simulations with a system of self-propelled particles. The flock is modelled as a set of particles moving in a three-dimensional space with update rules for positions and velocities, which are a discretized version of Eqs. (1) and (2). Following a simple Euler integration scheme67, we get

$${{{{{{{{\boldsymbol{v}}}}}}}}}_{i}(t+{{\Delta }}t)={{{{{{{{\boldsymbol{v}}}}}}}}}_{i}(t)+{{\Delta }}t\,{{{{{{{{\boldsymbol{F}}}}}}}}}_{i}+\delta {{{{{{{{\boldsymbol{\eta }}}}}}}}}_{i}$$
$${{{{{{{{\boldsymbol{x}}}}}}}}}_{i}(t+{{\Delta }}t)={{{{{{{{\boldsymbol{x}}}}}}}}}_{i}(t)+{{\Delta }}t\,{{{{{{{{\boldsymbol{v}}}}}}}}}_{i}(t)$$

Here the force Fi = Fint + Fsc acting on particle i contains both an alignment term Fint,

$${{{{{{{{\boldsymbol{F}}}}}}}}}_{{\mathrm {int}}}=-J\mathop{\sum }\limits_{j}^{N}{n}_{ij}(t)\left({{{{{{{{\boldsymbol{v}}}}}}}}}_{i}(t)-{{{{{{{{\boldsymbol{v}}}}}}}}}_{j}(t)\right)$$

and a speed control term Fsc, which can be either linear:

$${{{{{{{{\boldsymbol{F}}}}}}}}}_{{\mathrm {sc}}}=2g\frac{{{{{{{{{\boldsymbol{v}}}}}}}}}_{i}}{| {{{{{{{{\boldsymbol{v}}}}}}}}}_{i}| }\left({v}_{0}-| {{{{{{{{\boldsymbol{v}}}}}}}}}_{i}| \right)$$

or marginal,

$${{{{{{{{\boldsymbol{F}}}}}}}}}_{{\mathrm {sc}}}=\frac{8\lambda }{{v}_{0}^{6}}{{{{{{{{\boldsymbol{v}}}}}}}}}_{i}{({v}_{0}^{2}-{{{{{{{{\boldsymbol{v}}}}}}}}}_{i}^{2})}^{3}$$

The last term in Eq. (18) is a white Gaussian noise with zero mean and variance:

$${\sigma }_{\eta }^{2}=2dT{{\Delta }}t$$

where d = 3 is the space dimension and T is the effective temperature. The matrix nij(t) is the adjacency matrix that defines which pairs interact; its entries can assume only the values 0 and 1, according to a rule of interaction that can be metric (i.e. nij ≠ 0 if and only if rij < rc) or topological (i.e. nij ≠ 0 if j is one of i’s first nc neighbours)33,68. When working at fixed average density and in the very low temperature region where density fluctuations are small, there is not great difference between metric and topological interaction. Even though natural flocks are known to have topological interactions33,68, we therefore decide to perform simulations with the metric rule, which are much less expensive computationally. In this way, we are able to study systems in d = 3 with N up to 3 × 105 particles. We consider a metric connectivity matrix with interaction radius rc = 1.2, such that the number of nearest neighbours at the time t = 0 is nc = 6, close to the biological value33,68. We then check a posteriori that the system remains spatially homogeneous in time by computing the distribution of the number of nearest neighbours for every simulation, and verifying that it is always sharply peaked around the initial value nc = 6. All the simulations are made in a cubic box (of linear size L) with periodic boundary conditions. Individuals are initialized in a global polarized configuration on a cubic lattice with lattice spacing (i.e. nearest-neighbour distance) rc = 1 and then evolve off-lattice according to rules (18). The effective temperature clearly drives the system from a disordered to an ordered state through a phase transition at fixed density. However, since we are considering self-propelled particles, the same configurations can be reached using another control parameter defined as the ratio between the mean first neighbour distance r1, which directly depends on the density of the system, and the interaction radius rc but at fixed noise. We decide to perform all the simulations at constant density ρ = 1, maintaining r1/rc constant and choosing the temperature according to the desired polarization.

We choose the value of the reference speed of the particles v0 and of the integration step Δt to ensure an average displacement Δrv0Δt much smaller than the size of the box L. In this way there is a weak rewiring of the interaction network during the time of simulation, consistently with the quasi-equilibrium condition of natural flocks34. The integration step is selected as the maximum value granting a robust numerical integration in terms of errors and stationarity of the system’s energy (absence of trends in time or in size). In simulations with marginal speed control this algorithmic stability is achieved with Δt = 0.01, while linear speed control requires a Δt = 0.001. Every simulation consists in a run of length Nsteps = 2 × 104 steps for thermalization and in an independent run long Nsteps = 1.2 × 106 steps. From the latter we extract configurations every 1000 steps in order to compute the quantities needed by our analysis. In Table S2 of the SI we report the values of the other parameters used in the simulations.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.