Information flow in finite flocks

We explore information flow in finite active matter flocks by simulating the canonical Vicsek model and estimating the flow of information as a function of noise (the variability in the extent to which each animal aligns with its neighbours). We show that the global transfer entropy for finite flocks not only fails to peak near the phase transition, as demonstrated for the canonical 2D Ising model, but remains constant from the transition throughout the entire ordered regime to very low noise values. This provides a foundation for future study regarding information flow in more complex models and real-world flocking data.


Introduction
Recent experimental studies of animal flocks, fish [17,36], birds such as pigeons [34], starlings [18], midges [4] and sheep [23] have dramatically increased our understanding of flocking dynamics.A central theoretical issue is how the communication range between flock members leads to global coordination of the flock, leading to proposals [32,4] that the flock exists at some critical point-the point at which flocks transition between order and disorder-providing the ideal scenario for collective reaction to external stim-uli, with enough order to form collective behaviour without overwhelming inertia.This carries concomitant implications for information flow.
We measure information flow among members of these flocks, using transfer entropy [37], in particular Global Transfer Entropy (GTE, eq.5), shown to peak on the disordered side of the phase transition in the 2D Ising spin model [9], where mutual information (MI) and pairwise transfer entropy (TE) peak at the transition.
Real world flocks are active matter -systems far from equilibrium, which do not conserve momentum or other dynamical quantities [22].We analyse the long-term limit GTE of the minimalist Standard Vicsek Model (SVM) [40] of collective motion [38,41] using a novel closed-form dimensional reduction, obtained by exploiting an approximate isometry in the SVM.This approach has demonstrated anomalous behaviour in MI [11], diverging as noise tends to zero and failing to peak at the transition.
This study reveals two unexpected informationtheoretic differences from the Ising model [9] at long observation times: absence of a peak in GTE on the disordered side of the phase transition; and a slowly changing value below the transition 1 .For shorter observation times, unlike the Ising model, the GTE peaks at, rather than above, the phase transition.
While the final closed-form expression for GTE (eq.9) requires an isometry approximation, two technical innovations (eqs.6 and 7) were needed to reduce computational requirements (without isometry): replacing the multi-dimensional vector of interacting particles with a consensus vector; and exploiting the independence of the noise, leading to the surprising result (eq.7) that calculation of global information flow requires no measurement of neighbouring particles.
This new behaviour may inform future studies on real-world flocks or other more sophisticated models, noting that for the SVM maximum information flow occurs not just near the phase transition, but throughout the entire ordered regime as well.

The Standard Vicsek Model
The SVM comprises a set of N point particles (labelled i = 1, . . ., N ) moving on a plane of linear extent L with periodic boundary conditions (see [1] for full details).Each particle moves with constant speed v, and interacts only with neighbouring particles within a fixed radius r.Positions x i (t) and headings θ i (t) are updated synchronously at discrete time intervals ∆t = 1 according to (1) respectively, ϕ i (t) is the average heading of all particles within the interaction radius, r i , of particle i (including particle i itself), and ω i (t) is white noise uniform on the interval [−η/2, η/2] with intensity η ∈ (0, 2π].The average heading [40], which constructs the consensus vector, is ϕ i (t) = arctan[ sin(θ(t)) i,r / cos(θ(t)) i,r ], where z i,r ≡ 1 N N j z j δ r (i, j) and δ r (i, j) is 1 if neighbour j is within the interaction radius, r, of particle i, and 0 otherwise; note that δ r (i, i) = 1, so that z i is always included in the average.The velocity vector v i (t) is constructed from the heading θ i (t) with constant speed v. Particle density ρ = N/L2 is fixed throughout at 0.25.
Considering the SVM as a steady-state statistical ensemble of finite size containing N particles with control parameter η, we use capitals to indicate quantities sampled from the ensemble; in particular, Θ i denotes an ensemble sample of the heading of the ith particle.The full order parameter for the SVM ensemble is the 2D mean particle velocity vector M with magnitude M ∈ [0, 1] and heading Φ ∈ (0, 2π] [1].M = 1 iff all particles are aligned, while in the fully-disordered case (η = 2π) we have M → 0 in the large-system limit N → ∞.The ensemble variance defines the susceptibility; a peak in χ as a function of η is taken to locate an (approximate) phase transition [5] 2 .
The phase transition in the original SVM was thought to be second order, but this was disputed [25,19] and it transpired that seemingly minor details affect the nature of the transition: type of noise statistics [20]; forward versus backward updating (especially at high particle velocities) [33]; boundary conditions associated with density bands or spin waves [2]; and the cone of influence on each particle [21,35].The consensus for SVM now appears to state that the phase transition is second order for low velocities, and first order for high velocities [3,7,8].Consequently, we utilise the original SVM model (backward updating, angular noise, periodic boundary conditions and low density) over a range of velocities.Observation of the Binder cumulant [15] for these regimes (not shown) indeed shows a sharp minimum-representative of a first order transition-only at high velocity magnitudes (v = 2.00), consistent with [3].
The finite-size SVM exhibits behaviours [10] akin to "continuously-broken ergodicity" [30].Over short observation windows the SVM is confined to a comparatively small volume of phase space, thus breaking symmetry and ergodicity.As the window increases, the SVM explores progressively larger vol-umes of phase space, until ergodicity is restored, albeit requiring very long windows at low noise.
Thus the ensemble statistics are observation timedependent [10], giving two regimes-short-term and long-term statistics.In the short-term regime, we collate statistics (with no ergodic assumptions) over ranges of observation sizes spanning several orders of magnitude, demonstrating the effect of observation time.In the long-term limit, since ergodicity is unbroken, the time average of the GTE will be equal to the ensemble average, and thus statistics can be measured from ensembles constructed of many independent realisations-each with shorter observation windows-rather than the prohibitively long time spans required for solitary realisations to traverse the entire phase space.

Global Transfer Entropy
The information flow between two continuous random processes, from Y to X, is given by the TE [37] : where X, Y are the process histories, and X is the updated state.We truncate process histories to just the most recent state as in [9].For a continuous random variable X, H(X) ≡ − p X (x) log p X (x)dx denotes differential entropy, which necessitates a continuous estimator [28].GTE extends TE to measure information flow from the entire system to a single particle, and here is defined as the ensemble statistic where I is uniform on the set of particle indices, Θ i is the updated heading at time t + 1 of the ith particle, Θ = (Θ 1 , . . ., Θ N ) is the vector of all N particle headings.
Since the update of a particle's heading is mediated purely by the consensus heading of its neighbours, the GTE may be reduced to three dimensions, i.e., H(Θ I | Θ) = H(Θ I | Φ I ), giving: thus eliminating dimensionality issues surrounding Θ.
Noting that θ i = [ϕ i + ω i ] for any particle iwhere [. ..] denotes modulo 2π, confining the result to (−π, π]-and that noise ω i is independent of ϕ i we have just where Ω is the (ensemble) noise.Thus all measurement of particle neighbours is eliminated and eq.6 reduces to the two dimensional: Finally, in the long term limit rotational symmetry remains approximately unbroken: that is, for any fixed angle α, the joint distribution (Θ 1 +α, . . ., Θ N + α) is the same as the joint distribution (Θ 1 , . . ., Θ N ).Under this isotropy approximation (see below), eq.7 reduces to a one-dimensional form in which only changes in particle heading θ i −θ i and noise Ω appear.Let p(θ 1 , θ 2 ) be the probability density function (pdf) of (Θ I , Θ I ) [1].Under the assumption of rotational symmetry we have: where q(θ) is the pdf of Θ I −Θ I .Since the marginal distributions of Θ I and Θ I are uniform on the unit circle in the long-term (ergodic) limit, we obtain ) which reduces eq.7 to the closed-form expression for the long-term GTE, where [• • • ] denotes the internal angle.Note that at η = 2π, H([Θ I − Θ I ]) = H(Ω) = log 2π and thus T LT gl vanishes at maximum noise, as expected.As noise decreases, the particles align more and more strongly, so that the distributions of both Θ I − Θ I and Ω become increasingly sharply peaked.Since these are both differential entropies, they both diverge to −∞.The exact nature of the divergence is established in simulations discussed below.
The isotropy approximation arises because the SVM on a 2D plane with periodic boundary conditions-i.e. a flat torus-is not in fact strictly isotropic.We tested its validity by repeating the (dotted) calculated according to eq. 9 for a range of particle velocities for 0 < η ≤ 2π.System size N = 1000 particles, density ρ = 0.25 and velocities v as indicated.Simulation: ensembles constructed from 20 realisations at observation time T = 500 time steps each, under ergodic assumptions and the isotropy approximation (see text).Error bars at 1 s.e.(smaller than symbols) were constructed by 10 repetitions of the experiment.Lines show susceptibility χ (eq.3).long-term simulations while randomly rotating the SVM frame of reference between each update, thus enforcing isotropy [6].Negligible error was introduced around the phase transition and at very low noise [1].

Results And Discussion
Figure 1 shows the long-term GTE T LT gl estimated in sample according to eq. 9 for a range of particle velocities.For v < 0.5 there is no peak in GTE and for v ≥ 0.5 peaks occur at or below the phase transition-identified as a peak in χ as per eq.3with all GTE values approaching approximately 0.72 bits as noise tends to zero.As v increases, a shift in the χ peak to higher noise values occurs.
For short observation times, by contrast, T gl and T 2D gl estimated according to eqs. 6 and 7 respectively  gl (dark lines) and susceptibility χ (light lines) estimated for a range of particle velocities (parameters as in fig.1).T 2D gl was estimated according to eq. 7 over T = 5000 time steps (that is, with no ergodic assumption) after relaxation to steady state, using a nearest-neighbour estimator (full simulation details in [1]).χ was estimated over the same realisations.Error bars at 1 s.e.constructed by 10 repetitions.
(and with no isotropy assumption) do peak at the transition, rather than in the disordered regime as in the Ising model [9]; see figs. 2 and 3.
Figure 3 shows the effect of eliminating the consensus vector measurement in eqs.6 and 7.The agreement between the two is extremely close, although eq.7 gives slightly better results for numerical reasons.
Some flattening at low noise occurs, particularly for higher velocities.Here GTE does not converge to the ∼0.72 bits observed in the T LT gl .The shorter the observation window, the nearer we are to ergodicitybreaking as in the Ising Model [9] and thus GTE → 0 as η → 0. This is confirmed in fig. 4

which shows T 2D
gl for a single fixed velocity at observation window size varying over two orders of magnitude, along with the long-observation time limit T LT gl .As observation time increases, the GTE peak flattens and constant GTE in the ordered regime starts to occur, approaching, as predicted, the long-observation time limit.Figure 3: T gl as measured by the one-dimensional form (eq. 9, circles), the three-dimensional form (eq. 6, crosses) and the two-dimensional form (eq. 7, squares) for v = 0.30.Error bars at 1 s.e.constructed by 10 repetitions.Finally, fig. 5 shows the effect of varying the system size.For η > 0.4, T 2D gl increases-converging to T LT gl -as N increases.Below this however, T 2D gl diverges further as N increases, reflecting the reduced capacity of the system to explore large volumes of the phase space.Near the transition, flocks are in flux, breaking apart and reforming.Given this fluidity, a larger number of particles results in more sub-flocks, which consequently are able to explore the phase space more efficiently, so that the GTE approaches T LT gl .At near-zero noise levels, however, flock stability predominates, with phase space exploration effected mostly by the flock's random walklike behaviour 3 [16].The magnitude of the random walk is inversely proportional to the number of interacting particles; i.e., as the number of interacting particles increases, the mean of the consensus heading at t + ∆t more closely matches the mean at t. Due to the slower random walk, the system explores less of the phase space-more closely approximat- 3 Although it is still possible for flocks to break apart over time.estimates according to eq. 6 at fixed velocity v = 0.30 for T = 5000 time steps for varying N , along with T LT gl according to eq. 9 (red circles).
ing ergodicity-breaking-and hence diverges from the long term limit T LT gl .Simulation establishes the nature of the diverging entropies, showing convergence to ∼0.72 bits as η → 0. Analysis of how particle headings evolve over time [1] reveals an approximate Gaussian distribution 4 of heading differences-i.e., ∆Θ-as well as an approximately Gaussian distribution in the heading of the consensus vector-relative to the appropriate particle-as noise tends to zero.From the heading update in eq.2-with particular note of the definition of T LT gl in eq.9-we have: which allows us to decompose ∆Θ into two independent distributions, defined by ϕ i (t) − θ i (t) and ω i (t).The relative consensus heading, ϕ i (t)−θ i (t), is approximately Gaussian with support approximately 4 Strictly speaking it is a Von Mises (circular) distribution; however, as the width of the distribution is much less than 2π at and below the transition, we may safely adopt a Gaussian approximation.See [1] for more details.
. By the Central Limit Theorem [14], summing these two distributions as per the RHS of eq. 10, yields a truncated Gaussian with range [−η, η] and variance twice that of the noise; i.e., σ 2 ∆Θ = 2σ 2 Ω .Empirical results match this, with σ 2 ∆Θ = cσ 2 Ω where c → 2 − as η → 0. Thus, closed form entropies for Gaussian and uniform distributions can be substituted into eq.9: which tends to 0.7546 − bits as c → 2 − , in reasonable agreement with simulation results, given the approximations involved.
Above the phase transition however, the distribution of Θ I − Θ I is no longer approximately Gaussian in nature.As η → 2π, ∆Θ becomes increasingly convolved with a uniform distribution before reaching uniformity at η = 2π, leading to the decrease seen in T gl over η c < η < 2π.
Finite-size scaling analyses [5]-showing good agreement with theory for susceptibility divergence at the phase transition-include the appearance of dense travelling bands of particle at higher velocities (fig.6) [33].While this could imply symmetry breaking, fig.6 reveals that band orientation, as well as Φ, performs a random walk through angle space, thus not truly breaking ergodicity.Notwithstanding, the behaviour of the GTE around the phase transition at higher velocities is indeed different to lower velocities: it exhibits a peak in the long-term limit.The appearance of travelling bands coincides exactly with this noise/velocity regime, indicating that these bands could be a source of information flow.
The flat GTE exhibited in the ordered regime is a result of the approximately Gaussian heading of particles relative to their consensus vectors with variance proportional to noise.Although the continuous nature of the SVM cannot be ruled out at this stage, Figure 6: Snapshots from a single simulation demonstrating random walk of heading of high density bands of a flock with N = 1000 particles at high velocity (v = 2.0).Snapshots taken at, from left to right, t = 23 × 10 3 , 24 × 10 3 , 28 × 10 3 , 40 × 10 3 , 47 × 10 3 , 49 × 10 3 .Top row shows the state of the flock, while the bottom row shows the two-dimensional order parameter M -that is, mean particle velocity-for the previous 1000 time steps going from blue (t − 1000) to red (t).Distance from the center of the circle corresponds to the order parameter magnitude M = |M |.Note that, as witnessed by the first two snapshots, the change in heading can be rapid, with only 1000 time steps required for the band to precess π/4 radians.Reprinted from [10]. it seems likely that a "discretised" SVM would display similar behaviour.While it might seem reasonable to have investigated the behaviour of GTE in the continuous equilibrium case for comparison, we note that the obvious contender-the classical XY model-features a BKT phase transition (at least in the 2D case) [27], which would seem to be of an entirely different nature to the transition observed in the 2D SVM model.
It is still a matter of debate whether real-world flocks are poised at criticality.Bialek et al. [13] develop a spin-wave approximation for 3-dimensional flocks of starlings, parametrised from real data.Using this model, along with analysis in [12], Bialek et al. discuss long-range order of the velocity (orientation) and speed fluctuations of the flock.At low noise, there is a spontaneous symmetry breaking of the continuous velocity fluctuations, leaving behind a Goldstone mode [24], wherein there is no energy cost for birds to perform certain changes in flight, which manifests as infinite correlation length [39].
Bialek et al. state, however, that there is no spontaneous symmetry breaking in relation to speed fluctuations, therefore no related Goldstone mode; and hence that long-range order of the flock must be a consequence of criticality.
Melfo [31] instead argues qualitatively that attraction and avoidance rules common in such models do in fact spontaneously break continuous translational invariance, thus giving rise to a Goldstone mode and correlation lengths on the order of the system size.Melfo further states that this should appear in other examples of collective motion-particularly noting the Active-Elastic model of Huepe et al. [26], which achieves scale-free correlation of speed and velocity fluctuations with only attraction and avoidance interaction rules.While we cannot draw any direct conclusions here regarding criticality in real-world flockssince the SVM utilises constant speed (and thus no speed fluctuations)-our work provides a foundation for further comparative studies of information flow as measured by GTE in alternative models that do feature speed fluctuations.

Figure 2 :
Figure 2: T 2Dgl (dark lines) and susceptibility χ (light lines) estimated for a range of particle velocities (parameters as in fig.1).T 2D

Figure 4 :Figure 5 :
Figure 4: T 2D gl estimates according to eq. 6 at fixed velocity v = 0.30 for a range of observation times T as indicated, along with the long-term T LT gl of eq. 9 as per fig. 1. System sizes N as indicated, other simulation details as for previous figures.Note that the blue right triangle line (N = 1000, T = 5000) and orange square line (N = 10000, T = 500) lie on top of each other and represent constant N T .