Introduction

The striking collective phenomena exhibited by social insects and animals have inspired complex systems scientists for decades. The study of the basic principles of their communication and self-organization into large ensembles can be traced back to the seminal work of Wilson on social insects1,2,3. Large collective behavioral structures, such as hives, swarms, bird flocks, foraging ants trails, etc., emerge out of local interactions. The sizes of the resulting complex global structures are several orders of magnitude larger than the scale that the individuals are able to communicate. Prototypical examples are the large trails formed by foragers ants4, the collective defensive reactions of entire bird flocks5,6, the transition from disordered motion to ordered marching in locusts7 or the shimmering phenomena8 seen in giant honeybees as waves rippling across their open nests. These emergent behaviors have an adaptive value, facilitating the species’ survival. Often, the dynamics of these apparently disparate large-scale phenomena exhibit scale-invariant properties both in space and time which are common across the species, an observation that could be studied from the perspective of statistical physics9,10.

It is well established that scale-free spatial and temporal correlation characterize systems near the second-order phase transition11. In the paradigmatic case of the Ising model, when the control parameter, temperature, approaches its critical value long-range correlations emerge between distance lattice sites. The theory of complex systems implies that the information processing capacities of the system, as well as its dynamic range and capacity to generate emergent behavior patterns, are amplified at criticality. Therefore it stands to reason that living systems could be evolutionarily poised to exist in the vicinity of critical point.

The best-known example of a biological system with scale-free correlations and space and time is the vertebrate brain12,13, however, evidence of scale-free correlations14 in animal collectives imply that they might be in a similar state. It should be noted, that the existence of statistical similarities between vertebrate brains and collective entities has wide-ranging philosophical and scientific implications, which largely lie outside the scope of this paper, but are actively explored in the literature15,16.

The application of statistical physics toolbox to biological data has boomed once the data became available. Signatures of critical behavior have been uncovered in flocks of murmurating starlings9, swarms of midges17, mammal collectives18, and fish shoals. It should be noted that honeybees, while being an archetypal example of complex group behavior19 have not been studied from this perspective, a deficiency we hope to address in our work. Modeling work has demonstrated, that fish shoals operate best in the parameter region close to criticality20. At the same time, proximity to the critical region does come with associated costs—small-scale perturbation could be unnecessarily amplified to the detrimental results to the system. Moreover, as biological systems, even those of the same species vary in size, a single combination of preset parameters cannot always result in critical behavior. Therefore, ongoing research investigates mechanisms through which critical state could be controlled and maintained, with current theory leaning towards the view that animal collectives can tune their distance to the critical point depending on the needs of the moment14.

Moreover, the aforementioned ample observations often lack the elucidation of generative mechanisms of scale-free correlations. At the same time, turning to simple models has been fruitful in explaining complex patterns of collective behavior21. Invoking Occam’s razor, one should seek a sufficiently simple explanation. Exploring statistical similarities between the novel system at hand and well-studied systems for which the generative mechanism for the phase transition and its crucial properties have been established is a route that avoids unnecessary complexity. As bees in natural conditions are densely packed, the suitable analogy is the jamming transition22. Such transition occurs in a system of moving agents when a certain density is reached—at this state, a non-trivial relation between the throughput (traffic) of agents and their density emerges—at low-density traffic grows linearly as a function of density, however once critical threshold is reached further increase in density only leads to a decreased throughput. At this density, critical phenomena, such as long-range correlations in time and space appear, stemming from the complex non-linear interactions of agents. Initially developed for car traffic this transition has been studied both in simulation and in the real traffic condition23,24. Concepts of jamming have been applied to study motile biological agents such as cells and bacteria25,26.

In this work we study the collective dynamics of a hive of thousands of bees, searching for clues of their large-scale dynamics, revisiting the data previously acquired by Gernat et al.27 graciously shared with the authors of the current work. The paper is organized as follows: In the next paragraphs, we describe briefly the main features of the data and the methods. Next, we study the univariate aspects of the fluctuations, including occupancy (density) fluctuations, nearest neighbors bees’ interaction distances, and insects’ speed. After that, we proceed to explore collective effects, by calculating spatial and temporal correlations for different coarse grainings of the data being considered. Finally, we investigate, in analogy to traffic dynamics, the relation between average mobility and local density. The paper concludes with a discussion on the possible origin of the scaling found in the hive collective dynamics.

Results and discussion

Data

The full details of the methods are described in Ref.27. In brief, the experiment is conducted with a colony of 1200 1-day-old worker bees with a queen which are placed in a rectangular hive (dimensions \(348 \times 232\) mm) covered by a transparent material. The hive thickness is such that prevents the bees from climbing on top of each other while permitting full observation for recording. The dataset includes the position (the center of mass) of each bee sampled every second. It should be noted that bees could exhibit two types of behaviors that allow them to elude observation and also exclude them from interactions with the rest of the insects in the hive: they could go inside the honeycomb cells to sleep, eat, and take care of the brood or they could walk on the glass cover of the hive. For the first 2 days and two nights of the recording, insects were kept sealed and were supplied with the necessary nutrients (hereafter termed “phase I”). Afterward, bees were allowed to leave at will to scout and forage (“phase II”). During the last days of the trial, forages were not permitted to return to the hive (“phase III”). (The top 16 rows of the honeycomb cells were provisioned with 150 g of honey, and the two rows below the honey cells were provisioned with 15 g of artificial “bee bread”. Each bee was marked with barcodes attached to their thoraxes, enabling the continuous tracking of their positions. The sampling rate of the raw data is 1 s. Five different colonies were recorded, each for several days and nights. Results presented in the paper are computed using data from the second trial (dataset #2)).

Density fluctuations

Figure 1A shows an example of the trajectories of two bees during the 7 h of the first day of the experiment, from which it can be seen that their activity covers the entire hive. This is best appreciated by computing the average density of bees discretizing the hive in \(30\times 20\) squares of side length = 11.6 mm and counting the number of bees in each grid cell at a sampling rate of 5 s. Figure 1B depicts, as a heat map, the average occupancy over the entire experiment. Typically, each bee, unless it is hiding inside one of the honeycomb cells, explores an average distance of \(\sim \)4–5 m every hour (see Panel D) displaying bursts of activity (as described before27) seen in the two examples presented in Panel C.

Figure 1
figure 1

Typical bees displacements in the hive. Panel (A) illustrates the trajectory of two bees during the 7 h of the first day of the experiment. Straight lines represent the approximate position of the bee when it walked on the glass and its exact coordinates are unavailable. Panel (B) depicts, as a heat map, the average occupancy over the entire experiment of the hive (discretized in \(30\times 20\) squares of side length = 11.6 mm). Panel (C) shows two examples of the distances covered in each 60-s intervals by two bees on the first day of the experiment. In the main panel, each point represents the distance covered while the inset shows the cumulative distance as a function of time. Panel (D) illustrates the distribution for all bees of the average distance traveled in 1 h computed for the entire duration of the experiment. The results in the main panel correspond to the statistics computed for the daytime, and those in the inset for the nighttime.

It is well known that bees in the hive usually are highly packed. To study the typical inter-bees interaction length over time we computed (for each bee) its distance to their nearest neighbor (NN) (between their centers of masses) at intervals of 5 s. It can be noted that the mean distance is of the same order as the typical bee’s body length \( \mu = 10\;{\text{mm}} \pm 2 \)) and that exhibits daily fluctuations as shown in Fig. 2A,B. The distribution of nearest neighbor distances that a single bee experiences also fluctuates, as illustrated by the histograms shown in Panels (D) and (E) of Fig. 2, which seem to be approximated by a log-normal function (see insets showing the histograms plotted in log-linear axis). As expected the bees’ displacement speed also exhibits changes. Notice in panel C the circadian fluctuations in speed as the experiment evolves, as well as the one-fold change at the end of the experiment, when the bee density in the hive decreases after the foragers were not allowed to return to the hive.

Figure 2
figure 2

Bees nearest neighbor distances over time. Panel (A) shows the mean \(\mu \) and variance \(\sigma ^2\), computed every 2 h for all bees for the entire duration of the experiment. The vertical dashed-dotted line at 48 h indicates the sunrise of the third day when the hive entrance was open and bees were able to go out. The vertical black dashed lines at 149 and 158 h indicate the moments at which foragers removal has begun/ended. Notice that the peaks in both the mean distance and its variance coincide approximately with each day’s sunrise. Panel (B) shows the bees motion polarization \(\Phi \) (i.e., Eq. 1), computed every 5 s and plotted here as averages of 2 h. Panel (C) shows the mean speed of all bees in the hive computed every 5 s and displayed as 2-h averages throughout the entire experiment (black circles). In addition, the number of bees detected in the field is plotted (red circles). Panels (D) and (E) Histogram of the nearest neighbor distances for a single bee, in linear axis in the main plots and semilogarithmic axis in the insets. The results on panel (D) correspond to the first 48 h of data and those in panel (E) to 48 h after the foragers’ removal for one of the bees that remained in the hive. The cumulative distribution of speeds for all the bees detected are depicted in panel (F); for the period preceding the hive opening (i.e., dashed vertical line at 48 h in main panels, “phase I”), for the period of time after the foragers removal ended (i.e., after the hour 158, “phase III”) and for the intermediate period, (i. e., between the hour 48 and hour 158 ”phase II”).

Now we turn to explore the collective properties of the fluctuations described. The first indication is given by the fluctuations in the polarization \(\Phi \), a parameter commonly used to quantify the degree of global order in collective motion17. This metric ranges from 0 to 1 and is higher when directions of individual velocity vectors \(\vec {v_{i}}\) are aligned.

$$ \Phi = \left\| {\frac{1}{N}\sum\limits_{{i = 1}}^{{i = N}} {\frac{{\overrightarrow {{v_{i} }} }}{{||v_{i} ||}}} } \right\| $$
(1)

By simple visual inspection of the traces in Fig. 2 it can be noticed that the increases in the nearest neighbor variance are associated with an increase in the polarization (although less obvious it seems that the increases in polarization seem to precede the spread of the bees nearest neighbor distances). Also is worth mentioning that polarization and speed seem to hold an inverse relation, an observation that will be commented later on. These results show that more packed conditions (i.e., smaller NN distances) and less polarized collective motion came together with slower speeds on average. Also during more polarized motion bees spread apart only slightly on the average (observe only a small change in the mean NN distance) but with notable non-uniformity.

Another approach to reveal emergent properties analyzes different statistics computed over increasingly larger ensembles. The results of this type of finite-size scaling analysis can distinguish collective properties when compared with a known null hypothesis in which there are no collective effects. For that, we use the time series of bee densities x(t) recorded at each of the 600 square grids. After constructing ensembles of increasing size N we computed the statistics of \(f(t)=1/N \sum _{i=1}^N x(t) \) for several different stochastic combinations. The average results on Fig. 3A demonstrate a manifest anomalous scaling of the f(t) variance \(\sigma ^2\) with increasing N, since it remains almost constant. This scaling is in clear contrast with the 1/N behavior of an independent hypothesis constructed by random scrambling of the same time series. Together with the polarization calculations already commented on, this result is an indication of collective behavior since it shows that the density fluctuations inside one grid are not independent of the other grids fluctuations. In addition, we explored the so-called fluctuation scaling28, also known as Taylor law29, which considers the proportionality between the mean and the variance of the fluctuations in each grid. The results shown in Fig. 3B indicate that the process can not be due to independent random processes since it shows excess variance with respect to the Poisson hypothesis, demonstrated by the exponent \(\alpha > 1/2\) in the scaling \(\sigma ^2 \propto \mu ^ {-\alpha } \).

Figure 3
figure 3

Fluctuation scaling of the bees density reflects the presence of correlations. Panel (A) Finite-size scaling of the variance as a function of the ensemble size N. Each point corresponds to the variance computed from density fluctuations of ensembles of N grid units (averages of 30 random realizations). The fluctuation of the raw data (black circles) exhibits a relatively constant variance while the randomized data (red diamonds) follows 1/N scaling as expected from the law of the large numbers. Panel (B) Taylor Law. The variance \(\sigma ^2\) of the density fluctuations in each grid site as a function of its average \(\mu \) does not follow the proportionality of an independent uncorrelated process. The scaling can be approximated by two power-laws with exponents: \(\alpha _1\propto 0.89\) and \(\alpha _2 \propto 0.59\). Crossover takes place near \(\log _{10}\mu =-0.22\). Dashed lines correspond to slopes 1 and 1/2 and serve for visual reference. Both panels represent the data from Phase I.

Density–density correlations

Information relevant to untangle collective phenomena usually can be grasped from the computation of the correlation functions in time (2) and in space (3)

$$ AC(k) = \frac{{\sum\nolimits_{{t = k + 1}}^{T} {(x_{t} - \langle x\rangle )(x_{{t - k}} - \langle x\rangle )} }}{{\sum\nolimits_{{t = 1}}^{T} {(x_{t} - \langle x\rangle )^{2} } }} $$
(2)
$$ C(r) = \frac{{\sum {(x_{i} (} r) - \langle x(r)\rangle )(y_{i} (r) - \langle y(r)\rangle )}}{{\sqrt {\sum {(x_{i} (} r) - \langle x(r)\rangle )^{2} } \sqrt {\sum {(y_{i} (} r) - \langle y(r)\rangle )^{2} } }} $$
(3)

between the occupancy time series. In particular, the correlation decay as a function of distance, as presented in Fig. 4A, where it can be seen that the fluctuations between two points in the hive are correlated an order of magnitude beyond the spatial scale the bees interact (\(\sim 8\) mm, see Fig. 2).

Figure 4
figure 4

Spatial and temporal correlations of the occupancy fluctuations are long-range and dependent on bee density. Panel (A) Average correlation function C(r) of the density fluctuations versus distance r in mm. Dashed lines indicate fitted slopes, which are equal to \(-0.5\), \(-0.3\), and \(-0.75\) for phases I, II, and III respectively. Panel (B) Average first coefficient of the autocorrelation function \(1-AC(1)\) as a function of the size of the ensemble N (number of grid sites) considered. Each point represents the average of 30 stochastic realizations. Circles represent the results computed from the raw data, while the diamonds correspond to the computation using the null hypothesis constructed by a circular random shifting of the timeseries. Note that the autocorrelations in the null hypothesis data remain constant irrespective of the ensemble size N, while in the raw data AC(1) increases as a power law as the size N of the ensemble increases. Dashed lines represent slopes, which are equal to \(-0.84\), \(-1.15\), and \(-0.7\) for the respective phases. Panel (C) shows the bees’ displacement (in analogy to “traffic”) as a function of density. The color distinguishes different phases of the experiment: brown (phase I) before the hive was open, black (phase III), after foragers have been removed from the hive and blue the remaining part (phase II). The red line with white circles represents the binned average (± SD) computed for all the points irrespective of the phase. The histograms at the bottom of the panel (C) represent the computed probability that such density was observed for each phase, providing an estimation of the residence time of the hive with those conditions.

It is expected, as in other complex systems near a critical point, that the properties of the correlations in space reflect into the temporal correlations30,31. Thus, in Fig. 4B we estimate the behavior of the autocorrelation function as a function of the ensemble size N. The results show that the temporal correlations of larger ensembles become asymptotically longer, i.e., 1-AC(1) vanishes, in contrast with the null hypothesis results (depicted with diamonds) which remain larger and constant. AC(1) is used as a proxy for the entire autocorrelation function30, in this computation, additional information concerning the scaling of the autocorrelation function with time is contained in the supplementary materials to the paper.

In addition, the correlation results in Panels A and B show an important feature: when the density in the hive decreases in phase III (due to the removal of foragers) the correlation properties seem to change to a regime that can be considered of relatively larger independence. Indeed, on the spatial aspects, the correlations shift to a weaker and shorter range (see filled circles in Panel A) which agrees with the temporal aspects where the autocorrelations become also weaker and shorter (see filled circles and diamonds in Panel B) (Additional information concerning numerical aspects of presented results is contained in the supplementary material).

We conjecture that the presence of density-dependent correlations may shed light on the mechanism responsible for the collective dynamics. To test that we computed an observable alike to traffic by first partitioning the timeseries into 5000-s intervals and computing the average occupancy of each grid square. For the same intervals, we compute the throughput of bees by dividing each square with a midline and counting the occurrences of bees crossing this line in either direction. The results, presented in Fig. 4, unveil a pattern that is interpreted as follows: For low values, the traffic is a linear function of occupancy. This is true for all phases when the occupancy within the interval is lower, but it is especially evident if only the data collected in phase III (black dots) is considered because there is one fold reduction in density due to the foragers removal (see also Fig. 2). For higher densities, there is a maximum in the traffic, beyond which any further increase in density leads to a reduction in the bee’s traffic. Considering the typical size of a bee (\(\sim 10\) mm) and the mean distance to the nearest neighbor computed (\(\sim 7\)–9 mm, see Fig. 2A) the observed maximum traffic occurs near a regime in which bees would only be able to move as a solid-like ensemble.

The changes in the traffic and the correlation properties seen in Fig. 4, together with the changes in the nearest neighbor distances shown in Fig. 2A seem to be interrelated. As a first hypothesis, these results may be explained by a process of jamming-unjamming as described in a variety of systems from car traffic22 to cells25. Jamming is a self-organized process found both in living and inert systems in which even a small change in the control parameter, such as density, can induce big changes in the macroscopic properties of the system or material. Slow power-law decay of correlation function with distance has been empirically demonstrated for car traffic in major cities24.

We should note, that although we posit the jamming transition as the generative mechanism for the scale-free correlations, as discussed above, we do not assume that it is an exhaustive explanation for all phenomena observed in the data. Thus it is evident from Fig. 2 that rhythmic oscillations of N.N. distance, mean speed, and Polarization appear in Phase II after the hive was open permitting foragers to go in and out. These fluctuations are correlated with the day–night cycle and have been likely caused by the interaction of the beehive with the surrounding environment.

Speed correlations

Additional evidence is provided by the results in Fig. 5, which presents the correlations in speed, computed for all bees present in the hive during different phases of the experiment. Speed is computed as \(v=||\frac{\delta \vec {r}}{\delta t}||\); \(\delta t =5\) s. Distance-binned speed correlation coefficient \(C_{sp}(r_1, r_2)\) for each bin is computed as a mean correlation coefficient between the speed time series \(v_i, v_{j}\) of all bee pairs located at mutual distance r: \(r_1<r<r_2\), were \(r_1\) and \(r_2\) are the bin’ edges (4).

$$\begin{aligned} C_{sp}(r_1, r_2)= & {} \frac{1}{\sum _{i\ne j}\theta (||\vec {r_{i}}-\vec {r_{j}}||)}\sum _{i\ne j}\frac{\sum _{1}^{t}(v^t_i-\langle v_i\rangle )(v^t_j-\langle v_j\rangle )\theta (||\vec {r^t_i}-\vec {r^t_j}||)}{\sqrt{\sum _{1}^{t} (v^t_i-\langle v_i \rangle )^2}\sqrt{\sum _{1}^{t}(v^t_j-\langle v_j \rangle )^2}} \nonumber \\ \theta= & {} {\left\{ \begin{array}{ll} 1 &{} \text {if } r_1<||\vec {r_i}-\vec {r_j}||<r_2\\ 0 &{} \text {otherwise } \end{array}\right. } \end{aligned}$$
(4)
Figure 5
figure 5

The speed spatial correlations fluctuations are also long-range and dependent on bee density. Speed correlation function: Distance-binned correlations in speed computed using 48-h periods at the three phases of the experiment. (30 mm bins are used, and median values of the bins are used to represent distance on x-axis).

Results demonstrate that the spatial correlations in speed are only long-range (i.e., decaying slowly with distance as a power law) for when the bee density in the hive is high enough (Phases I and II) and become short-range (i.e., decaying faster) when the density of bees is reduced in Phase III such that their interactions are weaker.

Conclusion

In summary, we have shown that bees in the hive exhibit complex collective dynamics that involve scale-free spatial and temporal correlations. We conjecture possible mechanisms, through which such dynamics could emerge. Several routes are open for further study including appropriate modeling efforts concerning the dynamics of jamming of active particles in 2D. Jamming transitions are well investigated with respect to traffic phenomena, however, one would expect additional effects coming from behavior germane to eusocial insects, as suggested previously32.

The correlation properties observed for the fluctuations in the hive might indicate that the system is tuning itself towards an optimal state by varying its control parameter—bee density. This critical state corresponds to a density that provides maximum throughput (i.e., peak at Fig. 4). It is striking that such throughput optimization is very similar to what car drivers experience in daily traffic, including the presence of long-range correlations in speed and occupancy, much larger than the short-range interactions. The present results call for modeling attempts to validate this conjecture.