System design for inferring colony-level pollination activity through miniature bee-mounted sensors

In digital agriculture, large-scale data acquisition and analysis can improve farm management by allowing growers to constantly monitor the state of a field. Deploying large autonomous robot teams to navigate and monitor cluttered environments, however, is difficult and costly. Here, we present methods that would allow us to leverage managed colonies of honey bees equipped with miniature flight recorders to monitor orchard pollination activity. Tracking honey bee flights can inform estimates of crop pollination, allowing growers to improve yield and resource allocation. Honey bees are adept at maneuvering complex environments and collectively pool information about nectar and pollen sources through thousands of daily flights. Additionally, colonies are present in orchards before and during bloom for many crops, as growers often rent hives to ensure successful pollination. We characterize existing Angle-Sensitive Pixels (ASPs) for use in flight recorders and calculate memory and resolution trade-offs. We further integrate ASP data into a colony foraging simulator and show how large numbers of flights refine system accuracy, using methods from robotic mapping literature. Our results indicate promising potential for such agricultural monitoring, where we leverage the superiority of social insects to sense the physical world, while providing data acquisition on par with explicitly engineered systems.

The covariance matrix describing the precision of a position estimate p n = x n y n T given by our flight recorder can be derived from a basic model of honey bee motion. This motion model uses an assumed honey bee speed v and a sequence of heading measurements {γ 0 ,γ 1 , ...,γ n } to produce estimates of honey bee position over time. We present the covariance matrix associated with each sequential position estimate as a running sum of diagonalized matrices, a form that clearly shows the impact of each error source governing the performance of our flight recorder. Several different error sources must be taken into account. First, heading measurements taken by the flight recorder will suffer from quantization error. To account for this error source, we denote the i th heading measurementγ i , and we call the quantization error present in this measurement ε γ,i . Second, the time between measurements will be subject to random variations stemming from noise processes in the integrated circuit at the core of the flight recorder. To capture this source of error, we denote the nominal timestep as∆t and the error in the i th timestep as ε ∆t,i . These errors can be included in the honey bee motion model as follows: where h is a unit vector pointing along the heading of the honey bee. By including these errors, one can derive the covariance matrix characterizing the position estimation precision achieved by the flight recorder.
The derivation begins with the definition of the covariance matrix for position estimate p n = x n y n T : The motion model is nonlinear with respect to ε γ,i , owing to the trignometric functions in h. To simplify analysis, the motion model can first be linearized with respect to ε γ,i : A convenient and illuminating form of the covariance matrix can be achieved by rewriting p n in terms of the in terms of the The expected value of p n can then be written as since the heading discretization error ε γ,i and timing error ε ∆t,i are both zero-mean. The difference between p n and E[p n ] is then: The covariance matrix equation can then be evaluated: This product-of-sums can be rewritten as a sum-of-products: Assuming independence of ε γ,i and ε ∆t,i , the sum can be re written as : This double summation can be separated into two parts: The expectation operator can be applied to each element in the above sum of matrices. Since ε γ,i and ε ∆t,i are zero-mean, the above equation can be rewritten as: Assuming no autocorrelation between successive measurements of ε γ,i and ε ∆t,i , respectively, the expectation operator in the second matrix above becomes separable: By the assumption of zero-mean error variables, all terms in the second matrix vanish, and we are left with the final result: Thus, we obtain the covariance matrix at timestep n as a sum of diagonal matrices conjugated by rotation matrices that are evaluated on the sequence of measured headings.

Section II: Estimated error in straight-line trajectories bounds error for all trajectories of same length
Let {σ * 1,n , σ * 2,n } denote the directed standard deviations computed at timestep n for a straight-line trajectory, and let {σ 1,n , σ 2,n } denote the directed standard deviations computed at timestep n for an arbitrary trajectory. We aim to derive the following pair of results: We begin by deriving σ 1,n and σ 2,n for an arbitrary sequence of measured headings The product of the summand matrices can be computed: Since the directed standard deviations are the square roots of the eigenvalues of Σ Σ Σ n , we wish to examine the eigenvalues of Σ Σ Σ n . These eigenvalues can be computed in the standard manner: |Σ Σ Σ n − λ I| = σ 11,n − λ σ 12,n σ 21,n σ 22,n − λ = 0 (A24) Evaluating this determinant yields the characteristic equation for Σ Σ Σ n : λ 2 − (σ 11,n + σ 22,n )λ + (σ 11,n σ 22,n − σ 12,n σ 21,n ) (A25)

3/7
The eigenvalues of Σ Σ Σ n are thus given by: λ 1,2 = (σ 11,n + σ 22,n ) ± (σ 11,n + σ 22,n ) 2 − 4(σ 11,n σ 22,n − σ 12,n σ 21,n ) 2 (A26) Since every covariance matrix is symmetric, we have σ 21,n = σ 12,n , and we can rewrite the above equation as λ 1,2 = (σ 11,n + σ 22,n ) ± (σ 11,n − σ 22,n ) 2 + (2σ 12,n ) 2 2 (A27) Observing each term in the above equation: We can evaluate the quadratic formula term-by-term: Term I: Term III: The eigenvalue equation then becomes: The eigenvalues λ 1,2 achieve their extreme values when the term ((σ 11,0 − σ 22,0 ) (∑ i cos(2γ i )) 2 + (∑ i sin(2γ i )) 2 ) is maximized with respect toγ i . Theγ i -dependent terms within this expression can be rewritten as: This sum maximizes ifγ i =γ j =γ 0 ∀{i, j}, thus demonstrating that λ 1,2 co-extremize for straight-line trajectories. Since the directed standard deviations are the square roots of the eigenvalues of the covariance matrix, we have therefore shown that straight-line directed standard deviations σ * 1,n and σ * 2,n provide upper and lower bounds on the directed standard deviations for any sequence ofγ i . In the main article, we introduced and discussed an orchard model created in MATLAB 1 for simulating honey bee flights. This section will present further details of our model as shown in Fig. S1. The circles in (Fig. S1-a) represent an aerial view of trees in our MATLAB 1 simulation. As mentioned in the main article, the radii are assigned randomly based on reported tree parameters 2 , with vertical and horizontal spacing also assigned randomly based on reported numbers. The hive is the yellow circle situated at (0, 0). In the main text, we discuss the terms denoting various parts of our model, but in Fig. S1-b, we outline this in an image for the reader. We zoom in on part of the map and overlay a honey bee path, with text describing what each 5/7 item on the map is in terms of our model. We hope this section will further aid a reader in designing a similar simulation setup to that described in this work. The section Honey Bee Motion Model in the main article describes our own investigation into honey bee trajectories. In this section, we present our setup for obtaining that data through Fig. S2. Honey bee facilities at Cornell University, seen in Fig. S2-a, were used for capturing all honey bee flight data. This consisted of a barn with several observation hives, with one selected for recording. A close-up of one observation hive can be seen in Fig. S2-b. Data for behavior near the hive was collected near the barn in Fig. S2-a while data for feeding behavior was collected near the site in Fig. S2-c. Flight data from hive to a feeding source was collected at several points in-between Fig. S2(a & c). Video data was then processed by MATLAB's 1 computer vision toolbox and computations performed to find the data reported in the main article.

Appendix: Reduced Flight Number Study
(a) (c) (b) Figure S3. Example of generated activity maps assuming differing forager counts for Iteration 1, with full (1750 flights) in a) and the estimated maps after 1050 flights (b), and 350 flights (c). Actual feeding sites are marked in magenta. Plots were created in MATLAB 1 .
Since tagging honey bees is often performed by hand in biological studies/settings, we examined the maps produced assuming fewer tagged foragers. This is important in the case a grower does not have the resources necessary to tag larger counts of bees or if tagged bees are lost due to predation, senescence, stress, and other factors. We studied the impact of having fewer instrumented honey bees, 1% and 3% respectively, by sampling 20% and 60% of all generated flights assuming no particular order for scout or regular foraging flights. In Fig. S3, for iteration 1, we see a comparison between full tagged flights ( Fig. S3-a), which assumes 5% of all foragers are tagged, 3% of foragers tagged in Fig. S3-b, and 1% of foragers in Fig. S3-c.
The most striking feature of all three maps is the high activity region. Despite reduced forager counts, we can see similar regions of high activity. We further verify the usefulness of the information contained in these maps by examining the distance 6/7 error in Fig. S4 as performed in the main article. We note that there is little change in this distance metric whether 1%, 3%, or 5% of foragers are tagged. Consequently, the flight data returned by any of these forager counts, when mapped, yields high activity regions within a mean distance of 1.4m of the true location of the feeding sites. This shows our approach can be useful even if there is reduced forager tagging capability.