Animal movement is a highly complex process driven by various random and deterministic mechanisms involving a large number of causing factors1,2. It has been proposed that spatiotemporal patterns in movement may arise from moving strategies that evolve to optimise foraging efficiency3,4, decision-making processes in response to external stimuli5, environmental conditions or landscape features6,7,8, collective dynamics and social interactions9,10, memory and home-return behaviour11,12, just to name a few. Fully unravelling the complexity of animal movement as well as sorting out the intricate relations between the observed spatiotemporal pattern and various underlying causing factors remains a difficult scientific challenge.

For over a century, our attempts to understand animal movement have been limited to a qualitative level due to the lack of high-quality data that can provide fine-grained spatiotemporal description of movement2. Recently, new tracking technologies such as the Global Positioning System (GPS) have been deployed in animal tracking to obtain continuous time-resolved moving trajectories with high spatiotemporal resolution. The emerging high-quality movement data enables the application of quantitative analysis and mathematical characterization on mobility patterns at different spatiotemporal scales, which provides new insights into possible factors that drive movement decisions.

One common approach to analyze animal movement is to represent the time-resolved trajectory as discrete moving steps under the framework of random walk13,14,15. In this context, the dispersal kernel p(r) in space, which characterizes the general distribution of step length r in the trajectory, is considered to be a significant footprint of movement16,17. The detailed functional form of p(r) is indicative of a specific type of random walk and the underlying dynamics of movement. For example, an exponential kernel function with rc being the characteristic length scale is the signature of the classic Brownian walk that obeys the central-limit theorem and exhibits a normal-diffusive pattern. A scaling dispersal kernel characterized by a power-law function p(r) ~ r−γ with γ being the scaling exponent is the signature of the Lévy walk which exhibits high heterogeneity and super-diffusive pattern. Much effort has been devoted to study the dispersal kernel p(r) for different animal species using real movement data, from small insects like honey bees18, marine life like jelly fish and whales19, birds like albatross20,21, to mammals like monkeys22 and human17,23. For example, a controversial topic that attracts tremendous attention is whether the observed movement follows a Brownian-like motion or a Lévy walk. Although many studies have shown strong evidence for the existence of Lévy walks in animals, it has been argued that this evidence may come from statistical artifacts or inappropriate manipulation of data24,25,26, suggesting the necessity of using high-resolution data and robust statistical methods to validate the characterization of movement patterns.

Despite its importance in characterizing movement, the dispersal kernel p(r) only provides partial information on the spatial pattern and does not fully capture all important aspects in animal movement such as the temporal spectrum that depicts the switch between different activity modes over time. Recently it has been found that scaling phenomena in movement can also arise in the waiting time distribution pw) that characterizes the time span of non-moving period, or the inter-event time distribution pe) that characterizes the time between two successive moving activities11,17,27. These findings suggest there is a need for more detailed investigation on the spatiotemporal pattern in movement beyond the dispersal kernel.

Here we use a dataset of high-frequency GPS samples to study the movement of grazing cattle. In contrast to most previous studies on animal movement that only focus on the dispersal kernel or statistics for one specific activity mode such as the waiting time, the high-resolution trajectories in our dataset allows us to do activity classification on the trajectory and gain more comprehensive insight into the spatiotemporal pattern in each activity mode. In particular, we use a two-state ‘stop-and-move’ model to describe the mobility pattern, dividing the trajectory into alternate moving and non-moving states (see Material and Methods). The non-moving state indicates that the animal remains within a radius Δr in space for at least Δt in time, where Δr represents the spatial resolution limit in the observation and Δt is a tuning threshold parameter to specify the minimum time span. The non-moving segment in the trajectory can be viewed as a single point in space, which we call waiting location, with a length τw ≥ Δt in time, which we call waiting time correspondingly. On the other hand, the moving state indicates that the animal is in a transition from one waiting location to another, which can be described as a trip (l, τm) in the trajectory with l being the distance between the two waiting locations and τm being the time elapse of the trip. The representation of mobility pattern in this approach is shown in the schematic diagram of Fig. 1.

Figure 1
figure 1

A schematic diagram of the spatiotemporal pattern under the two-state ‘stop-and-move’ representation.

(a) The temporal spectrum of activities illustrated in a spike train. Colour segments on the time-axis represent alternating waiting (white) and moving (red) activities over time. (b) The raw trajectory of an individual cow before processing. The two-dimensional x-y plane here represents the grazing area (in meters). (c) The spatial pattern extracted from the raw trajectory in panel (b) can be projected as a transition graph, where the waiting locations for non-moving segments are represented by red dots and the trips are represented by blue solid lines.

Under this representation, we observe a very interesting spatiotemporal pattern in which the two activity states are of unique statistical characterization. In particular, we find that the dispersal kernel or trip length distribution p(l) is best described by a hybrid exponential distribution, which indicates that the trajectory has a two-level hierarchical structure in space and each level appears to follow a Brownian walk. This is in contrast to the widely-observed Lévy walk patterns in other species. Despite the absence of scaling law in the spatial dispersal, we find that the waiting distribution pw) in the time domain is best described by a truncated power-law. Possible underlying mechanisms and ecological implications accounting for this phenomena are discussed (see Discussions).

Understanding grazing/foraging animal movements is not only a critical issue in biological science but also of fundamental importance to many practical issues such as farm and livestock management28, the maintenance of biodiversity in ecosystems29 and developing better tracking30 and virtual fencing technologies31. Our results provide new quantitative insights into grazing cattle movement that are largely lacking in most previous work. Our statistical characterization of the multi-modal mobility pattern is useful for understanding the biological basis of the complex grazing behaviors as well as the underlying driving factors behind these behaviors. The simple two-state model along with the statistics extracted for each activity state can be also used as a building block to develop more realistic mobility simulation platform that can benefit disease spread modelling17 as well as the design of virtual fencing systems31.


Dispersal kernel in moving state

We first turn attention to the spatial dispersal kernel or the trip length distribution p(l) over the whole population. Obtaining the functional form of p(l) directly from empirical data requires a binning process, which has been known to have statistical distortion for data with a broad distribution26. To avoid the disadvantage of binning, we use the complementary cumulative distribution for statistical analysis. We process the data using three different parameter sets with Δr = 5 m corresponding to the resolution limit of positioning device and Δt = 1, 2 and 5 mins respectively32. To describe the dispersal kernel P(l) shown in Fig. 2a–c, we consider four commonly-used candidate models8: (1) power-law; (2) truncated power-law with exponential cut-off; (3) exponential; (4) mixture exponential. Using the maximum-likelihood estimation (MLE) to fit the candidate models and the Akaike-information-criterion (AIC) for model selection24 (see Supporting Information), we find that the best model to describe P(l) is the mixture exponential

Figure 2
figure 2

The statistics for the moving state with Δr = 5 m.

Panels (a–c) are the cumulative distributions P(l) for trip length with Δt = 1, 2 and 5 mins (from left to right). Panels (d–f) are the cumulative distributions Pm) for trip time with Δt = 1, 2 and 5 mins (from left to right). The solid red lines represent the best fitted mixture exponential obtained by the maximum-likelihood method using an expectation-maximisation algorithm.

where l1 and l2 are the characteristic lengths in each mixture component, q is a parameter specifying the mixture proportion and lmin = Δr = 5 m is the lower bound in observation. Another significant statistical feature of the moving state is the trip time distribution pm), which is also best described by the mixture exponential model, as shown in Fig. 2d–f. This is consistent with our expectation that the trip time τm is strongly correlated with the trip length l.

The mixture exponential here indicates that the spatial pattern of grazing cattle is governed by two different Brownian-like dynamics with different characteristic scales, suggesting a hierarchical structure of the movement. If we consider that the landscape is formed by a number of patchy areas, the first exponential distribution will represent the short-range movement that occurs within a patch and the second exponential distribution with a larger characteristic length will represent inter-patch movements. To better reveal this hierarchy structure in mobility pattern, we perform clustering on the waiting locations using a density-based clustering algorithm DBSCAN which is efficient in discovering significant clusters with irregular shape from noisy data points, as shown in Fig. 3. After grouping the waiting locations into clusters, the trips in the mobility pattern fall into two categories, intra-cluster trip and inter-cluster trip. We find that the trip length distribution for each of these two types of trips can be well described by a single exponential distribution, as shown in Fig. 4. It is worth noting that the mixture exponential still renders the highest AIC weight among the four candidate models for the intra-cluster trip length distribution, while the single exponential has the highest AIC weight without the mixture exponential. However the difference between the two components in the mixture model is comparably small (l1 = 11.20, l2 = 22.05, q = 0.44, Δt = 2 mins), suggesting that the two components are not strongly distinguishable and a single exponential is a reasonable alternative model in this scenario. We also observe that some long-distance inter-cluster trips are of high similarity, indicating that transitions from one cluster to another are not completely random and spontaneous, but could be driven by a deterministic process such as memory or herding.

Figure 3
figure 3

The visualisation of clusters extracted by the DBSCAN algorithms and the corresponding inter-cluster trips.

Dots with different colours represent different clusters (the lightest colour represent outliers). Blue solid lines indicate inter-cluster trips.

Figure 4
figure 4

The trip length distribution for intra-cluster movements and inter-cluster movements.

Both of them are well described by a single exponential distribution p(l) ~ exp(−l/l0) (red solid lines). The black solid line in panel (a) indicates the mixture exponential fitting using Eq. 1. The mixture exponential is still the best candidate model for the intra-cluster trip length distribution according to the AIC weight. However the difference between the two components in the mixture model is comparably small, suggesting that a single exponential could be a reasonable alternative.

Waiting time distribution

To characterise the waiting time distribution, we compare three different models, namely exponential, power-law and truncated power-law with exponential cut-off (see Supporting Information). We observe that the waiting-time distribution pw) is best described by a truncated power-law distribution

with γ being the scaling exponent and being the structural cut-off. As shown in Fig. 5, varying the threshold parameter Δt in data processing does not affect the emergence of scaling phenomena. The scaling law in the waiting time distribution is indicative of the heterogeneous grazing dynamics of cattle, which could be related to the landscape heterogeneity, the complex decision-making dynamics or the energy management of movement (See Discussion). We also find that the waiting time distributions in the main clusters discovered by the DBSCAN algorithm are all well described by a truncated power-law, suggesting the scaling behaviour in waiting time distribution is invariant at the cluster-level.

Figure 5
figure 5

The statistics for the non-moving state with Δr = 5 m.

Panels (a–c) are the cumulative distributions Pw) for waiting time with Δt = 1, 2 and 5 mins (from left to right). The solid red lines represent the best fitted truncated power-law obtained by the maximum-likelihood method.

Without taking into account correlation between activities, the temporal spectrum of mobility pattern can be approximated as a two-state renewal process where the time span of the alternate moving and non-moving activities are randomly drawn from the distribution functions pm) and pw) respectively. To test the validity of this approximation, we measure the pairwise Pearson correlation coefficient between the time span of consecutive activity segments in the following four situations: (1) the non-moving segment and the next moving segment (r = −0.0344, p = 0.0363); (2) the moving segment and the next non-moving segment (r = −0.0616, p = 0.000174); (3) two consecutive non-moving segment (r = 0.0787, p = 1.58 × 10−6); (4) two consecutive moving segment (r = 0.0719, p = 1.24 × 10−6). We find that none of these shows significant correlation. The result suggests that short-range correlation does not exist in the temporal spectrum, i.e. the time span of the previous activity has little influence on the time span of the next activity and the temporal dynamics can be approximately described by a two-state renewal process without considering long-range correlation.

Individual mobility pattern

The population-based statistics presented above are not necessarily representative of the individual patterns. It has been suggested that the characteristics of population statistics may differ from their individual counterpart after being aggregated over population. For example, the observed Lévy walk pattern in population may arise from individual heterogeneity33. To test whether the individual pattern is consistent with the population-based statistics, we use the same model selection procedure to fit the individual statistics (Δr = 5 m, Δt = 2 mins). We find that the trip length and trip time distribution for each individual is best described by the hybrid exponential, with only one exception in the trip length distribution. On the other hand, the waiting distribution for each individual is best described by power-law or truncated power-law (see Supporting Information). This suggests that the composite Brownian walk in space as well as the scaling law in waiting time distribution are not a statistical artefact due to the mixture of different individual patterns, but they appear to be universal for all individuals. Although all individual spatiotemporal patterns are best described by the same distribution functions, the fitted parameters vary from individual to individual. For example, the exponents of the truncated power-law for waiting distribution estimated by the maximum-likelihood method range from γ = 1.6 to γ = 2.5. This indicates that the internal properties encapsulated by the scaling exponent γ are different among individual cows, although their activities appear to be governed by the same dynamics.


In this study we have found that under a two-state ‘stop-and-move’ representation the spatiotemporal pattern of grazing cattle exhibits a hierarchical structure in space and an asymmetric temporal spectrum, which can be described by a composite Brownian walk interspersed with power-law distributed non-moving periods. This finding is in contrast to the patterns observed in human27 and T-cell34 mobility where the moving and non-moving states are both characterised by a scaling law. Since detailed statistical characterisation on free-range animal movement based on high-frequency GPS trajectories is still largely missing, this finding can provide new perspectives to our understanding for grazing animals movement and useful leads to the underlying ecological basis of grazing behaviour.

A simple deterministic scenario that can give rise to the observed scaling law in waiting time distribution is that the environment is structured according to the same heterogenous statistics. We can consider that different location in the landscape is of different quality or resource abundance, which can be described a quality function . If the cattle simply spend their time for feeding on one location proportional to the quality at that location, i.e. a ‘greedy’ strategy, would be the observed waiting distribution.

Stochastic processes and spontaneous behaviour can also account for the observed spatiotemporal pattern. Recently, a plausible decision-based queueing process in which the animal executes activities from a stochastic priority list has been used to interpret the scaling law observed in the waiting time of marine predators19. This model was originally proposed to explain the power-law distributed inter-event time observed in the communication pattern in human dynamics23. Specifically, the model assumes that the animal performs the two activities waiting and moving with probability x1 and x2 = 1 − x1 at a regular basis, where x1 and x2 are the priority of the activity drawn from a random distribution p(x). If the animal moves, it changes its context and therefore its likelihood to move or stay also changes. As a result, the priority will be redrawn from the random distribution, representing the change of state due to the movement. This model can generate the power-law distribution in waiting time as well as the exponential distribution in step-size. By introducing a deterministic component to the decision probability, the model can be also tuned to generate different scaling exponent γ accounting for the various scaling phenomena in different species. The model is recast in a dynamic prey-predator environment where the moving probability x1 can be interpreted as the likelihood of finding a prey in the vicinity.

We can also consider the movement as a two-state point process, in which the probabilities that the animal switches its state are qA (from moving to non-moving) and qB (from non-moving to moving)35,36. It is well known that the state duration is exponential distributed when the switching probability is constant and independent of time36. Recently, it has been suggested that the power-law distributed duration can be attributed to the reinforcement dynamics, such that the switching probability is proportional to the time that animal has spend in its current state, i.e. the longer the animal stays in its current state the less likely it will change it ref. 35,37 and 38.

Another explanation is to associate the movement pattern with the energy state of the animal using a maximum entropy approach. In this context, each moving and non-moving activity is associated with a certain amount of energy loss El or energy gain Eg. According to the maximum entropy principle, the distribution of El and Eg over all activity segments should follow a Boltzmann distribution (See Material and methods). The validation of the maximum entropy approach is mainly subject to two conditions: (1) each individual activity is independent and has no influence on others; and (2) the energy intake and expenditure is maintained by two different mechanisms and can be treated as two isolated systems. The first condition is ed by our test on the correlation between consecutive activities, while the second is intuitively understandable. Following this formulation, it is straightforward to derive that when Eg log τw and El τm the observed scaling law in waiting time as well as the exponential distribution in trip time can be reproduced. That is to say, the energy intake increases logarithmically as grazing time increases, while the energy expenditure due to moving increases linearly with the moving time or distance. It is interesting to note that the logarithmic energy intake function has been suggested for grazing animals before39 and the linear energy expenditure or cost function has been widely observed in many single-mode movements of human transportation activities40,41. It is well known that energy status can affect animal movement, but a quantitative understanding of their relation is still unclear. Our proposed maximum entropy approach can potentially fill this gap by establishing a connection between the energy function and the observed mobility pattern, suggesting that the detailed energy intake or expenditure as a function of time in different activities can be inferred from statistical features of the macroscopic mobility patterns such as waiting time or step-length distributions. The conjectured relation can be tested in future experiments by measuring detailed energy intake or consumption using laboratory techniques.

So far our study has been focused on using a simple two-state movement model to reveal the statistical characterization of the spatiotemporal pattern of grazing cattle in a short observation window and a small confined area. In future, it is interesting to extend this approach to build more realistic mobility model to capture more complex dynamics in movement, such as long-term memory effect and returning behavior that can be extracted from data with a larger observation window in time and space11,42. For example, one can build a two-level mobility model to capture the hierarchy nature of the mobility pattern, with one level describing the bimodal ‘stop-and-move’ continuously random walk within a specific grazing area and another level describing transitions and recurrent movements between the grazing areas. Our results also opens new avenues for studying the relation between the observed bimodal mobility pattern and other dynamical processes such as epidemic spreading and diffusion17,43,44.

Material and Methods

Dataset description

The dataset consists of continuous 0.5-Hz GPS samples for 34 individuals covering an observation period of over 50 hours. We select the data of 31 individuals in which there is no discontinuity in GPS samples and we choose a continuous 30-hour observation window during which the animals were grazing in a confined 600 m × 400 m rectangular area. The trajectory for each individual cow can be denoted by a sequence L = {pi}, where pi = (xi, yi, ti) represents a GPS sample with (xi, yi) being the position coordinates and ti being the timestamp. We use moving average filtering to reduce the noise and smooth the trajectory with a 10 sec moving window, such that pi = 〈pi−2, pi−1, pi, pi+1, pi+2〉.

Classification of mobility pattern

We define the non-moving segment of a trajectory as a set of consecutive points Lw = {pk, pk+1, …, pk+m−1}, which satisfies the following three conditions: (1) the distance dk,j from the starting point pk to any other point pj of the segment must be smaller than a certain threshold Δd, i.e. maxk<j<k+mdk,j ≤ Δr; (2) the distance from the starting point pk to the point following the ending point of the segment pk+m must be larger than Δr, i.e. dk,k+m > Δr; (3) the time span of the segment must be longer than a certain threshold Δt, i.e. tk+m − tk > Δt. In this definition, the first two constraints are made to identify consecutive points that are likely to represent an identical position within in a certain proximity. The third constraint imposes a minimum time span of the non-moving segment that can be tuned to exclude some very-short random activities such as a pause when encountering an obstacle, as well as making the extracted non-moving segments more representative of meaningful activities such as grazing or resting. After extracting the non-moving segments, we simply define the points between two non-moving segments as the moving segments. The approach here is in analogy to the definition of staying points for continuous GPS samples in most spatiotemporal analysis of human mobility27,45. The value of Δt is suggested to be 2–3 mins32.

Maximum entropy principle

The maximum entropy principle originates from statistical mechanics, which assumes that the configuration of microscopic states of a complex system (e.g. the energy of each particle) leading to the macroscopic observation is the one that maximise the entropy of the system. Suppose the system consists of N non-interacting particles and has a total energy U, such that and where ni denotes the number of particles at a specific energy state Ei. Then the ensemble that represents all possible configurations of the system is called the canonical ensemble and the probability p(E) that a particle has a specific energy state E is denoted by . Here we assume that the alternate moving and non-moving activities in the mobility pattern operate in two independent systems, while the individual activities are regarded as ‘particles’ and the associated energy state of the activity is the incurred energy gain (or loss) due to the activity. To obtain the distribution of the time span τ in each activity, we use the transformation where E(τ) is the energy function that describes the energy gain as a function of the time span during the activity. The detailed form of the distribution function p(τ) is then subject to the energy function E(τ). For example, a logarithmic function E(τ)  log τ will lead to a power-law distribution p(τ)  τβ, while a linear function will simply maintain a exponential form p(τ) e.

Additional Information

How to cite this article: Zhao, K. and Jurdak, R. Understanding the spatiotemporal pattern of grazing cattle movement. Sci. Rep. 6, 31967; doi: 10.1038/srep31967 (2016).