A random walk model that accounts for space occupation and movements of a large herbivore

Animal movement has been identified as a key feature in understanding animal behavior, distribution and habitat use and foraging strategies among others. Large datasets of invididual locations often remain unused or used only in part due to the lack of practical models that can directly infer the desired features from raw GPS locations and the complexity of existing approaches. Some of them being disputed for their lack of biological justifications in their design. We propose a simple model of individual movement with explicit parameters, based on a two-dimensional biased and correlated random walk with three forces related to advection (correlation), attraction (bias) and immobility of the animal. These forces can be directly estimated using individual data. We demonstrate the approach by using GPS data of 5 red deer with a high frequency sampling. The results show that a simple random walk template can account for the spatial complexity of wild animals. The practical design of the model is also verified for detecting spatial feature abnormalities and for providing estimates of density and abundance of wild animals. Integrating even more additional features of animal movement, such as individuals’ interactions or environmental repellents, could help to better understand the spatial behavior of wild animals.

• parsimony: the BCR dynamics are only depending on the set of parameters. They do not depend on additional or ad-hoc parameters. All the parameters can directly be inferred from the dataset, • interpretation: each parameter can be directly interpreted in terms of animal behavior and ecology, • flexible design: the number of parameters can be easily modified and the BCR can also be simulated in continuous or discrete time and space, depending on the process we wish to model, • well adapted to positioning data: the approach is adapted to discrete time series such as positioning data and can be used independently of the time and spatial scales of the dataset. Thus, it can directly exploit the data without the requirement of any additional procedure for data preparation.
This study aims at modeling animal movement of sedentary individuals over short periods of approximatively 1 year 39 , without reproduction and in a homogeneous landscape using GPS data sets and a BCR. The model is implemented with the three aforementioned parameters and we use the GPS dataset of 5 red deer (Cervus elaphus) to demonstrate the approach (pictured in Fig. 1). We also study how the temporal resolution (i.e. the sampling rate) affects the estimations of the three parameters. We finally challenge this approach as a practical tool for inferring environmental and behavior information from the dataset.

Methods
A simple methodological framework was established for testing the BCR model using empirical datasets, consisting of the GPS data of 5 animals. For each of these 5 animals, the three parameters were accordingly tuned using a straightforward estimation procedure. This procedure uses the empirical datasets to infer the parameters'values ( Fig. 1). We also used the datasets to assess the model's reliability-or performance-. We also detail other analyzes that were carried out to ensure the robustness and consistency of the approach, including the deterministic nature of the 5 statistics and a sensitivity analysis. This analysis consists in evaluating the performance of the BCR using a sweep method that produce arbitrary values of the parameters instead of using data-driven estimations. All BCR simulations and the five statistics were performed using MATLAB Version 7.13.0.564 (R2011b).
Data. The  www.nature.com/scientificreports/ up to 400 m a. s. l. The climate is continental with cool summers and mild winters (mean January and July temperatures of 1.4 and 19.6 • C, respectively, data from Phalsbourg weather station, Meteo France, from 2004 to 2017). Three ungulate species are present and mainly managed through hunting in the NHWR: wild boar, red deer and roe deer. The present study focuses on female red deer for test model. A detailed overview of the landscape and surroundings is given in 40 . The GPS data had regular observation frequencies with high frequency sampling (Table 1). In the following text, we note X i = [X (1) i , X i ] the locations of the individual with X i ∈ R 2 , i = 1, 2, . . . , n and where X (1) i , X (2) i represent the longitude and latitude respectively. We use t i ( t 1 = 0 ) as the time elapsed between two successive locations X i−1 , X i and as the average sampling time. The trajectory of the animal, or 'path' , was interpolated using linear interpolation between each pair of recorded observations ( Fig. 2 and detailed in Supplementary Methods (Eq. 21) and associated Graphic 2). It approximates the animal travels in straight lines at constant velocity between each pair of locations 41 . The attractor X F of one individual was estimated as the isobarycenter of all recorded locations: BCR model. The model aims at estimating the location at the next time step, given the actual location X at step i: such that the function f (·) is assumed to be representative of the behavior of the animal on sufficiently large time scales. We considered one individual of a given species with no interaction and simulated its movement in continuous space and discrete time in 2 dimensions. The BCR includes 3 parameters coupled with isotropic diffusion: • Diffusion: A random direction with uniform spatial distribution in a 2D plane, • Bias ( p F ): An increased probability to go to a fixed point named attractor 42 . This attractor was estimated as the isobarycenter of all recorded locations, defined as X F (Eq. 2). This yields a bias or advection parameter in the direction of X F . We use the term 'attraction' for the bias component of the BCR and the term ' den' for the attractor. In the data set we study, the den is equivalent to the deer's bunk. • Correlated component ( p I ): This parameter increases the probability to move forward, i.e. to perform one step in the direction of the previous step. This is equivalent to a short term bias in movement, when the animal has inertia. We refers to 'inertia' for the correlated component, • Immobility ( p s ): We included this as a specific parameter and the movement is stopped for one step. This takes into account the absence of movement between a pair of locations. It can be accredited to technological limitations with the satellite telemetry due to a weak GPS signal strength, possibly due to natural elements: such as when the animal was standing underneath a rock or due to dense clouds, dust particles, mountains or flying objects, such as airplanes). However, this can also be part of the behavior of the animals, during specific times: sleep cycles or foraging for instance. We use d min to denote this distance cutoff and set d min = 10 m which corresponds to the magnitude of the error typically found in GPS locations 43 . We also use d min to encapsulate GPS error and peculiar ecological behavior, not associated with p I or p F , that are beyond the scope of this study.
The effect of each parameter is detailed in Fig. 3. The typical model contains all three parameters: p I , p s and p F for describing animal motion while offering a trade-off between the number of parameters and the description of animal motion. Table 1. Data summary. For each animal, the total number of observations n is given along with the period of collection (date and time), the sampling rate T (i.e. the average time between 2 observations) (in min.) and corresponding standard deviation, total distance (in kilometers), total recording time (in days) and average speed s (in 10 −2 m.s -1 ). www.nature.com/scientificreports/ Estimation of the parameters. The estimation of the three parameters for each animal is based on the empirical datasets. We distinguished between the states, where one state is described by the pair {X i−1 , X i } and the situations, where one situation is described by the past ( X i−1 ), current ( X i ) and future ( X i+1 ) locations. Knowing both the state of the animal at a given time step i and its situation-the realization of movement at the next time step i + 1-allowed for collecting the occurrences of inertia, immobilism and attraction. This could be done provided we account for the variability of the movement: the animal may not be heading exactly toward the den, or performing inertia with an exact angular value of π . Thus we discretized the space around the animal in 8 quadrants at each time step i. For example, if the animal was heading straight forward with a margin of ±π/8 then it was considered in the situation of inertia. In other words, the state could fall in a situation of inertia with a margin of ±π/8 . Such a discretization can be represented as a matrix, depending on the state of the animal, its location and the location of the den at each time step (see Supplementary methods, Eq. 19). In order to gather enough data samples per situation, we arbitrary used angular thresholds of π/8 as a convenient trade-off between data scarcity and precision loss. Using smaller threshold values (say π/10 ) may result in too few samples per situations.
Using larger threshold values such as π/4 may result in a loss of precision while capturing additional movement samples that may not correspond to the situation. We first needed to define in which state is the animal at each step i. A state is the 2-tuple containing the previous and actual observation {X i−1 , X i } . We wanted to distinguish between non-conflicting and conflicting states, where a non-conflicting state is when the animal is in one state only, while a conflicting state is when the animal is in two states at once. We defined two conflicting states: when the animal was already heading toward the den X F , and: when the distance between two consecutive observations was too small ( ≤ d min m.), describing an individual that was already immobile. Such that the subset of non-conflicting states is: www.nature.com/scientificreports/ We then needed to assess in which situation the animal was for each corresponding state. A situation is the 3-tuple {X i−1 , X i , X i+1 } . We defined three subsets of situations corresponding to a straight forward motion (I), no motion (s) and a motion toward the den (F): With d(·, ·) the Euclidean distance between two locations. For the situations in s, we considered that the animal is not performing a motion if the Euclidean distance between two successive locations was ≤ d min m. We counted the number of states falling in each situation, for states in H (Eq. 6). We defined x 1 , x 2 , x 3 as the empirical proportion of cases corresponding to each situation: with χ = 8 + p I + p s + p F . The values of x 1 , x 2 and x 3 were then gathered for each animal. We did not use immobile locations (i.e. distances separating two successive observations must be > d min m) for the estimations of x 1 and x 3 . Solving Eq. (10) for χ with respect to x 1 , x 2 , x 3 yields: www.nature.com/scientificreports/ Plugging in Eq. 10: Note that we assumed that p IF = p I + p F in H IF and p Is = p I + p s in H Is as a convenient arrangement and ignoring higher order conflicting cases. Investigating the step-size distribution in the 5 deers, we found a lognormal step size distribution ( Supplementary Fig. S1). We then set a log-normal distribution ln N(µ, σ 2 ) for the step size distribution for the step size in the BCR. The same estimation procedure was used for configurations using a different number of parameters and quantity χ is accordingly calculated depending on the number of parameters used. It is possible to obtain negative values using this inference method. A parameter with a negative values reflects a direction that is not favored by the animal. In such a case, one should rethink the design of the BCR by changing the parameters (see Supplementary methods, section "negative parameters"). In the subsequent sections, we only consider parameters with positives values.
BCR dynamics. The BCR dynamics for each animal are completely determined by the three parameters p F , p I , p s , taking values in R + , and the step-size distribution. If p F = p I = p s = 0 , the BCR resumes to a typical two-dimensional random walk with a log-normal step size distribution ln N(µ, σ 2 ) . The dynamics can be visualized in Fig. 3 for different values of each parameter. When simulating a step in the model, the motion in H is described by: (14): The motion in H Is is: The motion in H IF is: Statistics for describing animal movement. We simulated N = 1000 BCR and used 5 statistics to assess the model reliability on spatial features including: (i) the distribution of relative turning angles which provides information about the movement of the animal, (ii) the home range which provides information about the spatial density of observations and (iii) observation counts using still and mobile transects, providing information on absolute observation abundance 44 . A detailed description of each statistic is provided in Supplementary Methods and Fig. S2. The reliability-or performance-was assessed in each animal and studied statistic using two error terms e 1 and e 2 . Error e 1 is the L 1 norm to compare the differences between the statistic S computed over a simulated path, and the statistic S computed over the data-set: www.nature.com/scientificreports/ With k = 1, . . . , N the number of simulations of the BCR. Error e 1 is the sum of absolute differences in the given statistic, and is a natural way of measuring the distance between the statistics computed on the data set and the trajectories generated using the BCR. We also focused on the average relative error e 2 as an indicator of the sensitivity: Distribution of turning angles. For each individual, the distribution of counter-clockwise relative turning angles This means that we only kept the angles from observations that were separated by an Euclidean distance greater than d min .
Home range. We used an adaptive kernel density estimator (matlab package kde2d-kernel density estimation version 1.3.0.0) as an estimator of the utilization distribution 45 to represent the home range of the animal. The approach of Z.I. Botev provided an estimate of observation density using a bivariate (Gaussian) kernel with diagonal bandwidth matrix 46 . The density was estimated over a grid of 210 × 210 nodes and we computed the home range area (in m 2 ) for various values: 100, 99, 95, 90, 80, ..., 20, 10% of the estimated density. Similarly to the distribution of turning angles, we compared each value of the data's home range against the simulated one.
Dilation. Dilation is generally used to account for the spatial attributes of an object such as to measure an area around the path or the volume of a brownian motion (see Wiener sausage 47 and Gromov-Hausdorff distance).
In our approach, we use dilation of both simulated and GPS paths for two reasons: to have a real-and comparable-number that accounts for how a trajectory has explored space and because it is natural tool from a census point of view (the dilated path corresponds to the area where the animal can be detected). Each simulated or real path was plotted in binary format in a window and dilated with a disk shape. The window size was set to a huge value in order to encapsulate the dilated path while preventing boundary effects, i.e. the convex envelope of the dilated area did not collide with any window border. We then estimated the surface covered by the dilated path for 100 different sizes of the disk, from disk size 1 to disk size 100. We compared each value of the data's estimated surface against the simulated one.
Immobile transects. We used still transects that counted the number of times the animal was seen in their line of sight. We arbitrary set the line-of-sight value at 200 m. The number of sightings of each transect was gathered and ordered in decreasing order, thus breaking the spatial dependence. We then compared the bins of the resulting histogram in the data and in the simulated path.
Mobile transects. First, the movement of the animal was linearly interpolated from the GPS data, meaning that between two recorded locations the individual followed a linear path. The speed of the animal between two locations was accordingly reconstructed using the recorded times t i between each location. Second, we used mobile transects as the ecological sampling method, where each transect 'count' the intersection between its path and the animal's one. The mobile transects followed a predefined path at a given constant speed as time increased. The area of vision of each transect was defined as a circle of a given radius. Each time the path of an individual collided with an area of vision, the count of the corresponding transect increased by 1. Two types of movements were used: linear and clockwise rotational transects. The initial locations of both types of transects are X 1 and X F . Both the animal and mobile transects started to move at the same time. At each of the two locations X 1 , X F , 8 linear transects moved in the 8 cardinal directions, totalizing 16 transects. For the linear transects, every 10,000 time steps, we set 2 × 8 new transects starting at the same locations and following the same directions. Clockwise rotational transects were rotated around X 1 and X F using a 500 m radius. When we reached t n , we gathered the total count (i.e. the count of all transects). For the two types of transects, we gathered the total count for 6 different lines of sight: 50, 100, 200, 400, 500, 1000 m. and 4 speeds: s/4, s/2, s, 2 s with s the average speed of the animal. We then aggregated the overall count in each of the two types of transects, and compared the results from the data and the simulated path (Supplementary Methods and Fig. S2).
Scale invariance. Several authors pointed out that the temporal resolution of the discretization is of importance: it should be relevant to the considered behavioral mechanisms 5,48-50 . Schlägel and Lewis focused on the quantification of movement models' robustness under subsampled movement paths 49 . They found that increased subsampling leads to a strong deviation of the central parameter in resource selection models 49,51 . They underlined that important quantities derived from empirical data (e.g. parameters estimates, travel distance or sinuosity) can differ based on the temporal resolution of the data 49,51 . Moreover, Postlethwaite and Dennis highlighted the difficulty of comparing model results amongst tracking-datasets that vary substantially in temporal grain 50 ). Each of the studied dataset has a relatively high sampling rate (roughly 10 m) and a period of study that is appropriate to the analysis of animal movement at the year scale (Table 1). In order to investigate such a possible effect on the BCR dynamic, we changed the sampling rate of the movement path to ensure that the three parameters We did not change the value of d min as we subsampled the movement path because we designed p s for capturing GPS noise and movements that are associated with peculiar ecological behaviors that are beyond the scope of this study in terms of time and spatial scales (foraging for instance). We then compared the resulting parameters p I , p F and p s as the resampling rate k increased.

Fluctuations.
Whereas the BCR is a stochastic process, the deterministic aspects of the 5 statistics were tested with an increasing number of steps n s . The statistic associated with each realization of the model (a simulated path) is a random variable. If the distribution of these random variables has low concentration (high variance) then it is not a convenient statistic as it cannot be used as a reference for assessing the model's performance, even when averaging over multiples realizations. On the opposite, if the statistic is deterministic (no fluctuations) it can provide a reliable tool to assess the model's performance. This was numerically tested over a range of increasing n s values with n s = 10 4 , 2 × 10 4 , . . . , 4 × 10 5 . For each of those step values, a set of 100 BCR was simulated with parameters p I , p F and p s estimated from the first deer (see Table 2) and we studied the variance of the statistics.

Sensitivity analysis.
In order to assess whether the estimated parameters are optimal (i.e. providing the best possible performance) and to study parameter scarcity, we also evaluated the performance of the model using arbitrary weight values. We first started by evaluating how using one parameter instead of the three could alter the performance of the model. We then extended this sensitivity analyse by drawing arbitrary values for each parameter from a multi-dimensional square mesh, whose center corresponds to the estimated values of p I , p s , p F , estimated using GPS data (Fig. 1). We additionally used values that are distant from the estimated ones, up to p I = 3 , p F = 3 and p s = 5 . We tested a total of 151 new configurations with these arbitrary values.
For each configuration, we ran 150 simulations and evaluated them using the 5 statistics. The mean error of |S −S k | and its standard deviation are gathered and plotted for each arbitrary configuration. As a resume, we replicate the framework described in Fig. 1 but we inject arbitrary parameters instead of using data-driven parameterisations.
Application. The proposed model could be used to infer environmental and behavior information from the dataset. We chose to illustrate such an application by trying to detect anomalous voids (or holes) in the spatial territory of the individual using the GPS dataset and Monte-Carlo simulations of the model. Anomalous means that the observed void is not related to the randomness of the movement, but rather related to a geographical artifact. The parameters p I , p F , p s , µ and σ 2 of the BCR were accordingly estimated from the data of each individual, similarly to previous experiments (Fig. 1). A simple heuristic was used to find voids in empirical and simulated paths for each individual: we computed the alpha shape of all locations using a fixed alpha radius of 60 m. This allowed for determining the surface covered by all locations while preserving the voids. We then collected the area of each void provided they had an area of at least 100 m 2 . We focused on voids near the center of the alpha shape in order to avoid artificial voids, generated by the weak density of locations at the boundaries. We ran 10,000 iterations of the model for each animal and estimated the probability p ∅ of finding voids of different sizes in the simulated paths. This probability was then compared to voids found in the GPS datasets and available environmental information was used to determine whether any geographical element(s) could explain the unexpected voids. Table 2. Estimated parameters. For each deer, the estimated parameters p I , p F , p s and the two parameters µ , σ that control the step size distribution are given.

Results
We found that the parameters µ and σ 2 were close for all individuals (Table 2), and estimates of the three parameters for individuals 3 and 4 were similar. The values of p I and p F showed that inertia and attractor play a greater role in the movement of deer 5 ( p I = 0.22 , p F = 0.24 ), compared to the 4 other individuals. On the opposite, the movement of Deer 1 is characterized by immobilism ( p I = 0.01 , p s = 2.01 p F = 0.01 ). The immobility p s was stable across the individuals while p I and p F varied together ( Table 2). The latter is a mechanistic effect, as they act as opposite forces.
Evaluation of the BCR model. The distribution of errors e 1 and e 2 in the 5 statistics is provided in Fig. 4 for deer 5 and in Supplementary Figure S3 for all 5 deers. The table containing the mean error and standard deviation, median error and interquartile range for all deer and for error e 1 is also provided in Supplementary file S2. For all deers, we showed that combining the parameters plays an important role in modeling deer behavior. Configurations with only one parameter did not perform well on average while further investigations showed that combining p I , p F and p s allows for a better description of movement, especially regarding the census statistics for both linear and rotational transects and home range estimates (Supplementary Figure S3 and Table S2).
Scale invariance. The parameters p I and p F remain almost constant with increased subsampling (Fig. 5).
On the other hand, distance-related parameters such as p s , µ , σ 2 are highly sensitive to the resampling rate k. This is a mechanistic effect of the subsampling procedure: as we increase k, the distance between each pair of GPS locations increases, resulting in less observations falling in the state s (Eq. 8) which in turn results in smaller p s .

Fluctuations.
We found that the variance decreases or does not change as the number of simulated steps increases in most of the statistics (Fig. S4)  www.nature.com/scientificreports/ statistic is expected to be similar to the one of immobile transects but the speed of convergence to the null variance may be very slow and it may take a much larger number of steps. The variance of the estimated areas in the dilation statistic increases with n s because we dilated the simulated paths in a huge window, encapsulating the whole path including a very large portion of empty space around it. This was done to prevent boundary effects when assessing the area of dilated paths: to make sure that dilated paths do not hit any of the window bounds. Otherwise this would produce biased, underestimated areas. However, using a smaller window or, again, a much larger number of steps would result in a null-variance.

Sensitivity analysis.
We showed that using the 3 parameters instead of one provides better results ( Fig. 4 and S3). The inferred values for the single parameter configuration of the BCR are given in table S1and corresponding mean, standard deviation, median and interquartile range are presented in table S2. When investigating arbitrary values for the three parameters, we show that the empirically inferred estimates provide better performance in all deers. It means that the BCR simulations that use the three parameters with values directly inferred from the dataset show better performance in average (Fig. S5). This is clear in the distribution of turning angles, where using arbitrary values leads to an increase of the mean error. While some arbitrary values may produce better performance, they only work for selected individuals and specific statistics. For instance, configuration 140 with arbitrary parameters values p I = 3, p s = 4, p F = 1 provide better results in all statistics except the distribution of turning angles for deer 1. The inferred parameters values for deer 1 are {0.01, 2.01, 0.01} , such that one could conclude that an arbitrary increase of p I and p F will produce better performance (to the exception of reproducing the correct turning angles distribution pattern) for all deers and statistics. However, using the parameters values of configuration 140 produce highest mean error in the distribution of turning angles, home range (Kernel) estimate and dilation for deer 4. Moreover, even a small change in the values of a successful configuration with arbitrary parameters can lead to complete different results in a given animal and statistics. One leading observation is that a small increase in immobilism p s -0.2 or 0.4 for instance-yields better results in most statistics for all deers (supporting Fig. S5). This is true provided the two other parameters p I and p F are not far from the inferred values. However, artificially increasing p s always lead to failure in reproducing the correct angular distribution. An arbitrary increase of p F also reduces the mean error in the 3 statistics which summarize the aggregated spatial localizations (i.e. kernel estimates, dilation and fixed transects) but ultimately fails in providing accurate results in mobile transects. www.nature.com/scientificreports/ Application: detecting spatial voids. The resulting alpha shapes and detected voids (holes) are presented for each deer in Supplementary Fig. S6. The probability p ∅ of observing such voids is computed and showed in Fig. 6. Many voids whose area fell in the interval 0, 1.5 × 10 4 were related to boundary conditions, where the alpha shape produced artificial voids due to less dense areas. However, the Monte Carlo simulations show that 3 voids, located inside the alpha shape (void 1 (deer 1) and voids 1 and 2 (deer 4)), should not appear. In other words, these voids are possibly not related to movement randomness but to other spatial features, with good probability. The distributions of errors for each configuration varied in each deer.

Discussion
In this work we aim at providing valuable and reliable ecological information regarding the components of animal movement. We introduce a simple and tractable model to deal with animal movement, based on a twodimensional BCR in discrete time and continuous space that allows for combining the ecological forces in a simple way. The parameters of the BCR are directly inferred from the GPS data recorded in a large herbivore and its performance is assessed in 5 spatial and ecologically-related statistics. Four of them differ from the typical signals or parameters calculated based on empirical relocations 52 and address the home range size, census issues and animal behavior. While there is no methodological consensus in the most appropriate way to check for similarity of individual movement paths, turning angles and home-range can provide concurrence in movement parameters over time without including positional information 53,54 . In addition, census information give an aggregated information, based on the intersections of movement paths with either mobile or fixed transects. The framework is presented in Fig. 4 and explains how synthetic (or simulated) paths parameterized from empirical observations are ultimately compared to the empirical paths throughout the statistics. However, we emphasize that the attributes of the data (such as the sampling rate for instance) should be consistent amongst different individuals in order to allow for inter-individual comparison 50 . It is also important to have a sufficient number of locations as precision in parameter estimation scales with the sample size, meaning that the more locations, the higher the precision. We focus on three essential forces that allow for an efficient description of animal motion over large periods of time. The results display that by combining those parameters, we get a much better description of animal movement compared to an unbiased random-walk with a log-normal step size distribution. Results also show that using three parameters produce accurate results in describing animal movement compared to using a single parameter. This confirms that animal movement is a complex process, driven by several forces instead of a single and dominant one. Both the absolute error e 1 and mean relative errors e 2 do provide the same results, showing that the BCR model is robust in terms of the norm ( e 1 ) and in sensitivity of the norm ( e 2 ). An additional analyze was conducted to further test if the parameters inferred from the data set provide the best performance. Results show that configurations with an increase in immobilism p s and-to lesser extent-den attraction p F values can improve the performance of the model. However, such an arbitrary increase comes at the cost of a lesser performance in describing the distribution of turning angles or estimating mobile transects. Thus, the configuration using the parameters estimated from the data set does provide the more balanced performance in all statistics for all deers amongst the 152 configurations tested.
The inertia p I , describing the short-term memory effect, is the first force introduced in this approach. Whether the use of land space by the animal is dependent on short-term or long-term memory is a debated topic. It gave rise to a series of studies that emphasized the importance of memory in animal movement from a biological or modeling perspective [55][56][57][58][59][60][61] . These studies also underlined that inferring memory effects directly from relocations is not a trivial task. Those relocations instead depend on a mixture of effects, including landscape and territorial constraints, resource patches and possibly long-term memory. Using a single memory feature p I might be a too simple approximation for efficiently capturing the memory effect. In our approach it is possible to alter p I in order to include several previous steps instead of just one.
Immobility p s combines several features of animal movement including animal at rest, in vigilant state, and GPS noise. Multiple factors are known to affect GPS noise, including topographic exposure, canopy cover, vegetation height and the slow movement of the ionosphere. The latter changes by a few centimeters during 30sec intervals 2 , possibly introducing up to 20 fold this bias in each of the recorded GPS observations. However, this is small regarding the average step size of non-immobile movements, ranging from 41 m (red deer 1) to 46 m (red deer 3 and 5). Thus, we assume that the measured step lengths and turning angles reflect the reality. Immobile (i.e. ≤ d min ) observations represent a large proportion in our total datasets: 25.0% (red deer 1), 17.2% (red deer 2), 22.0% (red deer 3), 19.0% (red deer 4) and 23.6% (red deer 5), associated with specific behaviors such as on-site foraging, eating, resting, etc. The estimates of p s in all five animals are greater than inertia or attractor (Table 2), underlining the importance of considering immobility when analyzing the movement of red deer. This is in line with previous experimental studies that showed the high frequency of feeding, resting cycles in red deer and labile diet 62 .
Site fidelity is the recurrent visit of an animal to a previously occupied location. This is a well-known and wide-spread behavior in the animal kingdom 63 . The animal favors locations that are ecologically valuable and related to a foraging or explorative behavior. In our approach, we rather and simply depict site fidelity using one single attractor p F . The fact it improves the performance of the model when combined with inertia and immobilism confirms that site fidelity (or a simplified estimation of it) should be taken into account when modeling deer movement.
The model can be sophisticated by including more complicated environmental aspects of individual movement, such as spatial memory 58,59 , reinforcement and site fidelity 64   www.nature.com/scientificreports/ distribution to analyse all movement is not effective 9,68 . In this work, we used 3 parameters and a single uniform distribution in all directions (diffusion) to account for other underlying or more subtle behaviors-foraging, etc.-. Similarly, we used a single log-normal distribution to describe the step-size distribution. The BCR model can be adapted to include more parameters that would result in a less diffusive movement, and multiple distributions can be used to model the step-size at different and smallest time scales. Distributions and parameters values can also be nested such that the model can adapt to different time and spatial scales. One leading rational of this work is to investigate the model's ability to address ecological challenges such as estimating the abundance of a given species (census) or detecting anomalous spatial features. We use both mobile and immobile simulated transects to illustrate how the model could be used in the first problem. The probability of counting the same animal multiple times can then be estimated using Monte Carlo simulations. The second issue is, for instance, to detect spatial voids in empirical movement paths. The location of animals may present empty spatial voids (or holes) of various sizes. This may be related to environmental conditions such as urban areas, water, cliff, or other ecological reasons (such as interactions with other individuals e.g. repulsive marks) and other factors. Using numerical simulations of the model, we are able to detect anomalous voids in the dataset, that are not related to randomness but to human activity. The void 1 (deer 1) reveals that some environmental changes took place between the recording time of the GPS location in 2010 and the satellite image in 2018. After cross-checking with additional information from the OFB, we learned that the identified area was a forest enclosure. This explains why the deer was not able to reach this area. Both voids 1 and 2 of deer 4 also are related to human activities: forest roads, buildings and one artificial reservoir.
This work introduces a tractable model, based on a two-dimensional BCR for describing animal movement in discrete time and continuous space. The model allows for a direct and explicit estimation of the three parameters that provide the optimal design regarding the studied statistics. Moreover, it allows for deriving reliable (i.e. independent from the GPS sampling rate) and quantitative information about the components of animal movement. Results show that combining the parameters is a key component in modeling movement, while allowing for an accurate description of the turning angles, home range size and census issues. The model also has practical applications for addressing ecological issues such as census or spatial anomalies. While we only focus on 5 animals to demonstrate the approach, the model is general and can be applied to any other species. We considered only one attractor per animal in the proposed approach and both the existence and influence of multiple attractors are yet to be investigated. Additional behavior features such as the spatial reinforcement, memory of n previous steps, activity rhythms (such as the circadian cycle), distance from the attractor, landscape/habitat effect 66 , interactions with other animals and topological issues are currently being investigated and will be included in a future work. Comparing the proposed method to other existing approaches, such as the HMMs, would allow for evaluating its benefits and disadvantages in terms of both precision (error) and computational resources. In particular, the scale invariance of the parameters may provide a benefit compared to other scale-dependent approaches. An additional improvement would be to consider d min as an average speed cutoff instead of a distance cutoff. This would allow for capturing immobile or slower movements independently from the average sampling time T . Another point of interest is the development of a continuous version of the proposed model, where the direction of the step is drawn from a specific distribution, whose parameters are yet to be empirically characterized.