Abstract
Animal telemetry data are often analysed with discrete time movement models. These models are defined with regular time steps. However, telemetry data from marine animals are observed irregularly. To account for irregular data, a timeirregularised firstdifference correlated random walk model with drift is introduced. The model generalizes the commonly used firstdifference correlated random walk with regular time steps by allowing irregular time steps, including a drift term, and by allowing different autocorrelation in the two coordinates. The model is applied to data from a ringed seal collected through the Argos satellite system, and is compared to related movement models through simulations. Accounting for irregular data in the movement model results in accurate parameter estimates and reconstruction of movement paths. Further, the introduced model can provide more accurate movement paths than the regular time counterpart. Extracting accurate movement paths from uncertain telemetry data is important for evaluating space use patterns for marine animals, which in turn is crucial for management. Further, handling irregular data directly in the movement model allows efficient simultaneous analyses of several animals.
Introduction
Understanding animal movement behaviour, space use patterns, and response to the environment relies on modelling animal telemetry data. Collecting telemetry data in marine environments is difficult: satellite systems require animals to surface, storage tags require recapture, and acoustic networks require a dense array of receivers. Nonetheless, the scope and scales of movement studies are continuously increasing with rapid technological advances in tracking devices^{1}.
In spite of technological advances leading to more accurate measurements and larger datasets, data from satellite tags have inherent measurement errors. Systems such as Fastloc GPS and Argos can only record data when the animal is above water, with an accuracy that depends on the surface time and diving behaviour^{2}. Typically, this results in data observed at irregular time steps with considerable uncertainty, which makes statespace models a valuable framework for analysing the data.
Statespace models naturally separate the movement process from the measurement mechanism. True animal locations are explicitly modelled separately from the observed locations. True locations are modelled directly by a stochastic process, while an observed location is modelled by the conditional distribution given the true location. Hence, the movement models developed can be combined with telemetry data from any source, for example, GPS, Argos, lightbased, or acoustic tags. However, inference in nonlinear or nonGaussian statespace models can be challenging. Since the true locations are not observed, maximum likelihood estimation in statespace models requires obtaining the marginal distribution of the observations. The marginal distribution is obtained by calculating the joint distribution of true and observed locations and integrating over all possible true locations that could have been observed. Recent software such as AD Model Builder^{3} and Template Model Builder (TMB^{4}) utilizes the Laplace approximation to rapidly approximate the marginal distribution of observations in statespace models. While care is needed to ensure the approximation is appropriate, this framework is generally applicable. For instance, TMB has been used for both discrete time^{5,6} and continuous time^{7,8,9} movement models. Alternatively, the statespace can be discretized and the resulting hidden Markov model can be used for inference^{10}.
Although continuous time movement models can handle irregular data, discrete time models are often thought of as more intuitive^{11}. Further, continuous time models can be difficult to construct or extend analytically, because they are often formulated through stochastic differential equations. Whereas continuous time models describe the instantaneous change, discrete time models describe the change between two observations. The change between observations is often modelled by the distance and the turning angle (e.g.^{12,13,14,15,16,17}), which is considered a natural way to represent movement^{18}. In discrete time models with regular time steps, irregular data can be handled by modifying the observational model to interpolate between states^{13}.
To facilitate the analysis of populations or metaanalysis of individuals, the model parameters for different individuals must be on the same scale. If the interpretation of parameter values depends on the chosen time steps between estimated locations^{11}, the values must be corrected before comparison is possible. Likewise, estimating population parameters based on individuals either requires movement models that use the same time steps for all individuals (e.g.^{19}), which can be suboptimal, or movement models that handle irregular data directly.
The firstdifference correlated random walk (DCRW^{13}) models animal movement as a discrete time first order autoregressive process on the difference between consecutive locations. The model includes a rotation parameter which allows tortuous and circular movement patterns. This is similar to the mean turning angle included in many discrete time models (e.g.^{12,15,16,17}). A continuous time version of the DCRW without rotation is the continuous time correlated random walk (CTCRW^{20}). The CTCRW models the velocity of an animal by an OrnsteinUhlenbeck process: the continuous time counterpart of the first order autoregressive process. Recently, an irregular time version of the DCRW was introduced without rotation including a timevarying parameter instead^{6}.
In this paper, a generalization of the DCRW is introduced. The generalization of the DCRW is trifold: the model allows irregular time steps, a drift vector is included, and different autocorrelations in the two coordinates can be used. As a byproduct, it is shown how the parameters in the DCRW model depend on the selected time steps. Further, a formula for transforming them to a timeinvariant scale is obtained. The generalized DCRW is introduced as a discretization of a stochastic differential equation for the animal’s velocity and location. Further, the close relationship between the generalized DCRW and a generalization of the CTCRW is shown. Through simulation studies, the generalized DCRW is shown to perform well compared to the CTCRW and DCRW. Finally, the applicability and extendability of the model are illustrated through a real animal movement dataset.
Materials and Methods
Movement model
The generalization of the DCRW movement model is constructed through a stochastic differential equation (SDE) for the velocity. From this SDE, the location process is the integrated velocity process. To obtain the timeirregular movement model, the SDE is solved analytically while the location process is a discrete time approximation.
SDE for velocity
The bivariate SDE for the velocity, V_{t}, of an animal is
with initial condition \({V}_{0}={v}_{0}\in {{\mathbb{R}}}^{2}\). In this model, γ_{1} and γ_{2} are autocorrelation parameters, μ is a vector of mean velocity parameters, and θ is a rotation parameter related to the mean turning angle of the DCRW model. The matrix S is a lower triangular matrix determining the covariance of the changes in velocity. Diagonal elements of S must be positive. When offdiagonal elements are zero, changes in velocity are independent between coordinates and the diagonal is their standard deviation. An overview of the parameters is given in Table 1. Animals moving persistently in one direction will have a high autocorrelation in the velocity and a rotation parameter close to zero, while a lower autocorrelation in the velocity and a rotation parameter different from zero results in tortuous movement trajectories with circular patterns and frequent changes in velocity (e.g.^{13,21}). When γ_{1} = γ_{2} and S = σI_{2}, where I_{n} denotes an n × n identity matrix, the model is identical to the rotationaladvective correlated velocity model discussed by Gurarie et al.^{22}. In this case, θ can be interpreted as a mean radial velocity^{22}.
The analytical solution to the SDE is a stochastic process with Gaussian increments (see e.g. Supplemental Material S1). The mean of an increment is
where \(\mu ={({\mu }_{1},{\mu }_{2})}^{T}\), while the covariance is
where \({\rm{vec}}(C)={({\rm{\Theta }}\oplus {\rm{\Theta }})}^{1}\,{\rm{vec}}({\rm{\Sigma }})\), \({\rm{\Sigma }}=S{S}^{T}\), ⊕ denotes the Kronecker sum, \(A\oplus B=A\otimes {I}_{B}+{I}_{A}\otimes B\), and vec is an operator that stacks the columns of a matrix to a vector. Based on the velocity, the location process, X_{t}, follows the SDE
Discretizing the SDE for locations
Considering the N predetermined time points \({\{{t}_{i}\}}_{i\in \{\mathrm{1,}\ldots ,N\}}\), an Euler–Maruyama approximation to the location process is obtained by
where \({{\rm{\Delta }}}_{i}={t}_{i}{t}_{i1}\) is the length of the time step. Inserting the expression for \({V}_{{t}_{i1}}\), the Generalized firstDifference Correlated Random Walk (GDCRW) is obtained:
The error terms, \({\varepsilon }_{{t}_{i}}\), follow a zero mean normal distribution with variance
with vec(C) as defined in Subsection (SDE for velocity).
When the animal movement is continuous in time, the GDCRW model provides a discrete time approximation. The quality of the approximation depends on the time between locations of the movement process; that is, the predetermined time points \({\{{t}_{i}\}}_{i\in \mathrm{\{1,}\ldots ,N\}}\). The time points do not need to correspond directly to the time of observations. Therefore, the quality of the approximation can be improved by adding additional locations between observations, thereby making the time discretization finer. The additional locations must be integrated out of the likelihood. When the process is observed directly, this can be done by simulation^{23,24}. In a statespace model, this simply corresponds to adding estimated locations without observations, since the entire location process is unobserved. Consequently, the additional estimated locations can be integrated out of the likelihood in the same way as the remaining process, for instance, with the Laplace approximation, filters or MCMC. In both cases, the approximation of the transition density between observations is improved, while only the marginal likelihood of observations is used for inference.
Special cases
The GDCRW model is closely related to four other models: The CTCRW, the DCRW, the modified DCRW by AugerMéthé et al.^{6}, and a random walk.
Relation to the CTCRW
When the turningangle parameter θ = 0 and the increment covariance S is diagonal, the velocity model, V_{t}, is identical to the velocity model of the CTCRW; however, the location models differ. While the location model of the CTCRW is an analytical solution to the two univariate integrated velocity models, the GDCRW is a discretization of the more general bivariate velocity model. Hence, when θ = 0 and S is a diagonal matrix, the discrete time GDCRW converges to the continuous time CTCRW as the time steps tend to zero. When θ ≠ 0 or S is not diagonal, the GDCRW converges to a generalization of the CTCRW when the length of the time steps approaches zero.
Relation to the DCRW
For regularly observed data, \({{\rm{\Delta }}}_{i}={\rm{\Delta }}\), with μ = 0 and γ = γ_{1} = γ_{2},
Hence, the movement model simplifies to
which is the movement model of Jonsen et al.^{13} when Δ = 1. Therefore, the GDCRW generalizes the DCRW in three ways: the restriction of regular time steps is relaxed, a drift vector is added, and different autocorrelations in the two coordinates are allowed. Further, equation (4) allows scaling the DCRW parameters to a common time scale. That is, if parameters \(\tilde{\gamma }\) and \(\tilde{\theta }\) are obtained from a DCRW model fitted with time steps \(\tilde{{\rm{\Delta }}}\), scaleindependent parameters are obtained by \(\gamma ={\tilde{\gamma }}^{\mathrm{1/}\tilde{\Delta }}\) and \(\theta =\tilde{\theta }/\tilde{\Delta }\). Being able to scale the DCRW parameters to a common scale allows comparison between model fits in meta or population studies even if different time scales are used.
Relation to the modified DCRW
Recently, AugerMéthé et al.^{6} proposed a modified version of the DCRW for FastlocGPS data. The modified DCRW allowed irregularly observed data and included a timevarying autocorrelation parameter, γ_{t}. Ignoring the timevarying autocorrelation, the modified DCRW can be obtained as a special case of the GDCRW introduced here by letting γ_{1} = γ_{2} = γ_{t}, μ = 0, θ = 0, and by letting Σ be a diagonal matrix. Note, however, that the two models differ slightly in parameterization. In the modified DCRW, γ_{t} is not scaled by the time differences and the variance of \({\varepsilon }_{{t}_{i}}\) is parameterized as a diagonal matrix \(diag({\sigma }_{1}^{2},{\sigma }_{2}^{2})\). The GDCRW can easily be extended to have a timevarying autocorrelation parameter in the same way as AugerMéthé et al.^{6}.
Relation to a random walk
Finally, the GDCRW can be reduced to a random walk model. From equation (3) it follows that when the autocorrelation parameters γ_{1} and γ_{2} tend to zero, the model is reduced to a random walk. This is also the case for the CTCRW, DCRW, and modified DCRW. If \(\mu \ne 0\), it will be a random walk with drift.
Measurement equation
Besides the movement process describing the true, latent locations, \({\{{X}_{{t}_{i}}\}}_{i\in \mathrm{\{1,}\ldots ,N\}}\), a measurement equation is needed to form a statespace model. For the DCRW, measurement errors are often included by linearly interpolating between the locations. For an observation \({Y}_{{s}_{j}}\) at time s_{j},
such that \({t}_{i}\le {s}_{j} < {t}_{i+1}\), \(q=\frac{{s}_{j}{t}_{i}}{{{\rm{\Delta }}}_{i+1}}\), and \({\eta }_{i}\) is a zero mean bivariate random variable. However, when the movement model allows irregular time steps, it is more natural to ensure that the estimated locations align with the observations, such that the measurement equation simplifies to
Then the observation at time s_{j} does not depend on an estimated location in the future. Note that the statespace model can include additional estimated locations between observations. These estimated locations only contributes to the likelihood function through the movement process (see Supplemental Material S2). Below, \({\eta }_{j}\) will either be modelled as a bivariate normal distribution^{20} or a bivariate tdistribution^{7}.
Simulation Studies
To evaluate the GDCRW, three simulation studies were conducted. The first simulation study evaluated the GDCRW against the CTCRW when the latter was the true movement model. The second study compared the GDCRW to the DCRW when a continuous time version of the DCRW was the true model. Finally, the third study investigated the effect of measurement error on the choice of time step lengths in the GDCRW. The simulation studies only considered the main movement parameters. Variance parameters were not considered since they were related to the coordinate system used. For the CTCRW, the main movement parameter was the autocorrelation. For the DCRW, the main movement parameters were the mean turning angle and the autocorrelation.
The first simulation study was conducted to illustrate that the GDCRW could adequately reconstruct movement trajectories and parameter estimates from the CTCRW model. Likewise, the second simulation study was included to show that the GDCRW could recover parameter estimates from a continuous time version of the DCRW, including the rotation parameter not present in the CTCRW model. In contrast, the DCRW model itself had to be combined with equation (4) to recover the parameter estimates. Finally, the third simulation study was carried out to show that the irregular time steps of the GDCRW was better at reconstructing continuous time movement trajectories than the regular time steps of the DCRW, especially when measurement error was low.
In all three simulation studies, location trajectories were simulated from the continuous time model defined by equations (1) and (2). In the first simulation study, the GDCRW was therefore a discretized version of the true model, while the CTCRW was the true model. In the second and third simulation studies, the DCRW and GDCRW were different discretizations of the true model. The EulerMaruyama method was used to simulate from the model. To emulate continuous time movement, the EulerMaruyama method was used with 200 additional simulated locations between observations. Time steps between observations were simulated from a mixture of an exponential and a normal distribution (see Supplemental Material S3) to ensure both short and long time steps between observations. All models in the simulation studies were fitted to the data by maximum likelihood through the Laplace approximation using the R (version 20180321 r74432^{25}) package argosTrack (version 1.2.2^{26}), which builds on the package TMB (version 1.7.14^{4}) (R code is available as supplementary material).
Evaluating the GDCRW against the CTCRW
Since the GDCRW location process is constructed as an Euler–Maruyama discrete time approximation to an underlying continuous time model, increasing the number of estimated locations between observations should improve the approximation because it makes the time discretization finer. This was investigated by simulating from the underlying continuous time model in the special case that reduces to the CTCRW. In this case, the GDCRW is a discretization of the true model. To investigate the effect of the discretization, the ability of the GDCRW to estimate the main movement parameter, γ = γ_{1} = γ_{2}, along with the ability to reconstruct the trajectory was evaluated against the CTCRW.
To evaluate the GDCRW against the CTCRW, 200 trajectories with 250 observations were simulated from the continuous time model defined by equations (1) and (2) with θ = 0 and diagonal S; that is, with the CTCRW as the true model. A total of six models were fitted for each simulated dataset. The CTCRW and the GDCRW with estimated locations at the time of observations were fitted. Further, the GDCRW with additional estimated locations every 8, 4, 2, and 1 time unit were fitted. The models with additional estimated locations were denoted GDCRW_{8}, GDCRW_{4}, GDCRW_{2}, and GDCRW_{1}, respectively.
For each of the 250 simulated locations, observations were simulated from a bivariate normal with covariance \({0.1}^{2}{I}_{2}\). The processes were simulated with γ = γ_{1} = γ_{2} = 0.9, μ = 0, and \({S}_{11}={S}_{22}={e}^{2}\approx 0.135\). For simplicity, the model was fitted with γ = γ_{1} = γ_{2}, θ = 0, μ = 0, and \({S}_{12}={S}_{21}\mathrm{=0}\).
The CTCRW models both the velocity and location giving a total of 1000 latent variables: 4 for each location. In contrast, the GDCRW only has 500 latent variables: 2 for each location. The mean of the time step distribution used in the simulation study was approximately 1.99, giving an average length of a simulated trajectory of 497.5 time units. Therefore, the GDCRW_{8} would include 63 additional location estimates on average, corresponding to a total of 626 latent variables. Likewise, the GDCRW_{4}, GDCRW_{2}, and GDCRW_{1} would on average include 125, 249, and 498 additional locations, respectively, corresponding to 750, 998, and 1496 latent variables.
Regardless of the different number of latent variables, all six estimation models (CTCRW, GDCRW, and the GDCRW with additional location estimates every 8, 4, 2, and 1 time unit) provided γ estimates close to the true value (Fig. 1); however, the standard deviation of the estimates was higher for the GDCRW than for the five other models. For the true continuous time model, the CTCRW, the average of the 200 estimates (standard deviation) was 0.897 (0.018), whereas it was 0.901 (0.028) for the discrete time approximation with estimated locations only at the time of the observations, the GDCRW. When additional estimated locations were included every 8, 4, 2, and 1 time unit, the average of the estimated γ s were 0.905 (0.018), 0.905 (0.016), 0.903 (0.016), and 0.899 (0.017), respectively.
Besides the ability to reobtain the true parameters, the GDCRW models with different numbers of estimated locations were evaluated on the distance between the estimated locations and the true simulated locations (Fig. 2). To evaluate the models, this distance was calculated for each location for each track for the GDCRW, GDCRW_{8}, GDCRW_{4}, GDCRW_{2}, and GDCRW_{1}, and divided by the distance obtained for the CTCRW to get a relative distance. Note that since the coordinate system of the simulation study is arbitrary, the distances are only meaningful relative to each other. Whereas the models performed similarly in estimating γ, the CTCRW performed slightly better than the other models in obtaining precise location estimates; however, the GDCRW models improved when additional locations were estimated, as expected. For the GDCRW, 46% of the estimated locations were closer to the true location than the corresponding estimated location from the CTCRW. For the GDCRW_{8}, 46.9% of the estimated locations were closer to the true location than the CTCRW, while it was 47.9, 48.5, and 49.6%, respectively, for the GDCRW_{4}, GDCRW_{2}, and GDCRW_{1}. Consequently, the median relative distances where 1.0125 (GDCRW), 1.0075 (GDCRW_{8}), 1.0043 (GDCRW_{4}), 1.0021 (GDCRW_{2}), and 1.0003 (GDCRW_{1}). Further, the range of relative distances narrowed when the number of estimated locations increased. When no additional estimated locations were included, the length of the range on logscale was 9.458 (−5.317; 4.141). Based on these 200 simulations, the range gradually narrowed for the GDCRW_{8} (8.894; min: −5.024; max: 3.87), GDCRW_{4} (7.711; min: −4.069; max: 3.642), GDCRW_{2} (6.746; min: −3.252; max: 3.494), and GDCRW_{1} (6.655; min: −2.938; max: 3.718).
This simulation study evaluated the effect of the discretization used by the GDCRW when the true movement was continuous in time. Even when assuming constant velocity between observations, the GDCRW performed well compared to the CTCRW from which the data was simulated. Although the GDCRW with estimated locations at the time of observations performed well, performance could be improved by including additional estimated locations between observations. Including additional estimated locations made the time discretization finer, thereby allowing the velocity to change between observations and improving the EulerMaruyama approximation. For a sufficient number of additional estimated locations, the discretized movement model would resemble the true continuous time model.
Comparing the GDCRW and the DCRW
Above it was established that the DCRW was a special case of the GDCRW. One of the ways the DCRW was generalized was that the GDCRW allowed estimated locations at arbitrary time points. In contrast, the DCRW only allowed regular time steps in discretizing the underlying SDE. In this simulation study the effect of the choice of discretization on parameter estimates and trajectory reconstruction was investigated. Trajectories were simulated from the underlying continuous time model in the special case that would reduce to the DCRW if the time steps were regular. The comparison consists of three parts. The first two parts compare how the DCRW and GDCRW are able to estimate parameters of the continuous time movement. The comparisons are made for a tortuous and a persistent movement, respectively. The final part compares how the DCRW and GDCRW are able to reconstruct the true trajectories, and how they are influenced by measurement error.
Datasets were simulated from two behavioural scenarios: a tortuous movement scenario with low autocorrelation and nonzero mean turning angle, and a persistent movement scenario with high autocorrelation and a mean turning angle equal to zero. The scenarios were selected to reflect the typical behavioural states the DCRW is used for (e.g^{13,21}.). For each scenario, 200 trajectories with 250 observations were simulated. Measurements were simulated from a normal distribution with covariance \({{\rm{\Sigma }}}_{{\rm{o}}}={0.1}^{2}{I}_{2}\).
Both the DCRW and the GDCRW are discretizations of the true model. As a result, the models are expected to perform better when the number of estimated locations is increased, since increasing the number of estimated locations makes the time discretization finer. For each trajectory, a total of 10 models were fitted. Both the GDCRW and the DCRW were fitted with \(N=\mathrm{250,500,750,1000,}\) and 1250 estimated locations. In the DCRW, the estimated locations were included at regular time intervals. As mentioned above, observations were, on average, 1.99 time units apart. Consequently, the average trajectory length was approximately 497.5 time units. Therefore, the number of estimated locations (N) corresponds to estimated locations every 1.99, 1, 0.66, 0.5, and 0.4 time unit, respectively. In the GDCRW, 250 estimated locations were aligned with the observations. Remaining estimated locations were included recursively in the middle of the longest time step. For simplicity, the DCRW models were fitted with S_{11} = S_{22} and S_{12} = S_{21} = 0. The GDCRW models were fitted with γ = γ_{1} = γ_{2}, μ = 0, S_{11} = S_{22}, and S_{12} = S_{21} = 0 to reduce to the same parameters as the DCRW models.
Persistent movement
The first scenario emulated a persistent movement with θ = 0 and \({\gamma }_{1}={\gamma }_{2}\mathrm{=0.9}\). The movement was simulated with diagonal S such that \({S}_{11}={S}_{22}={e}^{2}\).
In this movement scenario, the GDCRW and the DCRW corrected by equation (4) provided estimates close to the true values for both parameters for all five number of estimated locations (Fig. 3). For the GDCRW, the average γ estimates ranged from 0.898 to 0.905, and the average θ estimates ranged between −0.001 and 0. Likewise, the average DCRW estimates corrected by equation (4) ranged from 0.899 to 0.902 for γ, and from −0.001 to −0.001 for θ. Unlike for the GDCRW, the estimates obtained directly from the DCRW were not close to the true values. For the DCRW models, the average γ estimates increased with the number of estimated locations from 0.815 to 0.959. The average θ estimates were all close to zero (min: −0.001; max: 0.000); however, their standard deviation decreased from 0.035 to 0.007 as the number of estimated locations increased.
Similar to the comparison with the CTCRW, the GDCRW performed well for this persistent movement simulation study. For the rotation parameter, θ, the estimates barely changed by including additional location estimates, whereas the standard deviation decreased slightly by increasing the number of estimated locations. Likewise, when the DCRW parameters were corrected by equation (4), the DCRW performed well for all five number of location estimates. When the movement was persistent, changes in velocity between observations were small; therefore, few estimated locations were needed for the discretized movement to resemble the true continuous time trajectory.
Tortuous movement
The second scenario mimicked a more tortuous movement. Movement was simulated with \(\theta =\frac{\pi }{3}\approx 1.05\), μ = 0, and \({\gamma }_{1}={\gamma }_{2}=0.6\). Similar to the first scenario, the movement was simulated with \({S}_{11}={S}_{22}={e}^{2}\).
The GDCRW provided γ estimates close to the true value for all five numbers of estimated locations with a slight tendency to overestimate the value (Fig. 4). The average γ estimates (standard deviation) were 0.622 (0.059), 0.622 (0.059), 0.627 (0.061), 0.617 (0.064), and 0.614 (0.066) for the GDCRW with \(N=\mathrm{250,500,750,1000,}\) and 1250 estimated locations, respectively. A difference was, however, seen for the θ estimates. The GDCRW with 250 and 500 estimated locations provided downwards biased estimates of θ. Similarly to the persistent movement scenario, the estimates obtained directly from the DCRW were not close to the true values. For these models, the average γ estimates increased from 0.562 to 0.824 with the number of estimated locations, while the average θ estimates decreased from 1.851 to 0.426, illustrating the timescale dependence on the parameter estimates. After correcting the DCRW parameter estimates by equation (4), their values came closer to the true parameter value as the number of estimated locations increased. The average γ estimates decreased from 0.749 to 0.62, while the average θ estimates increased from 0.924 to 1.06.
Unlike the two previous simulation studies, additional estimated locations were necessary for the tortuous movement scenario. For both the GDCRW and the DCRW corrected by equation (4), estimates of the rotation parameter, θ, improved by including additional estimated location. The autocorrelation parameter was estimated accurately by the GDCRW for all five number of estimated locations. For the DCRW corrected by equation (4), the estimation improved with the number of estimated locations.
When the movement was tortuous, changes in velocity between observations were larger and more frequent than for a persistent movement. Consequently, more estimated locations were needed for the GDCRW to resemble the true continuous time movement.
From these comparisons with the DCRW, it was evident that parameter estimates of the DCRW depended on the time between estimated locations. When the time between estimated locations decreased, the autocorrelation parameter, γ, of the DCRW would increase, while a nonzero turning angle, θ ≠ 0, would decrease. While the parameter estimates could be corrected by equation (4), an advantage of the GDCRW is that the parameter estimates are independent of the time scale.
Effect of measurement error on the choice of time steps
In the final simulation study, the GDCRW and DCRW were compared once more. While the previous comparison focused on parameter estimates, this simulation study compared the ability to accurately obtain location estimates for different ratios of process variability and measurement variability. Again, 200 trajectories with 250 observations were simulated from two behavioural scenarios using the same procedure as for the two previous simulation studies. The same movement parameters as in the previous simulation study were used, and the GDCRW and DCRW with 250 estimated locations were fitted in the same manner as before; however, the observations were simulated with covariances \({e}^{2\cdot 6}{I}_{2},{e}^{2\cdot 5.5}{I}_{2},\ldots ,{e}^{2\cdot 1.5}{I}_{2},{e}^{2\cdot 2}{I}_{2}\), respectively. For each estimated location, the distance to the true location was calculated and compared between the GDCRW and the DCRW. Comparisons were made by dividing GDCRW distances with the corresponding DCRW distances to obtain relative distances. The DCRW locations were interpolated linearly to get values at the time of observations.
When the measurement standard deviation was low, the GDCRW performed notably better than the DCRW (Fig. 5). In both scenarios, 90% of the simulated locations had a relative distance less than 1; that is, the locations were better estimated by the GDCRW than the DCRW. As the measurement standard deviation increased relative to the movement standard deviation, the median relative distance tended to one, suggesting that the GDCRW and the DCRW returned similar location estimates.
When measurement error was low, and the number of estimated locations were the same as the number of locations, the GDCRW was better at reconstructing the true trajectories than the DCRW. Hence, it was better to have the estimated locations at the time of the observations than regularly spaced in time, because when the estimated locations were regularly spaced in time, several observations could be related to the same estimated locations. Consequently, the estimated location must provide a reasonable compromise for all related observations which was a disadvantage compared to the GDCRW when the measurement error was small. As the measurement error of each observation increases, the uncertainty of the GDCRW location estimate also increases and the two methods give similar results.
Case study
To illustrate its practical applicability, the GDCRW model was fitted to the subadult ringed seal Pusa hispida data of Albertsen et al.^{7} in this case study. Albertsen et al.^{7} fitted both the CTCRW and DCRW to the data. The dataset was chosen to illustrate the movement model when substantial measurement errors are present. The ringed seal data consists of N = 3583 locations collected by the Argos satellite system, which is known to have substantial nonGaussian measurement errors. Three observations with location class Z were excluded. In the first period of the track, the seal was moving northwest, followed by two periods with restricted space use close to land. Consequently, Thygesen et al.^{27} found indications that the CTCRW was not sufficient, but a nonconstant drift parameter was needed. To account for this behaviour, the mean velocity parameters of the GDCRW were set to be location dependent:
The bivariate mean velocity field \(\mu ({X}_{t})=({\mu }_{lat}({X}_{t}),{\mu }_{lon}({X}_{t}{))}^{T}\) was modelled through a 15 × 15 grid of random effects. Note that like Johnson et al.^{20}, the argosTrack package used for inference has latitude as the first coordinate^{26}. The order of the coordinates only affects the sign of θ. Each grid point was related to two random effects: one for the mean latitudinal velocity, μ_{1}, and one for the mean longitudinal velocity, μ_{2}. Between the grid points, the random effect values were interpolated by a search tree approximation to local polynomial regression^{28}. In this case study, the number of random effects to construct the grids were selected as a tradeoff between computational cost and flexibility. Locations of the movement process were estimated at every observation and every hour from the first observation, giving a total of 7459 location estimates each with two coordinates. The number of additional estimated locations was selected to include as many as possible while limiting the computational cost. Following Albertsen et al.^{7}, the measurements were modelled by a bivariate tdistribution with scale matrices and degrees of freedom depending on the location class of the observations. The model was fitted to the data by maximum likelihood through the Laplace approximation using TMB and argosTrack. Estimating the mean velocity field in TMB was time and memory consuming; therefore, the model was run on a high performance computing cluster with 128 GB memory.
Having a location dependent drift in the model, the animal movement may be related to available resources or hydrographic variables. The model cannot be estimated using the DCRW since the DCRW does not include a drift parameter. Furthermore, the model cannot easily be estimated using CTCRW without assuming constant velocity between estimated locations, in which case the CTCRW is a special case of the GDCRW (see gdcrwVSctcrw).
The fitted model resulted in small estimated longitudinal mean velocity field values (Fig. 6). On the contrary, the latitudinal mean velocity field was estimated to have a southward drift when the seal was south of the 57.5°N parallel. Despite the southward drift, the seal then entered a large area of northward drift between 57.5°N and 63.5°N. North of the 63.5°N parallel, the latitudinal mean velocity field was estimated to have a southward attraction, keeping the seal close to the 63.5°N parallel.
Further, the model estimated an overall mean turning angle, θ. This parameter was estimated to be significantly different from zero (Table 2), albeit close, whereas both the latitudinal and longitudinal autocorrelation parameters were estimated to be close to 0.56. Combined, this indicated a slightly tortuous movement; slightly circular, with neither high nor low autocorrelation when the drift was zero. When the drift in either direction was not zero, the movement was directed, and deviations from the drift were slightly tortuous. The estimated autocorrelation was lowered by including the velocity field in the model. Albertsen et al.^{7} estimated the autocorrelation in the CTCRW to be 0.65. Whereas the CTCRW explained directed movement by a high autocorrelation, this case study explained directed movement by a drift related to the seal’s position. Naturally, the drift lowers the remaining autocorrelation in the trajectory. All movement parameter estimates and their standard errors are seen in Table 2.
Discussion
The GDCRW introduced here generalizes the DCRW movement model of Jonsen et al.^{13} in three ways. Firstly, whereas the DCRW handles irregular observations by linear interpolation in the observational model, the GDCRW allows modelling irregular observations directly in the movement process. Through this construction, timescale corrections of the DCRW parameters were found by considering a time regular GDCRW. Secondly, the GDCRW generalizes the DCRW by allowing different autocorrelation parameters in the two coordinates. Hence, when the rotation parameter is zero, the GDCRW resembles the integrated velocity model of the CTCRW of Johnson et al.^{20}. Like the CTCRW and DCRW, the GDCRW includes the random walk as a limiting case. When the autocorrelation parameter approaches zero, the movement model approaches that of a random walk, which has been used in several studies, in particular when the data includes only daily observations (e.g.^{29,30}). Thirdly, the GDCRW extends the DCRW by including a drift term. Because the GDCRW is more general, it is also computationally more demanding to fit the model to data (see Supplementary Material S4). While the GDCRW has fewer latent variables than the CTCRW, and the same number as the DCRW, the transition equation includes matrix exponentials, which are computationally demanding. In the implementation used here, the matrix exponentials are calculated in each time step^{26}; however, estimation can, in general, be made faster by reusing calculations.
When the true animal movement is continuous in time, the GDCRW is a discrete time approximation. The quality of the approximation can be improved by including additional estimated locations between observations. The additional estimated locations make the time discretization finer, effectively allowing the discrete time movement to change several times between observations, similar to the continuous time movement. Likewise, the DCRW provides a discrete time approximation; but, with regular time steps. In the simulation studies, it was shown that it is generally better to place estimated locations at the time of observations than regularly throughout the track. This was particularly true when measurement error was low. Nonetheless, when computational power is limited, regular time steps can be used to decrease the number of estimated locations to fewer than the number of observations. Both the GDCRW and DCRW can be improved as approximations to continuous time movement by adding additional estimated locations between observations. The GDCRW, and the DCRW as a special case, is an EulerMaruyama approximation of the continuous time movement defined by equations (1) and (2). Increasing the number of estimated locations will make the time discretization of the approximation finer, thereby improving it. While more estimated locations are better, the number used in practice will often be limited by the increasing computational complexity of the estimation method. Further, the number needed depends on the type of movement analysed. For persistent movement trajectories with few changes, few additional estimated locations are needed. For tortuous movement trajectories with rapid changes of velocity and direction, more additional estimated locations are needed for accurate inference. The simulation studies suggest that from a certain point, the advantage of including more estimated locations will be limited. In the comparison with the DCRW, parameter estimates of the GDCRW did not improve much when more than 750 locations were estimated for 250 observations (Figs 3 and 4). When the number of additional estimated locations is a concern, the analysis can start by only including estimated locations at the time of observations and gradually increase the number of additional estimated locations until the inference is no longer improved.
Including a drift term in the model is useful for animals moving persistently between areas, such as the ringed seal analysed in the case study. In the case study, the drift term was modelled by a location dependent field. Having location or time dependent parameters in the SDE greatly complicates the calculations to obtain an analytical solution and in turn a fully continuous time model. Therefore, location dependent mean velocities could not easily be implemented in the CTCRW which has previously been used to analyse the track^{7,27}. Nonetheless, using the discrete time model, or a discrete time approximation, it is straightforward to extend the movement model with a location dependent drift. Using the same methods, spatial covariates such as sea surface temperature or depth could be included.
While the mean turning angle was estimated to be close to zero using the entire track, this may not be the case if, for instance, a time varying parameter or behavioural switching was included. Changes in behavioural states, \({S}_{{t}_{i}}\in \mathrm{\{1,}\,\mathrm{2,}\ldots ,n\}\), can be modelled by a Markov Chain such that
where Q is an n × n matrix such that \({Q}_{jk}\mathrm{ > 0}\) for \(j\ne k\) and \({Q}_{jj}={\sum }_{\{k:j\ne k\}}{Q}_{jk}\). The movement model is modified to have different parameters depending on the current behavioural state
This model generalizes the behavioural switching movement models of Jonsen et al.^{13} and Whoriskey et al.^{21}. It assumes that switching only occurs at the predetermined time points. That is, between two estimated locations, the animal only exhibits one type of behaviour and moves at a constant velocity; however, as it has been shown in both the case study and simulation studies, additional locations can be estimated either at regular time points, or at the time of observations and any time between them. While the extension to include behavioural switching is simple, maximum likelihood estimation is greatly complicated when measurement error is present. When observations are obtained without error, behavioural states and parameters can be estimated using hidden Markov model methods (e.g.^{21}).
The GDCRW can be constructed as a discretization of a generalization of the CTCRW. The first simulation study comparing the GDCRW and CTCRW suggests that using the analytical continuous time solution, when it can be found, generally provides better results than a discrete time approximation. Nonetheless, the discrete time approximation provided accurate parameter estimates, even when no additional estimated locations were included to improve the approximation. While the GDCRW was barely outperformed by the CTCRW in estimating true locations when the CTCRW was the true model, the first simulation study illustrated that the performance of the GDCRW could be improved to closely resemble the CTCRW. The performance of the GDCRW could be improved by including additional estimated locations between observations, because the movement process is then discretized on a finer time grid. Therefore, if the only aim is to reconstruct the true trajectory, any of the models perform well. Otherwise, the choice between the models should be based on which parameters are important to estimate.
The second simulation study compared the DCRW and the GDCRW in their ability to reestimate the movement parameters of the underlying continuous time model. In both the tortuous and persistent movement scenarios, the GDCRW performed well; however, in the tortuous movement scenario, the performance was notably improved by adding additional estimated locations between observations. Further, it became evident that the parameter estimates of the DCRW are highly dependent on the selected time steps between estimated locations. Nevertheless, the estimates could be corrected by equation (4): the GDCRW with regular time steps. This correction provided estimates close to the true values in all cases except in the tortuous movement case with 250 and 500 estimated locations. Clearly, the ability to have timescale independent movement parameters is a key feature of the GDCRW, regardless of whether it is used with regular or irregular time steps. Timescale independent movement parameters allow analysis of several animals without using the same time steps. Further, they allow comparing results from previous studies using the DCRW even if different time steps are used.
Besides accuracy of the estimates, the DCRW and GDCRW were compared on their ability to reconstruct the movement tracks. The third simulation study showed that modelling timeirregular data through a timeirregular movement could increase the accuracy of predicted locations. This is consistent with previous findings comparing the DCRW and CTCRW, which showed that the continuous time model performed better if both movement models were combined with tdistributed measurement errors^{7}. Overall, using the GDCRW with irregular time steps performed well in both scenarios compared to the DCRW. For small measurement errors, the irregular time steps outperformed the regular time steps with interpolation. For medium measurement errors, the two approaches performed equally in the persistent movement scenario, whereas the DCRW slightly outperformed the GDCRW in the tortuous movement case. For large measurement errors, the two approaches performed equally in both scenarios. Combined with the previous simulation study, these results suggest that the GDCRW performs well compared to the DCRW. In general, it is an advantage to have estimated locations at the time of observations for reconstructing trajectories. However, both models improve as approximations to the same underlying continuous time movement model when the number of estimated locations increases.
In this paper, a new method for dealing with telemetry data made up of time irregular locations was introduced. The model introduced was shown to generalize the DCRW and it was evaluated through simulation studies. Finally, the applicability of the model was illustrated by a case study including real telemetry data with time irregular observations. Directly modelling irregular data eases interpretation and comparison of movement parameters, because they are defined independently of the time between observations. From equation (4), parameters of the DCRW and GDCRW can be transformed to any time scale. However, the simulation study shows that fitting the DCRW with a poorly chosen time step introduces bias. This poses a problem in population and meta studies. With a discrete time model, all animals must be fitted with the same time steps to compare movement parameters. However, if the same time step is not optimal for all animals, bias can be introduced. Using an irregular or continuous time model transforms the parameters to a common time scale, even when the animals are observed, or behave, at different time scales.
Data Availability
Data analysed in the case study are available with the R package argosTrack in the Zenodo repository, https://doi.org/10.5281/zenodo.1420418. Code for the case study is available from the supplementary material and from GitHub, https://github.com/calbertsen/argosTrack/tree/c3a08ab/argosTrack/inst/examples/gdcrw_velocityfield.
References
Hussey, N. E. et al. Aquatic animal telemetry: A panoramic window into the underwater world. Science 348, https://doi.org/10.1126/science.1255642 (2015).
Costa, D. P. et al. Accuracy of ARGOS locations of pinnipeds atsea estimated using fastloc GPS. PLoS One 5, e8677, https://doi.org/10.1371/journal.pone.0008677 (2010).
Fournier, D. A. et al. Ad model builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optim. Methods Softw. 27, 233–249, https://doi.org/10.1080/10556788.2011.597854 (2012).
Kristensen, K., Nielsen, A., Berg, C., Skaug, H. & Bell, B. Tmb: Automatic differentiation and laplace approximation. J. Stat. Softw. 70, 1–21, https://doi.org/10.18637/jss.v070.i05 (2016).
AugerMéthé, M. et al. Statespace models’ dirty little secrets: even simple linear gaussian models can have parameter and state estimation problems. Sci. Reports 6, 26677, https://doi.org/10.1038/srep26677 (2016).
AugerMéthé, M. et al. Spatiotemporal modelling of marine movement data using template model builder. Mar. Ecol. Prog. Ser. 565, 237–249, https://doi.org/10.3354/meps12019 (2017).
Albertsen, C. M., Whoriskey, K., Yurkowski, D., Nielsen, A. & Flemming, J. Fast fitting of nongaussian statespace models to animal movement data via template model builder. Ecology 96, 2598–2604, https://doi.org/10.1890/142101.1 (2015).
Chambault, P. et al. Sea surface temperature predicts the movements of an arctic cetacean: the bowhead whale. Sci. Reports 8, https://doi.org/10.1038/s41598018279661 (2018).
Winton, M. et al. Estimating the distribution and relative density of satellitetagged loggerhead sea turtles using geostatistical mixed effects models. Mar. Ecol. Prog. Ser. 586, 217–232, https://doi.org/10.3354/meps12396 (2018).
Pedersen, M. W., Patterson, T. A., Thygesen, U. H. & Madsen, H. Estimating animal behavior and residency from movement data. Oikos 120, 1281–1290, https://doi.org/10.1111/j.16000706.2011.19044.x (2011).
McClintock, B. T., Johnson, D. S., Hooten, M. B., Ver Hoef, J. M. & Morales, J. M. When to be discrete: the importance of time formulation in understanding animal movement. Mov. Ecol. 2, 21, https://doi.org/10.1186/s4046201400216 (2014).
Morales, J. M., Haydon, D. T., Frair, J., Holsinger, K. E. & Fryxell, J. M. Extracting more out of relocation data: Building movement models as mixtures of random walks. Ecology 85, 2436–2445, https://doi.org/10.1890/030269 (2004).
Jonsen, I. D., Flemming, J. M. & Myers, R. A. Robust state–space modeling of animal movement data. Ecology 86, 2874–2880, https://doi.org/10.1890/041852 (2005).
Gurarie, E., Andrews, R. D. & Laidre, K. L. A novel method for identifying behavioural changes in animal movement data. Ecol. Lett. 12, 395–408, https://doi.org/10.1111/j.14610248.2009.01293.x (2009).
Tracey, J. A., Zhu, J. & Crooks, K. R. Modeling and inference of animal movement using artificial neural networks. Environ. Ecol. Stat. 18, 393–410, https://doi.org/10.1007/s1065101001388 (2010).
McClintock, B. T. et al. A general discretetime modeling framework for animal movement using multistate random walks. Ecol. Monogr. 82, 335–349 (2012).
Michelot, T., Langrock, R. & Patterson, T. A. moveHMM: an r package for the statistical modelling of animal movement data using hidden markov models. Methods Ecol. Evol. 7, 1308–1315, https://doi.org/10.1111/2041210x.12578 (2016).
Turchin, P. Quantitative Analysis of Movement: Measuring and Modeling Population Redistribution in Animals and Plants (Sinauer, Sunderland, MA, U.S.A, 1998).
Jonsen, I. Joint estimation over multiple individuals improves behavioural state inference from animal movement data. Sci. Reports 6, 20625, https://doi.org/10.1038/srep20625 (2016).
Johnson, D. S., London, J. M., Lea, M.A. & Durban, J. W. Continuoustime correlated random walk model for animal telemetry data. Ecology 89, 1208–1215, https://doi.org/10.1890/071032.1 (2008).
Whoriskey, K. et al. A hidden markov movement model for rapidly identifying behavioral states from animal tracks. Ecol. Evol. 7, 2112–2121, https://doi.org/10.1002/ece3.2795 (2017).
Gurarie, E. et al. Correlated velocity models as a fundamental unit of animal movement: synthesis and applications. Mov. Ecol. 5, https://doi.org/10.1186/s4046201701033 (2017).
Pedersen, A. R. A new approach to maximum likelihood estimation for stochastic differential equations based on discrete observations. Scand. J. Stat. 22, 55–71 (1995).
Pedersen, A. R. Consistency and asymptotic normality of an approximate maximum likelihood estimator for discretely observed diffusion processes. Bernoulli 1, 257–279, https://doi.org/10.3150/bj/1193667818 (1995).
R Core Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2018).
Albertsen, C. M. argosTrack: Fit Movement Models to Argos Data for Marine Animals, https://doi.org/10.5281/zenodo.1420418 R package version 1.2.2, https://github.com/calbertsen/argosTrack/tree/v1.2.2 (2018).
Thygesen, U. H., Albertsen, C. M., Berg, C. W., Kristensen, K. & Nielsen, A. Validation of ecological state space models using the laplace approximation. Environ. Ecol. Stat. 24, 317–339, https://doi.org/10.1007/s1065101703724 (2017).
Albertsen, C. M. covafillr: Local Polynomial Regression of State Dependent Covariates in StateSpace Models, R package version 0.4.3, https://CRAN.Rproject.org/package=covafillr (2018).
Nielsen, A., Bigelow, K. A., Musyl, M. K. & Sibert, J. R. Improving lightbased geolocation by including sea surface temperature. Fish. Oceanogr. 15, 314–325, https://doi.org/10.1111/j.13652419.2005.00401.x (2006).
Lam, C., Nielsen, A. & Sibert, J. Incorporating seasurface temperature to the lightbased geolocation model TrackIt. Mar. Ecol. Prog. Ser. 419, 71–84, https://doi.org/10.3354/meps08862 (2010).
Author information
Authors and Affiliations
Contributions
C.M.A. conceived and conducted the study, analysed the results and wrote the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The author declares no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Albertsen, C.M. Generalizing the firstdifference correlated random walk for marine animal movement data. Sci Rep 9, 4017 (2019). https://doi.org/10.1038/s4159801940405z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4159801940405z
This article is cited by

EQRSRL: an energyaware and QoSbased routing schema using reinforcement learning in IoMT
Wireless Networks (2023)

Multifractal Characteristics on Multiple Pollution Variables in Malaysia
Bulletin of the Malaysian Mathematical Sciences Society (2022)

VaryingCoefficient Stochastic Differential Equations with Applications in Ecology
Journal of Agricultural, Biological and Environmental Statistics (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.