Introduction

Ecologists have long been interested in the relative roles of intrinsic and extrinsic factors in population dynamics1. Analysis of time series of population abundance has been the most common method to understand the drivers of population dynamics across many species2,3,4,5. Data on population abundance of many species, including ecologically and economically important species like large ungulates6, are derived using various approaches that often include indirect counting methods. Large ungulates are typically distributed over large areas in habitats that conceal animals, preventing complete detection and driving the use of indirect counting methods7,8,9. In some circumstances, it is possible to estimate true population sizes from indirect counts using ecological and statistical models. In general, however, population time series of species like large ungulates inherently include error10 and yet most previous time-series analyses have not explicitly been able to account for error. Therefore, methods that account for the sources of error in time-series estimates are needed in ecology.

The two major components of error in any time series of population counts are observation and process error. Observation error, as the name suggests, results from variation in the (observation) methodology used to obtain the population count. Sources of observation error are many and can include: difficulty in detecting animals distributed over wide spatial scales; terrain, field conditions or observer experience that prevent animals from being detected; harsh environmental conditions that hinder logistics and replicability of counting animals; untrained observers; lack of technical expertise; insufficient funding; and human error8,9,11. Process error, on the other hand, is usually thought of as variation in true population size due to biotic or abiotic processes; that is, the real drivers of population fluctuations that ecologists are interested in quantifying. Process error can often get overlooked in statistical modelling, however, because of the inability of most traditional time-series methods to capture multiple complex population processes12. The trade-offs inherent to keeping models of population dynamics simple, accurate and meaningful can lead to models being incapable of capturing complex intra- and inter-species, life stage, trophic and community interactions13. Moreover it is often not possible to observe all life stages of a long-lived species, which is why population counts often fail to detect hidden population states14 and why time-lagged responses need to be incorporated in population models15. Despite these challenges, ecologists have long recognized the importance of separating observation from process error in ecological modelling.

Bayesian state-space models are being increasingly used to analyse population time series to separate process and observation error10,16,17,18,19,20. The last twenty years has seen the growing use of Bayesian statistical methods to address ecological questions21,22,23, which has been driven by advances in widely available software that have made these approaches easier to implement24. The appeal in using state-space models lies in their hierarchical approach that decomposes a population time series into not only growth parameters (depending on the choice of the population model) and process error, but also underlying observation error. Because of the conflict between the roles of density-dependence and density-independence (i.e., abiotic or climate variation) in regulating populations, accounting for observation error in population time series will be important to accurately quantify impacts of factors like climate change in the future. Past studies have often conducted naïve time-series analysis without explicitly acknowledging the role of observation error25. Although there has been a growing trend in acknowledging the presence of both observation and process error in time-series data10,12,26,27,28, important questions regarding trends, variation, relative role and the drivers of error in population time series remain unanswered.

We conducted a comparative global analysis of the population dynamics of two contrasting large herbivores using a Bayesian state-space modelling approach (hereafter referred to as BSS) with the goal of better understanding the sources of observation and process error in population time series and variation in the strength of both across populations. We used a time-series dataset that combined 27 Ceruvs (elk in North America and red deer in Europe) and 28 Rangifer (caribou in North America and reindeer in Europe) populations because of their broad ecological and economic importance. The four major objectives of the study were: 1) estimating and examining variation in the magnitude of observation and process error; 2) quantifying the contributions of ecological and methodological variables to the variance in observation and process error; 3) comparing the BSS estimates of error with estimates of error derived from standard non-Bayesian autoregressive integrated moving average (hereafter referred to as ARIMA) models; and finally, 4) quantifying the contributions of ecological and methodological variables to the variance in the strength of statistical direct density-dependence. All four objectives were applied to the dataset that combined time series of both focal genera (hereafter referred to as the combined dataset), in addition to the time-series datasets of each of the two focal genera.

The methodological explanatory variables included the temporal components of any time-series (time-series length and proportion of missing data) and the population estimation method (e.g., aerial survey, ground survey, etc.). With respect to ecological variables, predation impacts growth of Cervus and Rangifer populations29 and has been found to be a significant predictor in dynamics and variation of both species30,31. Therefore we considered whether populations were broadly subject to predation by humans and large carnivores. Climate drives forage availability, which in turn drives population growth and latitude, being a key determinant of climate and primary production32, was used as a simple ecological explanatory variable as both a direct surrogate for climate and an indirect surrogate for forage availability. As statistical direct density dependence was a necessary component of the chosen population model, we therefore also analysed the overall variation in statistical direct density dependence estimates found across all populations and discuss potential drivers of variation in the strength of direct density dependence across the combined dataset and across the independent time-series datasets of each of the two focal genera. The intent of our analysis was not to derive the most parsimonious model of factors influencing the strength of direct density dependence for either species and so inferences about the strength of density dependence deriving from our analyses are limited.

Results

Bayesian state-space modelling revealed high variation in both observation (0.0002–0.166) and process (0.0003–0.248) error across the combined dataset (Figs. 12, Supplementary Information Table A2). Process error was much greater than observation error, with process error exceeding observation error for 23 of the 27 (85%) Cervus and 18 of the 28 (68%) Rangifer populations (Supplementary Information Table A2). In general, Bayesian estimates of error were higher for Rangifer (observation error: mean μ = 0.039; process error: μ = 0.057) than Cervus (observation error: μ = 0.015; process error: μ = 0.040) populations (Fig. 3). Bayesian estimates of observation and process error were strongly correlated across populations within the combined dataset (r = 0.73, n = 55, p < 0.001) and also in the set of Cervus (r = 0.77, n = 27, p < 0.001) and in the set of Rangifer (r = 0.74, n = 28, p < 0.001) populations separately.

Figure 1
figure 1

Location map (generated using ArcGIS) of 27 globally distributed Cervus populations and time series of a subset of eight populations.

For each of the eight populations: the solid blue line is the empirical time-series data, the dashed black lines are the upper and lower Highest Posterior Density (HPD) estimates by a first-order autoregressive state-space model and the yellow and green bars are observation (O error) and process error (P error) estimated by the state-space models. The secondary right y-axis is for the observation and process error bar graphs and is the same scale for all figures, enabling direct inter-population comparisons. The populations are (starting with the top-left figure and moving clockwise): Northern Range, Yellowstone, USA; Sikhote-Alin Zapovednik, Russia; Population 4, Norway; Bialowieza Primeval Forest, Poland; Petite Pierre National Reserve, France; Isle of Rum, Scotland; Ya Ha Tinda Cervus herd, Banff National Park, Canada; Point Reyes, USA; Northern Range, Yellowstone, USA.

Figure 2
figure 2

Location map (generated using ArcGIS) of 28 globally distributed Rangifer populations and time series of a subset of eight populations.

For each of the eight populations: the solid red line is the empirical time-series data, the dashed black lines are the upper and lower Highest Posterior Density (HPD) estimates by a first-order autoregressive state-space model and the yellow and green bars are observation (O error) and process error (P error) estimated by the state-space models. The secondary right y-axis is for the observation and process error bar graphs and is the same scale for all figures, enabling direct inter-population comparisons. The populations are (starting with the top-left figure and moving clockwise): Denali National Park, Alaska; Krasnoy, Russia; Tyumen, Russia; Alakyla, Finland; Palojarvi, Finland; Upernavik, Greenland; Manitsoq, Greenland; Nelchina, Alaska, USA.

Figure 3
figure 3

Mean (±SE) of (a) Bayesian state-space model estimates of observation and process error and ARIMA model estimates of error and (b) Bayesian state-space and ARIMA modelling estimates of statistical direct density dependence in the detrended time series of 55 globally distributed Cervus (n = 27) and Rangifer (n = 28) populations.

On average, observation error was greatest (μ = 0.043) in time series for which data comprised harvest counts, followed by snow-track counts (mean μ = 0.038), followed by aerial counts (μ = 0.014) and finally ground counts (μ = 0.001) (Fig. 4). Both time-series length (p = 0.004; Fig. 5a) and predation (p = 0.02) were negatively associated with observation error and harvest counts (p = 0.03) and snow-track counts (p = 0.07) were positively associated with observation error in the top model (AIC = −218.3; AIC of null model = −206.6) that fit the variation in observation error across the combined dataset (Table 1).

Table 1 Results of generalized linear models (GLM) that best fit the variation in Bayesian state-space (BSS) model estimates of observation error, process error and direct density dependence and autoregressive integrated moving average (ARIMA) model estimates of error and direct density dependence (β1) in the detrended times-series of 55 globally distributed populations (27 Cervus and 28 Rangifer). The explanatory variables included methodological (time-series length, proportion of missing data, population estimation method) and ecological (species, latitude, presence/absence of hunting, wolves and large felids, predation [absence/presence] and number of predators [0, 1 or 2]). Only significant predictors are reported for each model and presented in order of decreasing significance
Figure 4
figure 4

Mean (±SE) of Bayesian state-space model estimates of observation error found in the detrended time series of 55 globally distributed Cervus (n = 27) and Rangifer (n = 28) populations using different survey methods: aerial counts; ground counts, which included drive counts, line counts, capture-resight counts, road counts and horseback counts; harvest counts; and, snow-track counts.

Figure 5
figure 5

Generalized linear model (GLM) generated relationships between (a) Bayesian estimates of observation error, (b) Bayesian estimates of process error and (c) ARIMA estimates of error and the time series length, respectively.

The relationships (95% CI = grey shaded region) were calculated while keeping other explanatory variables constant in the respective GLMs.

In the separate analyses of observation error for each of the two species, harvest counts (p = 0.07 for Cervus; p = 0.01 for Rangifer), ground counts (p = 0.08 for Cervus; p = 0.04 for Rangifer), snow-track counts (p = 0.09 for Cervus; p = 0.02 for Rangifer) and proportion of missing data (p = 0.08 for Cervus; p = 0.01 for Rangifer) were all positively associated with observation error and were common to both top models that fit the variation in observation error across the set of Cervus (AIC = −157.6; AIC of null model = −129.8) and Rangifer (AIC = −98.22; AIC of null model = −92.36) populations (Table 2), respectively. The presence of both one predator species (p = 0.08) and two predator species (p = 0.08) were positively associated, but time-series length (p = 0.08) was negatively associated with observation error in the top model that fit the variation of observation error across the set of Cervus populations (Table 2).

Table 2 The generalized linear models (GLM) that best fit the variation in Bayesian state-space (BSS) model estimates of observation error, process error and direct density dependence and autoregressive integrated moving average (ARIMA) model estimates of error and direct density dependence in the independent sets detrended time-series of 27 Cervus populations and in the set of 28 Rangifer populations, respectively. The explanatory variables included methodological (time-series length, proportion of missing data, population estimation method) and ecological (species, latitude, presence/absence of hunting, wolves and large felids, predation [absence/presence] and number of predators [0, 1 or 2]). Only significant predictors are reported for each model and presented in order of decreasing significance

Process error in the combined dataset increased with increasing time-series length (p = 0.004; Fig. 5b), but decreased for Cervus (p = 0.01), presence of one predator (p = 0.05) and increasing latitude (p = 0.06) were negatively associated with process error in the top model (AIC = −191.2; AIC of null model = −167.1) that fit the variation in process error across the combined dataset (Table 1). In the analyses of process error for each of the two species, multiple variables–the positive association of wolves (p = 0.05) and hunting (p = 0.05) and the negative association of time-series length (p = < 0.001)–made the list of predictors in the top model that fit the variation of process error across the set of Cervus populations (AIC = −114.4; AIC of null model = −103.7), but the negative association of latitude (p = 0.08) was the only predictor of process error across the set of Rangifer populations (AIC = −87.9; AIC of null model = −73.1).

The BSS estimates of the strength of statistical direct density dependence were lower in the set of Rangifer (μ = −0.74) than in the set of Cervus (μ = −0.42) populations (Fig. 3). Time-series length was highly significant (p = < 0.001) and positively associated, while the strength of density dependence was weaker for Cervus in the top combined model that fit the variation of BSS estimates of statistical density dependence (AIC = 32.3; AIC of null model = 70.8) (Table 1). Time-series length was also both positively associated and the most significant predictor in the top models (Cervus: AIC = 21.73, AIC of null model = 25.93; Rangifer: AIC = −5.19, AIC of null model = 39.79) that fit the variation in the strength statistical direct density dependence across the population sets of each genera (Tables 2). In addition to ‘time-series length’, the presence of ‘one predator species’ (p = < 0.001), ‘two predator species’ (p = 0.03) and ‘latitude’ (p = 0.004) were positively associated with direct density dependence in the model that best fit the variation in the strength of statistical direct density dependence across the Rangifer populations (Table 2).

Similar to the BSS estimates of error, the ARIMA estimates of overall error (not partitioned into observation and process error) were higher in the set of Rangifer (μ = 0.043) than in the set of Cervus (μ = 0.087) populations (Fig. 3). In general, time-series length was positively associated with the variation in ARIMA estimates of error across both the combined dataset (AIC of top model = −151.85, AIC of null model = −126.8; Fig. 5c) and across the set of Cervus populations (AIC of top model = −89.11, AIC of null model = −91.19), respectively (Tables 1 and 2).

The ARIMA estimates of variation in statistical direct density dependence, similar to the BSS estimates, were lower in the set of Rangifer (μ = −0.64) than in the set of Cervus (μ = −0.38) populations (Fig. 3). While the interaction of time-series length and predation was positively associated, predation by itself was negatively associated with statistical direct density dependence in the top model that fit the variation in ARIMA estimates of statistical direct density dependence across the combined set (AIC = 17.16; AIC of null model = 57.2; Table 1). Time-series length was positively associated with statistical direct density dependence in the top models that fit the variation in ARIMA estimates of statistical direct density dependence across both the set of Cervus (AIC = 10.45; AIC of null model = 12.83) and Rangifer (AIC = −1.9; AIC of null model = 37.0) populations, respectively (Table 2).

Discussion

Our results provide a comprehensive examination of variation in the strength and covariates of the different components of error in a large (55 populations) dataset of two species distributed across their entire global ranges. We found consistent differences in the magnitude of observation and process error between these two widely distributed species. Bayesian state-space model estimates of observation and process error were consistently greater for the more northerly species, Rangifer. The difference in observation error between the species may reflect differences in how population counts were obtained, but it is less clear what explains the difference in process error between the species. Methodologically, this suggests that comparative time series analyses that fail to account for different types of error might end up with incorrect conclusions about the relative roles of factors affecting population dynamics. For example, not accounting for observation error has been shown to hamper identifying demographic mechanisms that cause density-dependent population regulation in red-backed shrikes Lanius collurio33. Elsewhere, in a study of fossil pollen evidence for beech in southern Ontario, Canada, it was thought that the large observation error that was found overwhelmed details of the population growth process10. More generally, many authors10,20,26 have demonstrated that inferences regarding population dynamics may be clouded when conducted on naïve time-series without separating observation and process error.

Other Bayesian examples of state-space modelling of population time-series have also found process error to dominate observation error. For example, an analysis of the time-series of moose (Alces alces) in Bialowieza Primeval Forest, Poland10, found process error to be greater than observation error, similar to results of our analyses of a Cervus population from this same study area. However, Bayesian state-space analysis (using a model near-identical to the one used in this study) of an American redstart (Setophaga ruticulla) population in North America found greater observation error than process error27. The authors of the American redstart study thought that using data collected by the Breeding Bird survey was the main reason for the high observation error; multiple observers with little training participate in the Breeding Bird Survey, which has the potential to compound observation error.

It is possible that the low observation error associated with both Bialowieza populations (Cervus and Alces) reflect greater precision in the estimates of the Bialowieza populations compared to the Breeding Bird Survey counts. Both Bialoweiza populations are more-or-less closed populations and normally counts of closed populations of large mammals have a higher level of precision and accuracy. This also may be why we found very low observation error for the Cervus populations at Point Reyes, on the Isle of Rum and in the Scottish highlands (C2, C12, C13; Supplementary Information Table A1), all of which are considered more-or-less closed populations and have been carefully monitored for decades33,34. In comparison, our analysis of Rangifer populations showed that indirect population estimation methods such as harvest counts yield higher observation error than other, ostensibly more accurate methods. Therefore, analyses combining time series across different methods, which are often the case in syntheses of global population dynamics2, need to pay particular attention to methodological differences in structuring error.

Comparing the BSS estimates of observation and process error to the ARIMA estimates of error highlights the level of detail and potentially erroneous conclusions that would be lost if only ARIMA models were used. It is not reasonable to equate ARIMA error to BSS process error as ARIMA error contains both observation and process error. Comparing the predictors of the variation in BSS process error with the predictors of ARIMA error estimates across the entire dataset (Table 1) demonstrates that ARIMA estimates would tend to underestimate the difference in variation between species. For example, the ARIMA analysis lost all signatures of methodological differences in survey methods, especially for Rangifer and underestimated differences between species. Our simple comparison of ARIMA and Bayesian state space models confirms results of previous studies that with the advent of these new methods, time-series analyses should take advantage of the ability to separate out process error in ecological analyses.

While we did not find many predictors for either observation or process error, a general result was that Bayesian process error increased with the time-series length, while observation error declined with the time-series length (Fig. 5). The positive relationship of process error with time-series length suggests that longer-term studies are more likely to capture shifts in the ecological processes in observed dynamics. A good example of the value of long-term time-series data that has helped capture important ecological processes (prey-predator in this case) is the 50-year long Isle Royale wolf-moose study35. Regarding the negative relationship of observation error with time-series length, perhaps observation error was lower for longer time-series because observer efficiency increases over time, and/or because animals become more accustomed to observers over time. Regardless, these counteracting effects of a simple parameter, time-series length, on process and observation error emphasize even more clearly the importance of separating out factors affecting these two components using state-space models.

The lack of significant predictors, beyond time-series length and genus, in variation in the strength of statistical direct density dependence across the pooled time series made it difficult to draw inferences regarding the drivers of variation in density dependence. Predation was, however, positively associated with density dependence in Rangifer populations, that is the presence of wolves increased the strength of density-dependence. This could be a function of wolves being important predators of Rangifer36,37, or simply a tradeoff between the strength of extrinsic and intrinsic factors in limiting populations of Rangifer. Although Cervus are known to be a significant component of all three predator (wolves, bear and mountain lion) diets38,39, only time-series length was positively associated with the variation in direct density dependence in the Cervus populations. The list of the predictors of statistical direct density dependence, however, may have been different if: 1) we had allowed the structure of statistical direct density dependence to vary among populations or species (sensu40); and 2) our models included environmental covariates that might influence the strength of direct density dependence (sensu41). It is for these reasons, therefore, that the focus of this study was confined primarily to error in time-series analysis and not population growth parameters such as direct density dependence.

In conclusion, this study highlights the usefulness of Bayesian state-space modelling of time-series, without which conclusions regarding the strength of population regulatory processes like density dependence and the role of density-independent variables on population dynamics run the risk of being misleading. The differences in error between the species further highlights that estimating different components of error may be necessary to make meaningful comparative studies. Finally, this study endorses the usefulness of collecting long-term data, which increases detection of population processes while at the same time decreasing the amount of observation error.

Methods

Global Cervus and Rangifer population estimates

We obtained from published literature time series of annual abundance estimates of 27 Cervus and 28 Rangifer populations located in eight and four countries, respectively (Supplementary Information Table A1). These counts were either raw counts, or counts that were adjusted to account for issues like detectability. The non-uniformity and non-standardization in estimating methods, in addition to variation in the proportion of missing data made this combined population time-series dataset an ideal test case to analyse error variation in population time series. There was considerable variation in the population estimation methods used, including: road counts, horseback counts, drive counts, harvest counts, mark-resight counts, snow-track counts and aerial counts with and without sightability correction (Supplementary Information Table A1). The length of the time series ranged from 12–48 years (μ = 30.93) for Cervus and 10–74 years (μ = 28.25) for Rangifer. Seven of the Cervus time series were missing data, i.e., 0.04%–0.22% of years in the length of the time series did not have data and seven of the Rangifer populations were missing data, 0.01%–0.43%.

Markov Chain Monte Carlo (MCMC) Bayesian state-space population models

In general, time series of abundance of large herbivores inherently include components of both observation and process error42. The Markov Chain Monte Carlo (MCMC) Bayesian approach is a good choice for analysing these time series, because its two-tier modelling approach detects both observation and process error. In the first tier, the system process models the underlying ecological process, in this case population growth (and hence abundance) over time and its inherent stochasticity. Second, the observation process takes into account the error associated with the population estimation method – detection of observation error is the primary advantage that state space models provide over other dynamic population models. Direct density dependence is a common feature of the dynamics of both Cervus and Rangifer populations40,43,44, whether or not these populations additionally experience limitation by extrinsic factors such as climate or natural enemies. In general, direct density dependence is detected in a statistical framework as a first-order process, for example a negative relationship between Nt+1 and Nt. To ensure uniform modelling across a diverse set of populations that would allow comparing results across all 55 populations, we chose a first order autoregressive Gompertz population growth model15. Our objectives here were to determine the factors affecting variation in both decomposed estimates of observation error, process error and overall variation in density dependence across populations to understand challenges to synthesizing factors affecting population dynamics across multiple populations.

In the first tier of the two-tier state space model, for each population i, let Xi,t be the time series of loge-transformed true abundances xi,t. The statistical model approximating the inferred ecological process model of first-order density dependent population dynamics is

in which βi,0 represents the rate of intrinsic population growth and βi,1 the strength of statistical direct density dependence in population growth14. The process stochasticity, εi,t, is assumed to be normally distributed with mean zero and standard deviation .

In the second tier of the state-space model, the state of the process Xi,t is observed indirectly through estimates of abundance (observations) Yi,t

in which the observed abundance Yi,t is assumed to be randomly drawn from a normal distribution with the true abundance xi,t as the mean and an observation error standard deviation σi,y.

As detrending a time-series may be necessary to satisfy the assumption of stationarity of a population time-series and to maintain consistency across all populations, we therefore detrended all the population time-series and ran the Bayesian state space models on the detrended time series. The state-space models were analysed using the Gibbs Sampler, a modification of the Metropolitan-Hastings sampler, the primary Monte Carlo Markov Chain Bayesian algorithm implemented in JAGS 3.3.045 using the rjags package of the R computing environment46. The Bayesian approach requires providing the state space model with prior probabilities of the parent parameters (βi,0, βi,1, σi,x, σi,y) (refer to model and R code described in Supplementary Information) i.e., parameters that are not dependent on other parameters or data. We followed the general practice of providing vague prior probabilities to the parent parameters19: . Posterior distributions of the unobservable processes of each population (growth parameters, observation error and process error) were derived by successive application of the Bayes theorem using 200,000 simulations after a burn-in of 100,000 simulations of the Gibbs sampler.

Posterior distributions of statistical direct density dependence and observation and process error were summarized by their mean and 95% highest posterior density (HPD), respectively – HPD in Bayesian analyses is analogous to confidence interval used in frequentist statistics. These statistics were calculated using the Bayesian Output Analysis program 1.1.5 implemented via the R computing environment.

Non-Bayesian autoregressive integrated moving average (ARIMA) models

For comparative purposes, we also estimated the error and the statistical direct density dependence in each population time series using non-Bayesian autoregressive integrated moving average (ARIMA) models47. Each population time series was detrended to ensure stationarity and then fitted with a first-order ARIMA model similar to equation (1). The statistical direct density dependence and process variance within each time-series were estimated using the arima function of the ‘stats’ package in the R statistical analysis environment46.

Analysing drivers of error and density dependence

We tested for ecological and methodological drivers of the variation in the Bayesian estimates of three response variables: observation error, process error and DD. We also compared inferences obtained from BSS and ARIMA models as a final question. We expected observation error to be driven by the methodological factor of survey method and evaluated the effect of the following four survey method categories on observation error: i) ground counts, which included drive counts, line counts, capture-resight counts, road counts and horseback counts; ii) aerial counts; iii) harvest counts; and iv) snow-track counts. Survey method was not expected to drive process variation and, therefore, was not used to analyse process error. Two other methodological variables, length of time-series and proportion of missing data, were considered for analysing observation and process error, as well as ARIMA error. We expected the time-series length to be positively related to the amount of process variation captured by the model and missing data was a factor with obvious potential to impact error in general.

While we expected ecological variables to be more important in driving process error, we also examined their role in driving observation error. The two ecological factors that were used to analyse the different error estimates were latitude and predation. As ‘harvest counts’ was already included as a survey method category to analyse observation error, we evaluated the potential of the factor ‘hunting by humans’ only on process error. However, hunting by animal predators were analysed for both observation and process error. We included the presence of three different predators: wolves Canis lupus; large felids like mountain lions Felis concolor and Eurasian lynx Lynx lynx; and bears Ursus spp. Since wolves and bears were strongly correlated (r = 0.74, p < 0.001), we included only wolves and large felids and also tested two additional predator variables: (a) predation (categorical: absence/presence) and (b) number of predators (categorical: 0, 1, 2) for both process and observation error. Therefore, the initial model for observation error included: survey method, time-series length, proportion of missing data, latitude and hunting by animal predators; and the initial model for process error and ARIMA error included: survey method, time-series length, proportion of missing data, latitude, hunting by humans and animal predators. Since the two species occupied different habitats that could impact both process variation and potentially observation error, we tested whether species would be a significant predictor for all response variables.

As the distributions of the response variables were not uniform, we analysed their variation using generalized linear models (GLM)48. We used a stepwise approach to determine the model with the optimal set of explanatory variables based on both lowest Akaike Information Criteria (AIC) and lowest residual deviance. Both process and observation error were assumed to have Gaussian distributions and were analysed log-transformed, which reduced both AIC and residual deviance in their best-fit models respectively. All GLM analyses were done in the R computing environment46.