Double-observer approach with camera traps can correct imperfect detection and improve the accuracy of density estimation of unmarked animal populations

Camera traps are a powerful tool for wildlife surveys. However, camera traps may not always detect animals passing in front. This constraint may create a substantial bias in estimating critical parameters such as the density of unmarked populations. We proposed the 'double-observer approach' with camera traps to counter the constraint, which involves setting up a paired camera trap at a station and correcting imperfect detection with a reformulated hierarchical capture-recapture model for stratified populations. We performed simulations to evaluate this approach's reliability and determine how to obtain desirable data for this approach. We then applied it to 12 mammals in Japan and Cameroon. The results showed that the model assuming a beta-binomial distribution as detection processes could correct imperfect detection as long as paired camera traps detect animals nearly independently (Correlation coefficient ≤ 0.2). Camera traps should be installed to monitor a predefined small focal area from different directions to satisfy this requirement. The field surveys showed that camera trap could miss animals by 3–40%, suggesting that current density estimation models relying on perfect detection may underestimate animal density by the same order of magnitude. We hope that our approach will be incorporated into existing density estimation models to improve their accuracy.

However, most of these models, except for Chandler and Royle 27 and Ramsey et al. 34 , assume that camera traps can detect animals passing in a specific area within the camera detection zone with absolute certainty (i.e. perfect detection). Rowcliffe et al. 17 suggested that the effective area of the camera detection zone (codeterminants of the camera sensitivity) can be determined using a distance sampling approach by applying detection function models to data on the position where animals were first detected (see also Howe et al. 31 ). Although this approach is an important step in quantifying camera sensitivity, it also requires perfect detection at a given point within the camera detection zone, which is still questionable.
A possible yet untested approach to correcting imperfect detection would be to apply the 'independent double observer approach. ' This approach has been used in point-count surveys based on direct observations by human observers [36][37][38][39][40] . In this approach, two observers (camera traps in our context) record animals concurrently and independently, and the detection probability is estimated from the match or mismatch of observation records. In surveys using camera traps, two cameras were often installed at a station to improve detection probability and reliably identify the individuals of the marked populations 7,41 . Thus such placement may be a realistic approach in camera trapping surveys. he capture-recapture and N-mixture models can be applied to correct imperfect detections using double-observer data 38 . The former requires that the observers confer with each other regarding each observation, while the latter may be based solely on the counts by each observer. Nichols et al. 38 applied these two models to bird surveys and reported that both models have potential, while the precision of estimates was higher in the former approach. Given that timestamps within images captured by camera traps would allow for efficient reconciling of each count, the capture-recapture model may provide a reliable and efficient means to correct imperfect detections. Although capture-recapture models have been usually used to estimate animal abundances, the same approach may be helpful to estimate the detection probability of a camera trap, and hence the total number of animal passes.
A constraint in applying the capture-recapture model to the double-observer approach may be the assumption that two observers (i.e. camera traps) detect animals independently. If this assumption is violated, detection probability is overestimated, underestimating the number of animal passes. Regrettably, in practice, the detection probability by paired camera taps may depend on numerous unknown factors (e.g. animal body mass, ambient temperature, etc.), resulting in correlated detections between camera traps at a station and hence heterogeneity in detection probability among animal passes. One viable method to deal with this issue would assume a betabinomial distribution instead of a binomial distribution to account for the correlated detection 42,43 . Alternatively, it may be possible to model the detection histories using categorical-Dirichlet distribution 44 . Indeed, Clare et al. 44 successfully accommodated correlated detections by two different sampling devices to obtain reliable estimates of animal occupancy probability. However, given that the number of camera traps in a single station is at most two in our applications, available sample size may not be sufficient to accommodate the heterogeneity in detection. Thus it is necessary to determine to what degree of the heterogeneity this modelling approach allows for a given sample size.
This study develops a double-observer approach by camera traps in which multiple cameras are installed at a single site to improve detection probability with accounting for correlation between the detections of the colocated cameras using a reformulated hierarchical capture-recapture estimation model with a Bayesian framework. Firstly, we introduced the hierarchical capture-recapture model for stratified populations. Secondly, Monte Carlo simulations were performed to evaluate the reliability of this approach, focusing on to what degree the model can accommodate the correlated detections. Thirdly, we performed additional simulations to determine desirable camera selections and placements for this approach. Fourthly, we applied the models to the datasets obtained from two different habitats, Cameroon and Japan, and quantified the detection probability of 12 mammals with varying body sizes. This survey aimed to assess the necessity to correct imperfect detections using widely used commercial camera traps and evaluate the applicability of the double-observer approaches in actual conditions.

Results
Testing the effectiveness of the hierarchical capture-recapture model. The results of the Monte Carlo simulations showed that the hierarchical capture-recapture models assuming a beta-binomial distribution could provide good estimates of detection probability, and the number of animal passes with a reasonable confidence interval coverage, as long as a pair of camera traps detects a passing animal nearly independently (Table 1). However, when the correlation coefficient was > 0.2, detection probability was overestimated, underestimating the number of animal passes. On the other hand, the models assuming a categorical-Dirichlet distribution could not sufficiently refine the detection probability estimates in all the scenarios (Table 1). This pattern did not differ largely between the lower (0.8) and higher (0.4) detection probabilities (Table 1).
Determining suitable camera traps and their installations. The simulations mimicking the detection processes of moving animals showed that camera models with a lower trigger speed produced a more correlated detection history (Table 2, Fig. 1). The degree of the correlations also depended on the camera placements. Even when using a camera model with high trigger speed, the detection histories of paired camera traps were highly correlated when monitoring the entire field of view (r = 0.53) or the small focal area from the same direction (r = 0.27). However, the correlation significantly decreased when monitoring the small focal area from the different directions (r = 0.18).
Field surveys. In total, 12 species were recorded more than 10 times (7 in Japan and 5 in Cameroon), which were targets for our analyses. Although none of the species was perfectly detected, the detection probability was

Discussion
The Monte-Carlo simulation showed that it is highly challenging to estimate the detection probability reliably when detections by two camera traps at a station are highly correlated (see Table 1). The model assuming a categorical-Dirichlet distribution could not sufficiently refine the detection probability estimates. The beta-binomial distribution model also estimated the detection probability without bias when the correlation was small (≤ 0.2),  Table 2. Results of the simulations mimicking the process by which camera traps detect moving animals. In ins. 1, two camera traps were placed at the same position (i.e. mounted on the same tree) and monitored the entire field of view of the cameras. In ins 2 and 3, camera traps monitored a specific equilateral triangle with a side length of 1.7 m from the same direction (ins. 2) or from different angles of 60 degrees (ins. 3). For each installation, the uses of camera models with a fast trigger speed (0.1 s) and a slow one (1.5 s) were considered. Detection probability indicates the number of successful detections for the total number of animal passes (200 times). The mean value of the two cameras were shown. For the details, see the main text. www.nature.com/scientificreports/  www.nature.com/scientificreports/ but it overestimated the detection probability when the correlation increased. We also confirmed that increasing the number of camera stations to 100 did not improve the estimates sufficiently. The imperfect performances of the models are probably due to the significant limitations of the data available from two camera traps at a station. Clare et al. 44 showed that the model using the categorical-Dirichlet distribution allows for unbiased estimates of animal occupancy based on spatially and temporally repeated surveys using two devices with correlated detections. In their situation, data 0 (i.e. not detected) at a site where an animal has been detected at least once is certain to be a "false zero" (and thus a wealth of information is available for estimating the detection probability). On the other hand, the only reliable information in our situation is the number of cameras that detected animals (1 or 2). Under this critical constraint in data availability, the only way to obtain more reliable estimates of the detection probability would be to increase the number of cameras per station. Indeed, we confirmed that, when installing ten camera traps at each station, both models could provide unbiased estimates even with data having a correlation coefficient of much higher (e.g. 0.8, results not shown). Nonetheless, this installation is not practical, and it is more realistic to carefully design a survey to obtain as much independent data as possible.
The simulation mimicking the detection processes of moving animals suggested that camera traps should monitor a small focal area from different directions to obtain nearly independent data. A pair of camera traps facing the same direction have the same hazard landscape, so they often fail to detect animals passing through the periphery of the focal area. On the other hand, camera traps installed in different directions can compensate for weak areas with a lower detection probability by one another. In addition, it may be critical to use a camera model with high trigger speed to avoid missing fast-moving animals. There may be other viable approaches to address heterogeneity in detection probability. For example, one could measure the distance and angle of animals from the camera trap and incorporate them into the model as covariates. This approach is called MRDS (Mark-Recapture Distance Sampling), a well-developed theoretical framework 45,46 . However, this approach does not always function well 46 and also requires an intensive field survey. The installations proposed here may be much simpler and reduce labour costs. Nonetheless, it should also be noted that the results of our simulation were obtained from a single detection function, and the installation may not necessarily be the solution to the issue of dependence. Although, as long as detection probability decay with the distance and angle from camera traps, the results of the simulation (i.e. better performance of camera traps installed in different directions) will be kept, further studies (e.g. experiments using farmed animals) should confirm the degree to which independence can be kept in actual conditions.The results of our field survey showed that it is critical to account for possible imperfect detection in actual analyses. Although the estimated detection probability was relatively high (> 0.8) for most species, camera traps could not detect any species completely. In particular, the detection probability was lower for field mice and tree pangolins (Fig. 1), possibly reflecting the smaller body mass (field mice) or the scaly hairs preventing heat radiation from the body interior (tree pangolins). Given that the size and position of the focal area were specified to smaximise the detection probability, the sensitivity of the camera model might be less than one throughout the detection zone for these species. This result warns against applying the current density estimation models without accounting for imperfect detection. In the present study, trapping rates were underestimated by 4-36%, which may lead to underestimating animal density by the same order of magnitude. Therefore, the double-observer approaches proposed here should be incorporated into the existing density estimation models relying on perfect detections.
We admit that a shortcoming of this approach remains. In particular, it is necessary to have more cameras available, which may be a constraint on implementation. Nonetheless, researchers may be able to challenge these constraints in various ways. For example, installing a pair of cameras at every camera station may not be necessary, as long as the variance in detection probability among camera stations is not too large. Instead, one may choose to install a pair of camera traps in locations with high trapping rates. This is theoretically equivalent to the hybrid designs of camera installations (i.e. a combination of stations with double and single camera traps) proposed by Augustine et al. 47 to estimate animal population size using capture-recapture analyses. It may be effective that the study period may be divided into two parts: one in which two cameras are installed to estimate the detection probability and one in which one camera is installed to estimate the trapping rate. Furthermore, if our approach were used at different locations, it would be possible to extrapolate the results to new sites by incorporating environmental conditions and animal characteristics as covariates. Given that many surveys estimating the density of marked populations use paired camera traps at each camera station (to srecognise individual animals reliably), it may also be possible to roughly assess the detection probability from currently available data using the approach proposed here.
This study showed that the double-observer approach, combined with hierarchical capture-recapture models using Bayesian frameworks, might be an effective option for estimating detection probabilities and the number of animal passes. We also suggest that commercially available camera traps have higher detection probability within a small focal area but still do not perfectly detect animals. The hierarchical capture-recapture model used here can estimate the distribution of detection probability and the number of animals passing concurrently, and thus, it is readily incorporated into the current density estimation models. We hope that our approach will be incorporated into them to improve their accuracy.

Methods
Model framework. The capture-recapture model applied here is the hierarchical model for stratified populations proposed by Royle et al. 48 . The model aims to estimate local population size or community structure 49 using capture-recapture data from multiple independent locations. In the following, we briefly describe the model in our context, including addressing heterogeneity in detection probability. www.nature.com/scientificreports/ Let us consider that we establish S independent camera stations in a survey area. Then, we install K camera traps at each station to monitor exactly the same focal area (totally S × K camera traps will be used). We assume that these camera traps detect animals within the focal areas N T times in total. For animal pass i (i = 1, 2, 3, …, N T ), we will obtain (1) at the station where the animal is detected (hereafter station identity; g i ), and (2) how many of the K cameras at the station were successful in detecting the animal pass (hereafter detection history; y i ). The hierarchal capture-recapture model uses these two data, g i and y i .
Let the number of the animal passes at station s be N s (s = 1, 2, 3, …, S). Then, we assume that N s follows a Poisson distribution with a parameter λ. In this case, the probability of passage i occurring at station s is expected to be ×S . Thus, station identity, g i , can be modelled as follows: When the number of the animal passes at station s, N s , may have larger variation than expected from the Poisson case, we may assume a negative binomial distribution model or may give a random effect to the parameter of the Poisson distribution at the camera station level.
The detection history Y with elements y i can be modelled using a data augmentation procedure 47 . Specifically, the original detection Y is artificially augmented by many M -n passes with all-zero histories (i.e. not detected by any camera). The augmented data W with elements w i (y 1, y 2 …y NT , 0, 0, … 0) will consist of the passage that occurred but was not detected by any camera (false zero), which occurs with probability ψ, and the passage that did not occur (structural zeros) with the probability 1 − ψ. A set of latent augmentation binary variables, z 1 , z 2 , … z M , is introduced, which denotes the false zero (z = 1) and the structural zero (z = 0). That is The elements of the augmented data, w i , can be modelled conditional on the latent variables z i . There would be two alternative approaches to modelling the w i.
The simplest one may regard w i as random binomial variables. That is When accounting for the heterogeneity of detection among animal passes, it can be accommodated using a beta distribution as follows; The expected detection probability can be derived from α/( α + β) and the correlation coefficients can be calculated by 1/( α + β + 1).
Alternatively, we can regard w i as a categorical variable that takes values from zero to K.
where π is a probability vector of length K + 1. For simplicity, let us consider two camera traps installed at each station, and those cameras have equal detection probability. Then, w i can take either 0 (i.e. z i = 0 or both camera traps missed animals with conditional on z i = 1), 1 (i.e. only one camera trap detected animals with conditional on z i = 1), or 2 (i.e. both camera traps detected animals with conditional on z i = 1). Thus, when we define the probability that w i takes 0, 1, 2 with conditional on z i = 1, as φ m (m = 1, 2, 3), the elements of π is equal to We then take different modelling approaches depending on whether detection probability among animal passes is heterogeneous or not. When two camera traps at a station detect animals independently with the same probability ρ, φ 0 , φ 1 , and φ 2 can be expressed as a function of ρ, i.e. (1 − ρ) 2 , 2 × ρ × (1 − ρ) 2 , ρ 2 , respectively (Clare et al. 47 ). On the other hand, when detections by the two camera traps are correlated, we need to estimate three real parameters φ m that designate the probabilities of all outcomes w i |z i = 1. We assume that ρ m follows the Dirichlet distribution with the parameter γ m (m = 1, 2, 3). That is In this approach, the expected detection probability can be derived from ϕ 1 /2 + ϕ 2 and the correlation coefficients can be calculated by ϕ 2 − ( ϕ 1 /2 + ϕ 2 ) 2 .
Compared to the beta-binomial distribution approach, the approach using categorical-Dirichlet distribution might be more flexible in accommodating detection heterogeneity while it might be more challenging to estimate the model parameters. In either approach, the expected total number of animal passes can be expressed as × S . Thus, ψ can be fixed as follows: For more details of the models, see Royle et al. 48 and Clare et al. 44 . Carlo simulations to evaluate the effectiveness of the hierarchical capture-recapture model. Because the model reliability has been confirmed well 48 , we here focused on the effects of heterogeneity in detection probability on the accuracy and precision of the estimates. We assumed that the number of detections by camera traps followed a negative binomial distribution with a mean of 5.0 and dispersion parameter 1.27, which derived the actual data on an ungulate in African rainforests 34 . We also assumed two camera traps each at 30 stations (i.e. 60 camera traps in total). We generated detection histories (i.e. the number of camera traps successfully detecting animals in each animal passage) using a betabinomial distribution with the expected detection probability at 0.8 or 0.4. We varied the correlation coefficients (= 1/(α + β + 1)), from 0.1 to 0.5 in 0.1 increments. The scale parameters of the beta distributions for each scenario are shown in Table 1. Additionally, to determine the effects of sample sizes on the accuracy and precision of estimates, we increased the number of camera stations at 100. Since this setting requires much computation time, we only assumed a detection probability of 0.4 and a correlation coefficient of 0.3.
We estimated the parameters of the hierarchical capture-recapture models assuming a beta-binomial distribution and a categorical-Dirichlet distribution using the Markov chain Monte Carlo (MCMC) implemented in JAGS (version 3.4.0) in all the simulations. We assumed that the number of animal passes followed a negative binomial distribution. For the model assuming a beta-binomial distribution, we transformed the scale parameters, α and β as p*phi and p*(1 − phi), respectively (p is an expected detection probability). Then we used a weakly informative prior (gamma distribution with shape = 10 and rate = 2) for phi and a non-informative uniform distribution from 0 to 1 for the detection probability 49 . For the model assuming a categorical-Dirichlet distribution, the Dirichlet prior distribution was induced by treating each γ m ~ Gamma(1, 1) and calculating each probability by ϕ m = γ m / M m=1 γ m followingv and Clare et al. 44 . We generated three chains of 3000 iterations after a burn-in of 1000 and thinned by 5. The convergence of models was determined using the Gelman-Rubin statistic, where values < 1.1 indicated convergence. These procedures were repeated 300 times. We report the mean of estimated median detection probability and the expected number of animal passes, and their 95% credible interval (CI) coverage of the densities. The R code to implement this simulation is available as supplementary material (Supplementary R1).
Determining a suitable camera installation. The above simulations suggested that a key to safely applying the hierarchical capture-recapture model may avoid correlated detections. To determine a preferred survey design to secure independent detections, we performed additional simulations. Specifically, we tested how the position and the trigger speed of camera traps may affect the independence of detections by considering the process by which cameras detect moving animals.
The simulation was performed using similar procedures taken by Rowcliffe et al. 17 . We assumed that the sensor of camera traps has a two-dimensional detection surface defining the instantaneous 'risk' (as analogous to the risk of mortality in survivorship analysis) of an animal being detected at any given location within the camera's field of view (FOV). The instantaneous risk landscape was defined with respect to distance r and angle θ relative to the camera as follows: where a defines the maximum risk close to the sensor, s and b define the position and shape of decline in risk with distance, respectively, and c defines the rate of decline in risk with angle. We set a, s, b, and c at 0.2, 3.0, 5.0, and 0.5, respectively. Although there is no empirical evidence for this particular function form, the simulated results were comparable to those observed. Note that our interest is not in estimating the actual values of the correlation but instead in obtaining information that will help us decide what cameras to install and how to install them.
We made an animal pass through the risk landscape in a straight line in a random direction. The animal movement speed (ms -1 ) followed a log-normal distribution with a mean (± SD) of 0.18 ± 0.15 m, which roughly accord with the speed of red duikers in our study sites (Y Nakashima, unpublished data). We then generated a random animal's position to be detected, considering the cumulative risk of detection and the camera trap's trigger speed (for the details, see Rowcliffe et al. 19 ).
We considered three designs of camera placements (Fig. 1). The first one is to install two camera traps at the same position (i.e. mounted on the same tree) and in the same direction to monitor the entire field of view within 10 m from the cameras (ins. 1). The second and third one is assumed to monitor a specific small area within the FOV. The focal area was a small equilateral triangle with a side length of 1.9 m and was centred within the FOV. The nearest vertex was set to be 1.9 m away from the cameras. This area corresponds to the highest detection probability in the camera model used in our field study. This focal area is monitored from the same direction (ins. 2) or different angles of 60 degrees (ins. 3). We then considered using camera models with a fast trigger speed (0.1 s) and a slow (1.5 s) for each installation. Finally, we generated the detection history of 500 times animal passes and calculated the correlation coefficients between the two camera traps for each scenario. www.nature.com/scientificreports/ japonica and Chamaecyparis obtusa). The monthly mean temperature was 20.6 ± 5.9 °C during the study period, with the highest in August (27.4 °C) and the lowest (13.6 °C) in November. The simulations suggest that camera traps monitor the predefined focal area from different positions (see below). According to the results, we used the camera traps with a high trigger speed (0.15 s) (Browning Strike Force Pro, BTC-5HDP, Browning, Missouri, US) at both study sites. We regarded two camera traps to monitor the same equilateral triangle area as in the simulation from different directions by 60 degrees (the right panel of Fig. 1). We surrounded the focal area with a white rope and manually filmed it with camera traps as a reference. The rope was removed after filming to avoid disturbing the animal behaviour. We set camera traps at approximately 0.7 m above ground without baits or lures. We used the 'video mode' and designated the video length as 20 s and the delay period between videos at 1 s (minimum delay period in this product). In Japan, we established seven camera stations at least 2 km apart and installed two camera traps at each station. In Cameroon, we set 26 camera stations at least 2-km apart from each other. Since the number of camera stations is not enough to estimate the expected number of animal passes in Japan, we focused on the detection probability.
We determined whether animals passed within or outside the focal area by superimposing videos and the reference image. We used only images of animals crossing the focal area for subsequent analyses. We then matched each detection from the two cameras to determine whether two camera traps successfully recorded an animal pass. We applied the model to the detection probability of the species detected more than 10 times in both Japan and Cameroon. The analysis was limited to images in which the animal species were reliably identified. We plotted the estimated detection probability against the median value of body mass (kg), drawn from Ohdachi et al. 50 for animals in Japan and Kingdon 51

Data availability
All the data used here is available in Supplementary Information