Timely poacher detection and localization using sentinel animal movement

Wildlife crime is one of the most profitable illegal industries worldwide. Current actions to reduce it are far from effective and fail to prevent population declines of many endangered species, pressing the need for innovative anti-poaching solutions. Here, we propose and test a poacher early warning system that is based on the movement responses of non-targeted sentinel animals, which naturally respond to threats by fleeing and changing herd topology. We analyzed human-evasive movement patterns of 135 mammalian savanna herbivores of four different species, using an internet-of-things architecture with wearable sensors, wireless data transmission and machine learning algorithms. We show that the presence of human intruders can be accurately detected (86.1% accuracy) and localized (less than 500 m error in 54.2% of the experimentally staged intrusions) by algorithmically identifying characteristic changes in sentinel movement. These behavioral signatures include, among others, an increase in movement speed, energy expenditure, body acceleration, directional persistence and herd coherence, and a decrease in suitability of selected habitat. The key to successful identification of these signatures lies in identifying systematic deviations from normal behavior under similar conditions, such as season, time of day and habitat. We also show that the indirect costs of predation are not limited to vigilance, but also include (1) long, high-speed flights; (2) energetically costly flight paths; and (3) suboptimal habitat selection during flights. The combination of wireless biologging, predictive analytics and sentinel animal behavior can benefit wildlife conservation via early poacher detection, but also solve challenges related to surveillance, safety and health.


Introduction
Wildlife trade is a low-risk yet high-profit crime, ranking fourth in terms of revenue after trade in drugs, humans and arms 1 . Wildlife crime is driven by a rapidly expanding wealthy class in some cultures that views animal parts as medicine or status-enhancing luxury goods 2 . The demand for animal parts has led to escalating prices 3 , which consequently fuels poaching. As one of the main causes for biodiversity decline 4 , poaching increasingly threatens the existence of wildlife, notably pangolins, rhinos, elephants and tigers. Ultimately, losses of these and other species can reshape entire ecosystems via cascading effects.
Although the ultimate solution is to reduce the global demand for wildlife products, efforts to do so have not been successful enough 5 . Local efforts thus often aim at deterring poachers, mainly through ranger patrols. Deadly force used by poachers incites conservation authorities into intensified 'militarized conservation', resulting in frequent shootouts between poachers and conservation officers 6 . Sadly, poaching of wildlife still continues to be a threat to the preservation of many wildlife species 1 , as anti-poaching rangers often arrive too late at crime scenes 7 . An effective method for early poacher detection and localization is thus urgently needed, so that preventive action can be taken. With situational awareness, law enforcers can operate under safer conditions with reduced risk of fatalities and potential to de-escalate conflicts. An effective poacher early warning system (EWS) thus contributes to preventing lethal violence, not only against wildlife, but also against conservation officers and poachers 6 .
Animal sentinels, especially those that are abundant and no targets themselves, may provide an early warning that poachers are en route. Prey species may be good sentinels as these species have evolved a suite of traits preventing them from being killed, e.g., via early predator detection and escape 8 . This often extrapolates to humans as well, since many prey species evolved together with human hunters, leading to anthropogenic disturbance stimuli triggering similar, or often even stronger, evasive responses 9,10 . Until now, practical constraints have hampered the development of a sentinel-based EWS 11 . Although wireless sensors can generate large volumes of data, the areas in which poaching occurs often lack infrastructure that allows real-time wireless communication of sufficient bandwidth 7 . Moreover, animal behavior is known to be complex and context-dependent, thus an EWS needs to be able to handle rich contextual data when identifying behavioral abnormalities linked to anthropogenic disturbances. Fortunately, advances in technology, computing and analytics have now alleviated these constraints 12 . We therefore tested whether the behavior of sentinel animals can be used to detect and localize human intrusions real-time using wearable biologging sensors and predictive algorithms.
We tested the sentinel-based EWS in an African savanna, home to several targeted species (e.g., pangolin, elephant, rhino and lion) that coexist with an assemblage of mammalian prey species that could be potential sentinels. We deployed wearable GPS and tri-axial accelerometer sensors on 138 animals of four species (plains zebra, blue wildebeest, common eland and impala) in a 1200ha fenced, predator-free area inside Welgevonden Game Reserve (WGR), South Africa (Fig 1). These sensors transmitted data wirelessly via a LoRa network connected to a backhaul allowing for real-time analytics. During a period of seven months, WGR park officials executed 57 intrusions mimicking poachers (referred to as 'experimental intrusions'). Data collected in the absence of experimental intrusions were used to characterize undisturbed behavior, allowing quantification of the degree of abnormality of movement behavior at any point in time.
We engineered a plethora (2117) of potentially meaningful features, describing the geometry of individual trajectories as well as emergent herd topologies and various characteristics of the animal-environment interplay. Then, we applied dimensionality reduction and segmented the dataset into experimental intrusions and controls. Data during experimental intrusions were matched with control data of the same period, one or two days earlier or later, when no intrusion took place. To generate predictive signatures for the EWS, we followed a three-step process: behavioral response classification, intrusion detection and intrusion localization. We allocated each experimental intrusion or control segment to either the training phase or the evaluation phase, applying a leave-one-group-out cross-validation approach on these segments to make the best use of all data (see Materials and Methods for details).

Results
Exploration of the animals' reaction to the experimental intrusions highlighted several broad characterizations of their response. First, the experimental intrusions triggered nearby sentinels to divert their movement away from the perceived treat while increasing their speed, body acceleration and directional persistence (Fig 2). This, together with elevated variation in such features, resulted in more directional, brisk, straight and erratic movements. These evasive flights lasted on average 47 minutes per fleeing group of zebra (SD=28, n=29), 39 minutes for wildebeest (SD=33, n=15), 46 minutes for eland (SD=18, n=15), and 43 minutes for impala (SD=14, n=14). Second, the difference between the sentinels' response behavior and their normal behavior was larger when comparing the individuals' movement in the same spatial (location and habitat) and temporal (seasonal and diurnal) context. Third, the sentinels selected sub-optimal habitat and chose flight paths that incurred higher energetic costs via faster and uncommon uphill movement in response to the experimental intrusions, possibly in an effort to find refuge (Fig 2, 3). Fourth, apart from alterations in the geometry of individual movement trajectories, patterns of collective geometry changed in the vicinity of the experimental intrusions. Generally, nearby individuals tended to form groups with more synchronized and aligned movements (Fig 2f).
We trained a Support Vector Machine (SVM) to algorithmically classify the animal's response behavior as either undisturbed (i.e., calm or normal) or disturbed (a summary label for the abovedescribed responses). We were able to achieve an average precision of classification (i.e., the area under the precision-recall curve) of 46%. Depending on the chosen value of the response probability decision boundary, the classification performance achieved up to 100% precision, or 100% recall, with a maximum F1-score of 47% ( Supplementary Fig S1).
Following animal behavior classification we were able to distinguish intrusions from controls with 86.1% accuracy (82.6% precision, 89.2% recall) using logistic regression, exclusively using the movement data of the sentinels. The odds of an intrusion increased considerably with higher SVM-predicted probabilities of response behavior, the degree of local spatial autocorrelation therein, and a decrease in spatial clustering of sentinels that were predicted to be undisturbed. Including more features in the detection classifier boosted its predictive accuracy to 91% (Supplementary Fig S3), but also increased the risk of lowering its generalizability to other areas due to potential overfitting.
Following detection, we predicted the location of the intrusion relative to the position, movement direction and SVM-predicted response probabilities of the sentinels. We summarized the performance of the localization prediction through the Euclidian distance between the peak prediction and the true location of the intrusion, followed by computing the spatial error of the 10 most dense probability surfaces per experimental intrusion. In 20.8% of them the predictions were highly accurate, namely within 100m from the true location, increasing to 41.7% and 54.2% respectively, for distances up 300m and 500m (Fig 3b).

Discussion
Our study thus clearly demonstrates that sentinel animal behavior can be used in a real-time poacher EWS, since predictable signatures in behavioral responses to disturbance stimuli can be used to detect and locate human intrusions. Indeed, the sentinels took systematic and detectable evasive action when experimental intruders came near.
The sentinels increased their movement speed and body acceleration as they generally do during anti-predator responses 8 , whilst moving away from the perceived threat with higher directional persistence (Fig 2). They did so for a considerable amount of time per flight response (45 minutes on average), longer than only instantaneously running away, thereby substantially trading off energy for safety 13 . This signal became even more pronounced in the context of the individuals' normal behavior given the prevailing conditions (season, time of day and habitat), since a systematic deviation from normality is key to successful identification of disturbed behavior. It thus proved to be important to explicitly consider the spatial-temporal context of the movementenvironment interplay when using sentinel movement metrics as early warning indicators. Solely using movement speed as indicator 14 without incorporating environmental conditions is therefore not very informative (Fig 1).
These findings suggest that the sentinels elevated their energy expenditure while fleeing, in line with theory on energy landscapes and the landscape of fear 8,13,15,16 . However, not only did experimental intrusions trigger faster-than-normal movement, the sentinels also tended to utilize the terrain by moving uphill, thereby increasing their energy expenditure (Fig 3). Moreover, the sentinels seemed to alter their decision-making during evasive actions, selecting less optimal habitat than they would do when undisturbed (Fig 2e, 3d). This suggests that anti-predator tradeoffs relates to energy trade-offs and that perceived threats can induce resource avoidance 17 . Together, these consequences of anti-predator behavior can incur significant energetic and opportunity costs 17 . These energetic costs are generally not considered in the indirect costs of predation within the landscape of fear framework, but are now increasingly being recognized 13,17 . Our findings suggest that anti-predator behavior not only incur costs in terms of trading off foraging and resting for vigilance, but also in terms of increased costs due to 1) performing long, high-speed flights; 2) choosing energetically costly flight paths; and 3) selecting suboptimal habitats during flights.
Although the study of collective behavior of animals within groups has predominantly relied on controlled laboratory-based studies and theoretical models 18,19 , our high-resolution data on manifold large terrestrial mammals allowed the detailed computation of collective movement properties in their natural habitat in relation to perceived threats. The sentinels increased group coherence when intruders were near (Fig 2f), presumably in an effort to find safety in numbers 20 , whilst at the same time avoiding the likelihood of collisions by increasing alignment during escape 21 . These findings support predictions from theoretical studies 22 and controlled laboratory experiments 23 .
Central to these findings is that the responsive and evasive behavior of animal sentinels can be used to algorithmically detect and localize poachers. A sentinel-based EWS is robust against adaptive behavior of poachers, as an abundance of sentinels cannot easily be manipulated and fooled 24,25 . Additionally, shooting sentinel animals would give away the poacher's position, both via its acoustic signal 26 as well as through the sensor data of the shot animal. Moreover, if hackers were to tap into the dataflow, only the locations of the sentinels may be revealed, but not those of targeted species. Applying biologging technology directly to targeted species is risky, and will rule out preventive intervention as it only enables the post hoc identification of mortalities 26 . Instead, the responsive behavior of untargeted sentinels crossing path with poachers en route provides an early warning and situational awareness to anti-poaching personnel.
Using animal sentinels as a lens to the environment is in itself not new, as they have long been employed to detect human exposure to biological and chemical hazards (e.g., canaries in coal mines) 27,28 . Moreover, anecdotal evidence has linked animal behavior to the onset of natural disasters 29,30 , and recent evidence suggests that dogs can be used to provide an early warning of epileptic seizures 31 or outbursts of violence 32 . Elucidating the hitherto hidden information in the behavior of animals with cutting-edge technology can help us gauge the conditions of life on Earth 33 . More specifically, this approach can expose illicit human activities, such as illegal fishing 34 and, as shown here, poaching. Our study is the first to document the use of untargeted sentinel behavior as a real-time early warning against wildlife crime, yet our approach is generalizable beyond animals as sentinels. Similar methods could be utilized to detect anomalous behavior of people in crowds in response to a perceived threat 35 . Harnessing the collective sensing capacities of sentinels will thus not only innovate wildlife conservation and help turn protected areas into safe havens, it has the potential to advance many other applications as well.

Study system and species
This study was performed in Welgevonden Game Reserve (WGR), a privately owned game reserve in the Limpopo province, South Africa (24°10'S; 27°45'E to 24°25'S; 27°56'E). The reserve is located in the mountainous Waterberg region. WGR was established on former agricultural lands in the early 1980s and the main occurring vegetation types are Waterberg Mountain Bushveld and Sour Bushveld. The Waterberg region has a temperate climate, with two distinct seasons, characterized by the rainfall regime: a dry season ranging from April to September and a wet season ranging from October to March, with an average annual precipitation in WGR of 634 mm. Our study area is an enclosed breeding camp within WGR, with a size of approximately 1200 ha. Main predator species such as lion, cheetah and spotted hyena were excluded from this study area, as well as elephant and rhino.
WGR equipped 35 impala (Aepyceros melampus), 34 blue wildebeest (Connochaetes taurinus), 35 plains zebra (Equus burchellii) and 34 common eland (Taurotragus oryx) with a GPS and accelerometer sensor equipped collar. The animal movement data were transmitted wirelessly in near real-time to five long-range low-power LoRa radiocommunication gateways in the study area, from where data packages were routed to an on-line data warehouse via a 3G/4G backhaul. The deployment of these sentinel animals were approved by the board and CEO of WGR as a management action and was performed in accordance with relevant guidelines and regulations (see Supplementary GPS Collaring letter).

Experimental intrusions
Between September 2017 and March 2018, WGR employees performed experimental intrusions (lasting ca. 2 hours) on foot and by car through the study area, at varying locations and movement routes through the study area, independent from the locations of the sentinel animals. The movement of the intrusions were tracked by GPS, and the relevant metadata for each intrusion recorded (mode of transport, group size, start time, end time). The intrusions were distributed in a stratified way over the mornings, middays and afternoons (with time slots relative to specific solar positions: sunrise, solar noon and sunset). Furthermore, the intrusions were temporally spread in such a way to avoid a disturbance overflow for the sentinel animals, by performing a maximum of five experiments per week and a maximum of two experiments per day (and then only with one intrusion in the morning and one in the afternoon).

Data gathering
The animal sensors gathered location data via GPS and overall dynamic body accelerations (ODBA) via a tri-axial accelerometer (range ±2g). The GPS was scheduled to record spatial position at irregular intervals depending on the level of activity as gauged by ODBA. All sensors were scheduled to record locations every 15 minutes in the absence of sufficient activity (given that successive fixes were further than 5m apart, else a geofence was applied and the new coordinate was omitted to save bandwidth and battery power, thereby assuming that the animal still was at its previous location). The GPS fix rate was increased up to 2-or 10-minute intervals (depending on two different sensor settings) when ODBA indicated sufficient activity (after checking for the geofence). ODBA data were sampled continuously and summarized per 15 second window in a mean, maximum and variance value.
The experimentally intruding groups were outfitted with handheld GPS devices that recorded their location every 5 seconds and these groups logged and timestamped all their pre-defined activities and metadata on a tablet using CyberTracker 36 during their intrusion. Most cars traveling through the study area were tracked by GPS as well to filter the animal data for disturbances by cars unrelated to the experimental intrusions.
Weather data (temperature, radiation, precipitation and wind) in the study area were recorded on a 3-minute resolution with a weather station in the north of the study area. We assumed the 1200 ha study area to be sufficiently small to assume the weather station data to be representable for the prevailing weather conditions throughout the study area. GIS data of the study area (summarized in Supplementary Table S1) consisted of information on topography, infrastructure (e.g., fences, roads, powerlines, etc.) and vegetation cover (supervised classification of 25cm resolution aerial imagery into four classes: trees, herbaceous/grass, sand/soil and other/built-up area).

Data pre-processing
To link the animal location data with the intrusion location data, as well as to correct for the substantial level of positional noise present in the animal location data, we modelled the animal location data to regular 1-minute resolution trajectories using the following five steps. First, we filtered out large obvious errors (e.g., obvious outliers and irregularities such as locations far outside the study area) from the data. Second, we corrected systematic medium-scale outliers: 'spikes' that occurred due to positional outliers. Such spike-like outliers were visible during sensor testing while following known straight-line trajectories along an airstrip, thereby confirming that these spike-like geometries most likely resulted from positional error rather than true animal movement. Therefore, we corrected the locations that were classified as spike-like anomalies by shifting them closer to the straight line between the neighboring points. The extent of this shift was set relative to the degree of spikiness of the points (the spikier the pattern, the larger the shift towards the midpoint of the adjacent coordinates). Third, after filtering and correcting the original locations we smoothed the timeseries of x/y coordinates at each original timepoint with a Kalman smoother using a dynamic linear model. Fourth, we linearly interpolated the locations to a 10 second resolution based on ODBA, where we considered the animal to be stationary between multiple timepoints if the accelerometer signal suggested the animal was not moving. Fifth, we fitted an X-spline through the data, where we gave the linearly ODBA-interpolated locations a smaller weight, and sampled the fitted spline on a regular 1-minute resolution. These preprocessing steps resulted in the modelled animal trajectory data, composed of spatial locations every minute, and averaged ODBA statistics per step (i.e., the segments between consecutive coordinates). These data were used as input for the next steps in the analyses. In contrast to the animal data, the raw intrusion data were of a high temporal resolution and spatial accuracy so that we only needed to subset the data in order to acquire 1-minute resolution time-synchronized intrusion trajectories.

Feature engineering and processing
We computed a plethora of human-engineered features from the animal trajectories, ODBA data, weather data and several GIS layers with environmental data from the study area (summarized in Supplementary Table S1). All features were computed such that they could not directly be linked to specific points in space or time (by computing movement features relative to the environmental variables), so that only behavioral patterns and abnormalities therein could be linked to intrusion presence. After engineering these base features, we transformed features (after visual inspection of the histograms) to approximately unimodal and symmetric distributions. Then we truncated the distributions to the lower and upper 0.001 percentile to correct possible outliers. After that, we standardized all computed features to zero mean and unit variance per species. We also computed scaled versions of selected features by subtracting the mean and dividing by the variance of the selected features per reference set to capture deviations from normal behavior:  Table S1). Furthermore, after computing and standardizing the features, we computed more features by applying moving window computations (5 minutes centered, 10 and 20 minutes lagging, 10 and 20 minutes leading minus lagging) on the standardized features to capture (the change in) the recent history of animal movement descriptors (mean and standard deviation of all features, fitted Mean Squared Displacement exponential function parameters, netgross distance ratio and variance of log First Passage Times). Finally, we discretized all features to ordinal values to avoid odd-, fat-and heavy-tailed distributions. In total we computed 2117 features describing different aspects of movement geometry of individual trajectories, herd topology and the interactions with landscape variation.

Subsetting and dimensionality reduction
Before analyzing the computed animal movement features, we applied some filtering on the data. We removed all periods with an experimental intrusion during which there were less than 30 active animal sensors in total. We also removed data of both animals and intrusion when they were close to the reserve's main gate in order to avoid dilution of the data with other known disturbances. This resulted in 57 intrusions that were selected for further analyses. For every intrusion we selected control data of the same period one or two days earlier or later during which no intrusion took place, resulting in an approximately balanced intrusion-control dataset. Furthermore, we removed data from animals that were located within 250 m and within 20 minutes of a vehicle moving through the area that was not part of our experiment.
For each feature, we computed 4 importance metrics based on binary labelled data: records associated to locations within 1 km from the intrusion (subscript 1) versus an equally-sized random selection of data points during control periods (subscript 0): Mahalanobis distance, marginality (computed as 1 − 0 0 , for sample mean and sample standard deviation ), specialization (computed as 1 0 ) and the Mean Decrease Accuracy of a Random Forest classifier (with default hyperparameters). We then ranked the features according to their importance and selected a feature for further analyses if it occurred in the top 125 features for any of the 4 importance measures described above (resulting in a total of 361 selected features). Subsequently, we converted the selected features per main feature class (Supplementary Table  S1) to principal components, keeping those principal components that capture the most variation (in total 95%), which resulted in 99 selected components in total. Finally, we transformed these components again via a second principal component analysis, now across all the selected 99 components. In subsequent training of the animal behavior classifier, we optimized the total number of included components as a hyperparameter, which resulted in the first 8 principal components in the best performing classifier.

Labelling
We labelled the sentinel movement data through visual inspection of the animal and intruder trajectories, where we considered the animals' behavior to be undisturbed when the animal was not near an intrusion, or when the animal was close to an intrusion yet did not visually display a change in behavior. However, when the animal was near the intrusion and displayed a sudden or gradual behavioral change in response to intrusion proximity, we labelled the data as 'flight' (changing the movement direction away from the intrusion, possibly with increased speed) or 'regroup' (when individuals clustered together). In total, only ca. 1 % of the animal data were associated to either flight or regroup behavior (which we will refer to as 'response' behavior). A few animals also appeared to exhibit behavior we could label as 'freeze', i.e., halting movement in the proximity of the intrusion, yet this class was too underrepresented to be accurately predicted and hence dropped from the final dataset. Furthermore, we assigned a qualitative measure of intensity to each labelled behavioral response ('low', 'medium', 'high') to describe how visually pronounced this response was. Besides the supervised labelling based on visual inspection of behavioral responses via video animations of the trajectories, we also labelled data using an unsupervised k-means nearest neighbor classifier, where we clustered the feature space consisting of the 99 features selected as described above into 25 clusters per species.

Animal behavior classification
We trained an RBF kernel C-classification Support Vector Machine (SVM) with a subsequent moving window over the outputted probabilities to distinguish undisturbed vs. response behavior. In the training datasets we only included the data separated by more than 1 km from the intrusion and labelled as 'undisturbed', and removed 90 % thereof to train algorithms with a more balanced dataset. Furthermore, we only trained and validated on data with intrusions present in the area. We trained another SVM to distinguish the flight response from the regroup response. All computations were done in R 3.5.0 with the e1071 package on the Linux High Performance Cluster of Wageningen University and Research. We optimized the following hyperparameters and model settings during the training phase for the Average Precision via a grid search (with the selected values between brackets): -gamma (undisturbed-response: 10 -3.2 ; flight-regroup: 10 -2.0 ); -cost (undisturbed-response: 10 -2.2 ; flight-regroup: 10 -1.5 ); -number of principal components to include as features (undisturbed-response: 8; flightregroup: 12); -species-specific models vs. one model with species dummy variables included in the features (species-specific models); -specific models for the different times of day vs. one model with time of day dummy variables (one model); -response intensities to include in the training data (only medium and high intensities); -weights to assign to the classes (equal weights); -the quantile to be computed of the SVM probabilities by the moving window (100 %, i.e., maximum value); -the alignment of the moving window (centered); -the size of the moving window (15 minutes on both sides).
The best model was selected via a leave-one-intrusion-out cross-validation approach. We summarized the predictive performance by computing the Average Precision of the least occurring class (i.e., 'response' for the undisturbed-response model: 46 %, Supplementary Fig  S1; and 'regroup' for the flight-regroup model: 80 %, Supplementary Fig S2). After having computed these probabilities with an SVM and a temporal window smoother, we tried to improve the predicted performance by including the predicted animal response probabilities of nearby animals. However, this spatial explicit approach hardly improved the predictive performance, indicating that the spatial contextualization of behavioral response was sufficiently captured by the computed features. We therefore did not include this spatial contagion effect of predicted animal response probabilities in the final analysis.

System classification -detection
Based on the predicted SVM response probabilities and feature cluster analysis, we computed summary features per 15 minutes of each intrusion and control period. These summary features related to the odds ratios of the probability of association of unsupervised clusters with intrusions vs. controls, the SVM predicted probabilities of behavioral response, and several features describing the values (and its spatial structure, e.g., clustering or autocorrelation) of these SVM predicted response probabilities. After computing summary features per 15 minutes, we summarized them even further for the intrusions vs. controls using the following eight statistics: mean, standard deviation, minimum, maximum, mean of the lagged differences, standard deviation of the lagged differences, minimum of the lagged differences and maximum of the lagged differences.
After computing the summary features, we build a logistic regression classifier to distinguish intrusions from controls. To create a parsimonious model, we iteratively added features to the model and evaluated its performance after each iteration. We evaluated the performance based on the model accuracy and performed validation through 25 times 2-fold cross-validation in a stratified way (by 25 times choosing a balanced random sample of intrusions and controls). We determined the sequence of adding features to the model by performing an independent two-sample t-test for each feature between the intrusions and controls. The feature with the largest tvalue was then added to the model. After each feature addition, we removed its correlation with the remaining features using linear regressions with the added feature as independent variable and the remaining features as dependent variables, from which we extracted the residuals, standardized them to zero mean and unit variance, and applied the t-tests again. The (original) feature with the largest t-value was then added to the model again. This procedure was repeated until all features were ordered corresponding to their "importance". We then performed logistic regressions without interactions between the features for an increasing number of features (Supplementary Fig S3). The model already performed quite accurately with only 7 features (86.1 % accuracy +/-SD 3.3 %, precision 82.6 % +/-SD 6.9 %, recall 89.2 % +/-SD 5.1 %). However, with 20 features and 2-way interactions the model achieved the maximum accuracy (90.9 %).

System classification -localization
The data gathered during intrusions that were correctly predicted as such by the detection classifier were used to train the intrusion localization algorithm. The probability surface of the location of the intrusion was fitted relative to that of the sentinel animals using: , ~( ( , , , 1 ) ( , , 1 , 1 )) (1 − ) ( ( , , , 0 ) ( , , 0 , 0 )) ( , , , 0 ) ( , , 0 , 0 ) where , is the odds ratio of intrusion presence at location evaluated for individual , is the SVM-predicted probability that individual is exhibiting response behavior. The function is the wrapped normal probability density function, , is the direction from location to the location of the focal animal , is the movement direction of individual , 1 and 0 are the standard deviations of the unwrapped distributions. The function is the lognormal probability density function, where , is the distance of location to , 1 and 0 as well as 1 and 0 are the lognormal distribution parameters (respectively log-mean and log-sd).
The parameters 1 , 1 and 1 capture the geometry of intrusion-animal topology for animals that exhibited a predicted behavioral response to the intrusion. Similarly, 0 , 0 and 0 are the corresponding parameters for animals that were predicted to be undisturbed. The parameters 1 , log ( 1 ) and log( 1 ) were fitted to the data assuming a 3 rd order polynomial relationship to : the time (in minutes) since the start of the predicted behavioral response (using the maximum F1 classification score). Since the behavioral response signature is lost over time, we truncated to 45 minutes (thus > 45 minutes was set to = 45). The parameters 0 , 0 and 0 were estimated using the data of the controls and with randomly generated intrusion locations in the study area, in order to correct for the effects of geometry of the study area on the predicted response surfaces. The probability surface was then calculated as: where is a normalization constant so that integrates to 1 over the area covered by the rectangular axis-aligned bounding box around the study area.
To measure the prediction accuracy of each localization surface, we simplified each surface to a point coordinate located at the location of maximum probability, and computed the Euclidian distance to the known true position of the intrusion. We then summarized each experimental intrusion by selecting the 10 prediction surfaces with the most condense highest probability density, i.e., those in which the top 5 % probability density is contained in the smallest, most condense, area. The spatial error of the localization prediction associated with these selected predictions was further summarized by taking the average Euclidian distance over the 10 selected predictions.

Data Availability
Our data and code will be made available in the public 4TU.ResearchData repository open publication. The data and code are available to the editors and reviewers upon contacting the corresponding authors. Figure 1. Overview of the study area with three examples of how normal behavior varies spatially: (a) topography and tree cover in the study area (white to green with increasing tree cover); (b) movement speed (third quartile) and directionality of wildebeest during the afternoon (blue to red with increasing speed; length and darkness of line segments indicates the degree of directional preference and orientation indicates the preferred movement direction); and (c) modelled habitat suitability of wildebeest during the afternoon as function of habitat characteristics (white to green with increasing suitability). The inset figures exemplify the importance of considering environmental context in the early warning system, since fast, straight and directional movements through low suitability areas are part of the sentinels' normal behavior. Thus, solely detecting fast and straight movements may not suffice as early warning indicators.

Figure 2.
A sample of the 2117 computed animal movement features characterizing the sentinels' behavior near experimental intrusions, shown here as function of the time since the annotated start of their response behavior (i.e., 'flight' and 'regroup' as described in the main text). All y-axes show standardized values (zero-mean and unit-variance when undisturbed), and the shaded area around each line (i.e., sentinel species) depicts pointwise 95% CI of a General Additive Model. When encountering the experimental intrusions, the sentinels moved faster (a), straighter (b), away from the intrusion (c), and with higher body acceleration (d). The sentinel species that prefer more grass-dominated habitats (i.e., lower tree cover) tended to move towards areas with higher tree cover (e) and thus lower habitat suitability. Moreover, encountering the intrusions induced more aligned collective movement (f). Figure 3. Spatial performance of our early warning system. Panel (a) shows the predicted spatial probability surface for the intrusion's location (based on data from the sentinel animals only) for one of the experimental intrusions. For all experiments where the intrusion was algorithmically detected (82.5%), the spatial localization accuracy as function of threshold distance (b) shows that 54.2% of these correctly detected intrusions could be localized with a spatial error of less than 500m and 20.8% within 100m. The dashed focal area shown in panel (a) is highlighted in panels (c-e), where the sentinels' (here: wildebeest) movements in the next 10 minutes is indicated with dashed lines. Panel (c) shows the spatial localization prediction of the intrusion. The evasive movements of the fleeing wildebeest are fast compared to their normal movement at that location (Fig. 1b), and highly aligned. While fleeing, the wildebeest move through habitat with a low suitability (d, see Fig 1c), and towards areas that are energetically costly to reach (e, movement costs are computed based on topography and relative to their current position, where the cost of movement is assumed to be inversely proportional to movement speed on an incline as computed using Tobler's hiking function). The experimental intrusion as depicted in this figure is animated in Supplementary Movie S1, including output from the animal classification and intrusion localization algorithms.