Artificial intelligence for automated detection of large mammals creates path to upscale drone surveys

Imagery from drones is becoming common in wildlife research and management, but processing data efficiently remains a challenge. We developed a methodology for training a convolutional neural network model on large-scale mosaic imagery to detect and count caribou (Rangifer tarandus), compare model performance with an experienced observer and a group of naïve observers, and discuss the use of aerial imagery and automated methods for large mammal surveys. Combining images taken at 75 m and 120 m above ground level, a faster region-based convolutional neural network (Faster-RCNN) model was trained in using annotated imagery with the labels: “adult caribou”, “calf caribou”, and “ghost caribou” (animals moving between images, producing blurring individuals during the photogrammetry processing). Accuracy, precision, and recall of the model were 80%, 90%, and 88%, respectively. Detections between the model and experienced observer were highly correlated (Pearson: 0.96–0.99, P value < 0.05). The model was generally more effective in detecting adults, calves, and ghosts than naïve observers at both altitudes. We also discuss the need to improve consistency of observers’ annotations if manual review will be used to train models accurately. Generalization of automated methods for large mammal detections will be necessary for large-scale studies with diverse platforms, airspace restrictions, and sensor capabilities.


Project Overview
The objective of this research was to investigate automated methods of detecting individual caribou (Rangifer tarandus) from drone imagery. To do this, we conducted drone surveys over a small herd of caribou along the northwestern border of Wapusk National Park, Manitoba, Canada ( Figure S1) as part of a larger effort focused on surveying a common eider (Somateria mollissima) nesting colony. We collected still RGB imagery of caribou on 18 July 2016 with a fixed wing aircraft and created georeferenced orthomosaics for use in manual and automated detection methods. In the following sections we describe the technical specifications of the drone platform and sensors used, as well as additional details on image collection and processing prior to manual image review and automated methods.

Platform specifications
Flights were conducted with a fixed-wing, rear propelled Trimble UX5 ( Figure S2). The Trimble UX5 was black in color with a 100 cm and 2.5 kg weight. It has a cruising speed of 80 km h -1 and is powered by a single removable lithium polymer battery (14.8 V, 6000 mAh). The UX5 has an estimated endurance of 50 mins and estimated range < 5 km (see section 2.3 Flight planning and method of operation). The Trimble UX5 is no longer commercially available from Trimble and has been replaced by the newer Delair UX11 (https://delair.aero/delair-commercialdrones/professional-mapping-drone-delair-ux11/).

Takeoff and retrieval
Prior to takeoff, the pilot-in-command performed a pre-flight checklist for the UX5 and uploaded the pre-programmed flight plan to the drone's onboard navigational system. UX5 takeoffs were initiated using an elastic catapult launcher to ensure a takeoff speed of at least 65 km h -1 , which is required to activate the electric motor ( Figure S3). For optimal takeoff conditions, the launcher is faced into the prevailing wind direction and at a 45° angle from the ground. No specialized landing or retrieval equipment was used, as the UX5 is designed with a reinforced ventral surface to facilitate "belly landings". Landings require a relatively flat patch of land, approximately 30 × 75 m to accommodate for error or differing environmental conditions during the landing sequence (e.g. GPS error, crosswinds, etc.). All takeoffs during this study occurred within the La Pérouse Bay research compound, while landings took place adjacent to the compound for extra safety precautions (i.e., to avoid collisions with personnel and buildings).

Flight planning and method of operation
All flight plans were preprogrammed semiautonomous line transects using Trimble Access Aerial Imaging V2.0.00.40 (Trimble, Sunnyvale, CA). Flight planning involved specifying locations for takeoff and landing, and the desired degree of percent image overlap (which dictated the spacing between adjacent line transects). Direction and orientation of flight lines were determined based on environmental conditions (i.e. wind speed and direction). Programming drone flights in a remote environment required us to download Google Earth background imagery prior to field work (due to a lack of internet connectivity). Real time monitoring of the UX5 position during flight was done using a Trimble Yuma 2 Ground Control Station (GCS), which provided constant feedback on UX5 flight parameters such as: location, distance from GCS, battery level, altitude, and cruise speed. During flight, the UX5 executed flight plans automatically, but the pilot had the ability for safety interventions such as recalling the aircraft to the takeoff/landing site, execute a circular holding pattern, or to abort the flight via immediate landing. In accordance with our Transport Canada Special Flight Operations Certificate (see section 6. Permits, Regulations, Training, and Logistics), drone operations required the presence of at least two researchers on the ground, a pilot-in-command monitoring the status of the drone using the Yuma GCS, and an observer tasked with observing the UX5 by maintaining visual contact with the drone at all times during flight.

Data overview
Surveys of caribou were done by the UX5 collecting still images with 80% vertical and horizontal overlap. Imagery was collected in Red Blue Green format (RBG, 3 visible bands), and saved as jpg files on 316 GB SD cards housed onboard the UX5. Images were geotagged with a latitude and longitude during postprocessing based on a csv file for each camera trigger during the flight and this information was used in the mosaic creation process (see section 5 Data postprocessing). For an example of raw image quality, see Figure S4.

Payload or sensor description, and data collection methods
The UX5 was equipped with a single payload during surveys, a nadir-oriented Sony NEX-5R 16.1 MP camera (Sony Corporation of America, New York, NY). During flight, images were automatically collected at a rate to achieve the desired degree of image overlap (approximately one per second). Camera settings for all flights were as follows: exposure time 1/4000 s, focal length of 15.5 mm, and automatic white balance. Ground sampling distances of images at 75m and 120m Above Ground Level (AGL) were 2.4 cm and 3.8 cm respectively (see section 4. Field Operation Details).

Field Operation Details
Drone surveys in this study were done using the UX5 on 18 July 2016, from 09:08 to 12:41 (seconds not recorded). Four flights were conducted to survey caribou, two at 120m AGL, and two at 75m AGL (Table S1). Average flight duration was 28.5 mins (range 25-32 mins). Weather conditions during flight were cloudy as measured by the pilot-in-command. While operations of the UX5 were limited to within-visual-line-of-sight, fortunately the caribou herd being surveyed remained close to the La Perouse Bay field camp, well within line of sight.

Data Post-Processing
At the conclusion of each flight we downloaded a comma delimited (csv) file with latitude, longitude, altitude, yaw, pitch, and roll for each camera trigger. This information was associated with the respective image in Pix4D (Version 4.x) at the start of the mosaic processing. While not used in this analysis, the post-processing included the product of a digital surface model (DSM) and point cloud. We did not use ground control stations for additional georeferencing since that level of spatial resolution was not needed for the objective of the project. The digital orthomosaic TIFF files created ( Figure S5) were then uploaded into the Open UAS Repository -OUR (https://digitalag.org/our/) for ease in viewing and classification of images, as described in the methods section. For full details of image geoprocessing and associated errors, we have included a full Pix4D Quality Report at the end of this document.