Short-term behavioural impact contrasts with long-term fitness consequences of biologging in a long-lived seabird

Biologging has emerged as one of the most powerful and widely used technologies in ethology and ecology, providing unprecedented insight into animal behaviour. However, attaching loggers to animals may alter their behaviour, leading to the collection of data that fails to represent natural activity accurately. This is of particular concern in free-ranging animals, where tagged individuals can rarely be monitored directly. One of the most commonly reported measures of impact is breeding success, but this ignores potential short-term alterations to individual behaviour. When collecting ecological or behavioural data, such changes can have important consequences for the inference of results. Here, we take a multifaceted approach to investigate whether tagging leads to short-term behavioural changes, and whether these are later reflected in breeding performance, in a pelagic seabird. We analyse a long-term dataset of tracking data from Manx shearwaters (Puffinus puffinus), comparing the effects of carrying no device, small geolocator (GLS) devices (0.6% body mass), large Global Positioning System (GPS) devices (4.2% body mass) and a combination of the two (4.8% body mass). Despite exhibiting normal breeding success in both the year of tagging and the following year, incubating birds carrying GPS devices altered their foraging behaviour compared to untagged birds. During their foraging trips, GPS-tagged birds doubled their time away from the nest, experienced reduced foraging gains (64% reduction in mass gained per day) and reduced flight time by 14%. These findings demonstrate that the perceived impacts of device deployment depends on the scale over which they are sought: long-term measures, such as breeding success, can obscure finer-scale behavioural change, potentially limiting the validity of using GPS to infer at-sea behaviour when answering behavioural or ecological questions.


Results
In total, 591 foraging trips were extracted for subsequent analyses, representing 405 individuals from 174 nests. To ensure that all our measures were fully comparable, we subset these trips to include only those individuals for which overall breeding success and, if applicable, measures of chick provisioning, were known. From this, 171 foraging trips, representing 130 individuals from 72 nests, remained. All ensuing analyses were performed on both the complete and comparative datasets; effect sizes and significance values did not differ depending on which dataset was used. For brevity, only results from the subset comparative dataset are presented here. Results from analyses conducted on the complete dataset are reported in supplementary materials. Figures include data from the entire dataset.
The starting masses for birds in each deployment group did not significantly differ (no device: 388.8 ± 9.2 g, GLS: 405.3 ± 7.3 g, GPS: 397.4 ± 6.2 g, combined: 402.7 ± 3.9 g; χ 2 = 4.0, df = 3, p = 0.264). Given these mean starting masses, the percentage body mass of each device equated to 4.8 ± 0.04% for the combined deployment, 4.3 ± 0.06% for the GPS-only deployment and 0.7 ± 0.08% for the GLS deployment. The metal ring carried by all birds equated to 0.2 ± 0.1% of body mass.
Effects of tagging on breeding success. Deployment type did not significantly predict breeding success in the year of deployment (χ 2 = 0.42, df = 3, p = 0.94) or the subsequent year (χ 2 = 3.09, df = 3, p = 0.38). A breakdown of the percentage of eggs surviving until at least late-stage chicks is shown in Table 1.
Effects on chick provisioning. In total, the mass changes of 48 chicks were recorded over the years 2013-2017. Neither deployment of GLS or GPS during the provisioning period, nor deployment of GPS during incubation, had an effect on the duration of the provisioning period for chicks, the number of nights they were fed, the maximum mass they attained, or their average feed per night, relative to chicks of non-tracked parents (

Discussion
In this study we compared behavioural and fitness measures of tagging impact in a free-ranging breeding seabird. We found that the short-term deployment of GPS tags on Manx shearwaters is associated with significant foraging behavioural changes during incubation, but that these do not translate into reduced breeding success in either the year of tagging or the subsequent year. Thus, the perceived impact of tag deployment for this species may depend on the measure used to assess it, potentially leading to substantial effects on behaviour being overlooked. More specifically, during incubation, shearwaters which carried a GPS doubled their foraging trip durations, reduced flight time, and gained less mass per day of their trips. However, during chick rearing, no effects on parental provisioning could be observed: pairs in which one adult was tagged provisioned as much and as often as unmanipulated pairs. Thus, breeding success may be insufficient to capture short-term alterations in behaviour. There are several mechanisms by which the attachment of the GPS tag could have led to our observed changes in behaviour, including: increased mass, increased drag forces, modifications to the centre of mass, or via the attachment procedure itself, which may induce stress and, in turn, lead to alterations in behaviour [38][39][40][41][42] . While the masses of our devices exceeded the 3% threshold 43 upheld by some authors, the negative impacts associated with tagging have not been found to increase linearly with tag mass 11 , suggesting that a reduced mass threshold would not have been sufficient to eliminate the effects we observed here. Indeed, the attachment of tags < 3% of body mass has been found to cause negative impacts in a wide variety of species (e.g. great cormorant 18 ; common starlings Sturnus vulgaris 44 ; grey seal Halichoerus grypus 45 ; Magellanic penguin Spheniscus magellanicus 46 ; green turtle Chelonia mydas 47 ; bottlenose dolphins 15 ). In particular, the disruption of air or water flow associated with attaching an external tag means that the cost of movement can be increased substantially 44,48,49 , especially for aquatic or volant animals. In addition to differences in the mass or impediment imposed by the different deployments discussed here, there are small differences in the duration and type of handling experienced by birds, so it is possible that these also contributed to the observed effects. However, any such effects are unlikely to be a main driver of our observed results. All birds, even those which were untagged, experienced handling, which was always short, and during deployments we did not observe visible signs of stress such as vocalisation or escape behaviour. The fact that we found no differences in foraging trip duration between GLS birds which had experienced occasional or frequent handling also supports this. Responses to handling have been shown to be less disturbing than device deployment in other seabirds 50 , and several studies in other Procellariiformes have shown that heart rate increases caused by handling return to base levels within a few hours 51,52 . It is therefore likely that in our study, the effects of handling dissipated quickly, and that the longer-term behavioural changes in flight and foraging behaviour we observed were driven by the persistent effects of carrying the device.
This may be the case in Manx shearwaters, which exhibited reduced flight time when carrying a GPS. It is unlikely that the birds in this study are physically prevented from ranging to long distance, profitable sites: while it is not possible to compare with untracked controls, GPS-tracked Manx shearwaters forage at considerable distances from the colony, covering the length of the Irish sea (approximately 280 km 40 ), with trips averaging a total length of 1517 km (maximum 2117 km 1 ). However, if increased flight costs compromise competitive ability, tracked birds may be competitively displaced from profitable patches, as may be the case for immature shearwaters 53 . Alternatively, the reduction in flight time observed may reflect a decrease in foraging efficiency associated with tagging. While tagged shearwaters in our study did not alter the amount of time they dedicated to foraging, it is possible that they increased foraging effort within this time. If shearwaters operate under an intrinsic energy ceiling 49 , and GPS-tracked birds invest more time in high-energy foraging, it may be the case that this increased energetic expenditure must be traded-off against reduced flight behaviour (energetically expensive) and increased rest (energetically inexpensive). Finally, while the change in rest time for GPS-tracked birds was not found to be significant, it is possible that the marginal increase in resting behaviour could reflect time 'wasted' pecking or preening at the tag 21 , with flight reducing as a consequence. If this is the case, individuals may be expected to habituate to tags over longer or multiple deployments, and so eventually return closer to baseline behaviour, though we did not have sufficient repeated measures of individuals to examine this in our dataset. Regardless of the specific mechanisms, we find that foraging gains per day are reduced. That the foraging trip duration overall is substantially increased in GPS-tracked individuals may reflect birds attempting to compensate for compromised foraging ability by extending the length of the trip itself. Consequently, the behaviour of GPS-tagged shearwaters is unlikely to be faithfully representative of unmanipulated individuals.
Despite this substantial disruption to behaviour, we did not observe changes in breeding success which could be related to device deployment type. This may in part reflect compensation by the partner. It is well established that in many species (e.g. Kentish plover Charadrius alexandrinus 54 ; burying beetle 55 ; starling 56 ), including seabirds (e.g. Cape gannet Morus capensis 57 ; great frigatebird Fregata minor 58 ; Cory's shearwater Calonectris borealis 59 ), a reduction in care by one parent can be compensated by the partner. It is possible in this case that tagging disrupts the normally equally shared 60 parental care burden. If the costs of tagging during chick rearing are absorbed by the un-tagged parent, then overall breeding outcome may not give an accurate assessment of tagging impacts in this species. This has been observed in thick-billed murres 38 and black-legged kittiwakes (Rissa tridactyla) 61 , where a reduction in provisioning rate by tagged adults appears to be compensated by the partner, leading to normal fledging success. The stark differences in impact we observed between the two halves of the breeding season may therefore reflect the fact that during incubation we can measure precisely individual investment, while during chick-rearing we only measure the response of the pair as a whole, meaning compensation could occlude finer scale impacts on individual behaviour. It will therefore be important for future work to test whether nest visitation and feed delivery differs between GPS-tracked and untracked parents. Individual costs may additionally be absorbed into future reproductive attempts, as has been observed in northern wheatears (Oenanthe oenanthe 28 ). In Manx shearwaters, an experimental increase in parental effort in one year has been found to lead to reduced breeding success in the following year 62 63 . Consequently, long-term assessment of breeding success may be required to identify costs to reproduction. In long-lived species that produce few offspring, breeding success at a single point in time is unlikely to capture the full breadth of impact on an instrumented bird sufficiently. In our study, even major changes to individual behaviour, here identified as significant disruption to behaviour when foraging, were not reflected in breeding outcomes as we measured them. These results demonstrate that focusing on breeding success is inadequate to identify the full complement of impacts experienced by animals carrying tags.
It is now well understood that the instrumentation of wild animals can lead to undesirable effects on their fitness and behaviour. However, it is less clear how well fitness can provide a proxy for behavioural changes. Here, we identify major alterations to foraging behaviour which are not reflected in our measures of breeding success, highlighting the critical need for alternative measurements when considering the impacts of tagging. It is likely that the disjunction between effects on fitness and behaviour observed here are commonplace in instrumented species. Consequently, careful consideration of which impacts are of interest from the perspective of the study design are necessary to determine whether data collected are representative of the wider population. By considering tagging impacts beyond coarse measures such as breeding success in a single season, authors can better identify from where their impacts arise and hence attempt to mitigate them. Even simple measures, such as comparisons of individual condition (easily measured as mass) or assessments of behavioural measures (such as trip length), in comparison to controls, will give a more complete picture of the often complex impacts of tag deployment on individuals, ultimately giving us a better toolkit with which we can answer fundamental questions about animal behaviour.

Methods
Study site and species. The study was conducted from 2009 to 2017 on UK Manx shearwater breeding colonies and long-term studies sites at Skomer Island, Wales (51°44′ N, 5°17′ W) and Lighthouse Island in the Copelands group, Northern Ireland (54°44′ N, 5°31′ W). Manx shearwaters breed on dense island colonies between April and September, during which time they come onto land only at night. In April, females lay a single egg in an underground burrow, which is incubated for about 51 days, with stints of 5-7 days taken by each parent. Following hatching, both parents feed the chick most nights for approximately 60 days. Occupancy and breeding success were monitored at the Skomer colony each year, with birds on both islands being easily accessed at the nest through the burrow entrance or purpose-built inspection hatches. To allow individual identification, all birds in this study were ringed with a permanent stainless steel ring, provided by the British Trust for Ornithology. Females were identified at the point of laying through cloacal inspection 64 , and males by inference.
Sampling methods. GPS tracking campaigns lasted for approximately two weeks during the incubation (May-June) and chick rearing periods (July-August). Deployments on the two islands were carried out simultaneously in each given year. We analysed the foraging trip metrics during incubation and provisioning behaviour of birds subject to four different logger deployments: no device (ring only, 0.78 g), a geolocator only (2.5 g), a single global positioning system (GPS) logger (total attachment mass 17 g), or a GPS logger and geolocator (GLS, c. 19.5 g). The sample sizes in each of these groups can be found in Table 4.
For individuals carrying no logger, foraging trips were identified by directly identifying the incubating parent once a day from laying to hatching, by removing the incubating adult from the nest and reading its ring number, which took no longer than 1 min, adding to a total of c. 20 min over the whole incubation period. Whichever bird was not present at the nest was deemed to be on a foraging trip.
GLS (Migrate Technology Intigeo-C250 or Intigeo-C65; BioTrack Mk4083; British Antartic Survey MK-14 or MK-19), were deployed either simultaneously with GPS (combined group) or at the beginning of the breeding season (GLS group), in April, and were retrieved at the beginning of the breeding season the following year. GLS were attached by two small cable ties to a plastic leg ring to ensure immersion when sitting on the sea (see 65 ). Deployment and retrieval of GLS was conducted in the field and handling time did not normally exceed 5 min. 27 individuals carrying GLS were handled daily, as per the no device birds, as part of a separate experiment. We analysed whether these two subgroups differed in their mean foraging trip duration; as they did not, we pooled the subgroups for the remainder of the analysis (see supplementary materials). GPS loggers (I-gotU GT-120) were waterproofed using lightweight heat-sealed plastic sleeves and attached to the dorsal feathers of birds using 5 strips of marine TESA tape (see 40 for details). Birds were captured on the nest during changeover with the partner (egg incubation) or during a provisioning visit to the nest (chick rearing), after they had been allowed approximately 30 min to feed their chick. GPS deployment was carried out in a darkened lab and processing time was approximately 10 min. On both islands the lab is close to the colony and so transportation did not exceed 10 min, giving a total maximum handling time, from capture to release, of 20 min. Following deployment of the www.nature.com/scientificreports/ GPS, the bird was returned to the nest and allowed to depart on its foraging trip naturally. Incubating birds were recaptured as soon as they returned from foraging; efforts to recapture chick-provisioning birds began 3 days after departure until their subsequent return to the nest. The foraging trip duration of incubating birds could thus be identified as the period it was not found in the nest.

Mass changes and breeding success.
To measure foraging gains from each trip, mass measurements were taken for all GPS birds at deployment and retrieval, and were taken daily during direct observations of non-instrumented and 23 GLS-tracked nests made during incubation in 2015 and 2016. Birds were inserted head first into a draw-string muslin bag attached to a 600 g Pesola spring balance (precise to 5 g). The resulting mass change (mass at return -mass at departure) was divided by the duration of the foraging trip (in days) to give a measure of daily mass gain.
To determine the meal sizes provided by parents, chick masses were collected on Skomer in 96 nests where at least one parent carried a GPS, 48 where at least one carried a GLS, and 89 where neither parent carried a device, between 2013 and 2017. Chicks were placed into a plastic box and weighed on a digital balance precise to 1 g. Chick masses on Skomer were collected daily from the day following hatching until the chick was not found in the nest for 3 consecutive days, at which point it was presumed to have fledged. For each nest, the duration of the provisioning period was identified as the time between hatching and the last known feed (taken to be an increase in the chick's body mass). For those chicks surviving to fledgling stage, the maximum mass attained, total number of nights fed, and average daily mass change was calculated.
At the end of the season, data on breeding outcome were collected for each nest in the Skomer study colony. Breeding outcomes were not available for Copeland birds. extracting foraging trip metrics. As foraging trip durations were not available for unmanipulated or GLS-carrying birds during chick-rearing, differences in foraging trip metrics were only investigated during the incubation period. For each year and tracking campaign, only foraging trips from GLS-only and no device birds that occurred simultaneously with the GPS-tracking period were used in the analysis. GLS devices tested for saltwater immersion every 3 or 6 s and recorded the proportion of samples immersed in water in each 10-or 5-min epoch respectively. The frequency distributions of total daily immersion (sum of each 10-or 5-minute immersion score) recorded by the GLS were used to separate incubation stints from foraging trips, using the expectation that days of incubation in the burrow would be characterized by a distribution of very low total immersion values. The package mixtools 66 was used to fit a Gaussian mixture model to the frequency data, which was used to assign days with lower total immersion as incubation days. These assignments were subsequently validated using individuals for which we knew through direct observation whether the bird was incubating or not. This yielded a 92% accuracy for assignment of foraging dates.
At-sea behaviour was classified using immersion data, using the same threshold methods outlined by Fayet et al., 2016 67 , where classifications were outlined as: < 2% maximum immersion for directed flight, > 98% maximum immersion for resting on the water, and intermediate values for foraging. Intermediate values are caused by the bird taking off and landing on the water, and have been shown to be indicative of foraging behaviours in this species by validation with dive loggers 1 .
Statistical analysis. All statistical analyses were completed using R version 3.2.2 68 . The R package lme4 69 was used to construct linear mixed effects models (LMMs), and p-values were obtained by comparing models to null models without the effect of interest, using a likelihood ratio test. Least squares means for each level of the deployment type were calculated using the R package emmeans 70 , and Tukey's range test used to calculate significant differences between them. To ensure that repeated measurements from the same individuals or inter-year differences did not impact our parameter estimates, all models included a random intercept effect of individual nested within year. For our model of breeding success, the random intercept was burrow identity nested within year.
To investigate differences in breeding success according to deployment type, a generalized linear mixed effects model with a binomial distribution was fitted to breeding success in the current (t) and subsequent (t + 1) year. Breeding success in t + 1 additionally included the fixed effect of breeding success in t.
To ensure that differences in trip length could not be explained by differences in the starting masses of birds, an LMM was fitted to determine the effect of deployment type (no device, GLS, GPS or combined) on (1) starting mass (g). Further LMMS were fitted to examine the effect of deployment type on (2) foraging trip duration, (3) daily foraging gain, and (4) total foraging gains. Models 2-4 included the fixed effect of starting mass (g) and sex; model 2 additionally included the interaction of deployment type and body condition was as a fixed predictor. To allow for an increased sample size and greater variation in foraging trip duration across the treatments, foraging trips from both Copeland and Skomer birds were included in the analysis. As mean foraging trip duration differs between the two islands, island was included as a fixed effect in all 3 models. For data collected during the chick rearing period, LMMS were also fitted to (5) provisioning trip duration, (6) maximum chick mass obtained, (7) daily feed size, and (8) number of nights fed. Models 5-8 included the random effect of nest within year to control for repeated measurements from nests.
To investigate differences in at-sea behaviour during foraging trips for birds in the GLS and combined deployment groups, LMMs were fitted to determine the effect of deployment type on the proportion of time spent in each of foraging, resting, and flight states per day of the trip. Trip duration and island were additionally included as fixed effects.