Perceived reward attainability may underlie dogs’ responses in inequity paradigms

Dogs have repeatedly been shown to give their paw to an experimenter more times for no reward when a rewarded conspecific partner is absent than when a rewarded conspecific is present, thereby showing inequity aversion. However, rather than being inequity averse, dogs might give their paw more when a partner is absent due to the experimenter’s procedure in which they move food in front of the subject to mimic feeding a partner. This action could increase subjects’ perception of reward attainability. We tested this hypothesis by introducing an improved type of control condition in which subjects were unrewarded for giving the paw in the presence of a rewarded box, a condition that more closely resembles the inequity condition. Inequity averse subjects’ performance did not differ based on whether the partner was another dog or a box. Moreover, these subjects gave the paw more times when no partner was present and the experimenter mimicked the feeding of a partner than when rewards were placed in the box. These results suggest that responses in the previous studies were inflated by subjects’ increased perception of reward attainability when no partner was present and, therefore, over-exaggerated dogs’ propensity to give up due to inequity aversion.


Table of Contents
All models were fitted in R (versions 3.6.2 -4.3.0 1 ). The packages and functions used in each 49 case are given below. Random slopes were identified, and overdispersion and model stability 50 were assessed, where necessary, using functions kindly provided by Roger Mundry. Boxplots 51 were created using the R package ggplot2 (versions 3.3.2 -3.4.2 2 ).

52
No. of times the subjects gave the paw (latency to give up) 53 To analyse the effect of condition on the number of times the subjects gave the paw, we fitted 54 a Cox proportional hazards mixed effects regression model. The response variable included the 55 number of trials completed by the subject and whether the event of "giving up" occurred (i.e. 56 subjects who gave the paw on 30 trials did not give up but a count lower than 30 meant the 57 subject gave up). 58 We included fixed effects of "rewarded" (i.e. whether the subject was rewarded or not), 59 "partner" (i.e. the type of partner, which was either a dog, the box, or no partner [empty 60 space]), and an interaction between these two factors, with the interaction being the main 61 term of interest. To control for its potential effect, we included test day order (i.e. whether a 62 particular condition occurred on the first, second, or third test day for each subject) as an 63 additional fixed effect. We included random intercept effects of subject (i.e. the identity of the 64 dog) and dyad (i.e. the identity of the subject-partner pairing). 65 In order to avoid overconfidence with regards to the precision of the estimates for fixed effects, 66 and to keep type I error rate at the nominal level of 5%, we included almost all identifiable 67 random slopes 3,4 which included the random slopes of "rewarded", "partner", and test day 68 order within the random effects of subject and dyad. 69 The model could only be fitted by excluding the correlations between the random slopes and 70 random intercept; therefore, we excluded these correlations. We fitted the model using the 71 function "coxme" in the package "coxme" (version 2.2-16 5 ). Prior to fitting the model, we z-72 transformed test day order to a mean of zero and a standard deviation of one to allow for an 73 easier interpretation of results. The factors "rewarded" and "partner" were dummy coded and 74 centred for inclusion as random slopes. task. This plot represents the range of model estimates for each term in the model when the levels of the random effects were 81 excluded one at a time. Names beginning with "dyad" and "subject" are the random effects: the term after the first full stop is 82 either the random intercept or a random slope.

83
As an overall test of the effect of the interaction between the factors "rewarded" and "partner" 84 we conducted a full-null model comparison 7 , aiming at avoiding cryptic multiple testing, 85 whereby the null model lacked the interaction between "rewarded" and "partner" but was 86 otherwise identical to the full model. This comparison was based on a likelihood ratio test 8 87 using the R function "anova" and setting the "test" argument to "Chisq". The sample for this 88 model included a total of 117 observations across 20 subjects and 10 dyads. 89 A cumulative incidence plot representing the probability of discontinuing with the task across 90 conditions and trials was created using the package "survminer" (version 0.4.9 9 ). To assess interobserver reliability, we calculated the intraclass correlation coefficient using the 92 function "icc" in the package "irr" (version 0.84.1 10 ), setting the "model" argument to "twoway" 93 and the "type" argument to "consistency". 94 Inequity averse individuals only 95 Pairwise comparisons were conducted using a Wilcoxon signed-ranks test, specifically using the 96 "wilcoxsign_test" function in the package "coin" (version 1.3-1 11,12 ) and by setting the 97 "distribution" argument to "exact" and the "alternative" argument to "two.sided".

98
Effect of age and sex on the probability of being inequity averse 99 To analyse the effect of age and sex on the probability of being inequity averse, we fitted a 100 Generalized Linear Mixed Model (GLMM) with a binomial error distribution and a logit link 101 function 13,14 (0, not inequity averse; 1, inequity averse). 102 We included condition, age and sex as fixed effects and dyad as a random intercept effect. No 103 random slopes were included. Age was z-transformed to a mean of zero and a standard 104 deviation of one prior to fitting the model. We fitted the model using the function "glmer" from 105 the package "lme4" (version 1.1-33 15 ). Assessment of model stability, carried out as above, 106 produced extreme estimates (see Fig. S3). This appeared to be due to one specific dyad (dyad 107 6). To test for the effects of age and sex we conducted a full-null model comparison as above. 108 The null model lacked the fixed effects of age and sex. The sample for this model included a 109 total of 20 observations across 10 subjects. number of paw commands issued plus the total number of sit commands issued. 118 We included fixed effects of "rewarded", "partner", and an interaction between these two 119 factors, with the interaction being the main term of interest. To control for its potential effect, 120 we included test day order as an additional fixed effect. To account for the differing number of 121 trials completed across subjects, as an offset term 14 we included the log of the number of trials 122 on which the subject gave the paw. 123 We included random intercept effects of subject and dyad. All theoretically identifiable random 124 slopes were included 3,4 ; these were the random slopes of "rewarded", "partner", and test day 125 order within the random effects of both subject and dyad. "Rewarded" and "partner" were 126 manually dummy coded and centred for inclusion as random slopes. 127 8 We fitted the model using the function "glmer" from the package "lme4" (version 1.1-23 15 ). 128 Prior to fitting the model, we z-transformed test day order to a mean of zero and a standard 129 deviation of one. This model was overdispersed (dispersion parameter: 1.825). To deal with 130 overdispersion we included an observation level random effect. However, this resulted in an 131 underdispersed model (dispersion parameter: 0.45). 132 Given these issues with overdispersion and underdispersion, we decided to fit a GLMM with a 133 negative binomial distribution, dropping the observation level random effect. We fitted this 134 model using the function "glmer.nb" from the "lme4" package (version 1.1-23 15 ). 135 Assessment of model stability was carried out as above. The model was of good stability for the 136 fixed effects, random intercepts, and random slopes; model stability for correlations among 137 random slopes and random intercepts was acceptable (see Fig. S4; for four models out of 30 in 138 the model stability analysis, the iteration limit was reached).

142
Names including an ampersat symbol (@) refer to random effects: the first term in the name is the grouping variable, the term 143 after the first ampersat is either a random intercept (indicated by "(Intercept)") or a random slope; if there is also a name after 144 the second ampersat, this effect represents a correlation within the random effect.

10
As an overall test of the effect of condition, we conducted a full-null model comparison, as 146 above 7 . The null model lacked the interaction between "rewarded" and "partner". The sample 147 for this model included a total of 117 observations across 20 subjects and 10 dyads. 148 Interobserver reliability for the total number of commands issued was excellent (ICC = 0.944, 149 n observations = 24, p < 0.001).

150
Inequity averse individuals only 151 Pairwise comparisons were conducted using a Wilcoxon signed-ranks test, specifically using the 152 "wilcoxsign_test" function in the package "coin" (version 1.3-1 11,12 ) and by setting the 153 "distribution" argument to "exact" and the "alternative" argument to "two.sided". Pilot study 200 An initial pilot study was carried out prior to the study presented in the main article. The 201 methods resembled those for the main study, except for differences identified below.

204
Sixteen pet dogs were tested in this study (6 f; 10 m; mean age ± SD = 5.69 ± 3.42 years). These 205 were sixteen different dogs to those tested in the main study.  Table 2 in the main 210 manuscript). In most cases, both dogs in the dyad were included in the study as subjects. After 211 the first subject in a dyad had completed all four conditions, the roles were reversed. The order 212 in which the social and asocial conditions were experienced was counterbalanced across all 213 subjects. An error in performance of the procedure occurred in a total of four sessions; thus, 214 these sessions were repeated on a third test day. Three of the subjects were recruited from the 215 same household; therefore, one dog played the role of the partner twice. 216 Videos were coded by one experimenter. Interobserver reliability was assessed by comparing 217 counts obtained by video coding with the counts from score sheets which were manually 218 scored during the test sessions. This was performed for 20% of test sessions. Interobserver The model was of acceptable stability generally with the exception of the interaction between 226 "rewarded" and "partner" which produced an extreme range and the random slope of 227 "rewarded" within the random effect of "subject" (see Fig. S7). Fig. S7 Model stability plot. Model stability plot for Cox regression model analysing subjects' latency to discontinue with the 230 task. This plot represents the range of model estimates for each term in the model when the levels of the random effects were 231 excluded one at a time. Names beginning with "dyad" and "subject" are the random effects: the term after the first full stop is 232 either the random intercept or a random slope.

233
Effect of beginning the study as a subject or a partner 234 We analysed the effect of whether the subjects carried out all of their sessions as the subject 235 first or began the study playing the role of the partner, using a Wilcoxon rank sum test using the 236 "wilcox_test" function in the package "coin" (version 1.4-2 11,12 ), setting the "distribution" 237 argument to "exact" and the "alternative" argument to "two.sided".

238
Effect of experience of hunting or nosework 239 We also analysed the effect of previous experience with hunting or nosework on the number of 240 times they gave the paw in the inequity -box condition. Subjects were divided into two 241 categories: no experience of hunting or nosework and some experience of hunting or 242 15 nosework. This information was based on a questionnaire typically filled out by participants of 243 studies at the Clever Dog Lab. We carried out a Wilcoxon rank sum test using the "wilcox_test" 244 function in the package "coin" (version 1.4-2 11,12 ), setting the "distribution" argument to 245 "exact" and the "alternative" argument to "two.sided".

262
Names including an ampersat symbol (@) refer to random effects: the first term in the name is the grouping variable, the term 263 after the first ampersat is either a random intercept (indicated by "(Intercept)") or a random slope.

274
Number of times the subjects gave the paw (latency to give up) 275 Overall, no significant interaction between the factors "rewarded" and "partner" was detected 276 in the model assessing subjects' latency to give up (i.e. the number of times they gave the paw; Effect of beginning the study as a subject or a partner 289 There was an effect of the subject's first role on the number of times they gave the paw (Wilcoxon rank 290 sum test: z = 2.4697, p = 0.013). Individuals that began the study playing the role of the subject gave 291 the paw fewer times than those that began by playing the role of partner and later became subjects (see 292

299
Effect of experience of hunting or nosework 300 There was no effect of experience of hunting or nosework on the number of trials in which the subjects 301 gave the paw in the inequity -box condition (Wilcoxon rank sum test: z = -0.7000, p = 0.514).