Introduction

An intertemporal choice is a decision between outcomes that occur at distinct points in time and typically requires a trade-off between how much we care for a given benefit and how long it takes to get it1. We confront time-sensitive choices in many areas of our lives, ranging from dietary choices to financial decisions, and as such intertemporal trade-offs have become a major issue of interest in the fields of economics, psychology, and biology, among others. Patience can be defined as the tendency to forgo an immediate reward to get a larger benefit thereafter, and correlates with a wide range of measures of adaptation and health; for instance, more patient individuals are less prone to addiction, obesity, or criminal behavior2,3,4.

Time preferences can be formally captured by a discount curve that describes how we value a good as a function of the time we need to wait to obtain it. The Discounted Utility model typically used in the social sciences entails discounting exponentially the value of a delayed reward5. In this model, future benefits are discounted at a constant rate over time, which implies that we consider each delay equally, irrespective of its position in our temporal horizon. However, research over the last few decades has shown that our time preferences may be better described by a more concave discount function, resembling a hyperbola, with a steeper discount rate for rewards in the immediate future6,7. This hyperbolic (or quasi-hyperbolic) function captures our tendency to make more impatient decisions when a delay is nearer in time. For instance, when offered the choice between 100 dollars today or 110 dollars in one month, many of us would accept the immediate reward, but if given the same choice one year in advance (100 dollars in twelve months or 110 dollars in thirteen months) most would switch to the second option.

Therefore, when faced with the choice between an immediate reward or a larger reward available after a delay, we frequently exhibit impatience by opting for the smaller, immediate gratification. However, when presented with the same choice in advance, we tend to display more patience and opt for the larger, delayed reward. Such a preference reversal implies that our decisions depend on our relative position in the temporal continuum and that our choices tend to change over time. This dynamic attribute of our decisions, often referred to as dynamic inconsistency or time inconsistency, has greatly enriched the study of traditional models of economic decision-making8, and has been found to apply to different types of rewards and timescales9 and to be present in children10 and across different cultures11. Alongside hyperbolic preferences, researchers in decision-making have proposed various other models to describe the fact that we tend to act more patiently when our options are further in the future [see Ref.12 for a review]. For example, several studies13,14,15 posit temporal preferences as a drive for immediate gratification that may be counteracted, at the expense of associated utility costs, by self-control cognitive systems.

Although distinct theories about why time inconsistency occurs have been proposed over the years, classically there has been some agreement in the participation of more visceral processes in the near-term decisions and cooler systems in the long-term plans. While the decisions for the here and now tend to be more context-dependent and more clearly associated with emotional reactivity, the decisions we make for the long run tend to be more stable and linked to beliefs and long-term goals16,17. Besides, the latter are less self-centered; for instance, we make similar decisions when our choices include temporally distant alternatives and when they involve socially distant ones, such as deciding for another person18.

Non-human animals also face time-sensitive decisions in social and physical contexts, and their temporal choices can be modeled in the laboratory [but see Ref.19]. There is ample evidence that non-human species such as pigeons, rodents, and rhesus macaques also make more patient decisions when their alternatives are far in time, as opposed to when choices include one immediate alternative. In fact, George Ainslie’s work on operant conditioning with pigeons was crucial for the conceptual development of time inconsistency and hyperbolic discounting20,21. In a classical study22, six pigeons were presented with the option to peck on one of two discs, resulting in either 2 s (small reward) or 4 s (large reward) of access to a grain feeder. The small reward was accessible after a certain delay, ranging from 0.01 to 12 s, while the large reward always had a delay of 4 s more than the small reward. All subjects preferred the smaller but sooner reward when it was immediate but, as the delay to both rewards increased, they reversed their preference and started choosing more frequently the larger but later reward. Similar results were found in a broad array of experiments with pigeons and rats23,24. More recently, studies in psychopharmacology and neuroscience have also demonstrated that rhesus monkeys' decisions over time fit a hyperbolic discounting curve better than an exponential one. This pattern appears when macaques repeatedly choose between small amounts of saccharin, juice, or cocaine with different time delays [e.g. Refs.25,26], and it is also reflected in their performance in instrumental tasks when the delay of the reward for a correct response is taken into consideration27,28, see also Ref.19.

However, the decisions between an immediate and a delayed reward have attracted most of the attention in comparative studies. This has been the case, for instance, with the research line that uses matched tasks for multiple species and connects their performance to their ecological and social context. On the one hand, an ecological-intelligence hypothesis29 postulates that the ability to delay gratification may be favored in species that depend on more variable food resources that are difficult to obtain, among other physical challenges. On the other hand, in line with a social-intelligence hypothesis30, it may have evolved in complex societies that pose the difficulties associated with living in larger, more variable, or highly hierarchical groups. Also, as the two hypotheses are not incompatible, a variety of both ecological and social factors may have also tailored the intertemporal preferences in different species.

The methodology that researchers have applied with similar parameters to a larger number of species is the delay-adjusting procedure31, in which an animal chooses between a quantity of food to be delivered immediately and a three times bigger quantity that, if chosen, is delivered after a delay. This delay is adjusted until it reaches an indifference point in which the animal chooses equally each alternative. This task has revealed significant differences between species: while pigeons and rats typically wait only a few seconds to get a larger reward, great apes are willing to wait up to an average of one minute (orangutans, bonobos, and gorillas) or two minutes (chimpanzees), with other species falling somewhere along the continuum32, see also Refs.33,34,35.

Some of the differences among species, both in the delay-adjusting procedure and in other time-related tasks, can be attributed to variations in their ecological niche. For instance, some features of each species’ feeding ecology can explain why, despite their close phylogenetic distance, chimpanzees are more patient (and risk-seeking) than bonobos36 or why common marmosets are more willing to wait for a reward, but less prone to travel for it, than cotton-top tamarins37. In general, there is some evidence that species with larger body sizes and larger home ranges behave more patiently32,38, although the sampling of species is still partly incidental and too small to draw solid evolutionary inferences. Social complexity, on the other hand, can also account for some of the observed variations. Amici et al.39 found, in this regard, that primate species with fission–fusion dynamics had longer waiting times in the delay adjusting procedure reviewed above [see also Ref.40]. Recently, De Petrillo et al.41 selected four lemur species with independent variation in both ecological and social features and administered an intertemporal choice task, among other self-control measures. Interestingly, they found evidence supporting both the ecological- and social-intelligence hypothesis in the delay choice task. A fruit-eating species was more successful than a leaf-eating one when social complexity was similar, and a socially complex species surpassed a pair-bonded one when ecological complexity was comparable [see also Ref.34].

In addition, the efforts to connect time preferences with developmental or contextual factors, among others [e.g. Refs.42,43], have also focused on documenting short-term tradeoffs. In recent years, the most common method to measure temporal discounting in animals has been a delay choice task where subjects can choose between receiving a reward immediately or opting for a larger reward, often three times bigger, with a delay that, unlike in the delay-adjusting procedure, is preset. In any case, they can opt for an alternative with an instant payoff. The same holds true for other delay of gratification tasks that have been administered to several primate and avian species, such as the accumulation task, exchange task, or hybrid delay task34. Each of these tests allows the subjects to choose immediate rewards over delayed ones, confronting them with the temptation of instant gratification.

Nevertheless, prior research in other areas, briefly reviewed above, suggests that time preferences may vary when instant gratification is removed. On one hand, we could expect that individuals across various species will behave more patiently when selecting between delayed rewards. On the other hand, differences between species, as well as inter- and intra-individual variations, could manifest diverse patterns in such choices. As animals frequently confront decisions between two options without immediate consequences, it would be equally relevant to study how this is influenced by socio-ecological factors or how it relates to individual or situational variables. Additionally, such manipulation would introduce prospection abilities as a potentially more relevant covariate.

Bonobos, chimpanzees, and orangutans are able to prepare for events occurring up to hours or even a day later, and their future-oriented behavior has been the subject of considerable research interest in recent years44,45. Furthermore, these species have shown high performance in several self-control and inhibition tasks46,47,48. Additionally, great apes have been one of the main models for studying delay of gratification in non-human species, and all great ape species have been previously evaluated with delay choice tasks34. However, their time preferences have only been studied in contexts involving one immediate and one delayed reward36,39, but see Ref.49. Hence, studying their reaction to delayed payoffs would contribute to a more complete understanding of their valuation of future rewards.

In this study, we evaluate the time preferences of great apes across two distinct time frames to determine if they exhibit dynamically inconsistent preferences. To that end, we administer a modified delay choice task to six orangutans, five bonobos, and four gorillas. In this task, participants can choose to receive one unit of appetizing food at a specific time or opt for three units of food three minutes later. In condition 1 (test), which is how temporal preferences are typically assessed, the first option becomes available immediately after the choice, and the other after three minutes (see Fig. 1). In condition 2 (test), both options are delayed further, becoming available after six and nine minutes, respectively. Considering the evidence of time inconsistency in other non-human species, we hypothesize that great apes will also be more likely to choose the larger reward in this second scenario. Additionally, we administer various series of choices in which the alternatives differ only in size (pretest) or in time (controls). Further, much research has demonstrated the value of non-human primates in economics and other social sciences, ranging from studies in game theory and behavioral biases to research on child development and the study of intergenerational impact of early advantage on health and social status50,51,52. Viewed from this perspective, we contribute to this important literature by studying the nature of our intertemporal preferences.

Figure 1
figure 1

Design. The illustration depicts the options presented to the subjects throughout the different phases of the study. The quantity of food units is represented by the number of dots, while the delay of the food is indicated by the time-lapse below each option. The labels A and B are used in this report as abbreviated names for each alternative.

Results

Great apes adjusted their choices to variations in delay and size

One gorilla (Gorgo) failed to meet the pretest criterion (see Methods) and was excluded from the study. An initial analysis of the data of the remaining 14 apes revealed that they were sensitive to the delay in the rewards, choosing less frequently the bigger reward in the test of both conditions than in the pretest (Wilcoxon test: pretest vs. test of condition 1: z = − 3.17, p < 0.001; pretest vs. test of condition 2: z = − 3.30, p < 0.001). Also, within each condition, they reacted to the changes in the magnitude of the rewards, choosing the more delayed option more frequently in the test phases than in their respective control phases (Wilcoxon test: condition 1: z = − 2.57, p = 0.008; condition 2: z = − 2.95, p = 0.001). Figure 2 presents the median percentage of choices in all phases (n = 14).

Figure 2
figure 2

Group results. The figure shows the median (± semi-interquartile range) percentage of choices of each alternative as a function of the study phase.

Great apes made more patient decisions when choosing between delayed rewards

A comparison between the tests of the two conditions reveals that the apes chose option B (see Fig. 1 for equivalences) more often in condition 2 than in condition 1 (Wilcoxon test: test of condition 1 vs. test of condition 2: z = − 2.14; p = 0.031). Regarding individual data, half of the subjects showed a significant change in the same direction (choosing more often option B in the test of condition 2) while one of them (Dunja) displayed the opposite pattern (choosing more often option B in the test of condition 1). A combined analysis of the individual results indicates that there is a globally significant effect in the main direction (Fisher’s method: χ2(28) = 54.14; p = 0.002), reinforcing the idea that the change in preferences occurs both in the group and individual levels. In the control phases, the group results showed an analogous effect to that of the test phases, since the apes chose more often option B in the control of condition 2 (Wilcoxon test: control of condition 1 vs control of condition 2: z = − 3.01; p < 0.001).

As we observed that the sample variability is substantially higher in the test of condition 1 than in the test of condition 2 (Table 1), we ran exploratory paired comparisons of the variance across control and test phases. While there are no significant differences in the variances neither between each control and its corresponding test phase (Wilcox’s HC4 test: condition 1: t(12) =  − 2.16, p = 0.052; condition 2: t(12) = 0.22, p = 0.828) or between controls (t(12) = 1.18, p = 0.259), the apes’ choices do have significantly more within-group variation in the test of condition 1 than in its analogous in condition 2 (t(12) = 3.64, p = 0.003; the p-values for Wilcox’s test are asymptotic).

Preferences within phases

An analysis of the choices within each phase indicates that, in the pretest, the group selected option B above chance (Wilcoxon test: z = 3.33, p < 0.001), as did all the subjects (see Table 1 for individual data). In the test of condition 1, the group did not show a clear preference (Wilcoxon test: z =  − 1.58, p = 0.126), although six of the apes did select option B below chance, and one of them (Dunja) above chance. In the test of condition 2, the group selected option B above chance (Wilcoxon test: z = − 2.69, p = 0.004), as did two of the subjects. Regarding the data from the control phases, the analysis revealed that in condition 1, the group selected option B below chance (Wilcoxon test: z =  − 3.19, p < 0.001), as well as nine of the fourteen subjects. In the control phase of condition 2, the group did not show any preference (Wilcoxon test: z =  − 1.09, p = 0.336), although one individual (Dokana) selected option B below chance.

Table 1 Individual results.

Exploratory analyses on control phases

Some of these results imply an unexpected near-equality in the choices of each alternative in the second control which in principle might indicate a lack of discrimination between the time magnitudes. In fact, it is more difficult for both human and non-human animals to distinguish between longer intervals53. However, in both tests the apes tended to select option B more frequently compared to the control phases, but less frequently than they did during the pretest. This pattern would be hard to explain if the subjects did not react in both conditions to the delay of the rewards as well as to their size. In order to find potential variations in the relative weighting of these two parameters across conditions, we calculated the difference scores for each individual in each condition (the difference between the percentage of choices of option B in the control and the test) and found no significant difference between them (Wilcoxon test: condition 1 vs condition 2: z =  − 0.12, p = 0.924). Thus, we could not find any changes in the data structure between the two conditions (Table 2).

Table 2 Difference scores.

Upon examining the correlations between phases (Table 3), we can also observe that there is no significant relationship at p < 0.05 for any pair of scores. However, the only significant correlations at p < 0.10 appear within each condition between control and test phases. Therefore, the best predictor of the apes’ intertemporal choices during the test phases seems to be their choices in the corresponding control phase (i.e., their reaction to the time intervals), although not statistically significant at conventional levels. Also, there is no indication that the strength of this relationship differs between conditions. Overall, the better explanation for these findings seems to be that apes traded off the quantity and the timing of the rewards throughout the entire procedure, including condition 2.

Table 3 Relationships between measures across phases.

In fact, a post-hoc re-examination of both group preferences and individual data seems to confirm that conclusion. First, considering the pretest scores, if they perceived the delays as equal, they would have chosen almost universally the larger reward. Secondly, the participants who selected the earlier option above chance (Table 1), thus showing a clear preference for the six-minute delay (Dokana, over both series of the control, and Padana, in the last one), fall within the range of the rest of the sample in the test, and near its median. In light of all the evidence, we think the behavior of the apes in the second control is better explained by a certain indifference (a weak preference in terms of rational choice theory) for earlier payoffs in extended time ranges.

Discussion

Our results support the hypothesis that great apes do not have a constant rate of time preference but display dynamically inconsistent preferences. Overall, the participants preferred the larger payoff but were more prone to wait for it when the rewards were distant in time. If both options were immediate, the apes selected three units of food over one in a median of 92% of the trials. When they had to wait an extra three minutes to get them, they only chose them in a median of 25% of the trials while the small reward was still immediate. However, when the small and the large rewards were available after six and nine minutes, respectively, their preference for the large quantity recovered above chance levels (58% of the times). All else being equal, great apes made more patient decisions when facing a choice between future rewards than between an immediate and a future reward.

We take these data, first, as confirming the results of previous studies showing that three minutes compared to an immediate reward can be a long waiting period for most species, including humans, when facing a food-related task36. More importantly, the novel finding here is that, when a constant delay of six minutes was added to both options, subjects preferably chose to wait. Therefore, when they had a choice between a smaller-sooner and a larger-later reward, apes tended to act impatiently if they decided just before the first one became available but patiently if they decided in advance. In both cases, the decision was the same, and the only change was the point in time when they took it.

This preference reversal, which depends on the temporal distance to the earlier payoff, mirrors a pattern observed in delay choice tasks across multiple species, including humans, pigeons, rodents, and rhesus macaques19,54. This tendency, which can be described as a higher rate of discounting in the short term, is common across various types of rewards like food, drugs, and, in the case of humans, money. However, to our knowledge, this phenomenon had not been documented in any species of great apes before. In doing so, we have used a methodology that is easily adaptable to a wide range of species across distant taxa. Therefore, our design offers a simple and versatile approach for measuring choices across various time frames.

Our results are based on a small sample size, limiting our ability to make valid comparisons between species in either of the two conditions. However, there is no reason to assume that the pattern of differences between species or individuals will be the same when choosing over short delays with an option for instant gratification than when choosing over longer time periods between two delayed rewards. In our exploratory correlational analysis, we did not find any covariation between the choices of the individuals over short- and long-run payoffs. Indeed, the apes that more frequently chose to wait three additional minutes to receive a larger reward in the first condition were not the same as those in the second condition. This discrepancy between conditions may point to the implication of different underlying mechanisms that could be explored in further studies.

We found substantial variation among individuals when they made choices for the near future. This replicates earlier findings that, like humans, non-human animals also display ample individual differences when given the opportunity for instant gratification55. However, the inter-individual variability substantially decreased in the condition with two delayed rewards. When choosing between delayed outcomes, apes preferred the larger reward as a group and showed quite a homogeneous behavior. A possible explanation for this finding could be that differences in impulsivity between individuals may diminish when making decisions about outcomes that are farther in the future. In that instance, the observed variations in the short-term decisions could be more closely related to the mechanisms that are engaged when immediate rewards are processed (e.g., individual differences in inhibition capacities) and less associated with the more cognitive processes that may prevail when deciding delayed outcomes.

A more careful evaluation of the patterns of covariation in a wider range of relevant measures when animals decide over different time ranges would be necessary to confirm the hypotheses relative to the distinct mechanisms that could be involved when deciding for the short and long term. For instance, this could entail exploring the correlations between behavioral and emotional responses when making decisions with immediate rewards versus decisions with only delayed rewards. Rosati and Hare56 presented chimpanzees and bonobos with, among other tasks, an intertemporal choice between an immediate reward and a three-times larger reward available after one or two minutes of waiting, and measured both their decisions and the indicators of a negative affective reaction immediately after the choice (such as screaming, scratching or banging). Their results demonstrate that apes from both species can willingly choose to wait for the larger reward, yet they still exhibit a stronger negative reaction when they do, particularly the chimpanzees [see also Ref.42]. An interesting addition to the line of research that we present here would be to ascertain if the negative emotional response would decrease when deciding between delayed outcomes.

Additionally, the difference between the control phases in our study suggests changes in the differential sensitivity to delay across different time ranges. As the time perception of humans and non-human animals is non-linear, two points in time are perceived as more similar as they are farther from the present53. In fact, our time discounting is closer to an exponential function when calculated over this subjective timeline instead of the objective time magnitudes [e.g., Refs.57,58,59]. Also, when deciding about the future, it is more difficult to integrate the magnitude of the delay in our mental representations and we appear to rely on noisier and more heuristic processes for our decisions. As a result, our cognitive uncertainty increases and our choices become relatively insensitive to delay, and that alone could explain a hyperbolic discounting pattern in money-earlier-or-later paradigms even among exponential discounters60, see also Refs.61,62,63,64. These perceptual and cognitive limitations that appear to be associated with the valuation of future outcomes have been proposed to underlie human time inconsistency, among other features of our decision-making65,66, and could partly explain the variations in preferences over the short- and long- run in non-human species as well [c.f. Ref.67].

Time processing is central to intertemporal choice in different ways, and a growing body of literature suggests that time perception and time preference are correlated at the interindividual level both in humans and rodents68,69. Still, few studies of time preference across species include controls aimed at detecting differences in the response to delays (for example, by offering the subjects the same amount of food available at different moments). In principle, when an organism decides whether to take a delayed benefit, it might take into account not only the discomfort associated with waiting (delay intolerance) but also the perceived length of the delay (time perception). For instance, in a recent study70 researchers asked 7- to 11-year-old children how long different time intervals (ranging from one day to three months) felt to them. The children had to pull a cord to indicate the duration. After adjusting for age and intelligence, their sense of time (how much they felt time “compressed” in the future) was a significant predictor of their choices concerning hypothetical monetary rewards. This interplay between the perception of time and decision-making has mainly been used to explain differences between individuals but it could also be relevant in cross-species comparisons, since variations in time perception among species and taxa may contribute to their differences in how they choose over time.

The study of the reaction of non-human animals to delayed rewards has also significant implications for processes of anticipation and representation1, especially when they concern decisions between future payoffs. Many species can anticipate and mentally represent future events71,72,73, and it has been suggested that future-oriented cognition is related to future-oriented choices. We tend to value more the rewards that are far in time when we vividly imagine future scenarios74 and mental time travel might facilitate patient and flexible intertemporal choices75,76. The processing of future payoffs could also overlap with that of socially distant or hypothetical outcomes and be linked to abstract construal18,77. Studying the co-occurrence of these abilities among different species and taxa could help in determining their evolutionary past and adaptive influence34 but, additionally, intertemporal choices may be a direct means to explore future-oriented cognition78. Under natural conditions, delayed payoffs tend to have minimal perceptual cues and a patient response often requires some capacity to foresee future outcomes. Although this is true when an individual makes a patient choice between an immediate-present and a delayed-absent reward [but see Ref.45], we agree with79 that an in-depth study of the choices between delayed and invisible rewards, e.g. by looking for effects that are observable with immediate and/or visible rewards, could enrich the possibilities of this approach.

Moreover, the dynamics of time preferences (i.e. how preferences change over time) can be leveraged to determine if great apes would spontaneously select future situations where the impatient choice would not be available80. Within operant conditioning paradigms, some pigeons can learn to peck a key that prevents the posterior appearance of a smaller-sooner option [Refs.21,81; see Ref.23 for a review of similar studies]. In more natural settings, humans tend to learn from experience that our preferences reverse over time and that we often stray from our long-term plans. As a result, we sometimes develop strategies to precommit to a specific course of action, making an impulsive drift more difficult or costly82. In doing that, we may take into account future changes in our own motivational state83,84. Indeed, there is literature in economics on the endogenous determination of time preferences that studies our efforts to reduce the discount on future utilities85, and literature on procrastination that shows the important consequences of differences between naïve, partially sophisticated, and fully sophisticated humans9. Some, but not all, of our basic strategies to cope with temptation are shared with other animals86, and how we deal with the inconsistency of our preferences is a key piece to understanding how we navigate through time that can be further explored in other species79,80.

In sum, we have shown here that great apes, as expected, are more patient when choosing between future rewards than when making short-run tradeoffs that produce immediate outcomes. The fact that the results are consistent with what is considered to be a core feature of human decision-making shows the value of non-human primates for research in economics and other social sciences, in this case regarding the nature of human preferences. Moreover, the ability to delay gratification may have been a key piece for the emergence of goal-directed behaviors such as cooperation87,88 and the experimental measures designed to ascertain these relationships have mainly examined choices between sets of alternatives that always included an instant gratification [e.g. Refs.89,90]. However, in natural settings, an individual faces decisions that only sometimes produce an immediate outcome. For example, when an animal can exploit different patches of food his set of alternatives will only include an instant payoff if he is already in one of the patches. Instead, it will often choose among options with various non-null temporal costs, and the previous literature and our results suggest that the observed patterns will change if that is the case. Therefore, research that also includes choices between two or more delayed rewards may better account for the decisions in natural settings and provide a more comprehensive approach to the study of time preference in non-human animals that help us to better understand how animals process and value the future consequences of their behavior.

Methods

Subjects

We tested 15 apes (6 orangutans, 5 bonobos, and 4 gorillas) that were socially housed in the Wolfgang Köhler Primate Research Center in Zoo Leipzig, Germany. The mean age of the sample was 19 years, ranging from 5 to 29 years, with 67% being females. Before this experiment, all subjects had participated in a previous study on time preferences, where they performed tasks that resembled those in condition 1 of this experiment (refer to Supplementary Table S1 online for complete subject data). Furthermore, they had been tested in various cognitive and behavioral tasks throughout their lifetime. Participants were never water or food-deprived and could stop participating at any time. The apes were selected for their availability, without any prior calculation of sample size.

Materials

The apparatus consisted of a flat sliding platform attached to the lower part of an upright Plexiglas panel and two containers filled with food. Three circular holes at the base of the Plexiglas panel allowed the ape to choose among the containers (Fig. 3). All subjects were acquainted with the basic choice procedure before the beginning of the study. Throughout the experiment, we used five kinds of plastic trays that differed in shape and color as containers, and banana slices and grapes as rewards.

Figure 3
figure 3

Experimental setup. At the beginning of the trial, the platform was in the backward position (a); later, the experimenter would push the platform forward, thus allowing the subject to touch one of the containers (b).

Procedure

Apes were tested individually, except for females with dependent offspring. At the beginning of the trial, the experimenter placed two containers on the platform and filled them with either one, two, or three units of food (see Fig. 1). Next, she pushed the sliding platform forward, thus allowing the subject to choose one of the containers by touching it (Fig. 3). Then, she pulled the platform backward, moved the unchosen container to a hidden location, and started the delivery of food according to a delay associated with the kind of container that had been selected (see Fig. 1). If the chosen container implied an immediate delivery, she gave its contents to the subject right away. If the container was associated with a delayed delivery, she left the room and came back in time to give its contents to the subject according to the scheduled time interval (either 3, 6, or 9 min). The selected container and the food inside it remained in full sight of the ape for the whole waiting period. Once the ape had received the reward, the experimenter gathered up the empty container and the trial ended.

During the pretest, when only immediate rewards were available, identical trays were used as containers for the whole sample. During condition 1 and condition 2, when the rewards were delivered with four different delays (0, 3, 6, or 9 min), one distinctive tray (different from that used in the pretest) was presented for each delay, being the correspondence between container and delay counterbalanced across the sample of subjects.

Design

The general procedure involved the presentation of two alternatives for the subjects to choose from. Each alternative consisted of an amount of food that was available after a length of time. The precise amount of food and/or the length of time required to get it varied as a function of the experimental phase and condition (Fig. 1).

Pretest phase

The pretest phase was designed to ensure that subjects preferred three units of food over one (Fig. 1). All apes completed one series of 12 trials and advanced to the next phase if they chose option B in at least 10 out of 12 trials. The subjects that did not fulfill this criterion completed a second series of 12 trials in which the same standard was applied. If they failed to meet it in the second series, they were dropped from the experiment. This exclusion criterion was predetermined during the design stage before the experiment began.

Control phase (condition 1 & condition 2)

The control phase assessed the reaction of the subjects to the different temporal delays (Fig. 1). To allow them to learn the temporal contingencies of the task, we introduced forced-choice trials in which only one of the alternatives was presented and available to choose. In both forced- and free-choice trials, the subject had to actively touch the container to receive the reward; the only difference between the two types of trials was the presence of one or two options to choose from. All apes completed one series of 12 forced-choice trials (6 for each alternative) followed by one series of 12 free-choice trials. Only the free-choice trials were scored. If the ape chose option A in at least 10 out of 12 free-choice trials, it passed directly to the next phase. The apes that failed to meet this criterion received an additional set of 12 forced-choice trials and 12 free-choice trials before they proceeded, independently of their results, to the next phase.

Test phase (condition 1 & condition 2)

Here, we tested the preferences of the subjects when facing a decision that implied a trade-off between delay and size of the reward (Fig. 1). As in the control phase, we used forced-choice trials to permit the subjects to experience the outcomes of both alternatives before being presented with the free-choice trials. Again, in both forced- and free-choice trials, the subject needed to touch the container to get the reward. The key difference was that in forced-choice trials, the ape was offered only one of the options, while in free-choice trials, both options were presented. The test phase of each condition consisted of one series of 12 forced-choice trials (6 for each alternative) followed by one series of 12 free-choice trials. Only the free-choice trials were scored.

All apes completed the pretest first followed by conditions 1 and 2 in a counterbalanced order. Thus, the sequence they received was pretest–condition 1–condition 2 for half of the subjects and pretest–condition 2–condition 1 for the other half. Each ape was manually assigned to one of these two subgroups aiming to match their distribution of species, age, and gender as closely as possible (for group membership, see Supplementary Table S1 online). Within both condition 1 and condition 2, all apes completed the control phase first before they received the test phase. Based on the aforementioned criteria, each subject was scored in 12/24 free-choice trials in the pretest, 12/24 in each control, and 12 in each test. The specific number of free-choice trials (12 or 24) completed by each animal in each phase can be seen in Supplementary Table S2 online. All trials within a phase were considered for the analysis.

Throughout the entire study, we ran two trials per session unless the subject did not respond to the first trial, in which case no further trial was presented. During a session, if the two trials were conducted, the second one was run immediately after the first (the intertrial interval or ITI was thus held constant at approximately 5 s). Each ape received a maximum of one session per day. Due to the apes' varying availability, sessions were scheduled on a weekly basis, with an average of 4 sessions per subject and week. We counterbalanced the order of side assignments and trial types (the latter being only applicable to forced-choice trials) for each individual, series of trials, and session. Except for the second observer who participated in the reliability analysis (see below), the study was not blinded at any stage. Therefore, the experimenter and the individuals who analyzed the data were aware of the main hypotheses throughout the entire procedure.

Data scoring and analysis

The experimenter scored each subject's choice for each trial live and later checked the record against videotapes. Additionally, a second observer (who was unaware of the study design and hypothesis) coded a random sample of 20% of the trials. He agreed with the experimenter in 98% of the sampled trials (Cohen’s k = 0.96). We first computed the number of times option B was chosen for each animal and series of trials (Supplementary Table S2 online). Then, we calculated the percentage of times option B was chosen for each subject within each phase (Table 1) and used this as the dependent measure for statistical testing. Only the data from the animals who completed the whole procedure was considered for the analysis.

Due to the small sample size, we used non-parametric methods. The data analysis included using the StatXact module in Cytel Studio (StatXact 10, from Cytel Software Corporation) to calculate Barnard’s test, manually performing Fisher’s combined probability test (see Supplementary Table S3 online for details), using R (version 4.0.3) to conduct Wilcox’s HC4 test, and utilizing SPSS software (IBM SPSS Statistics version 29) for the remaining analysis. Unless stated otherwise, all reported tests use a two-tailed and exact probability and a significance level of 0.05.

Ethics declaration

The procedure of this study was designed in accordance with the German laws on animal experimentation and was approved by the joint committee of the Max Planck Institute for Evolutionary Anthropology and Zoo Leipzig. The current report complies with the essential ARRIVE guidelines.