# Macaques are risk-averse in a freely moving foraging task

## Abstract

Rhesus macaques (Macaca mulatta) appear to be robustly risk-seeking in computerized gambling tasks typically used for electrophysiology. This behavior distinguishes them from many other animals, which are risk-averse, albeit measured in more naturalistic contexts. We wondered whether macaques’ risk preferences reflect their evolutionary history or derive from the less naturalistic elements of task design associated with the demands of physiological recording. We assessed macaques’ risk attitudes in a task that is somewhat more naturalistic than many that have previously been used: subjects foraged at four feeding stations in a large enclosure. Patches (i.e., stations), provided either stochastically or non-stochastically depleting rewards. Subjects’ patch residence times were longer at safe than at risky stations, indicating a preference for safe options. This preference was not attributable to a win-stay-lose-shift heuristic and reversed as the environmental richness increased. These findings highlight the lability of risk attitudes in macaques and support the hypothesis that the ecological validity of a task can influence the expression of risk preference.

## Introduction

Many animals, including humans, prefer sure things to gambles1. The tendency to minimize risk, i.e. unknowable and unpredictable variation, has been a topic of interest from behavioral ecology2,3 to economics4,5 and neuroscience6,7,8,9,10,11. Furthermore cognitive processes related to decision making in risky contexts underlies many maladaptive behaviors such as addiction and problem gambling12,13. Consequently, understanding risk attitudes in varying contexts provides important insight into the evolutionary origin, and thus the psychological and neural mechanisms, of addiction and maladaptive choice14.

Theoretical and experimental work on risk preference in non-human animals has delineated risk-aversion as a default preference for many species1,15,16,17,18,19. However risk preferences may not be as rigidly fixed as we might imagine; several factors have been demonstrated to shift risk-preference. Internal factors related to energetic states and metabolic processes are facultative on risk-preferences17,20,21,22,23. When faced with the possibility of starvation, many species will increase their tolerance for risk17,21,24. Likewise external factors related to the environmental richness, that is how much food is readily available, shift risk tolerance16,25,26,27. Risk preferences are also sensitive to the reward rate, both the timing of delivery and overall size of rewards15,28,29,30,31. Lastly, whether risk is explicitly cued or learned through experience impacts the expression of risk-preference32,33,34.

Rhesus macaques, the predominant model in neuroscience for understanding human decision making, are robustly risk-seeking in a variety of contexts6,8,35,36,37,38,39. Risk-seeking in macaques persists even when factors known to shift risk preferences are manipulated. For example, altering the cost of engaging in risk-seeking by increasing the inter-trial-interval reduces, but does not eliminate, macaques’ preference for risk35. In fact, only one study we know of has reported risk-aversion in rhesus macaques40.

Explanations for why macaques exhibit robust risk-seeking in experimental tasks come in two types. One type of explanation assumes that macaques’ risk attitudes are an evolved reflection of their foraging history. This view is supported by observed patterns of risk-seeking in primate species across a variety of experimental methods41,42,43,44. Another possibility is that macaques’ risk-seeking is a consequence of experimental tools typically used to measure their risk preferences. The manner in which macaques’ risk attitudes are measured is generally different from methods used for other species3,45,46 but it remains unclear how influential that difference is on general risk attitudes. The majority of data on macaque risk preference comes from studies tailored to the needs of electrophysiology, not cross-species comparison. Thus they are tested with rapid trials, often as fast as three seconds per trial, extremely small stakes, abstract stimuli, immediate rewards, overtraining, oculomotor responses, and hundreds or thousands of trials in a few hours. It may be that one of these factors, or some combination thereof, motivates risky choice. Indeed, even humans can become risk-seeking when gambling for small rewards in conditions designed to be similar to those used in non-human primate experiments34,45.

For foraging animals, risk manifests as an embedded component of their environment15,28,47. A macaque foraging for fruit may experience risk as variation in the likelihood of encountering patches of fruit-bearing trees or as variation in the quality and quantity of fruit located at an individual tree. In the former, risk may affect the decision on where to search and which tree to climb, while the latter may impact the decision for how long to reside within a particular patch or tree. Risk is also mitigated or exacerbated by the local dynamics of the foraging environment. These can include both the environmental richness and the movement costs related to the spatial position of food patches48,49. When food is plentiful the energetic cost of engaging in riskier foraging strategies is minimized27. Evolutionary pressures are believed to have shaped the cognitive architecture of foragers to navigate risk within nature and the degree to which experimental tasks match onto the natural dynamics of environment likely impacts the expression of risk50,51,52.

We hypothesized that embedding the experience of risk within a more naturalistic setting would result in macaques expressing risk preferences opposite to the trend of robust risk-seeking. We designed a naturalistic foraging task based on the patch-leaving problem from foraging theory2,53,54. We tested subjects (n = 3) using a single subject design within a large enclosure that allowed for free movement between four different feeding stations. Our task design incorporates risk within the stochasticity of patch harvest rates. Thus, we are able to examine the influence of risk across the use of patch types in addition to within particular patches. We found that macaques were risk-averse under these foraging conditions. We are able to abolish risk-averse preferences by increasing the overall richness of the environment in relation to the amount of variation in a risky patch. Two of the same subjects exhibited risk-seeking in a standard risk task designed for physiological recording, indicating that their risk preferences are task-specific, not individual-specific. Taken together, our results demonstrate the effect of the environmental structure on the expression of risk attitudes in rhesus macaques and highlight the importance of using naturalistic tasks for studying cognitive processes.

## Methods

### Subjects and apparatus

Three male rhesus macaques served as subjects for the experiment. Two of the subjects (C and K) had previously served as subjects on standard neuroeconomic tasks, including a set shifting task55, a diet selection task56,57, intertemporal choice tasks58, and a juice gambling task10, while the third subject (Y) was naïve to all experimental procedures. All three subjects were fed ad libitum and pair housed within a light and temperature controlled colony room. Subjects were water restricted to 25 mL/kg for initial training, and readily worked to maintain 50 mL/kg throughout experimental testing. All research and animal care was conducted in accordance with University of Minnesota Institutional Animal Care and Use Committee approval and in accord with National Institutes of Health standards for the care and use of non-human primates.

Subjects were behaviorally tested in a large cage (~3 m × 3 m × 3 m) made from framed panels consisting of 5 cm wire mesh (Fig. 1). This allowed for free movement of the subjects within the cage in three dimensions. Five 208 L drum barrels weighted with sand were placed within the cage to serve as perches for the subjects to sit upon. Four juice feeders were placed at each of the four corners of the cage in a rotationally symmetric alignment. The juice feeders consisted of a 16 × 16 LED screen, a lever, buzzer, a solenoid (Parker Instruments), and were controlled via an Arduino Uno microcontroller. Data were collected in MatLab (Mathworks) via Bluetooth communication with each of the juice feeders.

### Training

Previous training history for two these subjects included two types of foraging tasks57,59, intertemporal choice tasks34,60, two types of gambling tasks10,61, attentional tasks similar to those in62, and two types of reward-based decision tasks63,64.

We first introduced subjects to the large cage and allowed them to acclimate to it. Acclimation consisted of placing subjects within the large cage for progressively longer periods of time over the course of about five weeks. To make the cage environment more positive, we provisioned the subjects with copious food rewards (chopped fruit and vegetables) placed throughout the enclosure. This process ensured that subjects were comfortable with the large cage. We then trained subjects to use the juice dispenser. All three subjects were initially trained to lever press for juice rewards in the testing enclosure. Acquisition of reliable lever pressing took about three weeks. We defined acquisition as obtaining juice rewards in excess of their daily water minimum. After completing lever training, we placed subjects onto the first experimental condition of the freely moving patch-leaving task.

### Experimental testing

Working with captive non-human primate subjects imposes unavoidable practical limitations on the number of available subjects. We therefore structured our research design and analysis around a single subjects approach65. Formally, we used a multiple baseline approach for collecting and analyzing behavior in the freely moving patch-leaving task66. We tested subjects on the first experimental condition until five days of consistent behavior were observed. This training period also served as the initial learning period for the task contingencies. The criterion of five days was chosen a priori based on previous studies using foraging tasks59,67. This criterion ensured that the subjects were well trained and had ample opportunity to learn the task contingencies. We defined consistent behavior as similar allocation of lever presses at a juice feeder across days. We measured behavioral consistency as the total amount of juice collected at each feeder across days within a criteria of +/− 5 mL. After observing five days of consistent behavior, we then tested subjects for an additional five days. We then implemented the second experimental condition and repeated the same observation sequence. Throughout both experimental manipulations we used the same criterion of five days of consistent behavior, as a metric for ensuring subjects understood the experimental contingencies within a condition and were at a stable state of responding. All subjects were tested in the same order experiencing the standard environmental condition first and then the rich condition. Post hoc analyses of subject behavior revealed no significant changes across the five days of experimental testing. We performed all analyses on the five days of testing after establishing consistent behavior.

The freely moving patch-leaving task incorporates the dynamics of the natural environment by using multiple patches and a reward schedule designed to mimic the natural depletion of prey items from a patch the longer a subject forages from it57,67 (Fig. 2). Two of the four feeders diagonally across from each other were designated as variable (risky) feeders, while the other two served as safe feeders and had no variation in reward delivery. Feeders were visually identical, although they could be readily discriminated by their position relative to landmarks outside the cage. The feeder designations remained spatially fixed for each subject across experimental days. Each feeder displayed the total amount of juice available within the patch via a blue bar (8 × 16 LEDs). With each lever press, juice would be delivered and a portion of the blue bar would disappear, explicitly indicating its depletion status. Leaving a feeder to activate any of the other three feeders would cause the previously activated feeder to immediately fully replenish, cued by a full bar being displayed. Subjects were placed within the testing enclosure and allowed to forage freely between the four feeders for two hours each day.

#### Patch reward statistics and risk

Each feeder was programed to deliver a base reward schedule that decreased by a specified amount. In the standard condition each feeder delivered a base reward consisting of an initial 2 mL of juice that decreased by 0.125 mL with each subsequent delivery (turn). In the rich condition, the feeders provided 4 mL of juice that decreased by 0.25 mL each turn (Table 1). Risk, here defined as variation in reward amounts, was introduced by programming two of the juice feeders to randomly increase or decrease the juice delivery amount by 1 mL in addition to the base reward schedule at a probability of 0.5. Thus on any given turn, a response at the risky feeder may produce more or less juice, including non-reward delivery, than a safe feeder. Both feeder types delivered rewards following their respective schedules until reaching the base value of 0, at which point the patch is depleted and no more rewards were delivered. In practice this depletion process results in identical gain functions over the majority of patch residence times, that is the long run expectation of both feeder schedules are identical. However because the schedule had a bound at 0 mL, the tail end of the gain function for risky patches does diverge from safe patches (Fig. 3).

#### Definition of risk preference

Our task design defines risk preference as frequency of risky decisions made by an animal. We defined the proportion of patch entries greater than chance into risky patches as risk-seeking, and the inverse of that as risk aversion. Equal entry into both patch types was considered risk neutral. For patch residence time we defined risk-seeking as a significant tendency to stay longer in risky patches than safe ones, and risk aversion as the opposite of this trend.

#### Coefficient of variation

Rich environments in which food sources are abundant have been demonstrated to increase risk-seeking foraging strategies27. The coefficient of variation describes this effect as the relationship between experienced variation and the overall mean reward rate. In many species risk-seeking increases as the coefficient of variation decreases25,26,68. We manipulated the coefficient of variation by increasing the overall rate of reward from 2 mL with a decay of 0.125 mL/lever press to 4 mL with a decay of 0.25 mL/lever press while holding the variation constant at +/− 1 mL. Importantly this manipulation does not change the overall expectation of the reward schedules; they are still matched for both risky and safe patches.

Data from the juice gambling task (Fig. 4), which we used as a comparison, were previously collected for electrophysiology experiments10,69,70 and only available for subjects C and K. In brief, the task consisted of paired choices presented rapidly (~3 sec duty cycle) while subjects sat in a specially designed chair (Christ Instruments, Hagerstown, MD). On each trial offers were presented asynchronously. The first offer appeared for 400 ms either on the left or right with equal probability. A blank period of 600 ms followed. Then the second offer appeared for 400 ms, followed by another 600 ms blank period. Following a brief central fixation period, subjects expressed their choices with saccades to the presented offers. Offers were colored bars that indicated probability and stakes. The stakes were indicated by the color of the displayed bar indicating a base reward amount (red = 0 μL, grey = 125 μL, blue = 165 μL, green = 240 μL). The probability was drawn from a uniform distribution and indicated by the height of an overlapping red bar. For example a blue bar covered halfway with a red bar represents a probability of 0.5 for receiving the reward corresponding to the color blue. Within the juice gambling task, risk is characterized as trial-to-trial variation in the probability of receiving a particular reward amount. Subjects were well trained on the task, having completed over 10,000 trials across many sessions before electrophysiological recording. For analysis we chose a random set of five days from the period of electrophysiological recording.

### Data analysis

We focused our analysis of the freely moving patch-leaving task on the five days of testing after the initial learning period in both the standard and rich condition. Drawing from our experimental design, we restricted our analyses to changes within each individual subject’s behavior. A key strength of this approach lies in our ability to rule out individual differences as an explanation for behavioral changes, as each subject serves as their own control. Furthermore, each subject serves as a replication of the previous differing only in the inter subject domain. This allows us to infer strong causal relationships between our experimental manipulations and the subsequent behavior of our subjects.

#### Freely moving patch-leaving risk task

For the freely moving patch-leaving risk task we recorded lever presses at each of the four juice feeders throughout the 2-hour testing session. We defined patch entries as a lever press at a patch different from the previously recorded lever press. We defined consecutive lever presses or turns at a juice feeder as the patch residence time. Each daily session consisted of multiple patch entries at each of the four feeders of variable patch residence times. Data from daily sessions were combined across the five days following the initial learning period for each subject within the experimental condition.

We analyzed the differences in the proportion of subjects’ patch entries between risky and safe patches across the two conditions of environmental richness using an 1-factor ANOVA. We investigated subjects’ risk preferences on patch residence times across the manipulation of environmental richness using a 2-factor ANOVA (patch type × environmental richness). We analyzed differences in patch residence times between risky and safe patches using unpaired t-tests. To examine if subjects used a win-stay-lose-leave strategy we examined the effect of reward outcomes one turn back and two turns back from the end of each patch residency within risky patches.

#### Risk parameter estimation

A second way to categorize risk preferences is to examine the utility function derived from the expressed choices of subjects40,71. To analyze differences in risk preferences between the juice gambling task and our patch leaving risk task we fit each subject’s choice preferences for offer 1 from the juice gambling task or for the decision to stay in the current risky patch to the two equations below (Eqs 1 and 2) using maximum likelihood estimation. Equations 1 and 2 produce expected utility curves whose shape is dictated by the parameter alpha. The parameter α functions as an index of risk preference such that α < 1 indicates risk-aversion, α > 1 indicates risk-seeking, and α = 1 risk-neutrality. Graphically a value of α = 1 will produce a straight line in which all reward amounts are equally weighted. Values of α < 1 produce a concave utility curve in which larger rewards undergo diminishing returns, while values of α > 1 produce convex utility curves in which larger rewards are given greater weight. The parameter b in both equations represents the slope of the sigmoid choice function around the point of indifference, p(choice) = 0.5. As such, b provides a measure of variation in choice.

$$Juice\,Gambling\,Task\,p(choice\,|offer\,1)=\frac{1}{1+(\exp ((p1\ast v{1}^{\alpha })-(p2\ast v\,{2}^{\alpha })\ast b)}$$
(1)

where:

p1 = probability of offer 1

v1 = value of offer 1 (s)

p2 = probability of offer 2

v2 = value of offer 2(s)

α = risk preference index

b = measure of choice stochasticity

$$Patch\,leaving\,Risk\,Task\,p(stay|t)=1/(1+exp(threshol{d}^{\alpha }-V{(t)}^{\alpha })\ast b)$$
(2)

where:

t = time measured in discrete lever presses

threshold = point of indifference between staying and leaving a patch

V(t) = current reward amount available given the time spent in the patch

a = risk preference index

b = measure of choice stochasticity

## Results

### Macaques spend more time in safe patches in a standard environment

We examined patch residence times in safe and risky patches defined as the number of turns spent at a feeder. Within the standard environment, all three subjects remained in the safe patches (turn means C: 9.10, K: 10.03, Y: 10.70) longer than in the risky (turn means C: 8.15, K: 8.67, Y: 9.52) ones (Fig. 5, unpaired t-test C: 0.9479 turns, t(248) = 2.198, p = 0.0144, d = 0.278; K: 1.35 turns, t(176) = 2.0289, p = 0.022, d = 0.304; Y: 1.17 turns, t(184) = 1.6842, p = 0.0469, d = 0.247). That is, all three subjects made more consecutive lever presses in safe patches than risky ones.

### No evidence for win-stay/lose-shift heuristic in guiding patch-leaving

It is possible that macaques’ longer residence times in safe patches are due to a data censoring effect: perhaps they leave when any individual outcome is lower than some threshold. That is, they may obey a win-stay lose-shift heuristic72,73,74,75. To determine if subjects used this heuristic, we examined the likelihood of leaving a risky patch given the recent history of wins and losses. None of the three subjects exhibited a significant preference of increased patch-leaving immediately after losses (one sample t-test C: t(122) = 1.1740, p = 0.2427, K: t(82) = 0.5465, p = 0.5862, Y: t(85) = 0.6448, p = 0.5208). Nor did we observe any effect of harvest outcomes two steps back (ANOVA C: F(3,119) = 0.83, p = 0.8009, K: F(3,79) = 0.13, p = 0.9413; Y: F(3,82) = 1.44, p = 0.237).

### Macaque risk preferences shift with the coefficient of variation

Shifting the environmental richness serves to alter the overall mean rate of reward for the environment. When the mean rate of reward increases and variation or risk remains constant the overall coefficient of variation decreases. In all three subjects we found a significant environment by patch type interaction on their patch residence times (2-factor ANOVA K: F(1,314) = 3.1928, p = 0.07; C: F(1,376) = 18.276, p < 0.001; Y: F(1,293) = 6.7078, p = 0.01). All three subjects exhibited shifts away from risk-aversion to risk-neutrality/seeking as the coefficient of variation decreased (turn mean risky C: 11.32, K: 7.59, Y: 5.89, turn mean safe C:8.75, K:6.49, Y:4.44, unpaired t-test C: t(117) = 3.3303, p = 0.0005, d = 0.605, K: t(99) = 1.2077, p = 0.115, d = 0.226, Y: t(94) = 1.7483, p = 0.0418, d = 0.351). Thus, subjects were willing to stay longer in risky patches as the overall magnitude of reward for the environment increased relative to the variation within risky patches (Fig. 6).

### Macaques are indifferent between patch types

Foragers may choose to strategically engage with patches of a particular type as a way of avoiding variation. We found no evidence to support a preference for either patch type in any of our subjects for both standard and rich environmental conditions (1-factor ANOVA C: F(1,378) = −0.442, p = 0.51, K: F(1,316) = −0.034, p = 0.85, Y: F(1,295) = 0.01, p = 0.9374).

### Two of these macaques are risk-prone in computerized task

We next analyzed risky choice behavior in two subjects (C and K) in a standard (not foraging-based, not freely moving) juice gambling task10. Both subjects exhibited strong risk-seeking behavior. On trials with matched expected values subject C choose the risky option 67% (one sample t-test: t(1232) = 12.86, p < 0.0001) of the time, while subject K choose the risky option 66% of the time (one sample t-test: t(1437) = 12.55, p < 0.0001).

This preference can be quantified using the shape of the utility curve. Both subjects showed convex utility curves (Fig. 7, C: alpha = 2.284, 95%CI = 2.584–1.983; K: alpha = 3.632, 95%CI = 3.822–3.441). However within the more naturalistic freely moving patch-leaving task the same subjects exhibited concave utility curves indicative of strong risk aversion (Fig. 7, C: alpha = 0.550, 95% CI = 0.5922–0.508; K: alpha = 0.743, 95% CI = 0.889–0.586).

## Discussion

Risk is ubiquitous in the natural environment and foragers must develop strategies for dealing with it1. There’s a general observation that animals are, for the most part, risk-averse. The earliest studies of the neurophysiology of macaque risk attitudes were problematic because they demonstrated clear risk-seeking8,36,72,75,76,77. In other words, macaques appeared to be different from other species. We hypothesize that this difference is not innate. Instead we believe it reflects the strategic adjustments macaques make when faced with the specific environment of the laboratory gambling task18,45,46.

To test this hypothesis, we sought to examine risk attitudes from a more complex naturalistic task. To that end, we developed a large freely moving cage apparatus with four stations, and trained our subjects to forage in from variable and stable stations, and assessed their risk attitudes.

While we would expect risk preference to manifest as a preference for using one patch type over another, we did not observe this trend. This null result can be interpreted as an expression of risk neutrality (i.e. stochastic optimality) at the level of patch choice. Foraging primates have been shown to follow simple navigation rules for moving between patches of food, and within the spatial arrangement of our task these rules would manifest as risk indifference for patch entries78. Future research is needed to investigate the interplay between variation in the reward rates of a patch and the spatial arrangement of patches within the environment on patch choice. We did observe that subjects remain longer in safe patches than risky ones. This increased tolerance for safe rather than risky outcomes allows us to infer that subjects value safe patches more than they value the risky ones, and demonstrates that risk attitudes are fundamentally labile. Moreover, they suggest that effort made to make the task naturalistic pays off in the form of behavior that more closely resembles that found in the wild.

Our subjects’ willingness to stay longer in safe patches as the environmental richness increases indicates that a subjective weighting of the experienced variation of rewards influences the valuation of a patch. Had subjects followed rate maximization policies under a condition of information uncertainty, we would have observed risk preferences manifest as a myopic short-term rate maximization strategies that produced a consistent censoring effect of early leaving from risky patches in both standard and rich environments15,47,71. One interesting question warranting further study is how the degree of information regarding the variance in reward influences the expression of risk between short-term maximization policies and the subjective weighting effects seen in conditions of pure risk.

Our results point to ostensibly minor task factors as a major component in the expression of macaque risk preferences3,18,46. These are the kinds of things that tend to get ignored in economic-inspired models of risky choice. Our results suggest that risk attitudes are so labile that one must carefully consider all parameters of the task design when interpreting economic preferences79,80. More fundamentally, these results suggest that animals may not have such a thing as a stable risk attitude. Rather, we believe that each subject has a consistent, but flexible cognitive repertoire that they use when encountering risk. In the case of rhesus macaques, their evolution and spread across diverse ecologies likely shaped their ability to adaptively shift choice strategies and preferences as environmental contingencies changed81. By considering how experimental tasks match onto the natural environment we can begin to fully elucidate how diverse cognitive functions such as memory, prospection, and estimation sub serve choice.

Subjects’ measured risk aversion likely does not reflect lack of training or intolerance for ambiguity. In our freely moving patch-leaving risk task subjects were well trained. Reward schedules were fixed and subjects were fully trained in the reward contingencies before testing. This represents a case of “pure risk”, in which the subject knows the reward statistics and can identify patches with variability from constant patches71, i.e. there is no additional ambiguity present. Furthermore, our manipulation of the coefficient of variation allows for a disassociation of reward rate strategies from subjective risk preferences in guiding patch usage, as the overall expectations of the reward schedules remains the same.

One limitation of all laboratory approaches arises out of constraints in sample size, and care should be taken with regard to any species level conclusions regarding macaque risk preference. However we are able to clearly demonstrate on a single subjects level a divergence in risk attitudes arising from the task structure. These results therefore constitute both an existence proof – that the effects we hypothesized can be observed in our members of the macaque species – and motivate a prediction that further studies will demonstrate a species-wide generality of these effects. In this regard, it is worth emphasizing that we did not pre-select subjects for behavior; nor did we exclude subjects for any reason.

Finally, our results call for greater effort to mimic the natural structure of the environment in order to study the evolved cognitive faculties of animals82. Foraging animals evolved to make decisions between foreground and background options83,84. Their cognitive strategies are adapted for exploiting the regularities of their natural environment, e.g. depleting patches and clumpy resource distributions57,85,86,87. It is only by carefully considering the ecological validity of our tasks that we will begin to untangle the cognitive and neural processes underlying decision making28,60,88,89. In this vein we join many others in arguing for greater consideration of how the environment shapes cognition and behavior11,28,51,52,89.

## Data availability

All data collected and used in the analysis is available from the corresponding author upon reasonable request or can be found at www.haydenlab.com/www.zimmermannlab.com.

## References

1. 1.

Kacelnik, A. & Bateson, M. Risky Theories: The effects of variance on foraging decisions. Am. Zool. 434, 402–434, https://doi.org/10.1093/icb/36.4.402 (1996).

2. 2.

Stephens, D. W. & Krebs, J. R. Foraging Theory. (Princenton University Press, 1986).

3. 3.

Heilbronner, S. R. Modeling risky decision-making in nonhuman animals: shared core features. Curr. Opin. Behav. Sci. 16, 23–29, https://doi.org/10.1016/j.cobeha.2017.03.001 (2017).

4. 4.

Kahneman, D. & Tversky, A. A. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291, https://doi.org/10.2307/1914185 (1979).

5. 5.

O’Donoghue, T. & Somerville, J. Modeling risk aversion in economics. J. Econ. Perspect. 32, 91–114, https://doi.org/10.1257/jep.32.2.91 (2018).

6. 6.

Genest, W., Stauffer, W. R. & Schultz, W. Utility functions predict variance and skewness risk preferences in monkeys. Proc. Natl. Acad. Sci. 113, https://doi.org/10.1073/pnas.1602217113 (2016).

7. 7.

Knutson, B. & Bossaerts, P. Neural Antecedents of financial decisions. J. Neurosci. 27, 8174–8177, https://doi.org/10.1523/JNEUROSCI.1564-07.2007 (2007).

8. 8.

McCoy, A. N. & Platt, M. L. Risk-sensitive neurons in macaque posterior cingulate cortex. Nat. Neurosci. 8, 1220–1227, https://doi.org/10.1038/nn1523 (2005).

9. 9.

Preuschoff, K., Quartz, S. R. & Bossaerts, P. Human insula activation reflects risk prediction errors as well as risk. J. Neurosci. 28, 2745–2752, https://doi.org/10.1523/JNEUROSCI.4286-07.2008 (2008).

10. 10.

Strait, C. E., Blanchard, T. C. & Hayden, B. Y. Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron 82, 1357–1366, https://doi.org/10.1016/j.neuron.2014.04.032 (2014).

11. 11.

Calhoun, A. J. & Hayden, B. Y. The foraging brain. Curr. Opin. Behav. Sci. 5, 24–31, https://doi.org/10.1016/j.cobeha.2015.07.003 (2015).

12. 12.

Peters, S. K., Dunlop, K. & Downar, J. Cortico-striatal-thalamic loop circuits of the salience network: a central pathway in psychiatric disease and treatment. Front. Syst. Neurosci. 10, 1–23, https://doi.org/10.3389/fnsys.2016.00104 (2016).

13. 13.

Wilson, M. J. & Vassileva, J. Decision-making under risk, but not under ambiguity, predicts pathological gambling in discrete types of abstinent substance users. Front. Psychiatry 9, 1–10, https://doi.org/10.3389/fpsyt.2018.00239 (2018).

14. 14.

Santos, L. R. & Rosati, A. G. The evolutionary roots of human decision making. Annu. Rev. Psychol. 66, 321–347, https://doi.org/10.1146/annurev-psych-010814-015310.The (2015).

15. 15.

McNamara, J. Optimal patch use in a stochastic environment. Theor. Popul. Biol., https://doi.org/10.1016/0040-5809(82)90018-1 (1982).

16. 16.

Kacelnik, A. & Abreu, F. B. E. Risky choice and weber’ s law. J. Theor. Biol. 194, 289–298 (1998).

17. 17.

Kacelnik, A. & El Mouden, C. Triumphs and trials of the risk paradigm. Anim. Behav. 86, 1117–1129, https://doi.org/10.1016/j.anbehav.2013.09.034 (2013).

18. 18.

Farashahi, S., Azab, H., Hayden, B. & Soltani, A. On the flexibility of basic risk attitudes in monkeys. J. Neurosci. 38, 4383–4398, https://doi.org/10.1523/JNEUROSCI.2260-17.2018 (2018).

19. 19.

Farashahi, S., Donahue, C. H., Hayden, B. Y., Lee, D. & Soltani, A. Flexible combination of reward information across primates. Nat. Hum. Behav., 10.1038/s41562-019-0714-3, https://doi.org/10.1038/s41562-019-0714-3 (2019).

20. 20.

Real, L. & Caraco, T. Risk and foraging in stochastic environments. Annu. Rev. Ecol. Syst. 17, 371–390, https://doi.org/10.1146/annurev.es.17.110186.002103 (1986).

21. 21.

Caraco, T. Energy budgets, risk and foraging preferences in dark-eyed juncos (Junco hyemalis). Behav. Ecol. Sociobiol. 8, 213–217, https://doi.org/10.1007/BF00299833 (1981).

22. 22.

McNamara, J. M. & Houston, A. I. Optimal foraging and learning. J. Theor. Biol. 117, 231–249, https://doi.org/10.1016/S0022-5193(85)80219-8 (1985).

23. 23.

Pietras, C. J., Locey, M. L. M. L. & Hackenberg, T. D. Human risky choice under temporal constraints: tests of an energy-budget model. J. Exp. Anal. Behav. 80, 59–75, https://doi.org/10.1901/jeab.2003.80-59 (2003).

24. 24.

Craft, B. B. Risk-sensitive foraging: Changes in choice due to reward quality and delay. Anim. Behav. 111, 41–47, https://doi.org/10.1016/j.anbehav.2015.09.030 (2016).

25. 25.

Shafir, S. Risk-sensitive foraging: The effect of relative variability. Oikos. https://doi.org/10.1034/j.1600-0706.2000.880323.x (2000).

26. 26.

Weber, E. U., Shafir, S. & Blais, A.-R. R. Predicting risk sensitivity in humans and lower animals: risk as variance or coefficient of variation. Psychol. Rev. 111, 430–445, https://doi.org/10.1037/0033-295X.111.2.430 (2004).

27. 27.

Gilby, I. C. & Wrangham, R. W. Risk-prone hunting by chimpanzees (Pan troglodytes schweinfurthii) increases during periods of high diet quality. Behav. Ecol. Sociobiol., https://doi.org/10.1007/s00265-007-0410-6 (2007).

28. 28.

Stephens, D. W. Decision ecology: foraging and the ecology of animal decision making. Cogn. Affect. Behav. Neurosci. 8, 475–484, https://doi.org/10.3758/CABN.8.4.475 (2008).

29. 29.

Caraco, T., Kacelnick, A., Mesnick, N. & Smulewitz, M. Short-term rate maximization when rewards and delay covary. Anim. Behav. 44, 441–47, https://doi.org/10.1017/CBO9781107415324.004 (1992).

30. 30.

Shapiro, M. S., Schuck-Paim, C. & Kacelnik, A. Risk sensitivity for amounts of and delay to rewards: adaptation for uncertainty or by-product of reward rate maximising? Behav. Processes 89, 104–114, https://doi.org/10.1016/j.beproc.2011.08.016 (2012).

31. 31.

Krebs, J. R. & Kacelnik, A. Time horizons of foraging animals. Ann. N. Y. Acad. Sci. 423, 278–291, https://doi.org/10.1111/j.1749-6632.1984.tb23437.x (1984).

32. 32.

Hertwig, R., Barron, G., Weber, E. U. & Erev, I. Decisions from experience and the effect of rare events in risky choice. Psychol. Sci., https://doi.org/10.1111/j.0956-7976.2004.00715.x (2004).

33. 33.

Hertwig, R. & Erev, I. The description-experience gap in risky choice. Trends in Cognitive Sciences, https://doi.org/10.1016/j.tics.2009.09.004 (2009).

34. 34.

Heilbronner, S. R. & Hayden, B. Y. The description-experience gap in risky choice in nonhuman primates. Psychon. Bull. Rev., https://doi.org/10.3758/s13423-015-0924-2 (2016).

35. 35.

Hayden, B. Y. & Platt, M. L. Temporal discounting predicts risk sensitivity in rhesus macaques. Curr. Biol. 17, 49–53, https://doi.org/10.1016/j.cub.2006.10.055 (2007).

36. 36.

O’Neill, M. & Schultz, W. Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800, https://doi.org/10.1016/j.neuron.2010.09.031 (2010).

37. 37.

So, N.-Y. & Stuphorn, V. Supplementary eye field encodes option and action value for saccades with variable reward. J. Neurophysiol. 104, 2634–2653, https://doi.org/10.1152/jn.00430.2010 (2010).

38. 38.

Stauffer, X. W. R. et al. Economic choices reveal probability distortion in macaque monkeys. J. Neurosci. 35, 3146–3154, https://doi.org/10.1523/JNEUROSCI.3653-14.2015 (2015).

39. 39.

Xu, E. R. & Kralik, J. D. Risky business: rhesus monkeys exhibit persistent preferences for risky options. Front. Psychol. 5, 1–12, https://doi.org/10.3389/fpsyg.2014.00258 (2014).

40. 40.

Yamada, H., Tymula, A., Louie, K. & Glimcher, P. W. Thirst-dependent risk preferences in monkeys identify a primitive form of wealth. Proc. Natl. Acad. Sci. 110, 15788–15793, https://doi.org/10.1073/pnas.1308718110/-/DCSupplemental.www.pnas.org/cgi/doi/10.1073/pnas.1308718110 (2013).

41. 41.

Heilbronner, S. R., Rosati, A. G., Stevens, J. R., Hare, B. & Hauser, M. D. A fruit in the hand or two in the bush? Divergent risk preferences in chimpanzees and bonobos. Biol. Lett. 4, 246–249, https://doi.org/10.1098/rsbl.2008.0081 (2008).

42. 42.

De Petrillo, F., Ventricelli, M., Ponsi, G. & Addessi, E. Do tufted capuchin monkeys play the odds? Flexible risk preferences in Sapajus spp. Anim. Cogn. 18, 119–130, https://doi.org/10.1007/s10071-014-0783-7 (2015).

43. 43.

Rosati, A. G. & Hare, B. Decision making across social contexts: competition increases preferences for risk in chimpanzees and bonobos. Anim. Behav. 84, 869–879, https://doi.org/10.1016/j.anbehav.2012.07.010 (2012).

44. 44.

Rosati, A. G. & Hare, B. Chimpanzees and bonobos exhibit emotional responses to decision outcomes. PLoS One, https://doi.org/10.1371/journal.pone.0063058 (2013).

45. 45.

Hayden, B. Y. & Platt, M. L. Gambling for gatorade: risk-sensitive decision making for fluid rewards in humans. Anim. Cogn. 12, 201–207, https://doi.org/10.1007/s10071-008-0186-8 (2009).

46. 46.

Heilbronner, S. R. & Hayden, B. Y. Contextual factors explain risk-seeking preferences in rhesus monkeys. Front. Neurosci. 7, 1–7, https://doi.org/10.3389/fnins.2013.00007 (2013).

47. 47.

Oaten, A. Optimal foraging in patches: a case for stochasticity. Theor. Popul. Biol., https://doi.org/10.1016/0040-5809(77)90046-6 (1977).

48. 48.

Fauchald. Foraging in a hierarchical patch system. Am. Nat., https://doi.org/10.2307/2463618 (2017).

49. 49.

Searle, K. R., Vandervelde, T., Hobbs, N. T., Shipley, L. A. & Wunder, B. A. Spatial context influences patch residence time in foraging hierarchies. Oecologia, https://doi.org/10.1007/s00442-005-0285-z (2006).

50. 50.

Real, L. A. Animal choice behavior and the evolution of cognitive architecture. Science (80-.). 253, https://doi.org/10.1126/science.1887231 (1990).

51. 51.

Todd, P. M. & Gigerenzer, G. Environments that make us smart. Curr. Dir. Psychol. Sci. 16, 167–171, https://doi.org/10.1111/j.1467-8721.2007.00497.x (2007).

52. 52.

Mallpress, D. E. W. W., Fawcett, T. W., Houston, A. I. & McNamara, J. M. Risk attitudes in a changing environment: an evolutionary model of the fourfold pattern of risk preferences. Psychol. Rev. 122, 364–375, https://doi.org/10.1037/a0038970 (2015).

53. 53.

Charnov, E. L. Optimal foraging, the marginal value theorem. Theor. Popul. Biol. 9, 129–136 (1976).

54. 54.

Nonacs, P. State dependent behavior and the marginal value theorem. Behav. Ecol. 12, 71–83, https://doi.org/10.1093/oxfordjournals.beheco.a000381 (2001).

55. 55.

Sleezer, B. J. & Hayden, B. Y. Differential contributions of ventral and dorsal striatum to early and late phases of cognitive set reconfiguration. J. Cogn. Neurosci., https://doi.org/10.1162/jocn_a_01011 (2016).

56. 56.

Blanchard, T. C. & Hayden, B. Y. Neurons in dorsal anterior cingulate cortex signal postdecisional variables in a foraging task. J. Neurosci. 34, 646–655, https://doi.org/10.1523/JNEUROSCI.3151-13.2014 (2014).

57. 57.

Blanchard, T. C. & Hayden, B. Y. Monkeys are more patient in a foraging task than in a standard intertemporal choice task. PLoS One 1–11, https://doi.org/10.1371/journal.pone.0117057 (2015).

58. 58.

Blanchard, T. C., Pearson, J. M. & Hayden, B. Y. Postreward delays and systematic biases in measures of animal temporal discounting. Proc. Natl. Acad. Sci. 1–6, https://doi.org/10.1073/pnas.1310446110 (2013).

59. 59.

Blanchard, T. C., Strait, X. C. E., Hayden, B. Y., Strait, C. E. & Hayden, B. Y. Ramping ensemble activity in dorsal anterior cingulate neurons during persistent commitment to a decision. J. Neurophysiol. 114, 2439–2449, https://doi.org/10.1152/jn.00711.2015 (2015).

60. 60.

Hayden, B. Y. Economic choice: the foraging perspective. Current Opinion in Behavioral Sciences 24, 1–6 (Elsevier, 2018).

61. 61.

Azab, H. & Hayden, B. Y. Correlates of decisional dynamics in the dorsal anterior cingulate cortex. Plos Biol. 15, 1–25, https://doi.org/10.1371/journal.pbio.2003091 (2017).

62. 62.

Hayden, B. Y. & Gallant, J. L. Working memory and decision processes in visual area V4. Front. Neurosci., https://doi.org/10.3389/fnins.2013.00018 (2013).

63. 63.

Sleezer, B. J., Castagno, M. D. & Hayden, B. Y. Rule encoding in orbitofrontal cortex and striatum guides selection. J. Neurosci., https://doi.org/10.1523/jneurosci.1766-16.2016 (2016).

64. 64.

Wang, M. Z. & Hayden, B. Y. Reactivation of associative structure specific outcome responses during prospective evaluation in reward-based choices. Nat. Commun. 8, 1–13, https://doi.org/10.1038/ncomms15821 (2017).

65. 65.

Skinner, B. F. The Behavior of Organisms: An Experimental Analysis. (D. Appleton Century Crofts, INC., 1938).

66. 66.

Shadish, W. R., Cook, T. D. & Campbell, D. T. Experimental and Quasi-Experimental for Generalized Designs Causal Inference. Experimental and Quasi-Experimental Designs for Generalized Causal Inference, https://doi.org/10.1198/jasa.2005.s22 (2002).

67. 67.

Hayden, B. Y., Pearson, J. M. & Platt, M. L. Neuronal basis of sequential foraging decision in a patchy environment. Nat. Neurosci. 14, 933–939, https://doi.org/10.1038/nn.2856.Neuronal (2013).

68. 68.

Ludvig, E. A., Madan, C. R., Pisklak, J. M. & Spetch, M. L. Reward context determines risky choice in pigeons and humans. Biol. Lett. 10, https://doi.org/10.1098/rsbl.2014.0451 (2014).

69. 69.

Strait, C. E., Sleezer, B. J. & Hayden, B. Y. Signatures of value comparison in ventral striatum neurons. PLoS Biol., https://doi.org/10.1371/journal.pbio.1002173 (2015).

70. 70.

Blanchard, T. C. et al. Neuronal selectivity for spatial positions of offers and choices in five reward regions. J. Neurophysiol., https://doi.org/10.1152/jn.00325.2015 (2015).

71. 71.

Stephens, D. W. & Charnov, E. L. Optimal foraging: some simple stochastic models. Behav. Ecol. Sociobiol., https://doi.org/10.1007/BF00302814 (1982).

72. 72.

Hayden, B. Y., Nair, A. C., McCoy, A. N. & Platt, M. L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior. Neuron 60, 19–25, https://doi.org/10.1016/j.neuron.2008.09.012 (2008).

73. 73.

Pearson, J. M., Hayden, B. Y., Raghavachari, S. & Platt, M. L. Report neurons in posterior cingulate cortex signal exploratory decisions in a dynamic multioption choice task. Curr. Biol. 19, 1532–1537, https://doi.org/10.1016/j.cub.2009.07.048 (2009).

74. 74.

Barraclough, D. J., Conroy, M. L. & Lee, D. Prefrontal cortex and decision making in a mixed- strategy game. Nat. Neurosci. 7, 404–410, https://doi.org/10.1038/nn1209 (2004).

75. 75.

Seo, H. & Lee, D. Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J. Neurosci. 27, 8366–8377, https://doi.org/10.1523/JNEUROSCI.2369-07.2007 (2007).

76. 76.

Heilbronner, S. R., Hayden, B. Y. & Platt, M. L. Decision salience signals in posterior cingulate cortex. Front. Neurosci. 5, 1–9, https://doi.org/10.3389/fnins.2011.00055 (2011).

77. 77.

Hayden, B. Y., Heilbronner, S. R. & Platt, M. L. Ambiguity aversion in rhesus macaques. Front. Neurosci. 4, 1–7, https://doi.org/10.3389/fnins.2010.00166 (2010).

78. 78.

Teichroeb, J. A. & Smeltzer, E. A. Vervet monkey (Chlorocebus pygerythrus) behavior in a multi-destination route: evidence for planning ahead when heuristics fail. PLoS One 13, 1–18, https://doi.org/10.1371/journal.pone.0198076 (2018).

79. 79.

Stephens, D. W. & Anderson, D. The adaptive value of preference for immediacy: when shortsighted rules have farsighted consequences. Behav. Ecol., https://doi.org/10.1093/beheco/12.3.330 (2001).

80. 80.

Stephens, D. W., Kerr, B., Ferna, E. & Fernández-Juricic, E. Impulsiveness without discounting: the ecological rationality hypothesis. Proc. R. Soc. 271, 2459–2465, https://doi.org/10.1098/rspb.2004.2871 (2004).

81. 81.

Richard, A. F., Goldstein, S. J. & Dewar, R. E. Weed macaques: the evolutionary implications of macaque feeding ecology. Int. J. Primatol. 10, 569–594 (1989).

82. 82.

Pearson, J. M., Watson, K. K. & Platt, M. L. Decision making: the neuroethological turn. Neuron 82, 950–965, https://doi.org/10.1016/j.neuron.2014.04.037 (2014).

83. 83.

Stephens, D. W. & Dunlap, A. S. Why do animals make better choices in patch-leaving problems? Behav. Processes 80, 252–260, https://doi.org/10.1016/j.beproc.2008.11.014 (2009).

84. 84.

Dunlap, A. S. & Stephens, D. W. Tracking a changing environment: optimal sampling, adaptive memory and overnight effects. Behav. Process. 89, 86–94, https://doi.org/10.1016/j.beproc.2011.10.005 (2012).

85. 85.

Wilke, A. & Barrett, H. C. The hot hand phenomenon as a cognitive adaptation to clumped resources. Evol. Hum. Behav. 30, 161–169, https://doi.org/10.1016/j.evolhumbehav.2008.11.004 (2009).

86. 86.

Blanchard, T. C., Wilke, A. & Hayden, B. Y. Hot-hand bias in rhesus monkeys. J. Exp. Psychol. Anim. Learn. Cogn. 40, 280–286, https://doi.org/10.1037/xan0000033 (2014).

87. 87.

Hammack, T., Cooper, J., Flach, J. M. & Houpt, J. Toward an ecological theory of rationality: debunking the hot hand “illusion”. Ecol. Psychol. 29, 35–53, https://doi.org/10.1080/10407413.2017.1270149 (2017).

88. 88.

Krakauer, J. W., Ghazanfar, A. A., Gomez-marin, A., MacIver, M. A. & Poeppel, D. Neuroscience needs behavior: correcting a reductionist bias. Neuron 93, 480–490, https://doi.org/10.1016/j.neuron.2016.12.041 (2017).

89. 89.

Juavinett, A. L., Erlich, J. C. & Churchland, A. K. Decision-making behaviors: weighing ethology, complexity, and sensorimotor compatibility. Curr. Opin. Neurobiol. 49, 42–50, https://doi.org/10.1016/j.conb.2017.11.001 (2018).

## Acknowledgements

This research was supported by a National Institute on Drug Abuse Grant R01 DA038106 to BYH, a NIH T32 to BRE and the UMN DTI and AIRP to BYH and JZ.

## Author information

Authors

### Contributions

B.R.E., B.Y.H. and J.Z. designed experimental protocols. B.R.E. collected all data and preformed data analysis. B.R.E., B.Y.H. and J.Z. wrote the manuscript.

### Corresponding author

Correspondence to Benjamin R. Eisenreich*.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Eisenreich*, B.R., Hayden, B.Y. & Zimmermann, J. Macaques are risk-averse in a freely moving foraging task. Sci Rep 9, 15091 (2019). https://doi.org/10.1038/s41598-019-51442-z

• Accepted:

• Published:

• ### The description–experience gap: a challenge for the neuroeconomics of decision-making under uncertainty

• Basile Garcia
• , Fabien Cerrotti
•  & Stefano Palminteri

Philosophical Transactions of the Royal Society B: Biological Sciences (2021)

• ### Are the roots of human economic systems shared with non-human primates?

• , Michael J. Beran
• , Sacha Bourgeois-Gironde
• , Sarah F. Brosnan
•  & Jean-Baptiste Leca

Neuroscience & Biobehavioral Reviews (2020)

• ### Behavioural variability contributes to over-staying in patchy foraging

•  & Benjamin Hayden

Biology Letters (2020)

• ### Automated markerless pose estimation in freely moving macaques with OpenMonkeyStudio

• Praneet C. Bala
• , Benjamin R. Eisenreich
• , Seng Bum Michael Yoo
• , Benjamin Y. Hayden
• , Hyun Soo Park
•  & Jan Zimmermann

Nature Communications (2020)