Abstract
Those with diabetes mellitus are at high-risk of developing psychiatric disorders, especially mood disorders, yet the link between hyperglycemia and altered motivation has not been thoroughly explored. Here, we characterized value-based decision-making behavior of a streptozocin-induced diabetic mouse model on Restaurant Row, a naturalistic neuroeconomic foraging paradigm capable of behaviorally capturing multiple decision systems known to depend on dissociable neural circuits. Mice made self-paced choices on a daily limited time-budget, accepting or rejecting reward offers based on cost (delays cued by tone pitch) and subjective value (flavors), in a closed-economy system tested across months. We found streptozocin-treated mice disproportionately undervalued less-preferred flavors and inverted their meal-consumption patterns shifted toward a more costly strategy overprioritizing high-value rewards. These foraging behaviors were driven by impairments in multiple decision-making processes, including the ability to deliberate when engaged in conflict and cache the value of the passage of time as sunk costs. Surprisingly, diabetes-induced changes in motivation depended not only on the type of choice being made, but also on the salience of reward-scarcity in the environment. These findings suggest that complex relationships between metabolic dysfunction and dissociable valuation algorithms underlying unique cognitive heuristics and sensitivity to opportunity costs can disrupt distinct computational processes leading to comorbid psychiatric vulnerabilities.
Similar content being viewed by others
Introduction
Reward-seeking motivated behavior derives from multiple decision-making systems in the brain1, which play coordinated roles in overall limbic function. Distinct circuits can contribute to separable valuation algorithms via fundamentally distinct computations that can uniquely go awry in different psychiatric disorders2,3,4,5,6. Neuroeconomics is an emerging field of decision science that leverages complex approaches in behavior to quantify how the physical limits of the brain constrain the way cognitive mechanisms process reward-related information7. This encompasses characterizing multifactorial aspects of motivation that operationalize reward value along several dimensions (e.g., reward magnitude, probability, cost, or subjective preferences), integrate choice processes with environmental circumstances, and capture brain-body interactions with evolutionarily conserved cognitive heuristics that may depend on, for instance, energy balance and metabolic demand6,8,9. Recent insights from neuroeconomic principles offer novel approaches to investigate decision-making information processing capable of resolving behavior into discretely measurable computational units in a manner that is biologically tractable and readily translatable across species10,11,12.
Diabetes Mellitus, known colloquially as diabetes, refers to a group of metabolic diseases that impair the ability to regulate blood glucose via perturbations in insulin function: either insulin deficiency as in Type 1 diabetes or impaired signaling through the insulin receptor as in Type 2 diabetes13. The brain is the body’s largest user of glucose as an energy source yet has often been considered as being spared from the effects of diabetes, despite literature to refute this14. Recent studies have demonstrated that diabetes, even in the absence of associated vascular dysfunction, leads to altered cognitive performance and can double the risk of psychiatric disorders, especially mood disorders such as depression15,16,17,18,19,20,21. Depression is associated with dysfunction of reward circuitry22 with several lines of evidence suggesting that diabetes can impact overlapping neurobiology. Neuroimaging studies demonstrate structural and functional alterations in reward circuitry in people with diabetes, including changes in functional connectivity within mesolimbic systems23,24. Altered gene expression in postmortem brain tissue from people with diabetes is greatest in key regions of the reward system implicated in psychiatric sequelae, most notably the caudate, hippocampus, nucleus accumbens, and amygdala, as being disproportionately affected compared to other parts of the brain25,26,27,28,29,30. In models of diabetes, rodents display altered motivation and reward sensitivity for both food and non-food rewards, including decreased progressive ratio breakpoints and increased thresholds for intracranial self-stimulation31,32. This is perhaps not surprising in the setting of impaired food-related signaling when considering that food is a primary reward, and that motivated behavior functions to promote survival, including feeding. However, beyond insulin function in the periphery, insulin receptors are distributed throughout the brain, especially in limbic circuitry, despite glucose entry into the brain being insulin-independent. Together these findings raise the possibility that, rather than a consequence of insulin dysfunction in the periphery, altered reward sensitivity could be disturbed as a direct result of impaired central insulin action on value-related information processing in the brain. This posits a more direct link between diabetes and increased comorbid psychiatric vulnerabilities. However, our understanding of how diabetes alters reward processing remains limited. We reasoned that neuroeconomic approaches would be critically informative to disentangle multiple, dissociable circuit-specific computational processes that could uniquely go awry if applied toward translational studies of diabetes, revealed through behavior.
In this exploratory study, we set out to broadly examine how diabetes alters multiple aspects of value-based decision-making behavior, as such phenotypes in rodent models of diabetes have been characterized to date with little depth. This is driven by our hypothesis that reward processing is likely to be disrupted in diabetes, but in a heterogeneous manner, where impaired metabolic signaling is likely interacting with complex computational processes in the brain. Here, we asked if this could be accessed through complex decision-making behavior. We characterized a rich dataset of neuroeconomic choice behavior in a modified multiple low-dose streptozocin (STZ) mouse model of diabetes33 tested longitudinally in a naturalistic foraging paradigm called Restaurant Row34,35. In this complex behavioral task, mice must forage for their primary source of food while on limited daily time budget in a closed-economy system by navigating a maze with four uniquely flavored and contextualized feeding sites, or “restaurants.” Mice learn to associate the pitch of a tone with reward cost in the form of a delay required to obtain food. Mice are free to choose in a self-paced manner how to invest time in competing actions. This allows us to break down dissociable action-selection processes into discrete stages of the decision stream. Mice decided whether to skip offers presented in a spatially segregated offer zone and proceed to the next restaurant or to accept and then separately wait for cued delays in a distinct wait zone: these zones capture dissociable choice processes separated across space and time that we previously demonstrated derive from physically separable circuits and recruit fundamentally distinct valuation algorithms3,4,5,11.
Because time is a limited commodity, choices on this task can have dire opportunity costs and are interdependent both between trials and across days. Leveraging such elements of our task has led to key breakthroughs by our group capturing psychological phenomena across species previously thought to be unique to humans such as sensitivity to sunk costs and regret3,11,34,35,36. Sunk costs describes the economic concept that individuals tend to overvalue rewards and escalate commitment to an ongoing pursuit as a function of irrecoverable losses measured here in the form of time spent and orthogonal to future costs9,11,36,37,38,39,40. Relevant here, competing theories from the behavioral ecology field (i.e., state-dependent valuation learning and within-trial contrast) posit that sensitivity to sunk costs has been conserved across evolution in part due to drivers of cognitive heuristics that depend on considerations of caloric balance and a comparison among one’s depleting energetic state, possible gains from future rewards, and the demands of the environment11,41,42; however, this has not yet been formally tested in individuals with impaired metabolic signaling. Thus, here, the use of STZ-induced diabetic mice allows us to test the hypothesis that sensitivity to sunk costs critically depends on interactions between one’s metabolic state and changes in the environment, and that we might disentangle this from other reward valuation processes. To this end, we experimentally manipulated not only the distribution of reward costs presented to the animals across weeks to months but also systematically varied the rate of the changing economic landscape of the environment. This allowed us to assess how one’s experience of the rate of change of costs in the environment – or history of economic contrast – shapes decision-making behavior. Finally, we compared how behavior while food restricted vs. with ad libitum access to food altered the effects of environmental demand on diabetic mice in order to evaluate differential influences of global energic state on dissociable decision-making processes.
We discovered that STZ-treated mice employed altered behavioral strategies depending on the type of choice being made. This not only developed across an increasingly reward-scarce environment but also depended on the salience of contrast in the environment’s changing cost distribution in order to drive changes in fundamentally distinct valuation algorithms. We found that sensitivity to sunk costs only emerged after individuals transitioned to a reward-scarce environment. Further, consistent with an impairment in brain-body signaling of one’s metabolic state, we found that sunk cost sensitivity was strikingly abolished only in STZ-treated mice who experienced a low-contrast transition into a reward-scarce environment, and that this effect was dependent on being food restricted. These results indicate sunk cost sensitivity, separate from other decision-making behaviors, is a reward valuation process critically dependent on intact metabolic signaling that dynamically interacts with the global energetic state of the individual. Here, we demonstrate the utility of applying neuroeconomic approaches to animal behavior. This work will help uncover dissociable motivational processes capable of altering decision-making behavior in a disease that disturbs fundamental biology, such as diabetes, with implications for shared pathophysiology with comorbid psychiatric vulnerabilities.
Methods
Subjects
Adult male C57BL/6J mice (Jackson Labs, stock # 000664) were used in this study. Mice were maintained on a 12-h regular light/dark cycle with ad libitum access to water throughout the entire experiment. During the treatment protocol to induce chronic hyperglycemia, mice were group-housed with ad libitum access to regular chow (LabDiet 5053; Protein 21%, Fat 6%, Carbohydrate 53.5%, Fiber 4.4%, Ash 6%, 4.11 kcal/gm). Beginning 3 days prior to and continuing during testing on the Restaurant Row paradigm, mice were single-housed to avoid shared flavor preferences and food restricted to 80–85% of pre-task body weight. Behavioral testing was conducted during the light phase in dim lighting. Experiments were approved by the Mount Sinai Institutional Animal Care and Use Committee (IACUC) and adhered to the National Institutes of Health (NIH) guidelines.
Chronic hyperglycemia
We utilized a modified low-dose streptozotocin (STZ) protocol to induce insulin deficiency and resultant hyperglycemia33,43. At 8 weeks of age, 45 C57BL/6J mice received a daily intraperitoneal injection of either Hank’s balanced saline solution (HBSS, n = 20) as the vehicle (VEH) control or STZ (50 mg/kg, n = 25) for 5 consecutive days (Fig. 1a). Mice were maintained on ad libitum food for the next 6 weeks. Body weights and fasting blood sugar (morning fast, 4–5 h) were assessed weekly to monitor progression of diabetes using a handheld glucometer (Bayer Contour). At 14-weeks of age, Hemoglobin A1c (HbA1c) tests of tail vein blood were performed to confirm a chronic hyperglycemia phenotype using A1CNow+ point-of-care kits (PTS Diagnostics), given that not all STZ-treated mice develop hyperglycemia. 20 STZ-treated mice with the highest HbA1c (range for all 25 STZ-treated: 5–8.5%; VEH range: 4–4.5%) were then selected for inclusion in the study. Mice were then single-housed, and food restricted to 80–85% of their free-feeding body weight over the 3 days preceding the beginning of the task.
a Experimental timeline. Dashed box indicates time period relevant for this figure: hyperglycemia incubation. Randomly selected male C57BL/6 J mice received daily intraperitoneal injections of vehicle (VEH, Hank’s balanced saline solution, n = 20) or streptozocin (STZ, 50 mg/kg, n = 20) injections for 5 consecutive days (gray or orange tick icon). STZ is an antibiotic that ablates insulin-producing beta cells of the pancreas. Mice were allowed to incubate for 8 weeks while sampling body weights and fasting tail vein blood glucose levels (small droplet icon) weekly before sampling tail vein blood Hemoglobin A1c once (large droplet icon) and then beginning longitudinal neuroeconomic behavioral testing for an additional ~2 months. In brief, mice foraged for food rewards of varying costs across distinct economic landscapes that changed at different rates, described further in subsequent figures. b Weekly body weights. As expected, STZ-treated mice experienced an initial drop in weight before returning to a rate of weight gain similar to VEH-treated mice. c Fasting tail vein blood glucose levels. STZ treatment significantly elevated blood glucose measurements compared to VEH-treated mice. d Tail vein blood Hemoglobin A1c obtained at week 8, confirming chronic hyperglycemia in STZ-treated mice, sampled 3 days before starting behavioral testing. *Represents significant differences between VEH- and STZ-treated groups, p < 0.0001. Dots represent individual mice. Shading / error bars represent ±1 SEM.
Neuroeconomic decision-making paradigm
General structure
20 VEH- and 20 STZ-treated mice were then characterized longitudinally on the Restaurant Row task34. On this task, mice had a limited amount time each day to forage for their primary source of food by navigating a maze with four uniquely flavored and contextualized feeding sites, or “restaurants” (Fig. 2b). Food rewards consisted of 20 mg full-nutrition pellets (BioServ dustless precision pellets) that varied in flavor (chocolate, banana, grape, or plain) but not caloric content (3.6 kcal/gm calories). Macronutrient content did not vary grossly across flavors (18.7% protein, 5.6% fat, 59.1% carbohydrate, 4.7% fiber, 6.5% ash for banana, grape or plain; 18.4% protein, 5.5% fat, 59.1% carbohydrate, 4.6% fiber, 6.5% ash for chocolate). Each restaurant was decorated with either horizontal stripes (chocolate), dots (banana), triangles (grape), or vertical stripes (plain), whose locations remained spatially fixed throughout the entire paradigm. Animals were placed into an arena (approximately 36” × 36”) consisting of 4 fixed restaurants positioned in the corners of a square maze and connected by hallways for 45 min (their limited daily time budget) and video-tracked under dim lighting using a standard USB camera and computer running the task programmed in ANY-Maze (Stoelting). Animal position tracking in real-time controlled engagement with the task whereby centroid body crossings into distinct zones in the maze triggered task events, including speaker playback of various tones or activating custom-built 3D-printed automated pellet dispensers (open-source pellet dispensers used in this experiment: www.hackaday.io/project/171116-fed0). Task rules required animals to run in a counterclockwise direction so that they properly approached each restaurant’s T-shaped intersection in order engage task contingencies and trigger events, which all animals quickly acquired within the first week of testing. Each restaurant had a separate offer zone and wait zone. Upon entry into the offer zone from the correct heading direction, a tone sounded whose pitch indicated the offer length of how long of a delay mice would have to wait in a cued countdown should they choose to enter the wait zone in order to earn a food reward. The range of tone pitches used in this paradigm varied from 4000 Hz (lowest pitch, signaling a 1 s offer) in increments of 387 Hz to 15,223 Hz (highest pitch, signaling a 30 s offer), determined by the schedule animals were assigned to, described in detail below. In the offer zone, tones played for 500 ms and repeated every second at the same pitch until either an enter decision was made by turning right into the wait zone or a skip decision was made by turning left into the hallway advancing toward the next restaurant. If mice choose to enter the wait zone, tones (500 ms) descended by 387 Hz every second signaling the countdown to reward delivery. If mice decided to quit and exit the wait zone prematurely before the countdown completed, tones silenced, the offer was rescinded, and mice must advance to the next trial in the next restaurant. Thus, this task is economic in nature, requiring animals to budget their limited time effectively in a self-paced manner in order to earn a sufficient amount of food. Animals were maintained at 80–85% body weight per IACUC regulations and both VEH- and STZ-treated groups were supplemented with additional food post-task if needed to maintain safety.
a Timeline. Dashed box indicates time period relevant for this figure: first week of behavioral testing when all costs were fixed at 1 s delays (green epoch). b Task schematic. Mice were allotted a 45 min daily budget to invest time foraging for their primary source of food in a self-paced manner. Costs to obtain rewards were in the form of delays mice would have to wait near feeder sites. Mice were required to run in a counterclockwise direction encountering offers for different flavors at each “restaurant” in serial order. Each restaurant, separated by hallways, was divided into a T-shaped “offer zone” choice point and a separate “wait zone” that housed the pellet dispenser. Upon offer zone entry from the correct heading direction, a tone sounded whose pitch indicated the delay mice would have to wait if accepting the offer by entering the wait zone. If entered, tone pitch descended in the wait zone, cuing the indicated delay. Each trial terminated if mice skipped in the offer zone, quit during the countdown in the wait zone, or earned a reward, after which animals were required to proceed to the next restaurant. This task captures dissociable motivational elements of choice deliberation, re-evaluation, and opportunity costs that depend on subjective value and the economic demand of the environment. c–g Simple behavioral metrics across the first week of testing during which all offers were 1 s only (lowest pitch, 4 kHz): c laps run in the correct direction, d inter-trial travel time between restaurants, e total rewards earned, f earnings split by flavors ranked from least to most preferred by summing each day’s end-of-session totals in each restaurant and g normalized to number of laps run (treatment x rank: F = 3.823, p < 0.01). In (c–e) right, *represents significant differences between VEH- and STZ-treated groups on day 7 (dashed box), p < 0.01, t-test. Dots represent individual mice. Error bars represent ±1 SEM.
Offer schedule: stepwise vs. gradual
A key manipulation and focus of the present study was the way in which the distribution of offers presented to the animals changed across weeks. For the first week of behavioral testing, all offers were 1 s only (i.e., the lowest 4000 Hz pitch tone presented, green epoch). This allowed animals to easily learn the basic structure of the task and revealed individual differences in flavor preferences. Beginning on day 8 until day 22, the range of offers presented to the animals increased, ultimately to the final 1–30 s offer range, but did so via one of two schedules: (i) stepwise or (ii) gradual. For the stepwise schedule, during the next 7 days of testing (block 2, days 8–14, yellow epoch), offers ranged from 1 to 5 s; block 3 (days 15–21, orange epoch) consisted of a 1–15 s range. The fourth and final block (days 21–49, red epoch) consisted of offers ranging from 1 to 30 s. For the gradual schedule, during the next 15 days of testing (days 8–22), the maximal offers increased each day by 2 s (e.g., day 8: 1–2 s range; day 9: 1–4 s range; day 10: 1–6 s range, and so on) until reaching the same 1–30 s offer range by day 22. These two schedules allowed us to examine how mice initially shaped their behavior as they learned the basic structure of the task as well as how they adjusted their foraging strategies across long timescales (days, weeks, and months) as they transitioned from reward-rich to reward-scarce environments in either a stepwise or gradual manner. In this way, we manipulated the economic contrast or salience in how the distribution of costs changed each day. This design optimized the experimental plan, minimizing as many discrepancies between the schedules as possible. Regardless of the range of offers on any given day, offers were always sampled from a uniform distribution separately in each restaurant. Thus, after day 7, VEH- and STZ- treated mice were split to continue either on the stepwise or gradual schedule, randomly assigned and counterbalanced across numerous behavioral metrics. Animals developed strategies longitudinally across the changing economic landscape. Testing continued on the 1–30 s offer range until the main experiment terminated on day 49. This latter period (days 22–49) allowed us to characterize how differences in one’s prior history of environmental change (during days 8–21) influenced decision-making behavior despite arriving in the same final environment (days 22–49). At the very end of the experiment, after the main behavioral period during days 1–49 concluded, subjects were placed back on ad libitum access to regular chow in the home cage for an additional 5 days while continuing testing on Restaurant Row in their final week. For graphical comparison, data analyzed from the time period when mice were food restricted either was calculated from the 2 days leading up to the ab lib sessions or calculated from the entire 1–30 s epoch, indicated within each figure. Pre- and post-task tail vein blood glucose levels were sampled on days 7 and 22 in both the VEH- and STZ-treated animals. In the final week of testing, only the STZ-treated mice were sampled again for pre-task tail vein blood glucose levels to ensure safety and maintenance of hyperglycemia at long time points. Terminal HbA1c levels were measured in all mice at the conclusion of the study.
Data & statistical analyses
Data were processed in MATLAB with statistical analyses in JMP Pro 16 using standard built-in functions. All data are expressed as mean ±1 standard error. Statistical significance was assessed using Student’s t tests and one-way, two-way, and repeated measures ANOVAs, and sign-tests. Correlations were reported using Pearson correlation r coefficients. Several behavioral analyses were developed and either previously published or newly described in this manuscript, detailed below. Many analyses of interest were calculated by collapsing across the entire 1–30 s epoch (days 22–49) unless otherwise specified.
Meal consumption patterns were calculated by binning the full 45 min session into 18 segmented 2.5 min bins. Within each bin, we calculated the total number of rewards earned, rewards earned in each restaurant, as well as cumulative sums of both metrics either in raw pellet counts or as a percentage of either total global earns or total earns within-flavor. This allowed us to capture rate of within-flavor meal consumption patterns. Flavors were ranked from most to least preferred each day by summing the end of session earn totals in each restaurant. To summarize mid-session meal patterns, we extracted % cumulative earnings within-flavor metrics at the half-way point and calculated a difference score between the most preferred and least preferred flavors in this metric to generate a summary measure reflecting a sign change in the temporal rank order of flavor-specific meal consumption.
Percent change in earns across the entire study was measured within-subject relative to the average of stable total earns obtained from days 5–7. This allowed us to capture, within-subject, how the subsequent changing economic landscape of the task caused a decrease in total earnings relative to this day 5–7 average (termed 100%). From this, we calculated difference scores in this metric at each transition point in the stepwise schedule (i.e., day 7–8, day 14–15, and day 21–22) showing the drastic stairstep effects on earnings this schedule has on both VEH- and STZ-treated mice compared to the gradual schedule.
In order to approximate economic thresholds, or indifference points, of willingness to accept and earn rewards, we fit a Heaviside-step regression to choice outcomes as a function of cost in each zone and measured curve inflection points each day across the entire experiment. Wait zone thresholds, which determine offers animals are actually willing to wait for and earn are relatively stable, are used in calculations to determine the value of an offer on any given trial for that day via the following formula: (offer value = wait zone threshold minus offer). Offer here and throughout the manuscript refers to the independent variable, the randomly selected delay, presented to the animal on each trial signaled by the pitch of a tone. Thresholds refer to the indifference point of the Heaviside-step regression. Thus, subtracting offer from wait zone thresholds allows offers to be normalized into value terms across animals, across days, or within-animal across flavors. Thus, session-level thresholds were recalculated using a leave-one-out method on a trial-by-trial basis to avoid conflating value normalization with the offer presented and behavioral outcome on any given trial. The value remaining at the moment of quitting could similarly be calculated using the following formula: (value left = wait zone threshold minus time remaining in the countdown at the moment of quitting) to categorize quit events into separate economic categories. Difference scores in thresholds between zones were also calculated via the following formula: (offer zone minus wait zone thresholds) in order to summarize to what degree offer zone and wait zone thresholds were in register (delta score of 0) or out of register with one another (e.g., if offer zone thresholds > wait zone thresholds, animals would be more likely to quit accepted offers).
In order to capture deliberation behaviors, we quantified behavioral path trajectories as mice traversed through the offer zone en route to making a skip or enter decision. Video-tracked body positions during the pass through the offer zone choice point can be transformed into absolute integrated angular velocity – a metric of hesitation or physical “hemming and hawing” known as vicarious trial and error (VTE) behavior. Trajectories in this analysis started at offer onset entering the stem of the T-shaped 180-degree choice point until either hallway entry (left turn, skip) or wait zone entry (right turn, enter). This behavior is best measured by calculating changes in velocity vectors of discrete body x and y positions over time as dx and dy. From this, we can calculate the momentary change in angle, Phi, as dPhi. When this metric is integrated over the duration of the pass through the offer zone, VTE is measured as the absolute integrated angular velocity, or IdPhi. For a given animal on a given day, we can normalize all IdPhi metrics obtained on every trial and generate a zIdPhi value that can then be split post-hoc for comparisons under any number of trial conditions (e.g., skip vs. enter, by flavor, etc). This allowed us to capture within-subject but between-decision-condition differences in internal choice patterns. Thus, we could calculate differences scores between zIdPhi when skipping minus entering, for instance as a within-subject metric of choice conflict.
In order to capture sensitivity to sunk costs, we previously developed a dynamic analysis capable of extracting the influence of time spent in the wait zone on the likelihood of quitting that is orthogonal to temporal distance to the goal11,36. First, each quit event was binned into [time spent, time left] pairs. From this, we calculated the probability of earning a reward using a sliding window survival analysis as animals continuously reevaluated staying in the wait zone. In order to reduce the dimensions of this analysis, we collapsed each time spent condition across the time left axis. To control for artificial inflation of the probability of earning calculation simply due to increasingly non-existent data being left out of the grand means at higher time spent conditions (because data points do not exist above a certain time left amount for a specific time spent waiting condition), we resampled data from the 0 s time spent condition and collapsed along the time left dimension, iteratively excluding points to match the data range of each time spent condition. In other terms, the 0 s condition include data points from which wait zone thresholds derive, and thus this control analysis effectively recalculates the probability of earning by iteratively removing data from the right side of the offer distribution. This controls for the probability of earning approaching 1 due simply to excluded data in the shorter curves of the higher sunk cost conditions when reducing dimensions. This resultant dimension-reduced analysis yields an observed and control curve that can be subtracted to yield the final delta curve. The final delta curve captures the envelope of sensitivity to sunk costs, measuring the relative magnitude of change in wait zone behavior, normalized to each animal’s likelihood of quitting in the 0 s sunk condition. Note the rising phase of this curve captures meaningful sunk cost sensitivity while the falling phase of this curve diminishes merely due to ceiling effects of data samples used at high time spent conditions seen in the first-order non-dimension reduced analysis plots. Peaks were extracted from this delta curve for summary statistical comparisons between groups and tested against 0 (no sensitivity to sunk costs). Additional analyses of the effects of different forms of time spent on the probability of staying in the wait zone during the countdown, in contrast to sunk costs accrued in the wait zone, were also examined, including time spent deliberating in the offer zone before accepting an offer or time elapsed since last reward earned before accepting an offer. In both of such cases, the read-out metric of the probability of earning a reward upon arrival in the wait zone was calculated. These metrics of choice outcome are distinct from other metrics of choice latency, such as quit reaction time.
Post-consumption behaviors were measured as time spent consuming and lingering at the feeding site after earning a reward from reward delivery onset until exiting the wait zone. Transit times between restaurants were measured from trial termination (i.e., skip onset, quit onset, or post-earn wait zone exit onset) until mice arrived at the next restaurant’s offer zone choice point, triggering the next trial onset. Inter-earn-intervals were calculated as the amount of time elapsed between subsequent earns either of any flavor (i.e., time elapsed to any next earn of any flavor following an earn in any restaurant) or specifically of a same, given flavor (i.e., time elapsed to the next earn of a specific flavor following an earn in that same restaurant, ignoring other intervening earns), indicated in each corresponding figure.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Results
STZ treatment induces chronic hyperglycemia as a mouse model of diabetes
In order to generate a mouse model of diabetes, 45 C57BL/6J male mice were treated with intraperitoneal injections for 5 consecutive days of either Hank’s buffered saline solution (HBSS) as the vehicle (VEH, n = 20) control group or low-dose streptozotocin (STZ, 50 mg/kg, n = 25), which ablates insulin-producing beta cells of the pancreas33 resulting in sustained hyperglycemia in the majority of treated mice (Fig. 1a). Body weight and morning fasting blood glucose levels sampled from the tail vein were monitored weekly over the next 7 weeks to allow for the effects of chronic hyperglycemia to incubate while mice remained group-housed with ad libitum access to regular chow. We found that treatment with STZ induced a robust and long-lasting increase in blood glucose levels (week x treatment, blood glucose: F = 73.841, p < 0.0001) accompanied by a transient decrease in body weight that briefly delayed subsequent weight gain compared to VEH-treated mice (week x treatment, weight: F = 19.237, p < 0.0001; Fig. 1b, c). Prior to starting the behavioral task, glycated hemoglobin (HbA1c) was measured to confirm diabetic phenotype. We selected the 20 mice with the highest HbA1c and found STZ-treated mice displayed significantly elevated HbA1c levels (t-test, t = 9.96, p < 0.0001), confirming that these animals were chronically hyperglycemic compared to VEH-treated mice (Fig. 1d). Next, mice were single-housed and food-restricted to between 80–85% of free feeding weight as they began longitudinal testing on the Restaurant Row paradigm.
STZ-treated mice are able to acquire the basic structure of the Restaurant Row task
During the first week of behavioral testing on the Restaurant Row task, all reward offers cost only 1 s (Fig. 2a, b). During this period, all mice quickly acquired directionality of the task with no differences in number of laps run in the correct counterclockwise direction between STZ- and VEH-treated mice (t = 1.24, p = 0.223, Fig. 2c). We also found no differences in travel time when running in the hallways between restaurants among STZ- and VEH-treated mice (t = 0.17, p = 0.863, Fig. 2d). Despite acquiring running in the correct direction at the same rate and stabilizing at similar number of laps, we found a modest decrease in the total number of pellets earned in STZ-treated mice compared to VEH-treated mice at the end of the first week of behavioral testing (t = 3.53, p < 0.01, Fig. 2e, f). This effect negatively correlated with HbA1c levels (p = 0.027) but did not correlate with body weight (p = 0.147) in STZ-treated mice, reflecting a change in foraging behavior that is likely mediated by their hyperglycemic profile (Supplementary Fig. 1a–d).
STZ-treated mice display altered flavor-specific meal consumption patterns
Upon closer examination of rewards earned during the first week, we found that differences in food intake between STZ- and VEH-treated mice were largely driven by disproportionately fewer earnings for less preferred flavors, stabilizing by day 7 (Fig. 2g). These data indicate STZ-treated mice exhibited changes in revealed preferences that were asymmetrically skewed rather than globally shifted among flavors. To characterize meal consumption patterns, we next calculated earnings across the 45 min session segregated into 2.5 min bins (Fig. 3a). Overall, we found that both groups earned fewer rewards across the session (Fig. 3b), reflecting a measure of within-session satiety (main effect of session time on total earnings across groups: F = 142.022, p < 0.0001). STZ-treated mice, however, displayed this decrease to a greater degree, despite early in the session earning amounts of food similar to VEH-treated mice in each restaurant (Fig. 3b, c; treatment x session time: F = 12.469, p < 0.01; Fig. 3d; midsession: t = 3.13, p < 0.01, Supplementary Fig. 1e–h).
a Timeline. Dashed box indicates relevant time period for this figure: day 7. b, c Rewards earned within each 2.5 min bin across the session (b) in total earnings or (c) earnings split by flavor ranking. * in black near the VEH / STZ legend labels represents significant differences between VEH- and STZ-treated groups collapsing across time. * in black near individual time bins represents significant differences between groups at those times. Blue arrows indicate a main effect of time (e.g., satiety related changes across the session), with significance reported in blue. d Cumulative total rewards earned summed across the session. e Percentage of total session rewards earned split by flavor. f Tail vein blood glucose levels sampled immediately before and after day 7’s session. g Change in blood glucose from f post minus pre task. h Scatter plot of change in blood glucose from g against change in body weight measured immediately before and after day 7’s session. Gray dashed lines indicate 0 on both axes. Scatter plot of pre-task blood glucose from f against day 7’s end-of-session earns for i least and j most preferred flavors. Dots represent individual mice. Error bars represent ±1 SEM. Shading represents 95% confidence interval of linear fit.
When examining within-session meal patterns by restaurant, we found that decreased earnings across the session were largely driven by the most preferred flavor for both the VEH- and STZ-treated groups (flavor x session time: F = 74.690, p < 0.0001, Fig. 3c, Supplementary Fig. 1e–h). That is, on day 7, satiety-related effects were predominately observable in the most preferred restaurant. Thus, earnings for less preferred flavors remained relatively flat within session, albeit overall downshifted in STZ-treated mice (Fig. 3c). When calculating meal patterns normalized to total food earned, we found that STZ-treated mice consumed more of their day’s proportion of the most preferred flavor earlier in the session and prioritized over other flavors compared to VEH-treated mice (first time bin: treatment x rank: F = 5.317, p < 0.01; Fig. 3e, Supplementary Fig. 1e–h). These data suggest that elements of satiety, which may be more prominent after STZ treatment, interact with more intricate aspects of value-based decision-making and goal-directed behavior while foraging.
To examine to what extent hyperglycemia might contribute to these effects, we measured body weight and blood glucose pre- and post-task. As expected, blood glucose increased post-task in both groups (main effect of time point pre vs. post task: F = 23.915, p < 0.0001), with STZ-treated mice remaining hyperglycemic despite being food-restricted compared to VEH-treated mice (main effect of treatment: F = 36.666, p < 0.0001) (Fig. 3f, g). Change in blood glucose positively correlated with change in body weight in STZ- but not VEH-treated mice (VEH: p = 0.6, STZ: p = 0.018), consistent with dysregulation of glucose homeostasis in response to a meal (Fig. 3h). Pre-task blood glucose negatively correlated with earns for both least (p = 0.008) and most preferred (p = 0.005) flavors in STZ-treated mice on day 7 (Fig. 3i, j). Next, to investigate dissociable valuation processes that may be altered in these mice, we characterized how economic decision strategies developed as task complexity and environmental demand increased.
Mice learn to forage in a stepwise or gradually increasingly reward-scarce environment
A feature of the Restaurant Row task is the longitudinal nature of its closed-economy system in which behavior is interdependent across days as animals work for their primary source of food. Increasing the distribution of offer costs while animals remain on a fixed, limited time budget can elicit an economic challenge9. Because such a challenge can be metabolically demanding, particularly for mice that may have differing energetic needs such as these STZ-treated mice, we experimentally manipulated the rate of change of reward scarcity in the environment in two ways. After the first week of testing during which all offers remained at 1 s only (epoch 1: a relatively reward-rich environment), groups of mice were split and advanced to the next stage of testing where the range of offers available in the task environment increased via one of two schedules: (i) stepwise (high contrast) or (ii) gradual (low contrast, Fig. 4a, Supplementary Fig. 2). These schedules elicited a relative decrease in earned rewards that was either stepwise or gradual, respectively, across days 8–22 similarly for both STZ- and VEH-treated mice (main effect of schedule: F = 52.500, p < 0.0001; no main effect of treatment: F = 0.911, p = 0.342; Fig. 4b, c). Both schedules yielded matched earnings and overall reinforcement rate (i.e., time elapsed to any next earn of any flavor following an earn in any restaurant) across groups of mice at each stepwise transition point and were equivalent by day 22 and beyond (treatment x schedule% baseline earns after transition points: F = 0.937, p = 0.335; treatment x scheduleinter-earn interval after transition points: F = 0.051, p = 0.821; Fig. 4b, d). This allowed us to examine, ceteris paribus, not only how mice adapted choices to different rates of changing environments between days 8–22 but also how the experience and salience or contrast of the prior environmental change might bolster different valuation strategies once in the same, final reward-scarce environment after day 22.
a Timeline. Dashed box indicates time period relevant for this figure: entire 7-week Restaurant Row paradigm. b Total rewards earned each day normalized to the average of days 5–7 earnings (termed 100%, horizontal dashed green line). Visual guidance color bars along the x-axis reflect the experimental schedules (stepwise or gradual) illustrated in the timeline in a. Vertical dashed green-yellow-red lines indicate the transition points of the stepwise schedule (or matched days of the gradual schedule) and are re-used throughout all other figures as a visual aid. Top: VEH; bottom: STZ. c Change in b at each transition point of the stepwise schedule (or matched days of the gradual schedule): d7–8 (1 s only to 1–5 s offers, green to yellow), d14–15 (1–5 s to 1–15 s offers, yellow to orange), and d21–22 (1–15 s to 1–30 s offers, orange to red). Horizontal dashed gray line indicates 0 change. d Time elapsed between subsequent earns of any flavor. e Within-session cumulative earnings summed across 2.5 min bins normalized to end-of-session earnings for each restaurant. Data collapsed across the entire 1–30 s epoch (days 22–49, red). Horizontal dashed gray line indicates 50% of each flavor’s meal consumed. Dashed square box highlights mid-session data plotted within inset figure showing ranking spread (left) and a summary difference score (right) between most minus least preferred restaurants to depict whether favorite flavors are consumed relatively sooner (+) or later (−) in the session. Horizontal black line indicates 0 difference. f Time elapsed between subsequent earns of the same flavor split by ranking following an earn in the same restaurant (ignoring intervening earns). Data collapsed across either the 1 s only (green) or 1–30 s epoch (red). Shading / error bars represent ±1 SEM.
When examining meal consumption in the 1–30 s reward-scarce environment split by restaurant, we found that STZ-treated mice displayed inverted meal patterns across the session and prioritized consuming more of their favorite flavor sooner compared to VEH-treated mice regardless of schedule history (main effect of treatment: F = 25.326, p < 0.0001; Fig. 4e, Supplementary Fig. 3a–h). When calculating reinforcement rate by restaurant (i.e., time elapsed to the next earn of a specific flavor following an earn in that same restaurant, ignoring other intervening earns), we found a significantly higher latency between earns in STZ-treated mice compared to VEH-treated mice only after experiencing the gradual but not stepwise schedule that predominately affected less preferred flavors (treatment x rank: stepwise: F = 0.752, p = 0.525; gradual: F = 9.779, p < 0.0001; Fig. 4f). These data suggest environmental history interacts with valuation strategies uniquely altered by STZ.
STZ treatment alters economically distinct decision-making policies
To characterize economic decision patterns, next, we calculated the likelihood of selecting choices as a function of offer cost in both the offer zone (enter vs. skip) and wait zone (earn vs. quit) of each restaurant. We found that all mice were capable of discriminating tone pitch and could adhere to the basic economic structure of the Restaurant Row task, whereby interactions between choice outcomes and offer cost scaled with flavor preferences (Supplementary Fig. 3i–l). In order to approximate economic thresholds, or indifference points, of willingness to accept and earn rewards, we fit a Heaviside-step regression to choice outcomes as a function of cost and measured curve inflection points each day across the entire experiment (Fig. 5a–d). This analysis revealed a large discrepancy in economic thresholds between offer zone and wait zone decision policies upon transitioning into the 1–30 s reward-scarce environment (see Supplementary Fig. 3m–p for examination of thresholds between days 8–22). Overall, offer zone thresholds were significantly higher than wait zone thresholds, indicating mice were likely to accept offers more expensive than they were actually willing to wait for and earn (offer zone >wait zone thresholds: F = 74.715, p < 0.0001). This discrepancy captures a metric of economic choice conflict between zones and was greater for more preferred flavors. We discovered altered economic thresholds between STZ- and VEH-treated mice, but only in those tested on the gradual and not stepwise schedule and whose threshold direction of change depended on both the restaurant and zone type (stepwise: offer zone treatment x rank: F = 0.559, p = 0.645; wait zone treatment x rank: F = 0.572, p = 0.635; gradual: offer zone treatment x rank: F = 4.354, p < 0.01; wait zone treatment x rank: F = 3.390, p < 0.05; Fig. 5b, d). Offer zone thresholds of less preferred flavors were lower in STZ- compared to VEH-treated mice (Fig. 5d). Wait zone thresholds of more preferred flavors were higher in STZ- compared to VEH-treated mice (Fig. 5b). These data indicate complex changes in distinct stages of the decision process within-trial – willingness to accept (offer zone) vs. willingness to wait (wait zone) – that are differentially altered in STZ-treated mice.
a–d Offer zone (top) and wait zone (bottom) thresholds plotted (a, c) each day across the entire Restaurant Row paradigm or (b, d) split by flavor ranking collapsed across the 1–30 s epoch. Green (offer zone) and blue (wait zone) dashed staircase represent the maximum possible threshold for stepwise (top) or gradual (bottom) schedules. e, f Offer zone and wait zone thresholds within each 2.5 min bin across the session (data collapsed from entire 1–30 s epoch). Both schedules collapsed in e depicting all VEH- (top) or STZ-treated mice (bottom). Horizontal shaded bands represent offer zone threshold of 30 (green) or wait zone threshold of 9.s-13.5 (blue), corresponding to thresholds representing a strategy that would yield the maximum amount of total food as one type of optimal strategy, should animals ignore flavors, that was previously theoretically and empirically determined9. Analysis from e replotted and split by least and most preferred flavors in f for both schedules collapsed (left) or for all groups splitting flavor rankings (right). g Standard deviation of offer zone and wait zone thresholds calculated across the four flavors within each 2.5 min bin. h Standard deviation of the number of rewards earned calculated across the four flavor rankings within each 2.5 min bin. Shading in (a, c)/error bars represent ±1 SEM.
By fitting decision thresholds restricted to sliding windows of time across the session, we found that VEH-treated mice displayed a fundamentally distinct decision-making profile compared to STZ-treated mice. Overall, offer zone thresholds significantly decreased across the session while wait zone thresholds remained relatively unchanged in VEH-treated mice (VEH: main effect of time on offer zone threshold: F = 12.695, p < 0.001, but not wait zone threshold: F = 0.039, p = 0.843; Fig. 5e). Whereas in STZ-treated mice, the opposite and inverse was true: offer zone thresholds remained relatively unchanged while wait zone thresholds significantly increased (STZ: main effect of time on wait zone threshold: F = 50.312, p < 0.0001, but not offer zone threshold: F = 3.577, p = 0.060; Fig. 5e). When segregating this analysis by restaurant, complex profiles emerged that were more dynamic with respect to flavor in VEH- compared to STZ-treated mice (Fig. 5f). In VEH-treated mice, offer zone thresholds for all flavors started at ~30 s and decreased across the session, predominantly for less preferred flavors (VEH: offer zone rank x time: F = 7.241, p < 0.0001). Wait zone thresholds for all flavors started at ~10 s and bifurcated across the session (VEH: wait zone rank x time: F = 22.096, p < 0.0001). In contrast, STZ-treated mice initially displayed flavor-disparate offer zone thresholds that were relatively stable across the session whereas wait zone thresholds, while also initially flavor-disparate, increased in all restaurants (STZ: offer zone rank x time: F = 3.518, p < 0.05, wait zone rank x time: F = 103.712, p < 0.0001). These distinct profiles, summarized by taking the standard deviation of thresholds among restaurants across the session, were more different between VEH- and STZ-treated mice tested on the gradual than stepwise schedule (treatment x schedule: offer zone: F = 56.443, p < 0.0001, wait zone: F = 9.031, p < 0.01; Fig. 5g). These data explain the decision-making policies behind meal consumption patterns across the session: VEH-treated mice focused first on securing food instead of exacting preferences by accepting all offers and waiting the optimal food-maximizing threshold in the wait zone (as we have previously characterized), before subsequently laxing policies9,34; STZ-treated mice were skewed toward investing in more expensive offers to earn disproportionately more higher-preferred flavors at the expense of yielding less food overall as a consequence (main effect of treatment: F = 60.858, p < 0.0001; Fig. 5h, Supplementary Fig. 3i–p, recall Fig. 4e, f). These findings suggest STZ-altered sensitivity to economic choice competes with basic food security needs and depends on (i) the subjective value of the reward target at hand [flavor], (ii) the decision algorithm engaged [offer zone vs. wait zone], and (iii) is influenced by the salience or contrast of reward scarcity in the environment derived from one’s prior training history [gradual vs. stepwise schedules].
STZ-treated mice reveal diminished economic choice conflict
In addition to capturing economic conflict in the form of offer zone vs. wait zone decision policies, we measured a separate form of conflict within-trial during the decision process itself. We quantified behavioral path trajectories as mice traversed through the offer zone en route to making a skip or enter decision (Fig. 6a). Video-tracked body positions during the pass through the offer zone choice point can be transformed into absolute integrated angular velocity – a metric of hesitation or physical “hemming and hawing” known as vicarious trial and error (VTE) behavior44,45,46. VTE behavior has been previously shown to correlate with alternating neural representations of deliberation between competing choice options47,48. Overall, skip decisions elicited greater VTE behavior in the offer zone compared to enter decisions (main effect of offer zone outcome skip > enter: F = 121.070, p < 0.0001; Fig. 6b). The size of this discrepancy scaled with the ordinal ranking of flavor preferences (offer zone outcome x rank: F = 50.263, p < 0.0001, treatment: F = 5.876, p < 0.05; Fig. 6c, Supplementary Fig. 4). These data indicate mice demonstrated an aversion to skip in the offer zone that grows stronger with subjective flavor preferences. We discovered STZ-treated mice, particularly those previously tested on the gradual schedule, displayed diminished conflict that was more pronounced for less preferred rewards (Fig. 6c, Supplementary Fig. 4). We also found that this difference is related to but only partially explains the discrepancy between offer zone and wait zone thresholds (Supplementary Fig. 4). These data indicate that within-trial conflict during the choice process itself in the offer zone captures a unique aspect of reward valuation during planning behaviors altered by STZ treatment.
a Example video-tracked body centroid positions in one restaurant from a single mouse on a single day. All tracked positions in black, skip decisions in red (top), enter decisions in green (bottom), tracked from offer onset upon entering the T-shaped choice point until crossing either boundary (skip: leftward hallway entry; enter: rightward wait zone entry). High vicarious trial and error (VTE) trials capture multiple reorientation events in the offer zone. b VTE behavior across the entire Restaurant Row paradigm. c VTE behavior, first normalized to all trials on a given day for a given mouse, then collapsed across the entire 1–30 s epoch and split by flavor rankings. Horizontal dashed gray line represents z-score of 0. Shading/error bars represent ±1 SEM.
STZ-treated mice carry altered mental accounts of time
Next, we characterized the different ways in which mice value the passage of time on the Restaurant Row task. First, we examined ongoing choice processes in the wait zone by measuring the amount of time elapsed during the countdown before making a quit decision. When offer zone thresholds are higher than wait zone thresholds, this generally results in mice being more likely to quit accepted offers in the wait zone but does not capture how long it takes to quit. Overall, we found that all STZ-treated mice spent significantly more time waiting before deciding to quit compared to VEH-treated mice, regardless of prior schedule (Supplementary Fig. 5a, c). This was driven by trials in which STZ-treated mice accepted long-delay but not short-delay offers (Supplementary Fig. 5e, f). Furthermore, increased latencies to quit scaled with flavor in VEH- but not STZ-treated mice (Supplementary Fig. 5b, d). These data suggest a blunted relationship between subjective valuation and select elements of the choice process in STZ-treated mice that may be distinct from other forms of time spent on the task (Supplementary Figs. 5–7).
Why the majority of quit trials occur on long-delay offers can be due to the fact that mice have more of an opportunity to quit. However, as we have previously characterized, there is more value structure to these quit decisions that efficiently correct offer zone mistakes in a time-saving manner (Supplementary Fig. 5k–o)3,5,11,36. What is more, quit events comprise a unique economic dilemma in which the longer an animal takes to decide to quit, the closer it may be to earning a reward. This highlights a conflict of deciding whether to continue waiting, encompassing a well-studied cognitive phenomenon known as the sunk cost bias. This describes the tendency to escalate commitment to an ongoing investment due to an accumulation of irrecoverable losses that, according to classic economic theory, should be ignored. We previously developed a dynamic analysis capable of extracting the influence of time spent in the wait zone on the likelihood of quitting and that is orthogonal to temporal distance to the goal11,36. First, each quit event was binned into [time-spent, time-left] pairs (Supplementary Fig. 5k). From this, we calculated the probability of earning a reward using a sliding-window survival analysis as animals continuously reevaluated staying in the wait zone (Fig. 7a–c, Supplementary Fig. 8a–g).
a–c Sunk cost analysis of staying behavior in the wait zone, demonstrated using all mice. a The likelihood of staying in the wait zone and earning a reward (e.g., not quit) is plotted as a function of time left in the countdown along the x-axis and time already spent waiting orthogonally in color. Note the black 0 s time spent curve represents animals having just entered the wait zone from the offer zone. Inset vertical dashed gray line illustrates an example analysis point comparing three sunk cost conditions originating from different starting offers but matched at 10 s left. Data from a dimensioned reduced in b collapsing across time left, instead highlighting the grand mean of each time spent sunk cost condition (color and x-axis). Insets depict data from curves in a are collapsed into the observed (sunk condition) and control (0 s condition) lines. Difference between curves in b are plotted in c in order to summarize the envelope of the overall effect of time already spent on escalating the commitment of staying in the wait zone. Horizontal dashed line represents 0. d Sensitivity to sunk costs plotted across the entire Restaurant Row paradigm where the delta curve in c is plotted as vertical slices in d with time already spent along the y-axis and heatmap representing magnitude. Note no sensitivity to sunk costs until after transitioning to the 1–30 s epoch in all mice, and never in STZ-gradual mice. Delta curves collapsed across the entire 1–30 s epoch for e stepwise or f gradual schedules. Dashed gray box highlights peak differences. g Peaks from e to f for all groups. h Peaks split by flavor ranking. Shading/error bars represent ±1 SEM.
Consistent with our previous reports, we found that, in VEH-treated mice, the probability of earning a reward significantly increased as a function of time already waited (i.e., sunk costs, isolated from time left) but only after transitioning to the 1–30 s reward-scarce environment (VEH: stepwise: F = 76.544, p < 0.0001; gradual: F = 292.004, p < 0.0001; Fig. 7d, e). In STZ-treated mice, we found that sensitivity to sunk costs depended on the history of their prior environment schedule. STZ-treated mice that previously experienced the stepwise schedule displayed robust sensitivity to sunk costs (STZ: stepwise: F = 142.921, p < 0.0001; Fig. 7d, e). However, STZ-treated mice tested previously on the gradual schedule displayed diminished sensitivity to sunk costs that was abolished across the entire experiment (treatment x schedule: F = 7.759, p < 0.01; Fig. 7d, f–h. Supplementary Fig. 9a–c).
Because sunk cost sensitivity emerges only after the transition to the 1–30 s reward-scarce environment indicates that this valuation process may depend on enhanced appetitive motivational levels driven by lower food availability. That is, provided individuals have intact systems capable of sensing this environmental change. This may be more accessible to the individual in an environment that changes with higher contrast. VEH-treated mice remain attuned to this level of reward-scarcity regardless of prior schedule. And, while such a process may be impaired by STZ treatment, the stepwise schedule appears to preserve sunk cost valuations in STZ-treated mice conferred through the higher contrast in environmental change compared to the gradual schedule. Thus, we predicted that decreasing the relative scarcity of the environment by reducing food pressure ought to diminish sunk cost valuations as this may be a driver of blunted sensitivity in STZ-gradual mice. To test this without changing the distribution of offers in the environment, we restored ad libitum access to food in the home cage for all animals for an additional 5 days and continued testing on Restaurant Row. In addition to the expected decreases in laps run, pellets earned, and travel time (main effect of ad libitum vs. restricted: laps: F = 41.383, p < 0.0001; earns: F = 45.597, p < 0.0001; Fig. 8a–c, Supplementary Fig. 10a, d), we found differential changes in thresholds. While offer zone thresholds did not change in any group (F = 1.725, p = 0.190; Fig. 8d, nor offer zone VTE behavior, effect of treatment on enter: F = 0.346, p = 0.557, or skip: F = 0.791, p = 0.375; Fig. 8f), wait zone thresholds decreased in VEH- but increased in STZ-treated mice, regardless of prior schedule (F = 21.027, p < 0.0001; Fig. 8e, Supplementary Fig. 10b). Note that this observation is similar to that observed in the second half of the session when food restricted (Fig. 5), and likely reflects the distinct reward-based decision-making policies when sated. Lastly, we found that sensitivity to sunk costs remained intact in VEH-treated mice but was abolished in STZ-treated mice regardless of prior schedule experience (main effect of treatment: F = 15.045, p < 0.001; Fig. 8g, Supplementary Fig. 10e). This reflects a core phenotype of altered mental accounting unique to STZ-treated mice revealed by perturbing the global energy state of the individual, not seen in VEH-treated mice.
Number of laps run in the correct direction (a) and total rewards earned (b) comparing between when mice were food restricted vs. had unlimited access to regular chow in the home cage (purple outlines). c Rewards earned split by flavor rankings. Offer zone thresholds (d) and wait zone thresholds (e) split by flavor ranking. Note no change in offer zone thresholds but a bidirectional change in wait zone thresholds (VEH: decreased; STZ: increased). f Normalized vicarious trial and error (VTE) behavior split by flavor ranking and by skip vs. enter offer zone outcomes. g Sensitivity to sunk cost delta curves plotted as a function of time already spent in the wait zone. Note abolished sunk costs in both STZ-treated groups regardless of schedule, in spite of no change in offer zone thresholds and similar shifts in wait zone thresholds. Insets depict peak scores. Horizontal dashed gray line represents 0. Shading/error bars represent ±1 SEM.
Discussion
To explore the effect of impaired glucose homeostasis on reward processing, we characterized complex neuroeconomic decision-making behavior on the Restaurant Row paradigm in chronically hyperglycemic mice using the STZ diabetic mouse model (Fig. 1). We tested mice longitudinally across months in a closed-economy system. We found that both VEH- and STZ-treated mice learned to respond to the changing economic landscape of rewards distributed throughout the environment as a function of cost in the form of delays required to earn food and subjective value in the form of flavors. Thus, cognitive impairments that may be associated with diabetes did not prevent task acquisition. However, STZ treatment did result in fewer total earns on the task compared to VEH-treated mice and the number of earned rewards correlated with HbA1c levels (Fig. 2). This change in rewards earned was attributed to skewed flavor preferences, with STZ-treated mice being biased toward their preferred flavors, accepting most offers regardless of cost and ignoring lesser preferred flavors even when it was economically advantageous to choose those offers (Figs. 3 and 4). In contrast, VEH-treated mice optimized flavor preferences with offer cost, especially early in the session, effectively maximizing food earnings in this closed-economy system. What is more, these outcomes derived from changes in only certain types of choices (Figs. 5 and 6) and influenced the value of the passage of only certain categories of time (Fig. 7), consistent with our hypothesis that computation- and decision-specific aspects of reward processing is altered in diabetes beyond global motivational changes. By incorporating two different training schedules, we discovered that hyperglycemia-related changes in behavior depended on one’s prior experience with how rapidly the environment changed from reward-rich to reward-scarce. We found that STZ-induced alterations in behavior were more pronounced when this transition was gradual or when mice were sated (Fig. 8), including loss of sensitivity to cognitive heurists like sensitivity to sunk costs. These findings suggest that diabetes is capable of altering dissociable forms of mental accounting as a result of a complex interaction between glycemic regulation, energy demand, multiple decision-making algorithms, and the salience of reward scarcity in the environment.
Given the complex and multifactorial changes in dissociable decision-making behaviors observed on the Restaurant Row task in diabetic mice, we can begin to point to multiple circuit-computation-specific valuation algorithms that may be uniquely disrupted in diabetes. Separate behaviors in the offer zone and wait zone reflect fundamentally distinct types of choices6,49. In the offer zone, mice are faced with choosing between accepting vs. rejecting an offer whose cost is cued, but a time investment toward earning a reward has not yet taken place. Whereas in the wait zone, mice are continuously re-evaluating commitment to an ongoing investment as they temporally approach earning a reward. Here, we found that mice STZ-treated mice displayed lower offer zone thresholds for less preferred flavors and higher wait zone thresholds for most preferred flavors than VEH-treated mice (Fig. 5), consistent with the observation that earned rewards were skewed toward most preferred flavor in STZ-treated mice and suggesting that decision-making policies are distinct in STZ-treated mice. Indeed, if we examined behavior over a single session, we found that offer zone but not wait zone thresholds decreased in VEH-treated mice as they became sated, whereas STZ-treated mice displayed increasing wait zone but not offer zone thresholds, which was more pronounced on the gradual training schedule. We and others previously demonstrated that offer zone decisions and wait zone decisions access computationally distinct functions of physically separable circuits in the brain5,49. The cognitive mechanisms underlying offer zone behaviors have been previously linked to deliberative decision-making processes shown to engage circuits that support prospective thinking50,51. Hippocampal recordings during such choices are capable of decoding alternating representations of competing actions that sweep ahead of the animal through the choice point leading to potential future goal locations44,45,46,47,48. Failure to engage in such processes can cause individuals to rely on other Pavlovian or procedural memory systems1,34,52,53,54,55. Chemogenetically inactivating the medial prefrontal cortex can disrupt synchrony with the hippocampus, impair hippocampal sequences in the offer zone, and reduce deliberative behaviors such as VTE50,54. In humans tested on translated variants of the Restaurant Row task, functional magnetic resonance imaging of the default mode network, including hippocampal and prefrontal regions, has been shown to decode current and future goal locations prior to making decisions in the offer zone56,57.
Conversely, the cognitive processes underlying wait zone behaviors are thought to capture change-of-mind decisions that have far less understood neural mechanisms58,59. Continuously re-evaluating an ongoing investment on Restaurant Row encompasses a complex integration of requiring mice to patiently wait in a confined location while counting the passage of time36. With this, animals are losing irrecoverable time, have more of a window to reconsider opportunity costs in tandem with growing hunger states, all the while decreasing temporal distance to the goal. Recent recordings of dopamine release in the striatum using fiber photometry during Restaurant Row have revealed differential signaling patterns of costs coded in the offer zone vs. wait zone60. Further, dopamine encoded perplexing negative reward-prediction-error-like signals time-locked to volitional change-of-mind quit decisions in the wait zone that could be disrupted optogenetically60,61. Altering synaptic strength of glutamatergic inputs into the striatum can also selectively disrupt wait zone behavior without influencing offer zone choices4. Together, the changes observed in offer zone and wait zone thresholds, VTE behavior, and wait zone quitting behavior of STZ-treated mice characterize a suite of alterations in information processing that moves beyond simple dysfunction of glucose metabolism in the periphery and suggests a more complex and dynamic dysregulation of multiple decision-making systems in the brain of those with diabetes.
Diabetes mellitus is fundamentally a disorder of energy utilization characterized by dysregulated glucose metabolism mediated through impairments in insulin signaling either as a consequence of deficient insulin production or insulin receptor resistance13,62. How diabetes may affect motivated behavior is multifactorial. Principally, increased circulating levels of glucose in the bloodstream and lower levels of glucose uptake in tissue throughout the body is capable of altering peripheral drivers of reward-seeking behavior as a result of changes in global energy levels, body muscle and fat composition, and bottom-up signals of hunger and satiety14,63. As a key example, inability to utilize glucose as a peripheral fuel-source leads to fatty-acid oxidation and weight loss in STZ-treated rodents, with a subsequent reduction in circulating leptin levels that are thought to contribute to the hyperphagia associated with diabetes via a central mechanism64,65. Similar dysregulation of other hormones (e.g. cortisol, ghrelin) also likely contribute to altered reward behavior in diabetes63,66,67,68. Centrally, elevated circulating levels of glucose in the brain lead to reduced cerebral perfusion as means to limit excessive glucose entry14,69,70. This results in decreased blood flow and oxygen delivery that could interfere with ongoing cognitive processes, which may be contributing to differences in motivation or altered decision-making in diabetes17,71.
Consistent with reduced cerebral blood flow, reduced processing speeds on simple cognitive tasks have been reported in people with diabetes15,18. Diabetes in animals can impair novel spatial and object recognition as well as performance in Barnes maze, Morris water maze, and active-avoidance foot-shock tasks, aspects of which may be correctable with glycemic control72,73,74. Conversely, mildly increased glucose levels are associated with increased cognitive efficiency in real-world assessment of cognition in people with diabetes, supporting the idea that glycemic state is one of the internal factors that can affect decision-making. In line with this premise, recent evaluation of risky decision-making in people with diabetes also found that diabetes directly affects reward circuitry function, and the degree of impairment correlates with glycemic control. Using the Balloon Analogue Risk Task, it was found that people with Type 1 diabetes utilized a rigid, conservative, risk-averse strategy and displayed differences in activity of brain areas associated with inhibitory control that correlated with HbA1c75. This may be related to reports of impaired ability to discriminate ambiguous stimuli in the Iowa Gambling Task in people with diabetes76,77.
However, glucose is not the only fuel available to the brain; the brain can also utilize ketones, produced by breaking down fat, as a fuel source, although this switch in metabolism requires mitochondrial adaptations, some of which have been linked to changes in motivated behavior14,78,79. Normally, in the setting of chronic diabetes, ketones are not present in significant quantities in the blood unless there is illness or food is restricted, at which point ketone levels can rapidly rise resulting in diabetic ketoacidosis and possibly death. To prevent this in our cohort, both VEH- and STZ-treated mice were provided with a small ration of food overnight, thus it is unlikely that ketones are significantly driving differences in decision-making observed here. More recent work has implicated glucose-independent roles of insulin function directly in the central nervous system25,26,27,28,29. Insulin is actively transported across the blood-brain barrier and has been shown to alter the function of the mesolimbic dopamine system29. Insulin receptors are present on ventral tegmental area dopaminergic neurons, as well as on neurons in the nucleus accumbens, where activation of the receptor by insulin is thought to suppress hedonic feeding62,80,81,82. The observation that insulin-deficient STZ-mice are biased for most-preferred flavors is consistent with a role for insulin in suppressing hedonic feeding. Thus, the absence of central insulin in Type 1 diabetes may have profound consequences on dopaminergic function, either acutely by altering transient activity or also across longer timescales affecting plasticity and reward circuit remodeling. Such changes could influence not only reward-related behavior during meals but also impact dissociable aspects of decision-making information processing, giving rise to complex psychiatric disease vulnerabilities62,80.
In addition to understanding how diabetes may alter complex decision systems in the brain, interactions between an agent and one’s changing environment is critically important, especially when considering changing reward availability and metabolic demand83. We found that, by and large, most of the differences in decision-making behavior between STZ- and VEH-treated mice emerged only in those tested across an environment that grew increasingly reward-scarce on a gradual but not stepwise schedule. We previously demonstrated that the abruptness of the stepwise schedule serves as a naturalistic economic stressor capable of interacting with one’s prior history of stress and can alter subsequent decision-making strategies3,9. Thus, the gradual schedule may be less detectable and could dampen the perception of reward-scarcity in the environment compared to the stepwise schedule, despite ultimately arriving in the same environment. This concept captures the well-known economic framework of creeping normality, also known as the allegory of the “boiling frog,” in which behavioral responses are blunted when external challenges escalate gradually84. The stark differences in reward-scarcity in the stepwise schedule may be triggering a switch into an altered brain-body state that could pressure the mobilization of energy stores or activate alternative decision-making strategies in VEH- and STZ-treated mice alike. However, the gradual schedule may elicit a sub-threshold economic stress response phenotype that when combined with diabetic physiology reveals striking differences in foraging behavior that are otherwise masked when environmental circumstances change abruptly and instead overwhelm the system. This could be in part due to impaired glucose sensing in diabetes that is dependent on the salience of or contrast in reward availability in the environment – a dysfunction exaggerated in the gradual schedule. This could also stem from decision-making algorithms that should switch in response to recognizing any change in the environment, regardless of rate or contrast level and if consequential or not, but fail to do so in STZ-gradual mice only. Interestingly, we found that when placed back on ad libitum access to food, despite observing changes in foraging that would be expected when sated in all animals (e.g., decreased laps run and pellets earned), specific decision-making phenotypes that may be a core feature of diabetes emerged. Most notably, this manifested as abolished sensitivity to sunk costs in STZ-treated mice irrespective of prior testing schedule. So, why then does sensitivity to sunk costs emerge in animals at all?
Mice, like humans, are capable of valuing the passage of time in such a way that can inflate the value of continuing to invest in an ongoing endeavor11,36. This reflects the well-studied decision-making phenomenon termed the sunk cost fallacy, that according to classic economic theory, should be ignored when making re-evaluative choices85. We previously discovered that this valuation algorithm has been conserved across evolution postulated to arise from, among several theories, state-dependent valuation learning as a key driver of this phenomenon that may be useful from an energetic standpoint11,36,41,42. This theory hypothesizes that time and energy expended in pursuit of a food reward could shift an individual into a poorer energy state41,42. As a result, the yet-to-be-earned food reward may have enhanced perceived value that can escalate the commitment of continued investments41,42. Practically, this may help the agent in times of survival. We and others have observed sensitivity to sunk costs only in a reward-scarce environment, suggesting that this within-trial state-dependent valuation learning explanation of sunk costs depends on task demand or enhanced contrast between the energy state of the individual and potential future reward states depending on the availability of rewards in the environment9,11,36,39,86. Here, we found that diabetic mice tested on the gradual schedule failed to demonstrate sensitivity to sunk costs despite being matched with other groups in several other behavioral metrics. This may be because of a blunted ability to register internal states with external circumstances due to impaired brain-body metabolic signaling. Importantly, this depended on their training history, one in which the economic landscape of the environment changed gradually. A related theory, within-trial contrast, suggests that it is the ability to detect a relative difference between the state of the agent and the goal state that is permissive for sensitivity to sunk costs to manifest41. Thus, STZ-treated mice trained on a gradual, low-contrast schedule are likely impaired in their ability to detect the salience of sunk cost interactions between the agent’s state and their environmental demand, exacerbated by one’s metabolic dysfunction. This reveals a role for energetics as a key evolutionary driver of the sunk cost phenomenon that has been postulated but never tested. Further, by placing all mice back on ad libitum food access, thereby altering the global energetic state of the individual, we revealed a core phenotype of insensitivity to sunk costs in all STZ-treated mice, unlike VEH-treated mice who maintained intact sunk cost sensitivity regardless of training schedule or food restriction status. These findings suggest that discrete forms of mental accounting such as sensitivity to sunk costs can be uniquely perturbed in diabetes and provide more evidence for the metabolic and energetic drivers behind why cognitive biases such as this may have been conserved across evolution, constrained by our biology.
Limitations of the present study include no direct measures of metabolic physiology, body composition, or calorimetry as it relates to energy expenditure of animals tested across this longitudinal experiment. Further, food restriction-based studies in diabetic animals are incredibly difficult to manage while ensuring animal safety, particularly at such long timescales. Future work should explore how behaviors shift under different economic constraints: for instance, (i) if mice have, instead of a fixed time-budget, a fixed number of choices with an open-ended amount of time; or (ii) instead of titrating mice to a fixed body weight, evaluating the weight individual mice settle at as a dependent outcome measure as a function of distribution of costs in the environment. After validating and eliciting robust diabetic phenotypes in female mice, which has limitations presently in the STZ model33, sex differences in the effects of hyperglycemia on decision-making behaviors should be explored next. Investigating the role of insulin signaling in vivo will be critically important in future studies, including comparing and contrasting the effects of Type 1 vs. Type 2 diabetes on decision-making. This should take the form of experiments studying insulin replacement challenges but also measuring how peripheral and central targets of insulin might change throughout these two testing schedules or across different decision-making processes, including recording the activity of neural signals such as dopamine in diabetic mice on the Restaurant Row task.
We demonstrated that in a diabetic mouse model, complex neuroeconomic decision-making behaviors that are multifaceted and reflect fundamentally distinct valuation algorithms can be uniquely perturbed not only as a function of hyperglycemia but also as a consequence of environmental dynamics. We found that diabetes-induced changes in how mice deliberate for and re-evaluate rewards of varying costs and subjective value were altered in a manner that depends on the salience of reward-scarcity in the environment. Our task, which has been translated for use across species in humans11,36,56,57,87,88, allows for a thorough neuroeconomic investigation linking the biological mechanisms underlying metabolic disorders to unique decision-making vulnerabilities that could predispose individuals to develop co-morbid psychiatric illnesses.
Data availability
All data, code, and materials used in the analysis are available in the manuscript, materials and methods section, supplementary information, or upon request.
References
Redish, A. D. The mind within the brain: How we make decisions and how those decisions go wrong (Oxford University Press; 2013).
Redish, A. D. et al. Computational validity: using computation to translate behaviours across species. Philos. Trans. R. Soc. Lond. B Biol. Sci. 377, 20200525 (2022).
Durand-de Cuttoli, R. et al. Distinct forms of regret linked to resilience versus susceptibility to stress are regulated by region-specific CREB function in mice. Sci. Adv. 8, eadd5579 (2022).
Sweis, B. M., Larson, E. B., Redish, A. D. & Thomas, M. J. Altering gain of the infralimbic-to-accumbens shell circuit alters economically dissociable decision-making algorithms. Proc. Natl Acad. Sci. USA 115, E6347–E6355 (2018).
Sweis, B. M., Redish, A. D. & Thomas, M. J. Prolonged abstinence from cocaine or morphine disrupts separable valuations during decision conflict. Nat. Commun. 9, 2521 (2018).
Sweis, B. M., Thomas, M. J. & Redish, A. D. Beyond simple tests of value: measuring addiction as a heterogeneous disease of computation-specific valuation processes. Learn Mem. 25, 501–512 (2018).
Loewenstein, G., Rick, S. & Cohen, J. D. Neuroeconomics. Annu Rev. Psychol. 59, 647–672 (2008).
Glimcher, P. W., Dorris, M. C. & Bayer, H. M. Physiological utility theory and the neuroeconomics of choice. Games Econ. Behav. 52, 213–256 (2005).
Durand-de Cuttoli, R., et al. A double-hit of social and economic stress in mice precipitates changes in decision-making strategies. Biol Psychiatry 96, 67–78 (2023).
Kalenscher, T. & van Wingerden, M. Why we should use animals to study economic decision making - a perspective. Front. Neurosci. 5, 82 (2011).
Sweis, B. M. et al. Sensitivity to “sunk costs” in mice, rats, and humans. Science 361, 178–181 (2018).
Sweis, B. M. & Nestler, E. J. Pushing the boundaries of behavioral analysis could aid psychiatric drug discovery. PLoS Biol. 20, e3001904 (2022).
Cole, J. B. & Florez, J. C. Genetics of diabetes mellitus and diabetes complications. Nat. Rev. Nephrol. 16, 377–390 (2020).
Dienel, G. A. Brain Glucose Metabolism: Integration of Energetics with Function. Physiol. Rev. 99, 949–1045 (2019).
Brands, A. M., Biessels, G. J., de Haan, E. H., Kappelle, L. J. & Kessels, R. P. The effects of type 1 diabetes on cognitive performance: a meta-analysis. Diab. Care 28, 726–735 (2005).
Xia, X., Jiang, Q., McDermott, J. & Han, J. J. Aging and Alzheimer’s disease: Comparison and associations from molecular to system level. Aging Cell 17, e12802 (2018).
Broadley, M. M., White, M. J. & Andrew, B. A Systematic Review and Meta-analysis of Executive Function Performance in Type 1 Diabetes Mellitus. Psychosom. Med. 79, 684–696 (2017).
Gaudieri, P. A., Chen, R., Greer, T. F. & Holmes, C. S. Cognitive function in children with type 1 diabetes: a meta-analysis. Diab. Care 31, 1892–1897 (2008).
Holt, R. I. & Mitchell, A. J. Diabetes mellitus and severe mental illness: mechanisms and clinical implications. Nat. Rev. Endocrinol. 11, 79–89 (2015).
Sartorius, N. Depression and diabetes. Dialogues Clin. Neurosci. 20, 47–52 (2018).
Rustad, J. K. et al. Decision-making in diabetes mellitus type 1. J. Neuropsychiatry Clin. Neurosci. 25, 40–50 (2013).
Russo, S. J. & Nestler, E. J. The brain reward circuitry in mood disorders. Nat. Rev. Neurosci. 14, 609–625 (2013).
van Duinkerken, E. et al. Subgenual Cingulate Cortex Functional Connectivity in Relation to Depressive Symptoms and Cognitive Functioning in Type 1 Diabetes Mellitus Patients. Psychosom. Med. 78, 740–749 (2016).
van Duinkerken, E. et al. Altered eigenvector centrality is related to local resting-state network functional connectivity in patients with longstanding type 1 diabetes mellitus. Hum. Brain Mapp. 38, 3623–3636 (2017).
Kar, S., Chabot, J. G. & Quirion, R. Quantitative autoradiographic localization of [125I]insulin-like growth factor I, [125I]insulin-like growth factor II, and [125I]insulin receptor binding sites in developing and adult rat brain. J. Comp. Neurol. 333, 375–397 (1993).
Cai, W. et al. Peripheral Insulin Regulates a Broad Network of Gene Expression in Hypothalamus, Hippocampus, and Nucleus Accumbens. Diabetes 70, 1857–1873 (2021).
Soto, M., Cai, W., Konishi, M. & Kahn, C. R. Insulin signaling in the hippocampus and amygdala regulates metabolism and neurobehavior. Proc. Natl Acad. Sci. USA 116, 6379–6384 (2019).
Fetterly, T. L. et al. Insulin Bidirectionally Alters NAc Glutamatergic Transmission: Interactions between Insulin Receptor Activation, Endogenous Opioids, and Glutamate Release. J. Neurosci. 41, 2360–2372 (2021).
Stouffer, M. A. et al. Insulin enhances striatal dopamine release by activating cholinergic interneurons and thereby signals reward. Nat. Commun. 6, 8543 (2015).
Zhou, Z., Zhu, Y., Liu, Y. & Yin, Y. Comprehensive transcriptomic analysis indicates brain regional specific alterations in type 2 diabetes. Aging 11, 6398–6421 (2019).
Gallego, M., Setien, R., Izquierdo, M. J., Casis, O. & Casis, E. Diabetes-induced biochemical changes in central and peripheral catecholaminergic systems. Physiol. Res. 52, 735–741 (2003).
Ramakrishnan, R., Sheeladevi, R. & Suthanthirarajan, N. PKC-alpha mediated alterations of indoleamine contents in diabetic rat brain. Brain Res. Bull. 64, 189–194 (2004).
Furman, B. L. Streptozotocin-Induced Diabetic Models in Mice and Rats. Curr. Protoc. 1, e78 (2021).
Sweis, B. M., Thomas, M. J. & Redish, A. D. Mice learn to avoid regret. PLoS Biol. 16, e2005853 (2018).
Steiner, A. P. & Redish, A. D. Behavioral and neurophysiological correlates of regret in rat decision-making on a neuroeconomic task. Nat. Neurosci. 17, 995–1002 (2014).
Redish, A. D. et al. Sunk cost sensitivity during change-of-mind decisions is informed by both the spent and remaining costs. Commun. Biol. 5, 1337 (2022).
Durand-de Cuttoli, R., et al. Sex differences in change-of-mind neuroeconomic decision-making is modulated by LINC00473 in medial prefrontal cortex. bioRxiv, https://www.biorxiv.org/content/10.1101/2024.05.08.592609v1 (2024).
Durand-de Cuttoli, R. & Sweis, B. M. Ketamine reverses stress-induced hypersensitivity to sunk costs. bioRxiv, https://www.biorxiv.org/content/10.1101/2024.05.12.593597v1.full (2024).
Duin, A. A., Aman, L., Schmidt, B. & Redish, A. D. Certainty and uncertainty of the future changes planning and sunk costs. Behav. Neurosci. 135, 469–486 (2021).
Kazinka, R., MacDonald, A. W. 3rd & Redish, A. D. Sensitivity to Sunk Costs Depends on Attention to the Delay. Front. Psychol. 12, 604843 (2021).
Zentall, T. R. Within-trial contrast: when you see it and when you don’t. Learn. Behav. 36, 19–22 (2008).
Pompilio, L., Kacelnik, A. & Behmer, S. T. State-dependent learned valuation drives choice in an invertebrate. Science 311, 1613–1615 (2006).
Deeds, M. C. et al. Single dose streptozotocin-induced diabetes: considerations for study design in islet transplantation models. Lab. Anim. 45, 131–140 (2011).
Tolman, E. C. Prediction of vicarious trial and error by means of the schematic sowbug. Psychological Rev. 46, 318–336 (1939).
Muenzinger, K. F. On the origin and early use of the term vicarious trial and error (VTE). Psychological Bull. 53, 493–494 (1956).
Redish, A. D. Vicarious trial and error. Nat. Rev. Neurosci. 17, 147–159 (2016).
Johnson, A. & Redish, A. D. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci. 27, 12176–12189 (2007).
Kay, K. et al. Constant Sub-second Cycling between Representations of Possible Futures in the Hippocampus. Cell 180, 552–67 e25 (2020).
Diehl, G. & Redish, A. D. Differential processing of decision information in subregions of rodent medial prefrontal cortex. elife 12, e82833 (2022).
Schmidt, B., Duin, A. A. & Redish, A. D. Disrupting the medial prefrontal cortex alters hippocampal sequences during deliberative decision making. J. Neurophysiol. 121, 1981–2000 (2019).
Schmidt, B. & Redish, A. D. Disrupting the medial prefrontal cortex with designer receptors exclusively activated by designer drug alters hippocampal sharp-wave ripples and their associated cognitive processes. Hippocampus 31, 1051–1067 (2021).
Hasz, B. M. & Redish, A. D. Deliberation and Procedural Automation on a Two-Step Task for Rats. Front Integr. Neurosci. 12, 30 (2018).
Lind, E. B. et al. A quadruple dissociation of reward-related behaviour in mice across excitatory inputs to the nucleus accumbens shell. Commun. Biol. 6, 119 (2023).
Schmidt, B., Papale, A., Redish, A. D. & Markus, E. J. Conflict between place and response navigation strategies: effects on vicarious trial and error (VTE) behaviors. Learn Mem. 20, 130–138 (2013).
van der Meer, M. A., Johnson, A., Schmitzer-Torbert, N. C. & Redish, A. D. Triple dissociation of information processing in dorsal striatum, ventral striatum, and hippocampus on a learned spatial decision task. Neuron 67, 25–32 (2010).
Abram, S. V., Breton, Y. A., Schmidt, B., Redish, A. D. & MacDonald, A. W. 3rd The Web-Surf Task: A translational model of human decision-making. Cogn. Affect Behav. Neurosci. 16, 37–50 (2016).
Abram, S. V., Hanke, M., Redish, A. D. & MacDonald, A. W. 3rd Neural signatures underlying deliberation in human foraging decisions. Cogn. Affect Behav. Neurosci. 19, 1492–1508 (2019).
Resulaj, A., Kiani, R., Wolpert, D. M. & Shadlen, M. N. Changes of mind in decision-making. Nature 461, 263–266 (2009).
Stone, C., Mattingley, J. B. & Rangelov, D. On second thoughts: changes of mind in decision-making. Trends Cogn. Sci. 26, 419–431 (2022).
Kocharian, A. Dopaminergic control of neuroeconomic decision-making (ProQuest, University of Minnesota, 2023).
Eshel, N., et al. Striatal dopamine integrates cost, benefit, and motivation. Neuron 112, 500–514.e5 (2023).
Ferrario, C. R. & Finnell, J. E. Beyond the hypothalamus: roles for insulin as a regulator of neurotransmission, motivation, and feeding. Neuropsychopharmacology 48, 232–233 (2023).
Figlewicz, D. P. & Benoit, S. C. Insulin, leptin, and food reward: update 2008. Am. J. Physiol. Regul. Integr. Comp. Physiol. 296, R9–R19 (2009).
Fan, S. et al. A neural basis for brain leptin action on reducing type 1 diabetic hyperglycemia. Nat. Commun. 12, 2662 (2021).
German, J. P. et al. Leptin deficiency causes insulin resistance induced by uncontrolled diabetes. Diabetes 59, 1626–1634 (2010).
Chuang, J. C. et al. Ghrelin mediates stress-induced food-reward behavior in mice. J. Clin. Invest 121, 2684–2692 (2011).
Lutter, M. & Elmquist, J. Depression and metabolism: linking changes in leptin and ghrelin to mood. F1000 Biol. Rep. 1, 63 (2009).
Adam, T. C. & Epel, E. S. Stress, eating and the reward system. Physiol. Behav. 91, 449–458 (2007).
Hwang, J. J. et al. Blunted rise in brain glucose levels during hyperglycemia in adults with obesity and T2DM. JCI Insight 2, e95913 (2017).
Hwang, J. J. et al. Glycemic Variability and Brain Glucose Levels in Type 1 Diabetes. Diabetes 68, 163–171 (2019).
Satrom, K. M. et al. Neonatal hyperglycemia induces CXCL10/CXCR3 signaling and microglial activation and impairs long-term synaptogenesis in the hippocampus and alters behavior in rats. J. Neuroinflammation 15, 82 (2018).
Popovic, M., Biessels, G. J., Isaacson, R. L. & Gispen, W. H. Learning and memory in streptozotocin-induced diabetic rats in a novel spatial/object discrimination task. Behav. Brain Res. 122, 201–207 (2001).
Flood, J. F., Mooradian, A. D. & Morley, J. E. Characteristics of learning and memory in streptozocin-induced diabetic mice. Diabetes 39, 1391–1398 (1990).
Biessels, G. J. et al. Water maze learning and hippocampal synaptic plasticity in streptozotocin-diabetic rats: effects of insulin treatment. Brain Res. 800, 125–135 (1998).
Jorge, H., Duarte, I. C., Paiva, S., Relvas, A. P. & Castelo-Branco, M. Abnormal Responses in Cognitive Impulsivity Circuits Are Associated with Glycosylated Hemoglobin Trajectories in Type 1 Diabetes Mellitus and Impaired Metabolic Control. Diab. Metab. J. 46, 866–878 (2022).
Reiss, A. L. et al. A Pilot randomized trial to examine effects of a hybrid closed-loop insulin delivery system on neurodevelopmental and cognitive outcomes in adolescents with type 1 diabetes. Nat. Commun. 13, 4940 (2022).
Sun, D. M. et al. Decision-making in primary onset middle-age type 2 diabetes mellitus: a BOLD-fMRI study. Sci. Rep. 7, 10246 (2017).
Myette-Cote, E., Soto-Mota, A. & Cunnane, S. C. Ketones: potential to achieve brain energy rescue and sustain cognitive health during ageing. Br. J. Nutr. 128, 407–423 (2022).
Omori, N. E., Malys, M. K., Woo, G. & Mansor, L. Exploring the role of ketone bodies in the diagnosis and treatment of psychiatric disorders. Front Psychiatry 14, 1142682 (2023).
Gruber, J. et al. Impact of insulin and insulin resistance on brain dopamine signalling and reward processing - An underexplored mechanism in the pathophysiology of depression? Neurosci. Biobehav. Rev. 149, 105179 (2023).
Patel, J. C., Carr, K. D. & Rice, M. E. Actions and Consequences of Insulin in the Striatum. Biomolecules 13, 518 (2023).
Geisler, C. E. & Hayes, M. R. Metabolic hormone action in the VTA: Reward-directed behavior and mechanistic insights. Physiol. Behav. 268, 114236 (2023).
van Wingerden, M., Marx, C. & Kalenscher, T. Budget Constraints Affect Male Rats’ Choices between Differently Priced Commodities. PLoS One 10, e0129581 (2015).
Offerman, T. & Veen Avd. How to subsidize contributions to public goods: Does the frog jump out of the boiling water? Eur. Economic Rev. 74, 96–108 (2015).
Arkes, H. & Blumer, C. The psychology of sunk cost. Organ. Behav. Hum. Decis. Process. 35, 124–140 (1985).
Wikenheiser, A. M., Stephens, D. W. & Redish, A. D. Subjective costs drive overly patient foraging strategies in rats on an intertemporal foraging task. Proc. Natl Acad. Sci. USA 110, 8308–8313 (2013).
Huynh, T., Alstatt, K., Abram, S. V. & Schmitzer-Torbert, N. Vicarious Trial-and-Error Is Enhanced During Deliberation in Human Virtual Navigation in a Translational Foraging Task. Front. Behav. Neurosci. 15, 586159 (2021).
McInnes A. N., Sullivan C. R. P., MacDonald A. W. 3rd, & Widge A. S. Psychometric validation and clinical correlates of an experiential foraging task. BioRxiv, https://www.biorxiv.org/content/10.1101/2023.12.28.573439v1 (2023).
Acknowledgements
We thank members of the labs of Eric Nestler, Scott Russo, and Denise Cai for helpful discussion. Open-source illustrations obtained from SciDraw (www.scidraw.io), credit Federico Claudi. Funding sources include National Institute of Mental Health grant R01MH136230 (B.M.S.), L40MH127601 (B.M.S.), R01MH051399-31S1 (B.M.S.), Leon Levy Scholarship in Neuroscience, New York Academy of Sciences (J.L.A. and B.M.S.), Burroughs Wellcome Fund Career Award for Medical Scientists (B.M.S.), Animal Models for the Social Dimensions of Health and Aging Research Network via NIH/NIA R24 AG065172 (B.M.S.), Brain & Behavior Research Foundation NARSAD Young Investigator Awards 32856 (B.M.S.), 31140 (RDC), 28240 (J.L.A.), Einstein-Mount Sinai Diabetes Research Center Pilot & Feasibility Award P30DK020541-FS553 (J.L.A.), Mount Sinai SURP4US (C.A.N.).
Author information
Authors and Affiliations
Contributions
Conceptualization: J.L.A., B.M.S., Methodology: J.L.A., B.M.S., Investigation: C.A.N., R.D.C., Z.M.O., S.O.B., J.E.H., A.M., M.J.F., Y.Z.C., S.A., S.L., J.L.A., B.M.S., Data curation: C.A.N., R.D.C., J.L.A., B.M.S., Formal analysis: C.A.N., R.D.C., J.L.A., B.M.S., Visualization: C.A.N., R.D.C., B.M.S., Funding acquisition: J.L.A., B.M.S., Supervision: R.D.C., J.L.A., B.M.S., Writing – original draft: C.A.N., J.L.A., B.M.S., Writing – review & editing: all authors
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Sean Ostlund, Duda Kvitsiani, and the other, anonymous, reviewer for their contribution to the peer review of this work. Primary Handling Editor: Benjamin Bessieres.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Nwakama, C.A., Durand-de Cuttoli, R., Oketokoun, Z.M. et al. Neuroeconomically dissociable forms of mental accounting are altered in a mouse model of diabetes. Commun Biol 8, 102 (2025). https://doi.org/10.1038/s42003-025-07500-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-025-07500-6










