Introduction

Impaired goal-directed motivation is a debilitating symptom impacting many individuals with schizophrenia [1,2,3,4]. Diminished motivation is associated with deficits in cost-benefit decision making with patients haveing difficulty generating accurate estimates of the value of future rewards [5,6,7] or difficulty estimating the effort required to complete tasks [8,9,10,11]. Therefore, it is unclear if impairments in processing information about value, effort, or both factors drive the motivation deficits observed in patients. Consequently, the neural mechanisms underlying the disruption of cost-benefit decision making in schizophrenia are not yet well understood but could help develop treatment for this pervasive and persistent symptom.

Preclinical models have been useful in understanding the neural mechanisms underlying motivation and cost-benefit decision making. Rodent studies show that mesolimbic dopamine signaling is critical for invigorating behavior in effort-demanding situations [12,13,14]. Dopamine signaling in the Nucleus Accumbens (NAc) in the ventral striatum is required for the increase in neuronal activity in the NAc that drives the invigoration of reward seeking behaviors [15]. Disruptions to normal mesolimbic dopamine neurotransmission through cell body lesions or inactivation of the Core region of the NAc (NAcc), neurotoxic lesions of mesoacumbal dopamine terminals, or blockade of dopamine D1 or D2 receptors within the NAcc impair cost-benefit decision making. Such manipulations bias response selection away from high effort/ high reward toward low effort/low reward response options [16,17,18,19,20]. However, because typical rodent cost-benefit decision making tasks involve varying both value and effort options within the same choice trial, it is difficult to dissociate which factor(s) are affected.

We recently developed novel behavioral assays to evaluate the influence of Value and Effort on behavior independently (Bailey et al., under review and [21]. Here, we use these tasks to examine cost-benefit decision making in transgenic mice that over-express Dopamine D2 receptors in the striatum (D2R-OE mice) [22]. This molecular manipulation models the ~12% increase in D2R occupancy observed in schizophrenia [23,24,25] that may be part of the etiology of the disease, or may result from chronic antipsychotic treatment [26]. Increased expression of the D2Rs is selectively in post-synaptic medium spiny neurons in the striatum and can be temporally regulated [22], enabling comparison of differences between concurrent and developmental effects of D2R over-expression.

D2R-OE mice show deficits in goal-directed behavior and cost-benefit decision making but the prior studies did not determine if the mice are impaired in the ability to use information about effort, value or both to guide their decision making [27,28,29,30]. In the current study, our novel tasks determined that the increase in striatal D2Rs reduces willingness to work because of an altered sensitivity to changing effort requirements, while value-related decision making is intact. We also measured the level of extracellular dopamine in the striatum in D2R-OE mice during an effortful task and found a dampening of dopamine levels compared to controls. In control mice, we observed dopamine encoding of response costs in the ventral striatum but this relationship was absent in D2R-OE mice.

Materials and methods

Subjects

Generation of the D2R-OE mice, breeding scheme, and regulation of the transgene has been previously described [22]. All subjects were females that began experiments between 10 and 12 weeks of age. All mice were group housed, with the exception of the mice used for microdialysis which were single housed to protect the guide cannula implants. Throughout experiments, all subjects were maintained at 85% of their ad libitum bodyweight to motivate them to work for food rewards. Naïve groups of mice were used for each behavioral paradigm and the number of subjects in each experiment is provided in the relevant figure legends. Complete details are provided in the Supplementary Information (SI).

Behavioral testing

Before implementing specific procedures, mice were trained to make lever presses [27] and lever holds [31] as previously described. Each task is described in the Results section along with the presentation of the data. Detailed descriptions for all training and testing procedures are provided (SI), and have been previously described: the cost-benefit decision-making task allows the mice to choose between pressing a lever to earn a preferred reward (milk) and eating freely available home cage chow [28, 32]. Milk was earned on a random ratio (RR) schedule, e.g., on RR20 mice were rewarded after an average of 20 lever presses. We chose a RR schedule because it promotes goal-directed responding compared to other reinforcement schedules which result in habitual responding after several session [33]. Mice were tested in an ascending series of RR schedules, which allowed us to compare genotypes over a wide range of press rates. The concurrent value choice (CVC) task allows mice to choose between working for a fixed cost to obtain sucrose of a given concentration or to work for a pellet reward whose cost varies across sessions [21] and Bailey et al., Submitted. The Current Effort choice (CEC) task allows mice to earn a fixed reward by either making multiple lever presses of any duration or by holding the lever down for a long criterion duration [21] and Bailey et al., Submitted. In the progressive ratio (PR) task, mice must press a lever an increasing number of times for each successive reward [27, 28, 31]. In the progressive hold down (PHD) task, mice must hold the lever down for an increasing length of time for each subsequent reward [31]. For Microdialysis sessions, we used a RR schedule because there was an effect of genotype across RR schedules in the cost-benefit decision-making task.

Microdialysis sampling

Surgeries

Under ketamine/xylazine anesthesia, female mice were stereotaxically implanted unilaterally with microdialysis guide cannulae for the insertion of CMA7 probes (CMA, Boston, MA) to position the 1 mm of dialysis membrane in the ventral striatum using the coordinates: AP: + 1.0 ML: −1.5 DV: −3.50 from skull surface. The guides, a head block tether (Instech, Plymouth Meeting, PA) and four stabilizing screws were secured using dental cement. Animals recovered for 1 week before food restriction and behavioral training.

Microdialysis sample collection

All subjects underwent two sampling sessions during RR10 sessions in two different motivational states: fasted (food was removed from the home cage the day before testing) and pre-fed (home chow was available for 2 h before testing). The order of sampling sessions was counterbalanced across subjects. On sampling days, CMA 7 probes (cupraphane membrane 1 mm in length; outer diameter, 0.24; molecular cutoff, 6 kDa; CMA) were inserted into the guide and continuously perfused with artificial CSF (CNS perfusion fluid, CMA) at a flow rate of 1 µL/min for 2 h and 20 min while the mouse was inside a circular container within the operant test chamber. After equilibration, samples were collected at 20-min intervals: two samples prior to behavior (considered baseline), three samples during behavior, and two samples after the behavioral test ended. Sample collection was staggered from event times to account for the delay in recovery through the tubing and swivel.

Quantification of dopamine UPLC/MS/MS analysis

Samples and dopamine standards were pre-processed by dansyl chloride derivatization based on the method of ref. [34], as described (SI). Quantification was performed by the Biomarker Core at the Irving Center for Clinical and Translational Research at Columbia University Medical Center as described (SI).

Confirmation of microdialysis probe placement

Under anesthesia, a CMA7 probe stripped of membrane was inserted into the implanted guide and 1 µL methylene blue dye was infused at 1 µL/min. After euthanasia, brains were cryo-sectioned, and probe locations identified, using the mouse brain atlas in stereotaxic coordinates for ref. [35].

Data analysis

In experiments that tested concurrent vs developmental transgene expression, four groups of mice were used: Control Chow, D2R-OE Chow, Control Dox, and D2R-OE Dox (see SI). No significant differences existed between Control Chow and Control Dox mice, so data were collapsed across these two groups into a single control group to simplify the presentation of the results.

Behavioral measures and dopamine values were analyzed using mixed ANOVAS. Between-subject factors included group (D2R-OE Chow, D2R-OE Dox and Controls), within-subject factors included conditions (e.g., sucrose concentrations, response costs, motivational state, time, etc.). Planned ANOVA results are reported in the text, and significant Bonferroni corrected multiple comparisons are reported in the figures using asterisks. Statistical analyses were performed using Matlab Statistics Toolbox (Mathworks), Graph Pad Prism, or IBM SPSS Statistics. For all analyses, α was set at 0.05.

Results

Striatal D2 receptor overexpression alters cost-benefit decision making in mice

To measure cost-benefit decision making, we used a task in which subjects choose between working for a preferred reward (lever pressing on a RR schedule to earn evaporated milk) or consuming less preferred, but freely available home cage chow [36]. Mice expressing the D2R transgene (D2R-OE chow mice) made fewer lever presses and consequently earned fewer rewards at all ratios tested (Fig. 1a). There was a main effect of ratio (F (2, 58) = 42.94, p < 0.0001), a main effect of group (F (2, 29) = 15.48, p < 0.001), and a significant ratio x group interaction (F (4, 58) = 3.996, p = 0.0062). Post hoc comparisons are presented in Fig. 1a. Instead of earning milk, D2R-OE chow mice consume more of the free chow (Fig. 1b). We observed main effects of ratio (F (2, 58) = 23.51, p < 0.0001) and group (F (2, 29) = 5.469, p = 0.0097), without a ratio by group interaction. These results indicate that transgenic overexpression of D2 receptors in the striatum disrupts cost-benefit decision making, which is normalized when the transgene is switched off. These results are consistent with previous studies showing a reduction in the willingness to work by D2R-OE mice in other operant paradigms [29, 30].

Fig. 1
figure 1

D2R-OE mice are less willing to work for preferred rewards in a cost-Benefit decision-making task. a Number of rewards earned. (b) Chow consumption (g) made by different groups of mice. Graphs show Mean ± (SEM). Significant main effects are reported in the results section, asterisks denote the results of post hoc comparisons. * p < .05; ** p < .01. Control mice, gray, (n = 16); D2R-OE Chow, green, (n = 8); D2R-OE Dox, blue, (n = 8)

Striatal D2R overexpression does not alter sensitivity to reward value

The altered behavior we observed in the cost-benefit decision-making task could be due to an increased sensitivity to costs (the effort of lever pressing), or a decreased sensitivity to benefits (the relative value of milk reward over chow). A third possibility is a disruption to cost-benefit computation. We recently developed novel behavioral tasks to distinguish between the first two possibilities (Bailey et al., under review) and used them to test D2R-OE mice for this study.

To examine value-based decision making we used the CVC task. In this task, mice can choose to work for one of two distinct rewards: a liquid sucrose solution which always costs 5 lever presses (FR5) vs a solid sucrose pellet which costs 5,10, 20, 40, or 80 presses, depending on the day, see schematic (Fig. 2a). Varying the cost of the pellet reward enables us to obtain each subject’s choice function between these two reward options. Importantly, by manipulating the concentration of the liquid sucrose solution from 05 to 20% concentration (i.e., changing the reward value) we can measure the change in each subject’s choice function to determine if they are sensitive to reward value.

Fig. 2
figure 2

D2R-OE mice are sensitive to reward magnitude manipulations. a Schematic representation of the CVC task. Each session begins with 10 single lever trials in which either the Sucrose Lever or Pellet Lever is presented. The number of presses required on the Pellet Lever varied over days (10, 20, 40, and 80), whereas the number of presses required on the Sucrose Lever was always 5. Upon completing the 10 single lever trials subjects then received 20 choice trials in which both levers were presented for subjects to choose which lever to work on to obtain reward. b Shows proportion of sucrose choice for controls (left), D2R-OE chow (center), and D2R-OE dox (right) mice in two different sucrose concentration conditions. c Shows the point of subjective equality (PSE) in the two different sucrose concentration conditions for each group, Control mice, gray, (n = 16); D2R-OE Chow, green, (n = 6); D2R-OE Dox, blue, (n = 6). d Shows the proportion of sucrose choice for controls (left) and D2R-OE chow mice (right) in the CVC task in two different valuation conditions. e Shows the PSE for subjects in the two different valuation conditions. Control mice, gray, (n = 7); D2R-OE Chow, green, (n = 7). All graphs depict Mean ± (SEM) values. Significant main effects are reported in the results section, asterisks denote the results of post hoc comparisons. * p < 0.05; ** p < 0.01

An upward shift (an increased preference for sucrose over pellets) in the choice function for 20%, compared to 05% sucrose demonstrates sensitivity to the value of the sucrose reward. We observed this shift in all groups of mice (Fig. 2b). Using individual subjects’ sucrose choice functions, we extrapolated the point of subjective equality (PSE) for each subject, defined as the point at which subjects are choosing the sucrose reward and the pellet reward with equal probability, i.e., where the subject’s data crosses the dashed line across the Y axis that represents choice probability of 0.5. The PSE values for the 5% sucrose was higher than the PSE for 20% sucrose concentration for all groups (Fig. 2c) and a mixed ANOVA detected a main effect of sucrose concentration (F (1, 11)= 17.741, p = .0001), but no main effect of group or, sucrose concentration x group interaction.

As a second measure of sensitivity to reward value, we tested a separate cohort of control and D2R-OE chow subjects using a procedure which devalued the pellet reward. For 3 weeks, subjects were tested in the CVC using the 05% liquid sucrose concentration condition throughout. In weeks 1 and 3, testing was exactly as before. In week 2, we devalued the pellet by giving subjects 30 min of free access to pellets prior to the session (Devalued). Data from weeks 1 and 3 (Valued) were averaged for each subject. The sucrose choice functions in Fig. 2d show that when the pellet is devalued, both D2R-OE and Control mice prefer to work for sucrose. Figure 2e shows the PSE for each group in the valued vs devalued condition. We observed a main effect of reward devaluation (F (1, 12) = 23.455, p < 0.001), but no main effect of genotype (F (1, 12) = 4.165, p = 0.0639) and no interaction, demonstrating that striatal D2R overexpression did not impair the ability to use changes in reward value to guide decision making.

Effort-based decision making is altered by striatal D2 receptor overexpression

After determining that value-based decision making is normal in D2R-OE mice, we next evaluated effort-based decision making, to determine if their altered behavior in the cost-benefit decision-making task results from an increased sensitivity to effort. In our concurrent effort choice (CEC) task (Bailey et al.; under review), mice can choose to earn 0.01 ml of milk through one of two types of work: making lever presses on the Press Lever (5, 10, 20, 40, 80, or 160 presses) vs making single lever presses of required durations on the Hold Lever (5, 10, or 20 s), see schematic (Fig. 3a). By varying the press and hold requirement on different days, while holding the reward value constant for both work types, we obtained an effort-choice function from each subject.

Fig. 3
figure 3

D2R-OE mice are more sensitive to making repeated numbers of responses. a Schematic Representation of the CEC task. Each session begins with 10 single lever trials in which either the Press Lever (PL) or Hold Lever (HL) is presented. The number of presses required on the PL varies over days (1, 5, 10, 20, 40, 80, or 160), as does the hold requirement on the HL (05,10 or 20 s). Upon completing the 10 single lever trials, subjects then received 40 choice trials in which both levers were presented for subjects to choose which lever to work on to obtain the milk reward. b Proportion of hold choices for control (left) D2R-OE chow (center) and D2R-OE dox (right) mice in three different hold duration conditions in the CEC task. c PSE in the three different hold duration conditions. Control mice, gray, (n = 14); D2R-OE Chow, green, (n = 8); D2R-OE Dox, blue, (n = 8). All graphs depict Mean ± (SEM) values. Significant main effects are reported in the results section, asterisks denote the results of post hoc comparisons. * p < .05; ** p < .01

To examine subject’s sensitivity to the different effort requirements of response number and response duration, we computed the proportion of hold choices subjects made during the choice trials, defined as a trial in which holding was successful and lead to reward. Control and D2R-OE Dox mice are sensitive to both types of work requirements, showing hold choice functions that are dependent on the press requirement and shift upward as the hold requirement increases. In contrast, D2R-OE Chow mice show hold choice functions that are dependent on press requirement, but exhibit consistent bias towards holding to earn rewards compared to the other groups, and little shift in choice functions when the hold requirement increases. (Fig. 3b, center). Analysis of the PSE, defined as the number of lever presses at which subjects choose between lever pressing and holding on an equal proportion of trials, reflected the D2R-OE bias for holding versus pressing (Fig. 3c). A Mixed ANOVA detected a main effect of group (F (2,25) = 9.015, p = 0.001), a main effect of hold duration (F (2,50) = 15.29, p < 0.0001), and a significant group x hold duration interaction (F (4,50) = 5.481, p = 0.001). The PSE for the D2R-OE Chow group was consistent for all three hold requirements, while the PSE increased for the other two groups (indicating a preference for pressing over holding) as the hold requirement increased. See Fig. 3c for results of post hoc comparisons.

Striatal D2R overexpression changes sensitivity to the effort associated with specific types of work

Results from the CEC task suggest that D2R-OE mice are very sensitive to the work requirement of repeated lever pressing, explaining their preference for free chow in the cost-benefit decision-making task, which requires lever pressing. The CEC data also show that D2R-OE mice are selectively more sensitive to the cost of number of lever presses but not the cost of press duration. To determine if this selective sensitivity would alter behavior in other effort-related tasks, we tested D2R-OE and Control mice in two different progressive schedules. In the first task the number of lever presses required to earn rewards increases (by a factor of 2) for each subsequent reward (PRx2 task). In the second task, the PHD task, the requirement is always a single press, but of progressively longer hold durations [31]. Two PHD schedules were used. In the “Easy” version the first hold requirement was 2 s and the requirement was multiplied by 1.13 s for each subsequent requirement. In the “Hard” version the first requirement was 3 s and then multiplied by 1.4 s thereafter.

In a PR x 2 schedule, D2R-OE mice quit working sooner than controls (Fig. 4a). The PR sessions were terminated when mice did not make a lever press for five consecutive minutes, or after a maximum of 2 h. We used this 5 min cut off period because in previous PR studies that had no such termination procedure, we found that when mice stopped responding for 5 min, they never resumed working. Symbols in Fig. 4a depict the time of the last lever press made before the termination of each subject’s session. A Mantel-Cox Log-rank analysis of the survival function revealed an effect of genotype (χ2 = 8.606, p = 0.003). D2R-OE mice make fewer lever presses (Fig. 4b; F(3,108) = 285.8, p < 0.001), and reach a significantly lower breakpoint than controls (Fig. 4c; F(3,108) = 285.8, p < 0.001).

Fig. 4
figure 4

D2R-OE mice quit working sooner in a PR task, but not a PHD task. a Shows a survival function of how long mice continued lever pressing in a PR before quitting for a period of 5 min without a single press. b The number of lever presses made in a PR x 2 schedule of reinforcement. c The breakpoint in a PR x 2. Control mice, gray, (n = 12); D2R-OE Chow, green, (n = 12). d Shows survival functions of how long mice continued lever holding in a PHD task in an Easy (top) and Hard (bottom) schedule before quitting for a period of 5 min without a single press. e Shows the number of lever holds in two different PHD schedules. f Shows the breakpoint in two different PHD schedules (Asterisks denote significance of post hoc tests, main effects are reported in the text). Control mice, gray, (n = 15); D2R-OE Chow, green, (n = 8). Bar graphs depict Mean ± (SEM) values. * p < .05; ** p < .01; *** p < .001

We also observed an effect of genotype in the PHD task, with the transgene appearing to have a beneficial effect on performance. PHD sessions were terminated when mice did not make a lever press for 15 consecutive minutes, or after a maximum of 2 h. We used this 15 min cut off period because in previous PHD studies that had no such termination procedure, we found that when mice stopped responding for 15 min, they never resumed working again. Symbols in Fig. 4d depict the time of the last lever press made before the termination of each subject’s session. There was no difference in how long D2R-OE mice and controls continued working in the PHD task before giving up in either an “Easy” (χ2 = 0.097, p = 0.7555) or “Hard” (χ2 = 0.2996, p = 0.5841) PHD schedule (Fig. 4d). The PHD schedule difficulty had a main effect on number of hold attempts (F(1, 13) = 17.76, p = 0.0010), but there was no difference between D2R-OE and control mice (Fig. 4e). There was a main effect of schedule on the breakpoint (F (1, 13) = 13.36, p = 0.0029) and a significant effect of genotype (F (1, 13) = 10.67, p = 0.0061) because the D2R-OE mice reached significantly higher breakpoints in the PHD task on both schedules (Fig. 4f).

Together, the results from the CVC, CEC, PR, and PHD task indicate that D2R-OE mice have altered cost-benefit decision making because of a difference in how they respond to effort requirements involving the number, but not duration, of responses.

Striatal D2R overexpression dampens the increase in striatal extracellular dopamine that occurs during motivated instrumental conditioning

Like our observations in D2R-OE mice, dopamine receptor blockade in the striatum also shifts effort-based choice to less effortful (and less rewarding) choices (see introduction). We therefore wondered if extra-striatal dopamine levels may be reduced in D2R-OE mice during effortful behavior, leading to a motivational deficit. Reduced extracellular dopamine in the striatum would also be consistent with our findings of a reduction in burst firing in midbrain dopamine neurons in D2R-OE mice [37], and a decrease in local synchrony among midbrain dopamine neurons [38]. We implanted D2R-OE and Control mice with guide cannulae in the ventral striatum prior to lever press training to collect microdialysate samples during RR10 conditioning. To determine the impact of motivational state on behavior and dopamine release, samples were collected during one session performed after overnight fasting (fasted) and another session performed after 2 h of free access to home cage chow (pre-fed).

The number of lever presses made was affected by genotype (F (1, 16) = 6.97, p = 0.018) and motivational state (F (1, 16) = 18.2, p = 0.0006) with no genotype x state interaction, Fig. 5a. Consequently, the number of rewards earned was affected by genotype (F (1, 16) = 6.99, p = 0.018) and motivational state (F (1, 16) = 13.13 p = 0.0023), with no interaction, Fig. 5b. These results show that all mice work less when pre-fed compared to the fasted condition, and D2R-OE mice work less than controls in  both conditions. This sensitivity to reward devaluation suggests that all mice were responding in a goal-directed manner, as would be expected on a RR schedule [33]. Consistent with this hypothesis, the percentage of earned rewards that was consumed was unaffected by genotype or state, and there was no interaction (Fig. 5c). Habitual lever pressing when pre-fed would likely result in a decrease in the percentage of earned rewards consumed. The lack of effect of genotype demonstrates that D2R-OE mice are equally interested in consuming the reinforcer when it is presented.

Fig. 5
figure 5

Increased striatal D2R expression reduces motivational state dependent striatal dopamine release during goal directed behavior. During RR10 Conditioning, Bar graphs depict Mean ± (SEM) values for: (a) total lever presses (b) number of rewards earned (c) % of earned rewards consumed. Significant main effects of state and genotype were observed for each measure, main effect of genotype was observed for lever presses and rewards earned, but not % of earned rewards consumed. d Mean ( ± SEM) dopamine concentrations expressed as percentage of baseline in consecutive 20 min dialysate samples collected before (Pre), during (Test), and after (Post) RR10 instrumental conditioning for controls (gray circles) and D2R-OE (green rectangles) mice during fasted (open symbols) and pre-fed sessions (filled symbols). e Localization of microdialyis probes within the striatum of Control (gray, n = 8) and D2R-OE (green, n = 10) mice. Scatter plots show the relationship between changes in conditioning-induced dopamine efflux and: (f) the total number of lever presses made in the session, (g) the number of rewards earned in the session, and (h) The average response cost (lever presses made/rewards earned) for Control (gray circles) and D2R-OE mice (green rectangles) during fasted sessions (significant correlation is denoted with a trend line). Main effects are reported in the text, Asterisks denotes Post hoc comparisons * p < 0.05; ** p < 0.01, *** p < 0.001

We quantified dopamine concentration from all samples taken on both sessions from 8 out of the 8 control mice, and 8 out of the 10 D2R-OE mice whose behavioral data are presented in Fig. 5. One of the control mice was sampled twice in the fasted condition. Because we did not use the no-net-flux method of microdialysis, normal variation in probe performance would make the comparison of raw values of extracellular dopamine across subjects unreliable [39, 40]. To compare dopamine function across subjects and conditions, we calculated changes in dopamine within each subject over the course of an experimental session and analyzed the changes in dopamine, relative to each individual subject’s baseline.

A three factor ANOVA revealed significant main effects of genotype (F (1, 14) = 7.082, p = .019), time (F (6, 84) = 8.326, p < 0.001) and state (F (1, 14) = 6.621, p = 0.022) on the increase in extracellular dopamine observed during conditioning. The time course of conditioning-induced dopamine efflux was affected by both motivation state, and genotype (time x state x genotype interaction: F (6, 84) = 4.125, p = 0.001, (Fig. 5d). At the termination of the experiment we determined the precise location of the 1 mm long dialysis membranes. Probes were predominantly located in the Core of the Nucleus Accumbens (NAcc), with some extending into the most ventral aspect of the dorsomedial striatum. There was no bias in location across genotype (Fig. 5e).

State dependent, conditioning-induced dopamine efflux encodes effort requirement in striatum of control, but not D2R-OE mice

In some behavioral conditions, dopamine tone can correlate with amount of motor activity, the amount of reinforcement, or response costs, i.e., the effort required to earn rewards [41]. Therefore, we examined the relationship between individual differences in the dopamine response during instrumental conditioning (average of test bins 1–3 for each subject) and these variables in our experiment. Because there was no increase in dopamine efflux in either genotype in the pre-fed condition, we computed Pearson’s correlation coefficient (two-tailed) only for data from fasted sessions. There was no significant relationship between change in dopamine efflux and total lever presses for either genotype (Control: r = −0.34, p = 0.37, D2R-OE: r = −0.06, p = 0.91, Fig. 5f). There was also no relationship between dopamine and the total number of rewards earned for either genotype (Control: r = −0.13, p = 0.75, D2R-OE: r = −0.07, p = 0.86, Fig. 5g).

Because reinforcement was earned on a probabilistic RR10 schedule, variability in the average number of responses made to obtain rewards across session (i.e., total number of presses performed in a session /total number of rewards earned in that session) allowed us to explore the relationship between dopamine and response cost. Due to the relatively small number of animals in each group, the following analysis should be considered exploratory. While dopamine was unrelated to total response output (Fig. 5f) and total rewards earned (Fig. 5g), we observed a negative correlation between the change in dopamine efflux and average response cost in control (r = -0.71, p = 0.03) but not D2R-OE mice (r = 0.5, p = 0.21), Fig. 5h. If these results were to be confirmed with larger numbers of mice, they would corroborate a previous report that response cost can be encoded by extracellular dopamine in the ventral striatum [41]. The results also show that increased D2Rs in the striatum disrupts any potential dopamine response-cost relationship, specifically because there was no observable increase in extracellular dopamine under any response cost condition. This flattening of dopamine in D2R-OE mice may underlie the alterations in effort-related choice and cost-benefit decision making documented throughout this current study.

Discussion

Compared to healthy individuals, patients with schizophrenia are more likely to favor low cost options leading to smaller rewards over high cost tasks that earn large rewards [8, 10, 11, 42, 43]. In such studies, costs typically are a combination of physical effort and delay. In delay discounting tasks that do not involve physical effort, patients also show steeper discounting [44,45,46,47]. In this study, we determined that in mice, an increase in striatal D2 receptor expression results in a similar increase in sensitivity to costs.

D2R-Overexpression alters dynamic, motivation dependent dopamine release

Concurrent over-expression of D2Rs within the striatum alters willingness to work for a preferred reward, the same result produced by D2R blockade [48, 49] and dopamine depletion [49]. That an Increase in D2R expression had the same effects on cost-benefit choice behavior as antagonizing D2Rs and dopamine depletion may seem paradoxical. However, the results of our Microdialysis study, in combination with previous electrophysiology studies may explain why increased D2Rs results in the same phenotype as dopamine depletion and D2R antagonism. Because we measured only Changes in extracellular dopamine, rather than absolute dopamine concentration, the lack of increase of dopamine observed in D2R-OE mice can be interpreted in two directions. (1) Dopamine levels are so high in the striatum of D2R-OE mice that no further increase can be observed or (2) motivational state dependent striatal dopamine release is dampened in D2R-OE mice. Results from in vivo electrophysiology studies support the later interpretation. The tonic firing rate of VTA dopamine neurons in vivo is reduced in D2R-OE mice, and normalized by switching off the transgene [37]. Disruption to presynaptic dopamine function in D2R-OE mice is also evidenced by a lack of recruitment of VTA dopamine neurons in a cognitive decision-making task [38]. Taken together, these results suggest that D2R overexpression results in reduced dopamine release, resulting in the same behavioral deficits associated with dopamine depletion and dopamine receptor antagonism. Measurements of extracellular dopamine during behavior confirmed this hypothesis.

In control subjects only, dopamine in the ventral striatum negatively correlated with the average amount of effort expended per reward earned. This is consistent with both a computational model of tonic dopamine during instrumental conditioning [50] and experimental data [41]. Together with our results, these studies suggest that dopamine tone in the ventral striatum provides a running estimate of current costs and benefits to guide goal-directed actions.

Dissection of effort-based versus value-based contributions to cost-benefit decision making

The cost-benefit decision-making task we used here (see Fig. 1) is an excellent assay for assessing general motivation for food reinforcement and hunger or satiation but because subjects choose between lever pressing (high effort) and foraging (low effort) in order to obtain milk (high value) or home cage chow (low value), observed differences in choice could result from differences in sensitivity to effort demands, an altered sensitivity to reward value, or both. Other paradigms can be used to measure sensitivity to reward, including sucrose preference tests [51]. Dissociating sensitivity to effort is more difficult. Most effort-based choice tasks involve choices which contrast both effort and reward, effectively testing cost-benefit decision making [52, 53]. A recently described effort task involved fixed rewards for varied amounts of physical effort, but the rats did not choose from variable effort trials, they were scheduled by the experimenter for the purpose of correlating effort demand with recorded neuronal activity [54].

To help identify the most affected factors in D2R-OE mice, we employed tasks that we recently developed to isolate the specific impact of effort or reward on decision making.

Striatal D2R overexpression does not disrupt value-related decision making but alters the sensitivity to the effort demands of response number vs time

Our value-based choice assay determined that D2R-OE mice were as sensitive as controls to two different kinds of shifts in reward value, and their choice behavior was adjusted accordingly. Using the CEC task, we generated effort choice functions by manipulating the work requirement of two work options (number of lever presses vs duration of hold) to see how subjects scaled the effort of one type of work against the other. Importantly, in this task the reward outcome was the same for the two work options, so the only variable manipulated was the cost (lever pressing and time) on each work option. Striatal D2R over-expression resulted in differential sensitivity to the cost of response number as these mice strongly preferred maintaining a single sustained action over repeatedly initiating the action of lever pressing. This differential sensitivity to effort demands of number was substantiated using other behavioral paradigms, and was largely normalized when the transgene was switched off.

Considerations for our novel effort-based and value-based choice paradigms

The fact that we observed genotype effects in our novel effort-based, but not value-based choice tests validates the use of these tasks for isolating changes specific to each of these closely related behaviors. These tests are not rapid, requiring a minimum of 4 weeks of testing, in addition to training time, but they do not require large groups of animals. The effects of genotype or experimental conditions were documented with an average of approximately 8 mice per group. Our task design included using reward and effort requirements in fixed or ascending order. We observed genotype effects using such a design but cannot exclude the possibility that order effects might have a preferential effect on one genotype. Adapting the tasks to include a random or counterbalanced order would minimize potential order effects.

Conclusions and implications

Patients with schizophrenia suffer from motivational deficits that are driven by difficulties in anticipating benefits and a reduction in allocating effort. Decreased effort allocation is particularly important because it inversely predicts functionality [8], happiness, life satisfaction and success [4]. Here we demonstrate that in mice, overexpression of striatal D2 receptors specifically impacts effort-based choice and is association with dampened extracellular dopamine during goal-directed activity. The blunting of motivation stimulated dopamine tone observed in the ventral striatum of D2R-OE mice has relevance to findings in patients. The severity of negative symptoms (which includes amotivation) negatively correlates with D2R occupancy in the striatum [55]. Also, PET studies suggest that hyperdopaminergia exists selectively in the head of the caudate (the associative striatum), with no dopamine excess in the limbic (ventral) striatum [55]. In fact, there is evidence to suggest that hypodopaminergia is prevalent in many brain regions outside of the associative striatum [56, 57].

Developing treatments for debilitating deficits in motivation requires a better understanding of the neural mechanisms underlying effort and value-related processes. Future rodent studies employing behavioral strategies to specifically isolate effort and value variables in the context of cell type and circuit specific manipulations could provide such information.