INTRODUCTION

Different forms of behavioral flexibility, mediated by anatomically dissociable regions of the prefrontal cortex (PFC), are critically dependent on interactions between these regions and their striatal afferents. For example, shifts between stimulus–reward associations within a particular dimension (ie reversal learning) are sensitive to lesions to either the orbitofrontal cortex (OFC) or the dorsal striatum (Divac et al, 1967; Dias et al, 1996; Schoenbaum et al, 2002; Ragozzino et al, 2002; McAlonan and Brown, 2003; Clarke et al, 2008). In rodents, more complex forms of flexibility entailing shifts between rules, strategies, or attentional sets, are dependent on neural circuits incorporating the medial PFC and the core of the nucleus accumbens (NAc) (Reading and Dunnett, 1991; Ragozzino et al, 1999; Birrell and Brown, 2000; Floresco et al, 2006a; Block et al, 2007). Moreover, the cortical and striatal nodes within these circuits appear to make distinct contributions to these processes. Disruptions of frontal lobe functioning typically induce perseverative deficits (Ragozzino et al, 1999; Block et al, 2007; Boulougouris et al, 2007). Conversely, manipulations of striatal outputs do not disrupt the ability to disengage from a previously relevant strategy, but do impair maintenance of a novel strategy once perseveration has ceased (Ragozzino et al, 2002; Floresco et al, 2006a; Block et al, 2007).

Dopamine (DA) also is important in facilitating behavioral flexibility mediated by these cortico-striatal circuits. Systemic blockade of D2 (but not D1) receptors in monkeys impairs reversal learning (Ridley et al, 1981; Lee et al, 2007), although the specific terminal regions where DA may be acting to facilitate this form of flexibility remains unclear. DA depletions in the PFC (including orbital regions) leave reversal learning intact (Crofts et al, 2001), and similar depletions in the striatum have yielded mixed effects (Collins et al, 2000; Crofts et al, 2001; O’Neill and Brown, 2007). In contrast, DA depletion of the dorsolateral PFC in primates made before training impairs the acquisition of an attentional set (Crofts et al, 2001), whereas, in rats, blockade of D1 or D2 receptors in the medial PFC before a set-shift induces severe perseverative impairments (Ragozzino, 2002; Floresco et al, 2006b). Interestingly, stimulation of PFC DA receptors does not affect these forms of behavioral flexibility (Fletcher et al, 2005; Floresco et al, 2006b), even though systemic administration of D2 receptor agonists does impede this form of executive functioning (Smith et al, 1999; Mehta et al, 2001; Boulougouris et al, 2009).

Comparatively less is known about the function of ventral striatal DA in reversal learning and set-shifting. This is somewhat surprising, given that anatomical and neurophysiological studies suggest that DA in this nucleus seems ideally suited to facilitate changes in behavior in unpredictable environments (Mogenson et al, 1993; Pennartz et al, 1994; Floresco, 2007, 2008). DA lesions of the NAc impaired spatial reversal learning; however, interpretation of these data was complicated by the fact that these lesions also retarded initial discrimination learning (Taghzouti et al, 1985). More recently, Goto and Grace (2005) reported that unilateral D1 receptor blockade in the NAc combined with contralateral hippocampal inactivation disrupted shifting between different strategies. Similar asymmetrical manipulations combining PFC inactivation with intra-NAc infusions of a D2 agonist induced a perseverative impairment in set-shifting. However, a thorough assessment of the contributions of NAc D1 and D2 receptor activity to different forms of behavioral flexibility is lacking. Thus, the present study investigated the effects of blockade and stimulation of D1 and D2 receptors in the NAc core, on these two complementary forms of flexibility.

MATERIALS AND METHODS

Subjects and Surgery

Long Evans rats (275–300 g; Charles Rivers, Montreal, QC) were anesthetized with 100 mg/kg of ketamine hydrochloride and 7 mg/kg of xylazine and implanted with 23-gauge stainless-steel guide cannulae bilaterally into the NAc core (flat skull: AP=+1. 5 mm, ML=±1.8 mm from bregma, and DV=−5.9 mm from dura) (Paxinos and Watson, 1998). Cannulae were held in place with stainless-steel screws and dental acrylic. Stainless-steel obturators (30-gauge) flush with the end of the guide cannulae were inserted after surgery. Each rat was given at least 7 days to recover from surgery before training. During this period, animals were food-restricted to 85% of their free-feeding weight and handled for at least 5 min per day.

Apparatus

Testing was conducted in four operant chambers (30.5 × 24 × 21 cm; Med Associates, St Albans, VT, USA), enclosed in sound-attenuating boxes with a fan to provide ventilation and mask outside noise. Two retractable levers were located on either side of a central food cup where food reinforcement (45 mg; BioServ, Frenchtown, NJ) was delivered by a pellet dispenser. A stimulus light was positioned above each lever. Each chamber was illuminated by a 100 mA houselight located in the top center of the wall opposite the levers. Four infrared photobeams, mounted 3 cm above the grid floor on the sides of each chamber, measured locomotor activity.

Initial Lever-Pressing Training

These training procedures have been described previously (Floresco et al, 2008). Briefly, over 2–3 days rats were trained to press each lever on a fixed-ratio 1 schedule to a criterion of at least 50 presses in 30 min. Over the next 5 days, they were trained to press the retractable levers within 10 s of their insertion of the chamber. The stimulus lights were not illuminated during these sessions.

Immediately after the last lever training session, the side bias for the rat was determined (Floresco et al, 2008). Here, every 20 s, both levers were inserted into the chamber simultaneously. Again, the stimulus lights were not illuminated. On the first trial, food was delivered after responding on either lever, but on subsequent trials, food was delivered only if the rat responded on the lever opposite to the one chosen initially. If the rat chose the same lever as the initial choice, no food was delivered, and the houselight was switched off. This continued until the rat chose the lever opposite to that chosen initially. After choosing both levers, a new trial commenced. Thus, one trial of the side-bias procedure consisted of responding on both levers (rats typically completed testing in 12–15 trials). The lever (right or left) that a rat responded on first during the initial choice of a trial was recorded and counted toward its bias. If the total number of responses on each lever were comparable, the lever that a rat chose initially four or more times over seven total trials was considered its side bias. However, if a rat made a disproportionate number of responses on one lever over the session (ie greater than a 2 : 1 ratio), that lever was considered its side bias. After side bias testing, rats received a mock infusion where injectors were placed in the guide cannulae for 2 min, but no infusion was administered.

Drugs and Microinfusion Procedure

We tested the effects of infusions of selective D1 and D2 antagonists and agonists on strategy set-shifting and reversal learning. Specifically, we used the D1 antagonist SCH23390 (0.1, 1.0 μg; Sigma-Aldrich Canada, Oakville, ON, Canada), the D2 antagonist eticlopride (0.1, 1.0 μg; Sigma-Aldrich) the D1 agonist SKF81297 (0.2, 1.0, 2.0 μg; Tocris Biosciences, Ellisville, MO), and the D2 agonist quinpirole (0.1, 1.0, 10.0 μg; Sigma-Aldrich). Doses of these drugs were chosen from previous studies demonstrating these compounds to be behaviorally active when infused into the NAc or PFC at the higher doses (Floresco and Phillips, 1999; Goto and Grace, 2005; Floresco et al, 2006b; Pattij et al, 2007). All drugs were mixed in saline.

Infusions into the NAc were made through 30-gauge injection cannulae attached to a 10 μl syringe with polyethylene tubing. Injectors extended 0.8 mm below the guide cannulae. Saline or drugs were infused at a volume of 0.5 μl over 72 s by a microsyringe pump. Injection cannulae were left in place for an additional 1 min, after which the rat was placed in the chamber. Behavioral testing commenced 10 min later. Unless otherwise specified, rats received a mock infusion before the initial discrimination (visual cue or response learning on day 1) and a saline/drug infusion before the set-shift or reversal on day 2.

General Experimental Design and Procedures

Each test was conducted over 2 days, where rats received an initial discrimination session on day 1 and a set-shift or a reversal on day 2. We tested the effects of D1 and D2 antagonists and agonists on shifts from a visual cue to a response strategy, or reversal of a response discrimination. Our previous experience has shown that most rats can acquire these shifts within one training session (150 trials). Squads were typically tested in cohorts of 8–16 animals. Within each cohort, we always included at least 2–3 rats that received saline treatments. We did not test rats on the response-to-cue shift because this procedure typically yields highly variable data in control animals, and is not as sensitive to disruption following PFC inactivation (Floresco et al, 2008). To confirm that any effects on set-shifting were specifically attributable to a disruption in this form of behavioral flexibility, we determined a priori that we would assess the effects of the behaviorally effective doses of all four drugs on reversal learning of a response discrimination. We further determined that for any treatment effective at disrupting performance during both the set-shift and reversal, we would also test the effects of these treatments on the initial learning of a response discrimination. The set-shifting experiments (Experiments 1 A and B) were conducted in a sequential order over several months, testing squads of rats with all doses of the DA antagonists first, and then agonists in subsequent squads. Separate groups of saline-treated rats were tested at the same time as those in the DA antagonists and agonists groups, permitting us to form separate control groups for each. The reversal experiments (Experiments 2 and 3) used substantially fewer animals, allowing us to use one saline-treated control group for each study.

Strategy Set-Shifting

Day 1: visual-cue discrimination

We used a procedure modified slightly from one described previously (Floresco et al, 2008). Rats were initially trained to press one of the two levers that had a stimulus cue-light illuminated above it to obtain food (Figure 1a). A session began in darkness with the levers retracted. Trials began every 20 s with illumination of one of the stimulus lights; 3 s later, the houselight was turned on and both levers extended. In every pair of trials, the left or right stimulus light was illuminated once, and the order within the pair was random. Rats had 10 s to press one of the levers, which caused both to be retracted. Failure to choose either lever within 10 s resulted in the retraction of both (omission). A response on the lever under the illuminated light (a correct response) resulted in delivery of a food pellet, and the houselight was switched off 4 s later. Choice of the other lever caused both to be retracted, the houselight was switched off and no food delivery. On each trial, the lever that the animal chose and the position of the stimulus light was recorded, as were response latencies. Trials continued until either (1) a rat had received a minimum of 30 trials and achieved criterion performance of 10 consecutive correct responses or (2) after 150 trials, whichever came first. Omission trials were not included in the trials to criterion measure.

Figure 1
figure 1

Diagram of the types of discriminations used for the strategy set-shifting procedures conducted in an operant chamber. During visual-cue discrimination learning (a), rats were required to always press a lever that had a stimulus light illuminated above it. For response discrimination learning (b), rats were trained to always press one of the levers (eg left) regardless of the position of the cue light.

Day 2: set-shift to response discrimination

Animals were matched based on the performance on day 1 and assigned to either a saline or a drug group. The day following the visual-cue task, rats were trained on a response discrimination (Figure 1b). Here, rats were required to disengage from the previously relevant visual-cue strategy and instead use a response strategy by pressing the lever opposite its turn bias (left or right). Trials were given in a manner identical to visual-cue training and continued every 20 s until a rat achieved criterion performance of 10 correct consecutive choices, within a maximum of 150 trials (ie a 50 min session maximum). If a rat did not achieve criterion within this allotted number of trials, its data were included and was given a score of 150 trials.

Response Reversal Learning

Day 1: initial response discrimination

For this experiment, rats were initially trained on the response discrimination, in a manner identical to that described above. Rats were required to press the lever opposite their side bias, regardless of the position of the visual-cue light, which in this experiment served as distracters. Trials continued until a rat achieved criterion of 10 correct consecutive choices. There was no limit to the number of trials rats received to achieve this criterion.

Day 2: response reversal

As in the set-shifting experiment, rats were matched based on the performance on day 1 and assigned to either a saline or a drug group. On day 2, they were trained on a reversal of the discrimination, where a correct choice required a press of the lever opposite to that which was reinforced on day 1. All other aspects of training were identical to those used on day 1. Trials continued every 20 s until a rat achieved criterion performance of 10 correct consecutive choices, within a maximum of 150 trials.

Error Analysis

Errors committed during the set-shift were broken down into three error subtypes. A ‘perseverative’ error was scored when a rat responded on a lever with the stimulus light illuminated above it on trials that required the rat to press the opposite lever during the initial phases of the set-shift. For example, during the set-shift, the rat may have been required to always press the left lever. A perseverative error was scored after right lever press when the stimulus light was illuminated above it. Eight out of every sixteen consecutive trials required the rat to respond in this manner. In a manner similar to that described previously (Ragozzino, 2002; Floresco et al, 2006a, 2006b, 2008), these trials were separated into consecutive blocks of eight trials each. Perseverative errors were scored when a rat pressed the incorrect lever on six or more out of eight trials per block. Once a rat made five or fewer perseverative errors in a block for the first time, all subsequent errors of this type were no longer counted as perseverative, because at this point it was using the original strategy on 62.5% of trials or fewer. Instead, they were now scored as ‘regressive’, and were scored as such even if a rat made more than five such errors in subsequent blocks. ‘Never-reinforced errors’ were scored when a rat pressed the incorrect lever on trials where the visual-cue light was illuminated above the same lever that the rat was required to press during the set-shift. Regressive and never-reinforced errors are used as indices of maintaining and acquiring a new strategy, respectively.

Reversal learning errors were subdivided into perseverative and regressive subtypes and analyzed over blocks of 16 trials. Perseverative errors were scored when rats made an incorrect response, pressing the lever reinforced during initial response discrimination training on day 1. Once a rat made fewer than 10 perseverative errors within a block of 16 trials for the first time, all subsequent errors were scored as regressive. We also compared the number of errors committed on trials when the stimulus light was illuminated above the incorrect (toward distracter) or correct (away from distracter) lever.

Data Analysis

Trials and errors to criterion were analyzed separately using two-way ANOVAs, with Treatment as the between-subjects factor and Choice (correct/incorrect) or Error Type as a within-subjects factor. Significant main effects of Treatment were followed up with multiple comparisons using Dunnett's test. Subsequent targeted analyses comparing the number of each error subtype were conducted on data obtained from treatment groups that differed from controls in the overall analysis, to minimize the number of comparisons. Response latencies were analyzed with two-way ANOVAs, using Treatment as a between-subjects factor, and Test Day (day 1 or 2) as a within-subjects factor, to account for any baseline differences between groups. Locomotor activity was indexed by the number of photobeam breaks divided by the time to complete the session (beam breaks per minute) and was analyzed with one-way ANOVAs.

Histology

After completion of behavioral testing, rats were killed in a CO2 chamber. Brains were removed and fixed in a 4% formalin solution. Brains were frozen, sliced in 50 μm sections, mounted, and stained with cresyl violet (see Supplementary Figure 1).

RESULTS

Experiment 1A: Blockade of NAc D1 and D2 Receptors and Strategy Set-Shifting

A total of 46 rats with acceptable placements were included in the data analysis. These rats acquired the discrimination on day 1 in 48±5 trials. Rats were assigned to saline or drug groups based on their performance during visual-cue training, so that there were no differences between groups on this measure (F(4,41)=0.50, NS; Figure 2a).

Figure 2
figure 2

Experiment 1A; blockade of D1, but not D2, receptors in the nucleus accumbens (NAc) impairs strategy set-shifting. Data are expressed as mean+SEM. (a) Trials to criterion during initial visual-cue discrimination, when rats in all groups received a mock infusion before training. (b) Trials and (c) errors to criterion on the set-shift to the response strategy following intra-NAc infusions of either saline, or different doses of the D1 antagonist SCH23390 (SCH) or the D2 antagonist eticlopride (Etic). Stars denote p<0.05 significant difference vs saline. (d) Analysis of the type of errors committed during the set-shift by rats receiving the 1.0 μg dose of SCH23390. Blockade of D1 receptors in the NAc did not enhance perseveration, but did increase the number of regressive errors, indicative of an impairment in the maintenance of a novel strategy. Dagger denotes p=0.058 vs saline.

The effects of infusions of D1 and D2 receptor antagonists into the NAc core in shifting from a visual cue to a response strategy are presented in Figure 2b–d. All control rats (n=10) achieved criterion performance of 10 correct consecutive choices in fewer than 150 trials, whereas 2 rats that received the 1.0 μg dose of the D1 antagonist SCH23390 and 1 rat that received the 1.0 μg dose of the D2 antagonist eticlopride did not. Analysis of these data revealed a significant effect of Treatment on both trials (F(4,41)=2.61, p<0.05; Figure 2b) and errors to criterion (F(4,41)=2.87, p<0.05; Figure 2c). Multiple comparisons revealed that rats receiving the 1.0 μg (n=11), but not the 0.1 μg (n=8), dose of SCH23390 required significantly more trials (p<0.05) and made more errors (p<0.05) compared to controls. In contrast, blockade of D2 receptors with 1.0 μg (n=9) or 0.1 μg (n=8) of eticlopride did not significantly alter these measures of performance. Subsequent analyses on the types of errors made during the set-shift by rats receiving 1.0 μg SCH23390 indicated that these treatments did not alter perseverative or never-reinforced errors (both F's<1.0, NS). However, blockade of D1 receptors in the NAc did result in increase in the number of regressive errors; analysis of these data revealed a trend toward statistical significance (F(1,19)=4.08, p=0.058; Figure 2d). Thus, the maintenance of novel strategies during set-shifting is facilitated by D1, but not D2, receptor activity in the NAc core.

Examination of the response latencies revealed a significant Treatment × Test Day interaction (F(4,41)=3.38, p<0.05). Simple main effects analysis further indicated that 1.0 μg eticlopride significantly (p<0.05) increased response latencies relative to controls, whereas all other treatment groups did not differ (Table 1). Infusions of all doses of the D1 and D2 antagonist decreased the number of photobeam breaks per minute compared to saline (F(4,41)=9.41, p<0.001 and Dunnett's, p<0.01; Table 1). However, there were no differences between groups in the number of trial omissions (F(4,41)=1.41, NS). These results suggest that although D2 receptor blockade did not interfere with set-shifting, these doses of eticlopride were effective at reducing overall locomotor activity and slowing response latencies.

Table 1 Mean (±SEM) Locomotor Activity and Response Latencies During the Set-Shift or Reversal in Experiments 1 and 2

Experiment 1B: Stimulation of NAc D1 and D2 Receptors and Strategy Set-Shifting

A total of 59 rats with acceptable placements were included in the data analysis, acquiring the discrimination on day 1 in 45±4 trials. As in Experiment 1A, rats were assigned to either a saline or drug treatment group based on their performance during the visual-cue training, so that there were no differences between groups on this measure (F(6,52)=1.0, NS; Figure 3a).

Figure 3
figure 3

Experiment 1B; stimulation of D2, but not D1 receptors in the nucleus accumbens (NAc) impairs strategy set-shifting. (a) Trials to criterion during the initial visual-cue discrimination, when rats in all groups received a mock infusion before training. (b) Trials and (c) errors to criterion on the set-shift to the response strategy following intra-NAc infusions of either saline, or different doses of the D1 agonist SKF81297 (SKF) or the D2 agonist quinpirole (Quin). (d) Analysis of the type of errors committed in Experiment 1B during the set-shift by rats receiving the 10 and 1.0 μg doses of quinpirole. Stimulation of D2 receptors in the NAc caused a pronounced increase in perseverative errors. Stars denote p<0.05 significant difference vs saline.

Of the 12 rats that received saline infusions before the set-shift, all but one achieved criterion in fewer than 150 trials. In contrast, 4 out of 10 rats in the 10 μg quinpirole group and 3 out of 9 in the 1.0 μg quinpirole group did not achieve criterion within the allotted number of trials. Rats in all other drug treatment conditions were able to complete the shift within this period. Analysis of these data revealed a significant effect of Treatment on both trials (F(6,52)=3.35, p<0.01; Figure 3b) and errors to criterion (F(6,52)=3.76, p<0.01; Figure 3c). Multiple comparisons revealed infusions of the D1 agonist SKF81297 were ineffective at altering performance during the set-shift. Neither the 2.0 μg (n=8), 1.0 μg (n=6), or 0.2 μg (n=8) doses of SKF81297 induced a significant change in trials or errors to criterion relative to saline-treated rats, although rats treated with the lowest dose required fewer trials to reach criterion compared to all other groups. In stark contrast, infusions of the 10 μg dose of quinpirole caused a pronounced impairment in set-shifting, leading to a significant (p<0.01) increase in the number of trials to criterion. Furthermore, both the 10 and 1 μg doses of quinpirole increased the number of errors committed during the set-shift (both p's<0.01), whereas the 0.1 μg dose (n=6) did not affect either measure relative to the control group. Subsequent targeted analyses on the types of errors made during the set-shift following quinpirole infusions revealed an impairment distinct from that induced by D1 receptor blockade. Specifically, infusions of the 10 μg dose induced a pronounced increase in the number of perseverative errors (F(2,28)=3.14 and Dunnett's, p<0.05), without affecting regressive or never-reinforced errors (all F's<1.3, NS). The effect of the 1.0 μg dose on perseverative errors did not achieve statistical significance. There were no differences between treatment groups with respect to response latencies (F(6,52)=0.77, NS; Table 1), locomotor activity (F(6,52)=1.08, NS; Table 1), or trial omissions (F(6,52)=0.41, NS). These findings indicate that excessive stimulation of D2, but not D1, receptors in the NAc core severely impairs strategy set-shifting by disrupting the ability to disengage from a previously relevant strategy.

Experiment 2: Blockade or Stimulation of NAc D1 and D2 Receptors and Reversal Learning

This experiment was conducted to clarify whether the effects of alterations in NAc DA transmission were specific to strategy set-shifting, or attributable to a more fundamental disruption in behavioral flexibility. We assessed the effect of behaviorally active doses of D1 and D2 receptor antagonists and agonists on a reversal of a response discrimination, requiring rats to use the same basic strategy (always press a lever in one position), with a shift to the specific stimulus–reward contingency. We used the highest doses of the D1 and D2 antagonists used in Experiment 1A (1 μg). For the D1 agonist SKF81297, we used the lowest dose tested in Experiment 1B (0.2 μg), because this was the only one to cause any noticeable, albeit nonsignificant change in performance. For the D2 agonist quinpirole, we used the middle, 1 μg dose in an attempt to maximize receptor selectivity with this drug. This dose was as effective as the higher, 10 μg dose at increasing the number of errors during the set-shift.

Forty-one rats with acceptable placements in the NAc core that received infusions of either saline, DA antagonists, or agonists were included in the data analysis. These rats acquired the initial response discrimination on day 1 in an average of 90±5 trials. Again, rats were assigned to a treatment group based on their day 1 performance, so that there were no differences between groups (F(4,38)=0.15, NS; Figure 4a).

Figure 4
figure 4

Experiment 2; the effects of blockade and stimulation of nucleus accumbens (NAc) dopamine (DA) receptors on reversal learning. (a) Trials to criterion during the initial response discrimination. (b) Trials to criterion during the response reversal following intra-NAc infusions of either saline, or different doses of the D1 or D2 antagonists and agonists. Only infusions of the D2 agonist quinpirole impaired reversal learning. (c) Analysis of the type of errors committed in Experiment 2 during the reversal by rats receiving the 1.0 μg dose of quinpirole. In this experiment, stimulation of D2 receptors in the NAc increased regressive (regress), rather than perseverative (persev) errors. This impairment was associated with an increase in errors committed on trials when the visual-cue light was illuminated above the incorrect (toward distracter) rather than the correct lever (away from distracter). Stars denote p<0.05 significant difference vs saline.

The effects of infusions of saline, D1 and D2 antagonists and agonists into the NAc core on reversal learning are presented in Figure 4b and c. From a nonparametric perspective, only two rats in the control group (n=14) were unable to achieve criterion performance of 10 correct consecutive choices within 150 trials. Two rats that received infusions of SCH23390 (n=6) also failed to achieve criterion, whereas all rats that received infusions of eticlopride (n=9) achieved criterion within this period. Similarly, only one rat in the SKF81297 group (n=6) did not make 10 consecutive correct choices within 150 trials. In contrast, of the rats that received infusions of quinpirole (n=8), six of them failed to achieve criterion. Analysis of the trials to criterion data yielded a significant effect of Treatment (F(4,38)=2.64, p<0.05). As opposed to what was observed in Experiment 1A, infusions of neither the D1 antagonist SCH23390 nor the D2 antagonist eticlopride impaired reversal learning. Similarly, rats receiving infusions of the D1 agonist SKF81297 reached criterion in a comparable number of trials relative to controls. However, infusions of the D2 agonist quinpirole significantly (p<0.01) increased in this measure relative to saline-treated rats. In this experiment, we did not observe a significant difference between groups in locomotor activity (F(4,38)=1.03, NS; Table 1) or omissions (F(4,38)=2.09, NS). However, a significant effect of Treatment on response latencies was obtained (F(4,38)=3.19, p<0.05; Table 1), attributable to the fact that infusions of both D1 and D2 antagonists significantly increased latencies relative to saline-treated rats (p<0.05). Infusions of DA agonists did not affect this measure. Thus, as opposed to the effects on set-shifting, neither blockade of D1 nor D2 receptors in the NAc core impaired shifting between different stimulus–reward associations. Likewise, stimulation of D1 receptors also did not affect reversal learning. In contrast, excessive activation of D2 receptors in the NAc core induced a marked impairment in reversal learning.

Analysis of the error data indicated that D2 receptor stimulation only induced a statistically significant increase in regressive (F(1,20)=4.43, p<0.05) but not perseverative errors (F(1,20)=0.87, NS; Figure 4c). Further insight into the nature of this deficit was obtained from a separate analysis on the number of errors committed when the distracter light was illuminated above either the correct (toward distracter) or incorrect (away from distracter) lever. Infusions of quinpirole led to a disproportionate number of errors on trials when the stimulus lights were illuminated above the incorrect lever relative to the control group (F(1,20)=9.21, p<0.001), but not when the light was above the correct lever (F(1,20)=1.55, NS; Figure 4c). This pattern of errors suggests that impairments in reversal learning induced by excessive stimulation of NAc D2 receptors may be attributable to rats having attempted to use a visual-cue strategy, even though these cues had not been reliably associated with reward.

Experiment 3: Stimulation of NAc D2 Receptors and Initial Response Discrimination Learning

Increasing D2 receptor activity in the NAc impaired the acquisition of a response discrimination on day 2 regardless of whether rats learned a visual-cue-based rule (Experiment 1B) or were trained to respond on a different lever (Experiment 2) on day 1. In Experiment 3, we tested the effects of infusions of 1 μg quinpirole on the initial learning of a response discrimination. Animals were trained in a manner identical to those in Experiment 2. Again, on each trial, a stimulus light was illuminated above one of the levers in a pseudorandom manner. As opposed to the previously described findings, infusions of the D2 agonist into the NAc core (n=8) did not disrupt the learning of a response discrimination on day 1 relative to saline-treated rats (n=10). No significant difference between the groups was observed on the number of trials required to make 10 correct consecutive choices (F(1,16)=0.41, NS; Figure 5a). Furthermore, under these conditions, infusions of quinpirole did not affect the number of errors made on trials when the distracter light was illuminated above the incorrect lever (26.5±3) compared to saline-treated rats (24.6±3; F(1,16)=0.16, NS). Similarly, the groups did not differ in terms of locomotor activity, response latencies, or omissions (all F's<1.8, NS). The following day, all rats received a saline infusion and were trained on a reversal. Again, both groups achieved criterion in a comparable number of trials (F(1,16)=0.01, NS; Figure 5b). There were no differences in locomotor activity, response latencies, or omissions (all F's<0.67, NS). Viewed collectively, these findings suggest that excessive activation of D2 receptors in the NAc core does not impair learning of a response discrimination. Rather, these treatments appear to disrupt the ability of rats to learn novel discriminations that conflict with previously acquired strategies or stimulus–reward associations.

Figure 5
figure 5

Experiment 3; stimulation of nucleus accumbens (NAc) D2 receptors does not affect initial learning of a response discrimination. (a) Trials to criterion during the initial response discrimination for rats receiving intra-NAc infusions of either saline or the D2 agonist quinpirole. (b) Trials to criterion during the response reversal following intra-NAc infusions of saline for rats in both treatment groups. Infusions of quinpirole before initial response discrimination training did not affect learning or reversal of this discrimination.

DISCUSSION

Here we report that pharmacological blockade of D1, but not D2, receptors in the NAc core retards the ability to shift between conflicting discrimination strategies, suggesting that normal D1 tone in the ventral striatum facilitates these processes. However, DA receptor blockade does not interfere with a simpler form of flexibility requiring shifts between stimulus–reward associations within a particular dimension. In contrast, supranormal stimulation of D2, but not D1 receptors, induces a more fundamental deficit, impeding both set-shifting and reversal learning, in a manner qualitatively different from that observed following D1 receptor blockade.

As opposed to other attentional set-shifting tasks (Birrell and Brown, 2000), the strategy shifting task used here places a heavier emphasis on response conflict, because the same set of stimuli are presented during the initial discrimination and the set-shift, although it also requires a shift of attention from one stimulus dimension to another (ie visual cue to spatial position). Thus, these procedures do not readily permit an assessment of the mechanisms underlying formation and shifting between attentional sets, but can be used to assess shifting between different discrimination strategies, which presumably incorporates attentional set-shifting as a component process (Slamecka, 1968). It is important to note that behavioral flexibility is not a unitary phenomenon, but may be viewed as a hierarchical construct, ranging from simpler (extinction, reversal learning) to more complex (set-shifting) mechanisms for adjusting behavior. Moreover, these forms of flexibility can be further subdivided into component processes that, using the present task, can be assessed with a detailed analysis of the type of errors rats make during the shift. Our findings that D1 receptor blockade and D2 receptor stimulation in the NAc induced dissociable patterns of errors further substantiate the notion that different neural mechanisms mediate the ability to suppress the use of irrelevant strategies and acquire and maintain novel modes of responding (Block et al, 2007).

A Selective Role for NAc D1 Receptors in Strategy Set-Shifting

The present data complement the findings that unilateral infusions of a D1 antagonist combined with contralateral hippocampal inactivations induces nonperseverative deficits in shifting between different discrimination strategies, assessed with a maze-based set-shifting procedure (Goto and Grace, 2005). In this study, D1 antagonism did not increase perseveration, but instead increased regressive errors. This expands on our previous report that NAc core inactivation causes similar, nonperseverative impairments in set-shifting, and further substantiates the notion that neural activity within this nucleus, in conjunction with D1 receptor tone, facilitates the maintenance of novel strategies once perseveration has ceased (Floresco et al, 2006a). The D1 antagonist used in these studies (SCH23390) has affinity for both D1 and 5HT2A receptors. However, the observation that systemic blockade of 5HT2A receptors with M100907 produces no apparent impairment in set-shifting (Rodefer et al, 2008) suggests that the effects of SCH23390 reported here are likely attributable to actions on NAc D1 receptors.

Importantly, D1 receptor blockade did not affect reversal learning, where rats were required to use the same basic strategy, but had to shift responding from one lever to another. This lack of effect would indicate that impairments in shifting between strategies observed in Experiment 1A were not caused by general disruptions in the ability to adjust behavior upon changes in reinforcement contingencies. Rather, NAc D1 receptors appear to facilitate selectively higher-order forms of response flexibility, requiring fundamental changes in the manner that animals discriminate between complex stimuli associated with reward, but not during intradimensional shifts between different stimulus–reward contingencies. Notably, the rodent medial PFC also has a selective function in set-shifting, but not in reversal learning (Ragozzino et al, 1999; Birrell and Brown, 2000; Floresco et al, 2008). The fact that disconnection between the PFC and the NAc also disrupts strategy set-shifting (Block et al, 2007) suggests that D1 receptor tone modulates PFC-evoked activity of NAc neurons. This may occur through presynaptic modulation of cortico-striatal inputs (Nicola et al, 1996) or attenuation of lateral inhibition, which may diminish competitive interactions between ensembles of NAc neurons (Taverna et al, 2005).

Recent in vivo neurochemical studies provide additional insight into the function of NAc D1 receptors in set-shifting (Stefani and Moghaddam, 2006). These investigators observed that tonic levels of NAc DA displayed a relatively small increase during acquisition of an initial discrimination, but showed a much more pronounced increase when animals were required to shift strategies on a cross-maze. Thus, mesoaccumbens DA transmission may be particularly sensitive to unexpected changes in reinforcement contingencies. Under these conditions, activation of NAc D1 receptors by increased tonic DA may serve to stabilize the use of novel response strategies upon disengagement from previously effective, but now incorrect strategies. Stabilization of new strategies may be enabled by D1 receptor-mediated increases in cortico-striatal synaptic strengths (Floresco et al, 2001; Schotanus and Chergui, 2008) that could occur over the course of new learning during the set-shift. Changes in phasic DA signaling may also contribute to the effects of D1 receptor activation in facilitating set-shifting, given that reward-related stimuli promote DA transients in the NAc (Day et al, 2007; Sunsay and Rebec, 2008). However, phasic DA activity in the ventral striatum is thought to act primarily on receptors located in or around DA synapses, whereas increases in tonic DA are believed to affect extrasynaptic receptors (Floresco, 2007). In this regard, the majority of striatal D1 receptors are localized extrasynaptically, not within DA synapses (Caillé et al, 1996; Hara and Pickel, 2005), as opposed to D2 receptors that reside at both synaptic and extrasynaptic sites (Sesack et al, 1994). When viewed collectively, these ultrastructural findings suggest that stimulation of D1 receptors through volume transmission of tonic (ie extrasynaptic) DA may be the more predominant means through which these receptors are activated to facilitate shifting between strategies.

D2 receptor blockade did not affect set-shifting or reversal learning, although these treatments did decrease locomotion and increase response latencies. This is in keeping with other studies reporting that NAc D1 receptors mediate response accuracy whereas D2 receptors have a greater role in motivational aspects of performance (Floresco and Phillips, 1999; Floresco, 2007; Pezze et al, 2007). The lack of effect of eticlopride on either type of shift contrasts with the effect of systemic D2 receptor blockade, which has been shown to impair reversal learning (Ridley et al, 1981; Lee et al, 2007). The present results suggest that dopaminergic modulation of reversal learning likely occurs through actions on D2 receptors located in regions other than the NAc. In this regard, although the OFC is critical in facilitating reversal shifts (Dias et al, 1996; McAlonan and Brown, 2003; Ragozzino, 2007; Ghods-Sharifi et al, 2008), mesocortical DA depletion does not affect this form of flexibility (Crofts et al, 2001). The possibility remains that the certain subregions of the dorsal striatum may be the critical locus where D2 receptors enable shifts between different stimulus–reward contingencies, although striatal DA depletion has yielded inconsistent effects on reversal learning (Collins et al, 2000; Crofts et al, 2001; O’Neill and Brown, 2007). Nevertheless, the potential exists for an intriguing double dissociation between the contribution of striatal DA receptors and different forms of behavioral flexibility. D2 receptors in the dorsal striatum may facilitate reversal shifts within a stimulus dimension, whereas ventral striatal D1 receptors contribute to shifts between different discrimination strategies. Clearly, further research is required to elucidate the mechanisms through which striatal DA mediates these complementary forms of behavioral flexibility.

Excessive D2, but not D1, Receptor Activation Induces Fundamental Disruptions in Behavioral Flexibility

Supranormal stimulation of NAc D1 receptors neither improved nor impaired strategy and reversal shifts, although rats tended to make fewer set-shifting errors after treatment with the lowest dose. The lack of effect of SKF81297 on set-shifting contrasts with the impairments induced by a D1 receptor antagonist. One potential reason for these asymmetrical results may merely be related to a floor effect. As discussed above, D1 receptor activity appears to have a selective function in maintenance of novel strategies, indexed by the number of regressive errors made during the shift. Using these procedures, control animals typically make relatively few of these types of errors (<10), rendering it difficult to observe a statistically significant improvement in performance using this measure. Yet, it is interesting to note that similar asymmetrical effects of D1 agents on set-shifting have been reported following pharmacological blockade or stimulation of these receptors within the PFC. Blockade of prefrontal D1 receptors impaired set-shifting (Ragozzino, 2002) whereas infusions of D1 agonists were without effect (Fletcher et al, 2005; Floresco et al, 2006b). This is in contrast to the effects of mesocortical D1 receptor manipulations on working memory, with D1 receptor blockade impairing performance and stimulation of these receptors either impairing or improving working memory, dependent on a number of factors (reviewed in Floresco and Magyar, 2006). Taken together, these data indicate that although physiological levels of D1 receptor tone facilitate efficient shifts between strategies, pharmacological augmentation of D1 activity beyond normal levels does not appear to provide any additional beneficial effect to these processes.

In contrast to the above-mentioned findings, infusions of quinpirole into the NAc impaired both set-shifting and reversal learning. When interpreting these findings with respect to the contribution of ventral striatal D2 receptors to these executive functions, it is important to note that pharmacological stimulation does not mimic the precise pattern of D2 receptor activation by tonic and phasic DA transmission that may occur under normal conditions. Rather, the use of D2 agonists in these studies is better suited to ascertain the effects of abnormal stimulation of these receptors, as may occur in certain pathophysiological states where disruptions in tonic or phasic DA signaling may occur . Indeed, the finding that intra-NAc infusions of quinpirole impaired reversal learning is of particular interest, given that ventral striatal lesions do not reliably impair this form of flexibility (Stern and Passingham, 1995; Schoenbaum and Setlow, 2003). The finding that D2 receptor stimulation in the NAc did impair reversal shifts suggests that this manipulation induced an abnormal contribution by this nucleus to this form of learning.

The impairments induced by quinpirole observed here are likely attributable to excessive activation of D2 receptors, in light of a recent study where disruption of reversal learning induced by systemic quinpirole was ameliorated by co-administration of a D2 but not a D3 antagonist (Boulougouris et al, 2009). An important consideration in evaluating these findings is that activation of NAc D2 autoreceptors can suppress DA efflux (Pierce et al, 1995). However, we find it unlikely that these impairments were caused by reductions in DA levels in the NAc because (1) stimulation of NAc D2 receptors induced a qualitatively different form of impairment in set-shifting (perseveration) compared to D1 receptor blockade (strategy maintenance); (2) neither D1 nor D2 antagonism affected reversal learning, whereas quinpirole was effective at impairing this form of flexibility. Nevertheless, the possibility remains that the impairments in set-shifting induced by quinpirole may have been exacerbated by both increased D2 receptor activity in combination with reduced D1 receptor tone in the NAc.

The perseverative set-shifting deficit induced by D2 receptor stimulation in the NAc was similar to that observed following PFC–NAc disconnection (Block et al, 2007), or combined unilateral PFC inactivation/contralateral NAc quinpirole assessed with maze-based set-shifting procedures (Goto and Grace, 2005). In a similar vein, intra-NAc infusions of quinpirole or PFC–NAc disconnections also enhanced perseverative responding on a serial reaction time task (Christakou et al, 2004; Pezze et al, 2007). Thus, excessive activation of D2 receptors may hamper communication in these cortico-striatal circuits that contribute to response selection mechanisms, thereby disrupting the ability to modify behavior in accordance with changing reinforcement contingencies. This notion is in keeping with neurophysiological studies demonstrating that D2 receptor activation attenuates neural activity in NAc neurons driven by PFC inputs (Brady and O’Donnell, 2004; Goto and Grace, 2005).

The present data parallel those obtained from human subjects, where systemic administration of D2 agonists in healthy subjects also impairs reversal learning (Mehta et al, 2001). Interestingly, in the present study, excessive D2 receptor stimulation did not uniformly disrupt the ability to inhibit a prepotent response. Specifically, impairments in reversal learning observed here were not associated with excessive perseveration, but rather, were manifested as an impediment in maintaining a novel stimulus–reward association. In a similar vein, Cools and colleagues (2007) reported that administration of L-DOPA to Parkinson's patients modulated reversal-related activity in the NAc, but not in the dorsal striatum or the PFC. In that study, attenuation of NAc activity by L-DOPA was most pronounced during the final reversal errors that preceded behavioral switching, a result in keeping with the increased regressive errors induced by quinpirole during reversal learning. Under these conditions, rats may have attempted to solve this problem using other, inefficient strategies, as evidenced by the disproportionate number of errors committed on trials where the cue was illuminated above the incorrect lever, even though this distracter was not associated with reward reliably (ie only on 50% of trials). Note that during set-shifting, quinpirole-treated rats also responded excessively toward the cue that, in this instance, was previously associated with reward in a consistent manner. As such, D2 receptor stimulation may impair modifications in behavior mandated by changes in reinforcement contingencies by inappropriately focusing attention to salient environmental cues, regardless of whether they have been consistently or only partially associated with reward. This effect may be linked to the proposed role of DA transmission in signaling reward prediction errors through suppression of DA neuron firing (Hollerman and Schultz, 1998; Tobler et al, 2003). Saturating D2 receptors would dilute the ability of brief, phasic decreases in DA neuron activity to signal nonrewarded events, potentially providing an aberrant reward signal that could interfere with the ability to alter behavior in response to changes in reinforcement contingencies. Thus, even though a decrease in NAc D2 receptor tone does not affect set-shifting or reversal learning, it is apparent that abnormal increases in D2 receptor activity severely interfere with these processes.

SUMMARY AND CONCLUSIONS

Behavioral and neurophysiological studies have implicated the DA system in switching of attentional and behavioral resources to unexpected, behaviorally important stimuli (Oades, 1985; Redgrave et al, 1999; Floresco et al, 2001; Floresco, 2007). Our findings permit a refinement of this view, in that mesoaccumbens DA, acting on D1 receptors, has a selective function in facilitating shifts between different response strategies. Furthermore, excessive D2 receptor activity induces a fundamental disruption in these processes, suggesting that the ability to modify behavior in a timely manner is critically dependent on a balance of DA receptor tone within the NAc. It is notable that, in comparison, blockade (but not stimulation) of PFC D1 or D2 receptors induces severe perseverative deficits in set-shifting (Ragozzino, 2002; Floresco et al, 2006b), possibly by disrupting dopaminergic modulation of thalamic inputs to the PFC (Floresco and Grace, 2003). Thus, it is apparent that multiple DA pathways enable complex forms of behavioral flexibility, but the specific contributions and the receptor mechanisms through which the DA system facilitates these processes vary considerably between cortical and striatal terminal regions.

From these findings, important insights may be obtained regarding the neural mechanisms underlying impairments in different forms of behavioral flexibility associated with neurological conditions linked to perturbations in the DA system. Degeneration of DA neurons in parkinsonian patients would be expected to lead to reduced D1 receptor tone in the ventral striatum, which may contribute to impairments in set-shifting observed in this disease (Gauntlett-Gilbert et al, 1999). Conversely, schizophrenia has long been associated with excessive activation of D2 receptors. Impairments in set-shifting and reversal learning observed in these patients have been attributed to disruptions in frontal lobe functioning (Goldberg and Weinberger, 1988; Waltz and Gold, 2007; Murray et al, 2008). However, the finding that excessive stimulation of D2 receptors in the NAc severely impedes these forms of behavioral flexibility suggests that pathophysiological increases in mesolimbic DA activity may also be a contributing factor to inflexible patterns of behavior observed in this disorder.