INTRODUCTION

Environmental cues help shape our voluntary actions and are believed to exert an even more powerful influence over compulsive behaviors like drug seeking (Robinson and Berridge, 1993; Everitt and Robbins, 2005). In addition to biasing the selection of actions, such cues help control the vigor with which those actions are performed. For instance, rats trained to lever press for food tend to respond at a higher rate if presented with a conditioned stimulus (CS) that has been separately paired with food, a phenomenon known as Pavlovian–instrumental transfer (Balleine and Ostlund, 2007). This response-invigorating influence of a CS tends to transfer across appetitively motivated actions in an outcome-independent manner (Balleine, 1994; Corbit and Balleine, 2005; Corbit et al, 2007); for example, Balleine (1994) found that, for thirsty rats trained to lever press for water, a CS paired with food pellets was just as effective in increasing performance as a CS paired with water. Importantly, the response-invigorating properties of a CS appear to be tightly regulated by primary motivational processes (Corbit et al, 2007; Dickinson and Dawson, 1987); for example, Balleine (1994) found that if rats trained to press for water were tested thirsty (but sated on food), a food-paired CS actually suppressed performance.

Interestingly, subjects trained on multiple stimulus–outcome and action–outcome contingencies tend to show an outcome-specific form of transfer, such that a CS will bias action selection in favor of whichever action was trained with the same outcome as that CS (Kruse et al, 1983). For example, rats are more likely to perform an action trained with sucrose solution than an action trained with grain pellets if presented with a CS that signals sucrose solution (Colwill and Rescorla, 1988). Recent findings indicate that the outcome-specific and general forms of transfer are mediated by separate behavioral and neural processes (Corbit and Balleine, 2005; Corbit et al, 2007). For instance, unlike general transfer, which is sensitive to shifts in primary motivational state, Corbit et al (2007) found that outcome-specific transfer is relatively insensitive to such manipulations. The two types of transfer also appear to be mediated by circuits involving distinct subnuclei of the amygdala (Corbit and Balleine, 2005).

Studies using procedures likely to support the general form of transfer have shown that this effect is attenuated by dopamine receptor antagonism (Dickinson et al, 2000; Lex and Hauber, 2008; Wassum et al, 2011) and is potentiated by central infusions of the indirect dopamine agonist amphetamine (Wyvell and Berridge, 2000), which is consistent with a large literature implicating dopamine in incentive motivation (Blackburn et al, 1992; Ikemoto and Panksepp, 1999; Robinson and Berridge, 1993) and response invigoration (Robbins and Everitt, 2007; Salamone et al, 2007). Given the dissociations between outcome-specific and general transfer described above, one might expect the outcome-specific, response-biasing influence of CSs to be independent of dopamine signaling (but see Corbit et al, 2007).

The Pavlovian–instrumental transfer paradigm has much in common with the reinstatement-from-extinction paradigm. In both cases, the noncontingent delivery of a stimulus (a reward-paired CS or the reward itself) elicits an increase in the performance of a response that has typically been extinguished through nonreinforcement. Similar to transfer, in reinstatement, noncontingent rewards tend to both invigorate responding (Rescorla and Skucy, 1969) and bias action selection in an outcome-specific manner (Colwill, 1994; Delamater et al, 2003; Leri and Stewart, 2001; Ostlund and Balleine, 2007a) through behaviorally and neurally dissociable processes (for review, see Balleine and Ostlund, 2007). There is some indication that dopamine plays a role in the reinstatement of responses reinforced with natural rewards like food and water (Chausmer and Ettenberg, 1997; Horvitz and Ettenberg, 1988; McFarland and Kalivas, 2001). However, as much of this evidence has come from studies using discrete trial runway tasks, the involvement of dopamine in the reinstatement of free-operant (ie, self-paced) responding for natural rewards has remained largely unexplored. Furthermore, there is no direct information on the role of dopamine in outcome-specific reinstatement. As with transfer, dissociations between the response-biasing and response-invigorating effects of noncontingent food rewards suggest that these phenomena may be differentially dependent on dopamine transmission. Given the hypothesis that dopamine selectively mediates the response-invigorating, as opposed to the response-biasing, effects of CSs, one might also predict that it would play a similarly selective role in reinstatement performance.

Our aim was to examine the contributions of dopamine signaling to the expression of outcome-specific transfer and reinstatement performance by pretreating rats with the nonselective dopamine receptor antagonist flupentixol before testing. We sought to determine whether dopamine preferentially mediates the response-invigorating effects of rewards and/or reward-paired cues.

SUBJECTS AND METHODS

Subjects

A total of 40 adult (90 days at the start of experiment) male Long–Evans rats (Harlan Laboratories, Indianapolis, IN) weighing between 280 and 330 g at the beginning of training served as subjects. The rats were group housed (three per cage) in a humidity- and temperature-controlled vivarium at UCLA. Rats were food deprived throughout behavioral training and testing by restricting their access to home chow to 12 g per rat per day in order to maintain them at 85% of their free-feeding bodyweight. All procedures are in compliance with the National Research Council's Guide for the Care and Use of Laboratory Animals and were authorized by the institutional animal care and use committee of UCLA.

Apparatus

Eight identical Med Associates operant chambers were used. Each chamber was equipped with two retractable levers located on the left and right side on the front wall. Food pellets (45 mg, Bioserv, Frenchtown, NJ; grain-based pellets were used in experiment 1 and chocolate-flavored purified pellets were used in experiment 2) and 20% sucrose solution (0.1 ml per delivery) could be delivered into separate wells within a common food magazine located at the center of the front wall, between the two levers. An infrared beam was positioned across the magazine opening, allowing for the detection of head entries. A house light (24 V) located at the top of the rear wall provided illumination during training and testing. White noise (70 dB) and tone (2000 Hz, 70 dB) generators were attached to the exterior of each chamber, which were housed in separate sound- and light-attenuating shells.

Procedure

Pavlovian and instrumental conditioning

Rats were given one session of Pavlovian conditioning each day for a total of 8 days. Each session consisted of eight tone and eight white noise presentations. During each 2-min CS presentation, the appropriate outcome was delivered on a 30-s random time schedule, resulting in an average of four stimulus–outcome pairings per trial. Each CS was paired with a different outcome; for half the rats in each experiment, the white noise was paired with pellets and the tone was paired with sucrose, whereas the remaining rats were given the opposite arrangement. CSs were delivered in a pseudorandom order (no more than two successive presentations of the same CS) and were separated by a variable intertrial interval (ITI; mean=3.125 min; range=2.25–4.0 min). The rate at which rats entered the food magazine before (2 min) and during each CS (interval between CS onset and first US delivery) was recorded.

Rats were then given 11 days of instrumental conditioning, receiving two sessions of training each day, one with the left lever and one with the right lever (session order alternating over days). Each action was reinforced with a different outcome (counterbalanced with Pavlovian training contingencies); for half the rats in each experiment, pressing the left lever earned pellets and pressing the right lever earned sucrose, whereas the remaining rats received the opposite training arrangement. Each session was terminated after 30 outcomes had been earned or after 60 min, whichever came first. The outcomes were earned according to a continuous reinforcement schedule for sessions 1 and 2. Reinforcement was then delivered according to an ascending series of random ratio (RR) schedules; a RR-5 schedule was in place during sessions 3–5, a RR-10 schedule was used during sessions 6–8, and a RR-20 schedule was used during sessions 9–11.

Experiment 1: effects of flupentixol on outcome-specific Pavlovian–instrumental transfer and reinstatement: After initial Pavlovian and instrumental training, each rat (N=24) received four tests (see Figure 1). On the day before each test, the rats received a 30-min extinction session in which both levers were available but were not reinforced. On the following day, the rats received either a transfer test or a reinstatement test. Each test lasted 26 min and consisted of four trials, the first of which began 4 min into the test session. The transfer test consisted of two white noise trials and two tone trials (trial order: tone–noise–tone–noise), separated by a fixed 4-min interval. As during training, each CS was presented for 2 min. During the reinstatement test, rats received two pellet and two sucrose trials (trial order: pellet–sucrose–pellet–sucrose). Each trial consisted of two back-to-back presentations of the appropriate outcome over a 4-s period. To ensure that trial onset times were identical across transfer and reinstatement tests, a 5 min 56 s ITI was used for the latter. At 15 min before each test, rats were given an i.p. injection (1.0 ml/kg) of either sterile saline solution or 0.5 mg/kg of the nonspecific dopamine receptor antagonist flupentixol (Sigma Aldrich, St Louis, MO). Our dose of flupentixol was selected because it has been shown to abolish the general transfer effect without excessively disrupting baseline instrumental performance (Dickinson et al, 2000). Rats were given two transfer tests (one on saline and one on flupentixol) and two reinstatement tests (one on saline and one on flupentixol). In all, 12 of the rats were given both transfer tests before the reinstatement tests, whereas the remaining 12 rats were tested in the opposite order. Within each of these conditions, which were counterbalanced across Pavlovian and instrumental training contingencies, half of the rats were administered the saline test before the flupentixol test, whereas the remaining rats were given the opposite drug treatment order. Treatment order was held constant across transfer and reinstatement testing. After each test, rats were given 1 day of Pavlovian retraining and 2 days of instrumental retraining (RR-10 for first day and RR-20 for the second day).

Figure 1
figure 1

Schematic of design for experiment 1. Hungry rats underwent differential Pavlovian conditioning and instrumental training before Pavlovian–instrumental transfer and reinstatement testing. Each eliciting event was selectively associated with one of the two available actions (indicated by italics). Subjects were given systemic injections of flupentixol (0.5 mg/kg) or saline before each test. Each subject underwent all four test-drug conditions (pseudorandom order). CS, conditioned stimulus; R, response; O, outcome; sal, saline; flu, flupentixol.

PowerPoint slide

Experiment 2: effects of low-dose flupentixol treatment on outcome-specific Pavlovian–instrumental transfer: Two rats in this study were excluded from the experiment because of equipment malfunction during initial training. Each of the remaining rats (N=14) received four Pavlovian–instrumental transfer tests using the same testing and retraining procedures as in experiment 1. However, rats in experiment 2 were treated with two different, lower doses of flupentixol (0.05 and 0.25 mg/kg). A longer (1 h) injection-to-test interval was used in experiment 2 to ensure more stable levels of flupentixol were attained at test. The tests were divided into two pairs. Rats were treated with saline before one test in each pair and flupentixol (either high or low dose) before the other test. The flupentixol dose was changed for the second round of tests so that each rat was tested under both dose conditions. Both the order of saline and flupentixol treatments within each pair of tests and the order of flupentixol doses across tests were counterbalanced with training conditions.

RESULTS

Experiment 1: Effects of Flupentixol on Outcome-Specific Pavlovian–Instrumental Transfer and Reinstatement

Behavioral training

Figure 2a presents the mean rate at which rats entered the magazine before and during CS presentations across Pavlovian conditioning sessions. As expected, over days, rats learned to enter the food magazine when the CS was delivered. Consistent with this interpretation, a session × CS period ANOVA detected a significant effect of session (F7, 161=24.25; p<0.001), a significant effect of period (F1, 23=164.29; p<0.001), and, most importantly, a significant session by CS interaction (F7, 161=26.13; p<0.001).

Figure 2
figure 2

Training results for experiment 1. (a) Magazine approach rate (±SEM) before (pre-CS) and during CS presentations (collapsed across CSs) over Pavlovian conditioning sessions. (b) Rate of lever pressing (collapsed across actions; ±SEM) over instrumental conditioning sessions.

PowerPoint slide

The results of instrumental training, plotted as the mean rate of lever pressing over sessions, is presented in Figure 2b. A one-way ANOVA confirmed that the apparent increase in lever pressing over days was significant (F10, 230=130.87; p<0.001).

Pavlovian–instrumental transfer testing

To explore the role of dopamine transmission in the expression of Pavlovian–instrumental transfer, all rats were given two transfer tests: one after a saline injection and one after a flupentixol injection. The results are plotted in Figure 3. In the Introduction, we developed the hypothesis that dopamine receptor activation mediates the nonspecific, response-invigorating influence of reward-paired CSs over instrumental performance but does not play a significant role in mediating the outcome-specific, response-biasing influence that these CSs have on action selection. To target the former, we computed a general elevation ratio (X/(X+Baseline) collapsing the data across CSs and actions and using a 1-min pre-CS baseline period and 1-min CS increments (for a similar approach, see Zorawski and Killcross, 2003). It is clear from these data, which are presented in Figure 3a, that the CS deliveries were effective in elevating the rate at which rats pressed the two levers when they were tested in the control condition. However, flupentixol pretreatment appeared to attenuate this effect. These data were analyzed using a drug × minute (two CS periods) ANOVA, which detected a main effect of drug (F1, 23=5.678; p=0.03) and minute (F1, 23=7.742; p=0.01). No interaction between these factors was detected (F1, 23=0.818; p>0.05). Further analysis (one-sample t-test, two tailed, df=23; H0=0.5) revealed that responding was significantly elevated (ie, >0.5) during both the first (p=0.001) and second (p<0.001) minute of the CS when rats were tested after being injected with saline. In contrast, when the same rats were tested on flupentixol, no significant elevation in responding was detected in either the first (p>0.05) or second (p>0.05) minute of the CS.

Figure 3
figure 3

Effects of flupentixol pretreatment on outcome-specific transfer performance (experiment 1). (a) Response rate during CS presentations (collapsed across CSs and actions; ±SEM), plotted over minutes using an elevation ratio (X/X+Pre-CS). Dotted line indicates no change from baseline (ratio=0.5). (b) Choice of the action (Same) whose training outcome was the same as the presented CS (collapsed across CSs; ±SEM), plotted over minutes as the percentage of total actions performed on that action ((Same/Total Actions) × 100). Dotted line indicates no preference between actions (50%). (c) Average number of presses per min (±SEM) during the pre-CS period (collapsed across CSs and actions), during the CS whose outcome was the same as the response (Same) and during the CS whose outcome was different than the response (Different).

PowerPoint slide

These results indicate that dopamine receptor blockade disrupts the response-invigorating effect of CSs on instrumental performance. However, such cues can also bias action selection toward responses trained with the outcome signaled by that cue and away from responses trained with another, qualitatively different outcome. Therefore, if a rat in the current experiment was given tone–sucrose and noise–grain pairings during Pavlovian conditioning before being trained to press the right lever for sucrose and the left lever for grain, we should expect the rat to choose the right lever more often than the left lever when the tone is being presented, and vice versa. To isolate this response-biasing effect of the CS presentations during transfer testing and determine whether this effect was sensitive to dopamine receptor antagonism, we computed the percentage of total lever presses performed for the action whose outcome was the SAME as the CS being presented on that trial ((SAME/Total) × 100) for the minute before the CS and for each of the 2 min during the CS. These data are presented in Figure 3b. Inspection of these data indicates that the CSs were effective in guiding action selection on the basis of a shared outcome, and did so regardless of whether rats were injected with saline or flupentixol. A drug × minute (pre-CS and both minutes of the CS) ANOVA resulted in a significant main effect of minute (F2, 46=5.23; p=0.009), indicating a shift in response selection over minutes, but did not detect an effect of drug (F1, 23=0.24; p>0.05) or a drug by minute interaction (F2, 46=0.28; p>0.05). One-sample t-tests (two tailed, df=23) comparing these percentages with 50% (indifference; H0=50) confirmed that, regardless of drug treatment, the rats shifted their choice of actions toward the lever whose outcome was being signaled by the CS. However, this effect only reached significance during the second minute of the CS presentation in both the saline (p=0.016) and flupentixol (p=0.006) condition (all other p's>0.05).

Figure 3c presents the test results using a more conventional method of analysis, plotting the mean rate of responding on a lever (collapsed across levers) during the minute before each CS delivery (Pre), during the CS (2-min) that was paired with the same outcome as that lever (Same), and during the CS (2-min) that signaled the other outcome (Different). Although the analyses presented in Figure 3a and b provide a more focused method for evaluating the role of dopamine in mediating the response-invigorating and response-biasing effects of reward-paired cues, there are several aspects of the data shown in Figure 3c worth noting. First, although flupentixol appeared to have a suppressive effect on transfer performance, it did not prevent rats from exhibiting strong outcome-specific transfer, preferentially increasing their rate of responding during CS Same relative to CS Different. Instead, it appeared that this treatment had a roughly equivalent effect on the increment in responding produced by both cues. Second, flupentixol appeared to have little to no effect on baseline (Pre-CS) responding. A drug × period (Pre, Same, and Different) ANOVA found a significant effect of period (F2, 46=15.28; p<0.001). For both drug conditions, rats responded significantly more during CS Same than during CS Different (paired samples t-test: saline, p=0.03; flupentixol, p=0.03) or during the Pre-CS period (saline, p<0.001; flupentixol, p=0.03). However, the effect of drug did not quite reach significance (F1, 23=3.70; p=0.07) nor was there a significant drug by period interaction (F2, 46=0.82; p>0.05). Thus, the more finely tuned analysis performed on data shown in Figure 3a and b was effective in exposing a statistically significant effect that was marginal when conventional methods were used.

Altogether, the results of the outcome-specific transfer test provide evidence that the response-invigorating and response-biasing effects of reward-paired CSs are dissociable, with dopamine transmission playing a particularly important role in the former but not the latter. As such, they tend to favor the view that dopamine signaling mediates the nonspecific incentive motivational process.

Reinstatement testing

To characterize the role of dopamine transmission in reinstatement, rats were given two tests modeled after the transfer test but substituting noncontingent rewards for CS deliveries. All rats were pretreated with saline for one test and pretreated with flupentixol for the second test. The results are presented in Figure 4. As with transfer testing, we were interested in targeting the response-invigorating and response-biasing effects of the noncontingent reward deliveries during reinstatement testing. Figure 4a plots nonspecific changes in the rate of instrumental performance over minutes (collapsed across trials), computed as an elevation ratio using the minute before each reward delivery as the baseline period. There are several aspects to these data worth noting. First, the rats appeared to decrease their response rate immediately after the reward delivery, which was most likely the result of response competition with feeding and magazine approach behavior. Second, following this brief dip, response rates increased above baseline levels. Third, although rats showed a slightly larger dip in responding following the reward deliveries in the flupentixol test, this drug did not appear to have a marked effect on the response elevation produced by those rewards. A drug × minute (3 postreward minutes) ANOVA found no effect of drug (F1, 23=0.02; p>0.05) but did detect a significant main effect of minute (F2, 46=26.44; p<0.001), confirming that the rewards had altered the instrumental performance of the rats. The drug by minute interaction also failed to reach the conventional level of significance (F2, 46=2.66; p=0.081), indicating that the immediate, nonspecific effects of reward delivery on instrumental performance were minimally affected by dopamine receptor blockade. One-sample t-tests (two tailed, df=23) comparing these elevation ratio scores with 0.5 (H0=0.5) found that the reduction in responding in the first minute of the trial was significant in the flupentixol (p=0.005) but not in the saline test (p>0.05). A significant elevation in responding was observed in both tests during the second minute of the trial (flupentixol: p=0.001; saline: p=0.003). Responding continued to be significantly elevated in the third minute for the flupentixol test (p=0.044) but fell short of the threshold for the saline test (p=0.080). The modest size and duration of this effect, particularly when compared with flupentixol's effect on transfer performance (Figure 3a), suggest that dopamine signaling does not play a primary role in mediating the immediate, response-invigorating properties of noncontingently presented rewards. It should be noted that flupentixol administration resulted in lower levels of baseline (prereward delivery) response rates during reinstatement testing, a finding that we discuss in more detail below (see Figure 4c). The elevation ratio score attempts to control for such differences by focusing on relative change in performance.

Figure 4
figure 4

Effects of flupentixol pretreatment on outcome-specific reinstatement performance (experiment 1). (a) Response rate after outcome presentations (collapsed across outcomes and actions; ±SEM), plotted over minutes using an elevation ratio (X/X+Pre-Outcome). Dotted line indicates no change from baseline (ratio=0.5). (b) Choice of the action (Reinst) trained with the reinstating outcome (collapsed across outcomes; ±SEM), plotted over minutes as the percentage of total actions performed on that action ((Reinst/Total Actions) × 100). Dotted line indicates no preference between actions (50%). (c) Average number of presses per min (±SEM) during the predelivery period (collapsed across outcomes and actions), during the 3 min after delivery of the outcome that was paired with the response (Reinst), and during the 3 min after delivery of the other outcome (Other).

PowerPoint slide

To determine if flupentixol affected the tendency of the reinstating outcome to bias action selection, we computed the percentage of total actions performed on the lever that had earned the reinstating outcome during training. From these data, which are presented in Figure 4b, it is clear that the delivery of an outcome caused rats to choose the action that was trained with that outcome more often than they choose the other action. Furthermore, the flupentixol treatment appeared to have little if any effect on the tendency of the outcome to bias response selection. A drug × minute (prereward baseline and the three postreward minutes) ANOVA failed to detect an effect of drug (F1, 23=0.80; p>0.05), but did detect a significant effect of minute (F3, 69=12.21; p<0.001). The drug by minute interaction was not significant (F3, 69=0.61; p>0.05). One-sample t-tests (two tailed, df=23) comparing these values with 50% (H0=50) found that the rats were more likely to choose the action trained with the reinstating outcome in each of the 3 min following it delivery, an effect that was present in both the saline (p's0.02) and flupentixol test (p's0.001).

Figure 4c plots the reinstatement data as the mean rate of responding on a lever (collapsed across levers) during the minute before each outcome delivery (Pre), during the 3 min following the delivery of the outcome earned by that lever (Reinst), and during the 3 min following delivery of the other outcome (Other). Using this measure, flupentixol appeared to have a generally suppressive effect on the overall rate of responding during reinstatement testing, an effect that was largely restricted to the second half of the test (see below). Flupentixol seemed to have no effect on the outcome specificity of transfer. A drug × period (Pre, Reinst, and Other) ANOVA found a significant effect of period (F2, 46=35.59; p<0.001). For both drug conditions, rats showed more responding after delivery of outcome Reinst than after the delivery of outcome Other (paired samples t-test: saline, p<0.001; flupentixol, p<0.001) or during the preoutcome period (saline, p<0.001; flupentixol, p<0.001). The ANOVA also detected a significant effect of drug (F1, 23=9.15; p=0.006). However, the drug × period interaction did not reach significance (F2, 46=2.76; p=0.074).

Although flupentixol injections did not appear to have a dramatic disruptive effect on the ‘short-term’ enhancement of instrumental performance observed during the first few minutes following noncontingent outcome presentations, further analyses revealed that this treatment did have a general response-suppressing effect on performance during the second half of the reinstatement test. Figure 5 presents the mean rate of lever pressing during transfer and reinstatement tests, plotted separately for the first and second half (13 min each) of each test. Despite receiving different kinds of stimuli, the rats lever pressed at similar rates during the first half of the reinstatement and transfer tests. Furthermore, flupentixol had relatively little effect on performance during this period. This interpretation was supported by the results of a drug × test ANOVA applied to these data, which found no effect of drug (F1, 23=0.08; p>0.05) or test (F1, 23=0.88; p>0.05). The drug by test interaction was also nonsignificant (F1, 23=0.001; p>0.05). A different pattern of results was observed during the second half of these test sessions. In the control (saline) condition, rats exhibited dramatically higher response rates during the second half of the reinstatement test than during the second half of the transfer test. Indeed, although both tests were conducted under extinction conditions, we found that saline-treated rats significantly reduced their rate of responding (first vs second half; two-tailed paired t-test, df=23) in the transfer test (p<0.001), but failed to do so in the reinstatement test (p>0.05), suggesting that the noncontingent outcome deliveries were more effective in opposing the suppressive effects of nonreinforcement than were the reward-paired CSs. Interestingly, flupentixol-treated rats reduced their response rates in both transfer (p=0.001) and reinstatement (p<0.001) tests. A drug × test ANOVA conducted on these data detected significant main effects of drug (F1, 23=10.09; p=0.004) and test (F1, 23=7.07; p=0.014), and detected a significant drug by test interaction (F1, 23=5.92; p=0.023), confirming that the effect of flupentixol on performance varied across tests. Response rates were significantly higher (two-tailed paired t-test; df=23) in the saline-reinstatement condition than in the saline-transfer condition (p=0.006), flupentixol-reinstatement condition (p=0.002), or flupentixol-transfer condition (p=0.003). Response rates during these other conditions did not significantly differ from each other (p's>0.05).

Figure 5
figure 5

Response rate (±SEM) during the first (left) and second (right) halves of transfer and reinstatement tests (experiment 1). Asterisks indicate significant difference from Saline-Reinst test (p<0.05).

PowerPoint slide

Experiment 2: Effects of Low-Dose Flupentixol Treatment on Outcome-Specific Pavlovian–Instrumental Transfer

Experiment 1 found that flupentixol pretreatment attenuated the nonspecific, response-invigorating effects of reward-paired cues on lever pressing without affecting their ability to bias action selection. However, the use of a single, high dose (0.5 mg/kg) in that study left open the possibility that lower doses may indeed exert selective effects to attenuate reward-specific cue-induced invigoration of actions. Additionally, the relatively short (15-min) injection-to-test interval used in that experiment may have resulted in flupentixol levels in the brain that were unstable at test, and continued to rise over trials. Experiment 2 was therefore conducted to determine if the effects on transfer performance observed in experiment 1 were replicable using lower doses of flupentixol and a longer (1-h) injection-to-test interval.

Behavioral training

Rats in experiment 2 readily acquired conditioned approach behavior. By the last day of training, rats performed 22.05 entries per min (SEM=1.51) during CS presentations and 9.06 entries per min (SEM=0.75) during pre-CS periods. A session × CS period interaction resulted in significant effects of session (F7, 91=7.69; p<0.001) and CS (F1, 13=108.38; p<0.001), as well as a significant session × CS interaction (F7, 91=15.93; p<0.001). The rats also acquired robust levels of instrumental performance, responding at a rate of 50.15 lever presses per min (SEM=2.99) by the end of training. A one-way ANOVA found a significant effect of session (F10, 130=58.27; p<0.001).

Pavlovian–Instrumental Transfer Testing

Figure 6a shows the results of transfer testing plotted using the general elevation ratio measure. Flupentixol appeared to suppress the response-invigorating influence of the CSs on lever pressing at the 0.25 but not 0.05 mg/kg dose. A dose (saline, 0.05 mg/kg, 0.25 mg/kg) × minute ANOVA resulted in a significant effect of dose (F2, 26=3.40; p=0.049), but found no effect of minute (F1, 13=1.52; p>0.05) or dose × minute interaction (F2, 26=0.55; p>0.05). Further analyses (repeated one-way ANOVA) found a significant difference between the saline condition and the high dose (F1, 13=9.10; p=0.01), but not between saline and the low dose (F1, 13=0.16; p>0.05). Although a difference of comparable size was observed between the low- and high-dose conditions, this effect was not statistically significant (F1, 13=2.89; p=0.11).

Figure 6
figure 6

Effects of low-dose flupentixol pretreatment on outcome-specific transfer performance (experiment 2). (a) Response rate during CS presentations (collapsed across CSs and actions; ±SEM), plotted over minutes using an elevation ratio (X/X+Pre-CS). Dotted line indicates no change from baseline (ratio=0.5). (b) Choice of the action (Same) whose training outcome was the same as the presented CS (collapsed across CSs; ±SEM), plotted over minutes as the percentage of total actions performed on that action ((Same/Total Actions) × 100). Dotted line indicates no preference between actions (50%). (c) Average number of presses per min (±SEM) during the pre-CS period (collapsed across CSs and actions), during the CS whose outcome was the same as the response (Same), and during the CS whose outcome was different than the response (Different).

PowerPoint slide

The effect of the CS presentations on choice performance is presented in Figure 6b. As in experiment 1, flupentixol appeared to have little or no effect on the tendency for a CS to influence action selection in an outcome-specific manner at either of the two doses used. A dose × period ANOVA detected a significant effect of period (F2, 26=5.02; p=0.014), confirming that the CSs were able to bias action selection to favor whichever action was paired with the same outcome as the cue being presented. Importantly, there was no effect of dose (F2, 26=0.62; p>0.05) and no dose × period interaction (F4, 52=0.15; p>0.05), indicating that this effect did not differ across drug conditions.

Figure 6c presents the test results using the conventional response rate measure. These data generally support the above interpretation, in that neither dose of flupentixol interfered with the outcome selectivity of transfer performance. A dose × period ANOVA detected a significant effect of period (F2, 26=11.66; p<0.001) and dose (F2, 26=4.12; p=0.028), but the dose × period interaction did not reach significance (F4, 56=1.79; p>0.05). Although the lack of interaction between these factors suggests that the distribution of responses of rats across CS periods was not significantly altered by flupentixol administration, this treatment did reduce the rate of responding during CS periods. Indeed, statistical analysis (paired samples t-tests) found that response rates during the pre-CS period did not significantly differ across drug doses (all p's>0.05). Although no differences were detected between saline and the 0.05 mg/kg dose for either CS delivery period (p>0.05), they did respond at a significantly lower rate during CS Same (p<0.01) and CS Different (p<0.01) after administration of the 0.25 mg/kg dose compared with after the saline injection. Therefore, the suppression of responding was not specific to either CS period. It is important, however, to note that the 0.25 mg/kg dose had such a pronounced effect on cue-evoked responding that press rates during CS Same did not significantly differ from pre-CS rates (p>0.05), unlike in the 0.05 mg/kg (p<0.05) or saline (p<0.001) conditions. Although this might be taken as evidence that flupentixol blocked outcome-specific transfer, the lack of cue-specificity here, together with our finding that flupentixol left intact the outcome-specific response bias produced by the CSs (Figure 6b), suggest that dopamine is primarily involved in mediating the outcome-independent, response-invigorating properties of reward-paired cues.

DISCUSSION

The current study explored the role of dopamine transmission in the expression of Pavlovian–instrumental transfer and reward-elicited reinstatement of extinguished instrumental performance. During transfer testing, we found that systemic administration of the dopamine receptor antagonist flupentixol selectively attenuated the facilitation of instrumental performance elicited by reward-paired CSs but did not affect their tendency to bias action selection by signaling a specific rewarding outcome. During reinstatement testing, flupentixol had only a slight effect on the immediate response invigoration produced by individual noncontingent reward deliveries but dramatically disrupted the cumulative effect of these reward deliveries on performance. As with reward-paired CSs, the response-biasing influence of the reward deliveries was not significantly affected by flupentixol administration. Thus, it appears that dopamine transmission plays a critical role in mediating the general incentive motivational properties of noncontingently presented rewards and CSs, but is not directly involved in mediating their influence over action selection. This interpretation is compatible with previous findings indicating that perturbation of dopamine signaling results in suppressed levels of behavioral output without significantly affecting the ability of rats to choose between actions based on their history of reinforcement (Evenden and Robbins, 1983; see Salamone, 1987 for review and discussion). The current findings have implications for the study of cue-guided behavior and for theories that assign dopamine a central role in behavioral control.

Recent findings support a distinction between general and outcome-specific Pavlovian–instrumental transfer (Corbit and Balleine, 2005; Corbit et al, 2007). As noted in the Introduction, these two effects are typically observed under different training conditions; whereas general transfer tends to be observed in simple one-action experiments, specific transfer is observed when multiple action–outcome contingencies are used during instrumental training. In both types of study, the magnitude of transfer tends to be quantified using the same basic metric: the tendency for a CS to increase performance over some baseline level. In general transfer experiments, this increase does not depend on the identity of the anticipated outcome, whereas in outcome-specific transfer, the response increment is only observed when a CS and instrumental action share a common outcome. However, this comparative approach may be suboptimal when the goal is to differentiate the processes underlying these two forms of transfer. It is possible, for example, that reward-paired cues influence instrumental performance through two separate but interacting processes: a nonspecific incentive motivational process capable of influencing the vigor with which instrumental actions are performed (see, eg, Rescorla and Solomon, 1967), and an outcome-specific response retrieval process that involves the integration of Pavlovian and instrumental associations (CS1 → O1 → R1; see, eg, Trapold and Overmier, 1972). According to this view, the incentive motivation process that supports the general transfer effect should also contribute to specific transfer by invigorating instrumental performance (ie, increasing response rate). The choice of which action to perform, however, would be guided by the outcome-specific retrieval process. In this case, the tendency of a CS to increase performance during outcome-specific transfer should be dissociable from its ability to guide action selection. This hypothesis was supported by our finding that blocking dopamine receptor activity under conditions in which outcome-specific transfer is favored disrupted the overall (nonspecific) increment in instrumental performance produced by CSs without affecting their ability to bias action selection. This finding is also complemented by reports of the converse dissociation, which show that certain treatments (including lesions of areas implicated in stimulus–outcome and/or action–outcome learning) disrupt the selectivity of outcome-specific transfer, producing a general transfer effect (Blundell et al, 2001; Corbit et al, 2007; Ostlund and Balleine, 2007b, 2008; Johnson et al, 2007; Zorawski and Killcross, 2003).

Incentive motivational theories of dopamine function (eg, Blackburn et al, 1992; Ikemoto and Panksepp, 1999; Robinson and Berridge, 1993) tend to assume that dopamine transmission allows reward-paired CSs to produce a state of heightened appetitive motivation, which, in turn, has the tendency to invigorate various appetitively motivated behaviors, including approach behavior and instrumental actions. Indeed, the incentive sensitization theory of addiction extends this general account to explain how drug-induced alterations in dopamine signaling allow cues associated with drugs or natural rewards to exert greater control over behavior, generating compulsive drug-seeking behavior (Robinson and Berridge, 1993). Importantly, according to such theories, the incentive motivational effects of a CS should not depend on the specific sensory features of the reward with which it is paired.

Consistent with this view, most studies implicating dopamine in Pavlovian–instrumental transfer have employed training and testing procedures that support the general form of transfer (Dickinson et al, 2000; Lex and Hauber, 2008; Murschall and Hauber, 2006; Wyvell and Berridge, 2000; for review, Yin et al, 2008). However, a recent study by Corbit et al (2007) found that expression of both the general and outcome-specific transfer effects (specifically, the increments in lever pressing during CS deliveries) were abolished by inactivation of the ventral tegmental area (VTA), the origin of mesolimbic and mesocortical dopamine pathways. Our results suggest that the impairment in outcome-specific transfer may have resulted from disruption of the same nonspecific incentive motivational process that supports general transfer, not a separate outcome-specific process. Of course, as noted by the authors, this finding could also be explained by the dopamine-independent effects of VTA inactivation. Therefore, more studies are needed to determine the role of dopamine in outcome-specific transfer. However, the current results suggest that an effort should be made to target the response-invigorating and response-biasing effects of CS presentations. Teasing these effects apart may require the careful consideration of dosing parameters. For example, although we would predict VTA inactivation to spare the influence of reward-paired CSs on action selection, the manipulation used by Corbit et al (2007) effectively abolished the transfer effect, which may have made it difficult to evaluate this prediction.

Our findings clearly demonstrate that the general and outcome-specific influences of reward-paired cues on instrumental performance are dissociable and that the former depends on a dopamine-dependent incentive motivational process. This raises an important question: are the outcome-specific effects of cues mediated by a similar, although neurochemically distinct, motivational process or do they depend on fundamentally different psychological operations? Rats are certainly capable of adjusting their choice between instrumental actions in a flexible manner if the value of the particular outcome of one of the actions has been changed (eg, devalued through specific satiety) before testing (Colwill and Rescorla, 1985; Balleine and Dickinson, 1998). Such goal-directed decision making demonstrates that outcome-specific motivational processes can play an important role in instrumental action selection. Perhaps reward-paired cues engage these processes, motivating rats to seek out a particular goal through instrumental action. However, this interpretation is undermined by a growing body of evidence showing that goal-directed action selection and outcome-specific transfer depend on distinct neural circuits (Corbit et al, 2007; Ostlund and Balleine, 2007a, 2007b, 2008; Shiflett and Balleine, 2010). Moreover, the outcome-specific influence of reward-paired cues on action selection is not modulated by the value of the outcome mediating this effect; that is, a CS that predicts a devalued outcome is just as effective in biasing action selection as one that predicts a valued outcome (Rescorla, 1994; Holland, 2004; see Ostlund et al, 2008 for review). These and other findings suggest that the outcome specificity of the transfer effect is mediated by a separate action selection process guided primarily by associations between cues, actions, and the sensory features of the outcomes of these events.

Although there has been considerable interest in the role of dopamine in drug-priming-induced reinstatement of drug self-administration (Khroyan et al, 2000; McFarland and Kalivas, 2001; Schmidt et al, 2005; Sun and Rebec, 2005), there is relatively little evidence that dopamine contributes to the reinstatement of responding for natural rewards (Chausmer and Ettenberg, 1997; Horvitz and Ettenberg, 1988; McFarland and Kalivas, 2001). The current study found that flupentixol administration did not affect the tendency for noncontingent reward deliveries to influence the choice of rats between instrumental actions. Together with the results of transfer testing, this finding indicates that dopamine receptor activation is not required for the associative retrieval of instrumental actions. Flupentixol administration also appeared to have little effect on the response invigoration produced by noncontingent rewards, although rats treated with this drug did appear to exhibit slightly lower rates of lever pressing during the minute immediately following reward deliveries. This modest effect sharply contrasts with the near complete reduction in the CS-induced response invigoration produced by the drug during transfer testing, suggesting that these two forms of response instigation are supported by different processes. This is not particularly surprising given previous reports of dissociations between outcome- and CS-mediated response retrieval (Colwill, 1994; De Wit et al, 2009; Ostlund and Balleine, 2008). For instance, post-training lesions of the mediodorsal thalamus disrupt the former but not the latter (Ostlund and Balleine, 2008). Studies on drug- and cue-induced reinstatement also indicate that these events trigger self-administration through somewhat separate processes (see Fuchs et al, 2008 for a recent review of this literature). It must be noted, however, that studies on cue-induced reinstatement rarely assess the impact of noncontingent presentations of separately trained drug-paired CSs on self-administration performance (but see Corbit et al, 2007; Glasner et al, 2005). Instead, such studies typically arrange for a response–cue relationship at test or deliver noncontingent presentations of a discriminative cue that has been explicitly associated with the self-administration response, making it impossible to determine whether the cue's ability to ‘reinstate’ responding is supported by its response-invigorating effects, its ability to support conditioned reinforcement, or through stimulus-response learning.

Although reward-paired CSs appear to engage a dopamine-dependent incentive motivational process, it is not well understood how actual reward deliveries invigorate instrumental performance. It is possible that the discriminative stimulus properties of an outcome bias action selection and increase response rate through a common associative process mediated by stimulus–response associations (see Balleine and Ostlund, 2007; Colwill, 1994; Rescorla and Skucy, 1969). However, this explanation is undermined somewhat by the dissociable effects of medial prefrontal cortex lesions on outcome selective reinstatement performance (Ostlund and Balleine, 2005); whereas lesioned rats exhibited normal sensitivity to the response biasing effects of noncontingent reward delivery, they showed a significantly smaller response increment than control rats. Interestingly, lesions of this structure appear to have no detectable effect on outcome-specific Pavlovian–instrumental transfer performance (Corbit and Balleine, 2003), providing further evidence of the dissociable nature of these phenomena.

Dopamine receptor blockade was not without effect on reinstatement performance; it produced a pronounced reduction in response rate during the second half of the test session. There are several aspects of this effect worth noting. First, a similar effect was not observed during transfer testing, suggesting that the effect was specific to a situation involving the delivery of actual rewards. Second, when rats were treated with saline, their overall response rates significantly declined over the course of the transfer test but were maintained during the reinstatement test. This finding suggests that the free reward deliveries made during the reinstatement test were particularly potent in invigorating responding over longer intervals. Research on the role of background context conditioning in instrumental performance provides a simple explanation for these results. Cues embedded within the training situation are widely viewed as playing an important, modulatory role in instrumental performance (for review, see Rescorla and Solomon, 1967; Trapold and Overmier, 1972). Indeed, the transfer effect was designed to isolate this Pavlovian–instrumental interaction (Estes, 1948; Walker, 1942). Therefore, it is important to consider the role that background context conditioning plays in the reinstatement of responding produced by noncontingent rewards. For instance, Baker et al (1991) found that extinguished lever pressing could be reinstated by testing rats in the presence of contextual cues that were paired with noncontingent reward deliveries before testing. As for the current results, it is possible that the rewards delivered at test maintained or re-established context–reward associations that would have otherwise been extinguished, resulting in an invigoration of instrumental performance. One might expect this process to be particularly important in supporting responding after a significant period of extinction, when other sources of support (eg, context–response associations) should be inhibited (Rescorla, 1997). According to this account, the effect of flupentixol on response rate during the reinstatement test should not be surprising given the established role of dopamine in Pavlovian incentive motivation. Alternatively, it is possible that the noncontingent rewards maintained responding through the unintended reinforcement of lever pressing; that is, unscheduled response–reward pairings strengthened these associations. Indeed, treatments that disrupt dopamine transmission tend to attenuate reinforced instrumental performance (Dickinson et al, 2000; Salamone et al, 2007; Willner et al, 1988). The current results do not provide a definitive evidence for either of these accounts.

It is important to emphasize that the aim of the current study was to examine the effects of dopamine receptor blockade on the influence of rewards and reward-paired cues over the expression of instrumental performance, apart from their ability to modify performance through new learning (primary or conditioned reinforcement, respectively). The Pavlovian–instrumental transfer and reinstatement-from-extinction procedures are designed to isolate the effects of noncontingently presented stimuli on action selection and initiation (for review, see Balleine and Ostlund, 2007; Ostlund et al, 2008) from their ability to modify response tendencies through learning. Therefore, the finding that dopamine receptor antagonism attenuated the general, response-invigorating effects of reward-paired cues cannot easily be explained by dopamine's contributions to learning. Although the current findings do not address dopamine's role in learning, there is growing evidence that it is essential for the attribution of incentive salience to reward-paired cues, allowing them to invigorate reward seeking (Dickinson et al, 2000) and elicit approach behavior (Flagel et al, 2011), and appears to be critical for certain aspects of primary and conditioned reinforcement (Wise, 2006; Wickens et al, 2007).

It must also be noted that dopamine has been repeatedly implicated in effort-based decision making (Walton et al, 2006; Salamone et al, 2007) and in processing response cost (Gan et al, 2009; Day et al, 2010; Ostlund et al, 2011). Although response-contingent costs can bias the choice of rats between actions, as with conditioned reinforcement, such action biases are the result of learning (ie, dependent on response–outcome contingency) and should therefore be distinguished from the action-biasing influence of noncontingently presented rewards and reward-paired cues, which our findings suggest are not dependent on dopamine signaling.

Finally, we chose to use flupentixol because of its high affinity for both D1 and D2 dopamine receptors and because it has been widely used to assess the role of dopamine transmission in incentive motivation (eg, Dickinson et al, 2000; Flagel et al, 2011; Wassum et al, 2011). We are unable, therefore, to draw conclusions regarding the sub-type of DA receptors mediating the observed effects. Furthermore, as flupentixol is also known to bind with high affinity to other receptors, including serotonin (2A) and adrenergic (α1) receptors (Reimold et al, 2007; Testa et al, 1989) we cannot rule out the possibility that antagonism of these other neurotransmitter systems contributes to the behavioral effects of flupentixol reported here and elsewhere.

In summary, our results provide new details about how dopamine transmission supports a response-invigorating function in instrumental conditioning. Dopamine appears to be particularly important for mediating the incentive motivational effects of noncontingently presented Pavlovian reward-paired cues but does not mediate their ability to bias action selection. Future research will be needed to determine the neurochemical system(s) that underlie this outcome-specific influence of reward-paired cues on action selection.