INTRODUCTION

As a consequence of associative learning, an environmental stimulus paired with reward experience (a conditioned stimulus; CS) not only acquires predictive properties that serve to signal the availability and/or location of the reward (discriminated approach or goal-tracking; Boakes, 1977), but may also acquire incentive properties that enable CSs to attract (auto-shaping or sign-tracking; Brown and Jenkins, 1968), energize (Pavlovian-instrumental transfer; Estes, 1948) or directly reinforce (conditioned reinforcement; Mackintosh, 1974) appetitive behaviors (see also Flagel et al, 2009; Robinson and Flagel, 2009). Although the predictive and incentive functions of CSs have clear adaptive value, the neural systems that mediate the learning of incentive properties (the acquisition) and the CSs' subsequent effects on behavior (the expression) are proposed to be subverted by drugs of abuse (Everitt et al, 2001; Hyman et al, 2006; Kelley, 2004). Thus, contemporary theories of drug addiction ascribe particular importance to the role of drug-paired CSs in maintaining drug taking and triggering relapse (Everitt et al, 2001; Robinson and Berridge, 1993; Stewart et al, 1984). The powerful influence of CSs over the consumption of natural rewards (for example, cue-potentiated feeding; Weingarten, 1983; Zambie, 1973) has similarly led to the proposition that food-paired CSs may contribute to the development and maintenance of certain eating disorders and obesity (Holland and Petrovich, 2005; Volkow et al, 2008).

The neural circuitry underlying incentive learning and control over appetitive behaviors by CSs involves, in part, convergence within the striatum of dopaminergic projections from the ventral tegmental area (VTA) and substantia nigra, with glutamatergic inputs originating in the prefrontal cortex, hippocampus and amygdala (Cardinal and Everitt, 2004; Goto and Grace, 2008; Robbins and Everitt, 2002; Schultz et al, 1997). Glutamate signaling through ionotropic AMPA and NMDA receptors appears particularly important in mediating the expression of control over appetitive behaviors by CSs (Backstrom and Hyytia, 2006; Conrad et al, 2008; Crombag et al, 2008; Di Ciano and Everitt, 2001; Mead and Stephens, 2003a, 2003b). However, much less is known about the role of metabotropic glutamate receptors in these incentive processes.

The group I metabotropic glutamate receptor, mGluR5, is found throughout the CNS, but is most densely expressed in the striatum, cortex and hippocampus (Romano et al, 1995). Typically located postsynaptically on dendritic spines and concentrated at perisynaptic sites (Luján et al, 1996; Shigemoto et al, 1993), mGluR5 has a central role in different forms of synaptic plasticity, including long-term potentiation (LTP; see Anwyl, 2009 for review) and long-term depression (LTD; see Bellone et al, 2008 for review), that are thought to be involved in a variety of learning and memory processes (Hyman et al, 2006; Kelley, 2004; Malenka and Bear, 2004). Mechanisms by which group I mGluRs influence synaptic plasticity include control over presynaptic transmitter release via retrograde endocannabinoid signaling (Robbe et al, 2002) and changes in postsynaptic sensitivity to excitatory input through alterations in AMPA receptor expression (Bellone and Luscher, 2005; Jo et al, 2008; Kelly et al, 2009; Mameli et al, 2007; Snyder et al, 2001; Zhang et al, 2008). Thus, mGluR5 appears ideally positioned to mediate learning processes necessary for the acquisition of predictive and/or incentive properties by reward-paired stimuli, which enable them to subsequently influence behavior.

We explored this idea using the mGluR5 antagonist, MTEP, in mice trained to associate a simple stimulus with the delivery of a food reward. By administering MTEP to mice during the learning of this stimulus-reward association (Pavlovian conditioning), we were able to examine the role of mGluR5 in the acquisition of predictive properties by the food-paired CS that serve to signal the availability of reward at its location (goal-tracking test), and incentive properties necessary to reinforce an entirely novel instrumental response (conditioned reinforcement test). To determine whether mGluR5 was necessary for the expression of control over behaviors by the CS, we administered MTEP during the tests of goal-tracking and conditioned reinforcement to mice that had received vehicle during Pavlovian conditioning sessions. Critically, tests of goal-tracking and conditioned reinforcement were performed under extinction conditions, therefore allowing the predictive and incentive motivational features of the CS to be examined without interference from presentation of the primary reward.

MATERIALS AND METHODS

Subjects

Mice (n=62; male C57BL/6 × Sv129; derived in house; minimum 8 weeks old) were housed in groups of two or three and allowed to habituate to the holding room for 1 week before beginning the experiment. Animals were maintained on a 12:12 h light–dark cycle (lights on at 0700 hours) under controlled temperature (21±2°C) and humidity conditions (50±5%). Body weights were maintained at approximately 85% of free-feeding weight by the provision of a limited amount of standard lab chow (B&K Feeds, Hull, UK) approximately 2 h after daily experiment completion. Experiments took place during the light-phase between 0900 and 1500 hours. All procedures were performed in accordance with the United Kingdom 1986 Animals (Scientific Procedures) Act, following institutional ethical review.

Drugs

All injections were administered at a volume of 10 ml/kg i.p. The non-competitive mGluR5 antagonist, 3-((2-methyl-1,3-thiazol-4-yl)ethynyl)pyridine (MTEP; Sequoia Research Products, Pangbourne, UK), was dissolved in 10% v/v Tween 80, 90% water.

Apparatus

Behavioral training and testing were performed in eight standard mouse operant chambers (15.9 × 14 × 12.7 cm; Med Associates, Vermont, USA). Each chamber was housed within a sound attenuating and light-resistant cubicle, fitted with an exhaust fan that served to both ventilate the unit and mask any external noise. The front access panel, ceiling and rear wall of the conditioning chambers were constructed from clear Plexiglas and the side walls consisted of removable aluminum panels. Each chamber was fitted with a pellet dispenser system that delivered 20 mg food pellets (5TUL, Cat no. 1811142; Test Diets, Indiana, USA) into a recessed food magazine situated at the center of one side wall. An infra red beam detected head entries into the food magazine. A retractable response lever was located on either side of the food magazine and a LED stimulus light was positioned approximately 8 cm above each lever. A tone generator (2.9 KHz, 5 dB above background) was situated between the stimulus lights. The presentation of stimuli, the delivery of food pellets and the recording of both entries into the food magazine and lever responses were performed using Med-PC IV (Med Associates).

Procedure

A summary of the experimental design is shown in Figure 1. Mice were allocated to one of three Pavlovian conditioning (PC) treatment groups that received injections of either vehicle (PC: Veh group; n=22), 3 mg/kg (PC: 3; n=19) or 10 mg/kg (PC: 10; n=21) i.p. MTEP before each Pavlovian conditioning session (Phase 1). Following conditioning, each conditioning treatment group (PC: Veh, 3 and 10) was exposed to two tests of conditioned reinforcement (CRf; Phase 2). Mice from each conditioning treatment group were injected with vehicle during one CRf test and MTEP during the other CRf test, the order of CRf test treatment (that is, Veh or MTEP) being counterbalanced. Specifically, group PC: Veh received 10 mg/kg MTEP during one CRf test, whereas groups PC: 3 and PC: 10 received 3 and 10 mg/kg MTEP during one CRf test, respectively. Each conditioning treatment group was then exposed to two tests of goal-tracking (GT; Phase 3). As described for the CRf tests, each conditioning treatment group was injected with MTEP during one of the GT tests and vehicle during the other test; the order of GT test treatments being counterbalanced. Two further Pavlovian conditioning sessions were conducted between each CRf and each GT test. Mice received injections of vehicle (PC: Veh group) or 3 or 10 mg/kg MTEP (PC: 3 and 10 groups, respectively) before each reconditioning session to ensure that learning conditions were identical to those experienced during the initial conditioning phase. All drug injections were made 20 min before the start of the experimental sessions. The doses of MTEP used have previously been shown to not affect locomotor activity in mice (Cowen et al, 2007), and 3 mg/kg i.p. MTEP was reported to achieve >75% receptor occupancy for at least 15 min post-dosing in mice (Anderson et al, 2003).

Figure 1
figure 1

Experimental design summary. Mice were allocated to one of three groups that received injections of vehicle (PC: Veh), 3 mg/kg (PC: 3) or 10 mg/kg (PC: 10) MTEP before 11, once daily, Pavlovian conditioning sessions (Phase 1). Two tests of conditioned reinforcement (CRf; Phase 2) and goal-tracking (GT; Phase 3) were subsequently undertaken in each group. Injections of vehicle or MTEP were given before each test, the order of treatments being counterbalanced. Two Pavlovian conditioning sessions were conducted between each test (block arrows). See methods section for further details.

PowerPoint slide

Magazine Training

To familiarize mice with the food reinforcer used in Pavlovian conditioning sessions, a small amount of the food was provided to mice in their home cage. The following day, mice received a single 30 min magazine training session in which food pellets were delivered once every 60 s, on average (range of 25–95 s). No drug injections were made before the magazine training session and no stimuli or response levers were presented.

Phase 1: Pavlovian Conditioning

Commencing 24 h after the magazine training session, mice received 11, once daily, Pavlovian conditioning sessions. Each 60 min session consisted of 16 trials in which presentation of a stimulus was paired with food delivery (CS+) and 16 trials in which presentation of an alternative stimulus was not paired with food (CS−). The order of stimulus presentations was randomly determined and each stimulus trial was separated by a variable, no-stimulus, inter-trial interval (ITI; range of 80–120 s; mean=100 s). For half of the mice a constant 10 s tone served as the CS+ and the 10 s flashing (1 Hz) of both cue lights served as the CS−. This contingency was reversed for the remaining mice. A single food pellet was delivered 5 s after CS+ onset. The total number of entries made into the food magazine during each stimulus trial (CS+ or CS−) was recorded and expressed as a percentage of total magazine entries made during the session (percentage of magazine entries). Food magazine entries that occurred in the first five seconds following CS+ onset (that is, before food delivery) were recorded to provide a preliminary assessment of the acquisition of goal-tracking responses. The latency to enter the food magazine following onset of the CS+ (retrieval latency) was also measured.

Phase 2: Conditioned Reinforcement

The 60 min CRf test commenced with insertion of both response levers into the operant chamber. A single response on one lever resulted in a 1.5 s presentation of the CS+, whereas a single response on the alternate lever resulted in a 1.5 s presentation of the CS−. For half of the mice, the left lever was designated the CS+ lever and the right lever the CS− lever. This contingency was reversed for remaining mice. No food was delivered during the test. The ability of the CS+ to serve as a conditioned reinforcer is shown by a greater number of responses on the CS+ lever than on the CS− lever.

Phase 3: Goal Tracking

The GT test was 30 min in duration and consisted of eight trials of the CS+, and eight trials of the CS−. The order of stimulus presentations was randomly determined and each stimulus trial was separated by a 100 s fixed, ITI, during which no stimuli were presented. No food was delivered during the test. The total number of entries made into the food magazine during each stimulus trial was recorded. Four mice died before completion of the GT tests, reducing the size of groups PC: 3 and PC: 10 to n=17 and n=19, respectively.

Statistical Analysis

Data were initially analyzed by mixed-factor analysis of variance (ANOVA), where the three conditioning treatment groups (PC: Veh, 3 or 10) were represented by the between-subjects factor of PC treatment. The drug treatment (Veh or MTEP) administered to each of the three conditioning treatment groups during subsequent CRf and GT test sessions was included in analyses as a within-subjects factor of CRf treatment or GT treatment, respectively. Where a significant (p0.05) main effect or interaction term was found, further analysis was performed using ANOVA and post hoc comparisons by two-tailed t-tests. To permit analysis by parametric tests, appropriate transformations were undertaken to transform skewed distributions closer to a normal distribution and to reduce heterogeneity of variance (Cardinal and Aitken, 2006). Specifically, for analysis of percentage of magazine entries (Phase 1), data were arcsine transformed (Y′=arcsin√(Y)). For analysis of magazine entries made during the first 5 s of CS+ presentations in conditioning sessions (Phase 1), lever responses, and magazine entries in the test of CRf (Phase 2) and magazine entries in the test of goal-tracking (Phase 3), data were square root transformed (Y′=√Y). For within-subjects ANOVA, the Greenhouse-Geisser correction was used where the assumption of sphericity was violated. All figures show group mean (±SEM).

RESULTS

Phase 1: Pavlovian Conditioning

Pavlovian conditioning performance did not differ between groups of mice that received either vehicle (PC: Veh group), 3 mg/kg (PC: 3) or 10 mg/kg (PC: 10) MTEP before each conditioning session. Across conditioning sessions, mice from all three conditioning treatment groups (PC: Veh, 3 or 10) directed a greater proportion of total session entries into the food magazine (percentage of magazine entries; Figure 2a) during presentations of the food-paired stimulus (CS+) than during presentations of the unpaired stimulus (CS−). This finding was confirmed by a mixed-factor ANOVA, which included stimulus (CS+ or CS−) and session (1–11) as within-subjects factors. A significant difference in responding to the two stimuli across conditioning sessions was identified (main effect of stimulus, F(1, 59)=1432.62, p<0.001; stimulus × session interaction, F(10,590)=83.26, p<0.001). However, there was no difference between the three conditioning treatment groups in the percentage of magazine entries directed toward the stimuli (stimulus × session × conditioning treatment interaction, not significant (NS)).

Figure 2
figure 2

Measures of food magazine entry activity during 11 Pavlovian conditioning sessions (Phase 1) in which mice received presentations of a stimulus paired with food delivery (CS+) and a second, unpaired stimulus (CS−). Mice were injected with either vehicle or 3 or 10 mg/kg i.p. MTEP (PC: Veh, 3, or 10) 20 min before each conditioning session. (a) Magazine entries during presentation of the CS+ and CS−, expressed as a percentage of total session entries (percentage of magazine entries), did not differ between conditioning treatment groups and stabilized from session 8 onward. (b) Magazine entries made during the first 5 s of the CS+ presentation (that is, before food delivery) increased across conditioning sessions and were unaffected by treatment with MTEP. (c) The mean retrieval latency to enter the food magazine following CS+ activation stabilized at 4–5 s, which corresponded with the time of food (US) delivery. Retrieval latencies did not differ among the three groups.

PowerPoint slide

The number of magazine entries made during the first 5 s of CS+ presentations (that is, before delivery of the food reward; Figure 2b) increased across conditioning sessions (main effect of session, F(10,590)=22.01, p<0.001), but did not differ between the conditioning treatment groups (session × conditioning treatment interaction, NS). In contrast, the total number of magazine entries made during CS− presentations decreased across conditioning sessions (main effect of session, F(10,590)=43.91, p<0.001), but also did not differ among the conditioning treatment groups (session × conditioning treatment interaction, NS; data not shown).

Mice came to enter the food magazine at 4–5 s after CS+ onset (retrieval latency; Figure 2c), corresponding with the time of food delivery. The mean retrieval latency to enter the food magazine following activation of the CS+ significantly decreased across conditioning sessions (main effect of session, F(10,590)=43.23, p<0.001), and there was no difference in retrieval latencies among the three conditioning treatment groups (conditioning treatment × session interaction, NS).

Stability of conditioning performance (indicated by asymptotic responding) before the first test of CRf was observed from the eighth conditioning session. Percentage of magazine entries (Figure 2a) did not differ across sessions 8–11 (main effect of Session, NS), and there was no difference between conditioning treatment groups (stimulus × session × conditioning treatment interaction, NS). Similarly, magazine entries made in first five seconds of CS+ presentations (Figure 2b) and mean retrieval latencies (Figure 2c) did not differ across sessions 8–11 (main effect of session, NS), nor between conditioning treatment groups (session × conditioning treatment interaction, NS). No further change in conditioning performance was observed during any of the subsequent Pavlovian reconditioning sessions that occurred between the CRf and GT tests.

Phase 2: Conditioned Reinforcement

Conditioned reinforcement was influenced by the MTEP treatment given before Pavlovian conditioning sessions, but not by the MTEP treatment given during the CRf tests (Figure 3a). An initial mixed-factor ANOVA, which included Lever (CS+ or CS− paired) as a within-subjects factor, confirmed that lever responding significantly differed as a result of the treatment received during conditioning sessions (lever × conditioning treatment interaction, F(2, 59)=3.80, p<0.05). However, lever responding did not reliably differ as a result of the MTEP treatment received during the CRf test (lever × CRf treatment interaction, NS).

Figure 3
figure 3

Lever responding in tests of conditioned reinforcement (CRf), which examines the ability of a conditioned stimulus to reinforce a novel instrumental action. In each 60 min test session, mice were presented with two response levers; responses on one lever led to presentation of the food-paired stimulus (CS+) and responses on the alternate lever led to presentation of the unpaired stimulus (CS−). No food was delivered during each CRf test (a) Responding for CRf was observed in mice that received vehicle or 3 mg/kg MTEP during Pavlovian conditioning (PC: Veh and 3, respectively). CRf was significantly impaired in mice that received 10 mg/kg MTEP during conditioning (PC: 10). In contrast, 10 mg/kg MTEP during the CRf test did not impair CRf in mice that received vehicle during conditioning (PC-CRf: Veh-10). *p<0.05 Post hoc, t-test comparison between Veh–Veh and 10-Veh CS+ lever responses; #p<0.05 Post hoc, t-test comparison between Veh-10 and 10–10 CS+ lever responses. (b) 10 mg/kg MTEP did not alter the temporal profile of lever responding in mice that received vehicle during conditioning (PC: Veh). Mice that received 10 mg/kg MTEP during conditioning (PC: 10) failed to show any significant difference in CS+ and CS− lever responding in any 15-min period of each CRf test.

PowerPoint slide

Within-subjects ANOVA comparisons of CS+ and CS− lever responding, which included both CRf treatment conditions (Veh or MTEP), were undertaken to determine whether each conditioning treatment group showed CRf (that is, more responding on the CS+ lever than the CS− lever). Conditioned reinforcement was shown in the PC: Veh group (main effect of lever, F(1, 21)=26.53, p<0.001) and in the PC: 3 group (main effect of lever, F(1, 18)=8.55, p<0.01). However, the PC: 10 group failed to show any difference in CS+ and CS− lever responding (main effect of lever, NS).

The impairment in responding for CRf in the PC: 10 group was due to a specific reduction in responding for the food-paired stimulus (CS+), rather than a general reduction in the ability of these mice to perform an instrumental response. A mixed-factor ANOVA, performed for each stimulus-paired lever, showed that CS+ lever responding was significantly influenced by the treatment received during conditioning (main effect of conditioning treatment, F(2, 59)=3.59, p<0.05). In contrast, CS− lever responding was unaffected by the treatment received during conditioning (main effect of conditioning treatment, NS). Post hoc comparisons indicated that CS+ lever responding was significantly reduced in the PC: 10 group, in comparison with the PC: Veh group during CRf tests that were preceded by injection of vehicle (t=2.68, d.f.=41, p<0.05) and by 10 mg/kg MTEP (t=2.70, d.f.=41, p<0.05). Consistent with a dose-related effect of MTEP, there were no differences in CS+ lever responding between the PC: Veh and PC: 3 groups or the PC: 3 and PC: 10 groups in either the CRf test (t-test comparisons, NS).

Although CRf was not impaired by pre-test administration of 10 mg/kg MTEP in the PC: Veh group, a possibility existed that the temporal profile of lever responding may have been altered by acute 10 mg/kg MTEP treatment. Further analysis was therefore performed to determine whether administration of 10 mg/kg MTEP in the test of CRf had any effect on the temporal profile of lever responding (Figure 3b). Conditioned reinforcement (that is, greater responding on the CS+ lever) was evident in each 15-min time period of the 60-min CRf test in the PC: Veh group (main effect of lever, F(1, 21=25.81), p<0.001), but not in the PC: 10 group (main effect of lever, NS). In the PC: Veh group, 10 mg/kg MTEP during the test of CRf did not alter the temporal profile of either CS+ lever responding (Period × CRf treatment interaction, NS), or CS− lever responding (period × CRf treatment interaction, NS).

Magazine entry activity during CRf tests was also examined (Table 1), as this could provide further indication of whether MTEP administration had any gross effects on activity. A mixed-factor ANOVA of mean total magazine entries, indicated that entries were significantly increased during CRf tests in which MTEP was administered (main effect of CRf treatment, F(1, 59)=11.19, p<0.01), but that the effect of MTEP on magazine entries did not differ among conditioning treatment groups (CRf treatment × conditioning treatment interaction, NS). Analysis of the time course of magazine entries during the CRf tests indicated that entries decreased over the course of the test session (main effect of period, F(3, 177)=7.0, p<0.01), but the effects of MTEP given during the CRf test did not reach statistical significance (CRf treatment × period interaction, NS).

Table 1 Conditioned Reinforcement: Magazine Entries

Phase 3: Goal Tracking

Presentation of the food-paired stimulus (CS+), in the absence of food delivery, elicited approach responses into the food magazine (that is, towards the goal). Mice made fewer head entry responses into the magazine during presentation of the unpaired stimulus (CS−), indicating that the CS+ was able to serve as a predictor of food availability (Figure 4a). There was no effect of MTEP given during the Pavlovian conditioning phase, or MTEP given during the GT test, on goal-tracking responses. These findings were confirmed by a mixed-factor ANOVA, which included stimulus (CS+, CS−) as a within-subjects factor. Mean total magazine entry responses significantly differed depending on the identity of the stimulus (main effect of stimulus, F(1, 55)=200.51, p<0.001), but there was no effect of either the treatment received during conditioning (stimulus × conditioning treatment interaction, NS) or during the GT test (stimulus × GT treatment interaction, NS) on goal-tracking responses.

Figure 4
figure 4

Food Magazine entries in tests of goal tracking (GT), which examines the ability of a conditioned stimulus to elicit approach responses to the place of food delivery. No food was delivered during each GT test. (a) Mice that received vehicle, 3 or 10 mg/kg i.p. MTEP during conditioning sessions (PC: Veh, 3 and 10, respectively) made more entries into the food magazine during presentation of the food-paired stimulus (CS+) than during presentation of the unpaired stimulus (CS−). There was no difference in magazine activity between the conditioning treatment groups, and magazine activity was not altered by the MTEP treatment received during the GT test. (b) Magazine entries made in each CS+ stimulus trial decreased across successive trials. The number of magazine entries made during each CS+ stimulus trial was unaffected by the treatment (vehicle or 10 mg/kg MTEP) received during the GT test in PC: Veh and PC: 10 groups.

PowerPoint slide

Analysis of magazine entries made during each stimulus trial was performed to determine whether acute 10 mg/kg MTEP treatment altered the profile of goal-tracking responses in the PC: Veh group and whether response profiles differed between the PC: Veh and PC: 10 groups (Figure 4b). For both PC: Veh and PC: 10 groups, the number of magazine entries made during each CS+ trial decreased across the course of the GT test and few responses were made across all CS− trials. Analysis of magazine entries during CS+ trials was performed using a mixed-factor ANOVA, which included Trial (1–8) as a within-subjects factor. This analysis confirmed that magazine entries made during each CS+ trial significantly decreased with successive trials (main effect of trial, F(7, 273)=17.21, p<0.001), but that this profile of responding was unaffected by the treatment received during conditioning (trial × conditioning treatment interaction, NS), or the treatment received during the GT test (trial × GT treatment interaction, NS).

DISCUSSION

This study explored the effects of the selective mGluR5 antagonist, MTEP, on the acquisition of a Pavlovian association that enables a food-paired stimulus to acquire predictive properties that signal reward availability (goal-tracking) and incentive properties necessary to reinforce a novel instrumental response (conditioned-reinforcement). We report that MTEP did not affect performance during Pavlovian conditioning sessions, indicating that the overall motivation to obtain food and the ability of mice to discriminate between the food-paired stimulus and the stimulus not paired with food was unaffected by blockade of mGluR5. In addition, mGluR5 function was not required for the acquisition of predictive properties necessary for the control over goal-tracking responses by the food-paired stimulus. However, mGluR5 function was critical for the associative learning processes necessary for the acquisition of properties by the CS that allow the CS to serve as a conditioned reinforcer, that is, providing the CS with incentive value. Once incentive learning had taken place, mGluR5 function was not required for the expression of this CS-reinforced behavior, which has been proposed to depend upon CS-elicited representations of general effect (Burke et al, 2007; Parkinson et al, 2005). These findings add important new information regarding the function of mGluR5 in the control over appetitive behaviors by reward-paired stimuli.

A potential explanation for the findings reported here is that impaired CRf in mice that had received MTEP during conditioning sessions (PC: 10 group) was due to a state-dependent learning process (Stephens et al, 2000). That is, MTEP may have induced an interoceptive state during conditioning sessions and the subsequent retrieval of the CS memory during the CRf test may have been disrupted due to the presence of a different interoceptive state, namely the absence of MTEP. However, this account is unlikely since CRf responding was also impaired in the PC: 10 group when 10 mg/kg MTEP was given during the CRf test to induce the same state that existed during conditioning sessions.

That we found contrasting effects of MTEP on responding for conditioned reinforcement and goal-tracking responses may have been due to mice having experienced relatively more stimulus-food (CS-US) pairings before the GT tests than the CRf tests. Thus, goal-tracking responses may have been less susceptible to the effects of MTEP due to strengthened CS–US associations. At variance with this possibility is the observation that mice came to use the CS+ as a predictor of food delivery even during Pavlovian conditioning sessions that preceded the first CRf test (Figure 2b and c). Critically, the acquisition of these goal-tracking responses were unaffected by administration of MTEP, thereby supporting our proposition that mGluR5 has a dissociable role in the acquisition of predictive and incentive motivational properties by CSs.

Our findings that CRf was not impaired by administration of 10 mg/kg MTEP, during the test only, in mice that had received vehicle before conditioning sessions (PC: Veh group) is in apparent contrast to behavioral studies of cue-induced reinstatement that have reported a role of mGluR5 in the expression of control over responding maintained by both natural- and drug-paired CSs (Backstrom and Hyytia, 2006; Bespalov et al, 2005; Gass et al, 2009; Kumaresan et al, 2009; Martin-Fardon et al, 2009; Schroeder et al, 2008; Tessari et al, 2004). As it is possible that higher doses of MTEP would have reduced the expression of CRf in our study, our findings do not exclude a role of mGluR5 in the control over appetitive behaviors by reward-paired stimuli. Alternatively, subtle methodological differences may have contributed to this apparent contrast in findings. First, in our study, the CS reinforced an instrumental response that had not been previously associated with primary reinforcement. Second, mice were trained a purely Pavlovian (stimulus-outcome) association, whereas an instrumental (response-outcome) component is embedded in the acquisition of associations between environmental stimuli and reward in studies of self-administration and cue-induced reinstatement. Finally, we examined instrumental responding supported by a CS immediately following the conditioning phase, while extinction learning or periods of withdrawal are commonly employed in studies of cue-induced reinstatement and which may contribute to neural changes mediating the subsequent expression of control over appetitive behaviors by CSs (Conrad et al, 2008; Ghasemzadeh et al, 2009; Grimm et al, 2003; Lu et al, 2005).

Our finding that mGluR5 antagonism was effective in reducing a CS-reinforced behavior when administered during the acquisition of a Pavlovian association shares some similarity with studies examining the role of mGluR5 in conditioned place preference (CPP) learning. Administration of the mGluR5 antagonist 6-methyl-2-(phenylethynyl)pyridine (MPEP), during conditioning (that is, the acquisition phase), reduced the development of cocaine CPP in mice while having no effect on the development of amphetamine, ethanol, morphine or nicotine CPP (McGeehan and Olive, 2003). Another study reported that higher doses of MPEP attenuated both the acquisition and expression of morphine CPP in mice (Popik and Wrobel, 2002). In rats, the expression of cocaine CPP was unaffected by a dose of MPEP that reduced the expression of morphine CPP (Herzig and Schmidt, 2004). Thus, mGluR5 can contribute to the acquisition of associations that enable reward-paired, contextual stimuli to mediate CPP and can also influence the expression of CPP, a finding that may depend on the extent of mGluR5 blockade and/or the primary reward experienced during conditioning. However, the expression of CPP may be due to either predictive or incentive motivational associations formed between the contextual cues and the paired outcome (Stephens et al, 2010). While acknowledging that substantial differences exist between contextual vs discrete cue conditioning, our findings may provide further insight into the psychological mechanisms underlying these earlier CPP reports by identifying a specific role of mGluR5 in the acquisition of incentive associations between an environmental stimulus and reward, while the ability of a reward-paired stimulus to acquire predictive properties is unaffected by mGluR5 antagonism.

An advantage of the behavioral models used in this study is that the underlying neural circuitry is relatively well characterized. Brain areas mediating control over behavior by conditioned reinforcers, and which are also rich in expression of mGluR5 (Romano et al, 1995; Shigemoto et al, 1993), include the nucleus accumbens (NAc) core of the striatum (Ito et al, 2004; Parkinson et al, 1999) and the orbitofrontal cortex (Burke et al, 2008; Pears et al, 2003). The ventral striatum is densely populated with medium spiny neurons (MSNs), that control motoric output primarily through the integration of glutamatergic inputs from the cortex, hippocampus and amygdala and dopaminergic signals arising from the VTA (Cardinal and Everitt, 2004; Grace et al, 2007). Expression of mGluR5 is found on both striatonigral and striatopallidal projection MSNs (Tallaksen-Greene et al, 1998; Testa et al, 1998) and mGluR5 has a central role in multiple forms of plasticity (Anwyl, 2009; Bellone et al, 2008) that are likely to underpin a variety of appetitive learning and memory processes (Hyman et al, 2006; Kelley, 2004; Malenka and Bear, 2004). Electrophysiological studies have shown that mGluR5 has an important role in regulating MSN excitability (D'Ascenzo et al, 2009), and is necessary for the induction of synaptic plasticity in the nucleus accumbens that occurs following stimulation of glutamatergic cortical inputs (Schotanus and Chergui, 2008), but is not involved in the maintenance of plasticity following its induction (Gubellini et al, 2003; Sung et al, 2001). Thus, it is particularly interesting that we found mGluR5 to be involved in the acquisition of an incentive association, but not in the expression of responding for the reward-paired CS.

At the cellular level, neuroplastic changes that occur in the NAc during associative learning, and which determine the subsequent expression of control over appetitive behaviors by CSs are likely to depend, in part, on AMPA-mediated currents (Backstrom and Hyytia, 2006, 2007; Di Ciano and Everitt, 2001). Most AMPA receptors are heteromers, consisting of at least two different subunits (GluR1–GluR4), with the absence of the GluR2 subunit giving rise to higher conductance by conferring permeability to Ca2+ (Schoepfer et al, 1994). The subunit composition of AMPA receptors therefore provides a means to regulate membrane excitability, and behavioral studies have shown that incentive learning processes are influenced by AMPA receptor expression and subunit composition. For example, mice lacking the GluR1 AMPA subunit (gria1 knock out) show impaired responding for CRf (Mead and Stephens, 2003b), whereas mice lacking the GluR2 AMPA receptor subunit (gria2 knock out), show enhanced responding for a CS paired with food (Mead and Stephens, 2003a). Similarly, changes in the number and subunit composition of AMPA receptors within the NAc, following cocaine self-administration, are proposed to mediate enhanced responding for cocaine-paired stimuli (Conrad et al, 2008).

The above reports are particularly relevant because stimulation of group I mGluRs, including mGluR5, can produce changes in the expression of AMPA receptors (Bellone and Luscher, 2005Jo et al, 2008; Kelly et al, 2009; Mameli et al, 2007; Snyder et al, 2001; Waung et al, 2008; Zhang et al, 2008). In the striatum, activation of mGluR5 is required for phosphorylation of striatal GluR1–Ser831 and -Ser845 (Ahn and Choe, 2009), and GluR2–Ser880 residues (Ahn and Choe, 2010). A recent study in AMPA GluR1 Ser831 mutated mice, identified that action at Ser831 was necessary for normal conditioned reinforcement (Crombag et al, 2008). Phosphorylation of GluR2–Ser880 appears critically important for the regulation of AMPA internalization during synaptic plasticity (Chung et al, 2000; Xia et al, 2000). Thus, it is tempting to propose that blockade of mGluR5 during Pavlovian conditioning in our study may have prevented alterations in the conductance, kinetics, glutamate affinity, or number, and distribution of AMPA receptors in the postsynaptic membrane that may normally be necessary for experience-dependent alterations in synaptic plasticity (Derkach et al, 2007; Shepherd and Huganir, 2007) and which subsequently determine the sensitivity to control over appetitive behaviors by reward-paired stimuli.

However, mGluR5 has many diverse roles in the CNS, including involvement in astrocytic control over synaptic transmission and plasticity (see Haydon et al, 2009 for review) and regulation of neurotransmitter release via retrograde endocannabinoid signaling (Robbe et al, 2002). Further studies will be required to determine both the location of mGluR5 and downstream signaling pathways that are involved in mediating the effects observed in our study. Recent reports may guide these investigations by pointing to mGluR5 within limbic brain regions as being critical for the reinstatement of cocaine seeking induced by a cocaine prime (Kumaresan et al, 2009) and signaling through the extracellular signal-regulated kinase (ERK1/2) pathway as a mechanism by which mGluR5 antagonism effectively disrupts cue-induced reinstatement of alcohol seeking (Schroeder et al, 2008).

CONCLUSIONS

In this study we identify a necessary role of mGluR5 in the learning of an incentive association between an environmental stimulus and food delivery that enables the food-paired stimulus to subsequently reinforce a novel instrumental action. There is strong supporting evidence to hypothesize that these findings are due to a blockade of neuronal plasticity, mediated by changes in the expression of AMPA receptors within striatal circuits that are normally required for reward-paired cues to gain control over behavior. The acquisition of incentive associations is necessary for many aspects of adaptive behaviors, but conditioned incentives are also proposed to contribute to compulsive drug seeking and relapse observed in drug addiction (Everitt et al, 2001; Robinson and Berridge, 2000; Stewart et al, 1984), and non-homeostatic eating that may lead to obesity (Holland and Petrovich, 2005; Volkow et al, 2008). Electrophysiology studies have identified that cocaine exposure can produce long lasting plastic changes within the VTA and accumbens (Borgland et al, 2004; Chen et al, 2008; Mameli et al, 2009; Ungless et al, 2001), and mGluR5-mediated plasticity in both of these regions is involved in, or effected by, cocaine experience (Bird et al, 2010; Fourgeaud et al, 2004; Moussawi et al, 2009). The behavioral consequences of these plastic changes are still emerging, but alterations in plasticity following drug exposure may impair the ability of drug addicts to effectively learn about and/or employ strategies that could compete with drug seeking behaviors (Kalivas, 2009; Stephens and Duka, 2008). Our findings point to an interesting hypothesis that mGluR5-mediated plasticity during drug self-administration may be required for the attribution of incentive value to drug-paired cues that enable them to support drug seeking and relapse, without inducing generalized deficits in reward-learning. Secondly, disruption of mGluR5-mediated plasticity following drug experience may impair further incentive learning necessary for implementing new behaviors that could compete with drug seeking. Understanding the intricate mechanisms through which mGluR5 mediates learning and memory processes may provide therapeutic targets for a range of clinical disorders that are characterized by maladaptive responding for reward-paired cues.