Mesocorticolimbic dopamine (DA) is known to play a role in cognitive processes of working memory and reward-based learning (Goldman-Rakic, 1992; Schultz, 2002). However, there is controversy with regard to the neural site at which DA acts to modulate these cognitive functions.

Identification of the neural sites of DA action on cognitive function holds particular relevance given the large variability in the effects of dopaminergic drugs: Dopaminergic drugs may improve some, but impair other cognitive functions (Cools et al, 2001; Mehta et al, 2001; Mehta et al, 2004). It has been proposed that the direction of drug effects depends critically on baseline DA levels in the neural system on which DA acts to modulate the task under study (Robbins, 2000; Crofts et al, 2001). More specifically, DA-enhancing drugs may improve functioning of neural systems with low baseline DA levels whereas the same drugs may detrimentally ‘over-dose’ other systems where DA levels are already optimal (Zahrt et al, 1997; Arnsten, 1998; Granon et al, 2000; Mattay et al, 2003; Phillips et al, 2004).

One approach to addressing the role of DA in human cognition is investigating Parkinson's disease (PD). PD is characterized by striatal DA depletion and is accompanied by cognitive deficits from the earliest disease stages (Owen et al, 1992). The DA depletion progresses from the dorsal to the ventral striatum, so that, in early PD, the dorsal striatum is severely depleted, but the ventral striatum is relatively intact (Farley et al, 1977; Kish et al, 1988). Hence, mild PD provides a unique model for assessing dopaminergic drug effects on neural systems with differential baseline DA levels.

We have shown that medication with L-DOPA, which is a precursor affecting primarily DA in the striatum (Hornykiewicz, 1974), improved task-switching performance in mild PD patients. By contrast, L-DOPA impaired probabilistic reversal learning in the same patients (Cools et al, 2001). It was hypothesized that the beneficial effect of L-DOPA on task-switching reflects a remediation of DA levels in severely depleted dorsal fronto-striatal circuitry, while, conversely, the impairing effect of L-DOPA on reversal learning reflects a detrimental ‘over-dosing’ of intact ventral fronto-striatal circuitry (Gotham et al, 1988; Swainson et al, 2000; Cools et al, 2001). In keeping with this hypothesis, previous functional imaging studies in PD patients have revealed that the beneficial effect of L-DOPA on working memory is accompanied by modulation of the dorsolateral PFC, which is strongly connected with the severely depleted dorsal striatum (Cools et al, 2002b; Mattay et al, 2002). However, no study has yet investigated the neural site at which L-DOPA acts to impair cognitive functioning. In the present study, we aimed to provide evidence for the hypothesis that L-DOPA in PD interacts with the intact nucleus accumbens (NAc) and/or the ventral PFC, but not the severely depleted dorsal striatum and dorsal PFC to bias reversal learning. To this end, we investigated the effects of L-DOPA withdrawal in mild PD on BOLD responses during probabilistic reversal learning.



Fourteen PD patients participated in the study (52–80 years old; mean age of 67.3±8.8; three female). All patients were assessed on two occasions. For one occasion they were asked to abstain from their L-DOPA the night before the assessment, at least 18 h prior to the experiment. On the other occasion they were taking their medication as normal. Three ON L-DOPA sessions and four OFF L-DOPA sessions were excluded (one ON and two OFF sessions due to movement artifacts; two ON and two OFF sessions due to failure to perform task, see below for further details), leaving eight complete ON-OFF data sets (ie data from patients tested on both ON and OFF occasions; three were tested ON first).

All patients were diagnosed by one of two neurologists with a specialist interest in movement disorders (RAB or SJGL) as having idiopathic PD according to UK PDS brain bank criteria. Patients with a significant medical (or neurological) history not related directly to PD (eg stroke, head injury) as well as dementia (Mini Mental State Examination) (MMSE=<24; Folstein et al, 1975) or depression were excluded from the study. The mean score on the Beck Depression Inventory (BDI; Beck et al, 1961) was within the normal range (Table 1). All patients had normal color vision. The severity of clinical symptoms was assessed according to the Hoehn and Yahr rating scale (Hoehn and Yahr, 1967) and the Unified PD (44-item) Rating Scale (UPDRS) (Fahn et al, 1987). Hoehn and Yahr ratings ranged between I and II and were significantly higher when patients were OFF than when they were ON medication (Hoehn and Yahr: T7=−4.96, P=0.002; UPDRS: T7=−6.5, P<0.0001). All patients included in the study were receiving daily L-DOPA preparations, and two were also taking selective serotonin reuptake inhibitors (SSRIs: citalopram and paroxetine; these patients were not depressed at the time of testing). They were not taking any other forms of medication. All patients were on stable medication for at least 3 months prior to the study.

Table 1 Demographic Characteristics

Other demographics are summarized in Table 1.

The study was approved by the Local Research Ethical Committee in Cambridge and carried out in accordance with the Declaration of Helsinki. All volunteers gave written informed consent.

Experimental Design

Each participant was scanned while performing the probabilistic reversal learning task in three successive 9 min sessions (see Figure 1). This task has been described elsewhere and the reader is referred to Cools et al (2002a) for more detail. Before entering the scanner, participants performed a 30 trial training session. This was a simple probabilistic discrimination task (ie without reversal stages) designed to introduce the participant to the concept of a probabilistic error without the need to reverse responding. Subsequently, participants practiced one or two complete reversal blocks to familiarize them with the reversals and to minimize practice/test-retest effects from session to session.

Figure 1
figure 1

Schematic of the probabilistic reversal learning task. Subjects were presented with the same two abstract visual patterns (A, B) throughout a task block, as displayed in the left panel. They were instructed to choose the pattern that was usually correct, by making right or left button presses. The positive:negative feedback contingencies were always 80:20, so that 20% of correct responses would be accompanied by spurious negative feedback. This response constituted a probabilistic error. Subjects were told that the rule could change, so that, at some point, the other pattern would become correct. The instructions emphasized that they should change responding to the other pattern only when they were certain that the rule had changed. Following contingency reversal, subjects would typically continue to respond to the previously relevant pattern (A) and reverse their responding to the other pattern (B) only after two or three errors.

On each go, the same two patterns were presented. One of the patterns was correct and the other pattern was wrong, and subjects had to choose the correct pattern on each go. During the task, the rule changed intermittently so that the alternate pattern became usually correct. Subjects were instructed to only start choosing the other pattern when they were sure that the rule had changed.

Stimuli were abstract colored patterns presented simultaneously in the left and right visual fields (location randomized) on a computer display projected onto a mirror in the MRI scanner. Each block consisted of 10 discrimination stages, and therefore, nine reversal stages. Reversal of the stimulus-outcome contingencies occurred after between 10 and 15 correct responses (including probabilistic errors). This learning criterion varied from stage to stage. The number of probabilistic errors between each reversal varied from 0 to 4. Responses were made using the left or right button on a button box positioned on the stomach of the subject. On each individual trial, the stimuli were presented for 2000 ms within which the response had to be made (or else a ‘too late’ message was presented). Feedback, consisting of a green smiley face for correct responses or a red sad face for incorrect responses, was presented immediately after the response. The feedback faces were presented centrally, between the two stimuli, for 500 ms during which the stimuli also remained on the screen. After feedback, the stimuli were removed and the face was replaced by a fixation cross for a variable interval so that the overall inter-stimulus interval was 3253 ms, enabling precise desynchronization from the repetition time (TR).

Imaging Acquisition

Imaging data were collected using a Bruker Medspec scanner (S300; Bruker, Ettlingen, Germany) operating at 3 Tesla. T2*-weighted echo-planar images (EPIs), depicting blood oxygenation level-dependent contrast, were acquired in each session (TR, 1.6 s; echo time, 27 ms). A total of 21 slices (each of 4 mm thickness; interslice gap, 1 mm; matrix size, 128 × 128; bandwidth, 100 kHz; voxel size before normalization 1.56 × 1.56 × 5 mm and after normalization 3 × 3 × 3 mm; axial oblique acquisition orientation) per image were acquired. In addition, high-resolution T1-weighted images for spatial normalization were acquired of each participant (voxel size 1 × 1 × 1 mm). The first 12 EPIs in each session were discarded to avoid T1 equilibrium effects.

Imaging Analysis

Data analysis was performed using SPM2 (Statistical Parametric Mapping; Wellcome Department of Cognitive Neurology, London, UK). Preprocessing procedures included slice acquisition time correction, within-subject realignment, geometric undistortion using fieldmaps (Cusack and Papadakis, 2002), spatial normalization using each individual subject's skull-stripped SPGR (Spoiled Gradient echo sequence; using the Brain Extraction Tool, (Smith, 2002)) and the Montreal Neurological Institute (MNI) skull-stripped structural template (SPM2), and spatial smoothing using a Gaussian kernel (10 mm full-width at half-maximum). Time series were high-pass filtered (128 s). The movement parameters that were generated by realignment revealed that three subjects moved excessively, possibly indicating tremor, and were excluded on this basis.

A canonical hemodynamic response function was used as a covariate in a general linear model and a parameter estimate was generated for each voxel for each event type. Six movement parameters obtained from the realignment stage of preprocessing, as well as the global session means were entered as covariates of no interest. The parameter estimate, derived from the mean least squares fit of the model to the data, reflects the strength of covariance between the data and the canonical response function for a given condition. Individuals' contrast images, derived from pair-wise contrasts between parameter estimates for different events, were taken to a second-level group analysis in which t values were calculated for each voxel, treating inter-subject variability as a random effect. The t values were transformed to unit normal Z distribution to create a statistical parametric map for each of the planned contrasts (described below).

The hemodynamic response function was modeled to the onset of the responses, which co-occurred with the presentation of the feedback. The following events were modeled: (1) correct responses, co-occurring with positive feedback, as a baseline (approximately 320 trials per session); (2) misleading probabilistic errors, on which negative feedback was given to correct responses (approximately 52 trials per session); (3) final reversal errors, resulting in the subject shifting their responding; and (4) the other preceding reversal errors, following a contingency reversal but preceding the final reversal errors (the total number varied as a function of performance). The final reversal errors (co-occurring with the final negative feedback) were chosen as critical events of interest (ie reflecting reversal learning) because activation of a reversal network was assumed to follow this last negative feedback. The correct responses were chosen as our baseline measure. Error trials that could not be classified as probabilistic errors or reversal errors (so-called ‘spontaneous’ errors) were not included in the model. Probabilistic error trials that led to inappropriate switches and the trial that led to a switch back to the currently relevant stimulus were also excluded from the model.

The following contrasts were computed for each session: (i) final reversal errors minus correct responses, (ii) other nonswitch errors (including the probabilistic and preceding reversal errors) minus correct responses, and (iii) final reversal errors minus other nonswitch errors. These contrast images were taken to second-level group analyses.

In the present study, we predicted that L-DOPA would modulate reversal-related signal change in the NAc and/or the ventral PFC, but not in the dorsal striatum or dorsal PFC. These predictions were assessed using whole-brain analyses with a statistical threshold of P<0.05, corrected for multiple comparisons according to Random Field Theory (Worsley et al, 1996). In addition, the above-mentioned strong predictions justified analyses within regions of interest (ROIs).

Regions of Interest

The NAc ROI was defined anatomically and drawn by hand on slices of the Montreal Neurological Institute (MNI) skull-stripped structural (average 152 T1) template (volume of ROI=1652 mm3; Figure 2). The dorsal striatum ROI was defined by subtracting our NAc ROI from a template ROI for the striatum. This striatal ROI consisted of the right and left putamen and right and left caudate nucleus, as defined and provided by Tzourio-Mazoyer et al (2002). Although there is considerable agreement about the anatomical boundaries of and within striatal brain regions, a clear definition of both anatomical and functional subregions within the PFC is lacking. Therefore, frontal ROIs were derived from the data itself. In order to obtain task-related prefrontal activation clusters that were orthogonal to our comparisons of interest between the ON and OFF L-DOPA sessions, contrasts were calculated across the two sessions, in which the final reversal errors were contrasted with the baseline correct responses. All sessions were included in this one-sample t-test. The activation clusters were thresholded at P<0.05 (corrected for multiple comparisons at the whole brain level) with extent of at least 20 voxels (Table 2) and all PFC regions (Talairach y coordinate >0) were selected directly for subsequent ROI analysis.

Figure 2
figure 2

The nucleus accumbens region of interest.

Table 2 Signal Change during Final Reversal Errors Relative to Correct Responses

The above-described statistical models were then reapplied to the average signal within the ROIs for each subject's session, using the MarsBar tool for SPM2 (Brett et al, 2002). Average parameter estimates, representing mean signal change were extracted from each ROI and each subject's session and the mean values across subjects are the values reported in Figures 3, 4 and 5. These mean values were submitted to a repeated measures ANOVA (SPSS 11.0, Chicago, IL) with three within-subjects factors: ROI (eight levels: the NAc, the dorsal striatum, and the six task-related frontal regions), L-DOPA treatment (two levels: ON and OFF) and event-type (the final reversal errors and the baseline correct responses). No epsilon correction was applied, because the sphericity assumption was met (Mauchly's W was not significant for our ROI × L-DOPA treatment × event-type interaction; Mauchly's W (27)=0.001, P=0.5). For the ROI analyses we report two-tailed P-values (statistical threshold P<0.05).

Figure 3
figure 3

Reversal-related signal change across L-DOPA treatment sessions. The BOLD activation pattern during the final reversal errors relative to the baseline correct responses, across both ON and OFF L-DOPA treatment sessions, superimposed on the Montreal Neurological Institute (MNI) template brain (individual brain considered most typical of the 305 brains used to define the MNI template). The figure displays all activation above a threshold of P=0.001 (uncorrected for multiple comparisons). We chose this criterion for display in order to (i) reveal the physiological plausibility of the signal and (ii) facilitate comparison with the data in our previous report (Cools et al, 2002a) in which we also displayed all activation at P=0.001. See Table 2 for peaks that reached significance at P=0.05 (after family-wise error-rate correction for multiple comparisons). Numbers in top left-hand corner of each quadrant indicate Talairach coordinates. R=right hemisphere.

Figure 4
figure 4

Effects of L-DOPA on reversal-related signal change in the striatum and the prefrontal cortex. Values represent mean parameter estimates (betas, as estimated by SPM2) and error bars represent SE of the mean. Abbreviations: PD=Parkinson's disease; NAc=nucleus accumbens, VLPFC=ventrolateral prefrontal cortex, DLPFC=dorsolateral prefrontal cortex, OFC=orbitofrontal cortex, FEF=frontal eye fields, RCZ=rostral cingulate zone, IFJ=inferior frontal junction. *Significant treatment by event-type interaction.

Figure 5
figure 5

L-DOPA disrupted activation in the nucleus accumbens (a) Reversal-related signal change in the NAc in the OFF L-DOPA condition relative to the ON L-DOPA condition, as revealed by a whole-brain analysis, is shown on a section of slice 139 of the MNI ‘colin27’ template with MRIcro. All T-values >1 are shown. (b) Mean signal change from the NAc ROI (Figure 2) is shown as a function of event-type and L-DOPA treatment condition. It can be seen that signal change is increased during the final reversal errors (which lead to behavioral adaptation) when patients are OFF L-DOPA, but not when patients are ON L-DOPA. Values represent mean parameter estimates (betas, as estimated by SPM2) and error bars represent SE of the mean. *Significant treatment by event-type interaction.

We also examined the main effect of L-DOPA by contrasting all task-related regressors from the ON L-DOPA sessions with all task-related regressors from the OFF L-DOPA sessions in each ROI and with whole-brain analyses.

Behavioral Data Analysis

Postacquisition inspection of the data revealed that some patients were unable to perform the task. It was unclear from simple inspection whether these patients learnt anything or whether they responded at random, thereby reaching learning criteria by chance. For this reason, a learning criterion was imposed on a post hoc basis: Data were included, only if participants completed all nine reversals in each of the three scanning runs and reached an additional criterion of six consecutively correct responses during at least seven out of the 10 learning stages in each scanning run. This criterion led to exclusion of four sessions (two OFF and two ON), resulting in a sample size of eight patients (and 16 scanning sessions).

Dependent measures were (i) the proportion of errors due to switching after a probabilistic (misleading) error, (ii) the number of perseverative errors following a contingency reversal (excluding the first error), (iii) the number of spontaneous errors (see above for definition), (iv) the mean response latency following final reversal errors, and (iv) the mean response latency following correct responses. Data were analyzed using paired sample t-tests.


Table 2 shows all significant effects revealed by a second-level one-sample t-test group analysis of the critical contrast comparing the final reversal errors with the baseline correct responses (with the ON and OFF sessions collapsed into one group). Consistent with our previous study (Cools et al, 2002a), the final reversal errors induced significant signal change in the bilateral ventrolateral PFC/insulae and the rostral cingulate zone. We also observed effects in areas that were not activated in our previous study: The right orbitofrontal cortex and the dorsolateral PFC (Figure 3). This apparent discrepancy is most likely due to the use of a different MR acquisition sequence, better adjusted for picking up signal in area susceptible to drop out. In addition, this may reflect the fact that the participants in the current study were older PD patients, who found the task more difficult than the participants in the previous study, who were young healthy Cambridge students.

Treatment effects were examined in the a priori defined striatal and all task-related frontal regions of interest (ROIs). Although the omnibus ROI × L-DOPA treatment × event-type three-way interaction tended towards significance (F7,49=1.9, P=0.08), inspection of the data clearly suggests differences in the effects of L-DOPA in the different ROIs (Figure 4). Moreover, we had a priori hypothesized that L-DOPA would modulate ventral, but not dorsal fronto-striatal circuitry during reversal learning, which allowed us to perform simple interaction effect analyses. These analyses confirmed that the L-DOPA by event-type interaction was significant in the NAc (F1,7=6.9, P=0.03) (Figure 4 and 5). Conversely, there was no L-DOPA treatment by event-type interaction in the dorsal striatum (F1,7=0.0, P=0.99). Moreover, there were no significant interactions in any of the task-related frontal ROIs, including the ventrolateral PFC (F1,7=1.6, P=0.24), the dorsolateral PFC (F1,7=2.4, P=0.17), the orbitofrontal cortex (F1,7=0.35, P=0.6), the rostral cingulate zone (F1,7=2.9, P=0.13), the bilateral superior frontal eyefields (F1,7=2.96, P=0.13) or the inferior frontal junction (F1,7=0.84, P=0.39). Thus, L-DOPA modulated reversal-related signal change in the NAc, but not in the dorsal striatum or the PFC.

Further decomposition of the significant L-DOPA treatment by event-type interaction in the NAc into simple effects revealed a significant effect of event type, that is, greater signal change during final reversal errors than baseline correct responses when patients were OFF L-DOPA (T7=−2.6, P=0.03), but not when they were ON L-DOPA (T7=−0.15, P=0.9). Note that this interaction accounts for the finding that there was no main effect of reversal learning in this ROI (Table 2). In other words, the NAc was active or silent during reversal learning depending on the medication status of the patients.

Supplementary analyses revealed that the effect of L-DOPA in the NAc was not lateralized: a direct comparison between the right and the left NAc revealed that there was no significant interaction between ROI (left vs right NAc), L-DOPA and event-type (F1,7=0.08, P=0.8). Thus, the effect in the right NAc was not significantly different from that in the left NAc.

Furthermore, the effect was not confounded by testing order: Decomposition of the within-subjects data according to testing order revealed that the reported L-DOPA treatment effect in the NAc was equally large for those tested ON L-DOPA first and those tested OFF L-DOPA first and there was no treatment by event-type by testing order interaction (F1,6=0.01, P=0.9). Moreover, the critical L-DOPA treatment by event-type interaction remained significant after exclusion of the two patients who also took selective serotonin reuptake inhibitors (F1,5=10.2, P=0.02). Thus, the effect was not driven by SSRI treatment, but rather was due to L-DOPA treatment.

The omnibus ROI × L-DOPA × event-type ANOVA revealed that there was no significant main effect of L-DOPA (F1,7=0.09, P=0.8) nor an L-DOPA × ROI interaction (F1,7=0.6, P=0.6), indicating that there were no global, task-independent effects of L-DOPA on the BOLD signal in our ROIs. Analyses of simple main effects of L-DOPA in each ROI separately confirmed that L-DOPA did not induce any significant task-independent BOLD changes (no main effects of L-DOPA treatment in the NAc: F1,7=1.2, P=0.3, the dorsal striatum: F1,7=1.2, P=0.3; the ventrolateral PFC: F1,7=0.3, P=0.6; the dorsolateral PFC: F1,7=1.0, P=0.3; the orbitofrontal cortex: F1,7=0.7, P=0.4; the rostral cingulate zone: F1,7=1.0, P=0.3; the inferior frontal junction: F1,7=0.05, P=0.8 or the frontal eye fields: F1,7=0.4, P=0.5).

Although an L-DOPA treatment by event-type interaction effect in the NAc did not reach significance when the final reversal errors were compared with the other nonswitch errors (F1,7=2.3, P=0.18), inspection of the data in Figure 5b reveals that L-DOPA modulated signal change during the final reversal errors, but not during the other nonswitch errors. We suggest that the finding that there was no significant interaction must reflect the increased variability in the signal during the nonswitch errors, which most likely activated a reversal-related network on some but not all trials. The hypothesis that L-DOPA modulated signal change in the NAc only when error trials consistently activated a reversal-related network was supported by the finding that there was no significant interaction when the other nonswitch errors were compared with the baseline correct responses (F1,7=0.001, P=0.9).

Whole brain analyses did not reveal any significant main effects of L-DOPA or event-type by treatment interaction effects.

We successfully matched performance between the ON and OFF L-DOPA state as revealed by the lack of behavioral effects of L-DOPA. These data are presented in Table 3.

Table 3 Behavioral Data


Our findings demonstrate a role for the NAc in the dopaminergic modulation of reversal learning in mild PD patients. Reversal learning was accompanied by increased NAc activity when patients were OFF, but not ON their L-DOPA. L-DOPA did not affect reversal-related activity in the dorsal striatum or PFC.

These data concur with findings from studies with experimental animals showing that reversal learning is altered by (i) damage to the ventral striatum, specifically the NAc (Divac et al, 1967; Annett et al, 1989; Stern and Passingham, 1995; Schoenbaum and Setlow, 2003) and (ii) dopaminergic modulation of the NAc (Taghzouti et al, 1985; Smith et al, 1999). Neurophysiological findings have shown that NAc neurons encode a combination of outcome-predictive information and behavioral switching and may indicate that NAc neurons encode behavioral switching only when they encode outcome-predicting information (Wilson and Bowman, 2005). Those findings may reconcile our observation that the NAc is active during final reversal errors, which signal not only a behavioral switch but also an upcoming rewarding outcome, with previous findings that (i) the NAc subserves switching (Cools, 1980; Redgrave et al, 1999) and (ii) other neurophysiological and neuroimaging data showing NAc-activity during reward-anticipation (Hollerman and Schultz, 1998; Knutson et al, 2001; Carelli, 2004; Knutson et al, 2005).

The L-DOPA-induced disruption of NAc activity parallels observations from Goto and Grace (2005). These authors revealed that increases in tonic DA release in the NAc and administration of a DA (D2) receptor agonist in the NAc disrupted PFC-evoked responses in the NAc and impaired behavioral reversal learning in rats. Our data are also reminiscent of findings showing that injection of D-amphetamine in the NAc of rats potentiates behavioral control by stimuli formerly associated with reward (ie conditioned reinforcement) in a DA-dependent way (Robbins et al, 1989). The observation that this D-amphetamine-induced potentation of control by previously rewarded stimuli was abolished by lesions of the NAc (Parkinson et al, 1999) concurs with the present conclusion that L-DOPA in humans interacts with the NAc to impair reversal learning. Thus, in the present study, L-DOPA in the NAc of PD patients may have induced aberrant potentiation of control by previously rewarded stimuli and disrupted input to the NAc signaling the need for a switch.

The observation that the L-DOPA effect in the NAc was most pronounced during the final reversal errors that directly preceded behavioral switching reinforces the hypothesis that neural activity in the NAc is switch- rather than outcome-related (Redgrave et al, 1999). This result concurs with our recent finding that the ventral striatum was selectively activated on switch, but not nonswitch trials (Cools et al, 2004). The current finding contrasts with a recent finding showing that manipulation of central serotonin levels modulated activity in the PFC during the same reversal learning paradigm in young healthy volunteers (Evers et al, 2005). That effect was not specific to the final reversal errors but extended to the nonswitch errors, not followed by behavioral adaptation (Evers et al, 2005). Those data highlight the selectivity of the present dopaminergic modulation of switch-related activity in the NAc. Together, these findings concur with suggestions that the NAc is key for the integration of (changes in) outcome values to bias behavioral adaptation (Mogenson, 1987).

The finding that L-DOPA modulates the NAc during probabilistic reversal learning in mild PD is also consistent with the ‘L-DOPA over-dose’ hypothesis (Gotham et al, 1988; Swainson et al, 2000; Cools et al, 2001, 2003). This hypothesis states that L-DOPA doses necessary to remedy the DA loss in severely depleted brain areas may detrimentally ‘over-dose’ brain areas that are relatively intact. Mild PD patients are impaired on probabilistic reversal learning when they are ON, but not when they are OFF L-DOPA (Cools et al, 2001). It was proposed that this impairment was due to detrimental ‘over-dosing’ of the ventral striatum, which is relatively spared of DA loss in early-stage PD (Farley et al, 1977; Kish et al, 1988). In keeping with this prediction, the present observation confirms that L-DOPA interacts with the NAc, not the dorsal striatum during reversal learning. One mechanism by which L-DOPA may ‘over-dose’ reversal learning was proposed by Frank et al (2004). These authors suggested that L-DOPA-induced increases in tonic DA may ‘fill in’ phasic DA dips, thought to accompany omissions of reward (Hollerman and Schultz, 1998), thereby attenuating reward-prediction error signals. In keeping with this model, we demonstrate that L-DOPA disrupted normal neural processing at the time of the critical negative feedback signal leading to behavioral adaptation. Future study is necessary to address the pharmacological mechanisms underlying the medication-induced reversal impairment. In particular, studies in patients with severe PD, accompanied by DA loss in the NAc, will reveal whether or not the L-DOPA-induced deficit in mild PD depends on the level of DA depletion in the NAc. Whereas other accounts of the medication-induced impairment do not require the NAc to be intact (Frank et al, 2004), we predict that the impairment is abolished with progression of the disease.

Our finding concurs with the observation that the effect of L-DOPA stems from its ability to elevate DA in the striatum (Hornykiewicz, 1974; Carey et al, 1995) and suggests that at least some ‘frontal-like’ deficits in PD reflect disruption of frontal input to the striatum (Goto and Grace, 2005) or alternatively, striatal outflow to the PFC (Owen et al, 1998; Dagher et al, 2001). The present lack of cortical modulation of reversal-related activity apparently contradicts results from previous imaging studies in PD, which have revealed cortical changes (Cools et al, 2002b; Mattay et al, 2002; Lewis et al, 2003; Monchi et al, 2004). There are three explanations of this discrepancy. First, the paradigms employed in previous L-DOPA withdrawal imaging studies measured working memory, which strongly implicates dorsolateral PFC (Cools et al, 2002b; Mattay et al, 2002). This concurs with overwhelming evidence for a role of PFC DA in working memory (Goldman-Rakic, 1992; Mattay et al, 2003; Meyer-Lindenberg et al, 2005). We argue that the neural site at which L-DOPA acts to modulate function is the critical determinant of its behavioral effect. L-DOPA-induced modulation of the NAc during reversal learning may well coexist with modulation of the dorsal PFC during working memory. A second reason for the discrepancy may be the previously observed performance differences. Thus, in the present study, we expressly matched performance between the ON and the OFF L-DOPA sessions (by means of practice prior to scanning), with the aim to minimize the likelihood that any activation changes reflect differential effort or recruitment of task-relevant or compensatory mechanisms. As a result of this matched performance, there were no differences between the treatment sessions in terms of the number of events included in the analysis. At least some of the cortical activation changes observed in previous studies may have reflected the fact that patients performed significantly more poorly than controls (Lewis et al, 2004; Monchi et al, 2004). Finally, it is possible that the observed null effects in the PFC (and indeed, the dorsal striatum) reflect the relatively small sample size employed in the present study. The use of a larger sample size may have revealed additional (smaller) medication effects in the PFC and dorsal striatum. Thus, we recognize that caution is warranted regarding the interpretation of the null effects, particularly given that the ROI-based fMRI method may be insensitive to picking up subthreshold (ie nonsignificant) focal effects. Nevertheless, the present data indicate that the effect in the ventral striatum is disproportionately large relatively to effects in other loci.

Despite the matched performance, there was a significant effect of medication on NAc activity. The physiological BOLD response provided a more sensitive measure than the performance measure from this over-trained paradigm. Our finding of normal reversal learning coupled with disrupted NAc activity in patients ON L-DOPA suggests that switching in the present paradigm did not depend on supra-threshold NAc activity. Reversal learning most likely does depend on supra-threshold NAc activity in single-reversal tasks, because tonic DA release, DA receptor activation and damage in the NAc have been shown to disrupt such one-time reversal learning (Smith et al, 1999; Schoenbaum and Setlow, 2003; Goto and Grace, 2005; Divac et al, 1967; Taghzouti et al, 1985; Annett et al, 1989; Stern and Passingham, 1995). However, the extensive practice in the present task likely enabled participants (ON as well as OFF L-DOPA) to rely on different (cortical) neural mechanisms, helping them to perform well despite accumbens abnormality. Thus, although the present experiment does not provide direct evidence for this hypothesis, we argue that the failure to activate the NAc in patients ON L-DOPA is likely to account for the previously observed impairment on the single-reversal task (Cools et al, 2001). This argument is based on the following reasoning. First, our previous study revealed an impairment in patients ON, but not OFF medication, on a similar, but more sensitive task (Cools et al, 2001). Second, significant reversal-related activity in the ventral striatum was observed during the same task in our previous fMRI experiment with young controls (Cools et al, 2002a). Third, studies with experimental animals indicate that the ventral striatum is essential for reversal learning (Divac et al, 1967; Annett et al, 1989; Stern and Passingham, 1995; Schoenbaum and Setlow, 2003; Taghzouti et al, 1985; Smith et al, 1999). The effects of L-DOPA on reversal-related activity in the NAc of older healthy participants during probabilistic reversal learning are currently under investigation.

In conclusion, we have demonstrated a selective effect of L-DOPA in mild PD on reversal-related activity in the NAc, but not the dorsal striatum or the PFC. In the OFF state, patients demonstrated NAc activity during the final reversal errors that preceded behavioral adaptation. L-DOPA disrupted this NAc activity, consistent with work with experimental animals (Goto and Grace, 2005). These findings support the suggestion that DA interacts with the NAc to bias reversal learning and demonstrate the utility of early PD, with differentially depleted neural systems, as a model for exploring the role of DA in normal cognitive function.