Serotonergic Modulation of Prefrontal Cortex during Negative Feedback in Probabilistic Reversal Learning

Evers, Elizabeth A T; Cools, Roshan; Clark, Luke; van der Veen, Frederik M; Jolles, Jelle; Sahakian, Barbara J; Robbins, Trevor W

doi:10.1038/sj.npp.1300663

Download PDF

Original Article
Published: 02 February 2005

Clinical Research

Serotonergic Modulation of Prefrontal Cortex during Negative Feedback in Probabilistic Reversal Learning

Elizabeth A T Evers¹,
Roshan Cools^2,3,
Luke Clark³,
Frederik M van der Veen¹,
Jelle Jolles¹,
Barbara J Sahakian⁴ &
…
Trevor W Robbins³

Neuropsychopharmacology volume 30, pages 1138–1147 (2005)Cite this article

3393 Accesses
155 Citations
Metrics details

Abstract

This study used functional magnetic resonance imaging to examine the effects of acute tryptophan (TRP) depletion (ATD), a well-recognized method for inducing transient cerebral serotonin depletion, on brain activity during probabilistic reversal learning. Twelve healthy male volunteers received a TRP-depleting drink or a balanced amino-acid drink (placebo) in a double-blind crossover design. At 5 h after drink ingestion, subjects were scanned while performing a probabilistic reversal learning task and while viewing a flashing checkerboard. The probabilistic reversal learning task enabled the separate examination of the effects of ATD on behavioral reversal following negative feedback and negative feedback per se that was not followed by behavioral adaptation. Consistent with previous findings, behavioral reversal was accompanied by significant signal change in the right ventrolateral prefrontal cortex (PFC) and the dorsomedial prefrontal cortex. ATD enhanced reversal-related signal change in the dorsomedial PFC, but did not modulate the ventrolateral PFC response. The ATD-induced signal change in the dorsomedial PFC during behavioral reversal learning extended to trials where subjects received negative feedback but did not change their behavior. These data suggest that ATD affects reversal learning and the processing of aversive signals by modulation of the dorsomedial PFC.

Striatal dopamine dissociates methylphenidate effects on value-based versus surprise-based reversal learning

Article Open access 24 August 2022

Ruben van den Bosch, Britt Lambregts, … Roshan Cools

Information capacity and robustness of encoding in the medial prefrontal cortex are modulated by the bioavailability of serotonin and the time elapsed from the cue during a reward-driven task

Article Open access 06 July 2021

A. Ezequiel Pereyra, Camilo J. Mininni & B. Silvano Zanutto

Serotonin modulates asymmetric learning from reward and punishment in healthy human volunteers

Article Open access 12 August 2022

Jochen Michely, Eran Eldar, … Raymond J. Dolan

INTRODUCTION

Serotonin (5-HT) has been extensively implicated in depressed mood and the processing of motivational signals (Graeff et al, 1986; Soubrie, 1986; Deakin, 1991; Wilkinson et al, 1995). Animal research has demonstrated that 5-HT-enhancing drugs attenuate the aversive effects of brain stimulation (Patkina and Lapin, 1976; Graeff et al, 1986; Smith and Kennedy, 2003), and, conversely, potentiate self-stimulation in so-called ‘reward’ centers and enhance the motivational properties of stimuli predictive of rewards (Redgrave and Horrell, 1976; Aronson et al, 1995; Sasaki-Adams and Kelley, 2001; Orosco et al, 2004). Reduced motivation (anhedonia, apathy) is a cardinal feature of depression, where neuropsychological studies have further emphasized the relevance of incentive motivation and the processing of reinforcement; depression has been associated with a ‘catastrophic response to perceived failure’ (Beats et al, 1996) or an oversensitivity to negative feedback (Elliott et al, 1997; Steffens et al, 2001; Murphy et al, 2003). Selective 5-HT reuptake inhibitors exert antidepressant effects and acute reduction of central 5-HT function through dietary depletion of tryptophan (TRP), a precursor of 5-HT, can induce temporary depressive relapse in remitted patients (Young et al, 1985; Smith et al, 1997).

5-HT neurotransmission has been implicated not only in the processing of reward and punishment signals, but also in the inhibitory control of behavior (Soubrie, 1986; Evenden, 1999), where impulsive pathology is typically associated with reductions in central 5-HT (Coccaro et al, 1989; Cherek and Lane, 2000). Findings from studies with clinical populations are corroborated by animal studies linking impulsive choice in delay-discounting paradigms and premature responding in choice reaction-time tasks with 5-HT dysregulation (Harrison et al, 1997; Puumala and Sirvio, 1998; Koskinen et al, 2000; Mobini et al, 2000; Dalley et al, 2002; Liu et al, 2004).

These studies with animals have implicated particularly the medial and orbital prefrontal cortex (PFC) in impulsive performance (Dalley et al, 2002; Chudasama and Robbins, 2003; Liu et al, 2004), and this concurs indirectly with findings that manipulation of the 5-HT system in humans affects tasks that implicate the ventral and medial aspects of PFC (Robbins, 2000). Thus, neuropsychological studies have shown that acute TRP depletion (ATD) impairs performance on tasks of reversal learning, response inhibition, and affective decision-making (Park et al, 1994; Murphy et al, 2002; Walderhaug et al, 2002; Rogers et al, 2003), which have all been associated with ventral and/or medial PFC circuitry (Iversen and Mishkin, 1970; Jones and Mishkin, 1972; Dias et al, 1996; Rogers et al, 1999; O'Doherty et al, 2001; Cools et al, 2002; Fellows and Farah, 2003; Kringelbach and Rolls, 2003; Hornak et al, 2004).

While the ascending 5-HT projection has a widespread cortical distribution, receptor subtypes including the 5-HT_2A receptor show regional specificity to the frontal cortex and are overly expressed in medial and orbital regions in animal models of depression and anxiety (Poeggel et al, 2003; Preece et al, 2004). Structural and functional imaging studies in depressed patients also indicate reasonably selective abnormalities in the ventral and medial aspects of PFC (Drevets et al, 1997; Mayberg et al, 1999; Elliott et al, 2002; Ballmaier et al, 2004; Lacerda et al, 2004).

In the present study, we examined the effects of ATD on the blood oxygenation level-dependent (BOLD) response during probabilistic reversal learning, which requires the adaptation of behavior following changes in reward (and punishment) values as well as the maintenance of behavior in the face of misleading negative (probabilistic) feedback. ATD is a well-recognized research method for reducing central 5-HT in humans and studying the effects of low 5-HT on cognition. ATD produces a rapid decrease in the synthesis and release of brain 5-HT (Nishizawa et al, 1997; Carpenter et al, 1998; Williams et al, 1999). TRP is depleted by ingesting an amino-acid mixture that does not contain TRP but does include other large neutral amino acids (LNAA) (Young et al, 1985). ATD is achieved by increasing protein synthesis in the liver with subsequent decreases in plasma TRP stores. In addition, the amino-acid load results in competition for the active transport system that the amino acids share for entry across the blood–brain barrier, resulting in reduced availability of TRP in the brain. The probabilistic reversal learning task enables the relatively separate examination of behavioral adaptation following negative feedback (aversive signals) and the processing of negative feedback without subsequent behavioral adaptation. We used the same probabilistic reversal learning task that was previously employed by Cools et al (2002). This study revealed significant BOLD changes during probabilistic reversal learning in the ventrolateral prefrontal cortex (VLPFC) and the dorsomedial PFC. Based on the strong a priori association between depression, 5-HT, reversal learning, and orbital PFC, we decided not to restrict our regions of interest (ROIs) to the task-related brain areas but to extend these to the other orbital frontal regions not activated by the task. We predicted that ATD would modulate signal change in the ventral (including orbital) and medial PFC during the reception of negative feedback and the subsequent adaptation of behavior to the new contingencies.

MATERIALS AND METHODS

Participants

Twelve healthy right-handed male volunteers (18–28 years old; mean age of 23.8±2.8) participated in this experiment. The study was approved by the Local Research Ethical Committee in Cambridge and carried out in accordance with the Declaration of Helsinki. Participants were recruited via local advertisements, and screened for psychiatric and neurological disorders and MRI contraindications by means of prescreening questionnaires and interview by EATE. All volunteers gave written informed consent, and were paid for participation. The exclusion criteria were any history of cardiac, hepatic, renal, pulmonary, neurological or gastrointestinal disorder, medication use, and a history of major depression or bipolar affective disorder.

One participant vomited after ingesting the amino-acid mixture and was replaced by a substitute. One participant was excluded from the analysis due to poor performance on the reversal learning task (final n=11). His mean reaction time (RT) and the total number of trials on the reversal learning task were between 2.5 and 3.0 standard deviations higher than the group mean after the balanced drink.

Experimental Design

Participants attended two test sessions at least 1 week apart, and were administered either a TRP-depleted (TRP−) drink or a balanced (BAL) amino-acid drink in a double-blind crossover design (four participants received TRP− and seven received the BAL drink on the first session). Prior to a test session, volunteers fasted overnight and low-protein food was provided during the test days. Following a resting period of 5 h (4.5 h, SD=35 min, in the TRP− condition and 5.0 h, SD=40 min, in the balanced condition), to ensure stable and low TRP levels (Riedel et al, 1999), participants entered the functional magnetic resonance imaging (fMRI) scanner at the Wolfson Brain Imaging Centre (WBIC). They were scanned while performing three blocks of the probabilistic reversal learning task each for about 9 min (Cools et al, 2002) and the checkerboard task. Behavioral performance on the reversal learning task was assessed using button presses on a response box. Structural scans were obtained at the end of a test session or on a separate session.

Probabilistic Reversal Learning Task

The probabilistic reversal learning task was described in detail by Cools et al (2002). The task is a two-choice visual discrimination task where the same two abstract patterns were presented on each trial. Using trial-and-error feedback after each response (a green happy face or a red sad face), subjects learned to select the stimulus that was usually correct. This rule intermittently reversed so that the other stimulus was usually correct. Consequently, responding had to be adjusted in order to gain reward and avoid punishment. On a minority of trials (10–20%) false-negative feedback was provided to a correct response, the so-called ‘probabilistic errors’ (0–4 per reversal). Reversal of the stimulus-reward contingency occurred after 10–15 correct responses (including probabilistic errors). Participants performed three successive 9-min blocks of the task, each taking 140–160 trials (block length was determined by the number of errors made). Stimuli were presented for a 2000 ms response window (RTs >2000 ms were followed by a ‘too late’ message). Feedback was presented immediately after the response for 500 ms. After feedback, the stimuli were replaced by a fixation cross for a variable duration so that the overall interstimulus interval was 3215 ms, enabling precise desynchronization from the repetition time (TR) of 1600 ms.

Four types of events were modeled: (i) a correct response followed by positive feedback, (ii) a correct response followed by negative feedback (probabilistic error), (iii) an incorrect response where the subject reversed on the subsequent trial (reversal switch error), and (iv) an incorrect response where the subject did not reverse (ie perseverated) on the subsequent trial (preceding error). Spontaneous discrimination errors (those which could not be categorized as reversal or probabilistic errors) were not included in the model.

Checkerboard Task

The checkerboard task was a passive visual task where the subject viewed two configurations of black and white squares in an 8 × 8 matrix that switched at a frequency of 8 Hz. Using a blocked ABAB design, 20 s checkerboard blocks alternated with 20 s crosshair fixation for six cycles, taking a total of 4 min.

Amino-Acid Mixture

The TRP-deficient amino-acid drink (TRP−) contained a total of 75 g of amino acids using the proportions described by Young et al (1985): 4.1 g L-alanine, 2.4 g glycine, 2.4 g L-histidine, 6.0 g L-isoleucine, 10.1 g L-leucine, 6.7 g L-lysine, 4.3 g L-phenylalanine, 9.2 g L-proline, 5.2 g L-serine, 4.3 g L-threonine, 5.2 g L-tyrosine, 6.7 g L-valine, 3.7 g L-arginine, 2.0 g L-cysteine, and 3.0 g L-methionine (SHS International Ltd, Liverpool, UK). The balanced mixture contained the same amino acids, plus 3.0 g TRP. The drinks were prepared with 200 ml tap water and fruit flavoring to compensate for the unpleasant taste.

Biochemical Measures

Blood samples (10 ml) were taken prior to ingestion of the amino-acid mixture and after the fMRI scan (about 6.5 h later), to determine the plasma TRP level and the TRP/ΣLNAA ratio. This ratio is important because the uptake of TRP in the brain is strongly associated with the amounts of other LNAA competing at the blood–brain barrier. Venous samples were taken in lithium heparin tubes, centrifuged, and stored at −20°C. Plasma TRP was determined by an isocratic high-performance liquid chromatography (HPLC) method of analysis. Plasma proteins were removed by precipitation with 3% trichloroacetic acid (TCA) and centrifugation at 3000 revs, 4° for 10 min, and then pipetted into heparin aliquots. An aliquot was then diluted in mobile phase before injection into the HPLC analysis column. Fluorescence end-point detection was used to identify TRP.

Paired-sample t-tests were used to compare the two baseline measurements of plasma TRP levels and TRP/ΣLNAA ratios, and to compare measurements of plasma TRP levels and the TRP/ΣLNAA ratio in the balanced and TRP− condition. A repeated-measures ANOVA was performed to look at the effect of ATD on plasma TRP levels and the TRP/ΣLNAA ratios.

Psychological Ratings

Visual Analogue Scales (VAS) containing the items drowsy, sad, happy, anxious and nauseous were administered five times during the test day (at roughly 90 min intervals). The Positive and Negative Affect Scale (PANAS; Watson et al, 1988) was completed prior to ingestion and after the scan. Repeated-measures ANOVA with drink treatment (TRP− and balanced) and time (two time points for the VAS, and two for the PANAS). Greenhouse–Geisser corrections were applied when the sphericity assumption was violated.

Behavioral Data Analysis

Dependent measures were the number of reversal contingencies during the task as a whole, the number of errors due to switching after a probabilistic error, mean RT, and a maintenance score that was calculated by dividing the number of errors made following five correct responses but prior to the next contingency reversal by the number of trials remaining prior to the next contingency reversal (adapted from Swainson et al, 2000) for each reversal block. Data were analyzed using repeated-measures ANOVA with block (1–3) and treatment (balanced and TRP−) as within-subjects factors and order of drug treatment (balanced first or TRP− first) as between-subjects factor. Greenhouse–Geisser corrections were applied when the sphericity assumption was violated. Simple effects of block and treatment were analyzed using post hoc tests with Bonferroni correction for multiple comparisons. Medians were used for analysis because RTs were not normally distributed. Measures that were not normally distributed were analyzed with the nonparametric Wilcoxon signed ranks test.

Image Acquisition

Participants were scanned in a 3 T Bruker Medspec scanner (S300; Bruker, Ettlingen, Germany), at the WBIC. T2^*-weighted gradient echo planner images (EPI) (TE 27 ms) were acquired with blood oxygenation level-dependent (BOLD) contrasts. A whole-brain acquisition consisted of 21 slices (TR 1.6 s; voxel size before normalization 1.56 × 1.56 × 5 mm³ and after normalization 3 × 3 × 3 mm³; inter-slice gap 1 mm; matrix size 128 × 128; bandwidth 100 kHz; oblique orientation) and the total number of volumes acquired varied from run to run (from 142 to 166) depending on the participant's performance. In addition, high-resolution T1-weighted images for spatial normalization were acquired of each participant (voxel size 1 × 1 × 1 mm³). We were unable to acquire reliable data from a section of ventromedial PFC because of susceptibility artifacts.

Image Analysis

Data analysis was performed using SPM99 and SPM2 (Statistical Parametric Mapping; Wellcome Department of Cognitive Neurology, London, UK). Preprocessing procedures consisted of (linear) slice acquisition time correction, within-subject realignment (SPM2), geometric undistortion using fieldmaps (Cusack et al, 2003), spatial normalization using each individual subject's skull-stripped SPGR (using the Brain Extraction Tool; Smith, 2002), and the Montreal Neurological Institute (MNI) skull-stripped structural template (SPM2) and spatial smoothing using a Gaussian kernel (10 mm full-width at half-maximum).

A canonical hemodynamic response was used as a covariate in a general linear model and a parameter estimate was generated for each voxel for each event type. For each event, the hemodynamic response function was modeled to the onset of the response, which co-occurred with the presentation of the feedback for the reversal learning task.

For each subject, the following contrasts were computed: (i) Main task effect 1: Reversal switch errors vs baseline correct responses for the balanced condition only. (ii) Main task effect 2: Reversal nonswitch errors (which included probabilistic and preceding errors which were not followed by the subject switching responding) vs baseline correct responses for the balanced condition only. (iii) Main task effect 3: Reversal switch errors vs the other nonswitch errors for the balanced condition only. (iv) Treatment × task interaction 1, reflecting the effect of ATD on task effect 1. (v) Treatment × task interaction 2, reflecting the effect of ATD on task effect 2. (vi) Treatment × task interaction 3, reflecting the effect of ATD on task effect 3. Thus, treatment was modeled as a within-subject variable within each individual's general linear model. For the checkerboard task, an epoch (box-car) design contrasted checkerboard visual stimulation with crosshair fixation. Individual contrast images were taken to a second level analysis in which t-values were calculated for each voxel, treating inter-subject variability as a random effect.

The MarsBar tool (Brett et al, 2002) was used to average signal within independently defined ROIs at the group level. ROIs for the reversal task analysis were defined from the activation peaks found by Cools et al (2002); 10 mm spheres (corresponding to the smoothing filter) were built around the dorsomedial PFC (x, y, z=8, 32, 52), right VLPFC (x, y, z=38, 24, −2) and left VLPFC (x, y, z=−32, 24, −4). The random effects model was then reapplied to the average signal within these ROIs to test the statistical significance of the contrasts of interest (a one-sample t-test). Average signal change was extracted from each ROI and these are the values reported in Figure 2. In addition, we also performed whole-brain analyses. Both ROI and whole-brain analyses were thresholded at P<0.05 (corrected for multiple comparisons). Given the a priori prediction concerning the modulation of the orbitofrontal cortex by ATD during reversal learning, we also examined the inferior, medial, and superior orbitofrontal cortex using ROIs from the Automated Anatomical Labelling (AAL) map based on the MNI average brain (Tzourio-Mazoyer et al, 2002), also thresholded at P<0.05. Finally, the main effect of ATD was assessed by contrasting all task-related regressors from the TRP− condition with all task-related regressors from the balanced condition.

RESULTS

Functional Imaging Data

All significant task-related effects from the balanced condition are shown in Table 1. ATD significantly increased the BOLD response in the dorsomedial PFC during reversal switch errors relative to correct baseline responses. This effect reached significance in both the ROI (contrast iv; T₁₀=2.04; P=0.03; Figure 1) and whole-brain analyses (Talairach coordinates x, y, z=9, 39, 48; T₁₀=11.95; P_corrected=0.006). However, the effect of ATD did not reach significance when the reversal switch errors were compared with the other nonswitch errors (contrast vi; T₁₀=1.33; P=0.1). Furthermore, the increase in signal in the dorsomedial PFC tended towards significance when the reversal nonswitch errors were compared with baseline correct responses (contrast v; T₁₀=1.44; P=0.09) (Figure 2). Thus, ATD increased signal changes during all negative feedback, irrespective of whether the errors were followed by behavioral reversal. ATD did not affect the BOLD response in the left and right VLPFC (T₁₀=−0.13, P=0.55 and T₁₀=−0.21, P=0.6, respectively). Furthermore, ATD did not significantly affect global activation changes during the task, as revealed by both whole-brain and ROI analyses (ROI analyses: left VLPFC: P=0.33; right VLPFC: P=0.32; medial PFC: P=0.23).

Table 1 Significant Task Effects Revealed by Regions of Interest Analyses (From the Balanced Condition Only)

Full size table

While an ATD-induced increase in the orbitofrontal cortex during the reversal switch errors compared with correct responses did not reach significance in the ROI or whole-brain analysis according to our criterion (AAL's left middle orbital gyrus: T₁₀=1.12; P=0.14), for completion we report that whole-brain analysis revealed a nonsignificant effect at x, y, z=−48, 42, −3 (T₁₀=5.72; Z=3.7).

Supplementary analysis revealed that our findings are not confounded by the fact that four participants started with the TRP− condition and seven participants started with the balanced condition. This analysis of the individual parameter estimates, extracted from the dorsomedial PFC ROI (reversal switch errors minus baseline correct responses), revealed that the effect of ATD was not qualified by testing order (ATD × testing order interaction: F_1,9=0.5, P=0.5). No differences in signal change were observed between subjects who ingested the TRP− drink on the first occasion (mean signal change=0.16) and subjects who ingested the TRP− drink on the second occasion (mean signal change=0.19), and no differences were observed between subjects who ingested the balanced mixture on the first occasion (mean signal change=0.11) and subjects who ingested the balanced mixture on the second occasion (mean signal change=0.12).

Whole-brain analyses did not reveal any other significant effects.

Behavioral Effects of ATD on Reversal Learning

There was a nonsignificant tendency for ATD to slow overall RT (F_1,9=4.72, P=0.06); mean RT, TRP−=554 ms, BAL=522 ms). No other differences were found between the TRP− and balanced condition. Mean values are presented in Table 2.

Table 2 Behavioral Effects of ATD

Full size table