INTRODUCTION

Historically, the neurotransmitter serotonin (5-HT) has been associated with both aversive processing (Deakin, 1983) and behavioral inhibition (Soubrie, 1986), but scientists are only beginning to understand the mechanisms through which 5-HT modulates a vast range of normal and abnormal behaviors. Updated theoretical accounts of 5-HT function in motivation have adroitly pointed out that aversive processing and behavioral inhibition, though orthogonal in theory, are usually intertwined in practice (Dayan and Huys, 2008, 2009; Boureau and Dayan, 2011; Cools et al, 2011). Specifically, the inhibition of ongoing behavior is a reflexive and adaptive consequence of aversive predictions. This makes it difficult to disentangle effects of 5-HT manipulations on aversive processing and behavioral inhibition, as the two are almost always correlated in experiments. Understanding the effects of 5-HT on aversive processing is important because 5-HT is hypothesized to have a role in a range of psychiatric disorders, including depression, anxiety, obsessive-compulsive disorder, and aggression (Dayan and Huys, 2009).

Recently, we addressed this issue in an experiment that separately measured aversive processing, behavioral inhibition, and their interaction. We found that temporarily lowering 5-HT in healthy volunteers abolished punishment-related slowing of responding (‘punishment-induced inhibition’), without affecting overall motor response inhibition or general sensitivity to aversive outcomes (Crockett et al, 2009). Thus, 5-HT's role in motivation appears to operate at the interface of aversion and inhibition, reducing response vigor in the face of aversive predictions (Dayan and Huys, 2008, 2009; Boureau and Dayan, 2011; Cools et al, 2011). However, the specific cognitive mechanisms through which 5-HT modulates such withdrawal behavior remain poorly understood. As long as aversive outcomes are contingent on responses, punishment-induced inhibition reflects at least two concurrent processes: an instrumental process that inhibits behavior by virtue of the link between responses and the aversive outcomes they produce; and a Pavlovian process that reflexively suppresses behavior as a direct consequence of aversive predictions (Rescorla and Solomon, 1967; Bolles et al, 1980; Church et al, 1970). Although many studies have demonstrated a link between 5-HT and punishment-induced inhibition (Thiébot et al, 1982, 1983; Tye et al, 1977, 1979; Wise et al, 1972; Graeff and Schoenfeld, 1970; Crockett et al, 2009), no study has investigated whether this relationship depends on Pavlovian (stimulus-outcome) or instrumental (stimulus-response-outcome) aversive predictions. Here, we examine to what extent 5-HT modulates instrumental vs Pavlovian processes in punishment-induced inhibition. This question is particularly important in light of recent computational approaches to affective decision making emphasizing a distinction between instrumental and Pavlovian control of learning and choice (Dayan et al, 2006; Dayan, 2008).

In the current experiment, we used acute tryptophan depletion (ATD; Young et al, 1985) to temporarily lower 5-HT levels in healthy human volunteers, and tested the effects on behavior in a novel task designed to separate the effects of Pavlovian and instrumental aversive predictions on punishment-induced inhibition. Specifically, on every trial subjects had to categorize two types of stimuli. We compared reaction times (RTs) with both stimuli in a reward-only (RO) condition, in which both stimuli were rewarded if correctly categorized, with RTs in a reward+punishment (RP) condition, in which both stimuli were rewarded if correctly categorized but only one of the stimuli was punished if incorrectly categorized (Figure 1). Thus, in the RP condition only one of the stimuli (the ‘punished stimulus’) was associated with punishment, and only one of the responses (the ‘punished button’) led to punishment. Critically, this design allowed us to disentangle the effects of manipulating 5-HT on instrumental and Pavlovian aversive predictions. Specifically, in the RP condition, Pavlovian (stimulus-outcome) aversive predictions would be predicted to lead to slower responses in the presence of the punished stimulus regardless of response. Meanwhile, instrumental (stimulus-response-outcome) aversive predictions would be predicted to lead to slower responses in the presence of the punished stimulus specifically on the punished button.

Figure 1
figure 1

Task design. In the reinforced categorization task (RCAT), participants viewed checkerboard stimuli and indicated whether blue or yellow was in the majority by pressing the appropriate key as quickly as possible. One stimulus category was assigned to be the ‘punished stimulus’ (blue for half the participants and yellow for the other half), and the other was assigned to be the ‘non-punished stimulus’. In the reward-only (RO) block, both stimuli were rewarded with 10 points if correctly categorized, and received no points if incorrectly categorized. In the reward+punishment (RP) block, both stimuli were rewarded with 10 points if correctly categorized; the punished stimulus was punished with a loss of 10 points if incorrectly categorized, and the non-punished stimulus received no points if incorrectly categorized. Note that the nomenclature (eg, ‘punished stimulus’ and ‘punished button’) refers to the stimulus and response types regardless of block, even though the ‘punished stimulus’ and ‘punished button’ receive punishments only in the RP block.

PowerPoint slide

Relative to the RO condition, we expected that response latencies would be slower in the RP condition, reflecting punishment-induced inhibition, and that this effect would be abolished by ATD, as in our previous experiment. Further, we hypothesized that if 5-HT modulates instrumental aversive predictions, the effects of ATD would be restricted to responses on the punished button in the presence of the punished stimulus. In contrast, if 5-HT modulates Pavlovian aversive predictions, then we would expect ATD to abolish slowing of all responses in the presence of the punished stimulus.

METHODS

Participants

Thirty healthy volunteers (13 males, mean age=25.1±3.2 years) participated. Exclusion criteria included history of cardiac, hepatic, renal, pulmonary, neurological, psychiatric or gastrointestinal disorders, medication/drug use, and personal or family history of major depression or bipolar affective disorder. Participants gave written informed consent before participating and were financially compensated. Two participants were excluded due to technical errors during data collection. Because the rewards and punishments used in this study consisted of monetary wins and losses, four additional participants were excluded for indicating at debriefing they did not believe they would be paid for their performance. Therefore, the final analysis was carried out in 24 participants.

General Procedure

The protocol was approved by the Cambridgeshire Research Ethics Committee (09/H0308/051). Participants attended two sessions, spaced at least 1 week apart, and were randomized to receive either ATD (N=14) or placebo (N=10) on the first session. The ATD procedure was carried out according to an established protocol (Crockett et al, 2009).

Upon arrival to the Wellcome Trust Clinical Research Facility (between 0830 and 1000 h), participants completed a baseline mood questionnaire, gave a blood sample, and ingested either the placebo or ATD drink (75 g). After 6.5 h, participants completed a number of cognitive tests, including the Reinforced Categorization Task (RCAT; described below). Participants completed the RCAT task after completing an ultimatum game task, a reversal learning task, and a covert facial emotion recognition task (all completed in the fMRI scanner) and a third-party punishment task. Following the RCAT task, participants completed an emotion regulation task and a delay-discounting task. Task order was consistent across treatments and subjects. Mood was assessed using the Positive and Negative Affect Scale (PANAS; Watson et al, 1988).

Reinforced Categorization Task

In the RCAT, subjects were instructed to categorize stimuli as quickly as possible to win points exchangeable for money. The RCAT was adapted from the Reinforced Go/No-go task used in our previous study (Crockett et al, 2009). As in the original Go/No-go task, the stimuli were checkerboards composed of blue and yellow squares (see Figure 1). These stimuli were designed to introduce a tradeoff between speed and accuracy, and to be able to vary task difficulty. Stimuli could be easy (blue/yellow ratios of 16 : 9 and 9 : 16) or difficult (blue/yellow ratios of 12 : 13 and 13 : 12). All task conditions contained 50% ‘yellow’ trials and 50% ‘blue’ trials, distributed evenly across difficulty level.

In our previous study, participants were assigned a ‘target color’ (blue or yellow), and instructed to respond via button-press (‘Go’) if the target color was in the majority on the checkerboard. In the current study, participants were instructed to press one button (eg, ‘right’) if yellow was in the majority, and another button (eg, ‘left’) if blue was in the majority. Thus, the RCAT is a ‘Go/Go’ task rather than a ‘Go/No-go’ task. We adapted the task from our previous study (Crockett et al, 2009) to have two separate responses, only one of which would eventually be punished.

In some of the task conditions, participants received feedback for their responses. Sometimes correct responses were rewarded with 10 points, a flourishing tone, and a happy face. Sometimes incorrect responses were punished with a loss of 10 points, a long buzzing tone, and an angry face. Throughout the task, feedback was presented for 750 ms. Participants were instructed that points would be exchangeable for money at the end of the experiment. Faces were taken from the NimStim set of facial expressions (Tottenham et al, 2009).

The task consisted of several phases. First, participants completed 48 practice trials without feedback to minimize learning and practice effects in the main task. Stimuli were presented for 2000 ms, with an inter-trial interval of 1500 ms. The mean RT for correct responses was extracted from the practice session and set as the stimulus duration for the main task, to match task difficulty across participants and sessions.

The main task began with a neutral block of 36 trials to obtain a baseline RT. Next, participants completed two key experimental blocks, each with 48 trials. In the RO block, participants were rewarded for correct responses. Incorrect responses received no feedback. In the RP block, participants were also rewarded for correct responses. However, in the RP block one of the stimuli (‘blue’ for half the participants, ‘yellow’ for the other half—henceforth the punished stimulus) was punished if incorrectly categorized, while the other stimulus (henceforth the non-punished stimulus) received no feedback if incorrectly categorized. The experimental blocks were separated by neutral blocks of 36 trials without feedback to allow response biases to return to baseline. The RP block took place after the RO block for all participants (see Supplementary Results for an analysis of potential order effects). At the start of each experimental block, participants were explicitly instructed about the response-outcome contingencies in the upcoming trials, and completed four guided practice trials to observe the consequences of correct and incorrect responses. For a summary of response-outcome contingencies in the experimental blocks, see Figure 1.

Data Analysis

Raw data (response accuracy and RTs) are available in Supplementary Tables S1 and S2. For the response data, we computed measures from signal detection theory (Swets et al, 1961), including sensitivity (d′) and response bias (ln(β)). Formulae for calculating d′ and ln(β) are widely available (Stanislaw and Todorov, 1999). Sensitivity is a measure of discrimination accuracy (the ability to correctly categorize stimuli), and is independent from response bias, which measures subjects’ tendency to favor one response over the other. Accurate calculation of discrimination and response bias measures requires that the raw proportions of false alarms and omission errors be non-zero. Because performance on easy trials was nearly perfect, we were unable to calculate discrimination and response bias measures for easy trials, so we restricted the analysis of d′ and ln(β) to difficult trials only. For completeness, we also repeated the analysis of response bias on all trials (easy and difficult combined). We predicted that the response bias data would reflect a shift in preference away from the punished response in the RP block. One explanation for such a shift is the input of instrumental aversive predictions (stimulus-response-outcome); however, Pavlovian aversive predictions can also influence instrumental actions via Pavlovian-instrumental transfer (PIT) (Huys et al, 2011; Overmier et al, 1971). Thus, any observed effects of ATD on response bias could reflect Pavlovian or instrumental processes.

As in our previous study, we assessed punishment-induced inhibition by examining RTs for correct responses in the RP block relative to the RO and neutral blocks. This approach has been used in other studies of punishment-induced behavioral inhibition (Newman et al, 1997; Avila, 2001) and follows from the observation that the automatic response to aversive outcomes and their prediction is to freeze or depress responding (LeDoux, 1996; Gray and McNaughton, 2000). We reasoned that punishment-induced inhibition would result in slower responding in the RP block, relative to the RO and neutral blocks, as has been observed in previous studies (Newman et al, 1997; Avila, 2001; Crockett et al, 2009). RTs in the experimental conditions (RO and RP) were converted to z-scores by normalizing against matched RTs in the neutral condition as follows: first, we calculated means and standard deviations of RTs for correct responses in the neutral condition, separately for easy and difficult blue and yellow stimuli. Next, we calculated means of RTs for correct responses in the RO and RP conditions, again separately for easy and difficult blue and yellow stimuli. Finally, we computed z-scores for the RTs in the experimental conditions (RO and RP) by normalizing against the RTs in the neutral condition, again separately for easy and difficult blue and yellow stimuli. So for example, the normalized RT for RP/difficult/blue was computed by subtracting the mean RT for neutral/difficult/blue from the mean RT for RP/difficult/blue, and dividing by the standard deviation for neutral/difficult/blue. We employed this normalization procedure because we were primarily interested in how rewards and punishments influenced response vigor, relative to a neutral baseline.

We were also interested in whether the effects of punishment, and their modulation by ATD, were present at the start of the block or emerged only with learning. To assess these potential learning effects, we sorted the response bias and RT data within each block into ‘early’ (first 24 trials) and ‘late’ (second 24 trials) bins.

The transformed raw data (d′, ln(β), and normalized RTs) were analyzed using repeated-measures ANOVAs with treatment (ATD, placebo) and block (RO, RP) as within-subjects factors, and gender and treatment order as between-subjects factors. Additional analyses were conducted using, where appropriate, time (early trials, late trials), stimulus (punished stimulus, non-punished stimulus), response type (non-punished button, punished button), and stimulus difficulty (easy, difficult) as within-subjects factors. Factors were dropped from subsequent analyses when non-significant. Post hoc comparisons were conducted using paired t-tests.

In within-subject designs, the appropriate index of variation is not the standard error of the mean, but the standard error of the difference of the means (SED), which is used when one is interested in the relationship between variables rather than the variables themselves. The SED is therefore used in the figures as an index of variation. The SED is the denominator for Student's t-test and also provides a visual method of comparing mean values in graphical depictions of within-subject designs.

RESULTS

Serotonin Manipulation

Plasma samples were analyzed for tryptophan content according to the procedure described in Crockett et al (2009). ATD resulted in significant reductions in both plasma tryptophan levels and the TRP/ΣLNAA ratio. A repeated-measures ANOVA revealed a significant two-way interaction between treatment (ATD, placebo) and time point (baseline, +5.5 h), resulting from significant reductions in total tryptophan levels (F(1, 23)=108.524, p<0.0001) and the TRP/ΣLNAA ratio (F(1, 23)=28.605, p<0.0001), 5 h following ATD relative to placebo. Simple effects analyses showed a significant decrease in plasma tryptophan levels (t(23)=13.883, p<0.0001) on the ATD session, averaging 66%. There was also a significant decrease in TRP/ΣLNAA ratios (t(23)=12.404, p<0.001) on the ATD session, averaging 85%. On the placebo session, plasma tryptophan levels increased by an average of 88% (t(23)=−6.213, p<0.0001); there was no significant change in TRP/ΣLNAA ratios (t(23)=0.537, p=0.598).

Serotonin Modulates the Effects of Aversive Predictions on Choice

To examine the effects of ATD on the behavioral suppression of punished responding across conditions, we analyzed the effects of treatment (ATD, placebo) and block (RO, RP) on response bias (ln(β)). Lower (more negative) numbers indicate a bias toward responding on the punished button, while higher (more positive) numbers indicate a bias away from responding on the punished button. In the RO block, we did not expect there to be a bias toward one response or the other, since both responses yielded the same payoffs; however, in the RP block we expected there to be a bias away from the response that received punishments if incorrect. There was a significant interaction between treatment and block on response bias (F(1, 23)=7.455, p=0.012, difficult trials only; F(1, 23)=4.860, p=0.038, easy and difficult trials combined). On placebo, participants were biased away from responding on the punished button in the RP block (mean+SE, 0.210+0.102) relative to the RO block (mean+SE, −0.051+0.082; t(23)=−2.485, p=0.021; see Figure 2, left panel). However, this punishment-induced suppression of responding on the punished button was released by ATD; there was no difference in response bias between the RP block (mean±SE, −0.056±0.122) and the RO block (mean±SE, 0.078±0.089; t(23)=1.013, p=0.321; see Figure 2, right panel). ATD specifically influenced response bias in the RP block; bias away from the punished button was significantly reduced on ATD, relative to placebo, in the RP block (t(23)=−2.113, p=0.046), but not in the RO block (t(23)=1.289, p=0.210). The effect of ATD on response bias in the RP block was present at the start of the block and remained constant across trials. When we sorted the data into early and late trials and repeated the above analysis with treatment, block and time as factors, the treatment × block interaction remained significant (F(1, 23)=4.989, p=0.034), but there were no significant effects of time (F(1, 23)=0.013, p=0.911), time × treatment (F(1, 23)=0.624, p=0.437), time × block (F(1, 23)=0.006, p=0.940), or time × treatment × block (F(1, 23)=0.369, p=0.549).

Figure 2
figure 2

Effect of ATD on punishment avoidance, assessed by comparing response bias in the RP block with the RO block. Response bias was assessed by the natural log of β from signal detection theory; more positive values indicate a bias away from responding on the punished button. Floating error bars depict the SED for the RO vs RP effect. *p<0.05. RP: reward+punishment; RO: reward only.

PowerPoint slide

Serotonin Modulates the Effects of Aversive Predictions on Response Vigor

We first considered the influence of punishment expectations on response vigor by analyzing the effects of block (RO, RP) and treatment (ATD, placebo) on RTs for correct responses (irrespective of stimulus). This analysis indicated that on the placebo session, the possibility of punishment made subjects respond more slowly, but this effect was abolished by ATD (treatment × block interaction, F(1, 23)=4.734, p=0.042). On placebo, participants were slower to respond in the RP block (mean±SE, −0.219±0.082) than in the RO block (mean±SE, −0.367+0.083; t(23)=−2.254, p=0.034; see Figure 3, left panel). However, this punishment-induced inhibition of responding was absent following ATD; responses were not slower in the RP block (mean+SE, −0.371+0.099) compared with the RO block (mean+SE, −0.329+0.111; t(23)=0.456, p=0.653; see Figure 3, right panel). We therefore replicated our previous finding that ATD abolishes punishment-induced inhibition (Crockett et al, 2009). The effects of ATD on punishment-induced inhibition appeared early and were consistent across trials. Separating the RT data into early and late trials and including time as a factor in our model, the treatment × block interaction remained significant (F(1, 23)=4.306, p=0.049), but there were no significant effects of time (F(1, 23)=0.964, p=0.337), time × treatment (F(1, 23)=0.918, p=0.348), time × block (F(1, 23)=0.043, p=0.838), or time × treatment × block (F(1, 23)=0.287, p=0.598).

Figure 3
figure 3

Effect of ATD on punishment-induced inhibition, assessed by comparing RTs for correct responses in the RP block relative to the RO block. All RTs were normalized against a neutral baseline and converted to z-scores. Floating error bars depict the SED for the RO vs RP effect. *p<0.05. RP: reward+punishment; RO: reward only.

PowerPoint slide

Serotonin Modulates the Effects of Pavlovian Aversive Predictions on Response Vigor

We next examined the RT data at a finer level of detail to test the hypothesis that 5-HT modulates the effects of Pavlovian (stimulus-outcome) aversive predictions. We analyzed RTs (regardless of response button) to the easy and difficult punished and non-punished stimuli in the RO and RP blocks. In a repeated-measures ANOVA with block (RO, RP), stimulus (non-punished, punished), difficulty (easy, difficult) and treatment (ATD, placebo) as factors, we found a significant main effect of difficulty (F(1, 23)=12.483, p=0.002); subjects were faster to respond to easy stimuli than to difficult stimuli. We also found a significant three-way interaction between treatment, block, and stimulus (F(1, 23)=4.534, p=0.047), and a trend-level four-way interaction between treatment, block, stimulus, and difficulty (F(1, 23)=3.713, p=0.069).

To explore these interactions, we examined the effects of treatment, block, and stimulus for easy and difficult stimuli separately. We suspected that Pavlovian effects would be weaker for the easy stimuli, since these were less predictive of punishments; mean accuracy for easy trials was over 90%, with more than half of subjects performing perfectly and thus never receiving punishments for easy stimuli. Consistent with this prediction, we did not find evidence of Pavlovian slowing to the punished easy stimulus; on the placebo session, participants were not slower to respond to the punished stimulus in the RP block, relative to the RO block (t(23)=−0.719, p=0.479), and within the RP block, they were not slower to respond to the punished stimulus, relative to the non-punished stimulus (t(23)=−0.641, p=0.528). Furthermore, the three-way interaction between treatment, block, and stimulus was not significant (F(1, 23)=0.318, p=0.579), indicating that ATD did not affect slowing to the punished stimulus for easy trials.

In contrast, when focusing on the difficult stimuli we found a significant three-way interaction between treatment, block, and stimulus (F(1, 23)=4.618, p=0.042). On the placebo session, participants were slower to respond to the punished stimulus in the RP block, compared with the RO block (t(23)=2.459, p=0.022), whereas they responded with equal speed to the non-punished stimulus in the RP and RO blocks (t(23)=−0.391, p=0.699). Following ATD, participants did not exhibit slowing in the RP block, relative to the RO block, for either the punished stimulus (t(23)=–0.986, p=0.334) or the non-punished stimulus (t(23)=0.184, p=0.856).

We next confirmed that ATD abolished punishment-induced slowing to the punished stimulus. We computed slowing scores for the punished and non-punished stimuli by taking the RT difference between the RO and RP blocks. Relative to placebo, tryptophan significantly reduced punishment-induced slowing to the punished stimulus (t(23)=−2.353, p=0.028), without affecting slowing to the non-punished stimulus (t(23)=0.527, p=0.603); Figure 4.

Figure 4
figure 4

Effect of ATD on punishment-induced inhibition in the presence of punished vs non-punished stimuli. Punishment-induced inhibition was calculated by computing difference scores between RTs in the RO and RP blocks. RTs were normalized against a neutral baseline and converted to z-scores. Floating error bars depict the SED for the ATD effect; fixed error bars depict the SEM for the RO vs RP effect. *p<0.05.

PowerPoint slide

Finally, we focused on responses to the punished stimulus (difficult trials, as in previous analysis) and examined the slowing of responses on the punished button vs non-punished button. We reasoned that slowing of responses on the non-punished button in the presence of the punished stimulus should only reflect Pavlovian (stimulus-outcome) processes, while slowing of responses on the punished button in the presence of the punished stimulus should reflect both Pavlovian and instrumental (stimulus-response-outcome) processes. Thus, if 5-HT influences Pavlovian inhibition we should see a main effect of ATD on all responses, whereas if 5-HT influences instrumental inhibition we should see effects of ATD on responses on the punished button only, resulting in a treatment-by-response interaction.

We tested these predictions by analyzing slowing scores (computed as in above analysis by taking the RT difference between the RO and RP blocks) in a repeated-measures ANOVA with the factors treatment (ATD, placebo) and response (non-punished button, punished button); Figure 5. We found a main effect of response (F(1, 23)=5.246, p=0.033); responses on the punished button showed more slowing than responses on the non-punished button, reflecting an instrumental inhibition of the punished response. We also found a main effect of treatment (F(1, 23)=4.812, p=0.040); across all responses, ATD reduced slowing to the punished stimulus in the RP block, relative to the RO block. Importantly, the treatment-by-response interaction was not significant (F(1, 23)=1.007, p=0.328); in other words, the effects of ATD on RTs in the presence of the punished stimulus were not restricted to responses on the punished button, as would be predicted by a role for 5-HT in instrumental (stimulus-response-outcome) aversive predictions. Instead, the main effect of treatment suggests a role for 5-HT in Pavlovian (stimulus-outcome) aversive predictions.

Figure 5
figure 5

Effect of ATD on punishment-induced inhibition of responses to punished stimuli on the non-punished vs punished buttons. Punishment-induced inhibition was calculated by computing difference scores between RTs in the RO and RP blocks. RTs were normalized against a neutral baseline and converted to z-scores. Fixed error bars depict the SEM for the RO vs RP effect.

PowerPoint slide

For completeness, we repeated the above analysis for responses to the non-punished stimulus. We did not find any evidence of slowing of responses to the non-punished stimulus, on either the non-punished or punished button, on either ATD or placebo (see Supplementary Results).

As an additional test of the hypothesis that 5-HT modulates the effect of Pavlovian aversive predictions on response vigor, we examined the immediate after-effects of punishment. We reasoned that the effects of Pavlovian aversive predictions on response vigor should be strongest on trials that immediately follow punishment, resulting in slower responding on trials following punishment (vs trials following non-punishment). We expected this ‘post-punishment slowing’ effect to be reduced following ATD. As predicted, participants were significantly slower to respond following punishments than following correct responses, and this effect was abolished by ATD (see Supplementary Results and Supplementary Figure S1).

No Effect of Low Serotonin on Discrimination Performance or Mood

To rule out the possibility that ATD influenced performance via effects on attention or executive function, we assessed the effects of treatment on sensitivity (d′) in the experimental blocks in a repeated-measures ANOVA with treatment (ATD, placebo) and block (RO, RP) as within-subjects factors. There was a trend toward better discrimination performance in the RO block, relative to the RP block (F(1, 23)=3.105, p=0.091), but there were no significant effects of treatment (F(1, 23)=0.013, p=0.911) or treatment × block (F(1, 23)=0.325, p=0.574).

Consistent with previous studies in healthy volunteers, ATD did not affect subjects’ self-reported mood. PANAS scores were analyzed immediately before drink ingestion, and immediately before cognitive testing. A repeated-measures ANOVA with treatment (ATD, placebo) and time point (baseline, +5.5 h) as within-subjects factors found no significant effects of treatment, time point, or their interaction on PANAS positive affect (all p>0.13) or negative affect (all p>0.15).

DISCUSSION

Temporarily lowering 5-HT in humans produced a selective reduction in aversively motivated behavioral inhibition, replicating our previous findings (Crockett et al, 2009). This effect was evident in terms of both response bias and response vigor (ie, latencies). Critically, our task design allowed us to separate instrumental (stimulus-response-outcome) and Pavlovian (stimulus-outcome) processes in punishment-induced inhibition. Our results suggest that 5-HT is specifically involved in translating Pavlovian aversive predictions into behavioral inhibition, a function consistent with updated theoretical accounts of 5-HT in motivation and action (Dayan and Huys, 2009; Boureau and Dayan, 2011; Cools et al, 2011).

Recent computational approaches to affective learning and decision making have emphasized a distinction between instrumental control systems, which learn to emit arbitrary responses in pursuit of optimal outcomes, and Pavlovian control systems, which emit evolutionarily pre-programmed reflexes (eg, approach rewards and avoid punishments) to biologically relevant outcomes and their anticipation (Dayan and Huys, 2008; Dayan et al, 2006). Since any experiment with an instrumental contingency between stimulus, response, and outcome also contains a Pavlovian contingency between stimulus and outcome, behavior reflects a combination of Pavlovian and instrumental processes (Rescorla and Solomon, 1967; Mackintosh, 1983). Thus, any explanation of 5-HT's involvement in punishment-induced inhibition is incomplete without considering both instrumental and Pavlovian processes, especially because in the case of punishment the two are usually aligned (Bolles et al, 1980; Dayan et al, 2008; Boureau and Dayan, 2011; one exception is negative automaintenance). This analysis becomes all the more important when considering that Pavlovian responses to predictions of reward and punishment may account for a significant portion of anomalies in human decision making (Dayan et al, 2006), as well as psychopathologies such as depression (Dayan and Huys, 2008), in which 5-HT has historically played a prominent role. For example, a recent theoretical account of 5-HT in depression posits that 5-HT normally mediates reflexive (Pavlovian) avoidance of distressing thoughts; in depression, low 5-HT leads to increased engagement with aversive mental states and inflated predictions of the likelihood of aversive outcomes (Dayan and Huys, 2008).

In the current study, we separately assessed the impact of aversive predictions on response bias for punished vs non-punished responses, and on response vigor, reflected in the speed of responses in the presence of stimuli predictive of punishments. Aversive predictions influenced response selection: in an aversive context, subjects were biased against responding on the punished button, but only on placebo; tryptophan depletion abolished the bias against the punished response. Aversive predictions also influenced response vigor: in an aversive context, subjects responded more slowly than in a non-aversive context, but again only on placebo. Tryptophan depletion abolished the influence of aversive predictions on response vigor, replicating our previous findings (Crockett et al, 2009). These observations strongly support a role for 5-HT in the processing of aversive predictions.

But aversive predictions can take several forms. Specifically, aversive predictions can be instrumental, linking stimuli, responses, and outcomes; or they can be Pavlovian, linking stimuli and outcomes. Many studies have shown that 5-HT influences the effects of aversive predictions on punishment-induced behavioral inhibition (Thiébot et al, 1982, 1983; Tye et al, 1977, 1979; Wise et al, 1972; Graeff and Schoenfeld, 1970; Crockett et al, 2009), but no study has examined to what extent 5-HT modulates Pavlovian vs instrumental aversive predictions. This is likely because the two are difficult to disentangle: in the case of punishment, where a specific combination of stimulus and response produces an aversive outcome, Pavlovian and instrumental aversive predictions operate in parallel (Bolles et al, 1980). One potential explanation of our finding that ATD abolished a bias against responses that lead to punishment is that 5-HT is necessary for instrumental punishment sensitivity. An alternative explanation is that 5-HT mediates the effects of Pavlovian aversive predictions on instrumental choice. In aversive PIT, stimuli predictive of punishments increase the likelihood of selecting actions that avoid punishment (Overmier et al, 1971; Huys et al, 2011). Such effects could contribute to the response bias we observed in the RP block, and a role for 5-HT in aversive PIT could partly explain the effects of ATD on response bias. The current experiment was not designed to specifically examine PIT effects, but this would be a promising target for future research.

In analyzing the effects of aversive predictions on response vigor, our experimental design allowed us to separately assess Pavlovian aversive predictions, reflected in the slowing of all responses in the presence of punished stimuli, and on instrumental aversive predictions, reflected in the slowing of punished responses in the presence of punished stimuli. ATD had a main effect on response latencies for both punished and non-punished responses in the presence of punished stimuli, but did not affect response latencies to non-punished stimuli, suggesting a role for 5-HT in modulating the effects of Pavlovian aversive predictions on behavioral inhibition. However, we note that the lack of a significant interaction effect in the presence of a main effect of ATD could reflect a lack of power, and further studies should seek to replicate our findings. Furthermore, ATD abolished the slowing of responses immediately following punishment, but did not affect slowing of responses following errors. The effects of ATD on post-punishment slowing were again concentrated on responses to punished stimuli, providing further evidence for serotonergic modulation of Pavlovian aversive predictions.

Reflexive withdrawal responses to Pavlovian aversive predictions involve the amygdala, in particular the central nucleus (CeA) (Maren, 2001; Killcross et al, 1997; Cardinal et al, 2002; Balleine and Killcross, 2006). Lesions of the CeA reduce conditioned suppression of non-punished responses without affecting suppression of punished responses (Killcross et al, 1997), paralleling the current findings. Another region likely involved in Pavlovian aversive predictions is the insula (Calder et al, 2001), which is extensively anatomically connected with the amygdala (Barbas, 2007; Stein et al, 2007). An fMRI study that specifically modeled the neural representation of Pavlovian aversive predictions during learning found aversive value-related activity in the insula, as well as a region of the brainstem consistent with the location of the dorsal raphé nucleus (Seymour et al, 2004). Both the CeA and the insula receive a high level of serotonergic input, measured by 5-HT transporter density (Smith et al, 1999; O’Rourke and Fudge, 2006; Way et al, 2007), supporting the idea that 5-HT may promote punishment-induced inhibition by modulating activity in these regions. Future studies combining tryptophan depletion with fMRI are needed to explore these possibilities.

The method we used to manipulate 5-HT function, acute tryptophan depletion, is well known to deplete central 5-HT levels (Crockett et al, 2011) by way of reducing 5-HT synthesis (Nishizawa et al, 1997) and release in projection regions (Stancampiano et al, 1997; Fadda et al, 2000; van der Stelt et al, 2004). As a caveat, though, this method reduces 5-HT levels to a modest extent, and does so globally, which precludes drawing conclusions about region-specific effects without the concurrent use of neuroimaging. It is possible that more profound loss of central 5-HT would affect other processes, including aversive instrumental prediction, possibly mediated by distinct terminal regions. have suggested that tryptophan depletion may selectively influence tonic, rather than phasic, serotonergic signaling, which has important consequences for the interpretation of our results. Tonic 5-HT has been hypothesized to report the average rate of punishments, and a reduction in response vigor serves as an adaptive response to increased expectations of punishments (Cools et al, 2011). This proposal is consistent with our results; response vigor was indeed reduced in an aversive context, and this effect was abolished by tryptophan depletion. Although there is certainly room for future research using more precise methods to pin down the role of phasic 5-HT signaling, we note that psychiatric disorders as well as their treatments involve changes in global, tonic levels of 5-HT rather than region-specific changes in phasic 5-HT. Thus, an understanding of 5-HT function at the global level is necessary for resolving disorders of 5-HT in psychopathology.

In summary, we replicated previous findings that 5-HT is critical for punishment-induced inhibition, and now assign further specificity to 5-HT in this process by demonstrating that 5-HT modulates Pavlovian aversive predictions. ATD specifically abolished the slowing of response latencies in the presence of punishment-predicting stimuli. Our data provide early empirical support for emerging ideas about the role of 5-HT in motivation: operating at the intersection of aversion and inhibition, it may function to reduce the vigor of responding in the face of aversive predictions (Dayan and Huys, 2009; Boureau and Dayan, 2011; Cools et al, 2011).