Introduction

Selective serotonin reuptake inhibitors (SSRIs), such as fluoxetine, are the first-line treatments for psychiatric disorders with emotional dysfunctions, such as depression and anxiety. These agents selectively block the serotonin transporter (SERT), which leads to an increase of serotonin in the extracellular compartment. Despite the widespread use of SSRIs, the brain mechanisms that underpin their antidepressant and anxiolytic actions remain unclear. To date, most studies have focused on molecular and cellular mechanisms, including the desensitization of serotonin receptors [1], increased synaptic plasticity [2], and adult neurogenesis [3]. However, the insights gained from these studies are questioned by the time lag required to elicit therapeutic improvements. An alternative view suggests that SSRIs act primarily on the processing of emotionally valenced information [4,5,6]. SSRIs quickly facilitate the relative processing of positive versus negative affective information in patients and healthy subjects [7,8,9,10]. For instance, it has been shown that a short-term SSRI treatment promotes the recognition and categorization of social cues associated with a positive valence over negatively valenced cues in a facial expression recognition task [7, 11, 12]. These early changes in the processing of emotional valence predict later clinical outcomes [13], suggesting that the induction of a positive emotional bias is an early manifestation of the therapeutic actions of SSRIs.

Neuroimaging studies have shown that clinical improvements are tightly coupled with valence-specific effects of SSRIs on widely distributed brain networks [8, 14,15,16,17,18,19]. While considerable effort aimed to elucidate SSRI effects on both positive and negative emotions in the amygdala, anterior cingulate and orbitofrontal cortex [20,21,22], much less attention has been given to the ventral striatum (i.e., the nucleus accumbens [23]) in this context [24, 25]. Because the ventral striatum receives a strong serotonergic innervation [26, 27], expresses many SERTs [28], and makes an important contribution to regulating motivation and reward processing [29, 30], we hypothesize that this site could play a major role in SSRI actions on the processing of emotionally valenced information. A recent series of intracerebral interventions performed in non-human primates has provided causal evidence that both motivation to achieve rewards and to avoid negative events are mediated by the ventral striatum [31, 32]. Dysfunctions of these limbic circuits even produced anxiety-related behaviors or an apathetic state depending on the subregions affected within the ventral striatum [32,33,34,35], making this brain region a good candidate for regulating biases in emotional processing.

To determine if the ventral striatum conveys signals consistent with its predicted role in SSRI actions on emotional processing, we first trained two monkeys to perform an approach-avoidance task in which we manipulated the valence of sensory stimuli [31, 36]. Then, we measured neural responses to positive and negative events before and during a 4-week treatment period with fluoxetine. Combined with PET scans in which we assessed the fluoxetine binding levels at a therapeutic dose [37, 38], our results support an active role for the ventral striatum in SSRI actions by showing that fluoxetine potentiated positively valenced signals and attenuated negative emotion processing in this brain region. Thus, the ventral striatum is likely to be central in therapeutic approaches for affective disorders.

Materials and methods

Animals

Two cynomolgus monkeys (C and S) completed both the behavioral and electrophysiological parts of the study. To facilitate a quantitative approach in the imaging part, we added two extra macaques to complete brain scans with a total of four animals. Animal care and housing complied with NIH guidelines (1996) and the 2010 European Council Directive (2010/63/UE) recommendations.

Apparatus

Each monkey was seated on a chair in front of a touch-screen on which they executed the task with the left arm. To initiate the task, the monkeys placed their left-hand on an infrared-sensitive resting key installed on the chair. Presentation (Neurobehavioral Systems, CA, USA) and Scenario Manager (ISCJM, France) controlled the presentation of visual cues on the screen, monitored behavioral responses, and regulated solenoid valves for both reward delivery and airpuff systems. Single drops of apple juice (0.2 mL) or single puffs of air directed to the face (between the cheek and eye; 25–35 psi) were delivered depending on task conditions. Gaze position and blinking were monitored using an infrared camera system (120 Hz; ETL-200, ISCAN, MA, USA). Licking was detected whenever the tongue interrupted an infrared beam installed on the juice delivery system. A detailed description of the apparatus and paradigm can be found in our previous studies [31, 36].

Approach-avoidance task

The monkeys were trained to perform an approach-avoidance task that manipulated the valence of sensory stimuli (Fig. 1A). The attribution of valence to sensory information presented on the screen guided monkeys’ decisions to approach rewards and avoid punishments. Each trial began with a white central dot when the monkey placed its hand on the resting key. After 1.3 s, a conditioned stimulus was presented on the screen (for 1 s). The location of the cue alternated randomly between the left and the right sides of the monitor. We used two sets of images as conditioned stimuli: a first series of images signaled the possibility to obtain a juice reward at the end of the trial (defined as events with a positive valence), while a second series signaled the possibility to get an airpuff as punishment (defined as events with a negative valence). Images of abstract or concrete objects, including fractals and food items, served as conditioned stimuli. After a random delay period (1.5–2 s after cue offset), two green squares appeared on both sides of the screen, cueing the animal to touch one of the targets (within 2 s). When the monkey chose the green target at the location where the conditioned stimulus had appeared, it approached the stimulus. In that case, the juice reward or airpuff were delivered after a random delay (1.5–2 s). Alternatively, the monkey could avoid the anticipated outcome by selecting the contralateral green target. In that case, nothing happened at the end of the trial, i.e., the animal missed out on the opportunity to earn a reward or successfully prevented an airpuff. The trials were separated by 0.8–1.5 s intertrial intervals during which the screen was black. Errors in task performance were detected according to the specific task requirement not fulfilled. In particular, (i) the monkey was required to hold its hand in the start position until the presentation of the green targets; (ii) only a response time (reaction time [RT] + movement duration [MD]) of <2 s was allowed; and (iii) the touch of the screen was required to be in one of the target zones (5 cm2). If any one of these rules was broken, an error was registered, a blank screen appeared (1 s), followed by an intertrial interval, and the trial was repeated. Thus, monkeys were obliged to complete all trials according to the requirements in order for the task to proceed to other conditions.

Fig. 1: Approach-avoidance task and behavior.
figure 1

A Temporal sequence of events for the two trial types. After the monkey initiated a trial by positioning its hand on a resting key, a conditioned stimulus was presented briefly (left or right position chosen at random). During instruction cue presentation, some of the visual stimuli predicted liquid reward (green), while others predicted airpuff punishment (red). Depending on the condition, the monkey was then required to approach or avoid the location of the presented cue and thereby the anticipated outcome by selecting the ipsi- or contralateral green target, respectively. B Licking and C blinking behaviors (mean ± SEM) as detected around the time of outcomes (reward vs. airpuff). Differences between trial types were examined across sessions using a series of two-tailed t-tests. P values are shown in plots with a logarithmic scale. The horizontal dashed line indicates the statistical threshold (P < 0.05, corrected for 3001 time bins). D Performance of the two monkeys. Behavioral measures were averaged (mean ± SEM) separately for each trial type (appetitive vs. aversive) and drug condition (control vs. fluoxetine) across all recording sessions per animal. The number of trials successfully performed, the rate of approach, the error rate, the reaction time (RT), and the movement duration (MD) were overall affected by the valence of trials and by fluoxetine administration (two-way ANOVA, *P < 0.05, **P < 0.01, ***P < 0.001).

Prior to the experiment, the animals learned the valence associated with each conditioned stimulus (positive vs. negative) during a training period (>8 months) and were then free to choose any option they preferred (approach vs. avoidance). To maintain a sufficient level of motivation for performance and to minimize the risk of task disengagement, the aversiveness of the airpuff was limited (<36 psi) and negative cues were displayed only in ~40% of the trials. Specifically, a negative cue occurred only after a positive one, and no more than two positive cues could appear in consecutive trials. Hence, in addition to error trials, monkeys could predict with certainty the upcoming task condition when a negative cue or a pair of positive cues was presented. Conversely, it was not possible to anticipate the valence of the next trial after the presentation of a single positive cue.

Surgery and localization of the recording site

After training, a recording chamber and head holder (Crist Instruments, MD, USA) were fixed to the skull on the right side under general anesthesia and sterile conditions. The anatomical location of the anterior striatum (between AC+2 and AC+6) and proper vertical positioning of the recording chamber to access it were estimated from structural MRI scans (Siemens 1.5 T, voxel size of 0.6 mm). The boundaries of brain structures were identified on the basis of standard criteria including relative location, neuronal spike shape, firing pattern and responsiveness to behavioral events.

Recording and data acquisition

During recording sessions, an epoxy-insulated tungsten microeletrode (FHC Inc., ME, USA; 2–4 MOhm) was advanced into the target nucleus using a computer-controlled microdrive (Nan Instruments, Israel). Neuronal signals were amplified (x1K, Plexon Inc., TX, USA), bandpass filtered (0.3–10 kHz), and continuously sampled at 20 kHz (Spike 2, CED, UK). Individual spikes could be sorted on-line or off-line. Spikes were isolated using clustering on the basis of several waveform parameters including principal components, peak and trough amplitudes, as well as the presence of a refractory period. The timing of detected spikes and of relevant task events was sampled digitally at 1 kHz. We used two standard criteria to dissociate phasically active neurons (i.e., spiny projection neurons (SPNs), which constitute the main striatal output neurons) from interneurons: the cells’ average firing rate, and extracellular spike waveform duration from the first negative peak to the following positive peak. Neurons with waveform durations of 0.9–2.5 ms and average firing rates <6 Hz were classified as presumed SPNs [39,40,41].

Fluoxetine administration

Fluoxetine (2 mg/kg) was intramuscularly injected daily for 4 weeks. We performed each injection in the animal’s cage 4 h before testing the drug effects in the approach-avoidance task. This dosage has been shown to reduce anxiety-related symptoms such as self-injuries and stereotypic behaviors in monkeys [37, 38]. Drug concentrations and metabolic profiles are reported to be consistent with those seen in humans [42]. Fluoxetine administration was initiated after completion of experimental sessions defined as control days in which data were collected without any injection.

PET imaging

PET scans with 11C-N,N-dimethyl-2-(-2-amino-4-cyanophenylthio) benzylamine ([11C]-DASB), a radioligand selective for the SERT, were performed following an injection of fluoxetine (acute intramuscular injection of 4 mg/kg, 4 h before acquisition) or nothing in four anaesthetized monkeys. PET imaging was performed with a Siemens Biograph mCT/S64 scanner (CERMEP). As reported in our previous studies [43, 44], the non-displaceable binding potential (BP) of [11C]-DASB with (test) and without (control) drug injection was calculated for each animal. Parametric images of BP were calculated using a simplified tissue model with the cerebellum as reference [43]. Images were transformed into a common space using a brain Macaca fascicularis MRI template [45], and the values for the structure of interest were compared between conditions (control–test).

Analysis of behavioral data

We analyzed (i) whether behavior varied according to the valence assigned to distinct conditioned stimuli, and (ii) whether fluoxetine changed behavior. To avoid any variation resulting from a switch in monkeys’ strategy or a drift in their internal representation of cue values (caused by different phases in the learning of the task, for instance), we initiated the data collection when monkeys showed a stable level in task performance across sessions. We tested the stability of animals’ performance by comparing behavioral parameters between two groups of control sessions collected without drug administration (two-way ANOVAs). Then, rates of approach (selection of the ipsilateral target), RT (the interval between cue appearance and key release), MD (the interval between key release and target capture), and error rates were tested across types of trials (positive vs. negative valence) and drug conditions (ON vs. OFF) using two-way ANOVAs. Also, we tested as a control whether behavioral parameters varied according the certainty levels, which depended on the recent history of trials submitted to animals (three-way ANOVAs).

Neuronal data analysis

Neuronal recordings were accepted for analysis based on electrode location, recording quality and duration (>5 successful trials for each task condition). Only SPNs were included in this study. To analyze the spontaneous neuronal activity, we concatenated data from the intertrial intervals during which the screen was black. The mean spontaneous firing rate of a neuron was calculated as the total number of spikes across all intervals divided by the summed duration of those periods. Neuronal burst firing was quantified using the Legendy surprise method [46]. Bursts were defined as groups of 4 or more spikes whose interspike intervals (ISIs) were unusually short compared with other ISIs of a spike train. We used a surprise threshold of 5, which corresponds to alpha <0.001 that the candidate burst would occur as a part of a Poisson-distributed sequence of spikes. The prevalence of bursts in a spike train was measured as the fraction of the total duration of recording that a spike train spent in bursts. Furthermore, the general variability of a neuron’s firing rate was computed as the coefficient of variation of the spike train’s ISIs (i.e., SD ISI/mean ISI).

For different task events (i.e., cue presentation and trial outcomes), continuous neuronal activation functions [spike density functions (SDFs)] were generated by convolving each discriminated action potential with a Gaussian kernel (20-ms variance). Mean perievent SDFs (averaged across trials) for each of the task conditions (i.e., events with positive vs. negative valence) were constructed. A phasic response to a task event was detected by comparing SDF values during a post-event epoch (1000 ms) relative to a cell’s firing rate preceding the presentation of the conditioned stimulus (P < 0.01, two-tailed t-test). A neuron was judged to be task-related if it generated a significant phasic response for at least one of the task events. Only trials for which animals approached rewards (positive valence) or punishments (negative valence) were included in the analysis concerning the detection of neural responses to cues and outcomes.

To test how individual task-related neurons encoded distinct events in the task, we used time-resolved multiple linear regressions. We simultaneously tested whether trial-to-trial neuronal activity was modulated by the valence of events (positive vs. negative), the location of the chosen target (left vs. right), the level of anticipation of the current trial (certain vs. uncertain; see task description above), and interactions between those parameters. For each task-related neuron, we counted spikes (SC) trial-by-trial within a 250-ms test window that was stepped in 20-ms increments from −1000 ms to +2000 ms relative to the time of the event. For each bin, we applied the following model:

$${\mathrm{SC}}_{i} =\, {\upbeta}_{\mathrm{0}} + {\upbeta}_{\mathrm{1}}\,{\mathrm{Valence}}_{i} + {\upbeta}_{\mathrm{2}}\,{\mathrm{Location}}_{i} + {\upbeta}_{\mathrm{3}}\,{\mathrm{Certainty}}_{i} \\ +\,{\upbeta}_{\mathrm{4}}\,{\mathrm{Valence}}_{i}\times{\mathrm{Location}}_{i} + {\upbeta}_{\mathrm{5}}\,{\mathrm{Valence}}_{i}\times{\mathrm{Certainty}}_{i}$$

where all regressors for the ith trial were represented by dummy variables and were normalized to obtain standardized regression coefficients (Z-scored in standard deviation units). The β0–5 coefficients were estimated using the ‘glmfit’ function in Matlab (The Mathworks, MA, USA). To test whether individual coefficients were significant, we shuffled spike counts 1000 times across trials and compared actual coefficients to the confidence intervals yielded by shuffling [P = 0.05, corrected for 150 time bins]. We then used the polarity of β1 values to characterize the encoding of the valence assigned to task events. P-type neurons showed preferential involvement in the processing of positive events as indicated by significantly positive β1 parameter estimates (i.e., responses on appetitive trials significantly exceeded those on aversive trials). In contrast, N-type neurons were characterized by significantly negative β1 parameter estimates (i.e., responses on aversive trials significantly exceeded those on appetitive trials).

In addition, to quantify how strongly neuronal activity was influenced by regressors present in the model, we used the coefficient of partial determination (CPD). The CPD for the nth regressor Xn corresponds to:

$${\mathrm{CPD}}\left( {{X}_{n}} \right) = \left[ {{\mathrm{SSE}}\left( {{X}_{{- n}}} \right) - {\mathrm{SSE}}\left( {X} \right)} \right]/{\mathrm{SSE}}\left( {{X}_{{ - n}}} \right)$$

where SSE(X) refers to the sum of squared errors in the regression model that includes a set of regressors X, and Xn denotes all regressors included in the model except regressor Xn.

Results

Behavioral distinction between appetitive and aversive conditions

After an extensive training phase, the two monkeys were familiar with the task and differentiated appetitive trials from aversive ones. Animals showed stable levels in task performance across control sessions with equivalent preferences and willingness to work (Fig. S1). They increased their frequency of licking (Fig. 1B) or blinking (Fig. 1C) specifically in anticipation of reward or airpuff delivery (two-tailed t-test, P < 0.05, corrected for 3001 time bins). The conditioned stimuli presented during the task effectively acquired distinct valence as evidenced by consistent effects on the animals’ task performance (Fig. 1D). Both monkeys preferentially approached reward-predictive cues (two-way ANOVA; monkey C: F(1,45) = 343 P < 0.001, monkey S: F(1,48) = 2635 P < 0.001) with shorter RTs (C: F(1,45) = 32.8 P < 0.001, S: F(1,48) = 57.9 P < 0.001) and faster movements (C: F(1,45) = 39.3 P < 0.001, S: F(1,48) = 14.7 P < 0.001), while punishment-predictive cues were often avoided by slowly selecting the contralateral target or by making errors (only significant for monkey S: F(1,48) = 26.7 P < 0.001). Except an effect on error rates for monkey C (Fig. S2, F(1,45) = 6.24 P = 0.013), no consistent behavioral changes across certainty levels were detected, suggesting that the certainty did not interact with how animals experienced the valence assigned to task events.

Fluoxetine effects on behavior

Fluoxetine administration improved the animals’ performance in both appetitive and aversive conditions (Fig. 1D). They were more willing to work, as evidenced by a 52% increase in the number of trials completed per session (two-tailed t-test; C: t(45) = 1.64 P = 0.049, S: t(48) = 7.08 P < 0.001) and faster approach responses for reward (RT; C: F(1,45) = 4.02 P = 0.046, S: F(1,48) = 35.56 P < 0.001; and MD; C: F(1,45) = 6.36 P = 0.013, S: F(1,48) = 4.92 P = 0.039). In aversive trials, animals actively avoided punishment more often (C: F(1,45) = 6.35 P = 0.013, S: F(1,48) = 9.86 P = 0.002) while making fewer errors (C: F(1,45) = 4.15 P = 0.044, S: F(1,48) = 7.18 P = 0.008) with fluoxetine, again reflecting a drug effect on motivated decision making and increased task engagement.

Fluoxetine binding in the primate ventral striatum

To test whether fluoxetine can act directly in the ventral striatum of our monkeys, we compared [11C]-DASB from PET acquisitions with and without drug administration (Fig. 2A). We found a strong reduction of BPND values in the ventral striatum when fluoxetine was co-administrated (n = 8 hemispheres; Mann–Whitney U-test, U = 64 P < 0.001). This finding is consistent with the drug binding to SERTs within the ventral striatum. Fluoxetine also showed strong binding in other brain regions, such as the thalamus and the amygdala (Fig. S3). Still, it is possible that SSRIs act directly on striatal activity. To test this possibility, we next measured neuronal activity in the ventral striatum with and without fluoxetine administration.

Fig. 2: Fluoxetine effects in the ventral striatum.
figure 2

A Population-averaged [11C]-DASB PET images superimposed on an MRI template (n = 4 animals). The non-displaceable binding potential (BPND) of [11C]-DASB with (right) and without (control, left) fluoxetine co-injection was calculated for each hemisphere (n = 4 × 2). The ventral striatum (VS) and the dorsal striatum (DS) are delineated by white lines. Fluoxetine strongly reduced BPND in the ventral striatum (as summarized by the histogram; Mann–Whitney U-test, ***P < 0.001). Only monkeys C and S were used for electrophysiological recordings. B Population-averaged activities of task-related neurons in the ventral striatum before (left column) and during fluoxetine administration (right column). The spike density functions were aligned to the onset of cues (conditioned stimuli) and outcomes (unconditioned stimuli). The shaded area of the spike density function line indicates the population SEM. C Fraction of neurons showing a change in activity to different task parameters such as valence, location and certainty (P < 0.05, corrected for 150 time bins). Interactions are not plotted here (see Table S1). The faction of neurons encoding valence around the time of outcome was larger during fluoxetine administration (χ2 test, *P < 0.05). D Population averages (±SEM) of the coefficient of partial determination for the same regressors. The proportion of variance accounted for by valence increased with fluoxetine (two-tailed t-test, *P < 0.05).

Fluoxetine effects on neurons in the ventral striatum

While the monkeys performed the task, we recorded single-unit activity from 274 SPNs without fluoxetine and 202 SPNs with fluoxetine (Table 1). The mean spontaneous firing rate of striatal neurons increased by 54% with drug administration (two-tailed t-test, t(474) = 3.9 P < 0.001; Fig. 2B). We found no consistent drug effect on the burstiness (t(474) = 0.97 P = 0.33) and on the prevalence of cells with phasic task-related activity (Chi-square test, χ2(1,476) = 0.42 P = 0.51; Table 1). However, fluoxetine changed how SPNs encoded the acquired valence of distinct task events.

Table 1 Effects of fluoxetine on striatal neurons.

Especially around the time of the outcomes when animals received rewards or punishments by approaching targets (i.e., not by avoiding), the neurons encoded valence more strongly as evidenced by the fraction of cells showing significant regressions (χ2(1,199) = 6.58 P = 0.01; Fig. 2C, Table S1) and the proportion of variance accounted for (CPD; t(197) = 2.49 P = 0.01; Fig. 2D). Neuronal effects were not as consistent during cue presentation (Fig. 2C: χ2(1,199) = 0.30 P = 0.58; Fig. 2D; t(197) = 2.05 P = 0.04), suggesting that fluoxetine mainly altered the responding of striatal neurons to unconditioned stimuli, i.e., reward and punishment.

Furthermore, we found a limited effect of fluoxetine on the neuronal encoding of non-valence information. While the striatal encoding of the location of the chosen target remained unchanged during treatment (χ2(1,199) = 1.26 P = 0.261; Table S1), the fraction of cells showing significant regressions with certainty levels was increased by fluoxetine (χ2(1,199) = 4.52 P = 0.033; Table S1). However, no change in the proportion of variance accounted for by certainty (CPD, t(197) < 1.71 P > 0.05; Fig. 2C) was detected to confirm this drug effect. Together, our data suggest that fluoxetine primarily impacted the affective processes in the ventral striatum.

Two types of neurons in the ventral striatum

We identified two subsets of valence-encoding neurons based on the polarity of the regression coefficients β1 during evoked responses (P < 0.05, corrected for 150 time bins; Fig. 3). Some neurons preferentially responded to positive events in appetitive trials (P-type cells; Fig. 3A), while others primarily responded to negative events in aversive trials (N-type cells). N-type cells were selectively excited by the airpuff itself and showed no response when the monkeys avoided it (Fig. S4), confirming that N-type neurons did not simply encode the absence of reward. We observed no difference in the topographic organizations of different subtypes of cells in the ventral striatum (Fig. 3B), suggesting that the encoding of positive and negative valence was supported by intermixed striatal territories. In terms of populations, both types of neurons were similarly prevalent at the time of conditioned stimuli in the control condition (responses to Cue: P-type n = 14; N-type n = 18; Fig. 4 left). However, at the time of unconditioned stimuli, N-type neurons were more prevalent (73% of valence-discriminating neurons; responses to Outcome: P-type n = 15; N-type n = 40). Thus, before drug administration, more ventral striatum neurons encoded punishments than rewards.

Fig. 3: Two types of valence-discriminating neurons.
figure 3

A Activity of two exemplar neurons classified as P-type cells (left) or N-type cells (right). Spike density functions and raster plots around the times of cues and outcomes. A sliding window regression analysis compared firing rates between appetitive and aversive trials. The regression coefficients (gray line) were used to characterize the encoding of the valence assigned to task events: positive β1 values reflected preferential encoding of positive events, while negative β1 values reflected preferential encoding of negative ones. The bottom row shows P values calculated from the regression analysis. The horizontal dashed line indicates the statistical threshold (P < 0.05 corrected for 150 time bins). B Topography of cell types in the ventral striatum in a coronal plane. No differences were found in the locations of neuronal subtypes.

Fig. 4: Fluoxetine effects on valence-encoding cells.
figure 4

Population-averaged activities of A P-type neurons and B N-type neurons aligned to the times of cues and outcomes. Spike density functions were grouped according to the response pattern evoked during events: increase or decrease in β1 values. No effects of fluoxetine on population-averaged β1 values were detected (two-tailed t-test, P > 0.05). The width of the lines indicates the population SEM. N refers to the number of cells in each population. C The ratio of P-type cells to N-type cells was affected by fluoxetine around the time of the trial outcome (χ2 test, **P < 0.01). During fluoxetine administration, the striatal neurons processed punishments less often and rewards more often than during the control condition.

Fluoxetine effects on valence-encoding

While we found no effects of fluoxetine on the strength of valence encoding when comparing population-averaged β1 values between drug conditions (t < 1.8 P > 0.05; Fig. 4A, B), the drug modified the prevalence of neurons selectively activated by rewards or punishments (Fig. 4C). Specifically, fluoxetine increased the number of P-type cells and decreased the number of N-type cells at the time of outcomes (χ2(1,109) = 7.9 P = 0.004; P-type n = 29; N-type n = 25). These data suggest that fluoxetine enhances the relative processing of positive versus negative affective information in the ventral striatum by rebalancing the fractions of valence-discriminating neurons in favor of rewards over punishments. Fluoxetine effects on valence encoding appeared consistent over time, with no difference between the first and the second 2 weeks of treatment (χ2(1,54) = 0.74 P = 0.39; Table S2). The analysis of individual animals confirmed a drug effect on valence-encoding neurons by showing consistent increases in the number of cells responding to rewards versus punishments (Figs. S5 and S6). However, due to smaller samples of cells analyzed per animal, only monkey S had a significant amplification of P-type cells relative to N-type cells (χ2(1,51) = 4.58 P = 0.032). Notably, monkey S was also the animal with the strongest drug effects on task performance, reflecting a higher sensitivity to fluoxetine.

Discussion

Our findings reveal that two subsets of neurons in the primate ventral striatum encode positive and negative information, and that repeated administration of SSRI retunes the ability of these neurons to selectively encode emotional valence. The beneficial effects of SSRI on monkeys’ performance in approaching reward and avoiding punishment were coupled with changes in both tonic and phasic activities of striatal neurons. In addition to a general increase of the spontaneous firing rate (Fig. 2B), we found that fluoxetine potentiated positively valenced signals and attenuated negatively valenced signals by rebalancing the fractions of neurons that encoded either reward or punishment (Fig. 4). By contrast, fluoxetine administration did not reliably alter the encoding of non-valence information such as the certainty levels and the location of targets. Together with PET scans showing that the major binding regions of fluoxetine include the ventral striatum (Fig. 2A), our results are consistent with the proposal that the ventral striatum plays an active role in SSRI actions on the processing of valenced information. Thus, the ventral striatum may support the early effects of SSRIs on emotion processing biases that are commonly observed in psychiatric disorders with affective impairments [7,8,9,10].

The original idea that SSRIs act directly by modulating the brain’s emotional valence circuits arose primarily from psychological studies in which the relative processing of positive versus negative affective information was investigated in patients with various mood disorders [47,48,49,50]. Experimental evidence suggests that depressed and anxious states are characterized by a tendency to perceive social events as more negative, while disregarding positive information [5, 51]. Such negative affective bias has been related to increased risk of relapse [52] and psychological models suggest that it fuels recurring harmful thoughts in patients [53]. Despite inconsistent clinical findings [54, 55], repeated and even single administration of SSRI appear to normalize these affective impairments by producing positive shifts in the processing of emotionally valenced information (7–12). For example, antidepressants increase the relative recognition of positive over negative social cues in patients and healthy individuals performing a facial expression recognition task [7, 9]. At a neural level, clinical improvements are tightly coupled with valence-specific effects distributed across diverse brain regions [20]. SSRIs affect the medial prefrontal and core limbic parts of the emotional network (including the amygdala, the anterior cingulate, the insula, the striatum and the thalamus), by primarily decreasing their activity related to negative emotions, but occasionally also by increasing their activity related to positive emotions [8, 16, 56,57,58,59]. Depending on patients’ symptoms and the method used (e.g., treatment, task), neuroimaging studies report inconsistent SSRI effects in the ventral striatum. For instance, it is unclear whether SSRIs alter the emotional states by simultaneously improving the processes of both positive and negative affective information in the striatum. Some studies show a normalization in BOLD responses by producing positive shifts in all types of emotion [60, 61], while others describe SSRI-mediated decreases in emotional regulation of positive valence that could account for the experience of emotional blunting described by some patients during SSRI treatment [62]. Nonetheless, depressed patients who exhibit the largest SSRI-induced effect on positive emotions are those who exhibit the greatest change in ventral striatum activity [63, 64], making this brain region a powerful marker for SSRI efficacy on anhedonia or depressed mood.

Consistent with the majority of neuroimaging results, we find a positive shift in the processing of emotionally valenced information when an SSRI was administrated to healthy monkeys at therapeutic doses [37, 38]. With regard to the number of neurons recruited, fluoxetine retuned the relative encoding of positive versus negative affective information in the ventral striatum (Fig. 4). In addition to informing neuroimaging studies, our neurophysiological data elucidate the neural mechanisms that underlie SSRI effects. However, our methods cannot clearly determine whether there are causal relationships between these striatal changes and the behavioral improvement. Because the monkey which had the most significant effects on valence-encoding neurons was the animal with the strongest drug-induced behavioral effects (i.e., monkey S), two hypothetical solutions co-exist to explain our data. The effects of fluoxetine on striatal activity could lead more or less directly to behavioral changes, or alternatively, a better animal’s performance in task execution with more rewards collected could result in changes in neuronal processes. Further research is needed to determine whether fluoxetine within the ventral striatum truly drives the beneficial actions of the drug on animal’s behavior. One solution would be to test the drug action by performing focal injection of fluoxetine directly into the ventral striatum in animal models. While we showed that fluoxetine binds within the ventral striatum at a therapeutic dose (Fig. 2), indirect actions on striatal activity from other brain regions cannot be excluded so far.

Our findings suggest that the primate ventral striatum plays an active role in the processing of both positive (appetitive) and negative (aversive) information. Limbic circuits of the basal ganglia have traditionally been associated with the control of goal-directed actions toward positive events, with a key role in motivational drive and reward learning [30]. However, growing evidence suggests that these circuits also participate in avoidance behaviors and aversive processing [24, 25, 36, 65,66,67]. Thus, a dysregulation of opposite valence signaling within this network appears to be a possible contributor to several psychiatric illnesses [68,69,70]. Our data concur and the finding that an SSRI retunes emotional processing indicates that the ventral striatum may be an interesting target for treating affective symptoms in both depression and anxiety.

While our results point in the same direction as the findings obtained in depressed and anxious patients [8, 16, 56,57,58,59], our monkey model focuses on different pathophysiological underpinnings of emotional disorders, our results were collected with only two monkeys without a control group, and our SSRI administrations were not equivalent to a chronic medication. These differences may explain relative differences in the beneficial effects of SSRI. Despite these possible limitations, our study describes how a SSRI treatment refines the encoding of emotional valence in the ventral striatum, a brain region that is involved in the control of motivated and anxiety-related behaviors [32,33,34].

Funding and disclosure

This work and BP were supported by the French National Agency of Research (ANR-11-LABX-0042 and ANR-11-IDEX-0007). GD, YS, PNT, and LT were supported by the Swiss National Science Foundation (CRSII3-141965). AR and MM were respectively supported by the “Fondation pour la Recherche Médicale” (DEQ20110421326) and the ANR (ANR-11-LABX-0042). The authors declare no competing interests.