Introduction

The ability to learn from social outcomes is crucial for successful interpersonal interactions. We have previously shown that impaired social learning is associated with diminished social engagement motivation and more frequent experiences of negative interpersonal encounters in everyday life [1, 2]. These findings are particularly relevant to the understanding of social impairments in major depressive disorder, as depressed individuals demonstrate reduced learning from social feedback, as well as altered neural encoding of social learning signals [1, 2].

In order to identify potential treatment targets for social learning deficits in depression, it is important to determine which neurotransmitters may contribute to these impairments. Previous research points to a potential involvement of dopamine (DA) or serotonin (5-HT), as these neurotransmitters have been implicated in the psychopathology of depression [3, 4], social processing [5,6,7], and non-social learning [8,9,10,11].

While studies using DA or 5-HT manipulations in combination with social learning paradigms are lacking, there is extensive research on the effects of these neurotransmitters on learning from non-social outcomes. For instance, behavioral studies have found that lowering DA functioning impairs reward and enhances punishment learning [12,13,14,15,16], whereas increasing DA levels has the opposite effect [17,18,19,20,21,22]. Moreover, reducing 5-HT functioning has been shown to diminish both reward and punishment learning [23,24,25,26], although in some paradigms heightened punishment learning has been observed after 5-HT depletion [27, 28].

On a mechanistic level, it has been suggested that DA and 5-HT neurons contribute to the learning process by propagating learning signals. In particular, it is thought that DA neuron firing represents reward predictions and prediction errors (PEs; indicating the discrepancy between predicted and actual rewards), whereas 5-HT neuron firing may encode punishment PEs [10, 29, 30]. These mechanisms have been formalized by computational models which, in turn, have been utilized to inform fMRI analyses in humans. Using this approach, it has been shown that increased DA levels are associated with enhanced reward prediction representations in the ventromedial prefrontal cortex (PFC), as well as with heightened reward PE signals in the striatum [17, 19, 31]. By contrasts, reducing DA functioning has been found to diminish prediction responses in the caudate, thalamus, and midbrain, and to attenuate PE encoding in the caudate, thalamus, and amygdala [13, 32].

In addition, lowering 5-HT levels has been reported to decrease reward prediction representations in the dorsolateral and ventromedial PFC, anterior cingulate cortex (ACC), insula, and precuneus [25, 32], while also diminishing punishment prediction encoding in the orbitofrontal cortex and amygdala [33]. Moreover, reduced 5-HT functioning has been associated with attenuated reward PE encoding in ACC, putamen, and hippocampus [25, 34].

The above findings demonstrate that DA and 5-HT are involved in behavioral and neural learning processes when non-social outcomes are involved. However, it is less clear what role these neurotransmitters play during social learning. The current study aimed to examine this question by lowering DA or 5-HT levels in healthy volunteers through acute tyrosine/phenylalanine or tryptophan depletion, respectively. After consumption of the depletion drink (or a placebo), participants performed a social learning task in the MRI scanner during which they learned and rated associations between name cues and rewarding (happy faces) or aversive (fearful faces) social outcomes. Computational modeling was applied to the data to assess depletion effects on the neural representation of social learning signals. It was hypothesized that both depletion manipulations would impair social reward learning, as indicated by less accurate ratings in the task and reduced encoding of neural learning signals, while social aversion learning may be enhanced after DA depletion and reduced after 5-HT depletion.

Methods and materials

Participants

Seventy right-handed, healthy individuals between the age of 18 and 45 years took part in the current study. Volunteers were screened with the structured clinical interview for DSM-IV (SCID; [35]), and answered several questions about their medical history. Subjects were ineligible if they had a history of any DSM Axis I disorder, a significant current or past medical condition, or any contraindications to MRI scanning. Further exclusion criteria were the current use of any medications besides contraceptives, the use of any psychotropic medications or recreational drugs within the past 3 months, or smoking more than five cigarettes per week.

In a double-blind design, eligible participants were randomly allocated to the DA depletion (N = 24), 5-HT depletion (N = 24), or placebo (N = 22) group. These sample sizes are comparable to other learning-related depletion studies which observed group effects. A between subject design was chosen because it was expected that the unpleasant taste of the depletion drink and the required time commitment for the testing session (9 a.m. to 5 p.m.) would have resulted in large numbers of drop-outs if each participant had been required to attend three testing sessions. In addition, practice effect in the task would likely have occurred in a cross-over design.

The study was approved by the University of Reading Ethics Committee (UREC 15/61) and all subjects provided written informed consent.

Amino acid depletion drink

The relative amino acid amounts for the depletion drinks were based on previous 5-HT [36] and DA [37] depletion studies. However, to reduce the experience of side effects, the absolute amounts were adjusted to each participant’s body weight (which has been shown to lead to a reliable depletion effect with a slightly different mixture; see Dingerkus et al. [38]).

Specifically, the placebo drink contained the following amounts for a subject weighing 83.6 kg (i.e. the average male weight in the UK), which were adjusted proportionally for lower or higher body weights: l-alanine, 4.1 g; l-arginine, 3.7 g; l-cystine, 2.0 g; glycine, 2.4 g; l-histidine, 2.4 g; l-isoleucine, 6 g; l-leucine, 10.1 g; l-lysine, 6.7 g; l-methionine, 2.3 g; l-proline, 9.2 g; l-phenylalanine, 4.3 g; l-serine, 5.2 g; and l-valine, 6.7 g; l-threonine, 4.9 g; l-tyrosine, 5.2 g; l-tryptophan; 3.0 g.

The 5-HT and DA depletion mixtures were identical to that of the placebo drink, except that they did not contain tryptophan or tyrosine and phenylalanine, respectively. All drinks were prepared by stirring the amino acids and a pinch of salt (to neutralize the bitter taste) into 120 mL of tap water, 30 mL of caramel syrup, and a tablespoon of oil (with liquid quantities being adjusted to the amino acid amounts).

General procedure

After an initial screening visit, eligible participants were sent online versions of the Beck Depression Inventory (BDI; Beck et al. [39]) and a demographics form to complete at home. Subjects were then invited to attend the testing session. They were asked not to consume any food or drinks besides water after 10 p.m. on the previous day, and to arrive at the study location at 9am on the testing day. At this point, participants completed the Positive and Negative Affect Scale (PANAS; Watson et al. [40]) and gave a blood sample which was used to assess baseline amino acid levels. Subsequently, subjects consumed one of the three depletion drinks and were given a protein-free breakfast bar. During the following 3.5 h, participants occupied themselves in a waiting room, with lunch (protein free pasta and tomato sauce) provided at 12 noon. This waiting period was chosen to ensure that the MRI scan took place 5 h after the consumption of the depletion drink, which is when the maximum depletion effect has been shown to occur [41].

After the waiting period, subjects filled in the PANAS and a side effects questionnaire. Subsequently, they completed a name learning test (see supplement) and the practice trials of the social learning task. Additionally, a second blood sample was collected which was used to assess whether relevant amino acid levels had been successfully depleted (see supplement). Participants then performed the experimental trials of the social learning task in the MRI scanner, and, after the scan, completed a task feedback and drink guess questionnaire (see Fig. 1).

Fig. 1: Flow chart of the study procedure.
figure 1

Left panel describes the screening procedure and the right panel the testing procedure.

Social learning task

Participants’ aim during the task was to learn associations between name cues and happy, neutral or fearful facial expression. The task consisted of 48 practice and 72 experimental trials, which were divided into social reward and aversion blocks. The blocks were performed in counterbalanced order and three name - face (identity) pairings were randomly allocated to each block. On each trial, participants were presented with a name cue and a rating scale (see below), followed by the face that had been paired with the name (see Fig. 2). In the social reward block, each face had a different likelihood (25%, 50%, or 75%) of displaying a happy rather than a neutral expression. Similarly, in the social aversion block, each face had a different likelihood (25%, 50%, or 75%) of showing a fearful rather than a neutral expression. Participants were asked to learn how likely it was that a given name was associated with an emotional (rather than a neutral) expression and to indicate this likelihood on a visual analog scale (ranging from 0% to 100%) on each trial before being shown the face. Subjects were instructed to start with a guess and to subsequently base their ratings on the intuition they gained from all the times they had seen the name - face pairing before.

Fig. 2: Social Learning Task.
figure 2

Figure shows example timeline of events presented in miliseconds (ms).

Analysis

Behavioral analysis

Where normality assumptions were met, measures were analyzed using one-way ANOVAs. Otherwise Kruskal–Wallis H tests were used. Additionally, relations between categorical variables were assessed using χ2 tests.

Box-and-whisker plots were used to visually detect outliers in all data before unblinding of the groups. This procedure revealed several clear outliers in the learning task likelihood ratings (but not in the other data). Therefore, values outside ±2 standard deviations of the mean were removed from the learning task rating data (removed: N5-HT depletion = 3, Nplacebo = 3, NDA depletion = 4). Subsequently, a group × valence × probability mixed-measure ANOVA was conducted, and interactions were followed up with one-way ANOVAs. As the sphericity assumption was violated for the probability factor, Greenhouse–Geisser corrected results are reported for the associated effects.

Computational modeling

A Rescorla–Wagner model [42] was fit to the data by minimizing the sum of squared errors between participants’ likelihood ratings and the model prediction value (multiplied by 100; similar to Hindi Attar et al. [33]) using the fmincon function in MATLAB. The model included a learning rate (α) and a decay (ƴ) parameter, the latter of which accounted for potential forgetting of the contingencies between the practice and experimental trials (see supplement for details). Group differences in the model fit and parameters were assessed using Kruskal–Wallis H tests.

It should be noted that extensive model fitting, comparison, and validation was not performed because the main purpose of the modeling approach was to assess the neural encoding of learning signals. A previous systematic exploration of the effects of model parameter values on fMRI results has shown that parametric modulation results for prediction and prediction error values do not differ substantially as model parameters are varied, rendering precise model fitting unnecessary for model-based fMRI analyses [43]. Given that no model fitting was performed, we refrain from drawing conclusions about the behavioral performance from the model parameters and rely on the raw data for such inferences instead.

fMRI acquisition and analysis

Functional MRI images were acquired using a three-Tesla Siemens scanner (Siemens AG, Erlangen, Germany) and analyzed with Statistical Parametric Mapping software (SPM12; http://www.fil.ion.ucl.ac.uk/spm; see supplement for details).

Neural prediction encoding was assessed by entering computational modeling-derived prediction values into the first-level fMRI analysis as parametric modulators at the time of the cue (as two separate regressors for social reward and aversion blocks). On the second level, whole-brain one-way ANOVAs were performed to assess group effects (placebo vs. DA depletion, placebo vs. 5-HT depletion, and DA vs. 5-HT depletion). Reported results were thresholded at 0.005 (uncorrected) on the voxel level and are family wise error corrected at the cluster level.

Additionally, to examine prediction error (PE) encoding, the two PE components (i.e. inverse predictions and outcome values) were used as parametric modulators at the time of the face presentation in the first-level analysis (separately for social reward and aversion blocks). Subsequently, MarsBar (Brett, Jean-Luc, Valabregue, & Poline, 2002) was used to extract average parameter estimates for the two components from a 6-mm sphere around striatal coordinates that have been found to encode PEs in a previous meta-analysis (left ROI: −10 8–6; right ROI: 10 8–10; Chase et al., [44]). The extracted values were then compared between groups by conducting one-way ANOVAs.

Results

Behavioral results

Questionnaires and demographic measures

Demographic and questionnaire measures are shown in Table 1. No significant group differences were observed in the change of pre- to post-depletion PANAS ratings on the positive (F(2, 66) = 1.38, p = 0.260) or negative (F(2, 66) = 0.57, p = 0.567) affect subscale. Chi-square tests demonstrated a marginally significant relationship between the depletion groups and drink guesses (χ2(2) = 9.23, p = 0.056). It should, however, be noted that this was not due to the number of correct guesses (which was below 37% in each group), so this finding is likely spurious. The association between group and side effect reporting could not be assessed with a chi-square test, because the assumption that <20% of the cells have expected counts of below 5 was not met. However, as can be seen from Table 1, numerically the count of individuals reporting side effects did not differ substantially between the groups.

Table 1 Questionnaire and demographic measures by group.

The remaining demographic and baseline measures were not statistically compared between groups, as statistical tests to assess whether baseline group differences are due to chance are not appropriate in randomized trials in which such differences are known to occur by chance (see CONSORT guidelines).

Social learning task performance

As expected, the mixed-measure ANOVA (group × valence × probability) of participants’ likelihood ratings revealed a significant main effect of probability (F(1.36, 77.65) = 209.71, p < 0.001), as participants made higher likelihood ratings when the probability of an emotional outcome was greater. Additionally, significant valence by probability (F(1.92, 109.45) = 3.35, p = 0.040), group by probability (F(2.73, 77.65) = 4.42, p = 0.008), and group by valence by probability (F(3.84, 109.45) = 3.72, p = 0.008) interactions were observed.

Follow-up one-way ANOVAs showed significant group differences in the 75% (F(2, 57) = 4.81, p = 0.012), 50% (F(2, 57) = 3.29, p = 0.044), and 25% (F(2, 57) = 7.03, p = 0.002) social reward conditions, with no group effect in any of the social aversion conditions (all F < 2.65). Bonferroni corrected post-hoc tests indicated that, compared to placebo, 5-HT depleted subjects made significantly lower likelihood ratings on trials with a 75% chance of displaying a happy expression (p = 0.010), but made significantly higher ratings on trials with a 25% chance of presenting a happy face (p = 0.002). Moreover, DA depleted participants made significantly higher ratings than placebo controls on trials with a 25% chance of displaying a happy face (p = 0.040), as well as significantly higher ratings than 5-HT depleted individuals on trials with a 50% chance of presenting a happy expression (p = 0.045). These findings indicate that the depletion manipulation, especially 5-HT depletion, impaired social reward learning, seemingly leading to increased uncertainty about what social outcomes to expect (as indicated by ratings close to 50% across all outcome probabilities; see Fig. 3 below and uncertainty score analysis in the supplement).

Fig. 3: Likelihood ratings by group and probability.
figure 3

a Social reward trials b Social aversion trials. * indicate p < 0.05.

Computational modeling

There were no significant group differences in the learning rate (social reward block: H(2) = 1.89, p = 0.389; social aversion block: H(2) = 0.80, p = 0.672), or decay (reward block: H(2) = 3.37, p = 0.185; aversion block: H(2) = 1.56, p = 0.459) parameters. Similarly, no significant group effects were observed for the model fit, as indicated by mean squared errors, when using individual (reward block: H(2)= 2.77, p = 0.250; aversion block: H(2) = 1.14, p = 0.565) or averaged (reward block: H(2) = 2.35, p = 0.309; aversion block: H(2) = 1.81; p = 0.406) parameters.

fMRI results

Neural prediction value encoding

Compared to placebo controls, 5-HT depleted subjects displayed significantly decreased social reward prediction encoding, as indicated by a reduced covariation between computational modeling-derived prediction values and BOLD responses in the parametric modulation analysis. This group effect was seen in the dorsal anterior cingulate cortex (ACC)/dorsomedial prefrontal cortex (PFC), premotor cortex/dorsolateral PFC, bilateral temporal lobe/ fusiform gyrus, and in the right insula. Moreover, DA depleted individuals demonstrated significantly reduced social reward prediction representations in the dorsal ACC and dorsomedial PFC/ pre-supplementary motor area compared to controls (see Fig. 4 below and Table S1 in the supplement). Contrasts between the depletion groups did not reveal any significant clusters.

Fig. 4: Neural prediction value encoding.
figure 4

Clusters showing lower social reward prediction encoding in 5-HT depleted (a, b) or DA depleted (c) subjects than in placebo controls, as well as parameter estimates extracted from the peak voxel of the group contrasts in the insula (a) and the dorsal anterior cingulate cortex (dACC; b, c).

Additionally, in the social aversion condition, 5-HT depleted participants demonstrated stronger prediction signals than placebo controls and DA depleted individuals in the thalamus and precentral gyrus, respectively (see Table S1 in the supplement). All other contrasts yielded no significant clusters.

Neural prediction error encoding

One-way ANOVAs were conducted on the average parameter estimates extracted from the striatal regions of interest for the encoding of outcome and inverse prediction values (i.e. the two prediction error components). This analysis revealed no significant group differences for either the social reward or the social aversion block (all F < 0.8).

Discussion

Effects of 5-HT depletion on social learning

The present study aimed to examine the effects of 5-HT and DA depletion on learning from social outcomes. The behavioral findings revealed that 5-HT depletion impaired participants’ ability to learn from social rewards, giving rise to heightened uncertainty about what social outcomes to expect. These results are in line with previous reports of decreased non-social learning after reductions in 5-HT functioning [23,24,25,26]. Interestingly, using the same task, we previously observed very similar results in individuals with high depression scores [2], which suggests that low levels of 5-HT may contribute to social learning deficits in depression.

Moreover, consistent with the behavioral findings, 5-HT depletion also affected neural learning signals. Specifically, 5-HT depleted subjects demonstrated altered social reward prediction encoding in the dorsal ACC, PFC, insula, and temporal lobe. These observations are in keeping with previous reports of reduced reward prediction signals in the ACC, PFC, and insula following lowered 5-HT functioning [25, 32, 34].

The engagement of the insula and temporal lobe during the prediction phase of our task may have been due to the role of these regions in the working memory maintenance of faces [45], which may have aided the learning process. Moreover, the dorsal ACC may have contributed to cue value computations [46, 47], while the dorsolateral PFC may have directed attentional resources toward cues that were particularly salient due to their association with happy faces [48].

At first sight, this may suggest that the altered prediction encoding in 5-HT depleted subjects in the above regions may be linked to reduced attentional and working memory processing. However, it should be noted that 5-HT depletion did not merely lower, but instead reversed, the neural prediction signals in the abovementioned areas (see supplement). This indicates that, instead of covarying with the prediction of happy faces (as in participants on placebo), brain responses of 5-HT depleted individuals seemed to track the prediction of neutral faces.

A possible explanation for this finding is that 5-HT depletion may have given rise to negative biases [49], which may have led to the perception of ambiguous neutral faces as negative. This may have made the latter more salient, resulting in the recruitment of attentional and working memory processes to support the prediction of neural faces. This interpretation is, of course, speculative and more direct assessments of this hypothesized effect, and the role of the different brain regions, are needed. Yet, it is interesting to note that, using the same task, we previously found a similar pattern of reversed social reward prediction encoding in the insula and temporal lobe of individuals with high depression scores [2]. Taken together, these findings suggest that low levels of 5-HT may contribute to impaired social reward learning in depression by biasing learning towards negatively perceived ambiguous stimuli.

Following on from the above interpretation, it may seem surprising that no group differences were found in the happy vs. neutral face contrast. However, it is possible that the increased engagement of the PFC in anticipation of neutral faces may have facilitated a preparatory downregulation of limbic regions in 5-HT depleted subjects. This preparatory response may have equalized the otherwise potentially stronger activation to neutral faces in 5-HT depleted subjects compared to placebo controls.

At first sight, the above interpretation of the neuroimaging findings may appear to be inconsistent with the behavioral results, given the increased likelihood ratings on low probability social reward trials after 5-HT depletion. However, it is possible that the mismatch between task demands (for happiness prediction) and neural processing (focused on the prediction of negatively interpreted neutral faces) may have led to enhanced uncertainty (rather than a negative bias) on the behavioral level, thus leading to ratings close to 50% for both high and low probability trials in the 5-HT depletion group. This suggestion is in line with previous proposals stating that performance may be impaired if the framing of the task does not match the participants’ cognitive style [50, 51]. Nevertheless, it should be noted that the current interpretation is speculative and alternative explanations of the findings exist. For instance, 5-HT depletion may have induced a general deficit in the discrimination of decision options, as previously observed [52] (although this appears somewhat less likely in the current study, given that there were no group differences in the learning from fearful faces).

Effects of DA depletion on social learning

The current study further found that DA depleted participants tended to be less certain about what social rewards to expect compared to placebo controls. This observation is in line with previous findings showing that decreased DA levels are associated with impaired learning from non-social rewards [12,13,14,15,16], while increased DA functioning enhances learning from positive outcomes [17,18,19,20, 22, 31].

Moreover, on the neural level, DA depletion reduced social reward prediction encoding in the dorsomedial PFC and dorsal ACC. This may have been due to an effect of DA depletion on the stability of frontal prediction representations. More concretely, it is thought that the strength of input representations in the frontal cortex is influenced by the balance between D1 and D2 binding, with low levels of DA inducing preferential D2 (rather than D1) binding, which is associated with weak input representations [53]. Therefore, DA depletion may have impaired the stability of prediction representations in the frontal cortex through a shift to predominant D2 binding. This interpretation is in line with that of Jocham et al. [31], who found that the D2 receptor antagonist amisulpride increased predictive value signals in the vmPFC, possibly by facilitating more stable D1- (rather than D2-) mediated value representations.

It should be noted that the observed effects of DA depletion on frontal cortex signals, as well as on behavioral responses, were similar to those seen after 5-HT depletion. This might be the case due to interactions between these neurotransmitter systems. While there is little evidence for an influence of DA on 5-HT functioning, the reverse effect is well documented [54]. Specifically, 5-HT2C receptors seem to tonically inhibit DA functioning, whereas other 5-HT receptor subtypes appear to enhance DA activity when 5-HT release is stimulated [55]. It can thus not be ruled out that 5-HT depletion led to reduced DA activity in the frontal cortex, and that decreased DA rather than 5-HT functioning played a crucial role in the observed PFC effects. However, even if this was the case, other findings of the current study (e.g. in the temporal lobe and insula) were more unambiguously 5-HT related, as they were present only under 5-HT and not under DA depletion. Importantly, it was these findings (and not those observed in the PFC) that were highly similar between 5-HT depleted subjects in the current study and individuals with depression symptoms in our previous work. Thus, the main conclusions drawn above remain unaffected by the potential interactions between the 5-HT and DA systems.

In addition, it is noteworthy that in our task, DA depletion had a less extensive effect on behavioral and neural responses than 5-HT depletion. This may suggest either that DA is less crucially involved in social learning in particular, or that the stimuli used in our task (happy faces of strangers) were not rewarding enough to elicit a robust DA response. Future studies using different, more rewarding social stimuli (such as pictures of friends) are needed to distinguish between these possibilities.

Conclusion

Taken together, the results of the current study indicate that 5-HT depletion impairs social reward learning on both the behavioral and the neural level, possibly partly by increasing attentional and working memory processing of negatively perceived neutral faces. DA depletion had a similar, although less pervasive, effect. Interestingly, the behavioral and neural responses observed after 5-HT depletion in the current study closely resemble our previous findings in individuals with high depression scores. It may thus be the case that decreased 5-HT levels contribute to social learning deficits in depression. It would be of interest for future studies to examine whether serotonergic antidepressants alleviate social learning impairments in depressed individuals.

Funding and disclosure

The current work was funded by a Medical Research Council PhD studentship. The authors declare no competing interests.