Introduction

Disrupted reward processing is a central component in neurobiological theories of drug addiction [1,2,3,4,5]. In humans, addiction is characterized by exaggerated responses to drugs and drug cues at the expense of pleasure from and motivation for non-drug rewards [6]. Indeed, decreased neural and subjective responses to natural rewards [anhedonia; [7,8,9,10]] may predict relapse to drug use [7]. Symptoms of anhedonia have been observed across a range of substance use disorders [11,12,13], as well as during abstinence [14, 15], withdrawal [16], and pharmacotreatment [7, 15, 17]. Opioid addiction can be considered the prototypical addiction [18], with devastating consequences for individuals and society [19]. The current ‘gold standard’ treatment is opioid maintenance treatment (OMT) with methadone or buprenorphine [20]. Akin to heroin, OMT medications act as agonists on the opioid receptors, but with long-acting pharmacological properties. OMT reduces illicit opioid use, drug craving, and behavior associated with illegal drug use [20,21,22]. Patients in stable OMT are relieved of the highs and lows of the stressful ‘drug taking cycle’, i.e., binge, withdrawal, and drug seeking [1, 23]. However, because these pharmacotreatments act as surrogates at the receptor level, the patients remain opioid dependent in the ‘physiological’ sense; that is, they are opioid tolerant and will experience withdrawal symptoms if treatment is discontinued. It is unclear whether OMT medications prolong or even cause reward dysregulation. An alternative hypothesis is that patients in long-term stable OMT could exhibit intact reward processing. Life-style related factors such as psychosocial and somatic stress, social stigma, and poly-drug use [24] also contribute to anhedonia.

To determine whether reduced responsiveness to non-drug rewards are a necessary consequence of opioid dependence, we assessed reward responses in a unique group of patients with opioid dependence. Patients had been in stable OMT for >7 years, exhibited few psychosocial vulnerability factors compared to the general OMT population, and had minimal concurrent drug use. All patients were mothers who started OMT during pregnancy and who retained custody of their children.

We compared reward responsiveness in OMT patients to a healthy comparison group (COMP) using both subjective (self-reported hedonia) and objective (behavioral) reward responsiveness measures. As a behavioral measure, we used a well-established probabilistic reward task (PRT), which is sensitive to disrupted reward responsiveness during depression [25, 26], stress [27], cannabis dependence, [28] and nicotine withdrawal [29]. Data from the task were analyzed using classical signal detection method as well as with the drift diffusion model of decision making [DDM; [30]] to assess potential differences in decision sub-processes. This approach allows for a fine-grained comparison of the mechanisms underpinning reward-based decisions between groups. The DDM yields estimates of key decision parameters: speed-accuracy trade-off, time spent on perceptual and motor processes, signal processing efficiency and stimulus preference. We expected that potential group differences in cognitive processing would be reflected in one or several of these parameters. To establish the typical range of healthy response bias in the PRT, we performed a meta-analysis of performance of 968 healthy controls from previous studies.

In summary, we measured reward responsiveness in OMT using three distinct approaches (self-report, behavioral test, and decision sub-processes), and included two independent comparison samples (total n = 995). Reductions in reward responsiveness in the OMT group would support the notion that chronic opioid agonist use, even in the relative absence of additional psychosocial vulnerability factors, can lead to enduring disruption of non-drug reward processing. In contrast, intact reward responsiveness in patients would provide evidence that long-term µ-opioid receptor stimulation does not necessarily cause anhedonia.

Materials and methods

Participants

Fifty-six mothers were recruited as part of a 7-year follow-up of a longitudinal study of mothers in OMT (and their children). Three OMT mothers had recently discontinued drug treatment program (tapered), therefore the final sample included 23 mothers in current OMT (in treatment for >7 years) and 30 comparison mothers (Table 1). Apart from six foster mothers in the comparison group, participants were recruited during pregnancy in 2005–2007; group characteristics at study inclusion are described in [31]. Requirements for entering OMT include an ICD-10 opioid dependence diagnosis and a medical evaluation of treatment eligibility. Lifetime injection of heroin was on average 8 years before entering OMT [32].

Table 1 Group characteristics and questionnaire measures

Comparison mothers with no reported illicit drug use or psychiatric illness were matched (at the time of study inclusion) to the OMT group on age, gender, and time of pregnancy (apart from the six foster mothers) but not tobacco use or years of education. Therefore we refer to a ‘comparison’ (COMP) rather than ‘control group’.

Procedure

Procedures were approved by the Regional Ethics Committee (2013/1606/REK Sør-Øst B). Participants received verbal and written information about the study and signed a separate consent form before completing the reward responsiveness tasks. Participants received 10–15 USD corresponding to task performance.

Measures

Characterizing the study groups

We used the 25-item Shortened Hopkins Symptom Checklist [SCL-25, [33]] and a cut-off of 1.0 to identify participants with at least some anxiety and/or depression [34]. General life-satisfaction was operationalized as the mean rating [35] of the life-satisfaction questionnaire [LISAT-9, [36]]. We also employed the behavioral approach/inhibition scale [BIS/BAS, [37]]. A locally developed questionnaire based on [38, 39] was used to assess current mood and potential opiate side effects on an electronic visual analogue scale (VAS).

Subjective reward responsiveness

The state measure of anhedonia was modified [40] from the Snaith-Hamilton Pleasure Scale [SHAPS, [41]]. Unlike the original version, which centers on hypothetical enjoyment of everyday rewards (e.g., food, social contact, and esthetics) during the last few days, the present version assessed current hedonic state (“Right now I would [appreciate/enjoy/etc]”) as a means to avoid recall bias [42]. Further, to capture variance in hedonic capacity in both patients and healthy participants, items were rated on an electronic VAS (“not at all”—“extremely”).

Objective reward responsiveness

Behavioral reward responsiveness was measured with the well-established PRT, a perceptual decision task with skewed rewards [sometimes called the ‘objective anhedonia test’; [43, 44]]. In the majority of healthy participants, the skewed reward schedule induces a response bias reflecting the propensity to change behavior as a function of reward. During each trial, participants see a mouthless schematic face (Fig. 1) for 500 ms before the mouth is briefly presented (100 ms). The task is to identify which of two possible mouths was presented (long/short). The marginal difference in mouth length (11 and 12 mm) together with the brief presentation time makes the identification challenging (piloted to yield an average accuracy of 75%). Participants are instructed that upon correct identification of the mouth there is a chance to receive a small monetary reward (NOK 1, ~12 cents). For one stimulus, 75% of all correct identifications are rewarded (termed the rich response option) and for the other stimulus only 25% of the correctly identified mouths lead to reward (lean option). This reward schedule is unknown to participants, who nevertheless develop a response bias towards the rich stimulus. Participants rarely become aware of this reward skewed schedule [e.g., see [45,46,47]]. The experiment consisted of three blocks of 100 trials separated by short breaks (self-paced). The task was implemented using E-Prime 2.0® (Psychology Software Tools Inc., Pittsburg, PA, USA).

Fig. 1
figure 1

Experimental task. Participants were presented with schematic faces and instructed to identify which of two alternative mouths was shown (short or long mouth). Unknown to the participant, correct identification of one of the alternatives lead to a monetary reward three times more often than the other alternative stimulus (75% vs. 25% reward probability, rich and lean stimulus respectively). Non-rewarded and incorrect trials were followed by a fixation cross. The differences between the face stimuli have been inflated for illustrative purposes

Statistical analyses

Experimental data were analyzed using R [version 3.1.2, [48]] and IBM SPSS (version 24). Group differences on questionnaire measures were assessed using independent samples t-tests (Welch’s t-test when Levene’s test indicated unequal variances). Mixed effects ANOVAs were used for experimental task data. Greenhouse-Geisser correction was employed when sphericity assumptions were violated. For non-significant group contrasts that are interpreted in the discussion section, Bayes Factor, which gives information about the relative evidence for two competing models (H1: group difference vs H0: no group difference) given the data, was calculated in R using the BayesFactor package [49].

Missing data and outlier exclusion

The final number of participants included in the behavioral task analysis was OMT = 20 and COMP = 27 (one dataset from each group was lost; two OMT and one COMP participants were excluded due to failure to follow task instructions; one COMP failed to complete the task). Trials with responses <250 ms and >2500 ms were excluded prior to analysis (97 trials, 0.7%). Some responses to individual questionnaire items were also missing (see Table 1). Tukey’s method was used to identify extreme outliers ranged above and below the 1.5*IQR (inter quartile range), resulting in removal of one OMT data point from the subjective reward responsiveness dataset prior to analysis.

Experimental task data

Reaction time and accuracy (precision) data from the PRT were analyzed using ANOVA with block (1, 2, and 3) and stimulus type (rich, lean) as within-subjects factors and group (OMT, COMP) as between-subjects factor. In addition to accuracy and reaction time measures, we calculated response bias (log b) and discriminability (d’) in concordance with established procedures for signal detection analysis [50]. 0.5 was added to each cell when calculating log b and log d.

$$\log b = \frac{1}{2}\log \left( {\frac{{\mathrm{rich}_{\mathrm{correct}} \times \mathrm{lean}_{\mathrm{incorrect}}}}{{\mathrm{rich}_{\mathrm{incorrect}} \times \mathrm{lean}_{\mathrm{correct}}}}} \right)$$

The log b formula gives a log transformed ratio of presses to each of the two buttons associated with high or low probability of reward. This is an index of the reward sensitivity.

$$\log d = \frac{1}{2}\log \left( {\frac{{\mathrm{rich}_{\mathrm{correct}} \times \mathrm{lean}_{\mathrm{correct}}}}{{\mathrm{rich}_{\mathrm{incorrect}} \times \mathrm{lean}_{\mathrm{incorrect}}}}} \right)$$

The log d formula gives a log transformed ratio of correct and incorrect responses. The discriminability indicates ability to distinguish between the two stimuli. For both measures we used mixed ANOVAs with block as within-subject and group as between-subject factor.

Meta-analysis: what is “normal” reward responsiveness?

To obtain a robust estimate of what constitutes a healthy range of response bias, we conducted a meta-analysis of previously published data on healthy participants’ performance on the PRT. The meta-analysis included publications that have used the PRT and at least one group of healthy participants (see SM for the search and selection procedures). Data from 14 studies including 968 healthy (control group) participants was included and analyzed with a random-effects meta-analysis using the “metafor” package [51] in R statistical software [52] (Fig. 3, see supplementary materials (SM) for details).

DDM: group differences in processing of reward information?

To assess whether reward-based choices are processed differently in the two groups, accuracy and reaction times were fitted with a Bayesian hierarchical 4-parameter implementation of the DDM [[53]; see SM for details].

The DDM class of computational models is increasingly used in psychology and neuroscience, for instance to demonstrate how decision processes are altered by alcohol intoxication [non-decision time and drift rate; [54, 55]] and different psychiatric disorders [decision threshold and drift rate; [56, 57]]. It is a sequential sampling model that allows decomposition of reaction time and accuracy from two-alternative decisions into subcomponents reflecting mechanisms underpinning the observed effects of the task [58]. The DDM extends the classical signal detection analysis of two-alternative forced-choice data, in that it (i) includes trial-by-trial data instead of aggregated data and (ii) incorporates both reaction times and accuracy information to estimate characteristics of the decision process. Specifically, the DDM estimates parameters reflecting the efficiency of evidence accumulation (drift rate); decision caution (speed-accuracy trade-off); potential a priori preference for either stimulus (starting point) and non-decision time (encoding and motor response) [59] (see SM Fig. S1).

Results

Group characteristics

Group characteristics are presented in Table 1. Inspection of individual scores on the SCL-25 showed higher incidence of scores above the cut-off (1) in the OMT group on both anxiety (NOMT = 8, NCOMP = 1) and depression (NOMT = 8, NCOMP = 2). As a group, mothers in OMT also showed significantly higher anxiety and depression symptoms, lower life satisfaction (~1 point lower across life domains), and more discomfort in muscles and joints (Table S2 in SM), but no other significant group differences were observed on mood or state items (see SM), or on trait measures (BIS/BAS). Patients who remained in OMT after seven years did not differ in anxiety or depression scores at study inclusion (third trimester) from those who dropped out, nor did longitudinal analysis reveal substantial changes in anxiety and depression from inclusion until 7-year-follow up (see SM).

Subjective reward responsiveness

Self-report of state hedonic capacity was high in both the OMT and comparison group (MEANOMT= 6.59, MEANCOMP = 6.67, Welch’s t39.6 = 0.201, p = 0.844 (95% CI: −0.85, 0.69), see Fig. 2a). Calculation of Bayes Factor (BF01) indicated that the data are 3.42 times more likely given no group differences (H0), than under the alternative hypothesis of different group means (H1; BF10 of 0.292 ± 0.02%). The OMT group’s hedonic capacity ratings were comparable to scores from the 49 healthy men reported in [40]. For reasons of statistical power, methadone and buprenorphine users were not compared, however as illustrated in Fig. 2a there was no indication of systematic treatment differences.

Fig. 2
figure 2

Subjective and objective reward responses. Individual data points are marked with circles (OMT) and triangles (COMP). The color of the circles indicates type of OMT: methadone (light gray); buprenorphine (white). Group means are indicated with transparent horizontal bars. Block-wise means and standard errors are reported in SM. For reasons of statistical power methadone and buprenorphine users were not directly compared. a Subjective reward responsiveness. Average rating on a visual analogue version of the Snaith-Hamilton Pleasure scale from 0 to 10. Methadone: n = 14; buprenorphine: n = 8; b Objective reward responsiveness. Methadone: n = 13; buprenorphine: n = 7

Objective reward responsiveness task

Statistical analyses showed no significant main or interaction effects of group on accuracy, reaction time or discriminability (see SM for descriptive and inferential statistics).

Response bias

Objective reward responsiveness, operationalized as task response bias (log b) toward the rich stimulus, was evident in both groups (MEANOMT: 0.12 (95% CI: 0.03–0.20); MEANCOMP: 0.12 (95% CI: 0.04–0.19), as confirmed by one-sample t-tests showing that both groups displayed reward responsiveness significantly different from zero (COMP: t26 = 3.07, p = 0.005, BF10 = 8.6, Cohen’s d = 0.59; OMT: t19 = 2.72, p = 0.013, BF10 = 3.98; d = 0.61). There was no group difference in bias (means were identical; F1, 45 = 0.002, p= 0.97, 2 < 0.001, see Fig. 2b). Response bias did not differ throughout the test (effect of block F1.52, 68.5 = 0.817, p = 0.46, 2 = 0.018) and there was no significant block*group interaction (F2, 90 = 0.048, p = 0.95, 2 = 0.001). During informal debriefing after the session, none of the participants reported noticing the asymmetric reward schedule in this task. Bayes Factor (BF10) for the group contrast was 0.183, indicating that the data are 5.46 times more likely given no group differences (H0), than under the alternative hypothesis of different group means.

‘Healthy’ response bias

The meta-analysis (n = 968) provided further confirmation that the OMT group displayed intact (‘healthy’) response bias in the objective reward responsiveness task (Fig. 3). The analysis yielded a mean response bias of 0.145 across the 14 studies included (95% CI (0.13, 0.16); p < 0.001; SE = 0.001). The sample of studies showed moderate heterogeneity (τ2 < 0.01; Q (df = 14) = 24.57; p < 0.03; I2 = 47.1%).

Fig. 3
figure 3

Forest plot for meta-analysis of response bias (log b) across healthy groups of participants. N = number of participants. Estimates from the two groups from the current study are shown below the meta-analytic results on the same x-axis in a gray box. Estimates are shown with two decimal precision as the majority of publications provide this level of detail. The zero-point indicates no bias (not changing behavior as a function of rewards received)

Objective reward responsiveness task: Sub-processes

The computational modelling results corroborate the signal detection analysis of reward bias, showing the expected skewed starting points (z) indicating a robust bias for the rich option for both the patients in OMT and the comparison group and individuals (Fig. 4b). Similarly, the efficiency of data accumulation for high-reward probability trials (rich) was higher (mean (highest density interval, HDI): 1.5 (1.3–1.7)) than for the lean option (Mean: 1.16 (HDI: 0.8–1.5)). There was no evidence to support group differences on any of the examined subprocesses (Fig. 4a–d). See SM for details.

Fig. 4
figure 4

Group estimates for drift diffusion model decision parameters. All panels indicate parameter means and 95% HDI (highest density interval) from posterior distributions. Circles = OMT, triangles = COMP. a non-decision time; b starting point: 0.5 indicates no bias, higher numbers indicate preference for the high-reward-probability option (rich). c Evidence accumulation efficiency (drift rate) for rich and lean trials. d Decision threshold that indicates the speed-accuracy trade-off. Posterior probability for the group contrast is shown in italics (O = opioid maintenance group, C = comparison group). Group contrasts < 0.05 and >0.95 would indicate credible group differences

Discussion

Across the measures and levels of analysis employed here, former heroin-addicted individuals in stable OMT displayed intact responsiveness to non-drug rewards. Importantly, the OMT group significantly and robustly modulated their behavior as a function of monetary reward probability. The degree of response bias towards the most frequently rewarded option was comparable with data from ~1000 healthy participants. Computational modelling revealed that despite previous heroin addiction and a minimum of 7 years of OMT, OMT patients’ performance in this probabilistic reward task  was comparable to a healthy sample even at the level of decision subprocesses. A self-report anhedonia measure also revealed high responsiveness to non-drug rewards in this unique OMT group. All patients were mothers who retained custody of their children after >7 years of OMT treatment. Accordingly, the OMT group tested here is characterized by fewer psychosocial vulnerability factors than other OMT cohorts [60, 61], but is also relatively small (N = 20–23 in key analyses).

The objective reward responsiveness task employed here measures participants’ propensity to favor a more frequently rewarded response option. The task is designed to induce a bias that remains outside of conscious awareness. Disrupted response bias has been reported for several patient groups with subjective anhedonia [26, 50, 62,63,64], notably cannabis dependence [28], and nicotine withdrawal [29]. Since correct responses are rewarded with money in this task, the robust response bias demonstrated by the OMT group indicates intact sensitivity to a non-drug reward. Indeed, comparison with both a local comparison group and data from 968 healthy participants tested previously, confirmed that the patients’ mean response bias fell within the normal range. This evidence for intact reward responsiveness after prolonged OMT contrasts with the notion that chronic opioid use causes anhedonia. For instance, OMT patients have shown blunted neural responses to non-drug reward [65,66,67]. However, most previous demonstrations of anhedonia in substance abuse have included patients during withdrawal or early abstinence [e.g., see reviews: [11, 12]]. In contrast, the OMT group tested here had been in stable treatment for an average of 9 years. A recent study pointed to illicit opiate use as a potential cause of self-reported anhedonia among patients in OMT [15]. In line with this evidence, illicit drug use is very limited in the OMT group tested here. To our knowledge, this is the first demonstration of intact ability to modify behavior to optimize reward in long-term OMT.

Somewhat surprisingly, computational modeling revealed that all the estimated decision subprocesses were comparable between study groups. Thus, we found no evidence for impairment in efficiency of information processing, response caution, degree of prior response preference, or non-decision time (time spent on perceptual and motor processes) in the OMT group. This result contrasts with previous reports that prolonged opioid use is associated with deficits in cognition and executive function, including decision making, cognitive flexibility, and memory [reviewed in [68]]. In active substance use disorder, immediate rewards are often favored impulsively [delayed discounting, [69, 70]]. Deficits in risky decision making [71,72,73] and prolonged reaction times [66, 74] have also been observed in some OMT groups. Here however, patients in OMT responded to stimuli with the same speed as comparison mothers. The lack of impairment in these underlying sensory, motor and cognitive processes corroborate the main results, i.e., that these patients were able to respond normally to non-drug rewards. Darke et al. [75] suggested that cognitive impairment may stem from indirect effects of opiates, such as lifestyle, poor health and nutrition, or exposure to violence. Notably, these vulnerability factors were less pronounced in the OMT patients tested here.

Acute stress induction [76] and high levels of daily stress [77] reduced reward responsiveness as measured by the probabilistic task used here. OMT using drugs with long-acting pharmacological properties (methadone and buprenorphine [78]) is thought to reduce stress [79] and supports normalized brain metabolite profiles, sex and stress hormone function [80,81,82]. Further, abnormal neural responses to drug cues in short-term OMT (<1 year) were attenuated in patients with 2–3 years of treatment [83]. In light of this literature, the present findings are consistent with the notion that long-term stable OMT could “renormalize” brain reward systems. An alternative explanation is that this sample’s reward responsiveness was never impaired, potentially contributing to their successful treatment. Available longitudinal data from this cohort shows neither an improvement nor decline in symptoms of depression or anxiety. Further, we found no evidence of systematic differences in depression or anxiety levels at study inclusion between patients who did or did not remain in treatment (see SM for details). Future studies should characterize subjective or objective reward responsiveness in ongoing opioid addiction, and use longitudinal designs to test whether the measures could represent useful predictors of OMT treatment success [as suggested by [7]]. While the subgroups in our OMT cohort were too small to warrant statistical comparison, potential differential effects of methadone and buprenorphine on reward responsiveness should be addressed in future studies with larger patient groups.

We used a modified version of the SHAPS to assess subjective hedonic capacity in long-term OMT and healthy controls. The original questionnaire has been used in several previous studies of opioid dependence and misuse [15, 16, 84,85,86], revealing primarily large effect sizes (median Hedge’s g of 0.81, n > 300 opioid misusers in the references listed above). Here, the OMT and comparison group mean scores showed a negligible difference (0.08 on an 11-point scale). The objective reward responsiveness task employed here has also revealed impairments in substance dependent populations [28, 29]. An advantage to the present approach is that the ubiquitous nature of monetary reinforcement and the wide scope of pleasurable everyday scenarios included here render it unlikely that clinical anhedonia would go undetected in this population. Nonetheless, we observed somewhat higher scores of depression and anxiety in the OMT mothers. It is well established that not all depression involves anhedonia, and vice versa [87]. Note that we cannot exclude the possibility of a medium or small effect of OMT on non-drug reward responsiveness that cannot be detected here due to the small number of participants.

Several limitations of this study warrant consideration. The comparison group was matched to OMT patients on gender and age. Matching groups on potentially important variables such as smoking, socioeconomic status and years of education was not feasible. Nevertheless, despite somewhat higher self-reported distress in the OMT group, the two groups displayed strikingly similar performance on both the objective and subjective reward responsiveness measures. Importantly, performance was also comparable to data from a large group of healthy adults reported in the literature. We cannot exclude that other measures of responsiveness to rewards could reveal differences undetected here. Only women were tested in this study. The prevalence of opioid dependence differs in men and women [88] and most studies have tested mainly [14, 89] or exclusively men [66, 84, 90]. The relatively small number of patients included here limits the generalizability of results. Nevertheless, evidence of intact reward responsiveness in this group shows that normal reward behavior is possible even after a minimum of seven years in OMT preceded by on average eight years of heroin addiction.

The OMT group tested here is unique in a national and international context. Each patient was recruited during pregnancy and at the time of testing had received OMT for at least seven years. To maintain custody of their children, mothers in OMT in Norway commit to frequent testing and evaluation and are required to abstain from on-top illicit drug use. Compared to many other former heroin addicts, this subgroup leads a stable life-style with fewer psychosocial vulnerability factors. Furthermore, raising a child prompts the expression of caring behaviors, prioritizing others above oneself, and frequent problem solving. It is possible that the joys and responsibilities associated with child rearing may buffer against anhedonia in mothers in OMT. In support of this, a study of 390 Austrian mothers in OMT showed remarkably high self-reported quality of life [91]. Future studies should address the impact of psychosocial factors on reward responsiveness in current and former substance use disorder populations.

In sum, our findings suggest that long-term stable opioid agonist drug treatment does not necessarily lead to anhedonia or reduced reward responsiveness. The ability to adapt behavior according to non-drug reward may indicate a recovery of reward systems in long-term stable pharmacotherapy. Whether reward responsiveness is intact in OMT groups with more psychosocial difficulties remains to be seen. The present results may however extrapolate to other groups in long-term opioid treatment with a comparably small burden of psychosocial problems, such as patients with chronic pain. These results provide a new line of evidence supporting the utility of long-term opioid drug treatment for certain vulnerable patient groups.

Funding and disclosure

The study was funded by a PhD grant from the South-Eastern Norway Regional Health Authority (2013053) and internal funding from The Norwegian Centre for Addiction Research (SERAF) to MS. ME, PPL, MLP, NK, AMM, SL, and MS reported no biomedical financial interests or potential conflicts of interest.