Introduction

Classical associative learning theory explains fear conditioning by an aversive prediction error signal δav generated when an initially non-predictive conditioned stimulus (CS) is unexpectedly followed by an unconditioned stimulus (UCS). This establishes a UCS prediction, or aversive value Vav, for the CS that grows over successive pairings.1, 2 If at some point the UCS is unexpectedly omitted, this generates a negative (oppositely signed) aversive prediction error that will reduce Vav. The latter mechanism is thought to underlie the extinction of conditioned fear responses by repeated unpaired CS presentations.1 It is the theoretical basis of exposure therapy where a patient is repeatedly confronted with the trigger of his fears (the CS, for example, in an agoraphobic, an open space) and makes the experience that the predicted outcome (the UCS) is absent or less disastrous than expected (for example, he does not collapse).3

A frequent finding is that conditioned fear responses can return after successful extinction, indicating that the CS–UCS association (Vav) is not simply unlearned or erased during extinction but rather complemented by a competing inhibitory CS–noUCS association that may, or may not, dominate the CS–UCS association at future CS presentations.4, 5, 6 Moreover, there is compelling evidence for a partial segregation in the neural systems subserving conditioning and extinction.7, 8, 9, 10, 11, 12 The above simple account of extinction, as being solely mediated by the same learning system that also mediates conditioning, cannot accommodate these observations.

Alternatively, the omission of an expected aversive UCS could be conceptualized as an appetitive-like or reward prediction error δapp and the consequential reduction of the UCS prediction Vav during extinction as generation of a reward-like safety prediction Vapp. From this perspective, part of a solution for the above problem could be that extinction is driven by an opponent appetitive learning system. Reward learning has been strongly linked with the mesostriatal dopamine (DA) system.13, 14, 15 There is evidence that δapp is signaled by a phasic increase in the firing of DAergic neurons that originate in the ventral tegmental area and substantia nigra and project to the ventral striatum (VS).15 It has therefore been hypothesized that VS DA release is involved in putative δapp signaling during fear extinction as well.16 One rodent study that showed that DA signaling via D1 receptors is necessary for extinction17 further supports the potential link between fear extinction and the reward system. One goal of this study was therefore to test whether the VS encodes appetitive-like prediction error signals during extinction in humans.

Extracellular DA levels in the striatum are prominently regulated by neuronal DA reuptake via the DA transporter (DAT).18 The human transporter gene DAT1 features a frequent and functional variable number of tandem repeat (VNTR) polymorphism in a region that encodes the 3′ untranslated region.19 The 40-bp VNTR element is mainly repeated either 9 or 10 times, with the 9-repeat (9R) form most likely reducing DAT expression20, 21, 22, 23, 24 (but see Dyck et al.25) and thus presumably enhancing extrasynaptic striatal DA levels, in particular, during phasic DA release.18 Hence, if the mesostriatal DA system is involved in extinction in the fashion outlined above, one would expect the DAT1 9R allele to be associated with relatively enhanced extinction learning as well as with enhanced neural δapp signaling in the VS. To examine this hypothesis, we conducted the experiment in a sample of normal healthy volunteers that were preselected on the basis of their DAT1 genotype. In particular, we compared 9R carriers (genotypes 9/9 and 9/10) with non-9R carriers (genotypes 10/10). The grouping was chosen because of the relative scarcity of the 9R allele and to be in keeping with previous DA binding neuroimaging studies.20, 21

We also examined effects of interindividual variation in COMT (catechol-o-methyltransferase) function. In contrast to DAT, COMT is most strongly expressed in the prefrontal cortex (PFC)26 where it degrades released DA, thereby regulating extracellular PFC DA levels.27 The human COMT gene contains a functional single-nucleotide polymorphism that codes for a Val to Met change at position 158,28 the Met variant of the protein being less active29 and associated with higher prefrontal baseline synaptic DA.27 Prefrontal DA appears to have a role in extinction in rats,30 and a recent human study had suggested impaired extinction in COMT Met/Met carriers.31 Hence, including COMT genotype in the design allowed us to also explore potential contributions of extrastriatal DA to human fear extinction. As the Val and Met alleles are codominant,27 participants were preselected in a way to obtain three similarly sized groups of Val/Val, Val/Met and Met/Met carriers. This resulted in a 2 by 3 (DAT1 by COMT) factorial between-subject design.

Participants and methods

Participants

A total of 69 healthy male right-handed Caucasian participants with DAT1 and COMT genotypes 9R-Val/Val (n=13), 9R-Val/Met (n=10), 9R-Met/Met (n=10), non-9R-Val/Val (n=14), non-9R-Val/Met (n=12) and non-9R-Met/Met (n=10) were examined. As participants were drawn from a bigger population to achieve a stratified and matched population, calculations of Hardy–Weinberg equilibrium (HWE) were only appropriate for the basic population (n=450). Genotype distributions were as follows DAT1 9/9: n=28; 9/10: n=148; 10/10: n=252 (HWE χ2=0.97), COMT Val/Val: n=80; Val/Met: n=237; Met/Met: n=118 (HWE χ2=4.18), and thus above the HWE threshold of P=0.01. Details on participant selection, sample characteristics and genotyping can be found in the Supplementary Methods and Supplementary Table 1. A different analysis of an overlapping sample has been reported before.32

Experiment

Participants performed a simple uninstructed fear conditioning, extinction and reacquisition task, which has been described in detail elsewhere.32 Briefly, participants were first habituated to the CSs, the task and the scanner noise by presenting each CS four times before the actual experiment. In the subsequent acquisition phase (Acq), participants saw 18 pseudorandomized 5-s presentations of each of two geometric symbols (a circle, a triangle), one of which (CS+) was paired in 80% of cases with a painful electric stimulus (UCS) applied to the back of the right hand. The other symbol served as a control stimulus (CS−) for non-associative effects and was never paired with the UCS. Assignment of symbols as CS+ or CS− was counterbalanced across participants. In the extinction phase (Ext), both stimuli were again presented 18 times each, but in the absence of the UCS. The subsequent reacquisition phase (RAcq) was identical to the acquisition phase. The intertrial interval was jittered between 9–14 s, with an average of 11.5 s. We intermittently asked participants to give explicit ratings of their CS-evoked stress/fear/tension (at baseline, that is, after the habituation phase, and every 12 trials (six CS+ and six CS− trials) thereafter, resulting in three CS+ and three CS− ratings per phase). Throughout the experiment, the participants had to perform a speeded decision task on the geometric symbols (see Supplementary Methods). UCS intensity was individually adjusted before the experiment to achieve maximum tolerable pain.

Data acquisition and preprocessing

Acquisition and preprocessing of skin conductance and functional magnetic resonance imaging (fMRI) data followed standard procedures (see Supplementary Methods).

Data analysis

Fear ratings

All ratings were normalized by subtracting the baseline ratings given at the onset of the experiment (after habituation) such that positive ratings reflected an increase in fear relative to baseline and negative ratings a relative decrease in fear. In four participants, ratings were not acquired or lost because of technical problems. The remaining sample size for analysis of ratings was n=65.

Rescorla–Wagner model

The Rescorla–Wagner (RW) model is a simple and established associative learning model that formalizes the laws of learning outlined in the introduction. If learning is about enabling an organism to predict relevant future events from present stimuli, then classical conditioning should result in the organism being able to predict a UCS from the presentation of the CS. That is, the CS should activate a UCS expectation (or CS–UCS association or UCS prediction) that reflects the probability and magnitude of the UCS. This ‘associative strength’ or affective value of the CS is expressed in the V term of the RW equation below and will increase over the course of conditioning. It is thought to determine conditioned responding. Every violation of this expectation (for example, because a UCS occurs unexpectedly, as initially in the beginning of conditioning when the UCS prediction is still 0, or because an expected UCS does not occur, as in initial extinction) must result in an adjustment of the expectation, that is, learning. Therefore, the update of V in the RW model is directly proportional to a prediction error term δ that reflects the difference between actual and predicted reinforcement (that is, the expectation violation).1 This class of models is known to be relevant for learning about punishments and rewards, and has been successfully used to predict learning-related neural activation.33, 34, 35, 36, 37, 38, 39 Specifically, the RW model updates V at trial t+1 according to

with α being the constant learning rate (0–1), R being a fixed value assigned to the reinforcement/UCS and (RVt) corresponding to the prediction error δ that is generated at the time of reinforcement.

We used this rule to model how participants change their aversive CS+ and CS− values Vav depending on their associated reinforcement pattern. A flow chart showing the separate analysis steps is provided in the Supplementary Methods. We first range corrected each participants’ CS+ and CS− fear rating data (see Supplementary Figure 1a for sample average) according to

with xi,CS+orCS− (i=1,…,10) being the successive (CS+ or CS−) fear ratings, min being the sample-wide minimum of all ratings (−58) and max being the sample-wide maximum of all ratings (100). This resulted in individual rating time courses in which ratings ranged from 0 to 1 but retained interindividual differences in how participants used the rating scale. The (CS+ and CS−) baseline ratings x1,CS+orCS− (after habituation and before conditioning), which were 0 by definition in each participant (see above), became 0.36 (=x1,corr,CS+orCS−).

After complete learning, the aversiveness R of the UCS is reflected in the aversiveness of the CS+, that is, in the last CS+ fear rating after acquisition (x4,corr,CS+). With a partial reinforcement schedule of 80%, a participant's R in paired CS+ trials can thus be approximated as x4,corr,CS+/0.8.40 R in CS− trials (0% reinforcement) was set at each participant's x4,corr,CS− rating. The same value of R was used for unpaired CS+ trials. See the Supplementary Methods for a more detailed explanation. As mentioned above, UCS (pain) intensity in this experiment was individually calibrated to each participant's subjectively tolerable maximum to eliminate interindividual differences in UCS processing. Concordantly, the calculated individual R-values for paired CS+ trials were not affected by genotype (DAT1: F(1,64)=1.24, P=0.298; COMT: F(2,64)=0.74, P=0.438; and interaction: F(2,64)=0.06, P=0.978). We nevertheless used individual R-values to factor out any potential interindividual differences in learning that might in fact merely result from differences in UCS processing.

Vav,CS+ and Vav,CS− were modeled separately and set at 0.36 (=x1,corr,CS+orCS−; see above) before learning. On the basis of the idea of dissociable neural systems for fear acquisition and extinction (and possibly reacquisition as well), we used three free parameters αAcq, αExt and αRAcq (one for each of the three experimental phases), which were adjusted to minimize the distance between the change in Vav,CS+ and Vav,CS− and the change in fear ratings xcorr,CS+ and xcorr,CS− using a least-square approach. We did not use CS+>CS− difference scores, and Vav,CS+ and Vav,CS− were estimated at the same time within the same model. Vav,CS+ and Vav,CS− time courses averaged across the entire sample are shown in Figure 1a. Average least sums of squares were similar between genotypes (DAT1: F(1,64)=0.55, P=0.461; COMT: F(2,64)=0.13, P=0.878; and interaction: F(2,64)=0.49, P=0.613). Model fits were substantially worse when assuming one identical learning rate across all three phases of the experiment (data not shown).

Figure 1
figure 1

Formal modeling of fear ratings. (a) Lines show the sample average of the modeled trial-by-trial estimates of aversive conditioned stimulus values (Vav,CS+ and Vav,CS−). Dots show sample-average range-corrected fear ratings (made after every sixth CS+ and CS− trial). 0: baseline rating after habituation. (b) An example of a resulting individual time course of trial-by-trial aversive prediction error (δav) estimates associated with the CS+. Black squares mark unpaired CS+ trials during acquisition and reacquisition. CS+ trials during extinction were all unpaired. Prediction errors associated with the CS− were always 0 and are not shown for simplicity. x axis: CS+ or CS− trials.

Note that the original RW formula assumes different learning rates for reinforced and non-reinforced CS trials, but this differentiation is not critical for most of the model's predictions1 and has not been made in neuroimaging studies.33, 34, 35, 36, 37, 38, 39 Allowing different learning rates for reinforced and non-reinforced CS trials yielded worse fits (data not shown).

Imaging data

Analysis of imaging data was restricted to those 65 participants from which fear ratings were also available. In the comparison of DAT1 9R with non-9R carriers, group sizes were n=32 and n=33, respectively. To prepare the analysis, we used the sample-averaged learning rates αAcq=0.16, αExt=0.21 and αRAcq=0.19 to derive individual trial-by-trial Vav and δav estimates from the above modeling of the rating data, separately for acquisition, extinction and reacquisition. Averaging of learning rates was necessary to reduce noise in the data that resulted from a limited number of data points for fitting (10 ratings), and thus to obtain robust estimates. An exemplary individual δav time course is shown in Figure 1b. We emphasize that our estimated average learning rates are comparable to those used in previous neuroimaging studies.34

This information was fed into the imaging data analysis that used a standard approach for fMRI, involving a general linear model (multiple regression) at the single-subject level and a random-effects analysis at the group level within the SPM5 software (www.fil.ion.ucl.ac.uk/spm).41 For each participant, regressors were defined that modeled the time course of the experimental events. Onsets of CSs, irrespective of whether they were a CS+ or a CS−, were modeled as categorical ‘events’, that is, one series of delta functions. This regressor was parametrically modulated in a trial-by-trial fashion by the individual's sequence of Vav estimates. Another categorical event regressor modeled CS (both CS+ and CS−) offsets and was parametrically modulated by the individual's sequence of δav estimates. This was done for acquisition, extinction and reacquisition separately. CS+ and CS− trials were not differentiated in this analysis, as the concept of predictions and prediction errors is a purely quantitative one that does not make qualitative distinctions between types of stimuli. We therefore assumed identical neural substrates for predictions and prediction errors, whether associated with a CS+ or a CS−. Additional categorical regressors modeled UCSs (events), key presses (events), and the occurrence of fear ratings (14-s box car). Each regressor was convolved with a canonical hemodynamic response function. Using these regressors in a general linear model of brain activation at each voxel yields parameter estimates of the contribution of each regressor to the fMRI signal measured in each voxel. The convolved regressors of interest (Vav,Acq, δav,Acq, Vav,Ext, δav,Ext, Vav,RAcq and δv,RAcq) were sufficiently decorrelated from each other and from the UCS regressor to allow for robust estimation (Pearson's Rs for the correlations between Vav,Acq and δav,Acq: 0.01; Vav,Ext and δav,Ext: −0.21; Vav,RAcq and δav,RAcq: 0.01; Vav,Acq and UCS: −0.12; δav,Acq and UCS: −0.43; Vav,Ext and UCS: 0; δav,Ext and UCS: 0; Vav,RAcq and UCS: −0.12; and δav,RAcq and UCS: −0.41). Note further that the use of a comparatively high reinforcement ratio of 80% during acquisition assured a high initial amount of prediction error signaling during extinction in combination with a relatively steep approach toward zero signal (see Figure 1b). This characteristic time course was sufficiently different in shape from the constant categorical CS offset regressor, of which it was a parametric modulator, for it to be able to explain additional variance. At the same time, choosing a partial reinforcement ratio also seemed preferable to a 100% schedule, because the latter would have generated a very steep approach toward zero, which would leave little room for modulation by individual-difference factors.

For the voxel-wise random-effects group analyses, the subject-specific parameter estimate images from the parametric δav and Vav regressors were spatially smoothed (FWHM 10 mm, resulting in total smoothing with an 11-mm kernel) to account for interindividual anatomical and functional variance, and to fulfil the requirements for later correction for multiple comparisons following Gaussian random field theory (see below). DAT1 and COMT genotype effects were assessed using SPM's ‘full factorial’ model, which allows for correcting for possible non-sphericity of the error term (here unequal between-group variance). Separate models were calculated for each effect of interest (for example, Vav,Acq). A design matrix included six regressors, one for each possible genotype combination. The significance of linear combinations of the regressors (for example, 1 1 1 −1 −1 −1 when asking which voxels show larger effects for DAT1 9R than for non-9R carriers in a given parameter estimate image) was assessed using one-tailed t-tests.

Correction for multiple comparisons following Gaussian random field theory (family-wise error method) at P<0.05 was limited to the VS regions of interest (ROIs). Left- and right-sided ROIs were conservatively defined as spheres of 6-mm radius around coordinates.

The values around coordinates −27, 3, −9 and 27, −9, −9, respectively, which were taken from the first fMRI study that investigated neural reward prediction error coding using a formal associative learning model.34 Where no anatomical hypothesis existed (exploratory analyses across the entire scan volume), an uncorrected threshold of P<0.001 was used.

Results and discussion

Behavioral data

In the RW model of associative learning (see Participants and methods), the prediction error δav is weighted by a constant α, the learning rate, that determines how much a deviation from prediction at trial t is taken into account when formulating the prediction for the next trial t+1. A high learning rate signifies rapid prediction adjustment and thus quick learning. If extinction relies (in part) on a different learning system than conditioning, it may well show a different dynamic. We thus assessed learning separately for the three phases of the experiment, allowing separate learning rates αAcq, αExt and αRAcq. These were treated as free parameters, which we optimized so that the individual Vav time courses fit changes in individual skin conductance responses (SCRs) or fear ratings across the entire experiment (see Supplementary Figure 1 for sample-average SCR and rating time courses).

An attempt to model SCRs failed because of excessive noise in the data. Modeling of fear ratings (Figure 1), followed by three separate two-way analyses of variance (one per phase, each with between-subject factors DAT1 and COMT, and learning rate as the dependent variable) revealed significantly higher learning rates in DAT1 9R carriers compared with non-9R carriers during extinction (αExt: DAT1 main effect F(1,65)=4.57, P=0.037) but not during acquisition (αAcq: F(1,65)=0.13, P=0.725) or reacquisition (αRAcq: F(1,65)=3.27, P=0.075; Figure 2). This suggests DAT1 9R carriers have a more sensitive extinction learning system and is consistent with the idea that striatal DA might positively contribute to extinction learning.

Figure 2
figure 2

DAT1 genotype affects learning rates during extinction. Formal modeling of fear rating data showed that 9-repeat (9R) carriers have significantly higher learning rates during extinction than non-9R carriers. Error bars: s.e.m. *P<0.05 (F test).

Learning rates were unaffected by COMT genotype (all P>0.267), but there was a significant DAT1 × COMT interaction effect on learning rates in the reacquisition phase (αRAcq: F(2,65)=4.48, P=0.015). A post hoc t-test showed that DAT1 9R carriers had significantly higher learning rates than non-9R carriers only in the COMT Val/Met group (9R: 0.42±0.36 (mean±s.d.) versus non-9R: 0.09±0.13; t(22)=2.55, P=0.025, two tailed; Supplementary Figure 2a). This incidental finding will be discussed further below. A standard, non-computational analysis of SCR and rating responses for genotype effects yielded no significant results.

Taken together, behavioral analysis suggested a significant contribution to extinction of DAT1 in the predicted direction but found no evidence for an involvement of COMT. We note that the cited COMT study by Lonsdorf et al.31 also reported a negative result for SCR and that their COMT effect on startle potentiation might as well be explained by a modulation of fear memory consolidation rather than extinction learning itself. The current data from human subjects therefore do not lend strong support to the idea30 that prefrontal DA function is important for extinction.

Imaging data: entire sample

In the entire sample, exploratory analysis of extinction data for δav signals (the aversive prediction error characterized by phasic relative decreases in activation when a UCS is unexpectedly omitted, compare with Figure 1b) yielded no significant results. The putative appetitive-like prediction error δapp is the mathematical inverse of δav and characterized by relative phasic increases at CS omission. Activity conforming to δapp was found in, among others, left, and less so right, anterior insula, bilateral anterior lateral PFC, right ventrolateral PFC/lateral orbitofrontal cortex and right VS (ventral putamen and/or nucleus accumbens; x, y, z=14, 8, −6; z-score=3.22; P<0.001 uncorrected; see Figure 3a; Supplementary Table 2), areas previously associated with δapp coding in reward studies42 and with phasic activations to UCS omission during fear conditioning.43 The VS activation was, however, not located within our conservatively defined (left or right) VS ROIs (see participants and methods for definition). As for δav, there were no suprathreshold Vav signals (the aversive CS value that decreases across extinction, compare with Figure 1a). By contrast, the putative reward-like safety value of the CS, Vapp, which is the mathematical inverse of Vav and accordingly builds up across extinction, was found to be encoded in left orbitofrontal cortex/ventrolateral PFC, dorsomedial and lateral PFC, temporal cortex, left caudate, cerebellum and others (Supplementary Table 2). These observations might suggest that extinction is indeed primarily driven by reward-like safety learning. Results for the acquisition and reacquisition phases are reported in Supplementary Table 2.

Figure 3
figure 3

Ventral striatal (VS) prediction error signaling during extinction. (a) Appetitive prediction error (δapp) signal in right VS in the entire sample. (b) Stronger δapp signal in DAT1 9-repeat (9R) compared with non-9R carriers in left VS. Activations superimposed on a canonical structural image. Display threshold: P<0.01 uncorrected. Bar graphs show average δapp parameter estimates in extinction, as well as, for comparison, in acquisition and reacquisition in the peak voxel indicated by the hair cross. Error bars: s.e.m.

Imaging data: genetic analysis

In the genetic analysis, we focused on the comparison of DAT1 9R with non-9R groups, as COMT genotype had not affected extinction learning rates in the behavioral analysis. If DAT1 9R carriers weight prediction errors during extinction more strongly (have higher learning rates), then they can be expected to show larger neural δav and/or δapp signals during this phase. As predicted, 9R carriers had significantly stronger signal increases to UCS omission than non-9R carriers, corresponding to stronger δapp coding, in our left VS ROI (x, y, z=−32, 8, −6; z-score=2.99; P=0.03 corrected). The activation was located in the putamen and extended into the anterior insula (Figure 3b). This result supports our conclusion from the behavioral analysis that DAT1 9R carriers have a more sensitive extinction learning system and is evidence for an involvement of the mesostriatal DA system in extinction. Further group differences, all going in the same direction, were observed in left anterior cingulate sulcus and other areas (exploratory analysis at P<0.001 uncorrected; Supplementary Table 3). Group comparisons of CS value encoding (Vav or Vapp) surprisingly showed stronger Vapp signals in the non-9R compared with the 9R group in two frontal areas (P<0.001 uncorrected; Supplementary Table 3). We stress the exploratory nature of the latter comparisons and the corresponding likelihood of false-positive results.

The unexpected behavioral finding of higher reacquisition learning rates in 9R compared with non-9R carriers specifically in the COMT Val/Met group (see above) was reflected in relatively higher δapp signals in the right VS ROI in 9R-Val/Met participants (x, y, z=22, −8, −8; z-score=2.62; P=0.043 corrected; Supplementary Figure 2b). The trial-by-trial variance that is captured by the parametric δapp regressor during reacquisition mainly stems from the δapp increases to the three unexpected UCS omissions (compare with the inverse of the curve shown in Figure 1b, reacquisition part). A ventral striatal DAT1 effect on this type of signal suggests that the safety or appetitive-like mechanisms, which we propose are activated during extinction, carry over to the reacquisition situation. The observation that the DAT1 effect is limited to COMT Val/Met carriers might speculatively be attributed to a situationally dependent influence of prefrontal DA on striatal DA function.44, 45 In this context, it is worth noting that epistatic DAT1 by COMT interactions in VS reward signaling have been observed before.46 The exact nature of the effect must remain open until further investigation.

Supplementary Table 4 reports a genetic analysis of the VS ROIs from all experimental phases and contrasts for DAT1, COMT and DAT1 by COMT effects.

Conclusion

To summarize, our combined behavioral and imaging data hint toward signaling of UCS omission during extinction by phasic DA release in the VS, in analogy to the role of the mesostriatal DA system in reward learning.15 They support a conceptualization of extinction as a reward-like safety learning process.16 More globally, such a conceptualization can be integrated within a perspective of opponent aversive and appetitive systems.47, 48

Several limitations of the current study should be noted. First, there are still controversies with respect to the actual impact of DAT1 genotype on in vivo DAT function and striatal DA clearance (see introduction), and the prevailing hypothesis of stronger phasic DA signals in 9R carriers still has to be substantiated. Second, our approach is correlative, that is, we did not experimentally manipulate striatal DA signaling. Pharmacological manipulations in rodents have shown generally higher levels of conditioned freezing during extinction with DA antagonists given systemically16, 49, 50 or directly in the amygdala51 or nucleus accumbens.16 A systemic DA agonist reduced conditioned freezing during extinction.50 Although these results could be taken to support a facilitatory role for DA in extinction learning, it should be noted that DAergic manipulations also affect locomotion and baseline freezing, and it is therefore difficult to rule out that the enhanced conditioned freezing observed under DA antagonists might be explained by their motor side effects.16, 49, 50, 51 Further, these studies have analyzed average freezing across the entire extinction session, a measure that might also be confounded by potential general fear-potentiating drug effects. It might therefore be advantageous to instead focus on rates of decay of freezing as a primary outcome measure for extinction in animals. In humans, where conditioned responding is normally not read out from motor behavior, pharmacological experiments might suffer from other confounds such as potential drug effects on arousal.36 Such experiments will therefore have to carefully control for non-specific effects but might nevertheless be a valuable source of evidence. Third, showing a contribution of the mesostriatal DA system does not necessarily prove that extinction is appetitive, as the DA system is not exclusively involved in reward learning. Here, a direct demonstration of the appetitive nature of extinguished CSs would be beneficial. Another potentially interesting approach would be a direct formal comparison of extinction with a reward-learning task within the same sample. Fourth, our data do not exclude that non-appetitive, that is, aversive learning mechanisms also contribute to extinction. Fifth, although conditioning and extinction are generally considered relevant paradigms for the study of pathological anxiety and its therapy,3 it must be stressed that they cannot provide explanations for every aspect of anxiety and we can, in particular, not exclude that other mechanisms have a role in therapeutic fear relief. Sixth, our sample included mainly university students and exclusively comprised males. The latter selection criterion was introduced following reports of considerable gender and cycle effects on extinction52 and was intended to reduce variance, thus allowing us to limit sample size. Reproduction in other samples is therefore required. Seventh, where we reported results from exploratory whole-brain analyses, these are not corrected for multiple comparisons, as emphasized earlier. Again, reproduction will be paramount.

It is worth noting that we did not find evidence for a role for DA in fear acquisition, in line with one genetic study examining COMT genotype effects on conditioning.31 By contrast, a recent pharmacological fMRI study36 had reported enhanced aversive prediction errors δav in the caudate nucleus and the substantia nigra/ventral tegmental area during conditioning in participants under amphetamine compared with participants under haloperidol. As amphetamine participants also reported to feel less tired, drowsy and slowed, these results might also reflect a general attentional effect. It should, however, also be noted that the absence of DAT1 or COMT effects on our and other peoples’ measures of conditioning does not exclude a contribution of DA to fear acquisition. Further research will be necessary to settle this question.

To conclude, our findings highlight DA as a candidate neurotransmitter for fear extinction. This opens up interesting perspectives for neurobiological therapy augmentation, for instance, via adjunctive treatment with DAergic drugs. Experimental studies using the NMDA receptor agonist D-cycloserine to enhance the effects of exposure therapy have demonstrated the potential for such a strategy53 (reviewed in Grillon54). Pharmacological augmentation might be particularly useful in patients resistant to standard forms of behavior therapy. We would, however, caution against testing drugs in patients for which possible potentiating effects on fear expression or conditioning have not been carefully ruled out in previous non-clinical studies. Another promising lead for future studies would be to examine interactions with the endogenous opioid system, which, animal studies suggest, is another potential substrate of error signaling during fear extinction55, 56 and therefore another interesting candidate neurotransmitter system for translational research.