Introduction

Nearly all behavioral decisions involve judgements about the value of desired outcomes. Intuitively, all animals should choose actions that maximize outcome values when comparing multiple options that vary in costs and benefits. However, animals, including humans, vary in their decision preferences. For example, to some individuals, the subjective value of a small, certain outcome exceeds the subjective value of a much larger, uncertain outcome even if the expected value (i.e., probability of obtaining a reward multiplied by the reward amount) of the uncertain option is numerically greater. Similarly, humans regularly spend money now that would have much more spending power later if saved and invested. This tendency to discount the future such that the subjective value of a larger, delayed reward is lower than a smaller reward available now is common across many animal species.

Neuroimaging research has shown that although similar networks of regions represent subjective value across individuals, both behavioral preferences and neural representations of subjective value are also highly variable between people1. What, then, accounts for differences between people? Some have suggested that specific neurotransmitters, such as dopamine (DA), may influence subjective value computation and account for variation in neural representations1,2,3. While many studies using non-human primates or rodents have linked direct measurement of DA levels or the activity of DA-releasing cells to discounting behavior and subjective value coding4, no study to date has explored individual differences in direct measures of dopamine function and neural representations of subjective value.

Functional MRI (fMRI) studies have consistently shown that subjective value is reflected in modulation of brain activation in a network of regions including the ventromedial prefrontal cortex (vmPFC), ventral striatum (VS), and posterior cingulate cortex (PCC)5,6. Since subjective value scales with DA signals in the VS in nonhuman models, DA measures might vary with individual differences in subjective valuation. Direct recordings from midbrain DA neurons in monkeys and rodents provide evidence that DA neurons are sensitive to the subjective value of rewards over decreasing delays7,8,9. Providing indirect support for this mesolimbic DA narrative in humans, pharmacological manipulation of D2Rs in humans impacts delay discounting behavior10. However, the regional non-specificity of drugs that target D2Rs11 limits attempts to detail the role of specific regions in this circuit, especially since D2Rs are present across the striatum and cortex12,13.

Although these studies demonstrate the impact of DA on discounting behavior, less is known about how it impacts neural subjective value signals. There are two signaling pathways that may account for effects of DA on discounting: (1.) the corticostriatal loop14 may account for mesolimbic DA influences on prefrontal inputs to the VS15,16 and (2.) the ventral striatopallidal loop may account for interactions between the VS and midbrain DA17,18. Potentially, individual differences in the components of either of these signaling pathways might underlie differences in prefrontal, striatal, or midbrain value computation. We therefore hypothesized that individual differences in D2R availability would correlate with subjective value signals in regions encoding subjective value. Based on previous research with both human and non-human animals, we had the strongest predictions for associations of D2Rs in the ventral striatum and midbrain with subjective value signals in ventral striatum, midbrain, and vmPFC. However, we explored multiple potential associations between the VS, midbrain, vmPFC, and PCC due to evidence from functional neuroimaging studies for subjective value signals in all of these regions5.

In this study, healthy young adults completed a delay discounting task for monetary rewards during an fMRI scan. On a separate visit, we collected direct measures of DA D2R availability using positron emission tomography (PET) combined with the high affinity D2R ligand [18 F]fallypride. We examined whether individual differences in measures of D2R availability were related to discounted subjective value representations in the brain.

Methods

Participants and screening procedures

Twenty-five healthy young adults (ages 18–24, M = 20.9, SD = 1.83, 13 females) were recruited from Vanderbilt University, Nashville, TN in 2012. Participants were subject to the following exclusion criteria: any history of psychiatric illness on a screening interview (a Structural Interview for Clinical DSM-IV Diagnosis was available for all subjects and confirmed no history of major Axis I disorders)19, any history of head trauma, any significant medical condition, or any condition that would interfere with MRI (e.g., inability to fit in the scanner, claustrophobia, cochlear implant, metal fragments in eyes, cardiac pacemaker, neural stimulator, and metallic body inclusions or other contraindicated metal implanted in the body). Participants with major medical disorders including diabetes and/or abnormalities on screening comprehensive metabolic panel or complete blood count were excluded. Participants were also excluded if they reported a history of substance abuse, current tobacco use, alcohol consumption greater than 8 ounces of whiskey or equivalent per week, use of psychostimulants (excluding caffeine) more than twice at any time in their life or at all in the past 6 months, or any psychotropic medication in the last 6 months other than occasional use of benzodiazepines for sleep. Any illicit drug use in the last 2 months was grounds for exclusion, even in participants who did not otherwise meet criteria for substance abuse. Urine drug screens were administered, and subjects testing positive for the presence of amphetamines, cocaine, marijuana, PCP, opiates, benzodiazepines, or barbiturates were excluded. Female participants had negative pregnancy tests both at intake and on the day of the PET scan. PET and discounting behavioral measures for all participants in this sample were previously reported as subsamples of multiple data sets2,20,21 and the present analysis is comprised of data from participants with valid PET and fMRI data.

Approval for the [18 F]fallypride study protocol was obtained from the Vanderbilt University Human Research Protection Program and the Radioactive Drug Research Committee. All participants completed written informed consent and study procedures were approved by the Institutional Review Board at Vanderbilt University in accordance with the Declaration of Helsinki’s guidelines for the ethical treatment of human participants.

Delay discounting task

The delay discounting task was adapted from a previously used paradigm22. On each trial, participants chose between an early reward and a late reward. The delay of the early reward was set to today, 2 weeks, or 1 month, while the delay of the late reward was set to 2 weeks, 1 month, or 6 weeks later. The early reward magnitude ranged between 1% and 50% less than the late reward (determined by a Gaussian distribution, min = $5, max = $30, mean = $15, standard deviation = $10). Participants played 84 trials (42 trials in two runs) of the task. Participants had up to 8 seconds to select a reward, after which their selection was highlighted for 2 seconds, followed by an inter-trial-interval that lasted up to 10 seconds minus the reaction time. If no response was made on the choice slide within 7950 milliseconds, the blank (ITI) slide was set to 2050 milliseconds. Thus, each trial lasted approximately 12 seconds from the choice screen onset to the end of an ITI. (See Fig. 1A). To ensure participants were motivated in their choices, the task was incentive compatible. Participants were instructed to treat all decisions as real because a random trial would be selected for actual payout at the end of the experiment. Participants were paid in Amazon.com credit that was emailed to them either that afternoon (if the participant selected a reward available today) or scheduled to be delivered to them at the delayed date (if the participant selected a reward available later).

Figure 1
figure 1

(A) Task outline. Participants were given up to 8 seconds to indicate a preference for a smaller-sooner or a larger-later monetary reward, after which their choice was highlighted for two seconds. Choice trials were separated by an inter-trial-interval (ITI) scaled by the difference between 10 seconds and the choice response time so that every trial lasted 12 seconds from choice onset to ITI. (B) Top Row: Mean BPND map (left) and ventral striatum ROI (right) from which the average DA D2 receptor availability was extracted for each subject. Bottom Row: Ventromedial prefrontal cortex (left) and midbrain (right) ROIs from which subjective value parameter estimates and DA D2 receptor availability were extracted for each participant. BPND map and ROIs shown are overlaid on the mean participant T1-weighted image in MNI space.

Subjective value modeling

We used a computational model to estimate subjective value from behavioral preferences to create a timeseries regressor for fMRI analysis. Existing theories of reward discounting vary in their assumption about the shape of the discounted value function. To identify the model that provides the best fit to the data, we performed a model comparison across five models: (1.) hyperbolic (\(SV=\frac{A}{1+kD}\))23, (2.) exponential (\(SV=A\cdot {e}^{-kD}\)), (3.) double-exponential beta-delta (\(SV=A\cdot ({e}^{-\beta D}+{e}^{-\delta D})\))24,25, (4.) discounted utility (\(SV=\frac{{e}^{-rA}}{r(1+kD)}\))11, and (5.) a quasi-hyperbolic model with exponentiated delay (\(SV=\frac{A}{1+k{D}^{s}}\))26,27. Discounting models were also compared against a random choice model with probabilities of selecting an option fixed at 0.5. Across models, A represents the monetary reward magnitude, k represents the discount rate, D represents the delay in days, and SV represents the subjective value of available options. In the beta-delta model, βand δ represent exponential discounting for immediate choices and delayed ones, respectively. In the discounted utility model, r represents a concave weight that captures risk aversion with increasing values. For all models, the probability of selecting an option a on trial t was fit using a softmax decision function (\({p}_{t}(a)=\,\frac{{e}^{\beta \cdot S{V}_{a}}}{{\sum }_{i=1}^{2}{e}^{\beta \cdot S{V}_{i}}}\)) with the free parameter β representing the inverse temperature which captures choice stochasticity. Optimal parameters were identified using a Nelder-Mead simplex algorithm to minimize the log-likelihood of the data. Model fit quality was evaluated using Bayesian Information Criterion (BIC) scores. Paired-samples t-tests on BIC scores revealed that the hyperbolic model provided the overall best fit. While the hyperbolic model provided a better fit than the beta-delta model (t21 = −4.30, p < 0.001), the discounted utility model (t21 = −4.57, p < 0.001), the quasi-hyperbolic model with exponentiated delay (t21 = −4.28, p < 0.001), and the random choice model (t21 = −12.0, p < 0.001), it did not provide a significantly better fit than the exponential model (t21 = 0.709, p = 0.485).

Nevertheless, even a good-fitting model could provide inaccurate predictions of actual choices. To ensure that the hyperbolic model best predicted choices based on subjective values estimated from participant-specific free model parameters, we estimated the balanced accuracy for each participant across all models. Balanced accuracy provides an ideal choice prediction measure because it does not assume that the number of smaller-sooner and larger-later preferences are equal28. Balanced accuracy reflects the proportion of choices correctly predicted by the model (smaller sooner: tSS or larger-later: tLL) to all choices (correctly and incorrectly (fSS or fLL) identified): \(\frac{1}{2}(\frac{tSS}{tSS+fSS}+\frac{tLL}{tLL+fLL})\). The hyperbolic model provided higher balanced accuracy (78.0%) over the beta-delta model (68.3%) and the discounted utility model (73.1%), but not the quasi-hyperbolic model with exponential delay (79.1%) and only slightly better than the exponential model (77.7%). Based on the combination of the model fit quality and predictive accuracy, the hyperbolic model was selected. The hyperbolic model has additional advantages of parsimony with a single free parameter (k) in the value function. Since k values from the hyperbolic model are skewed, we used the natural log-transformed values Ln(k) for behavioral correlations.

PET acquisition and processing

[18 F]fallypride, (S)-N-[(1-allyl-2-pyrrolidinyl)methyl]−5-(3[18 F]fluoropropyl)−2,3-dimethoxybenzamide, was produced in the radiochemistry laboratory attached to the PET unit at Vanderbilt University Medical Center, following synthesis and quality control procedures described in US Food and Drug Administration IND 47,245. PET data were collected on a GE Discovery STE (DSTE) PET scanner (General Electric Healthcare, Chicago, IL, USA). The scanner had an axial resolution of 4 mm and in-plane resolution of 4.5 to 5.5 mm FWHM at the center of the field of view. Serial scan acquisition was started simultaneously with a 5.0 mCi (185 MBq) slow bolus injection of the DA D2/3 tracer [18 F]fallypride (median specific activity = 5.33 mCi). CT scans were collected for attenuation correction prior to each of the three emission scans, which together lasted approximately 3.5 hours with two breaks for participant comfort.

[18 F]fallypride binding potential (BPND) image calculation

Voxelwise D2/D3 binding potential images were calculated using the simplified reference tissue model, which has been shown to provide stable estimates of [18 F]fallypride BPND29. The cerebellum served as the reference region because of its relative lack of D2/D3 receptors13. The cerebellar reference region was obtained from an atlas provided by the ANSIR laboratory at Wake Forest University. Limited PET spatial resolution introduces blurring and causes signal to spill onto neighboring regions. Because the cerebellum is located proximal to the substantia nigra and colliculus, which both have D2Rs, only the posterior 3/4 of the cerebellum was included in the region of interest (ROI) to avoid contamination of [18 F]fallypride signal from the midbrain nuclei. The cerebellum ROI also excluded voxels within 5 mm of the overlying cerebral cortex to prevent contamination of cortical signals. The bilateral putamen ROI, drawn according to established guidelines30 on the MNI brain, served as the receptor rich region in the analysis. The cerebellum and putamen ROIs were registered to each participant’s T1-weighted anatomical image using FSL non-linear registration of the MNI template to the individual participant’s T1. T1 images and their associated cerebellum and putamen ROIs were then co-registered to the mean image of all realigned frames in the PET scan using FSL-FLIRT (http://www.fmrib.ox.ac.uk/fsl/, version 6.00). Emission images from the 3 PET scans were merged temporally into a 4D file. To correct for motion during scanning and misalignment between the 3 PET scans, all PET frames were realigned using SPM8 (http://www.fil.ion.ucl.ac.uk/spm/) to the frame acquired 10 minutes post injection. Model fitting and BPND calculation were performed using PMOD Biomedical Imaging Quantification software (PMOD Technologies, Switzerland). Binding potential images represent the ratio of specifically bound ligand ([18 F]fallypride in this study) to its free concentration.

The bilateral midbrain and ventral striatum ROIs were drawn in MNI standard space using previously described guidelines30,31,32 and registered to PET images using the same transformations used in BPND calculation (see Fig. 1B). An additional ROI for the medial frontal cortex in MNI space was derived from the Harvard-Oxford Atlas and registered to PET images using the same transformations used in BPND calculation.

MRI data acquisition

Brain images were collected using a 3 T Phillips Intera Achieva whole-body MRI scanner using a 32-channel head coil (Philips Healthcare, Best, The Netherlands). For each run of the delay discounting task, we used T2*-weighted gradient echo-planar imaging (EPI) to acquired 262 volumes of 38 ascending slices, 3.2 mm thick with 0.35 mm gap (in-plane resolution 3 × 3 mm), FOV = 240 mm × 240 mm, flip angle (FA) = 79, TR = 2000 ms, TE = 35 ms. A high resolution T1-weighted image (TFE SENSE protocol; 150 slices (in-plane resolution 1 × 1 mm), FOV = 256 × 256, FA = 8, TR = 8.9 ms, TE = 4.6 ms) was acquired for registration purposes and ROI definition. The average time between a PET imaging session and fMRI session was 18.5 ± 13.1 days.

fMRI data preprocessing

Data preprocessing was performed using fMRIPrep version 1.0.0-rc933, a Nipype34 based tool. Each T1-weighted volume was corrected for bias field using N4BiasFieldCorrection v2.1.035 and skull-stripped using antsBrainExtraction.sh v2.1.0 (using OASIS template). Cortical surface was estimated using FreeSurfer v6.0.036. The skull-stripped T1-weighted volume was co-registered to a skull-stripped ICBM 152 Nonlinear Asymmetrical template version 2009c37 using a nonlinear transformation implemented in ANTs v2.1.038.

Functional data was slice time corrected using AFNI39 and motion corrected using MCFLIRT v5.0.940. “Fieldmap-less” distortion correction was performed by co-registering the functional image to the same participant’s T1w image with its intensity inverted41,42 and constrained with an average fieldmap template43, implemented with antsRegistration (ANTs). This was followed by co-registration to the corresponding T1-weighted volume using boundary-based registration44 with 9 degrees of freedom, implemented in FreeSurfer v6.0.0. Motion correcting transformations, T1-weighted transformation and MNI template warp were applied in a single step using antsApplyTransformations v2.1.0 with Lanczos interpolation.

Three tissue classes were extracted from T1w images using FSL FAST v5.0.945. Frame-wise displacement46 was calculated for each functional run using Nipype. For more details of the pipeline see https://fmriprep.readthedocs.io/en/latest/workflows.html.

We performed voxelwise nuisance signal removal using publicly-available scripts (https://github.com/arielletambini/denoiser) to clean the data. Specifically, we denoised the data for 10 fMRIPrep-derived confounds: CSF, white matter, standardized DVARS, framewise displacement (over 0.5 mm), and six motion parameters. Functional and structural image registration was verified by visual inspection of quality assessment reports automatically generated by the fMRIPrep software used to preprocess the data.

FSL FEAT (www.fmrib.ox.ac.uk/fsl) was run for each participant with fixed effects across runs. Functional data were high-pass filtered with a cutoff of 100 seconds, spatially smoothed with a 5 mm full-width-at-half-maximum (FWHM) Gaussian kernel, and grand-mean intensity normalized. FSL FILM pre-whitening was carried out for autocorrelation correction. Events were convolved with a double-gamma hemodynamic response function. A general linear model was fit to the data with a regressor for the mean (un-modulated) signal over the duration of the choice period and a regressor for the parametric modulation of subjective value at the choice reaction time with a duration of zero seconds. We applied temporal filtering and added the temporal derivative to the waveform. Visual inspection of data quality using outputs from MRIQC47 suggested one participant had fMRI scans with strong artefactual features. This participant was excluded from analysis. One participant was excluded from analyses because this person only had data for a single run of the task. One participant was excluded for corrupted fMRI data. This provided a final sample of 22 participants. Participant demographics and characteristics are listed in Table 1.

Table 1 Study sample characteristics.

Statistical analyses

Linear regressions between D2R BPND in each of the 3 PET ROIs (VS, midbrain, vmPFC) and discounting behavior (indexed with Ln(k) or proportion of smaller-sooner choices) was run in JASP (Version 0.9.2)48. For associations between PET ROIs and fMRI subjective value signal, we extracted the mean subjective value parameter estimates (percent signal change) for each participant from ROIs in the medial frontal cortex, posterior cingulate, midbrain, and ventral striatum (defined using the same methods described above for the PET ROIs) (see Fig. 1B). Like the medial frontal cortex ROI, the posterior cingulate was derived from the Harvard-Oxford Atlas. Statistically significant relationships were defined using a Bonferroni-correction for 12 tests (3 PET ROIs: VS, midbrain, vmPFC by 4 fMRI ROIs: VS, midbrain, vmPFC, PCC) on an alpha of 0.05 (p < 0.004). Effects surviving correction for multiple comparisons were followed-up with regressions controlling for age and sex as covariates of no interest. All regression coefficients reported are standardized.

For D2R ROIs significantly associated with subjective value signal, we conducted whole-brain analyses of the fMRI data to better localize the effects or identify associations in other regions. All whole-brain fMRI analyses were carried out in FSL FEAT with mixed effects using FLAME 1. Statistical maps were thresholded using a cluster-forming threshold with a height of Z > 2.3, and cluster-corrected significance of p < 0.05. Analyses were run to examine: (1) the mean effect of subjective value parametric modulation of the fMRI BOLD signal across all participants and (2) the correlation between individual differences in BPND and subjective value parametric modulation of the BOLD signal.

Results

Dopamine D2Rs and delay discounting behavior

As expected, computationally-derived discount rates, Ln(k), were strongly positively correlated with the proportion of smaller-sooner options chosen (β = 0.911, 95% CI [0.794, 0.963], p < 0.001). As already reported in a previous publication2, D2R BPND was not correlated with the proportion of smaller-sooner choices or Ln(k) values for any ROI: VS (prop sooner: β = 0.036, 95% CI [−0.391, 0.451], p = 0.873; Ln(k): β = −0.021, 95% CI [−0.439, 0.404], p = 0.927), Fig. 2D; midbrain (prop sooner: β = 0.036, 95% CI [−0.392, 0.451], p = 0.874; Ln(k): β = –0.054, 95% CI [−0.465, 0.376], p = 0.810), and vmPFC (prop sooner: β = 0.218, 95% CI [−0.224, 0.586], p = 0.330; Ln(k): β = 0.195, 95% CI [−0.247, 0.570], p = 0.384). These D2R-discounting behavior results presented here are based on a subset of the data used in the prior publication. That prior publication showed no significant associations between D2R and discounting behavior in healthy adults across three samples that included this sample2.

Figure 2
figure 2

(A) Mean effect of subjective value (N = 21) overlaid on the mean participant T1-weighted image in standard space, whole brain cluster-forming threshold Z > 2.3, cluster-corrected p < 0.05. Delay discounting was not correlated with the effect of subjective value on fMRI signal in the (B) vmPFC (N = 22, r = −0.312, p = 0.158) or (C) midbrain (N = 22, r = −0.097, p = 0.669). DA D2-like receptor availability in the ventral striatum was not correlated with (D) delay discounting (N = 22, r = −0.053, p = 0.821). DA D2-like receptor availability was positively correlated with the effect of subjective value on fMRI signal in the (E) vmPFC (N = 21, r = 0.624, p = 0.003) and (F) midbrain (N = 22, r = 0.597, p = 0.003). Shaded regions indicate 95% confidence interval.

Localization of subjective value representations within fMRI data

Voxelwise analysis of the mean effect of subjective value of the chosen option revealed significant parametric modulation in the dorsomedial PFC. Exclusion of a single outlier revealed stronger and spatially extended activation in the vmPFC and PCC (see Table 2 and Fig. 2A). These effects are consistent with previous studies using subjective value as a parametric regressor6,25,49. Both unthresholded maps with and without the outlier are available to view/download on Neurovault (https://neurovault.org/collections/PDSRXDAH/).

Table 2 Average neural representations of subjective value of the chosen option.

Subjective value representations and delay discounting behavior

We ran correlations between fMRI parameter estimates for subjective value in each of the 4 subjective value ROIs (VS, midbrain, vmPFC, PCC) and discounting rates to evaluate whether individual differences in subjective value representations were associated with individual differences in time discounting behavior. There were no significant correlations between discounting and subjective value in the VS (prop sooner: β = −0.046, 95% CI [−0.459, 0.383], p = 0.839; Ln(k): β = −0.046, 95% CI [−0.383, 0.458], p = 0.840), midbrain (prop sooner: β = −0.060, 95% CI [−0.470, 0.371], p = 0.790; Ln(k): β = −0.097, 95% CI [−0.498, 0.339], p = 0.669), vmPFC (prop sooner: β = −0.299, 95% CI [−0.640, 0.140], p = 0.176; Ln(k): β = −0.312, 95% CI [−0.648, 0.126], p = 0.158), and PCC (prop sooner: β = −0.014, 95% CI [−0.433, 0.410], p = 0.952; Ln(k): β = −0.009, 95% CI [−0.429, 0.414], p = 0.967) (See Fig. 2B,C).

Ventral Striatum D2Rs (PET) and subjective value representations (fMRI)

We identified a positive correlation between D2R BPND in the VS and subjective value-related fMRI signal in the vmPFC (β = 0.466, 95% CI [0.056, 0.742], p = 0.029). Combined visual inspection of the correlation and bivariate outlier statistics (Cook’s distance greater than 4 times the mean distance, t-test of studentized residuals (p < 0.05), and test of heteroskedasticity (p < 0.05)) identified an influential outlier with high D2R BPND but low subjective value parameter estimates that may have biased the estimated effect. This is the same outlier mentioned above in the fMRI analyses. Exclusion of this outlier revealed a stronger association (β = 0.624, 95% CI [0.263, 0.832], p = 0.003) (see Fig. 2E). This effect remained significant after controlling for age and sex as covariates of no interest (β = 0.548, p = 0.003). D2R BPND in the VS was also significantly positively associated with subjective value-related fMRI signal in the midbrain (β = 0.597, 95% CI [0.234, 0.814], p = 0.003) (see Fig. 2F). This effect remained significant after controlling for age and sex (β = 0.576, SE = 0.012, t(18) = 3.27, p = 0.004). By contrast, D2 BPND in the VS was not significantly associated with subjective value BOLD signal in the VS itself (β = 0.333, 95% CI [−0.103, 0.661], p = 0.130) or PCC (β = 0.026, 95% CI [−0.400, 0.443], p = 0.909).

Exploratory voxelwise analysis of the fMRI data using ventral striatal D2R BPND revealed a significant correlation between BPND and subjective value representation in a cluster including the left precentral gyrus, inferior frontal gyrus (IFG) pars opercularis, and a superior portion of the posterior insula (See Fig. 3, Table 3).

Figure 3
figure 3

Positive correlation between ventral striatum D2 BPND and subjective value in the left inferior frontal gyrus, whole brain cluster-forming threshold Z > 2.3, cluster-corrected p < 0.05 shown on the mean participant T1-weighted image in MNI space.

Table 3 Positive correlation between ventral striatum D2 BPND and neural representations of subjective value of the chosen option.

Midbrain D2Rs and subjective value representations

Midbrain D2R availability and subjective value-related fMRI signal in the vmPFC were not related (β = 0.254, 95% CI [−0.187, 0.610], p = 0.254). As before, combined visual inspection of the correlation and bivariate outlier statistics (Cook’s distance greater than 4 times the mean distance, t-test of studentized residuals (p < 0.05), and test of heteroskedasticity (p < 0.05)) identified an influential outlier (same as above) with high D2R BPND but low subjective value parameter estimates that may have biased the estimated effect. Exclusion of this outlier revealed a stronger association (β = 0.513, 95% CI [0.105, 0.773], p = 0.017) but the effect did not survive correction for multiple comparisons. D2 BPND in the midbrain was not significantly associated with subjective value in the midbrain (β = 0.398, 95% CI [−0.029, 0.702], p = 0.067), ventral striatum (β = 0.026, 95% CI [–0.400, 0.443], p = 0.909), or PCC (β = −0.172, 95% CI [−0.554, 0.269], p = 0.443). We did not conduct exploratory voxelwise analysis of the fMRI data using midbrain BPND.

Prefrontal D2Rs and subjective value representations

vmPFC D2R availability was not significantly associated with subjective value-related fMRI signal in the VS (β = −0.127, 95% CI [−0.521, 0.311], p = 0.573), midbrain (β = 0.143, 95% CI [−0.296, 0.533], p = 0.525), vmPFC (β = −0.129, 95% CI [−0.522, 0.309], p = 0.567), or PCC (β = −0.041, 95% CI [−0.454, 0.388], p = 0.858). We did not conduct exploratory voxelwise analysis of the fMRI data using vmPFC BPND.

Discussion

Here we tested the hypothesis that individual differences in mesolimbic DA D2Rs relate to neural representations of subjective value. We predicted associations between D2R in ventral striatum and midbrain and subjective value signals in ventral striatum, midbrain, and vmPFC. We identified a positive correlation between VS D2R availability and the strength of subjective value signals in the vmPFC and midbrain. However, neither D2R availability nor functional neural representation of subjective value were directly correlated with discounting behavior.

The positive correlations between mesolimbic D2Rs and subjective value in the vmPFC and midbrain are consistent with past findings converging on two key circuits: a corticostriatal loop and a ventral striatopallidal loop. The corticostriatal loop is comprised of a series of pathways that promote approach behavior. Activation of D2Rs in the ventral striatum increases GABAergic signaling to the ventral pallidum which projects to the thalamus50,51. Neurons in the vmPFC receive these thalamic projections and promote local release of DA in the ventral striatum52. The ventral striatopallidal loop is comprised of connections linking the ventral striatum and dopaminergic midbrain that promote reward “wanting”53. Specifically, D2-mediated ventral pallidal signals from the ventral striatum that complete the corticostriatal loop also promote DA release to the ventral striatum via GABAergic signals to the midbrain18,54 (See Fig. 4 for an illustration of these two potential mechanisms). Prevention of hyperdopaminergic states in these loops are regulated by dopamine transporters in the ventral striatum and somatodendritic autoreceptors in the midbrain55. Importantly, we did not measure DA release specifically in this study. Further studies with multiple measures of DA function are needed to test the specific links between subjective reward valuation and integration of value signals between the corticostriatal and ventral striatopallidal circuits.

Figure 4
figure 4

Illustration of a potential mechanism by which mesolimbic D2Rs impact subjective value (SV) and vice-versa. Binding of DA to D2Rs in the ventral striatum (VS) increases GABAergic signaling to the ventral pallidum (VP), which sends GABAergic projections to the thalamus (Thal) and midbrain (MB). GABAergic VP-MB signaling promotes DA release to the VS, while VP-Thal signaling promotes glutamate signaling in the ventromedial prefrontal cortex (vmPFC). The glutamatergic afferents from the vmPFC project to and promote local DA release in the VS.

The observed voxelwise associations between D2Rs in the VS and subjective value representations in select cortical regions are consistent with prior reported effects of dopaminergic drugs11,56,57. While these cortical regions are not often emphasized in fMRI studies of subjective value, variability in the encoding of subjective value in the IFG and precentral gyrus has been identified in studies of effort discounting49,58 and risky decision making59. In particular, Since the IFG and precentral gyrus support inhibitory control60 and motor control61, respectively, individual differences in DA function may impact corticostriatal signaling. Specifically, increased subjective value representations in the vmPFC (mediated by VS D2R) may recruit additional resources that increase motivational vigor by facilitating direct control of goal-directed movements toward highly-valued rewards62. The SV activation clusters in the insula in relation to VS D2R is similar to clusters previously identified in the posterior insula in relation to preferences for delayed rewards in a discounting task63. It is possible that individuals with more VS D2Rs in our sample may be representing SV associated with delayed preferences more strongly in the insula.

This study examined associations between PET measures of DA function, fMRI measures of subjective value processing, and discounting behavior. A related recent study identified associations between individual differences in D2R availability in the midbrain and neural representations of expected value (i.e., reward magnitude multiplied by probability) in the ventral striatum during a simple gambling task64. The findings show a similarity in that they both identified associations between value-related functional neural activity in one brain region and individual differences in D2R availability in a different region, with the present study exploring a broader set of ROIs in a larger sample to examine potential associations across more of the reward circuit.

Perhaps more importantly, the studies differed both in terms of the task utilized and the emphasis on objective vs. subjective valuation. The neural measure of value processing in the previous study was based on sensitivity to expected value64, an objective function that does not convey details about individual subjective utility. Nevertheless, some of the signal in that study may have reflected subjective valuation, as there were individual differences in the magnitude of representations of value in the fMRI data that were correlated with dopamine receptor availability. That is to say that there may have been subject-specific valuation in addition to the objective valuation that was shared across subjects. To evaluate the uniqueness of the association with subjective value signals in the present study, we conducted additional analyses that examined the robustness of the subject-specific effects in the present study. For comparability to the objective value representations in the prior paper, we also computed estimates of value in the fMRI data using a group-averaged discount rate. At the group level, mean SV provided a worse fit to the data and did not reveal significant activation clusters in any vmPFC or striatal regions (Supplementary Fig. S1). When we examined the association between SV BOLD parameter estimates and dopamine using the group averaged discount rate, positive associations between ventral striatal D2R availability and vmPFC subjective value remained significant (Supplementary Fig. S2). This may not be surprising given that rank-order participant differences in vmPFC subjective value were largely preserved across statistical maps based on either group-average or subject-specific discount rates. Given this rank consistency, individual differences with D2R availability were also preserved. Prior studies of value-related neural activity using computational models have documented similar consistency of fMRI estimates across wide ranges of parameter estimates65. However, the robustness of this association raises the possibility that the correlation between SV BOLD parameter estimates and D2R availability may not require a value-related model at all. To test this, we ran an additional model that did not include a parametric regressor for subjective value during the choice period. This analysis did not reveal mean BOLD activation during the choice period in the frontal cortex (Supplementary Fig. S3). In addition, vmPFC BOLD parameter estimates during the choice period were not correlated with D2 receptor availability (Supplementary Fig. S4). These supplemental analyses support the contention that the D2 receptor availability was related to subjective valuation representations rather than a group-level objective valuation signal.

A recent study66 in humans identified an association between PET measures of DA D1 receptor availability in the ventral striatum and reinforcement learning-based value signals in the vmPFC (but not with reinforcement learning behavior per se), but the dual role of DA in learned value and motivation complicates interpretation of mesolimbic DA influences on prefrontal reward processing67. Specifically, it remains unclear the extent to which updated state values emerging from prediction errors across time in a learning task are similar to goal values from one-shot decisions. Nevertheless, a positive correlation between vmPFC value representations and D1Rs in the ventral striatum in that study and D2Rs in the present study suggests a more nuanced relationship between DA function and value. Although D1Rs and D2Rs have opposite effects in the direct and indirect pathways of the basal ganglia, an emerging view suggests this dichotomy is specific to the dorsal striatum and non-existent in the ventral striatum68. Thus, mesolimbic DA signaling in the ventral striatum could support subjective value representations effectively in the same way via D1Rs or D2Rs.

While prior studies5 have identified subjective value representations in the VS, we did not observe this effect in this study, consistent with a recent analysis of healthy adults using a similar task49. As described in a meta-analysis of different kinds of value representations69, it has been suggested that subjective value-like signals in the VS might represent reward prediction errors and not goal values (which are more strongly represented in the medial prefrontal cortex). It is possible that some delay discounting tasks might have features that increased ventral striatal sensitivity to positive prediction errors—such as larger changes in the magnitudes of presented values from trial-to-trial. Since the trial-to-trial changes in presented reward magnitudes of the present study fluctuated around a normal distribution, it is possible that our design minimized ventral striatal sensitivity to strong fluctuations in values. A more liberal statistical voxel threshold did reveal average subjective value representations in the VS and caudate (see unthresholded results on NeuroVault: https://neurovault.org/collections/PDSRXDAH/). The lack of a clear VS subjective value signal may contribute to the absence of a correlation between VS BOLD responses and VS D2R availability. It is intriguing that VS D2R’s may nevertheless influence processing in other regions even in the absence of demonstrating a more direct influence on BOLD responses in the VS within the task paradigm.

We explored all potential associations between all PET and fMRI ROIs including the PCC due to evidence from functional neuroimaging studies for subjective value signals in the PCC5. We did not find significant associations between D2R in any of our ROIs and subjective value signals in the PCC. It is possible that subjective value-related neural activity observed using fMRI during discounting tasks is not as DA-mediated as in the striatum, midbrain, or vmPFC. Although the PCC is often functionally co-activated with the striatum and vmPFC, it is often not included in models of reward circuitry14. We also did not observe associations between vmPFC D2R and subjective value signals in any of the ROIs. It is possible that effects of prefrontal DA on discounting may be more D1R-mediated as prior work suggests D1Rs and D2Rs make dissociable contributions to specific features of discounting in rodents70,71.

It is striking that individual differences in DA function were correlated with neural representations of subjective value but not the behavior presumed to be influenced by regions encoding subjective value. This could suggest that DA function may impact idiosyncrasies in how choice values are computed without necessarily impacting a wide array of possible choice behaviors. The lack of a correlation between D2Rs and reward discounting is consistent across studies of healthy adults in a larger sample2. Together, these individual differences may suggest that revealed preferences indexed by behavioral choices are not aligned tightly enough to valuation signals indexed by BOLD responses to capture the biological mechanisms that shape valuation. Although this runs counter to assumptions in the neuroeconomics literature, the lack of strong associations complements some theoretical models of cognition. For example, fitting David Marr’s levels of analysis, DA signaling provides a neural substrate at the implementation level, subjective value provides the strategy at the algorithmic level, and preference for smaller-sooner options describe the problem at the computational level72. As Marr and others have described, the neural processes alone at the implementation level cannot adequately describe behavior at the computational level, but only have meaning inasmuch as each of these levels are linked by the intermediate algorithmic level73. This hierarchical structure might explain why D2R receptor availability (implementation level) alone does not reveal associations with discounting behavior (computational level), even though it does explain neural subjective value representation (algorithmic level). This precludes dynamic measures of dopamine signals (for example fast-scan cyclic voltammetry) which may more directly encode value signals9.

As with most neuroreceptor PET studies, the most important limitation of the present study is the sample size, which limits statistical power. Although the sample size of this study is comparable to or larger than other recent studies measuring both fMRI signal and DA PET measures within subjects64,74,75, no prior studies have used large enough samples to better estimate the effect sizes that might be expected. As such, even the strongest effects have quite wide confidence intervals, so the sizes of the true associations between these measures are unclear. Despite the relatively small sample, these results provide valuable information given the direct measurement of DA receptors and past speculation on the role of DA in the study of reward discounting. It is also important to note that since we did not measure DA release, we are limited from making stronger claims about transient changes in VS DA concentrations. Instead, baseline measures of D2R availability reflect individual differences that are more trait-like. Nonetheless, baseline measures have previously been shown to be positively correlated with DA release76. Importantly, these findings in a healthy young adult sample may not generalize to clinical samples. In fact, meta-analytic correlations between dopamine function and discounting behavior suggest that associations between dopamine and discounting vary across clinical and healthy samples2. Since all neuroimaging data (fMRI and PET) are publicly available on OpenNeuro (https://openneuro.org/datasets/ds002041), we hope this initial set of analyses and the complete data set provide a unique resource for other scientists to better understand associations between DA receptors and reward-related functional brain activation. Until now, it has been unclear how neural representations of subjective value arise to support a broad range of intertemporal choice behaviors in humans. The present findings suggest that variation in dopamine function may account for differences between people in neural representations of subjective value.