Valuation in major depression is intact and stable in a non-learning environment

The clinical diagnosis and symptoms of major depressive disorder (MDD) have been closely associated with impairments in reward processing. In particular, various studies have shown blunted neural and behavioral responses to the experience of reward in depression. However, little is known about whether depression affects individuals’ valuation of potential rewards during decision-making, independent from reward experience. To address this question, we used a gambling task and a model-based analytic approach to measure two types of individual sensitivity to reward values in participants with MDD: ‘risk preference,’ indicating how objective values are subjectively perceived, and ‘inverse temperature,’ determining the degree to which subjective value differences between options influence participants’ choices. On both of these measures of value sensitivity, participants with MDD were comparable to non-psychiatric controls. In addition, both risk preference and inverse temperature were stable over four laboratory visits and comparable between the groups at each visit. Neither valuation measure varied with severity of clinical symptoms in MDD. These data suggest intact and stable value processing in MDD during risky decision-making.

in such tasks. The first, 'risk preference 13,14 (RP)' reflects how objective values are subjectively perceived (subjective value) and is quantified by the curvature of a power utility function 12 . This approach is equivalent to defining risk as the variance of each option and thus, takes into account both the reward magnitude outcome spread and probability information. The second component determines the degree to which subjective value differences between options affect the probability of choosing one option over the other, and is often referred to as 'inverse temperature 15 (IT)' . Both components characterize individual differences in the direction and the degree to which objective values impact individual choices, and thus are used as measures of value sensitivity in the current study. Note that each measure explains a different functional relationship between subjective values and decision-making: RP accounts for nonlinear (concave or convex function) subjective valuation and IT is a linear scaling of subjective values (similar to 'reward sensitivity' in other MDD studies 1 ; see Methods for expected utility model specifications). The value sensitivity measures were estimated from individuals' choices using maximum a posteriori fitting (see Methods for parameter estimation procedure).
Participants completed the decision-making task on up to four laboratory visits as part of a longitudinal study; on average, visits were separated by 5.5 weeks (mean of 116.27 days between Time 1 and Time 4 visits). At each visit, participants were instructed that one of their actual choices would be randomly selected and played out to determine their payoff at the end of the visit. The payoff was determined by the values of a gamble selected via random number generator from the participant's actual choices and a roll of a hundred-sided die (the first determining which gamble would be played and the second determining the payoff). Participants who made less than two visits to the laboratory, had Beck Depression Inventory (BDI-II) scores 16 > 12 for controls or < 13 at Time 1 for MDD participants, or always chose the option with smaller expected value were excluded from analyses (see Methods for numbers of excluded participants for each criterion). The analyzed sample for RP and IT parameter estimation included 33 non-psychiatric controls (14 females; age = 33.00 ± 11.31) and 65 individuals with MDD (48 females; age = 37.92 ± 11.48). See Table 1 for further demographic information.

Results
Valuation is comparable between MDD and non-psychiatric control participants. To compare value sensitivity in MDD participants with that of non-psychiatric controls, we estimated each individual's risk preference and inverse temperature for each visit, and first compared the means of these parameters between groups (see Methods for details about parameter estimation). Thus, RP and IT at each of the four visits were computed for each individual, for participants who visited all four times (N control = 28, N MDD = 47). Group mean parameter values were: RP control = 0.50 ± 0.31; RP MDD = 0.46 ± 0.31; IT control = 3.41 ± 0.41; and IT MDD = 3.25 ± 0.43 (mean ± s.d.). Note that both the MDD and non-psychiatric control groups showed risk aversion (RP < 1) consistent with Holt & Laury 8 . Across four laboratory visits, participants with MDD showed comparable RP and Figure 1. Payoffs and probabilities of paired gambles. Participants played a gambling task that consisted of a menu of probabilities of high-and low-payoff values. As per Holt & Laury 8 , participants made nine choices between two risky gambles 'Option A' and 'Option B' . The high-and low-payoff assigned to each option were fixed as shown here. The probability associated with payoff values was represented as a range of numbers; this allowed participants to easily match the probability of each outcome with a roll of a hundred-sided die; this roll was performed after the task for one randomly selected gamble to determine the final outcome for payoff. The rightmost column shows the expected value differences between the Option A and B. Expected utility theory predicts that a risk neutral individual will choose Option A in decisions 1-4 where EV(B) < EV(A) and Option B in decisions 5-9 where EV(B) > EV(A).
IT to that of non-psychiatric controls (   19,20 . As an initial test of RP and IT stability within MDD and control participants, we compared within-group means over time; these analyses indicate that neither parameter differed over time for either group (RP control : χ 2 (3, 81) = 2.12, P = 0.55; RP MDD :  Table S2 for sample size at each visit). Note that the proportion  Table S2 for statistical results. Each point represents an individual participant, and the colorcoded lines are the robust regression line between measures from two visits. The x-and y-axes each represent the rank order of individual participants at each visit (for simplicity, not labeled here); *P < 0.05, **P < 0.01, ***P < 0.001, uncorrected; all correlations were significant after applying multiple comparison correction (FDR q < 0.0001).
of risky choices, a model-free measure of risk preference, was also stable over time in both the MDD and control groups (see Supplementary Fig. S2 for model-free risk preference stability over time). Again, Bayesian null hypothesis significance testing confirmed the stable RP and IT across visits (see Supplementary Text and Table S5 for details and Bayes factors). These significant correlations indicate that for MDD and control participants, the risk preference and inverse temperature measures of value sensitivity at the decision phase are stable over time.

Valuation does not vary with severity of clinical symptoms in MDD.
Given previous reports that reward sensitivity at decision outcome varies with symptoms in depression 1,21-23 , we also examined whether RP and IT varied systematically with depressive or anxious symptoms. Symptoms were measured using the BDI-II 16 , Anhedonia subscale of the BDI-II (sum of BDI items 4, 12, 15, and 21) 24 , State Anxiety Scale of the Spielberger State-Trait Anxiety Inventory (STAI) 25 , and the five subscales of the Mood and Anxiety Symptoms Questionnaire (MASQ; Anhedonic Depression, Anxious Arousal, General distress (GD):Anxiety, GD:Depression, and GD:Mixed) 26 ; correlations were performed within the MDD group. None of the clinical symptom scores or changes in symptoms over time were related to MDD participants' RP or IT parameter values (see Supplementary Fig. S1, Supplementary Table S3, and S4 for statistical test results). These data demonstrate that individual differences in value sensitivity during decision-making are not explained by clinical characteristics of MDD.

Discussion
The current study used a risky decision-making task to investigate MDD individuals' value sensitivity at the decision phase independent from learning and feedback. The within-subjects repeated-measures design allowed us to examine the stability of the value sensitivity measures, and the model-based approach dissociated linear (inverse temperature) and nonlinear (risk preference) value sensitivities that together determine behavioral choices during risky decision-making.
A few previous studies have used risky decision-making paradigms and measured MDD individuals' risk preferences. The results, however, have been inconsistent. Some studies have reported decreased risk seeking behavior in individuals with MDD 21,27,28 , while other studies have reported comparable risk preferences between individuals with MDD and healthy individuals 29,30 . In the current study, we showed that risk preferences (nonlinear value sensitivity) in individuals with MDD are comparable with those of healthy individuals. The stability of risk preferences was tested across four repeated visits, and consistent with previous findings in unselected control individuals 19,20,31 , MDD participants showed stable risk preferences over time (c.f., model-free measures showing lower reliability [32][33][34] ). In addition to estimating risk preference, we examined inverse temperature (linear value sensitivity, similar to 'reward sensitivity' in other MDD studies 1 ) at the decision phase, and showed that MDD participants have stable and comparable inverse temperature compared with non-psychiatric controls. In addition, none of the clinical symptom severity measures within participants with MDD were related to individual differences in risk preference or inverse temperature. These results indicate that in contrast with previous decision-making studies showing blunted valuation at the outcome phase in MDD 1 , neither linear nor nonlinear value sensitivity at the decision phase in MDD was different from that of controls.
To date, studies examining valuation in MDD have focused primarily on the outcome phase of reward learning tasks and shown impaired valuation, including diminished neural reward responses [35][36][37] , reduced learning rate 38 , lower reward sensitivity 1 , or enhanced exploration (more frequent choice shifting) 39,40 in participants with MDD. A few other studies have used various non-learning tasks and have suggested that individuals with MDD have low motivation for monetary reward 21,41,42 ; however, in these studies, the focus was also on responses at the outcome phase 22,23 . Unlike the abundant literature about responses to reward outcome (particularly during reward learning), little is known about whether individuals with MDD have intact ability to process and compare values during decision-making when no learning is required. The current study provided no outcome feedback during the task and thus focused on the decision phase dissociated from learning and reward experience. These data showed that during the decision phase, participants with MDD have value processes comparable to that of healthy individuals. This is consistent with previous studies showing intact neural responses in individuals with MDD during reward anticipation (prior to outcome) 43,44 . Together, the present data suggest that individuals with MDD have intact valuation when reward contingencies are fully known (no reward learning required) and suggest that previously reported valuation deficits in MDD are specific to the outcome phase of tasks in which rewards are experienced and learning occurs. We note that this conclusion is drawn based on the assumption of linear probability weighting (see Supplementary Text for nonlinear probability weighting function) and the present task where each pair of gambles had the same high-and low-payoff probabilities. Individuals with MDD may exhibit altered valuation in other environments (e.g., when making choices between two options that have different payoff probabilities); these possibilities cannot be ruled out in the present data.
In MDD, intact valuation, dissociated from learning, may provide mechanistic insight about behavioral activation therapies for depression 45 . These type of therapies engage individuals with potential positive reinforcers (rewards) in a structured manner and, in doing so, allow individuals with MDD to largely bypass disrupted learning processes. That is, behavioral activation provides a guided learning environment wherein engagement and experience of action-reward contingencies are enforced, allowing for the value of rewards to evolve from being unsampled and ambiguous to sampled and fully known. Once these values are known, intact decision processes such as those identified here allow individuals to engage in healthy choices. As our data indicate, when action-reward contingencies are fully known, participants with MDD show intact valuation during decision-making. We speculate that this state is comparable to the endpoint of successful behavioral activation wherein the experience of reward is restored. In brief summary, the current study suggests specificity of previously reported value processing disruptions in MDD, informs the conditions under which sensitivity to reward values is preserved, and offers the possibility that learning about reward values, rather than discriminating among values when making decisions, may be a mechanistic target for intervention in MDD.
Scientific RepoRts | 7:44374 | DOI: 10.1038/srep44374 Methods Participants. Fifty non-psychiatric controls and 80 individuals with MDD were recruited as part of a larger ongoing study examining neural substrates of treatment response in MDD (neural and treatment data will be analyzed as part of another manuscript). Among these participants, we included individuals who at least participated in both Time 1 and 4 laboratory visits to maximize the time interval for test-retest reliability. These inclusion criteria yielded 41 non-psychiatric controls and 69 individuals with MDD for the present study. Basic inclusion/ exclusion criteria were initially assessed via telephone and were confirmed during the first laboratory visit using the Structured Clinical Interview for DSM-IV-TR Axis I Disorders -Research Version -Patient Edition (With Psychotic Screen) (SCID-I/P) 46 and selected modules of the Mini-International Neuropsychiatric Interview (M.I.N.I.) 47 . At study intake, individuals in the MDD group met DSM-IV criteria for MDD and/or dysthymia while individuals in the control group did not meet criteria for any current Axis I disorder. Exclusion criteria for all participants included contraindications to magnetic resonance imaging (MRI) and history of neurological disease. Following the initial screening visit (Time 1), participants returned to the lab up to three times; on average, there were 5.5 weeks between each visit. Three controls whose BDI-II scores were above the non-depressive range (i.e., greater than 12) at any visit and two individuals with MDD who had BDI-II scores in the non-depressive range (i.e., less than 13) at Time 1 were additionally excluded from analyses 48 . Five controls and two individuals with MDD who always chose the option with smaller expected value were also excluded. Therefore, the analyzed sample for RP and IT parameter estimation included 33 healthy controls (14 females; age = 33.00 ± 11.31) and 65 participants with MDD (48 females; age = 37.92 ± 11.48). See Table 1 for additional demographic information. Use of psychotropic medication was not an exclusion criterion for individuals with MDD, and at study enrollment, 20 participants with MDD were taking one or more psychotropic medications. As noted above, these data were collected as part of a larger study examining neural substrates of treatment response in MDD, and a subgroup of individuals with MDD received cognitive behavioral therapy (CBT) over the course of participation (N = 45; see Supplementary Table S1 for demographic data for the treatment group). As such, participants showed decline in symptoms over time; neither symptoms nor symptom change was related to either RP or IT (as detailed throughout the main text). All participants provided written informed consent following an explanation of study procedures. The study was approved by the Institutional Review Board (IRB) of Virginia Tech, and all experimental procedures followed relevant institutional guidelines and regulations.
Experimental procedure. Participants made a series of nine choices between two gambles, one of which was objectively riskier than the other (adapted from Holt & Laury 8 ) (Fig. 1). Each pair of gambles had the same high-and low-payoff probabilities that varied from 10% to 90% in 10% increments along the nine pairs. Payoff spreads between high-and low-payoffs were fixed for each option; 'Option A' had $5.00 and $4.00, and 'Option B' had $9.63 and $0.25 as potential payoffs. Participants were paid based on the actual outcome of one of their choices; the payoff was determined by the values of a gamble selected via random number generator from the participant's actual choices and a roll of a hundred-sided die (the first determining which gamble would be played and the second determining the payoff).
Model-free analyses. For model-free behavioral analyses, the proportion of choices of the risky option (P(risky)) among the nine pairs of gambles was used as a measure of risk preference. Given the expected value (EV) between pairs of choices (Fig. 1), a risk neutral individual should show P(risky) = 5/9 ≈ 0.56 (as per expected utility theory, a risk neutral individual is expected to choose Option B in the trials where EV(B) > EV(A), decisions 5-9, and to choose Option A in the trials where EV(B) < EV(A)). Higher P(risky) thus indicates risk seeking. P(risky) was calculated per visit and used to examine stability of model-free risk preferences over time in each group.
Estimates of individual risk preference. We applied expected utility theory 12 to estimate each individual's risk preference (RP) and inverse temperature (IT) that predict the individual's choices. We used a standard power utility function and softmax choice rule as described below: A h igh payoff high payoff:A h igh payoff l ow payoff: where U A (U B ) is the utility of the Option A (Option B), P is the probability of earning a payoff, V represents the payoff amount for each gamble, α is the risk preference, and μ is the inverse temperature. The estimated RP, α , indicates whether an individual is risk averse (0 < α < 1), risk neutral (α = 1), or risk seeking (α > 1). The estimated IT, μ , indicates how sensitive an individual is to the utility differences between the two gambles; larger μ indicates higher sensitivity to utility differences and μ ≈ 0 indicates utility (subjective value) insensitivity. To achieve a more stable parameter estimation for each individual, we adopted a hierarchical model structure of the population 49 in which it is assumed that a participant i's parameters (μ i and α i ) are sampled from the population's parameter distribution. Of importance, both controls and participants with MDD were considered to share the same group-level (population) distribution (equal prior), which allowed us to compare the two participant groups in the further analyses. That is, the two groups were assumed to be coming from the same distribution every time the group-level parameters were updated throughout the estimation procedures. As this approach may bias against finding group differences (because individuals from two groups are treated as samples from equal priors), we also implemented two additional approaches with different assumptions: i) separately estimating group-level distributions (biasing toward finding group differences) and ii) estimating an additional variable that captures a potential group mean difference (examining whether the mean group difference is different from zero) (see Supplementary Text for details).
Based on the assumptions for each approach, we estimated the group-level parameter distribution for each parameter and set the distribution as a prior for individual estimation (maximum a posteriori (MAP) estimation). In the current study, we set the group-level distribution of each parameter as a gamma distribution 50 with a shape parameter, k, and a scale parameter, θ , (μ ~ Γ (k μ , θ μ ); and α ~ Γ (k α , θ α )). For each iteration of the group-parameter estimation (max iteration of 15,000), 100 random samples were drawn from each parameter distribution for each participant, and the average of the calculated likelihoods was used as an approximation of the integral in the following equation: