Title: Risky Decision-making in Major Depression Is Stable and Intact

Correspondence may be addressed to Pearl Chiu (chiup@vtc.vt.edu) or Brooks King-Casas (bkcasas@vtc.vt.edu). CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not .


Abstract:
The clinical diagnosis and symptoms of major depressive disorder (MDD) have been closely associated with impairments in reward processing.In particular, various studies have shown blunted neural and behavioral responses to the experience of reward in depression.However, little is known about whether depression affects individuals' valuation of potential rewards during decision-making, independent from reward experience.To address this question, we used a gambling task and a model-based analytic approach to measure two types of individual sensitivity to reward values in participants with MDD: 'risk preference,' indicating how objective values are subjectively perceived and 'inverse temperature,' determining the degree to which subjective value differences between options influences participants' choices.On both of these measures of value sensitivity, participants with MDD were comparable to nonpsychiatric controls.Both risk preference and inverse temperature were also stable over four laboratory visits and comparable between the groups at each visit.Moreover, neither value sensitivity measure varied with severity of clinical symptoms in MDD.These data suggest intact and stable value processing in MDD during risky decision-making.

Main text:
Major depressive disorder (MDD) has been associated with impairments in reward processing, and many studies indicate that symptoms of MDD correlate with diminished neural and behavioral responses when rewards are presented [1][2][3][4][5] .These studies have typically used reward learning and other tasks that provide feedback about rewards and focused on individuals' responses at this feedback or 'reward outcome' phase (see Rizvi et al. 6 for review).However, little is known about how depression affects reward valuation during decision-making in the absence of learning and feedback.Understanding whether individuals with MDD have disrupted valuation during decision-making at the 'decision phase', separate from reward outcome, will clarify whether individuals with MDD are disrupted overall in reward valuation or more specifically in experiencing rewards.Here, we used a risky decision-making task, a model-based analytic approach, and a repeated measures within-subject design across four visits to investigate whether participants with MDD have intact or disrupted valuation during decision-making in the absence of learning and feedback.
Sixty-nine individuals with current MDD and 41 non-psychiatric controls were recruited in the current study.To investigate 'value sensitivity' during decision-making independent from feedback, we asked participants to complete a risky decision-making task (adapted from Holt & Laury 7 ) (Fig. 1).During the task, participants made a series of nine choices between two gambles, one of which was objectively riskier than the other 7 .Each pair of gambles had the same high-and low-payoff probabilities that increased from 10% to 90% in 10% increments along the nine pairs.Participants' choices between the safer and riskier options, at each payoff and probability combination, were recorded to investigate individual value sensitivity.Participants were paid based on the actual outcome of one of their choices; the outcome was determined after all choices had been made (i.e., no feedback at each decision).This paradigm allowed us to examine valuation during decision-making, independent from potential learning and outcome effects.
Tasks of this sort are classically used to study individuals' value-based decision-making under risk, and expected utility theory 8 points to two basic components that account for differences among individuals' choices in such tasks.The first, 'risk preference 9,10 (RP)' reflects how objective values are subjectively perceived (subjective value) and is quantified by the curvature of a power utility function 8 .The second component determines the degree to which subjective value differences between options affect the probability of choosing one option over the other, and is often referred to as 'inverse temperature 11 (IT)'.Both components characterize individual differences in the direction and the degree to which objective values impact individual choices, and thus are used as measures of value sensitivity in the current study.Note that each measure explains a different functional relationship between value and decision-making: RP accounts for nonlinear (concave or convex) subjective valuation and IT is a linear scaling of values (similar to 'reward sensitivity' in other MDD studies 1 ; see Methods for expected utility model specifications).Based on maximum a posteriori fitting, the value sensitivity measures were estimated from individuals' choices (see Methods for parameter estimation procedure).
Participants completed the decision-making task on up to four laboratory visits as part of a longitudinal study; on average, visits were separated by four weeks (mean of 116.27 days between Time 1 and Time 4 visits).At each visit, participants were instructed that one of their actual choices would be randomly selected and played out to determine their payoff at the end of the visit.The payoff was determined by the values of a gamble selected via random number generator from the participant's actual choices and a roll of a hundred-sided die (the first determining which gamble would be played and the second determining the payoff).Participants who made less than two visits to the laboratory, had Beck Depression Inventory (BDI-II) scores 12 > 12 for controls or < 13 at Time 1 for MDD participants, or always chose the option with smaller expected value) were excluded from analyses (see Methods for numbers of excluded participants for each criterion).The analyzed sample for RP and IT parameter estimation included 33 nonpsychiatric controls (14 females; age = 33.00± 11.31) and 65 individuals with MDD (48 females; age = 37.92 ± 11.48).See Table 1 for further demographic information.
Previous studies have shown that risk preferences measured with variations of the Holt & Laury task 7 are stable over time in unselected control individuals, particularly when model-based estimates were used 14,15 .Adopting the approach of these studies for measuring temporal stability, we examined the stability of RP and IT within controls and participants with MDD by correlating the value of each parameter between pairs of visits ([1 st vs 2 nd visit], [1 st vs 3 rd visit], … [3 rd vs 4 th visit]) (Fig. 2aii, 2bii).Both control and MDD participants showed moderate to high stability in both RP and IT, respectively (mean correlation coefficients: RP control : Spearman ρ = 0.57; RP MDD : ρ = 0.54; IT control : ρ = 0.48; IT MDD : ρ = 0.57; see Fig. 2aii and 2bii for full correlation matrix).Note that the proportion of risky choices, a model-free measure of risk preference, was also stable over time in both the MDD and control groups (see Fig. S2 in Supplementary materials for model-free risk preference stability over time).These significant correlations indicate that for MDD and control participants, the risk preference and inverse temperature measures of value sensitivity at the decision phase are stable over time.
Given previous reports that reward sensitivity at decision outcome varies with symptoms in depression 1,[16][17][18] , we also examined whether RP and IT varied systematically with depressive or anxious symptoms.Symptoms were measured using the BDI-II 12 , the Spielberger Anxiety Inventory state measure (SAI) 19 , and the five subscales of the Mood and Anxiety Symptoms Questionnaire (MASQ) (Anhedonic Depression, Anxious Arousal, General distress (GD):Anxiety, GD:Depression, and GD:Mixed) 20 ; correlations were performed within the MDD group.None of the clinical symptom scores were related to MDD participants' RP or IT parameters (see Fig. S1, Table S1, and S2 in Supplementary materials for statistical test results).These data demonstrate that individual differences in value sensitivity during decision-making are not explained by clinical characteristics of MDD.
The current study used a risky decision-making task to investigate MDD individuals' value sensitivity at the decision phase independent from learning and feedback.The within subjects repeated-measures design allowed us to examine the stability of the value sensitivity measures, and the model-based approach dissociated linear (inverse temperature) and nonlinear (risk preference) value sensitivities that together determine behavioral choices during risky decision-making.
A few previous studies have used risky decision-making paradigms and measured MDD individuals' risk preferences.The results, however, have been inconsistent.Some studies reported decreased risk seeking behavior in individuals with MDD 16,21,22 , while other studies reported comparable risk preferences between individuals with MDD and healthy individuals 23,24 .In the current study, we showed that risk preferences (nonlinear value sensitivity) in individuals with MDD are comparable with those of healthy individuals.The stability of risk preferences was tested across four repeated visits, and consistent with previous findings in unselected control individuals 14,15,25 , MDD participants showed stable risk preferences over time (c.f., model-free measures showing low reliability [26][27][28] ).In addition to estimating risk preference, we examined inverse temperature (linear value sensitivity, similar to 'reward sensitivity' in other MDD studies 1 ) at the decision phase, and showed that MDD participants have stable and comparable inverse temperature compared with non-psychiatric controls.In addition, none of the clinical symptom severity measures within participants with MDD were related to individual differences in risk preference or inverse temperature.These results indicate that in contrast with previous decision-making studies showing blunted valuation at the outcome phase in MDD 1 , neither linear nor nonlinear value sensitivity at the decision phase in MDD is different from that of controls.
To date, studies examining valuation in MDD have primarily focused on the outcome phase of reward learning tasks and shown impaired valuation, including diminished neural reward responses [29][30][31] , reduced learning rate 32 , lower reward sensitivity 1 , or enhanced exploration (more frequent choice shifting) 33,34 in participants with MDD.A few other studies have used various non-learning tasks and have suggested that individuals with MDD have low motivation for monetary reward 16,35,36 ; however, in these studies, the focus was also on responses at the outcome phase 17,18 .Unlike the abundant literature about responses to reward outcome (particularly during reward learning), little is known about whether individuals with MDD have intact ability to process and compare values during value-based decision-making when no learning is required.The current study provided no outcome feedback during the task and thus focused on the decision phase, dissociated from reward experience.These data showed that during the decision phase, participants with MDD do not have impaired valuation compared with healthy individuals.This is consistent with previous studies showing intact neural responses in individuals with MDD during reward anticipation (prior to outcome) 37,38 .Together, the present data indicate that individuals with MDD have intact valuation when reward contingencies are fully known (no reward learning required) and suggest that previously reported valuation deficits in MDD are specific to the outcome phase of tasks in which reward experience and learning occur.
In MDD, the intact valuation, dissociated from learning, may provide mechanistic insight about behavioral activation therapies for depression 39 .These type of therapies engage individuals with potential positive reinforcers (rewards) in a structured manner and, in essence, allow individuals with MDD to largely bypass disrupted learning processes.That is, behavioral activation provides a guided learning environment wherein actionreward contingencies can evolve from being unsampled and ambiguous to sampled and fully known.As our data indicate, when action-reward contingencies are fully known, participants with MDD show intact valuation processes.We speculate that this state is comparable to the endpoint of successful behavioral activation wherein the experience of reward is restored.In brief conclusion, the current study both suggests specificity of previously reported value processing disruptions in MDD to the experience of reward during learning and calls attention to mechanistic precision about disease processes and treatment that may be obtained through task-specific decision models.

Methods and Materials:
Participants.Fifty non-psychiatric controls and 80 individuals with MDD were recruited as part of a larger ongoing study examining neural substrates of treatment response in MDD (neural and treatment data will be analyzed as part of another manuscript).Among these participants, for the current study, we included individuals who at least participated in both Time 1 and 4 laboratory visits to maximize the time interval for test-retest reliability.These inclusion criteria yielded 41 non-psychiatric controls and 69 individuals with MDD for the present study.Basic inclusion/exclusion criteria were initially assessed via telephone and were confirmed during the first laboratory visit using the Structured Clinical Interview for DSM-IV-TR Axis I Disorders -Research Version -Patient Edition (With Psychotic Screen) (SCID-I/P) 40 and selected modules of the Mini-International Neuropsychiatric Interview (M.I.N.I.) 41 .At study intake, individuals in the MDD group met DSM-IV criteria for MDD and/or dysthymia while individuals in the control group did not meet criteria for any current Axis I disorder.Exclusion criteria for all participants included contraindications to magnetic resonance imaging (MRI) and history of neurological disease.Following the initial screening visit (Time 1), participants returned to the lab up to three times; on average, there were four-week intervals in between each visit.All participants provided written informed consent and were given instruction about the task.The study was approved by the Institutional Review Board of Virginia Tech.Three controls whose BDI-II scores were above the non-depressive range (i.e., greater than 12) at any visit and two individuals with MDD who had BDI-II scores in the nondepressive range (i.e., less than 13) at Time 1 were additionally excluded from analyses 42 .Five controls and two individuals with MDD who always chose the option with smaller expected value were also excluded.Therefore, the analyzed sample for RP and IT parameter estimation included 33 healthy controls (14 females; age = 33.00± 11.31) and 65 participants with MDD (48 females; age = 37.92 ± 11.48).See Table 1 for additional demographic information.
Experimental procedure.Participants made a series of nine choices between two gambles, one of which was objectively riskier than the other (adapted from Holt & Laury 7 ) (Fig. 1).Each pair of gambles had the same high-and low-payoff probabilities that varied from 10% to 90% in 10% increments along the nine pairs.Payoff spreads between high-and low-payoffs were fixed for each option; 'Option A' had $5.00 and $4.00, and 'Option B' had $9.63 and $0.25 as potential payoffs.Participants were paid based on the actual outcome of one of their choices; the payoff was determined by the values of a gamble selected via random number generator from the participant's actual choices and a roll of a hundred-sided die (the first determining which gamble would be played and the second determining the payoff).
Model-free analyses.For model-free behavioral analyses, the proportion of choosing the risky option (P(risky)) among the nine pairs of gambles was used as a measure of risk preference.Given the expected value (EV) between pairs of choices (Fig. 1), a risk neutral individual should show P(risky) = 5/9 ≈ 0.56 (as per expected utility theory, a risk neutral individual is expected to choose Option B in the trials where EV(B) > EV(A), decisions 5-9, and to choose Option A in the trials where EV(B) < EV(A)).Higher P(risky) thus indicates risk seeking; P(risky) was calculated per visit and used to examine stability of model-free risk preferences over time in each group.
Estimates of individual risk preference.We applied expected utility theory 8 to estimate each individual's risk preference (RP) and inverse temperature (IT) that predict the individual's choices.We used a standard power utility function and softmax choice rule as described below: where U A (U B ) is the utility of the Option A (Option B), P high-payoff is the probability of earning the high-payoff, V represents the payoff amount for each gamble, α is the risk preference, and µ is the inverse temperature.The estimated RP, α, indicates whether an individual is risk averse (0 < α < 1), risk neutral (α = 1), or risk seeking (α > 1).The estimated IT, µ, indicates how sensitive an individual is to the utility differences between the two gambles; larger µ indicates higher sensitivity to utility differences and µ ≈ 0 indicates utility (subjective value) insensitivity.
To achieve a more stable parameter estimation for each individual, we adopted a hierarchical model structure of the population 43 in which it is assumed that a participant i's parameters (µ i and α i ) are sampled from the population's parameter distribution.Of importance, both controls and participants with MDD were considered to share the same group-level (the population) distribution (equal prior), which allowed us to compare the two participant groups in the further analyses.This is a conservative approach, because the equal prior does not introduce potential bias about different parameter distributions between groups.Based on these assumptions, we estimated the group-level parameter distribution for each parameter and set the distribution as a prior for individual estimation (maximum a posteriori (MAP) estimation).In the current study, we set the group-level distribution of each parameter as a gamma distribution 44 with a shape parameter, k, and a scale parameter, θ, (µ ~ Γ(k µ , θ µ ); and α ~ Γ(k α , θ α )).For each iteration of the groupparameter estimation (max iteration of 15,000), 100 random samples were drawn from each parameter distribution for each participant, and the average of the calculated likelihoods were used as an approximation of the integral in the following equation: Note that all participants visited at least twice, including the 1st and the 4th visits.
Because we tested whether an individual's value sensitivity (RP and IT) changes across multiple visits, we chose not to provide any information about the subject's identity in the estimation step; behavioral choices from a participant's two visits were considered as decision patterns from two independent participants.Note that estimated value sensitivities for the same subject from repeated visits were considered as repeatedmeasures for post estimation stability testing.This is a more stringent approach examining within-subject stability over repeated visits.To apply this method, we used 196 sets of behavioral choices for the group-level parameter estimation ([33 HC + 65 MDD] × [1st visit + 4th visit]; only 1st and 4th visits were used to provide an equal amount of choice information from each individual participant).The group-level parameters were used to define each parameter's prior distribution for individual-level estimation, which was equally applied to individual-level estimations for all four visits.We fit the data using MAP, with posterior function as below.
All parameter estimations were conducted with custom MATLAB R2015b (MathWorks) scripts and the fminsearch function in MATLAB with multiple initial values.
Clinical measures.At each visit, participants completed a battery of self-report measures to assess current depression and anxiety symptoms.Depressive symptom severity was measured using the BDI-II and the Anhedonic Depression subscale score of the MASQ.Anxiety symptom severity was measured using the State scale of the SAI (Spielberger Anxiety Inventory) and the Anxious Arousal subscale of the MASQ.Additionally, general distress (GD) related to depressive symptoms, anxious symptoms, or a mixture of the two were measured using the MASQ subscales, GD: Anxiety, GD: Depression, and GD: Mixed, respectively.

Statistical analyses.
We examined if model-free risk preference (P(risky)) and modelbased measures of value sensitivity (inverse temperature and risk preference) were consistent across multiple visits.IT and RP measures in both participant groups were not normally distributed (Shapiro-Wilk test P < 0.01 for IT and RP in each group and in each visit), and thus non-parametric tests were used as appropriate and available.First, to compare the means of IT and RP across four laboratory visits and between groups, we used mixed-design ANOVA where visit number (Time 1, Time 2, Time 3, Time 4) was the within-subject factor and diagnostic group (MDD, control) was the between-subject factor.Parameters were first rank transformed and then inserted for mixed-design ANOVA 13 .In addition, we used Friedman's test to examine whether IT and RP across four visits were stable or not, within each group.Second, Spearman's correlations between risk preference measures from two different visits ('1st visit' (T1) vs T2, T1 vs T3, T1 vs T4, T2 vs T3, T2 vs T4, and T3 vs calculated to test if the rank-order of risk preference within each group is consistent across multiple visits.False discovery rate (FDR) adjusted q-values where indicated were reported for multiple comparisons 45 .MATLAB R2015b was used for all statistical tests.We used a standard power utility function and softmax choice rule to identify separate 'risk preference' and 'inverse temperature' parameters to explain nonlinear and linear value sensitivities in decision-making.(ai, bi) Estimated RP and IT were stable across four repeated visits for both MDD and control participants.Across the repeated visits, both RP and IT were comparable between the control and MDD groups (no main effect of group using mixeddesign ANOVA with rank transformation).(aii, bii) Spearman's correlation coefficients were calculated to test whether the rank order of the parameters among individuals was consistent between visits to the lab (([1 st vs 2 nd visit], [1 st vs 3 rd visit], … [3 rd vs 4 th visit]).Each point represents an individual participant, and the color-coded lines are the robust regression line between measures from two visits.Gray and red shades represent distribution of data points along the y-axis; *P < 0.05, **P < 0.01, ***P < 0.001, uncorrected; all correlations were significant after applying multiple comparison correction (FDR q < 0.0001).