Rationally inattentive intertemporal choice

Gershman, Samuel J.; Bhui, Rahul

doi:10.1038/s41467-020-16852-y

Download PDF

Article
Open access
Published: 03 July 2020

Rationally inattentive intertemporal choice

Nature Communications volume 11, Article number: 3365 (2020) Cite this article

5926 Accesses
24 Citations
16 Altmetric
Metrics details

Subjects

Abstract

Discounting of future rewards is traditionally interpreted as evidence for an intrinsic preference in favor of sooner rewards. However, temporal discounting can also arise from internal uncertainty in value representations of future events, if one assumes that noisy mental simulations of the future are rationally combined with prior beliefs. Here, we further develop this idea by considering how simulation noise may be adaptively modulated by task demands, based on principles of rational inattention. We show how the optimal allocation of mental effort can give rise to the magnitude effect in intertemporal choice. In a re-analysis of two prior data sets, and in another experiment, we reveal several behavioral signatures of this theoretical account, tying choice stochasticity to the magnitude effect. We conclude that some aspects of temporal discounting may result from a cognitively plausible adaptive response to the costs of information processing.

Time pressure changes how people explore and respond to uncertainty

Article Open access 08 March 2022

Quantifying the contribution of individual variation in timing to delay-discounting

Article Open access 15 September 2021

Temporal discounting correlates with directed exploration but not with random exploration

Article Open access 04 March 2020

Introduction

The preference for sooner over later rewards is traditionally interpreted as an intrinsic decline in value as outcomes recede into the future. However, recent evidence suggests an alternative (although not mutually exclusive) viewpoint: temporal discounting could arise from internal uncertainty in value representations of future rewards. Imagining the future allows an agent to immediately experience anticipated outcomes, helping them to delay gratification, but this prospection may lose its impact when mental simulations are noisy. A number of influential studies show that patience is enhanced by treatments that may be thought of as increasing the precision of mental simulation. For example, discounting is attenuated when people are asked to imagine spending future rewards¹, when they imagine future outcomes in greater detail², and when episodic tags are provided to facilitate such imagination³.

Thus, even if agents with imperfect foresight intrinsically valued delayed rewards just as much as immediate rewards, the myopic nature of their prospection could reduce the impact of the future. This idea has recently been formalized in a Bayesian model of discounting⁴, in which an agent observes noisy simulations of future value and applies Bayes’ rule to obtain a posterior estimate. Assuming simulations become noisier the further they reach into the future, the agent increasingly relies on their prior beliefs and discounts the reward value. Gabaix and Laibson⁴ showed how this can lead to hyperbolic discounting while accommodating the effects of experience on intertemporal choice tasks. However, this analysis is predicated on a fixed relationship between reward delay and simulation noise, although there is reason to think that the relationship is not fixed. We extend this perspective by considering how the degree of simulation noise may be adaptively controlled, and propose how such a mechanism contributes to the well-known magnitude effect in intertemporal choice—the finding that people are disproportionately more patient when judging high-value outcomes^5,6,7,8,9.

According to our theory, vivid prospection can help agents to delay gratification, but this comes at a cost. Making simulations more precise requires mental effort, and this effort may only be invoked if its benefits outweigh its costs. We formalize this intuition in terms of rate-distortion theory, an information-theoretic framework for modeling the optimal level of internal uncertainty (ref. ¹⁰, see also refs. ^11,12,13). Richer simulations are cognitively costly, and therefore a decision maker must make a trade-off involving precision and effort. Most relevant in the present context, larger rewards may be more important to evaluate carefully. In this case, greater magnitudes would be simulated more precisely and, in light of the above argument, would engender more patience. The model thus implies a direct connection between stochasticity and discounting.

Our theory is consistent with several lines of psychological evidence. Mental representations of events farther in the future generally contain fewer sensory and contextual details than those closer in time^14,15. Future events are imagined with greater vividness when cued by more rewarding stimuli¹⁶, and people produce longer lists of thoughts when prompted to evaluate higher magnitude intertemporal choices¹⁷. Moreover, when people are asked to write down justifications for their choices, patience is enhanced specifically for lower magnitude rewards, as if cognitive control is already being exerted at higher magnitudes¹⁸.

In what follows, we investigate the behavioral implications of this theory. We show how it qualitatively accounts for several empirical findings pertaining to the magnitude effect, quantitatively improves model fit in a large existing data set, and accurately predicts patterns of discounting and stochasticity in another experiment. These results help sharpen our understanding of the relationship between patience, reward, and mental effort.

Results

A Bayesian model of as-if temporal discounting

In this section, we first describe the Bayesian model of discounting developed by Gabaix and Laibson⁴. In the next section, we extend this analysis by endogenizing the simulation noise variance using a rational inattention analysis.

Following Gabaix and Laibson⁴, we model an agent who is faced with a choice between several rewards that occur at some time in the future. For ease of exposition, we will consider a single reward r_t delivered after delay t, whose true value is denoted by u. We assume that this value is drawn from a Gaussian distribution with mean μ and variance $\sigma _u^2$: $u \sim {\cal{N}}(\mu ,\sigma _u^2)$. We further assume that the agent does not directly observe u, but instead observes a noisy signal $s \sim {\cal{N}}(u,\sigma _\varepsilon ^2t)$ generated by some form of mental simulation.

Noise arises from the agent’s limited ability to simulate the event’s future value. Gabaix and Laibson⁴ assumed that the variance increases linearly with the delay because events farther in the future are harder to simulate. Combined with the assumption that the prior mean μ is 0 (which we suppose for the remainder of the paper), this leads to the following expression for the posterior mean:

$$\hat u = {\Bbb E}[u|s] = \mathop {\int}\limits_u p (u|s)u\,{\mathrm{d}}u = D_ts,$$

(1)

where p(u|s) is the posterior, computed using Bayes’ rule:

$$p(u|s) = \frac{{p(s|u)p(u)}}{{\int_u p (s|u)p(u){\mathrm{d}}u}},$$

(2)

with likelihood p(s|u) and prior p(u) as defined above. The term D_t expresses an as-if hyperbolic discount function:

$$D_t = \frac{1}{{1 + kt}},$$

(3)

with the as-if discount rate k given by:

$$k = \frac{{\sigma _\varepsilon ^2}}{{\sigma _u^2}}.$$

(4)

The discount function is as-if because the agent in fact has a neutral time preference, but chooses in accordance with hyperbolic discounting, one of the most broadly supported models of intertemporal choice (see ref. ¹⁹ for a review). Figure 1 illustrates how Bayesian inference in this model produces temporal discounting.

**Fig. 1: Illustration of rational-discounting model.**

The estimated value of a reward will be regularized towards the mean μ (0 in this case). The strength of this regularization depends on k, which can be thought of as an inverse signal-to-noise ratio. Intuitively, when the simulation noise variance $\sigma _\varepsilon ^2$ is large relative to the prior variance $\sigma _u^2$, the simulations are less reliable and the agent will rely more on their prior, whereas when the simulation noise variance is relatively small, then the agent will rely more on their simulations.

Because we (the experimenters) cannot directly observe the signal s, we use the objective reward r_t as a proxy. This allows us to link the model directly to experimentally observable variables. We note, however, that this assumption may generate erroneous inferences. For example, we may misinterpret the effects of model misspecification in terms of simulation noise.

Rational inattention

The Gabaix and Laibson⁴ analysis assumed that the agent has a fixed simulation noise variance. Here we develop the idea that the simulation noise variance is determined by the agent’s attention to the signal. Intuitively, an agent can improve the reliability of their mental simulations by exerting cognitive effort (i.e., attending more), but pays a cost for this effort.

We approach this problem through the lens of rate-distortion theory¹⁰. Rate-distortion theory offers us a principled way to study the optimal precision of internal representations, formalized using information theory. As such, it has been fruitfully applied to human cognition in domains such as perceptual judgment and working memory²⁰, and its close relative, rational inattention^11,21, has been used to analyze a variety of economic problems^12,13. In this framework, the agent is modeled as a communication channel that takes as input the signal and outputs an estimate of the value. The agent can select the design of the channel subject to a constraint on the information rate of the channel (the number of bits that can be communicated per signal).

In this case, we define a family of channels parametrized by the simulation noise scaling parameter, $\sigma _\varepsilon ^2$. The optimization problem is to select the value of $\sigma _\varepsilon ^2$ that minimizes the expectation of a squared error distortion (aka loss) function that quantifies the cost of estimation error. As shown in the “Methods” section, the optimal simulation noise parameter under some assumptions is given by:

$$\bar \sigma _\varepsilon ^2 = \frac{{\sigma _u^2}}{{\beta |r_t|}},$$

(5)

where β > 0 is a sensitivity parameter that governs the link between information rate and magnitude. As β increases, the rate becomes increasingly sensitive to variations in reward and delay. Plugging this into Eq. (4) yields the optimal discount parameter:

$$\bar k = \frac{1}{{\beta |r_t|}}.$$

(6)

Thus, the rate-distortion framework can lead us to a model that captures the magnitude effect (inverse relation between discount factor and reward magnitude; Fig. 1c). As shown in the “Methods” section, the model also predicts a choice stochasticity magnitude effect: choices should become less stochastic as magnitude increases (Fig. 1d). This arises in the model because choice stochasticity is partially driven by simulation noise, which should decrease with reward magnitude.

Applications to prior experimental results

In this section, we explore the empirical implications of the rational inattention analysis. We begin by examining experimental data collected by Ballard et al.¹⁸, in which subjects reported their indifference point between an immediate and delayed reward. The reward magnitude was manipulated across subjects (see “Methods” section for more details). In addition, some subjects were assigned to a justification condition in which they were asked to explicitly justify their choices. Ballard et al.¹⁸ hypothesized that the magnitude effect arises from increased self-control in response to large magnitudes, and reasoned that justification would elevate the ceiling on self-control. In the language of rational inattention, we interpret justification as prompting increased allocation of cognitive resources to prospective simulations. This hypothesis can be formalized by increasing the β parameter in the justification condition compared to the no justification condition.

Five predictions follow from this hypothesis, all of which are confirmed in Fig. 2, and quantified by a regression with regressors for justification (no justification coded as +1, justification coded as 0), magnitude, and the interaction between justification and magnitude (negative coefficient indicates a reduced justification effect for larger magnitudes). For all of the following analyses, we report bootstrapped 95% confidence intervals. First, the average discount factor k should be larger in the no justification condition, because k decreases monotonically with β (regression coefficient for the main effect of justification: CI = [0.064, 0.155]). Second, the justification effect should diminish with magnitude, because dk/dβ is a concave function of |r| (regression coefficient for the interaction: CI = [−0.036, −0.015]).

**Fig. 2: Justification effect results.**

The next three predictions are distinctive of our theory, and pertain to the variability of k, which we quantify by the standard deviation. The third prediction is that the standard deviation of k should be higher for small magnitudes (i.e., a magnitude effect for response variability; regression coefficient for the main effect of magnitude: CI = [−0.016, −0.008]). The fourth prediction is that the standard deviation should be lower in the justification condition, because response variability decreases with β (regression coefficient for the main effect of justification: CI = [0.203, 0.501]). The fifth prediction is that the justification effect for response variability should diminish with magnitude (regression coefficient for the interaction effect: CI = [−0.109, −0.045]).

The Ballard data set confirms several predictions qualitatively, but is ill-suited to confirming quantitative predictions because each subject only saw a single experimental condition. To quantitatively assess the validity of our model, we re-analyzed a large data set (N = 1284) of intertemporal choices collected by Chávez et al.²². Each subject in this study was presented with the same set of 27 choices, taken from ref. ⁷. The rewards for both options and the delay for the larger-later option varied across trials, while the delay for the smaller-sooner option was held fixed at 0 days.

We compared our rational inattention model with several alternatives using random-effects Bayesian model selection (see “Methods” section). In particular, we compared the full rational model (R2) to a variant (R1), which uses the optimal discount factor, but treats the inverse temperature α as a free parameter. We also compared against standard quasi-hyperbolic (QH) discounting²³, and several variations of hyperbolic discounting, including the basic functional form (H0), and generalized versions that incorporate magnitude-dependent discounting and choice stochasticity (H1–H3²⁴). We used the protected exceedance probability (PXP) as a measure of model evidence. The PXP measures the probability that a particular model is more frequent in the population than all the other models under consideration, adjusting for the probability of differences arising by chance.

We found that the full rational inattention model (R2) was decisively favored (PXP > 0.99). Among the four variants of hyperbolic discounting, H3 was favored. We used this model to assess the qualitative predictions of the rational inattention theory (note that the rational inattention theory assumes discounting and choice stochasticity magnitude effects, so it cannot be used to falsify these predictions). Consistent with the theory’s predictions, the magnitude scaling parameter for inverse temperature (m_α) was significantly >0 [t(1283) = 7.47, p < 0.0001], indicating that choice stochasticity decreases with reward magnitude, whereas the magnitude scaling parameter for discounting (m_k) was significantly <0 [t(1283) = 15.42, p < 0.0001], indicating that myopia decreases with reward magnitude (Fig. 3a). Finally, we observed that the two magnitude scaling effects are negatively correlated (r = −0.21, p < 0.0001; Fig. 3b), consistent with the rational inattention model’s predictions. Thus, the data support the theory both qualitatively and quantitatively.

To further support the rational inattention model, we compared the psychometric functions of standard hyperbolic discounting (H0) and the full rational inattention model (R2), finding that choice probabilities were much better fit by R2, despite having fewer parameters (Fig. 3c, d).

The effect of reward variance on discounting and choice stochasticity

The rational inattention model predicts that the choice stochasticity magnitude effect should decrease with reward variance, because the noisy simulations become increasingly down-weighted as the reward variance increases, and this down-weighting interacts multiplicatively with the reward magnitude. The model also predicts that there should be no effect of reward variance on the discounting magnitude effect. We tested these predictions in another experiment (N = 221) in which the reward variance was manipulated while holding the mean and range of rewards fixed.

To evaluate the variance predictions, we fit the same models described above to the choice data. In this case, the model with the strongest support (PXP = 0.61) was H3 (hyperbolic discounting with magnitude-dependent discounting and choice stochasticity). The key parameter estimates are shown in Fig. 4, broken down by variance condition. Replicating our prior results with the Chávez data set, we found a significant discounting magnitude effect [m_k < 0: t(220) = 7.16, p < 0.0001] and a significant choice stochasticity magnitude effect [m_α > 0: t(219) = 9.95, p < 0.0001] when collapsing across conditions. Critically, the choice stochasticity magnitude effect was significantly lower in the high variance condition, [t(219) = 2.22, p < 0.05], whereas there was no effect of variance on the discounting magnitude effect (p = 0.84). Using a Bayesian t test with a scaled JZS (Jeffreys–Zellener–Siow) prior²⁵, we found a posterior probability >0.99 favoring the null hypothesis that variance does not modulate the discounting magnitude effect. These results collectively provide evidence consistent with our rational inattention model.

Discussion

Temporal discounting may stem partly from the inability of decision makers to perfectly simulate future outcomes²⁶. In this paper, we develop a theoretical account of prominent regularities in intertemporal choice, based on the idea that mental simulation of the future is noisy but controllable. Our approach connects the Bayesian model of discounting from⁴ with the information-theoretic framework of rate-distortion theory¹⁰ (see ref. ²⁰ for an overview of rate-distortion theory applications to human perception; see refs. ^11,12,13 for closely related economic applications of rational inattention). Supposing the prospective value of a reward becomes noisy when it is internally projected into the future, Bayesian agents should compensate for this uncertainty by relying more heavily on their prior beliefs—if priors are centered near zero, this leads to discounting of value. However, the degree of noise in the simulation may be controlled by the agent, at a cost. If it is more important to accurately evaluate larger rewards, the agent should spend extra mental effort to make their simulations more precise when dealing with greater magnitudes. This mechanism could lead to reduced temporal discounting when dealing with large rewards, a commonly observed phenomenon known as the magnitude effect. Our model can also account for how reward magnitude and contextual variability are simultaneously related to stochasticity in choice, which we validate in the re-analysis of two data sets and another experiment.

Note that the uncertainty we are dealing with is internal. This contrasts with theories of discounting based on objective risk in the arrival of rewards (e.g., refs. ^27,28). In the present framework, discounting can occur even when the decision maker has no innate preference for earlier rewards and there is no extrinsic risk. Of course, all of these pathways are not mutually exclusive, and we do not claim the others are inconsequential. Our goal is rather to clearly describe how apparent anomalies of intertemporal choice could arise from a cognitively plausible adaptive response to limits on information processing.

Our proposal is supported by a range of neural and behavioral evidence. Psychologically speaking, the allocation of attention in our framework (and what Gabaix and Laibson⁴ refer to as mental effort) may manifest as cognitive control—the set of mechanisms required to pursue a goal in the face of distractions and competing responses. It has been argued that the exertion of cognitive control depends on its expected value, the combination of its effort costs and payoff benefits in a given task, and that this plays a role in many decisions including intertemporal choices²⁹. Future events have been found to be imagined with greater vividness when cued by more rewarding stimuli¹⁶, and people list a greater number of thoughts when prompted to evaluate higher magnitude intertemporal choices¹⁷. Moreover, when people are asked to explicitly justify their choices, they exhibit more patience specifically for lower magnitude rewards, as if they have already hit a ceiling for higher magnitudes¹⁸. Our model formally draws out the implications of this cost-benefit logic, providing a high-level normative perspective that complements more mechanistic analyses of cognition and discounting (e.g., refs. ^30,31).

From a neuroscientific perspective, the exertion of cognitive control is known to rely on a network of regions in prefrontal cortex, which some studies have linked directly to temporal discounting^1,32. Shenhav et al.²⁹ have proposed that the expected value of control is computed by connected regions and guides the investment of attention into each task, while Ballard et al.¹⁸ demonstrated that frontal executive-control areas of the brain are particularly engaged in challenging intertemporal choices with high-magnitude rewards. Moreover, disruption of activity in such areas via transcranial magnetic stimulation reduces the magnitude effect³³. Taken together, these studies indicate that the brain adaptively modulates simulation noise and this plays a meaningful role in temporal discounting.

Another perspective from neuroscience is provided by studies of patients with Parkinson’s disease, who are known to have systemically low levels of dopamine. Foerde et al.³⁴ observed that patients on medication (with putatively higher dopamine levels) exhibit both more patience (higher estimated values of the k parameter) and a weaker magnitude effect compared to patients off medication. Both of these findings are consistent with the idea that higher levels of dopamine correspond to higher values of the sensitivity parameter β. Higher sensitivity means that reward will induce a greater willingness to exert cognitive effort, which in this case means reducing simulation noise and thereby reducing discounting. At the same time, increases in sensitivity will actually make the magnitude effect smaller, because of the concave relationship between the discount parameter and reward magnitude. Our interpretation of dopamine in terms of sensitivity is consistent with other work on Parkinson’s patients showing that high levels of dopamine produce greater reward sensitivity^35,36. More broadly, it has been suggested that dopamine may control allocation of cognitive effort³⁷. We conjecture that dopamine may play a specific role in mediating the relationship between reward and information rate, but further research will be required to directly test this hypothesis.

An important limitation of our experimental study was the hypothetical nature of choices made by subjects, a design element prompted by the impracticality of payment at the lengthy delays needed to precisely estimate discount rates. Many of the classic (e.g., refs. ^5,6,38) and modern (e.g., refs. ^18,39) studies of the magnitude effect are not incentive compatible, for the same reason. A recent survey has argued that comparison between incentive compatible and incompatible designs typically yield the same results for studies of intertemporal choice⁴⁰. For example, Bickel et al.⁴¹ have found that discount rates are highly correlated across real and hypothetical rewards⁹, as are their neural responses. Moreover, according to our analysis (following Gabaix and Laibson⁴), all decisions involve some future simulation, with the difference resting in the degree of simulation noise. Thus, although incentive compatibility is an important criterion towards which to strive in decision-making studies, practical and theoretical considerations render it less applicable to the experimental questions pursued here.

Finally, while our theory naturally captures a number of empirical phenomena surrounding the magnitude effect, future work may examine what other observations might be accommodated under other assumptions. For instance, people seem to savor and dread future outcomes⁴², which could lead people to prefer early resolution of losses, and the magnitude effect has been found to reverse in the loss domain¹⁷. The Bayesian discounting model cannot account for this inverted pattern, as it implies that deferred losses should be treated better than immediate ones. Nonetheless, there might be adaptive value in anticipation if it could facilitate planning and decision making⁴³. A more formal account of the costs and benefits involved may help predict when people will channel energy into such anticipatory thoughts.

Methods

Derivation of optimal precision

In order to derive the optimal precision $\bar \sigma _\varepsilon ^2 = {\mathrm{argmin}}_{\sigma _\varepsilon ^2}{\Bbb E}[{\cal{L}}(u,\hat u)|\sigma _\varepsilon ^2]$, the expected quadratic loss is computed as follows. Conditioning on u and s:

$$(u - \hat u)^2 = (u - D_ts)^2$$

(7)

$$= (D_t(u - s) + u(1 - D_t))^2$$

(8)

$$= D_t^2(u - s)^2 + 2D_t(1 - D_t)u(u - s) + u^2(1 - D_t)^2.$$

(9)

Taking the expectation over p(s|u), and subsequently over p(u),

$${\Bbb E}_s[(u - \hat u)^2] = \sigma _\varepsilon ^2tD_t^2 + 0 + u^2(1 - D_t)^2,$$

(10)

$${\cal{L}} = {\Bbb E}_u[{\Bbb E}_s[(u - \hat u)^2]] = \sigma _\varepsilon ^2tD_t^2 + \sigma _u^2(1 - D_t)^2$$

(11)

$$= \frac{{\sigma _\varepsilon ^2t(\sigma _u^2)^2}}{{(\sigma _u^2 + \sigma _\varepsilon ^2t)^2}} + \frac{{\sigma _u^2(\sigma _\varepsilon ^2t)^2}}{{(\sigma _u^2 + \sigma _\varepsilon ^2t)^2}}$$

(12)

$$= \frac{{\sigma _u^2\sigma _\varepsilon ^2t}}{{\sigma _u^2 + \sigma _\varepsilon ^2t}}.$$

(13)

We then plug this into the rate-distortion function for a Gaussian source (which reflects the rate-distortion frontier, that is, the minimal achievable information rate for a given distortion level, or equivalently the minimal achievable distortion for a given rate):

$$R = \frac{1}{2}{\mathrm{ln}}\left( {\frac{{\sigma _u^2}}{{\cal{L}}}} \right)$$

(14)

$$= \frac{1}{2}{\mathrm{ln}}\left( {\sigma _u^2\left( {\frac{{\sigma _u^2 + \sigma _\varepsilon ^2t}}{{\sigma _u^2\sigma _\varepsilon ^2t}}} \right)} \right)$$

(15)

$$= \frac{1}{2}{\mathrm{ln}}\left( {1 + \frac{{\sigma _u^2}}{{\sigma _\varepsilon ^2t}}} \right),$$

(16)

which can be rearranged to yield the optimal precision:

$$\bar \sigma _\varepsilon ^2 = \frac{{\sigma _u^2}}{{(e^{2R} - 1)t}},$$

(17)

where R is the information rate constraint in nats (i.e., units of information in base e).

We impose an additional constraint on this formulation, by assuming that information rate increases with reward magnitude (greater incentive to expend cognitive resources) and decreases with delay (simulation of distal events is more cognitively demanding):

$$R = \frac{1}{2}{\mathrm{ln}}\left( {\frac{{\beta |r_t|}}{t} + 1} \right),$$

(18)

where β > 0 is a sensitivity parameter that governs the relationship between rate, magnitude, and delay. As β increases, the rate becomes increasingly sensitive to variations in reward and delay. The constraint follows in the spirit of Gabaix and Laibson’s framework, reflecting a cost-and-benefit perspective on their baseline assumptions. The greater cost of simulating more distal events parallels their supposition of greater noise for projections extending farther into the future. Plugging R into Eq. (17) yields the optimal simulation noise parameter:

$$\bar \sigma _\varepsilon ^2 = \frac{{\sigma _u^2}}{{\beta |r_t|}}.$$

(19)

Note that although the optimal simulation noise variance in Eq. (17) depends on t, this dependence disappears when we use the rate constraint specified in Eq. (18).

We can draw out further implications of this model by connecting it to choice behavior. Let us assume, in the simplest case, that the agent deterministically chooses the option with highest estimated value. In this case, all stochasticity in choice behavior is driven by stochasticity in the agent’s simulation process. Marginalizing over these noisy simulations, the choice probability for a standard two-alternative choice (early vs. late) is given by:

$$P({\mathrm{choose}}\,{\mathrm{early}}) = {\mathrm{\Phi }}\left( {\alpha \left[ {D_tr_t - D_{t + \tau }r_{t + \tau }} \right]} \right),$$

(20)

where Φ is the standard Gaussian cumulative density function, τ is the difference in delay between early (r_t) and late (r_t+τ) options, and

$$\alpha = \frac{1}{{\sigma _\varepsilon \sqrt {D_t^2t + D_{t + \tau }^2(t + \tau )} }}$$

(21)

is an inverse temperature parameter controlling the degree of choice stochasticity (smaller values of α produce greater stochasticity). In the case where the early option is immediate (i.e., t = 0), as in many studies of discounting, this simplifies to:

$$\alpha = \frac{1}{{\sigma _\varepsilon D_\tau \sqrt \tau }}.$$

(22)

Plugging in the optimal simulation noise parameter gives:

$$\bar \alpha = \frac{{\sqrt {\beta |r_\tau |} (1 + \tau {\mathrm{/}}\beta |r_\tau |)}}{{\sigma _u\sqrt \tau }}.$$

(23)

One can show that

$$\frac{{\partial \bar \alpha }}{{\partial |r_\tau |}} \ge 0\,{\mathrm{for}}\,\beta |r_\tau | \, > \, \tau ,$$

(24)

which means that for sufficiently large rewards and sufficiently short delays, the model predicts a choice stochasticity magnitude effect: as reward magnitude gets larger, choice stochasticity should get smaller. One can also show that

$$\frac{{\partial ^2\bar \alpha }}{{\partial |r_\tau |\partial \sigma _u}} \le 0\,{\mathrm{for}}\,\beta |r_\tau | \, > \, \tau ,$$

(25)

which means that the choice stochasticity magnitude effect declines with reward variance (under the same conditions on reward and delay).

Finally, we can examine what happens to the two magnitude effects when the sensitivity parameter β changes:

$$\frac{{\partial ^2\bar k}}{{\partial |r_\tau |\partial \beta }} \ge 0,$$

(26)

$$\frac{{\partial ^2\bar \alpha }}{{\partial |r_\tau |\partial \beta }} \ge 0\,{\mathrm{for}}\,\beta |r_\tau | \, > \, \tau .$$

(27)

Because $\frac{{\partial \bar k}}{{\partial |r_\tau |}} \le 0$, this means that increasing β will decrease the discounting magnitude effect (i.e., push it closer to 0). This is somewhat counterintuitive, since one might reason that greater sensitivity to reward should translate into a stronger magnitude effect. This intuition is correct for the choice stochasticity magnitude effect: increasing β will magnify the dependence of choice stochasticity on reward magnitude. The key implication of this analysis is that a change in sensitivity will push the two magnitude effects in opposite directions.

Ballard data set description

Ballard et al.¹⁸ recruited 1500 subjects for their Study 3. After exclusions, the final sample size was 1382. Subjects considered a hypothetical choice between an immediate reward vs. a reward in 1 month. Each subject was randomly assigned to one immediate reward magnitude ($20, $50, $100, $200, $2000) and reported the delayed reward that would make them indifferent between the two options. Subjects in the justification condition were asked to justify their responses in two to three written sentences; subjects in the no justification condition did not have to provide any written justification.

Chávez data set description

Chávez et al.²² collected data from 1284 Mexican students (a mix of high school juniors and seniors and first-year university students). Subjects completed an intertemporal choice questionnaire developed by Kirby et al.⁴⁴, consisting of 27 questions, each presenting a hypothetical choice between a smaller-sooner (immediately available) monetary reward and a later-larger one. Monetary amounts were the same as in the original questionnaire, but expressed as Mexican pesos rather than US dollars.

Experimental methods

Two hundred and twenty-one people were recruited from Amazon Mechanical Turk via TurkPrime⁴⁵, and paid $1.25 for their participation. To elicit time preferences, we used a choice titration task in which subjects made a series of binary choices between a smaller-sooner reward and a larger-later reward (see, e.g., ref. ³⁴). They faced 40 titrator trials, each consisting of six hypothetical binary choices between fixed smaller-sooner and larger-later options, distinguished by larger-later delays, which varied from 1 to 6 months. The smaller-sooner reward was always $1 in every trial. The larger-later rewards were drawn from a Gaussian distribution, with a mean $5 truncated to be above $1 and below $9, rounded to the nearest cent. Subjects were randomly assigned to one of two conditions: in the low variance condition, the larger-later distribution had (untruncated) standard deviation 1, while in the high variance condition, the larger-later distribution had (untruncated) standard deviation 5. Empirically, the former condition had variance 1.03 and the latter had variance 4.97. The task was coded in JavaScript using jsPsych⁴⁶. Participants provided informed consent, and the study was approved by the Harvard Committee on the Use of Human Subjects.

Model fitting and comparison

We fit and compared several models with varying degrees of flexibility.

QH: quasi-hyperbolic discounting function, defined by $D_{t > 0} = \beta \delta ^tr$ and D₀ = r. We modeled choices using a generalized version of Eq. (28):
$$P({\mathrm{choose}}\,{\mathrm{early}}) = (1 - \omega ){\mathrm{\Phi }}\left( {\alpha \left[ {D_tr_t - D_{t + \tau }r_{t + \tau }} \right]} \right) + \frac{\omega }{2},$$
(28)
where α is the inverse temperature and ω is a lapse probability, capturing occasional random responses (see also ref. ²⁴). All subsequent models share the same choice probability function. This model has four free parameters: β, δ, α, and ω.
H0: standard hyperbolic discounting function, $D_t = 1{\mathrm{/}}(1 + kt)$. This model has three free parameters: k, α, and ω.
H1: hyperbolic discounting with baseline- and magnitude-dependent discount factor, using the parametrization of Vincent²⁴:
$$k_t = {\mathrm{exp}}(c_k - m_k{\mathrm{log}}|r_t|),$$
(29)
where c_k is a free parameter capturing baseline discounting (i.e., the component of discounting that is independent of magnitude), and m_k captures magnitude-dependent discounting. This model has four free parameters: c_k, m_k, α, and ω.
H2: same as H1, but with baseline- and magnitude-dependent inverse temperature:
$$\alpha = {\mathrm{exp}}(c_\alpha + m_\alpha {\mathrm{log}}(|r_t| + |r_{t + \tau }|)).$$
(30)
This model has five free parameters: c_k, m_k, c_α, m_α, and ω.
H3: same as H2, but without the baseline discounting and inverse temperature parameters. In this case, the parametrization simplifies to $k = |r_t|^{m_k}$ and $\alpha = (|r_1| + |r_2|)^{m_\alpha }$. This model has three free parameters: m_k, m_α, and ω.
R1: hyperbolic discounting with endogenized discount factor, using Eq. (6), fitting β as a free parameter. We approximated $\sigma _u^2$ as the empirical variance of the rewards each individual subject observed in the experiment. As in the other models, we use Eq. (28) to model choices, treating α and ω as free parameters. This model has three free parameters: β, α, and ω.
R2: hyperbolic discounting with endogenized discount factor and inverse temperature. This model uses the same formulation as R1, but sets α using Eq. (21). The model has two free parameters: β and ω.

Note that none of the models defined above, except for R1 and R2, are constrained to make the same qualitative predictions as the rational inattention theory. For example, the magnitude scaling parameters in H1–H3 might be 0 on average, or might go in a direction opposite what the theory predicts. Thus, fitting these models gives us the opportunity to test whether the parameter estimates are in qualitative alignment with the rational inattention theory, without making a commitment to the specific parametrization of that theory.

All models were fit using maximum likelihood estimation. To compare models, we computed the PXP⁴⁷, the probability that each model has higher model evidence than all the other models, taking into account the probability that the data may have arisen from a null (chance) model. To approximate model evidence, we used the Bayesian information criterion.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The data sets analyzed here are available online at https://github.com/sjgershm/rational-discounting. A reporting summary for this Article is available as a Supplementary Information file.

Code availability

The code reproducing the analysis is available online at https://github.com/sjgershm/rational-discounting.

References

Benoit, R. G., Gilbert, S. J. & Burgess, P. W. A neural mechanism mediating the impact of episodic prospection on farsighted decisions. J. Neurosci. 31, 6771–6779 (2011).
CAS PubMed PubMed Central Google Scholar
Lebreton, M. et al. A critical role for the hippocampus in the valuation of imagined outcomes. PLoS Biol. 11, e1001684 (2013).
CAS PubMed PubMed Central Google Scholar
Peters, J. & Büchel, C. Episodic future thinking reduces reward delay discounting through an enhancement of prefrontal–mediotemporal interactions. Neuron 66, 138–148 (2010).
CAS PubMed Google Scholar
Gabaix, X. & Laibson, D. Myopia and Discounting (National Bureau of Economic Research, 2017).
Thaler, R. Some empirical evidence on dynamic inconsistency. Econ. Lett. 8, 201–207 (1981).
Google Scholar
Green, L., Myerson, J. & McFadden, E. Rate of temporal discounting decreases with amount of reward. Mem. Cogn. 25, 715–723 (1997).
CAS Google Scholar
Kirby, K. N. Bidding on the future: evidence against normative discounting of delayed rewards. J. Exp. Psychol. 126, 54–70 (1997).
Google Scholar
Raineri, A. & Rachlin, H. The effect of temporal constraints on the value of money and other commodities. J. Behav. Decis. Mak. 6, 77–94 (1993).
Google Scholar
Johnson, M. W. & Bickel, W. K. Within-subject comparison of real and hypothetical money rewards in delay discounting. J. Exp. Anal. Behav. 77, 129–146 (2002).
PubMed PubMed Central Google Scholar
Berger, T. Rate Distortion Theory: A Mathematical Basis for Data Compression (Prentice-Hall, 1971).
Sims, C. A. Implications of rational inattention. J. Monetary Econ. 50, 665–690 (2003).
Google Scholar
Matějka, F. & McKay, A. Rational inattention to discrete choices: a new foundation for the multinomial logit model. Am. Econ. Rev. 105, 272–298 (2015).
Google Scholar
Caplin, A. Measuring and modeling attention. Annu. Rev. Econ. 8, 379–403 (2016).
Google Scholar
D’Argembeau, A. & Van der Linden, M. Phenomenal characteristics associated with projecting oneself back into the past and forward into the future: influence of valence and temporal distance. Conscious. Cogn. 13, 844–858 (2004).
PubMed Google Scholar
Trope, Y. & Liberman, N. Temporal construal. Psychol. Rev. 110, 403 (2003).
PubMed Google Scholar
Bulganin, L. & Wittmann, B. C. Reward and novelty enhance imagination of future events in a motivational-episodic network. PLoS ONE 10, e0143477 (2015).
PubMed PubMed Central Google Scholar
Hardisty, D. J., Appelt, K. C. & Weber, E. U. Good or bad, we want it now: fixed-cost present bias for gains and losses explains magnitude asymmetries in intertemporal choice. J. Behav. Decis. Mak. 26, 348–361 (2013).
Google Scholar
Ballard, I. C. et al. More is meaningful: the magnitude effect in intertemporal choice depends on self-control. Psychol. Sci. 28, 1443–1454 (2017).
PubMed PubMed Central Google Scholar
Frederick, S., Loewenstein, G. & O’Donoghue, T. Time discounting and time preference: a critical review. J. Econ. Lit. 40, 351–401 (2002).
Google Scholar
Sims, C. R. Rate-distortion theory and human perception. Cognition 152, 181–198 (2016).
PubMed Google Scholar
Denti, T., Marinacci, M. & Montrucchio, L. A note on rational inattention and rate distortion theory. Decis. Econ. Fin. 44, 1–15 (2019).
Chávez, M. E. et al. Hierarchical Bayesian modeling of intertemporal choice. Judgm. Decis. Mak. 12, 19 (2017).
Google Scholar
Laibson, D. Golden eggs and hyperbolic discounting. Q. J. Econ. 112, 443–478 (1997).
MATH Google Scholar
Vincent, B. T. Hierarchical Bayesian estimation and hypothesis testing for delay discounting tasks. Behav. Res. Methods 48, 1608–1620 (2016).
PubMed Google Scholar
Rouder, J. N. et al. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16, 225–237 (2009).
PubMed Google Scholar
Bulley, A. & Schacter, D. L. Deliberating trade-offs with the future. Nat. Hum. Behav. 4, 238–247 (2020).
PubMed Google Scholar
Sozou, P. D. On hyperbolic discounting and uncertain hazard rates. Proc. R. Soc. Lond. Ser. B 265, 2015–2020 (1998).
Google Scholar
Dasgupta, P. & Maskin, E. Uncertainty and hyperbolic discounting. Am. Econ. Rev. 95, 1290–1299 (2005).
Google Scholar
Shenhav, A., Botvinick, M. M. & Cohen, J. D. The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240 (2013).
CAS PubMed PubMed Central Google Scholar
Kurth-Nelson, Z., Bickel, W. & Redish, A. D. A theoretical account of cognitive effects in delay discounting. Eur. J. Neurosci. 35, 1052–1064 (2012).
PubMed PubMed Central Google Scholar
Amasino, D. R. et al. Amount and time exert independent influences on intertemporal choice. Nat. Hum. Behav. 3, 383–392 (2019).
PubMed Google Scholar
Figner, B. et al. Lateral prefrontal cortex and self-control in intertemporal choice. Nat. Neurosci. 13, 538 (2010).
CAS PubMed Google Scholar
Ballard, I. C. et al. Causal evidence for the dependence of the magnitude effect on dorsolateral prefrontal cortex. Scientific Rep. 8, https://doi.org/10.1038/s41598-018-34900-y (2018).
Foerde, K. et al. Dopamine modulation of intertemporal decision-making: evidence from Parkinson disease. J. Cogn. Neurosci. 28, 657–667 (2016).
PubMed Google Scholar
Manohar, S. G. et al. Reward pays the cost of noise reduction in motor and cognitive control. Curr. Biol. 25, 1707–1716 (2015).
CAS PubMed PubMed Central Google Scholar
Chong, T. T. J. et al. Dopamine enhances willingness to exert effort for reward in Parkinson’s disease. Cortex 69, 40–46 (2015).
PubMed PubMed Central Google Scholar
Westbrook, A. & Braver, T. S. Dopamine does double duty in motivating cognitive effort. Neuron 89, 695–710 (2016).
CAS PubMed PubMed Central Google Scholar
Benzion, U., Rapoport, A. & Yagil, J. Discount rates inferred from decisions: an experimental study. Manag. Sci. 35, 270–284 (1989).
Google Scholar
Green, L. et al. Delay discounting of monetary rewards over a wide range of amounts. J. Exp. Anal. Behav. 100, 269–281 (2013).
PubMed PubMed Central Google Scholar
Cohen, J. D. et al. Measuring Time Preferences (National Bureau of Economic Research, 2016).
Bickel, W. K. et al. Congruence of BOLD response across intertemporal choice conditions: fictive and real money gains and losses. J. Neurosci. 29, 8839–8846 (2009).
CAS PubMed PubMed Central Google Scholar
Loewenstein, G. Anticipation and the valuation of delayed consumption. Econ. J. 97, 666–684 (1987).
Google Scholar
Pezzulo, G. & Rigoli, F. The value of foresight: how prospection affects decision-making. Front. Neurosci. 5, 79 (2011).
PubMed PubMed Central Google Scholar
Kirby, K. N., Petry, N. M. & Bickel, W. K. Heroin addicts have higher discount rates for delayed rewards than non-drug-using controls. J. Exp. Psychol. 128, 78 (1999).
CAS Google Scholar
Litman, L., Robinson, J. & Abberbock, T. TurkPrime.com: a versatile crowdsourcing data acquisition platform for the behavioral sciences. Behav. Res. Methods 49, 433–442 (2017).
PubMed Google Scholar
De Leeuw, J. R. jsPsych: a JavaScript library for creating behavioral experiments in a Web browser. Behav. Res. Methods 47, 1–12 (2015).
PubMed ADS Google Scholar
Rigoux, L. et al. Bayesian model selection for group studies—revisited. Neuroimage 84, 971–985 (2014).
CAS PubMed Google Scholar

Download references

Acknowledgements

We are grateful to Xavier Gabaix and David Laibson for helpful discussions. This research was supported by the Office of Naval Research (N00014-17-1-2984), the Center for Brains, Minds and Machines (funded by NSF STC award CCF-1231216), and a research fellowship from the Alfred P. Sloan Foundation.

Author information

Authors and Affiliations

Department of Psychology, Harvard University, Cambridge, MA, USA
Samuel J. Gershman & Rahul Bhui
Center for Brain Science, Harvard University, Cambridge, MA, USA
Samuel J. Gershman
Center for Brains, Minds and Machines, Cambridge, MA, USA
Samuel J. Gershman
Department of Economics, Harvard University, Cambridge, MA, USA
Rahul Bhui

Authors

Samuel J. Gershman
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Bhui
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.J.G. and R.B. developed the theory, designed the experiment, and wrote the paper. S.J.G. performed the data analysis. R.B. conducted the experiment.

Corresponding author

Correspondence to Samuel J. Gershman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Nicolette Sullivan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gershman, S.J., Bhui, R. Rationally inattentive intertemporal choice. Nat Commun 11, 3365 (2020). https://doi.org/10.1038/s41467-020-16852-y

Download citation

Received: 14 November 2019
Accepted: 26 May 2020
Published: 03 July 2020
DOI: https://doi.org/10.1038/s41467-020-16852-y

This article is cited by

Multi-actor cooperation for emergency supply support: a simulation of behavior diffusion based on social networks
- Chenxi Lian
- Jian Wang
Natural Hazards (2024)
Impulsivity and risk-seeking as Bayesian inference under dopaminergic control
- John G. Mikhael
- Samuel J. Gershman
Neuropsychopharmacology (2022)
Time pressure changes how people explore and respond to uncertainty
- Charley M. Wu
- Eric Schulz
- Maarten Speekenbrink
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Time pressure changes how people explore and respond to uncertainty

Quantifying the contribution of individual variation in timing to delay-discounting

Temporal discounting correlates with directed exploration but not with random exploration

Introduction

Results

A Bayesian model of as-if temporal discounting

Rational inattention

Applications to prior experimental results

The effect of reward variance on discounting and choice stochasticity

Discussion

Methods

Derivation of optimal precision

Ballard data set description

Chávez data set description

Experimental methods

Model fitting and comparison

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Reporting Summary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Multi-actor cooperation for emergency supply support: a simulation of behavior diffusion based on social networks

Impulsivity and risk-seeking as Bayesian inference under dopaminergic control

Time pressure changes how people explore and respond to uncertainty

Comments

Search

Quick links