Children’s value-based decision making

Smith, Karen E.; Pollak, Seth D.

doi:10.1038/s41598-022-09894-3

Download PDF

Article
Open access
Published: 08 April 2022

Children’s value-based decision making

Scientific Reports volume 12, Article number: 5953 (2022) Cite this article

2774 Accesses
6 Altmetric
Metrics details

Subjects

Abstract

To effectively navigate their environments, infants and children learn how to recognize events predict salient outcomes, such as rewards or punishments. Relatively little is known about how children acquire this ability to attach value to the stimuli they encounter. Studies often examine children’s ability to learn about rewards and threats using either classical conditioning or behavioral choice paradigms. Here, we assess both approaches and find that they yield different outcomes in terms of which individuals had efficiently learned the value of information presented to them. The findings offer new insights into understanding how to assess different facets of value learning in children.

Understanding the development of reward learning through the lens of meta-learning

Article 18 April 2024

Human value learning and representation reflect rational adaptation to task demands

Article 30 May 2022

The neural and computational systems of social learning

Article 12 March 2020

Introduction

Learning the predictive value associated with environmental stimuli is essential to adaptive decision making^1,2. It facilitates a wide range of behaviors across development including obtaining food, avoiding life-threatening injury, adequate seeking of protection, and effective navigation of the social world^3,4. For these reasons, there has been a growth in research aimed at elucidating the mechanisms through which value learning emerges^5,6. In these studies, acquisition of value information is typically assessed by pairing neutral stimuli with a salient outcome. Learning is then inferred in one of two ways: through subsequent physiological or behavioral reactivity to the neutral stimuli, or the extent to which participants use the stimuli to guide their behavioral choices^7,8,9. However, it is not clear that these two approaches index similar measures of learning. To test the comparability of these approaches, we presented children with an opportunity to learn the value of stimuli and then assessed both changes to their reactivity to the new information as well as the degree to which that new information changed their decision-making.

Humans can recognize and use probabilistic relationships in their environments as early as infancy^10,11. Indeed, probabilistic learning during early childhood has been implicated in a number of different processes, including language learning, category formation, and social behavior^12,13,14. Additionally, probabilistic learning of reward and threat has been implicated in behavioral outcomes, such as the emergence of risk taking and aggression^15,16 and a range of psychopathologies^17,18,19. Together this suggests that value learning is present early in development and plays a critical role in shaping how children interact and engage with their environment.

Because early value learning has been associated with the emergence of behavioral and mental health problems, there has been interest in examining how early environments might influence the development of value learning processes. However, this literature remains inconclusive. Some evidence suggests that stress exposure and predictability in childhood are linked to the development of reward and threat learning processes^{7,9,20,21,22,40}. Other reports indicate limited support for any association^7,23,24. One potential explanation for these discrepancies is these studies use a variety of different paradigms to assess children’s learning, which may tap very different components of these processes.

Specifically, the different paradigms used in studies of children’s value learning may not represent comparable measures of learning. The types of paradigms employed to assess children’s value learning are typically either classical Pavlovian or instrumental conditioning paradigms. In both of these paradigms, neutral cues (or unsigned value signals) are paired with appetitive or aversive reinforcers. In classical Pavlovian conditioning, learning is inferred based on whether or not an organism demonstrates physiological and behavioral responses to the neutral cue (i.e., heart rate and skin conductance responses, freezing, reaction times). In instrumental paradigms, learning is inferred based on whether an organism executes the expected behavioral response to the neutral cue. For example, if an organism presses a lever to avoid administration of an aversive shock, it is assumed they have learned predictive value of the neutral stimulus^1,16,25. Although these two paradigms are both used to assess value learning in children, there are potentially important differences between them. Classical conditioning tasks use physiological and behavioral responses elicited directly by the neutral cue to assess whether the child has linked some value to a stimulus, but do not determine whether that information is directly or meaningfully translated into guiding decision-making or behavior^25,26. Instrumental tasks provide insight into goal directed behaviors. However, they are dependent on inferences of learning based upon whether a child executes a behavioral response to either approach an appetitive reinforcer or avoid an aversive reinforcer^27,28. In sum, there may be many reasons that a child decides whether or not to execute a behavioral response that are not necessarily related to whether they have acquired relevant, new information.

As one example, motivation may influence how a child behaves during assessment of learning. This potential confound between learning and motivation is illustrated by studies in which monetary (or point) rewards are used as appetitive reinforcers^7,29,30. These approaches are predicated upon the idea that children who have learned neutral cue–reinforcer relationships will respond to the neutral cues in ways that maximize their receipt of monetary rewards. Children’s learning is then modeled based on their behavioral choices^25,31. While it could be the case that children do not execute the expected behavioral responses because they have not learned, they also may not execute the expected responses because they are not sufficiently motivated by the monetary reward to use the information they have learned. Conversely, a loss of monetary reward may not be sufficiently salient to change behavior³². Comparing children’s performance across different assessments of learning can aid in illuminating whether behavior is being driven by learning or other motivational drives.

To test whether different measures of value learning identify similar groups of children as having learned, we compared children’s performance on a classical conditioned learning task and on a behavioral choice task. Children underwent a Pavlovian conditioning paradigm involving both appetitive and aversive reinforcers. Next, we assessed the extent to which they were able to use the conditioned stimuli to make decisions about whether to approach or avoid the various reinforcers. If both approaches are measuring comparable aspects of learning, children who demonstrate good associative learning on the conditioning paradigm should also be able to use that learned information to guide their approach and avoidance decisions. We assessed learning through both overt behavior and autonomic nervous system reactivity, and tested learning using a variety of different reinforcers to guard against stimulus-specific effects.

Results

Value learning as assessed by conditioning

We began by confirming that children learned the value of the previously neutral stimulus items during conditioning (for task design see Figs. 1 and Figure S1). Learning was assessed in three ways. First, we ran an HLM model including pre-conditioning and post-conditioning rating and reinforcer type as fixed effects with random effects for reinforcer type nested within subject for Visual Analogue Scale ratings. Children rated cues paired with appetitive reinforcers more positively after conditioning and those paired with aversive reinforcers more negatively (χ²(4) = 15.28, p = 0.004). This was especially true for the points and aversive noise reinforcers. Second, we examined children’s modeled learning rates. Across participants, the best fit yielded a learning rate of 0.2 which is similar to those utilized in other studies^33,34,35. As a tertiary measure of learning, we also examined heart rate reactivity (using IBIs) to different reinforcer trials during the conditioning task. Children demonstrated differential heart rate reactivity to the different reinforcers (χ²(4) = 50.27, p < 0.001) further indicative of learning during the task (for full discussion of heart rate analyses see Supplemental Materials). Including age, gender, and WASI-II score in the models did not change any of the reported effects. Further details of the analyses are reported in the Supplemental Materials Tables S1 and S2.

Value learning as assessed by behavioral choice

We next examined children’s behavior on the behavioral choice task (Fig. 1) to determine how they used the information they learned in the conditioning task. To do so, we ran logistic HLM models with a random effect for reinforcer type nested within participant. Reinforcer type (points, positive image, aversive noise, negative image) was included as a fixed factor and whether or not the child approached was the outcome variable. As expected, children approached the appetitive reinforcers and avoided the aversive reinforcers (χ²(4) = 143.37, p < 0.001). As with the Visual Analogue Scale ratings, this was most pronounced for the points and aversive noise. Including age, gender, and WASI-II score did not change any of the reported effects. Further details of the analyses are reported in the Supplemental Materials.

Comparing performance on the conditioning and behavioral choice tasks

To determine if performance on the conditioning task and behavioral choice task identified similar groups of children as having acquired the value of the stimuli, we examined clusters of behavior using change in pre- and post-conditioning ratings of the neutral shapes and use behaviors on the approach avoidance task. Our measure of learning during the conditioning task was change in visual analogue scale ratings of the shapes measured using unstandardized residualized change scores. For the points and the positive image, more positive change is indicative of increased learning, and for the aversive noise and negative image, more negative change is indicative of greater learning. Our measure of use behaviors was children’s likelihood of demonstrating the expected behavior. Effective use of information reflects a participant choosing to approach appetitive reinforcers and avoid aversive reinforcers. Because there is an opportunity for additional learning in the behavioral choice task, analyses were run using only behavior on the first five trials of the task. If performance on the two tasks is comparable, the cluster analysis should identify two latent subgroups of behavior: one in which children demonstrate learning on the conditioning task and effective use of information on the behavioral choice task and a second in which individuals demonstrate little evidence of learning on the conditioning task and poor use of information on the behavioral choice task. Consistent with this, we identified a group of children that appeared to demonstrate higher conditioned learning and higher effective use of information as well as a group of children with lower conditioned learning and lower effective use of information. However, two additional groups of behavior were also identified: a group of children demonstrating higher conditioned learning but lower effective use of information and a group of children demonstrating lower conditioned learning and higher effective use of information. The presence of these two additional groups suggests that the two tasks do not provide comparable inferences of learning for all individuals. The four clusters were similar across the different reinforcer conditions (points, positive image, aversive noise, negative image) (Fig. 2). We also ran all cluster analyses using an alternative measure of learning (behavior derived from our computational reinforcement model) and behavior across all trials and found comparable patterns which are reported in Figures S2–S6.

Examination of alternative hypotheses

It was possible that children did not use information they learned in the conditioning task because they had forgotten the cue reinforcer relationships by the time they completed the behavioral choice task. To test this alternative hypothesis, we assessed explicit recall for these relationships at the completion of the behavioral choice task (Figure S1). We found no evidence that an inability to recall the cue reinforcer relationships accounted for performance differences between high and low use groups for high learners (ps > 0.10; Table S3). Additionally, there was no consistent evidence that the relationship between learning and use of value information was associated with children’s age, gender, or general cognitive ability (ps > 0.10; Table S3). It was also possible that the observed pattern of behaviors across the two tasks was due to children’s relative cognitive immaturity. To assess whether our findings replicate in an adult sample, we conducted a follow-up study with young adults (n = 74) using the same experimental paradigm (see also³⁶). We found comparable effects (Fig. 2), suggesting our findings are not specific to early childhood. Full methods and results for the adult follow-up study are reported in the Supplemental Materials (Table S4).

Discussion

We tested whether classical conditioning and behavioral choice tasks identify similar groups of children as having learned a set of value information and found that they do not. Some children demonstrated learning on the conditioning task that was similarly reflected in their actions in the behavioral choice task. Yet others demonstrated evidence of learning based only on the conditioning task that was not reflected in their behavioral choices, or, conversely, showed little evidence of learning on the conditioning task but clearly used the learned value information in making their behavioral choices. Last, some children demonstrated little evidence of learning based on either tasks. This finding helps account for some reported inconsistencies in the reward and threat learning literature.

The robustness of this dissociation is supported by several lines of convergent evidence. First, we found similar evidence for this phenomenon across four distinct types of reinforcers, making it unlikely that there are stimulus-specific effects across individuals. Second, we found similar patterns using different ways to assess learning on the conditioning task, making it unlikely that the effects are specific to one type of measurement. Third, children’s using previously learned information less effectively was not accounted for by the participant’s forgetting the value information or general cognitive factors. Our findings suggest that studies assessing value (or reward/threat) learning in children should carefully consider whether the method of assessment being used aligns with the primary question of interest—neither approach is more or less accurate, but each approach assesses a distinct aspect of learning. For this reason, there is utility in using multiple tasks when examining learning and motivational processes.

Utilizing multiple assessments of learning may be especially critical when assessing individual differences in value learning processes, or the effects of experiences on the acquisition of value information. Altered value learning has been implicated in a range of psychopathologies^17,18,19, and there is growing evidence altered value learning may be one mechanism through which adverse experiences early in childhood increase risk for later psychopathology and behavioral problems^9,37,38. To date, research in this area cannot make clear claims about whether observed differences are linked to learning or differences in other motivational drives^39,40. Some recent evidence suggests it is likely the latter⁴¹. Using multiple assessments can help clarify what motivational components underlie these observed behavioral differences.

Taken together, the present data indicate a need for further research examining the mechanisms underlying what transforms learned information into action in early childhood—or what prevents acquired information from being transformed into action. Of particular interest are central prefrontal-dopaminergic striatal circuits. These circuits appear to play an important role in encoding value and informing goal directed approach avoidance behaviors in instrumental learning tasks^27,42. Dopaminergic activity has been linked to effective approach and avoidance of appetitive and aversive stimuli respectively^1,43. Additionally, dopamine may be particularly relevant to the motivational salience of stimuli⁴⁴. Research examining reactivity in central circuits can test whether altered reactivity in dopaminergic circuits contributes to the differential behavioral patterns observed.

Future research can refine the relationship between children’s learning and use of value information. One potentially surprising finding is that we identified children who demonstrated lower learning on the conditioning task but still executed the expected behavioral choices on the approach avoidance task. We suspect that the approach avoidance task provided some additional opportunity for learning. This is likely because children continue to be exposed to and receive feedback about the cue-reinforcer relationships across trials. Since our measures of learning are continuous and not dichotomous (learning/no learning)^16,25, this re-exposure to the stimuli may have reactivated these representations for some participants. An alternative explanation is this is due to the fact that the instrumental nature of the approach avoidance task provides participants with increased control over the outcomes. Provision of control has been demonstrated to increase motivation during value-based tasks and increase the subjective value of reinforcers^45,46. Thus, this increased control may have facilitated faster encoding of value information for some individuals who previously demonstrated poorer learning. The lack of substantial differences between clusters examined using behavior during the first five trials and across the entire task, could be interpreted as evidence against additional learning. However, both reactivation of prior representations of shape reinforcer relationships and facilitation of learning by increased control could have occurred rapidly, within the first five trials. Further research can further examine what may be driving this effect.

It could be argued that the differences we observe in performance across the tasks are reflective of other within-individual differences not captured by our measures of learning or memory for cue reinforcer relationships. However, we did utilize multiple measures of learning and found a similar pattern of dissociation in performance on the tasks across these convergent measures. It could also be argued that the differences we observe are a result of the two tasks indexing different forms of learning^27,28; yet the observation that some children only approached positive stimuli and only avoided negative stimuli suggest this is not driving differential performance on the two tasks. These behavioral decisions would not have been possible if children were not applying previously learned information from the conditioning task. Finally, the current sample was primarily White and the samples for some of the low use groups are relatively small. Future research should replicate this effect in larger, more diverse samples with more representation in both high use and low use groups to reduce limitations associated with generalizability.

Overall, these data cast a new light on how to assess the processes through which children acquire critical information from stimuli in the environment and transform that information to guide their actions, broadly referred to as value-based decision making. Research aimed at further examining the factors the prevent learned information from being used in decision making can aid in our understanding of the mechanisms underlying these effects and holds potential for informing intervention and treatment for behavioral problems associated with disrupted value learning and use.

Method

Participants

We aimed to recruit 70 child participants (see also⁴⁷). Final recruitment was 72 children (29 female 8 – 9 years old (M = 8.43; SD = 0.50; Race: 65.3% White Non-Hispanic; 2.8% Asian; 9.7% Black/African American; 9.7% White Hispanic; 4.2% Hispanic; 4.2% Multi-Racial; 4.2% Other). We recruited children in this age range because it appears to be the earliest period when children reliably exhibit both appetitive and aversive conditioned learning^7,48. Children provided verbal assent, and their parents provided written informed consent. Child participants received a toy prize and their parents received $25. This study was approved by the University of Wisconsin-Madison Institutional Review Board and performed in accordance with relevant guidelines and regulations.

Procedure

Methods are the same as those described previously in⁴⁷. Participants attended one laboratory session lasting approximately ninety minutes. On arrival, participants completed a conditioned learning task, assessing their ability to learn associations between value information and neutral stimuli. They then completed an approach avoidance task, assessing participants’ ability to use the information they learned in the conditioning task to guide behavior. To ensure potential differences in performance on the two tasks were not driven by differences in memory for the learned relationships, participants completed an explicit recall task after undergoing conditioning. Post-experiment all participants were debriefed. Tasks were presented using E-Prime 2.0 on a touch screen Windows PC. An electrocardiogram (ECG) was collected using standard lead II electrode configuration throughout the experiment. To control for any potential differences in cognitive functioning, the Matrix Reasoning and Vocabulary subtests of the Wechsler Abbreviated Scale of Intelligence-Second Edition were administered to all participants (WASI-II)⁴⁹.

Conditioned value learning

Participants completed a Pavlovian conditioning paradigm where they saw five colored shapes followed by either appetitive, aversive, or neutral reinforcers^33,47. Appetitive reinforcers consisted of points and a positive image; aversive reinforcers were an unpleasant 95 dB noise and a negative image (Fig. 1). The images were taken from the Open Affective Standardized Image Set (OASIS; Kurdi et al.⁵⁰; Positive Image: I256; Negative Image: I287). During conditioning, participants saw a visual cue (geometric colored shape) that was displayed until a keyboard response was made or 1.5 s had passed. This cue was followed by a delay period of 6 s during which a fixation cross was displayed. The delay was followed by either a corresponding reinforcer or a scrambled neutral image presented for 1.5 s with a probability of 0.8 for the reinforcer and 0.2 for the scrambled neutral image. Each trial was followed by a jittered inter-trial interval of 2.5–5.5 s. A fifth neutral condition consisted of a geometric cue always followed by the neutral scrambled picture. To maintain attention and as a measure of conditioning, participants were asked to press a keyboard response button as soon as they saw the geometric cue. Participants completed 14 trials of each condition for a total of 70 trials. Presentation of each trial was randomized within participants. Across participants, the shape-reinforcer pairings were counterbalanced using a Latin Square design.

To measure conditioned learning, participants were asked to rate how good or bad they thought each neutral shape was prior to and after the conditioning task using a Visual Analogue Scale. Visual Analogue Scale ratings ranged from 0 (Bad) to 100 (Good) (Figure S1). Consistent with previous research, response times from participants’ button press to the neutral shapes were also used as a convergent secondary behavioral measure of learning³³. These response times were used to model participants’ learning rates during conditioning using a reinforcement learning framework⁵¹. We derived participant level learning rates using subjects’ response times (RTs) to the cue using participants’ keyboard responses to neutral shapes during the conditioning task. RTs have been shown to be good indicators of conditioning^52,53, and learning rates represent the speed of integration of recent outcomes^16,25.

Use of learned information to guide behavioral choice

After the conditioning task, participants completed a behavioral choice task in which they were asked to use information from the conditioning task to approach or avoid appetitive and aversive stimuli⁴⁷. This task was similar to the conditioned learning task, with the following exceptions. On each trial, participants were presented with the same shapes they had encountered on the previous task. After 1.5 s, a green and a red button appeared on either side of the screen. These buttons remained on screen until participants made a response (Fig. 1). If participants selected the green button, the paired reinforcer was presented. However, if participants selected the red button, a blank screen appeared without any reinforcer. In this manner, selecting the green button represented an approach response and pressing the red button represented an avoidance response. Participants completed 14 trials of each condition for a total of 70 trials. Trial presentation was randomized within participants and the side of the screen where the green and red buttons appeared was counterbalanced across participants.

Memory of learned information

To ensure that differences in performance on the conditioning task and the behavioral choice task were not a result of participants forgetting the shape-reinforcer pairings, participants also completed an explicit recall task at the end of the experiment⁴⁷. Memory was assessed two different ways. In one block, participants saw each neutral shape and were asked to identify what came after it by selecting one of four choices. In another block, participants were presented with each reinforcer and asked to identify what came before it by selecting one of four choices. Presentation of trials within blocks was randomized, and order of blocks was counterbalanced across participants. Details of the task are shown in Figure S1.

Physiological measures

As a tertiary convergent measure of learning, we examined autonomic cardiac reactivity during the conditioned learning task. Heart rate was derived from the ECG continuously throughout the study. Results for heart rate are described in inter-beat interval of the heart (IBI). The IBI represents the time in milliseconds between two heart beats, such that as heart rate decreases, IBI increases. The IBI series, derived from ECG, was time sampled at 4 Hz (with interpolation) to yield an equal interval time series. The ECG was measured using a Bionex system (MindWare Technologies LTD, Gahanna, OH). MindWare software was used to visually inspect all physiological data. To examine whether there were differences in autonomic reactivity during anticipation and presentation of reinforcers, IBIs were coded for the six second anticipatory period between cue presentation and reinforcer presentation to assess reactivity in anticipation of the reinforcer. IBIs were also coded for the time period between reinforcer presentation and next cue presentation to assess autonomic reactivity to the reinforcers (4–7 s).

Statistical analyses

We used hierarchical linear modeling (HLM) techniques to examine participants’ pre- and post-conditioning Visual Analogue Ratings of the neutral shapes by reinforcer conditioning, reaction times to the neutral shapes by reinforcer condition, memory for the shapes by reinforcer condition, behavior on the behavioral choice task, and changes in IBI reactivity to reinforcers during conditioning. All HLM techniques were run using the lmer and glmer functions in the lme4 package in R v3.5.1, the Anova function in the car package was used to examine significance of the fixed effects. The emmeans package to examine simple slopes for interactions in linear models as recommended by Preacher et al.⁵⁴ and estimated marginal effects for predicted response probabilities for interactions in logistic models^55,56.

To examine whether individuals exhibited different patterns of learning and use behaviors across the conditioned learning and behavioral choice tasks, we used k-means clustering methodology⁵⁷. K-means takes a data driven approach that allows for identification of latent subgroups across measures of interest – in this case behavioral performance across the two tasks. We opted to use a data driven approach as this allows for identification of potential subgroups of behavioral performance without imposing a priori assumptions about how the behaviors should relate. Our measure of learning during the conditioning task was change in visual analogue scale ratings of the neutral shapes pre- and post-conditioning. Specifically, we calculated unstandardized residualized change scores by subtracting participants’ pre-conditioning rating from their post-conditioning rating. We then regressed the change score onto the pre-conditioning rating to remove baseline variance in ratings⁵⁸. These unstandardized residualized change scores and performance on the behavioral choice task were included as the k-means clustering factors. Clusters were run separately for each condition (points, noise, positive image, and negative image). Further methodological and analytic details, including discussion of other potential cluster solutions, are presented in the Supplemental Materials and Table S5.

Data availability

Associated data and code is available on the Open Science Framework (OSF; https://osf.io/ns3ke/).

References

Daw, N. D. & Tobler, P. N. Value learning through reinforcement: The basics of dopamine and reinforcement learning. Neuroecon. Decis. Mak. Brain Sec. Ed. https://doi.org/10.1016/B978-0-12-416008-8.00015-2 (2013).
Article Google Scholar
Padoa-Schioppa, C. & Assad, J. A. Neurons in the orbitofrontal cortex encode economic value. Nature 441, 223–226 (2006).
Article CAS PubMed PubMed Central ADS Google Scholar
Debiec, J. & Olsson, A. Social fear learning: From animal models to human function. Trends Cogn. Sci. 21, 546–555 (2017).
Article PubMed PubMed Central Google Scholar
Knutson, B. & Srirangarajan, T. Toward a deep science of affect and motivation. In Emotion in the Mind and Body (eds Neta, M. & Haas, I. J.) 193–220 (Springer, 2019).
Chapter Google Scholar
Olsson, A., FeldmanHall, O., Haaker, J. & Hensler, T. Social regulation of survival circuits through learning. Curr. Opin. Behav. Sci. 24, 161–167 (2018).
Article Google Scholar
Denison, S. & Xu, F. The origins of probabilistic inference in human infants. Cognition 130, 335–347 (2014).
Article PubMed Google Scholar
Gerin, M. I. et al. A neurocomputational investigation of reinforcement-based decision making as a candidate latent vulnerability mechanism in maltreated children. Dev. Psychopathol. 29, 1689–1705 (2017).
Article PubMed Google Scholar
Hanson, J. L. et al. Early adversity and learning: Implications for typical and atypical behavioral development. J. Child Psychol. Psychiatry Allied Discip. 58, 770–778 (2017).
Article Google Scholar
Silvers, J. A. et al. Vigilance, the amygdala, and anxiety in youths with a history of institutional care. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 2, 493–501 (2017).
PubMed PubMed Central Google Scholar
Montag, J. L. Limited evidence for probability matching as a strategy in probability learning tasks. Psychol. Learn. Motiv. Adv. Res. Theory https://doi.org/10.1016/bs.plm.2021.02.005 (2021).
Article Google Scholar
Rabagliati, H., Ferguson, B. & Lew-Williams, C. The profile of abstract rule learning in infancy: Meta-analytic and experimental evidence. Dev. Sci. 22, 1–18 (2019).
Article Google Scholar
Plate, R. C., Fulvio, J. M., Shutts, K., Green, C. S. & Pollak, S. D. Probability learning: Changes in behavior across time and development. Child Dev. 89, 205–218 (2018).
Article PubMed Google Scholar
Romberg, A. R. & Saffran, J. R. Statistical learning and language acquisition. Wiley Interdiscip. Rev. Cogn. Sci. 1, 906–914 (2010).
Article PubMed PubMed Central Google Scholar
Gweon, H., Tenenbaum, J. B. & Schulz, L. E. Infants consider both the sample and the sampling process in inductive generalization. Proc. Natl. Acad. Sci. U. S. A. 107, 9066–9071 (2010).
Article CAS PubMed PubMed Central ADS Google Scholar
Galván, A. Neural systems underlying reward and approach behaviors in childhood and adolescence. In Brain Imaging in Behavioral Neuroscience 167–188 (2013). https://doi.org/10.1007/7854_2013_240.
Nussenbaum, K. & Hartley, C. A. Developmental cognitive neuroscience reinforcement learning across development: What insights can we draw from a decade of research?. Dev. Cogn. Neurosci. 40, 100733 (2019).
Article PubMed PubMed Central Google Scholar
Shankman, S. A. et al. A psychophysiological investigation of threat and reward sensitivity in individuals with panic disorder and/or major depressive disorder. J. Abnorm. Psychol. 122, 322–338 (2013).
Article PubMed Google Scholar
Goris, J. et al. Autistic traits are related to worse performance in a volatile reward learning task despite adaptive learning rates. Autism 25, 440–451 (2021).
Article PubMed Google Scholar
Browning, M., Behrens, T. E., Jocham, G., O’Reilly, J. X. & Bishop, S. J. Anxious individuals have difficulty learning the causal statistics of aversive environments. Nat. Neurosci. 18, 590–596 (2015).
Article CAS PubMed PubMed Central Google Scholar
VanTieghem, M. R. & Tottenham, N. Neurobiological programming of early life stress: Functional development of amygdala prefrontal circuitry and vulnerability for stress related psychopathology. Curr. Top. Behav. Neurosci. 38, 117–136 (2018).
Article PubMed PubMed Central Google Scholar
Boecker, R. et al. Impact of Early Life Adversity on Reward Processing in Young Adults: EEG-fMRI Results from a Prospective Study over 25 Years. PLoS ONE. 9(8), e104185. https://doi.org/10.1371/journal.pone.0104185 (2014)
Article CAS PubMed PubMed Central ADS Google Scholar
Kasparek, S. W., Jenness, J. L. & McLaughlin, K. A. Reward Processing Modulates the Association Between Trauma Exposure and Externalizing Psychopathology. Clinical Psychological Science. 8(6), 989–1006. https://doi.org/10.1177/2167702620933570 (2020)
Article PubMed PubMed Central Google Scholar
Dennison, M. J. et al. Differential associations of distinct forms of childhood adversity with neurobehavioral measures of reward processing: A developmental pathway to depression. Child Dev. 90, 96–113 (2017).
Google Scholar
Boecker-Schlier, R. et al. Interaction between COMT Val158Met polymorphism and childhood adversity affects reward processing in adulthood. Neuroimage 132, 556–570 (2016).
Article CAS PubMed Google Scholar
Glimcher, P. W. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis. PNAS 108, 15647–15654 (2011).
Article CAS PubMed PubMed Central ADS Google Scholar
Dayan, P. & Berridge, K. C. Model-based and model-free Pavlovian reward learning: Revaluation, revision, and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492 (2014).
Article PubMed PubMed Central Google Scholar
LeDoux, J. & Daw, N. D. Surviving threats: Neural circuit and computational implications of a new taxonomy of defensive behaviour. Nat. Rev. Neurosci. 19, 269–282 (2018).
Article CAS PubMed Google Scholar
Daw, N. D. & O’Doherty, J. P. Multiple systems for value learning. In Neuroeconomics 393–410 (Elsevier, 2014). doi:https://doi.org/10.1016/B978-0-12-416008-8.00021-8.
Harms, M. B., Shannon-Bowen, K. E., Hanson, J. L. & Pollak, S. D. Instrumental learning and cognitive flexibility processes are impaired in children exposed to early life stress. Dev. Sci. 21, 1–13 (2018).
Article Google Scholar
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).
Article CAS PubMed PubMed Central ADS Google Scholar
O’Doherty, J. P., Cockburn, J. & Pauli, W. M. Learning, reward, and decision making. Annu. Rev. Psychol. 68, 73–100 (2017).
Article PubMed Google Scholar
Verharen, J. P. H., Adan, R. A. H. & Vanderschuren, L. J. M. J. How reward and aversion shape motivation and decision making: A computational account. Neuroscientist 26, 87–99 (2020).
Article PubMed Google Scholar
Metereau, E. & Dreher, J. C. The medial orbitofrontal cortex encodes a general unsigned value signal during anticipation of both appetitive and aversive events. Cortex 63, 42–54 (2015).
Article PubMed Google Scholar
Jensen, J. et al. Separate brain regions code for salience vs valence during reward prediction in humans. Hum. Brain Mapp. 28, 294–302 (2007).
Article PubMed Google Scholar
O’Doherty, J. P., Buchanan, T. W., Seymour, B. & Dolan, R. J. Predictive neural coding of reward preference involves dissociable responses in human ventral midbrain and ventral striatum. Neuron 49, 157–166 (2006).
Article PubMed CAS Google Scholar
Smith, K. E. & Pollak, S. D. Approach motivation and loneliness: Individual differences and parasympathetic activity. Psychophysiology https://doi.org/10.1111/psyp.14036 (2022).
Article PubMed Google Scholar
Hanson, J. L., Knodt, A. R., Brigidi, B. D. & Hariri, A. R. Heightened connectivity between the ventral striatum and medial prefrontal cortex as a biomarker for stress-related psychopathology: Understanding interactive effects of early and more recent stress. Psychol. Med. 48, 1–9 (2017).
Google Scholar
Risbrough, V. B. et al. Does anhedonia presage increased risk of posttraumatic stress disorder. In Behavioral Neurobiology of PTSD (Springer, 2018). https://doi.org/10.1007/7854.
Birn, R. M., Roeber, B. J. & Pollak, S. D. Early childhood stress exposure, reward pathways, and adult decision making. Proc. Natl. Acad. Sci. 114, 13549–13554 (2017).
Article CAS PubMed PubMed Central Google Scholar
Hanson, J. L. et al. Behavioral problems after early life stress: Contributions of the hippocampus and amygdala. Biol. Psychiatry 77, 314–323 (2015).
Article PubMed Google Scholar
Patterson, T. K., Craske, M. G. & Knowlton, B. J. Enhanced avoidance habits in relation to history of early-life stress. Front. Psychol. 10, 1–13 (2019).
Article Google Scholar
Berridge, K. C. & Kringelbach, M. L. Neuroscience of affect: Brain mechanisms of pleasure and displeasure. Curr. Opin. Neurobiol. 23, 294–303 (2013).
Article CAS PubMed PubMed Central Google Scholar
Oleson, E. B., Gentry, R. N., Chioma, V. C. & Cheer, J. F. Subsecond dopamine release in the nucleus accumbens predicts conditioned punishment and its successful avoidance. J. Neurosci. 32, 14804–14808 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kringelbach, M. L. & Berridge, K. C. The affective core of emotion: Linking pleasure, subjective well-being, and optimal metastability in the brain. Emot. Rev. 9, 191–199 (2017).
Article PubMed PubMed Central Google Scholar
Inzlicht, M., Shenhav, A. & Olivola, C. Y. The effort paradox: Effort is both costly and valued. Trends Cogn. Sci. 22, 337–349 (2018).
Article PubMed PubMed Central Google Scholar
Bhanji, J. P. & Delgado, M. R. The social brain and reward: Social information processing in the human striatum. Wiley Interdiscip. Rev. Cogn. Sci. 5, 61–73 (2014).
Article PubMed Google Scholar
Smith, K. E. & Pollak, S. D. Early life stress and perceived social isolation influence how children use value information to guide behavior. Child Dev. https://doi.org/10.1111/cdev.13727 (2021).
Article PubMed Google Scholar
McLaughlin, K. A., DeCross, S. N., Jovanovic, T. & Tottenham, N. Mechanisms linking childhood adversity with psychopathology: Learning as an intervention target. Behav. Res. Ther. 118, 101–109 (2019).
Article PubMed PubMed Central Google Scholar
Wechsler, D. Wechsler Abbreviated Scale of Intelligence Second Edition (WASI-II) (NCS Parson, 2011).
Google Scholar
Kurdi, B., Lozano, S. & Banaji, M. R. Introducing the open affective standardized image Set (OASIS). Behav. Res. Methods 49, 457–470 (2017).
Article PubMed Google Scholar
Rescorla, R. A. & Wagner, A. R. A Theory of pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement BT. In A.H. Black & W.F. Prokasy (Eds), Class. Cond. II. Curr. Res. Theory, (64–99) (Appleton-Century-Crofts, New York, 1972).
Critchley, H. D., Mathias, C. J. & Dolan, R. J. Fear conditioning in humans: The influence of awareness and autonomic arousal on functional neuroanatomy. Neuron 33, 653–663 (2002).
Article CAS PubMed Google Scholar
Gottfried, J. A., O’Doherty, J. & Dolan, R. J. Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science (80-). 301, 1104–1107 (2003).
Article CAS ADS Google Scholar
Preacher, K. J., Curran, P. J. & Bauer, D. J. Computational tools for probing interactions in multiple linear regression, multilevel modeling, and latent curve analysis. J. Educ. Behav. Stat. 31, 437–448 (2006).
Article Google Scholar
Long, J. S. & Mustillo, S. A. Using Predictions and Marginal Effects to Compare Groups in Regression Models for Binary Outcomes. Sociological Methods & Research. 50(3), 1284–1320. https://doi.org/10.1177/0049124118799374 (2021)
Article MathSciNet Google Scholar
McCabe, C. J., Halvorson, M. A., King, K. M., Cao, X. & Kim, D. S. Interpreting Interaction Effects in Generalized Linear Models of Nonlinear Probabilities and Counts. Multivariate Behavioral Research. 1–27. https://doi.org/10.1080/00273171.2020.1868966 (2021)
Hartigan, J. & Wong, M. Algorithm AS 136: A K-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28, 100–108 (1979).
MATH Google Scholar
Allison, P. D. Change scores as a dependent variable in regression analysis. Sociol. Methodol. 20, 93–114 (1990).
Article Google Scholar

Download references

Acknowledgements

We express thanks to Kristina Woodard, Keith Yoder, and Maryellen MacDonald for input on a previous version of the paper. This work was supported by the National Institute of Mental Health [R01MH61285 (SDP), T32MH018931-30 (KES)] and by a core grant to the Waisman Center from the National Institute of Child Health and Human Development [P50HD105353].

Author information

Authors and Affiliations

Department of Psychology and Waisman Center, University of Wisconsin – Madison, 1500 Highland Ave, Rm 392, Madison, WI, 53705, USA
Karen E. Smith & Seth D. Pollak

Authors

Karen E. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Seth D. Pollak
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.E.S. and S.D.P. designed the study and K.E.S. was responsible for all data collection and analyses. K.E.S. and S.D.P. wrote the main manuscript text and prepared all figures and tables. All authors reviewed the manuscript.

Corresponding author

Correspondence to Karen E. Smith.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Smith, K.E., Pollak, S.D. Children’s value-based decision making. Sci Rep 12, 5953 (2022). https://doi.org/10.1038/s41598-022-09894-3

Download citation

Received: 13 September 2021
Accepted: 28 March 2022
Published: 08 April 2022
DOI: https://doi.org/10.1038/s41598-022-09894-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.