Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A prospective code for value in the serotonin system

Abstract

The in vivo responses of dorsal raphe nucleus serotonin neurons to emotionally salient stimuli are a puzzle1. Existing theories centring on reward2, surprise3, salience4 and uncertainty5 individually account for some aspects of serotonergic activity but not others. Merging ideas from reinforcement learning theory6 with recent insights into the filtering properties of the dorsal raphe nucleus7, here we find a unifying perspective in a prospective code for value. This biological code for near-future reward explains why serotonin neurons are activated by both rewards and punishments3,4,8,9,10,11,12,13, and why these neurons are more strongly activated by surprising rewards but have no such surprise preference for punishments3,9—observations that previous theories have failed to reconcile. Finally, our model quantitatively predicts in vivo population activity better than previous theories. By reconciling previous theories and establishing a precise connection with reinforcement learning, our work represents an important step towards understanding the role of serotonin in learning and behaviour.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: A predictive model of serotonin neuron activity.
Fig. 2: A prospective code for value explains qualitative tuning features of serotonin neurons from previous literature.
Fig. 3: Individual serotonin neurons exhibit reward coding features consistent with a prospective code for value.
Fig. 4: A prospective code for value better explains serotonin neuron population activity than competing theories.

Similar content being viewed by others

Data availability

Previously published data from ref. 5 are available at Dryad (https://doi.org/10.5061/dryad.cz8w9gj4s)41. Previously published data from ref. 10 are available at Zenodo (https://doi.org/10.5281/zenodo.12776509)56. All data used in the present work are available at Zenodo (https://doi.org/10.5281/zenodo.14623230)57.

Code availability

All code used in the present work is available at Zenodo (https://doi.org/10.5281/zenodo.14623230)57.

References

  1. Dayan, P. & Huys, Q. Serotonin’s many meanings elude simple theories. eLife 4, e07390 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Liu, Z., Lin, R. & Luo, M. Reward contributions to serotonergic functions. Annu. Rev. Neurosci. 43, 141–162 (2020).

    Article  PubMed  Google Scholar 

  3. Matias, S., Lottem, E., Dugué, G. P. & Mainen, Z. F. Activity patterns of serotonin neurons underlying cognitive flexibility. eLife 6, e20552 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Paquelet, G. E. et al. Single-cell activity and network properties of dorsal raphe nucleus serotonin neurons during emotionally salient behaviors. Neuron 110, 2664–2679 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Grossman, C. D., Bari, B. A. & Cohen, J. Y. Serotonin neurons modulate learning rate through uncertainty. Curr. Biol. 32, 586–599 (2022).

    Article  PubMed  Google Scholar 

  6. Sutton, R. S. & Barto, A. G. Reinforcement Learning 2nd edn (MIT, 2018).

  7. Harkin, E. F. et al. Temporal derivative computation in the dorsal raphe network revealed by an experimentally-driven augmented integrate-and-fire modeling framework. eLife 12, e72951 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Bromberg-Martin, E. S., Hikosaka, O. & Nakamura, K. Coding of task reward value in the dorsal raphe nucleus. J. Neurosci. 30, 6262–6272 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Cohen, J. Y., Amoroso, M. W. & Uchida, N. Serotonergic neurons signal reward and punishment on multiple timescales. eLife 4, e06346 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Li, Y. et al. Serotonin neurons in the dorsal raphe nucleus encode reward signals. Nat. Commun. 7, 10503 (2016).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  11. Zhong, W., Li, Y., Feng, Q. & Luo, M. Learning and stress shape the reward response patterns of serotonin neurons. J. Neurosci. 37, 8863–8875 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Ren, J. et al. Anatomically defined and functionally distinct dorsal raphe serotonin sub-systems. Cell 175, 472–487 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Feng, Y.-Y., Bromberg-Martin, E. S. & Monosov, I. E. Dorsal raphe neurons integrate the values of reward amount, delay, and uncertainty in multi-attribute decision-making. Cell Rep. 43, 114341 (2024).

  14. Soubrié, P. Reconciling the role of central serotonin neurons in human and animal behavior. Behav. Brain Sci. 9, 319–335 (1986).

    Article  Google Scholar 

  15. Deakin, J. F. W. & Graeff, F. G. 5-HT and mechanisms of defence. J. Psychopharmacol. 5, 305–315 (1991).

    Article  PubMed  Google Scholar 

  16. Jacobs, B. & Fornal, C. in Psychopharmacology: 4th Generation of Progress (eds Bloom, F.E. & Kupfer D.J.) 461–469 (Raven, 1995).

  17. Doya, K. Metalearning and neuromodulation. Neural Netw. 15, 495–506 (2002).

    Article  PubMed  Google Scholar 

  18. Dayan, P. & Huys, Q. Serotonin in affective control. Annu. Rev. Neurosci. 32, 95–126 (2009).

    Article  PubMed  Google Scholar 

  19. Boureau, Y.-L. & Dayan, P. Opponency revisited: competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36, 74–97 (2011).

    Article  PubMed  Google Scholar 

  20. Cools, R., Nakamura, K. & Daw, N. D. Serotonin and dopamine: unifying affective, activational, and decision functions. Neuropsychopharmacology 36, 98–113 (2011).

    Article  PubMed  Google Scholar 

  21. Azmitia, E. C. in Handbook of the Behavioral Neurobiology of Serotonin Vol. 31 (eds Müller, C. P. & Cunningham, K. A.) 3–22 (Elsevier, 2020).

  22. Shine, J. M. et al. Understanding the effects of serotonin in the brain through its role in the gastrointestinal tract. Brain 145, 2967–2981 (2022).

    Article  PubMed  Google Scholar 

  23. Koolschijn, R. S. et al. Resources, costs and long-term value: an integrative perspective on serotonin and meta-decision making. Curr. Opin. Behav. Sci. 60, 101453 (2024).

    Article  Google Scholar 

  24. Luo, M., Li, Y. & Zhong, W. Do dorsal raphe 5-HT neurons encode ‘beneficialness’? Neurobiol. Learn. Mem. 135, 40–49 (2016).

    Article  PubMed  Google Scholar 

  25. Spring, M. G. & Nautiyal, K. M. Striatal serotonin release signals reward value. J. Neurosci. 44, e0602242024 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Haider, P. et al. Latent equilibrium: a unified learning theory for arbitrarily fast computation with arbitrarily slow neurons. Adv. Neural Inf. Process. Syst. 34, 17839–17851 (2021).

    Google Scholar 

  27. Srinivasan, M., Laughlin, S. & Dubs, A. Predictive coding: a fresh view of inhibition in the retina. Proc. R. Soc. Lond. 216, 427–459 (1982).

    ADS  PubMed  Google Scholar 

  28. Spratling, M. A review of predictive coding algorithms. Brain Cognition 112, 92–97 (2017).

    Article  PubMed  Google Scholar 

  29. Chalk, M., Marre, O. & Tkačik, G. Toward a unified theory of efficient, predictive, and sparse coding. Proc. Natl Acad. Sci. USA 115, 186–191 (2018).

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  30. Masani, K., Vette, A. & Popovic, M. Controlling balance during quiet standing: proportional and derivative controller generates preceding motor command to body sway position observed in experiments. Gait Posture 23, 164–172 (2006).

    Article  PubMed  Google Scholar 

  31. De Jong, J. W., Liang, Y., Verharen, J. P. H., Fraser, K. M. & Lammel, S. State and rate-of-change encoding in parallel mesoaccumbal dopamine pathways. Nat. Neurosci. 27, 309–318 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Lundstrom, B. N., Higgs, M. H., Spain, W. J. & Fairhall, A. L. Fractional differentiation by neocortical pyramidal neurons. Nat. Neurosci. 11, 1335–1342 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Kim, H. R. et al. A unified framework for dopamine signals across timescales. Cell 183, 1600–1616.e25 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Baleanu, D., Fernandez, A. & Akgül, A. On a fractional operator combining proportional and classical differintegrals. Mathematics 8, 360 (2020).

    Article  Google Scholar 

  35. dos Santos Matias, S. P. Dynamics of Serotonergic Neurons Revealed by Fiber Photometry. PhD thesis, Univ. NOVA de Lisboa (2016).

  36. Daw, N. D., Kakade, S. & Dayan, P. Opponent interactions between serotonin and dopamine. Neural Netw. 15, 603–616 (2002).

    Article  PubMed  Google Scholar 

  37. Kobayashi, S. & Schultz, W. Influence of reward delays on responses of dopamine neurons. J. Neurosci. 28, 7837–7846 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Masset, P. et al. Multi-timescale reinforcement learning in the brain. Preprint at bioRxiv https://doi.org/10.1101/2023.11.12.566754 (2023).

  39. Sousa, M. et al. Dopamine neurons encode a multidimensional probabilistic map of future reward. Preprint at bioRxiv https://doi.org/10.1101/2023.11.12.566727 (2023).

  40. Miyazaki, K., Miyazaki, K. W. & Doya, K. Activation of dorsal raphe serotonin neurons underlies waiting for delayed rewards. J. Neurosci. 31, 469–479 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Cohen, J., Grossman, C. & Bari, B. Serotonin neurons modulate learning rate through uncertainty. Dryad https://doi.org/10.5061/dryad.cz8w9gj4s (2021).

  42. Aghajanian, G. & Vandermaelen, C. Intracellular recordings from serotonergic dorsal raphe neurons: pacemaker potentials and the effects of LSD. Brain Res. 238, 463–469 (1982).

    Article  PubMed  Google Scholar 

  43. Vandermaelen, C. & Aghajanian, G. Electrophysiological and pharmacological characterization of serotonergic dorsal raphe neurons recorded extracellularly and intracellularly in rat brain slices. Brain Res. 289, 109–119 (1983).

    Article  PubMed  Google Scholar 

  44. Lynn, M. B. et al. A slow 5-HT1AR-mediated recurrent inhibitory network in raphe computes contextual value through synaptic facilitation. Preprint at bioRxiv https://doi.org/10.1101/2022.08.31.506056 (2022).

  45. Miyazaki, K. et al. Reward probability and timing uncertainty alter the effect of dorsal raphe serotonin neurons on patience. Nat. Commun. 9, 2048 (2018).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  46. Okaty, B. W., Commons, K. G. & Dymecki, S. M. Embracing diversity in the 5-HT neuronal system. Nat. Rev. Neurosci. 20, 397–424 (2019).

    Article  PubMed  Google Scholar 

  47. Dabney, W. et al. A distributional code for value in dopamine-based reinforcement learning. Nature 577, 671–675 (2020).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  48. Lee, R. S., Sagiv, Y., Engelhard, B., Witten, I. B. & Daw, N. D. A feature-specific prediction error model explains dopaminergic heterogeneity. Nat. Neurosci. 27, 1574–1586 (2024).

    Article  PubMed  Google Scholar 

  49. Calizo, L. H. et al. Raphe serotonin neurons are not homogenous: electrophysiological, morphological and neurochemical evidence. Neuropharmacology 61, 524–543 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Fernandez, S. P. et al. Multiscale single-cell analysis reveals unique phenotypes of raphe 5-HT neurons projecting to the forebrain. Brain Struct. Funct. 221, 4007–4025 (2016).

    Article  PubMed  Google Scholar 

  51. van Seijen, H. & Sutton, R. True online TD(λ). Int. Conf. Mach. Learn. 32, 692–700 (2014).

    Google Scholar 

  52. Van Seijen, H., Mahmood, A. R., Pilarski, P. M., Machado, M. C. & Sutton, R. S. True online temporal-difference learning. J. Mach. Learn. Res. 17, 5057–5096 (2016).

    MathSciNet  Google Scholar 

  53. Ranade, S. P. & Mainen, Z. F. Transient firing of dorsal raphe neurons encodes diverse and specific sensory, motor, and reward events. J. Neurophysiol. 102, 3026–3037 (2009).

    Article  PubMed  Google Scholar 

  54. Elber-Dorozko, L. & Loewenstein, Y. Striatal action-value neurons reconsidered. eLife 7, e34248 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Hesterberg, T. C. What teachers should know about the bootstrap: resampling in the undergraduate statistics curriculum. Am. Statistician 69, 371–386 (2015).

    Article  MathSciNet  Google Scholar 

  56. Li, Y. & Luo, M. In vivo electrophysiological data of DRN serotonin neurons. Zenodo https://doi.org/10.5281/zenodo.12776509 (2024).

  57. Harkin, E. F. & Naud, R. Code and data for ‘A prospective code for value in the serotonin system’. Zenodo https://doi.org/10.5281/zenodo.14623230 (2025).

Download references

Acknowledgements

We acknowledge that this work was carried out on the unceded and unsurrendered land of the Algonquin Anishinaabe people. The data we re-analysed were collected on the unceded lands of the Piscataway and Susquehannock people. We thank Z. Mainen, B. Miller, N. Uchida and P. Albert for providing feedback on earlier versions of this paper. We thank P. Dayan for extensive helpful discussions. We also thank M. Lynn and S. Maillé for many helpful brainstorming sessions and input on figure design; J. Beninger for assisting with troubleshooting population-level analysis; K. Lloyd and W. C. Riedel for feedback on analysis of data from freely moving animals; and all members of the Naud, Béïque and Dayan laboratories for helpful discussions. We particularly thank M. Luo and Y. Li for sharing their data and helping us understand its format. Funding: E.F.H. is grateful for PGS-D and Queen Elizabeth II Scholarships in Science and Technology awards from the Natural Sciences and Engineering Research Council and Government of Ontario, respectively. This work was funded by grants to J.-C.B. and R.N. from the Canadian Institutes of Health Research (grant numbers RN442369 and RN442338).

Author information

Authors and Affiliations

Authors

Contributions

E.F.H. conceptualized the project, created the model, performed all simulations and data analysis, performed mathematical analysis, created all figures and wrote the first and final drafts of the manuscript, as well as all drafts of the Discussion section. R.N. provided supervision, funding and extensive input on all aspects of the project, as well as performing mathematical analysis and writing the intermediate draft. J.-C.B. provided funding, extensive input and helpful discussions throughout the project, as well as detailed comments on the manuscript. C.D.G. and J.Y.C. provided data and validated the design of our analysis. C.D.G. provided helpful discussions and extensive input on comparisons with the uncertainty model, substantially strengthening this aspect of the work. J.Y.C. provided helpful discussions and input on the manuscript.

Corresponding authors

Correspondence to Emerson F. Harkin or Jean-Claude Béïque.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Minmin Luo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Simplified version of a prospective code for value explains serotonin neuron responses to reward and punishment over short timescales.

This figure qualitatively reproduces the results shown in Fig. 2a,b using a simplified model of serotonin neuron firing involving value and its rate of change over time \({\rho }_{t}={[\Delta {v}_{t}+(1-\psi ){v}_{t-1}+1]}_{+}\). The value signals used as input are shown in gray at top, using a baseline offset of B = 1, trial duration of 3 s, reward duration of 1 s, and punishment duration of 50 ms, as in the main text. Model predictions are shown for selected values of ψ indicated at left. Note that ψ = 0 causes the model output to be the same as the input (ρt = vt), representing a simple code for value. For ease of comparison with the exemplar neuron from ref. 9 (vignette in Fig. 2b, reproduced at bottom right), the right column shows the result of superimposing the two traces after smoothing with the biexponential kernel et/1 ms − et/200 ms as in the original work. Note also that for ψ = 0.95, the smoothed model output (right) more closely resembles the input (top) than the smoothed model output for other values of ψ. This is because the model with ψ = 0.95 is an ideal prospective encoder for a leaky integrator with a time constant of 195 ms (ignoring rectification), which is very similar to the smoothing kernel. This illustrates how a prospective code for value can compensate for smoothing/filtering effects of a downstream decoder. Figure adapted from ref. 9, eLife, under a CC BY 4.0 licence.

Extended Data Fig. 2 Predictions for experiments with time-varying rewards.

Model predictions for a reward of size 1/2 delivered over the course of 1 s, either increasing over the course of delivery (left), remaining stable (center, as in main text), or quickly depleting (right). Value (black) is calculated according to \({v}_{t}=1/4+{\sum }_{i=0}^{\infty }{e}^{-i\Delta t/\tau }{r}_{t+i+1}\Delta t\) where 1/4 is the baseline offset, Δt is the timestep (Δt = 10 ms), τ is the discounting timescale (τ = 3 s), and rt is the reward rate at time t (shown at top in gray). The prospective code for value (blue) uses adaptation strength A = 3 and timescale τad = 1 s as elsewhere.

Extended Data Fig. 3 Effect of punishment duration on response.

A Prospective code for value during punishment trials as a function of punishment duration. Note that the punishment offset response (defined as the baseline-subtracted normalized model output at the time of punishment offset) increases as the punishment duration shortens. This happens because when the punishment is short, the punishment-induced inhibition is removed more rapidly than adaptation-induced inhibition is re-established. B Example traces corresponding to the top, middle, and bottom of A. Simulation parameters: 3 s trial duration, 5 ms to 1000 ms punishment duration, 10 s mean ITI duration, 4 s discounting timescale, adaptation strength of A = 5, 1 s adaptation timescale, baseline offset of B = 2, 5 ms time step. All activity measurements are normalized to baseline for visualization purposes, this does not affect our results.

Extended Data Fig. 4 Effects of discounting over extreme timescales.

This figure extends the horizontal axis of main text Fig. 2d. Heatmap shows the ITI value relative to the maximal value during the trial (i.e., the value just before the start of reward delivery) as a function of experimental parameters (vertical axis) and the duration of the trial relative to the discounting timescale of the agent (horizontal axis). Ribbons next to the vertical axis are to scale, gray represents the mean ITI duration and colours represent trial epochs. Traces show the true value dynamics (black) and prospectively-encoded value (blue) for a trial consisting of a 3 s combined cue and delay epoch and a 1 s reward epoch. True value is normalized to the maximum just before reward as in the heatmap; different reward sizes (and learning) can be accommodated by scaling the traces. Note that the value signal is nearly flat if the discounting timescale is much longer than the trial duration (traces 1 and 4), similar to the dynamics of some serotonin neurons (e.g., Fig. 3g(ii)).

Extended Data Fig. 5 Online value estimation.

A Estimation of value during 300 trials of trace conditioning. Comparison between true value vt (black) and value estimated using TD(λ) with Dutch eligibility traces (shades of copper). B Prospectively-encoded value signals. Scale bars: 5 min, 1 arbitrary value unit. Inset adapted from ref. 11, Society for Neuroscience, under a CC BY 4.0 licence.

Extended Data Fig. 6 Tuning features of identified serotonin neurons in the dynamic Pavlovian task.

A Trial-averaged activity patterns of six exemplar serotonin neurons. Shades of red indicate levels of reward history as in main text (darker colours indicate higher proportions of recently-rewarded trials). Inset shows firing rate of the corresponding neuron averaged across all trials (scale bar 1 Hz). Firing rate is estimated using a 500 ms PSTH in all cases. The top left neuron is the cell shown in Fig. 3g(i), and the top right neuron is the cell shown in Fig. 3a. B–E Top row shows all individual neurons along the vertical axis, and bottom row shows the corresponding population-level measures. B Trial averaged activity patterns of all neurons. Heatmap shows firing rate normalized to baseline for each neuron. Population firing rate at bottom is generated by averaging the unnormalized activity of all neurons. Firing rate is estimated using a 500 ms PSTH in all cases. C Cue response amplitude. Related to the horizontal axis of Fig. 3f. The cue response is defined as the extremum of the PSTH during the cue period. D Regression slope of whole trial activity against mean reward. Related to Fig. 3d. E Regression slope of baseline activity (number of spikes in a 1.5 s period immediately before cue onset) against mean reward. Shaded areas in C–E indicate 95% bootstrap CIs. Individual neurons are sorted vertically according to whole trial and baseline slopes. Statistical annotations in D and E reflect Wilcoxon signed-rank tests (N = 37 neurons).

Extended Data Fig. 7 Tuning features of serotonin neurons in a dynamic foraging task.

A Design of the dynamic foraging task of ref. 5. Head-fixed mice were presented with two spouts which delivered water rewards probabilistically. Reward probabilities changed independently for each spout according to a block structure. On each trial, the animal could lick one spout or the other and immediately receive a probabilistic reward. The un-chosen spout was withdrawn immediately after the first lick to prevent animals from attempting to sample both spouts, and replaced 2.5 s after cue onset to signal the end of the trial. B Activity patterns of N = 66 identified serotonin neurons. Heatmap shows the trial-averaged firing rates of all recorded neurons, calculated using a 50 ms PSTH and normalized to the baseline firing rate, sorted vertically according to the cue response. Un-normalized PSTHs for four exemplar neurons are shown at right. Histogram at top shows distribution of reward times across all trials. Inset in 1 shows a prospective code for value assuming that the animal collects the reward 200 ms to 400 ms after the start of the cue, and using the following parameters: adaptation strength A = 3, adaptation timescale τad = 1 s, baseline activity B = 1, and discounting timescale τ = 3 s. Note that the model qualitatively captures the observed transient increase in firing followed by a ramp from a lower level. C Transient firing precedes reward delivery. Histogram shows the time of the PSTH maximum relative to the median reward time across all trials for each of N = 53 neurons with a transient increase in firing following cue onset. Statistical annotation reflects a Wilcoxon signed-rank test. Note that transient firing that precedes the reward is expected under a prospective code for value, but not if serotonin neurons directly signal reward. D Effect of recent mean reward on baseline firing estimated using the regression model \(\widehat{y}={\beta }_{\widehat{p}}\widehat{p}+{\beta }_{0}\) where \(\widehat{y}\) is the number of spikes during a 1.5 s pre-trial baseline. Each point is one neurons and error bars indicate 95% bootstrap CIs. For clarity, error bars for the cue response are not shown. While there is significant uncertainty in the estimated slopes \({\beta }_{\widehat{p}}\) (partly due to the fact that overall reward history poorly predicts value-based decisions in upcoming trials), positive modulation of baseline firing by reward history is the dominant tuning feature among neurons with a positive cue response (Wilcoxon signed-rank test p = 0.0012, N = 53), consistent with a prospective code for value. E Baseline activity is more strongly related to mean reward than uncertainty, consistent with a prospective code for value. The fitting procedure used in C was repeated using five-trial reward variance, standard deviation, and entropy in place of the five-trial mean reward, as in Fig. S6. Slopes are normalized to the dynamic range of the independent variable (as in Fig. S6). Statistical annotations reflect Wilcoxon signed-rank tests (N = 66).

Extended Data Fig. 8 Diverse tuning features of serotonin neurons in freely moving animals10 are explained by a prospective code for value with variable discounting.

A Experimental setup and example traces from individual serotonin neurons. Animals were trained to leave a “trigger zone” at one end of a linear track and subsequently enter a “reward zone” at the other end to receive a sucrose reward delivered 2 s after reward zone entry. Firing rates of optogenetically-tagged serotonin neurons were recorded extracellularly. Most serotonin neurons displayed increased activity after reward zone entry (example traces at top), but the dynamics of activity varied. Scale bar: 1 s, 5 Hz. B Trial-averaged serotonin neuron firing activity aligned to reward zone entry. Each row is one neuron. Activity is normalized to the mean firing rate between reward zone entry and reward delivery for each cell. Neurons with a <1 Hz increase in firing after reward zone entry were considered unresponsive and are shown at bottom. Rows labeled 1 and 2 indicate example neurons from A. Firing rate is calculated using non-overlapping 50 ms bins, as in the original work. C Smoothed firing activity of responsive neurons. Data are the same as in the top part of B. A Gaussian filter with a standard deviation of 2 cells in the vertical direction and 50 ms (1 bin) in the horizontal direction was used for smoothing. D Firing dynamics during the pre-reward epoch quantified using the slope of the normalized PSTH for each neuron. Positive slope indicates that firing increases as the animal gets closer to reward and negative slope indicates a decrease in firing. Slopes were fitted to a 1.5 s window beginning 250 ms after reward zone entry and ending 250 ms before reward delivery. Shaded region indicates 95% bootstrap CI. Note that a negative slope rules out value coding in the relevant cells. E Predicted activity under a prospective code for value as a function of the discounting timescale. As the discounting timescale increases, pre-reward activity changes from ramping up towards reward (top) to ramping down (bottom). F Predicted activity under a simple code for value as a function of the discounting timescale. As the discounting timescale increases, pre-reward activity changes from ramping up towards reward (top) to constant high activity (bottom). Decreasing activity is never predicted. G Predicted activity under surprise with adaptation as a function of the relative level of surprise attached to the reward. As the reward becomes less surprising relative to the cue, activity changes from a downward ramp during the pre-reward epoch and strong activation by reward (top) to the same downward ramp followed by weak activation by the reward (bottom). Increasing activity during the pre-reward epoch is never predicted. All model predictions are normalized in the same way as the raw data. Value-based models assume that reward zone entry acts as a cue, that value outside the reward zone is negligible, and that the perception of reward lasts 1 s (for example, due to residual sweet taste on the lick spout). Predictions are lagged by 150 ms to account for perceptual delays, as in the rest of this work.

Supplementary information

Supplementary Information

Supplementary Methods, Tables, Discussion and Notes. The Methods provide the derivation of true value in trace conditioning experiments. Tables (four in total) include result summaries and theoretical predictions. Notes (nine total) include analytical treatment of key ideas (for example, normative justification for theory) and comparison with the uncertainty-related results of Grossman et al.5 using overlapping data.

Reporting Summary

Peer Review File

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Harkin, E.F., Grossman, C.D., Cohen, J.Y. et al. A prospective code for value in the serotonin system. Nature (2025). https://doi.org/10.1038/s41586-025-08731-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41586-025-08731-7

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics