Reward revaluation biases hippocampal replay content away from the preferred outcome


The rodent hippocampus spontaneously generates bursts of neural activity (replay) that can depict spatial trajectories to reward locations, suggesting a role in model-based behavioral control. A largely separate literature emphasizes reward revaluation as the litmus test for such control, yet the content of hippocampal replay under revaluation conditions is unknown. We examined the content of awake replay events following motivational shifts between hunger and thirst. On a T-maze offering free choice between food and water outcomes, rats shifted their behavior toward the restricted outcome, but replay content was shifted away from the restricted outcome. This effect preceded experience on the task each day and did not reverse with experience. These results demonstrate that replay content is not limited to reflecting recent experience or trajectories toward the preferred goal and suggest a role for motivational states in determining replay content.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Experimental design, task and behavior.
Fig. 2: Sequenceless SWR content is biased in the opposite direction from motivational shifts.
Fig. 3: Sequenceless SWR content is biased away from the preferred outcome before experience on the task that day.
Fig. 4: Hippocampal sequences detected during example SWR events and their content.
Fig. 5: Motivational state changes shift SWR sequence content away from the behaviorally preferred maze arm.
Fig. 6: Neither differences in decoding accuracy across conditions nor experience-dependent scenarios can account for the observed SWR content.

Data availability

Data files including metadata are publicly available on DataLad (, ‘MotivationalT’ dataset).

Code availability

All analyses were performed using MATLAB 2017a and can be reproduced using code available on our public GitHub repository (


  1. 1.

    van der Meer, M. A. A., Kurth-Nelson, Z. & Redish, A. D. Information processing in decision-making systems. Neuroscientist 18, 342–359 (2012).

    Article  Google Scholar 

  2. 2.

    Epstein, R. A., Patai, E. Z., Julian, J. B. & Spiers, H. J. The cognitive map in humans: spatial navigation and beyond. Nat. Neurosci. 20, 1504–1513 (2017).

    CAS  Article  Google Scholar 

  3. 3.

    Buzsáki, G. Hippocampal sharp wave-ripple: a cognitive biomarker for episodic memory and planning. Hippocampus 25, 1073–1188 (2015).

    Article  Google Scholar 

  4. 4.

    Foster, D. J. Replay comes of age. Annu. Rev. Neurosci. 40, 581–602 (2017).

    CAS  Article  Google Scholar 

  5. 5.

    Pfeiffer, B. E. & Foster, D. J. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79 (2013).

    CAS  Article  Google Scholar 

  6. 6.

    Xu, H., Baracskay, P., O’Neill, J. & Csicsvari, J. Assembly responses of hippocampal CA1 place cells predict learned behavior in goal-directed spatial tasks on the radial eight-arm maze. Neuron 101, 119–132.e4 (2019).

    CAS  Article  Google Scholar 

  7. 7.

    Jadhav, S. P., Kemere, C., German, P. W. & Frank, L. M. Awake hippocampal sharp-wave ripples support spatial memory. Science 336, 1454–1458 (2012).

    CAS  Article  Google Scholar 

  8. 8.

    Fernández-Ruiz, A. et al. Long-duration hippocampal sharp wave ripples improve memory. Science 364, 1082–1086 (2019).

    Article  Google Scholar 

  9. 9.

    Dolan, R. J. & Dayan, P. Goals and habits in the brain. Neuron 80, 312–325 (2013).

    CAS  Article  Google Scholar 

  10. 10.

    Kennedy, P. J. & Shapiro, M. L. Motivational states activate distinct hippocampal representations to guide goal-directed behaviors. Proc. Natl Acad. Sci. USA 106, 10805–10810 (2009).

    CAS  Article  Google Scholar 

  11. 11.

    Hebben, N., Corkin, S., Eichenbaum, H. & Shedlack, K. Diminished ability to interpret and report internal states after bilateral medial temporal resection: case H.M. Behav. Neurosci. 99, 1031–1039 (1985).

    CAS  Article  Google Scholar 

  12. 12.

    Davidson, T. & Jarrard, L. E. A role for hippocampus in the utilization of hunger signals. Behav. Neural Biol. 59, 167–171 (1993).

    CAS  Article  Google Scholar 

  13. 13.

    Kennedy, P. J. & Shapiro, M. L. Retrieving memories via internal context requires the hippocampus. J. Neurosci. 24, 6979–6985 (2004).

    CAS  Article  Google Scholar 

  14. 14.

    Dickinson, A. & Balleine, B. Motivational control of instrumental performance following a shift from thirst to hunger. Q. J. Exp. Psychol. B 42, 413–431 (1990).

    CAS  PubMed  Google Scholar 

  15. 15.

    Karlsson, M. P. & Frank, L. M. Awake replay of remote experiences in the hippocampus. Nat. Neurosci. 12, 913–918 (2009).

    CAS  Article  Google Scholar 

  16. 16.

    Lakens, D. Equivalence tests: a practical primer for t tests, correlations, and meta-analyses. Soc. Psychol. Personal. Sci. 8, 355–362 (2017).

    Article  Google Scholar 

  17. 17.

    van der Meer, M. A. A., Carey, A. A. & Tanaka, Y. Optimizing for generalization in the decoding of internally generated activity in the hippocampus. Hippocampus 27, 580–595 (2017).

    Article  Google Scholar 

  18. 18.

    Gupta, A. S., van der Meer, M. A., Touretzky, D. S. & Redish, A. D. Hippocampal replay is not a simple function of experience. Neuron 65, 695–705 (2010).

    CAS  Article  Google Scholar 

  19. 19.

    Kudrimoti, H. S., Barnes, C. A. & McNaughton, B. L. Reactivation of hippocampal cell assemblies: effects of behavioral state, experience, and EEG dynamics. J. Neurosci. 19, 4090–4101 (1999).

    CAS  Article  Google Scholar 

  20. 20.

    Foster, D. J. & Wilson, M. A. Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440, 680–683 (2006).

    CAS  Article  Google Scholar 

  21. 21.

    Eschenko, O., Ramadan, W., Mölle, M., Born, J. & Sara, S. J. Sustained increase in hippocampal sharp-wave ripple activity during slow-wave sleep after learning. Learn. Mem. 15, 222–228 (2008).

    Article  Google Scholar 

  22. 22.

    Louie, K. & Wilson, M. A. Temporally structured replay of awake hippocampal ensemble activity during rapid eye movement sleep. Neuron 29, 145–156 (2001).

    CAS  Article  Google Scholar 

  23. 23.

    Wang, S.-H. & Morris, R. G. M. Hippocampal-neocortical interactions in memory formation, consolidation, and reconsolidation. Annu. Rev. Psychol. 61, 49–79 (2010).

    Article  Google Scholar 

  24. 24.

    Joo, H. R. & Frank, L. M. The hippocampal sharp wave–ripple in memory retrieval for immediate use and consolidation. Nat. Rev. Neurosci. 19, 744–757 (2018).

    CAS  Article  Google Scholar 

  25. 25.

    Ambrose, R. E., Pfeiffer, B. E. & Foster, D. J. Reverse replay of hippocampal place cells is uniquely modulated by changing reward. Neuron 91, 1124–1136 (2016).

    CAS  Article  Google Scholar 

  26. 26.

    Michon, F., Sun, J.-J., Kim, C. Y., Ciliberti, D. & Kloosterman, F. Post-learning hippocampal replay selectively reinforces spatial memory for highly rewarded locations. Curr. Biol. 29, 1436–1444.e5 (2019).

    CAS  Article  Google Scholar 

  27. 27.

    Wu, C.-T., Haggerty, D., Kemere, C. & Ji, D. Hippocampal awake replay in fear memory retrieval. Nat. Neurosci. 20, 571–580 (2017).

    CAS  Article  Google Scholar 

  28. 28.

    Ólafsdóttir, H. F., Bush, D. & Barry, C. The role of hippocampal replay in memory and planning. Curr. Biol. 28, R37–R50 (2018).

    Article  Google Scholar 

  29. 29.

    Pfeiffer, B. E. The content of hippocampal ‘replay’. Hippocampus (2018).

  30. 30.

    Javadi, A.-H., Tolat, A. & Spiers, H. J. Sleep enhances a spatially mediated generalization of learned values. Learn. Mem. 22, 532–536 (2015).

    Article  Google Scholar 

  31. 31.

    Pezzulo, G., Kemere, C. & van der Meer, M. A. A. Internally generated hippocampal sequences as a vantage point to probe future-oriented forms of cognition. Ann. N. Y. Acad. Sci. 1396, 144–165 (2017).

    Article  Google Scholar 

  32. 32.

    Stachenfeld, K. L., Botvinick, M. M. & Gershman, S. J. The hippocampus as a predictive map. Nat. Neurosci. 20, 1643–1653 (2017).

    CAS  Article  Google Scholar 

  33. 33.

    Russek, E. M., Momennejad, I., Botvinick, M. M., Gershman, S. J. & Daw, N. D. Predictive representations can link model-based reinforcement learning to model-free mechanisms. PLoS Comput. Biol. 13, e1005768 (2017).

    Article  Google Scholar 

  34. 34.

    Sutton, R.S. in Neural Networks for Control (eds Miller, W. T. 3rd, Sutton, R. S. & Werbos P. J.) 179–189 (MIT Press, 1990).

  35. 35.

    Momennejad, I., Otto, A. R., Daw, N. D. & Norman, K. A. Offline replay supports planning in human reinforcement learning. eLife 7, e32548 (2018).

    Article  Google Scholar 

  36. 36.

    Mattar, M. G. & Daw, N. D. Prioritized memory access explains planning and hippocampal replay. Nat. Neurosci. 21, 1609–1617 (2018).

    CAS  Article  Google Scholar 

  37. 37.

    Schapiro, A. C., McDevitt, E. A., Rogers, T. T., Mednick, S. C. & Norman, K. A. Human hippocampal replay during rest prioritizes weakly learned information and predicts memory performance. Nat. Commun. 9, 3920 (2018).

    Article  Google Scholar 

  38. 38.

    Colgin, L. L., Kubota, D., Jia, Y., Rex, C. S. & Lynch, G. Long-term potentiation is impaired in rat hippocampal slices that produce spontaneous sharp waves. J. Physiol. 558, 953–961 (2004).

    CAS  Article  Google Scholar 

  39. 39.

    Mehta, M. R. Cortico-hippocampal interaction during up-down states and memory consolidation. Nat. Neurosci. 10, 13–15 (2007).

    CAS  Article  Google Scholar 

  40. 40.

    Ólafsdóttir, H. F., Carpenter, F. & Barry, C. Task demands predict a dynamic switch in the content of awake hippocampal replay. Neuron 96, 925–935.e6 (2017).

    Article  Google Scholar 

  41. 41.

    Singer, A. C., Carr, M. F., Karlsson, M. P. & Frank, L. M. Hippocampal SWR activity predicts correct decisions during the initial learning of an alternation task. Neuron 77, 1163–1173 (2013).

    CAS  Article  Google Scholar 

  42. 42.

    Ólafsdóttir, H. F., Barry, C., Saleem, A. B., Hassabis, D. & Spiers, H. J. Hippocampal place cells construct reward related sequences through unexplored space. eLife 4, e06063 (2015).

    Article  Google Scholar 

  43. 43.

    Grosmark, A. D. & Buzsaki, G. Diversity in neural firing dynamics supports both rigid and learned hippocampal sequences. Science 351, 1440–1443 (2016).

    CAS  Article  Google Scholar 

  44. 44.

    Malhotra, S., Cross, R. W., Zhang, A. & van der Meer, M. A. A. Ventral striatal gamma oscillations are highly variable from trial to trial, and are dominated by behavioural state, and only weakly influenced by outcome value. Eur. J. Neurosci. 42, 2818–2832 (2015).

    Article  Google Scholar 

  45. 45.

    Balleine, B. W. & Dickinson, A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37, 407–419 (1998).

    CAS  Article  Google Scholar 

  46. 46.

    Zhang, K., Ginzburg, I., McNaughton, B. L. & Sejnowski, T. J. Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. J. Neurophysiol. 79, 1017–1044 (1998).

    CAS  Article  Google Scholar 

  47. 47.

    Davidson, T. J., Kloosterman, F. & Wilson, M. A. Hippocampal replay of extended experience. Neuron 63, 497–507 (2009).

    CAS  Article  Google Scholar 

Download references


We thank N. Gibson, M. Ryan and J. Flanagan for animal care and M.-C. Kuo, J. Espinosa and E. Carmichael for technical assistance. We thank E. Grant for developing the SWR detection method used for the main analyses in this paper. This work was supported by the University of Waterloo and Dartmouth College (start-up funds to M.v.d.M.), the Netherlands Organization for Scientific Research (grant no. 863.10.013 to M.v.d.M.) and the Human Frontiers Science Project (grant no. RGY0088/2014).

Author information




A.A.C. performed the experiments and preprocessed the data. A.A.C., Y.T. and M.v.d.M. wrote the analysis code. A.A.C. and M.v.d.M. performed the data analysis. M.v.d.M. wrote the paper with comments from A.A.C. and Y.T.

Corresponding author

Correspondence to Matthijs A. A. van der Meer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information: Nature Neuroscience thanks Daojun Ji and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Fig. 1 Schematic of SWR content analysis using sequenceless decoding (a) and sequence-based decoding (b).

In the sequenceless decoding analysis, each candidate SWR event is decoded as a single time bin, using joint tuning curves containing left and right trial data. This analysis produces two outputs reported in the paper (drawn as cyan diamond shapes): the z-scored left vs. right log odds averaged across all events, and a left vs. right count ratio of significant events only. In contrast, sequence-based decoding (b) first decodes all data in a given session for left and right trial tuning curves separately, using a 25 ms moving window (step size, 5 ms). Next, sequences are detected in the left and right posteriors separately, before removing sequences that (1) did not overlap with a candidate SWR event, (2) were not sufficiently distinct from the number of sequences obtained from a random resampling procedure, or (3) were sequences for both left and right.

Supplementary Fig. 2 Schematic of sequenceless decoding analysis.

Each SWR event (top left) is converted into a vector of spike counts and decoded into a joint probability distribution that includes both left and right trajectories of the T-maze. This probability distribution yields a left vs. right log odds score for each event. Because any left or right bias in this score may simply be due to unequal distributions of the number of place cells, average firing rates, and so on, this raw log odds score is compared to a distribution of log odds obtained from 1000 permutations of left and right trial tuning curves used in the decoding. This comparison yields a log odds z-score for each event, which is either averaged across all events in a session (bottom right figure and Fig. 2b), or thresholded to keep only significant events to yield a proportion of left events (Fig. 2c). To determine if SWR content is related to motivational state, both measures are averaged across food-restriction (fr) and water-restriction (wr) sessions, and the resulting values (black vertical bars in bottom right plot; dots indicate single sessions) compared to a distribution of averages obtained from randomly permuting food- and water-restriction labels across sessions (top right; gray bars indicate mean and SD of this shuffled distribution). Thus, this analysis uses two independent bootstraps: the first ‘tuning curve shuffle’ quantifies SWR content on an event-by-event basis, and the second ‘motivational state shuffle’ quantifies the effect of motivational state on SWR content averaged across sessions.

Supplementary Fig. 3 Left and right trials are clearly distinguishable during running on the track.

a: Decoding confusion matrix for a single example session. Each decoded time bin (bin size = 100 ms) is assigned to the corresponding actual trial type (left, right) and position of the animal (horizontal axis). Decoded posteriors for all bins at that location are averaged to obtain the values in each column of the matrix. Perfect decoding would result in all diagonal elements being 1. Note the clear diagonal indicating non-random decoding overall, and the fainter, off-diagonal elements corresponding to confusion of left and right trials. b: Average classification performance across all subjects and sessions, based on the trial type (left, right) and location (before the choice point, ‘pre’; or after, ‘post’) of the maximum a posteriori decoding. Note that for all trial types and locations, classification performance is clearly above chance (all p < .001, bootstrap compared to shuffling trial type and location). Overall, classification performance was better for the right (water) arm compared to the left (food) arm (post-CP .92 ± .04 for right, .83 ± .06 for left, p = .009; n.s. for pre-CP; two-tailed Mann-Whitney U test). Importantly, there was no indication that classification performance differed between food- and water-restriction sessions (all comparisons n.s.). Analyses were performed on n = 4 rats and 19 total sessions. c: Single-trial tuning curve Pearson’s correlations between trials of the same type (left-left, right-right) are systematically higher than correlations between trials of different type (left-right), both when averaged across all cells within a session (left panel) and on a cell-by-cell basis (right panel; both two-tailed Mann-Whitney U tests, n = 971 cells across all rats and sessions). Only positions on the common, central stem of the maze were included in this analysis. The higher correlations between trials of the same type, compared to the correlations across types, indicate left and right trials are distinct. Thus, as measured by two different approaches, left and right trials are clearly distinguishable during behavior on the track, even on the common portion of the maze.

Supplementary Fig. 4 SWR content, as measured by z-scored log odds, for every individual recording session.

Columns correspond to pre-task rest, intertrial intervals, and post-task rest respectively. Rows show data from individual subjects (n = 4 rats), and each data point corresponds to a single session (19 total). Individual subjects may show idiosyncratic biases such as an overall shift in SWR content towards the water (right) arm, but in each subject and epoch changes across days in SWR content tended to be opposite changes in behavior. For instance, in the pre-task (left) column, it can be seen that this is the case for all pairs except for R042 day 3 to 4, and R050 day 3 to 4. In other words, 13 out of 15 motivational shifts resulted in a SWR content shift opposite from behavior.

Supplementary Fig. 5 SWR content z-scored log odds based on maze arms only (that is excluding the central stem of the maze).

Figure layout is the same as Fig. 2, as is the pattern of results: across sessions, sequenceless SWR content is negatively correlated with behavioral experience (a, Pearson’s correlation –.88, p = 10−5). Both unthresholded (b) and thresholded z-scored log odds (c) showed a SWR content shift between food- and water-restriction sessions in the opposite direction from behavior (unthresholded z-scored log odds difference z = –3.09, p = 2.1 * 10−3; thresholded proportions difference z = –2.60, p = 9.2 * 10−3). Asterisks indicate significance level (*: p < 0.05, **: p < 0.01, ***: p < 0.001, two-tailed bootstrap) by comparison with a resampled distribution based on shuffling food and water restriction labels across sessions (gray bar width indicates standard error of the mean across shuffles). Analyses were performed on n = 4 rats and 19 total sessions.

Supplementary Fig. 6 Raw SWR sequence count data for the sequence-based analysis (Fig. 5).

Main panel (top) shows the total number of detected sequences (across all subjects) corresponding to the left (L) and right (R) trajectories on the maze, for food-restriction sessions (fr, red) and water-restriction sessions (wr, blue). Note that the same overall pattern is apparent in these raw sequence counts as is shown in the proportion-based analysis (main text, Fig. 5): for food-restriction sessions, more sequences are detected on the right trajectory (leading to water) and for water-restriction sessions, more sequences are detected on the left trajectory (leading to food), indicating a change in SWR sequence content in the opposite direction from the motivational shift. Such a shift between food- and water-restriction sessions was apparent in each individual subject (lower panels). Individual sequence numbers are 1257 sequences for R042 (5 sessions), 95 for R044 (2 sessions), 429 for R050 (6 sessions) and 1999 for R064 (6 sessions).

Supplementary Fig. 7 SWR content, as measured by the proportion of detected sequences on the left (food) trajectory, for every individual recording session.

Columns correspond to pre-task rest, intertrial intervals, and post-task rest respectively. Rows show data from individual subjects (n = 4 rats), and each data point corresponds to a single session (19 total).

Supplementary Fig. 8 SWR sequence content (proportion of sequences depicting the left/food trajectory) for forward sequences (a), reverse sequences (b) and sequences beyond the choice point (c).

The same overall pattern as in the main analysis was apparent (compare Fig. 5), although individual comparisons with a resampled distribution based on shuffling food and water restriction labels across sessions did not reach statistical significance (forward sequences difference z = −1.62, p = .10; reverse sequences difference z = −1.45, p = .15, post-choice point sequences z = −1.04, p = .29). Layout as in Fig. 5b (n = 4 rats, 19 sessions total).

Supplementary Fig. 9 Comparison between the data and SWR content expected based on different hypotheses.

a: Schematic depicting the hypothesis that replay content is proportional to experience (solid line) alongside a sketch of the pattern found in the data (dashed line). The horizontal axis indicates time, including four daily recording sessions starting with a water restriction session (shown in blue on the lower left) and ending with a food restriction session (shown in red on the top right). During each recording session, experience is biased towards the restricted food type (for illustration purposes given here as a .75 probability of choosing water for water-restriction sessions, and .75 food probability for food-restriction sessions). This bias shifts predicted replay content (shown on the vertical axis) towards the corresponding side of the maze (right trials for water, left trials for food; note changes in the solid line (indicating predicted replay content) that occur during the ITI epoch of each recording session. (The size of the experience-driven changes depends on factors such as whether all experience or only within-session experience is considered, as in Fig. 6c, d, but the pattern is the same.) The experience account thus predicts (1) a difference in replay content between the pre- and post-task rest periods, and (2) a bias in post-task replay content towards the recently preferred outcome. As indicated by the dashed line, neither prediction is confirmed by the data. b: Schematic depicting the hypothesis that replay content favors the preferred behavioral choice (or outcome). This account predicts that pre-task replay content (1) and ITI replay content (2) are shifted towards the preferred choice (note solid line is on the water side during water-restriction sessions, and on the food side during food-restriction sessions). As indicated by the dashed line, neither prediction is supported by the data. A variation of this account, which assumes animals can plan for the next session, correctly predicts post-task replay content, but not pre-task replay content.

Supplementary Fig. 10 Comparison between the data and SWR content expected based on delayed-experience and motivational state accounts.

a: Schematic depicting the hypothesis that replay content reflects delayed experience. Although unlikely given the reported rapid effects of experience on replay content (see main text for discussion), this scenario correctly predicts no change between pre- and post-task rest, and an overall replay bias opposite the preferred outcome. However, it further predicts that the behavioral bias (preference for one side or the other, defined as max(pleft, 1 − pleft)) on day n predicts SWR content bias the next day: note the relatively small swing in SWR content following a relatively unbiased session (for example session 2 with 63% food (left) arm experience) and comparatively large swing following a strongly biased session (session 4 with 95%, top right). The two bottom panels depict the three example session data points shown (dark gray symbols), which form the predicted positive correlation; the data, depicted here schematically as light gray circles, do not exhibit such a relationship. This figure also illustrates why it is informative to compute bias scores (bottom right panel; ranging from 0.5 to 1 by using the max operation above) rather than using raw values (bottom left panel). This is an instance of Simpson’s Paradox, where using raw values would always show a positive correlation between day n behavior and day n + 1 pre-task replay content (lower left panel): the structure of the task combined with the overall opposite bias in replay content confines the data points to the lower left and upper right quadrants. Geometrically, the bias scores align the food and water sessions to a common axis (note the reflection of the yellow and green quadrants in the lower right panel, made visible by the notch in one corner), enabling the testing of the more specific predictions shown here. b: If motivational state determines replay content, replay content bias during the pre-task rest period as well as other epochs should predict that session’s behavioral bias (after all, a hungrier animal would show a stronger preference for food). This prediction, illustrated in the lower two panels, is confirmed in the data. Note that again, bias scores are important in avoiding spurious results (Simpson’s Paradox). Because the comparison is within-session rather than across-session (as in a), the raw data are now confined to the upper left and lower right quadrants (lower left panel).

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Carey, A.A., Tanaka, Y. & van der Meer, M.A.A. Reward revaluation biases hippocampal replay content away from the preferred outcome. Nat Neurosci 22, 1450–1459 (2019).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing