Cerebellar granule cells encode the expectation of reward

Wagner, Mark J.; Kim, Tony Hyun; Savall, Joan; Schnitzer, Mark J.; Luo, Liqun

doi:10.1038/nature21726

Letter
Published: 20 March 2017

Cerebellar granule cells encode the expectation of reward

Mark J. Wagner¹^na1,
Tony Hyun Kim^1,2^na1,
Joan Savall¹,
Mark J. Schnitzer^1,3 &
…
Liqun Luo¹

Nature volume 544, pages 96–100 (2017)Cite this article

31k Accesses
309 Citations
355 Altmetric
Metrics details

Subjects

This article has been updated

Abstract

The human brain contains approximately 60 billion cerebellar granule cells¹, which outnumber all other brain neurons combined. Classical theories posit that a large, diverse population of granule cells allows for highly detailed representations of sensorimotor context, enabling downstream Purkinje cells to sense fine contextual changes^2,3,4,5,6. Although evidence suggests a role for the cerebellum in cognition^7,8,9,10, granule cells are known to encode only sensory^11,12,13 and motor¹⁴ context. Here, using two-photon calcium imaging in behaving mice, we show that granule cells convey information about the expectation of reward. Mice initiated voluntary forelimb movements for delayed sugar-water reward. Some granule cells responded preferentially to reward or reward omission, whereas others selectively encoded reward anticipation. Reward responses were not restricted to forelimb movement, as a Pavlovian task evoked similar responses. Compared to predictable rewards, unexpected rewards elicited markedly different granule cell activity despite identical stimuli and licking responses. In both tasks, reward signals were widespread throughout multiple cerebellar lobules. Tracking the same granule cells over several days of learning revealed that cells with reward-anticipating responses emerged from those that responded at the start of learning to reward delivery, whereas reward-omission responses grew stronger as learning progressed. The discovery of predictive, non-sensorimotor encoding in granule cells is a major departure from the current understanding of these neurons and markedly enriches the contextual information available to postsynaptic Purkinje cells, with important implications for cognitive processing in the cerebellum.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Two-photon Ca²⁺ imaging of cerebellar granule cells during an operant task.**

**Figure 2: Granule cells encode reward context during a forelimb movement operant task.**

Figure 3: Granule cells encode reward context during a Pavlovian tone–reward task.

**Figure 4: Emergence of reward expectation responses during forelimb movement task learning.**

Network-level encoding of local neurotransmitters in cortical astrocytes

Article Open access 17 April 2024

Perceptography unveils the causal contribution of inferior temporal cortex to visual perception

Article Open access 18 April 2024

Neuronal dynamics direct cerebrospinal fluid perfusion and brain clearance

Article 28 February 2024

Change history

05 April 2017
Fig. 1b was corrected to remove an erroneous dashed line.

References

Herculano-Houzel, S. Coordinated scaling of cortical and cerebellar numbers of neurons. Front. Neuroanat. 4, 12 (2010)
PubMed PubMed Central Google Scholar
Marr, D. A theory of cerebellar cortex. J. Physiol. (Lond.) 202, 437–470 (1969)
Article CAS Google Scholar
Albus, J. S. A theory of cerebellar function. Math. Biosci. 10, 25–61 (1971)
Article Google Scholar
Fujita, M. Adaptive filter model of the cerebellum. Biol. Cybern. 45, 195–206 (1982)
Article CAS Google Scholar
Rancz, E. A. et al. High-fidelity transmission of sensory information by single cerebellar mossy fibre boutons. Nature 450, 1245–1248 (2007)
Article ADS CAS Google Scholar
Huang, C.-C. et al. Convergence of pontine and proprioceptive streams onto multimodal cerebellar granule cells. eLife 2, e00400 (2013)
Article Google Scholar
Ito, M. Control of mental activities by internal models in the cerebellum. Nat. Rev. Neurosci. 9, 304–313 (2008)
Article CAS Google Scholar
Strick, P. L., Dum, R. P. & Fiez, J. A. Cerebellum and nonmotor function. Annu. Rev. Neurosci. 32, 413–434 (2009)
Article CAS Google Scholar
Stoodley, C. J., Valera, E. M. & Schmahmann, J. D. Functional topography of the cerebellum for motor and cognitive tasks: an fMRI study. Neuroimage 59, 1560–1570 (2012)
Article Google Scholar
Tsai, P. T. et al. Autistic-like behaviour and cerebellar dysfunction in Purkinje cell Tsc1 mutant mice. Nature 488, 647–651 (2012)
Article ADS CAS Google Scholar
Bengtsson, F & Jörntell, H. Sensory transmission in cerebellar granule cells relies on similarly coded mossy fiber inputs. Proc. Natl Acad. Sci. USA 106, 2389–2394 (2009)
Article ADS CAS Google Scholar
Bing, Y.-H., Zhang, G.-J., Sun, L., Chu, C.-P. & Qiu, D.-L. Dynamic properties of sensory stimulation evoked responses in mouse cerebellar granule cell layer and molecular layer. Neurosci. Lett. 585, 114–118 (2015)
Article CAS Google Scholar
Ishikawa, T., Shimuta, M. & Häusser, M. Multimodal sensory integration in single cerebellar granule cells in vivo. eLife 4, e12916 (2015)
Article Google Scholar
Powell, K., Mathy, A., Duguid, I. & Häusser, M. Synaptic representation of locomotion in single cerebellar granule cells. eLife 4, e07290 (2015)
Article Google Scholar
Coltz, J. D., Johnson, M. T. & Ebner, T. J. Cerebellar Purkinje cell simple spike discharge encodes movement velocity in primates during visuomotor arm tracking. J. Neurosci. 19, 1782–1803 (1999)
Article CAS Google Scholar
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001)
Article ADS CAS Google Scholar
Galliano, E. et al. Silencing the majority of cerebellar granule cells uncovers their essential role in motor learning and consolidation. Cell Reports 3, 1239–1251 (2013)
Article CAS Google Scholar
Medina, J. F. & Lisberger, S. G. Links from complex spikes to local plasticity and motor learning in the cerebellum of awake-behaving monkeys. Nat. Neurosci. 11, 1185–1192 (2008)
Article CAS Google Scholar
Brooks, J. X. & Cullen, K. E. The primate cerebellum selectively encodes unexpected self-motion. Curr. Biol. 23, 947–955 (2013)
Article CAS Google Scholar
Schultz, W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, 1–27 (1998)
Article CAS Google Scholar
Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012)
Article ADS CAS Google Scholar
Schultz, W., Apicella, P., Scarnati, E. & Ljungberg, T. Neuronal activity in monkey ventral striatum related to the expectation of reward. J. Neurosci. 12, 4595–4610 (1992)
Article CAS Google Scholar
Tremblay, L. & Schultz, W. Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. J. Neurophysiol. 83, 1864–1876 (2000)
Article CAS Google Scholar
Miyazaki, K., Miyazaki, K. W. & Doya, K. Activation of dorsal raphe serotonin neurons underlies waiting for delayed rewards. J. Neurosci. 31, 469–479 (2011)
Article CAS Google Scholar
Matsumoto, M. & Hikosaka, O. Representation of negative motivational value in the primate lateral habenula. Nat. Neurosci. 12, 77–84 (2009)
Article CAS Google Scholar
Kawai, T., Yamada, H., Sato, N., Takada, M. & Matsumoto, M. Roles of the lateral habenula and anterior cingulate cortex in negative outcome monitoring and behavioral adjustment in nonhuman primates. Neuron 88, 792–804 (2015)
Article CAS Google Scholar
Chen, T.-W. et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295–300 (2013)
Article ADS CAS Google Scholar
Madisen, L. et al. Transgenic mice for intersectional targeting of neural sensors and effectors with high specificity and performance. Neuron 85, 942–958 (2015)
Article CAS Google Scholar
Li, L. et al. Visualizing the distribution of synapses from individual neurons in the mouse brain. PLoS One 5, e11503 (2010)
Article ADS Google Scholar
Matei, V. et al. Smaller inner ear sensory epithelia in Neurog1 null mice are related to earlier hair cell cycle exit. Dev. Dyn. 234, 633–650 (2005)
Article CAS Google Scholar
Ben-Arie, N. et al. Math1 is essential for genesis of cerebellar granule neurons. Nature 390, 169–172 (1997)
Article ADS CAS Google Scholar
Madisen, L. et al. A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat. Neurosci. 13, 133–140 (2010)
Article CAS Google Scholar
Figielski, A., Bonev, I. A. & Bigras, P. Towards development of a 2-DOF planar oparallel robot with optimal workspace use. In 2007 IEEE International Conference on Systems, Man and Cybernetics. 1562–1566 (ISIC, 2007)
Lecoq, J. et al. Visualizing mammalian brain area interactions by dual-axis two-photon calcium imaging. Nat. Neurosci. 17, 1825–1829 (2014)
Article CAS Google Scholar
Pologruto, T. A., Sabatini, B. L. & Svoboda, K. ScanImage: flexible software for operating laser scanning microscopes. Biomed. Eng. Online 2, 13 (2003)
Article Google Scholar
Thévenaz, P., Ruttimann, U. E. & Unser, M. A pyramid approach to subpixel registration based on intensity. IEEE Trans. Image Process. 7, 27–41 (1998)
Article ADS Google Scholar
Mukamel, E. A., Nimmerjahn, A. & Schnitzer, M. J. Automated analysis of cellular signals from large-scale calcium imaging data. Neuron 63, 747–760 (2009)
Article CAS Google Scholar
Simon, H., Le Moal, M. & Calas, A. Efferents and afferents of the ventral tegmental-A10 region studied after local injection of [³H]leucine and horseradish peroxidase. Brain Res. 178, 17–40 (1979)
Article CAS Google Scholar
Ikai, Y., Takada, M., Shinonaga, Y. & Mizuno, N. Dopaminergic and non-dopaminergic neurons in the ventral tegmental area of the rat project, respectively, to the cerebellar cortex and deep cerebellar nuclei. Neuroscience 51, 719–728 (1992)
Article CAS Google Scholar
Swanson, L. W. The projections of the ventral tegmental area and adjacent regions: a combined fluorescent retrograde tracer and immunofluorescence study in the rat. Brain Res. Bull. 9, 321–353 (1982)
Article CAS Google Scholar
Dahlstroem, A., Fuxe, K., Olson, L. & Ungerstedt, U. Ascending systems of catecholamine neurons from the lower brain stem. Acta Physiol. Scand. 62, 485–486 (1964)
Article CAS Google Scholar
Kizer, J. S., Palkovits, M. & Brownstein, M. J. The projections of the A8, A9 and A10 dopaminergic cell bodies: evidence for a nigral-hypothalamic-median eminence dopaminergic pathway. Brain Res. 108, 363–370 (1976)
Article CAS Google Scholar
Panagopoulos, N. T., Papadopoulos, G. C. & Matsokis, N. A. Dopaminergic innervation and binding in the rat cerebellum. Neurosci. Lett. 130, 208–212 (1991)
Article CAS Google Scholar
Glaser, P. E. A. et al. Cerebellar neurotransmission in attention-deficit/hyperactivity disorder: does dopamine neurotransmission occur in the cerebellar vermis? J. Neurosci. Methods 151, 62–67 (2006)
CAS Google Scholar
Schwarz, L. A. et al. Viral-genetic tracing of the input–output organization of a central noradrenaline circuit. Nature 524, 88–92 (2015)
Article ADS CAS Google Scholar
Hnasko, T. S . et al. Cre recombinase-mediated restoration of nigrostriatal dopamine in dopamine-deficient mice reverses hypophagia and bradykinesia. Proc. Natl Acad. Sci. USA 103, 8858–8863 (2006)
Article ADS CAS Google Scholar
Beier, K. T. et al. Circuit architecture of VTA dopamine neurons revealed by systematic input-output mapping. Cell 162, 622–634 (2015)
Article CAS Google Scholar
Tervo, D. G. et al. A designer AAV variant permits efficient retrograde access to projection neurons. Neuron 92, 372–382 (2016)
Article CAS Google Scholar

Download references

Acknowledgements

We thank C. Kim for designing and assembling the capacitive lick sensor, L. Kitch for image processing code, J. Lecoq for microscope design, H. Zeng, E. Kim, E. Callaway and members of the Luo laboratory for reagents, mouse lines, and helpful discussions, and W. Newsome and J. Raymond for critical comments on the manuscript. M.J.W. was supported by Epilepsy Training Grant. M.J.S. and L.L. are HHMI investigators. This work was supported by NIH grants and Hughes Collaborative Innovation Award to L.L.

Author information

Mark J. Wagner and Tony Hyun Kim: These authors contributed equally to this work.

Authors and Affiliations

Department of Biology and Howard Hughes Medical Institute, Stanford University, Stanford, 94305, California, USA
Mark J. Wagner, Tony Hyun Kim, Joan Savall, Mark J. Schnitzer & Liqun Luo
Department of Electrical Engineering, Stanford University, Stanford, 94305, California, USA
Tony Hyun Kim
Department of Applied Physics, Stanford University, Stanford, 94305, California, USA
Mark J. Schnitzer

Authors

Mark J. Wagner
View author publications
You can also search for this author in PubMed Google Scholar
Tony Hyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Joan Savall
View author publications
You can also search for this author in PubMed Google Scholar
Mark J. Schnitzer
View author publications
You can also search for this author in PubMed Google Scholar
Liqun Luo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.J.W. designed and executed all experiments and analysed the data. T.H.K. contributed microscopy instrumentation as well as processing of brain imaging and behavioural videos. J.S. contributed to manipulandum design. M.J.S. provided imaging hardware, software, and expertise. L.L. supervised the project. M.J.W. and L.L. wrote the paper with contributions from all authors.

Corresponding authors

Correspondence to Mark J. Wagner, Mark J. Schnitzer or Liqun Luo.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks C. De Zeeuw, T. Knopfel and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Extended data figures and tables

Extended Data Figure 1 Ca²⁺ imaging in cerebellar granule cells.

a, Parasagittal section of the cerebellum of a transgenic mouse (Math1-cre/CAG-lox-stop-lox-tTA/TRE-lox-stop-lox-GCaMP6f) used for in vivo two-photon Ca²⁺ imaging. GCaMP6f expression (green) is widespread throughout most granule cells. GCaMP-expressing somas were not detected in the molecular layer, and only rarely coincided with Purkinje cells (red). For unknown reasons, granule cell expression is substantially reduced in lobules IX and X. A, anterior; P, posterior; D, dorsal; V, ventral. b, Mean two-photon fluorescence image for the session shown in Fig. 1c, d. c, Location of all identified active cerebellar granule cells in the field of view in b (n = 53 cells total). Numbered cells indicate the example cell traces shown in Fig. 1d, counting from the bottom to the top.

Extended Data Figure 2 Granule cells encode movement in a forelimb movement operant task.

a, b, Distribution of times of peak (a) or minimum (b) trial-averaged fluorescence response relative to reach midpoint (blue histograms, n = 561 total neurons from 6 mice). Orange histograms denote the subset of cells whose peak (a) or minimum (b) trial-averaged fluorescence modulation was significant. 85% of cells exhibited significant positive modulation, while 90% of cells exhibited significant negative modulation, at a point between –2 to 2 s relative to forelimb movement. To compute significance we compared observed peak and minimum fluorescence to fluorescence for randomized datasets (Methods). c, For each cell we computed the Spearman correlation coefficient between single-trial fluorescence (mean from –0.1 to +0.3 s relative to movement midpoint) and peak movement velocity. Histogram denotes distribution of Spearman coefficients across neurons (n = 561 total neurons from 6 mice). Neurons correlated with P < 0.01 (permutation test) are shown in orange. d, Mean movement-aligned fluorescence of granule cells whose single-trial fluorescence correlated significantly with peak movement speed, shown in c (n = 111 neurons with P < 0.01 for correlation coefficients, shown in orange in c). e, f, Two example granule cells that encode licking. For these cells, response differences between reward outcomes (top row, examples) can be explained by the encoding of the licking response on rewarded trials (bottom row, 25 trials with the most and least licking from 0.1 to 1 s), n = 209 rewarded and 68 omitted reward trials. Dashed vertical lines denote average time of forelimb movement midpoint, solid vertical lines denote time of reward. In this and all subsequent figures, shaded regions denote s.e.m.

Extended Data Figure 3 Granule cell reward responses during the operant task.

a, b, Fluorescence response of all granule cells recorded from three experiments in lobules VIa, VIb and simplex from one example mouse on rewarded trials and omitted reward trials. Each row shows the trial-averaged response of a single neuron. Dashed vertical lines denote the average forelimb movement midpoint; solid vertical lines denote time of reward delivery. Many more neurons appear to respond preferentially following omitted reward than reward delivery (n = 188 neurons). c–e, Average reward-aligned fluorescence of all reward-preferring cells (c), omitted-reward-preferring cells (d) and reward-anticipation cells (e), from all mice and lobules during the forelimb pushing movement task (n = 31 reward cells, 69 reward-omission cells, 50 reward-anticipation cells from 13 forelimb movement sessions in 6 mice). See Methods for cell identification criteria. f, g, Comparison of the cohort of mice that performed the operant task with briefer delay periods (f, n = 6 experiments in 3 mice with delay between the end of forelimb movement and reward delivery = 0.6 s and delay between reward delivery and manipulandum handle return = 2 s), or longer delay periods (g, n = 7 experiments in 3 mice with reward delay = 0.8 s and post-reward delay = 3.5 s). Top, prevalence of reward response types as fraction of total neurons (error bars denote counting error). Bottom, average movement and licking behaviour across mice on each task version. Results did not differ substantially between the two task versions and thus all data were pooled for all analyses aside from these figure panels. Across all mice, 50% of peak licking rise from baseline was reached in anticipation 0.8 ± 0.04 s before reward. Licking was prolonged following reward compared to omitted reward (P = 4 × 10⁻⁴ Wilcoxon rank-sum test, n = 6 mice; licking declined to half of its anticipatory level by 1.4 ± 0.14 s following reward compared to 0.7 ± 0.08 s following omitted reward). h, Venn diagram illustrating multiplexed representations in granule cells. Relative areas are true to observed cell proportions. Corresponding counting errors for reward-related cell classifications are provided in Fig. 2h. For forelimb speed cells, counting error was 1.7%. The prevalence of multiple representations in a granule cell matched predictions of independent probabilities of each representation (1.1% of cells encode reward and forelimb speed, 2% encode reward omission and forelimb speed, and 2.3% encode reward anticipation and forelimb speed, compared to the independence null hypothesis of 1.1%, 2.4%, and 1.8%, respectively).

Extended Data Figure 4 Body movement does not explain reward signalling in granule cells.

We placed mice (n = 3) in a clear tube during imaging experiments and recorded video of their body movement from the right side and from underneath the animal (Supplementary Video 2). a, For an example mouse, we computed the average body trajectory for each trial type: omitted reward, and the 25 trials most similar or most dissimilar to omitted reward body motion (Methods). AP, anterior–posterior, DV, dorsal–ventral, ML, medial–lateral. Motion on reward-similar-to-omitted-reward trials more closely matched motion on omitted-reward trials than did motion on reward-dissimilar-to-omitted-reward trials. b–g, For reward cells, reward-omission cells and reward-anticipation cells, despite robust signalling of reward outcome (b, d, f), higher similarity of body trajectory on rewarded trials to that on omitted-reward trials did not result in cellular responses more similar to those on omitted-reward trials (c, e, g), n = 21 reward cells, 41 reward-omission cells, 10 reward-anticipation cells (from n = 201 total granule cells analysed from 3 mice). Therefore body movement is unlikely to be the cause of granule cell reward signalling. Dashed vertical lines denote average time of forelimb movement midpoint.

Extended Data Figure 5 Inter-trial interval analyses do not support the theory that reward-omission responses encode preparation for the next trial.

One alternative explanation for the response of reward-omission cells on omitted-reward trials is that, following a trial in which the mouse does not receive a reward, the mouse is more anxious to begin the next trial and therefore quickly begins preparing for the next forelimb movement. If reward-omission cells were actually just ‘next trial preparation cells’, then these putative earlier motor preparations on omitted-reward trials would elicit a larger response. That these cells exhibit on average no response following rewarded trials could reflect mice choosing to wait before preparing the next trial following reward delivery compared to omitted reward. We tested two predictions of this hypothesis. First, we reasoned that if, following a rewarded trial, mice choose to initiate the next trial very quickly, putative ‘next trial preparation cells’ should exhibit increased response, as they do following omitted reward. By contrast, on rewarded trials after which mice wait before initiating the next trial, the lack of motor preparations should result in a smaller response in ‘next trial preparation cells’. Second, if mice were substantially more anxious to initiate the next trial following omitted reward, inter-trial intervals (ITIs) following omitted reward trials should be shorter compared to ITIs following rewarded trials. a–d, To test the first prediction, we leveraged natural variability in mouse behaviour to identify rewarded trials after which mice initiated the next movement very quickly and therefore had the shortest ITI (the earliest time that the robot returns to permit the mouse to initiate the next trial is 2 or 3.5 s following the previous reward, each in 3 mice). For each imaging session, we identified groups of 25 rewarded trials with the longest ITIs and those with the shortest. These two groups of rewarded trials had substantially different ITIs, indicating that their next-trial-preparatory movements varied substantially (mean ITI for the ‘short’ group was 3.6 s, for the ‘long’ group 5.8 s, n = 13 sessions). Each line in a represents one imaging session. Despite the large difference in next-trial preparations in these two groups of trials, reward-omission cells remained silent in both cases, despite robust responses on omitted-reward trials (two cells from two example mice in b, c; b is the example cell from Fig. 2b, n = 97 rewarded and 25 omitted-reward trials; for c, n = 129 rewarded and 34 omitted-reward trials). Across all 69 identified reward omission cells (d), there was no tendency for a stronger response when mice initiated the next trial quickly compared to when they waited before doing so. Thus the prediction that putative ‘next trial preparation cells’ respond to earlier next-trial preparations was not borne out. e, To test the second prediction that mice were preparing the next trial more quickly following omitted-reward trials, thereby leading to greater preparatory movements encoded by putative ‘next trial preparation cells’, we grouped ITIs according to whether they followed rewarded or omitted-reward trials within each imaging session (indicated by each line). We found no consistent difference in how long mice chose to wait before initiating the next trial following either reward or omitted-reward trials (P = 0.93 Wilcoxon signed-rank test, n = 13 imaging sessions from 6 mice). Thus, the second prediction was also not borne out. Taken together, the selective response of reward-omission cells to omitted-reward trials is more likely to be related to reward than next-trial preparations.

Extended Data Figure 6 Granule cell responses in alternate push-for-reward and pull-for-reward trials.

a, b, We identified reward (a) and reward-omission (b) cells based only on push-for-reward trials and computed their average response (top). We then computed the average response of these same cells on pull-for-reward trials (bottom) and found they were highly preserved (n = 23 reward omission and 30 reward cells from 4 mice). c, d, For comparison, we identified cells that responded to forelimb movement based only on push-for-reward trials (n = 25 pushing cells) and computed their average response (c, top). We then compared this to the average response of these cells on pull-for-reward trials (c, bottom) and found it was substantially weaker. Similarly, when we identified cells responsive to forelimb motion based only on pulling trials (d, bottom, n = 42 pulling cells) the response of these cells on pushing trials (top) was substantially weaker. This indicates that movement responses (c, d) are substantially less generalized across sensorimotor contexts than reward signalling (a, b). Dashed vertical lines indicate average time of forelimb pushing or pulling movement midpoint, solid lines denote time of reward.

Extended Data Figure 7 Granule cell ensembles discriminate reward outcome and decode behaviour.

a, We sought to discriminate reward from omitted-reward trials by linearly decoding ensemble granule cell activity. We first used lasso logistic regression to identify the minimal set of neurons that achieve optimal decoding accuracy for each imaging session. For this minimal set, we fitted a linear discriminant to the mean fluorescence from 0 to 1 s of each cell on each trial. We tabulated the discriminant’s cross-validated accuracy for each imaging session (dots). Red bars denote mean ± s.e.m. across sessions (n = 13 experiments in 6 mice; Methods). Dashed line denotes chance accuracy. Green dot denotes example session used in b and d. b, For an example imaging session, we applied the discriminant weighting to the time-varying cellular responses on each trial and averaged the output across all rewarded and omitted reward trials (n = 56 neurons, 64 rewarded trials, 19 omitted-reward trials). The large separation following reward versus reward-omission reflects accurate neural decoding. c, In general, the lasso determined that optimal cross-validated decoding was achieved with a minority of recorded cells. d, For the example session shown in b, we examined how cross-validated reward outcome decoding accuracy varied with the number of neurons included in the decoder, by varying the lasso penalty. We found that optimal performance was achieved with a subset of cells, indicating that larger groups of cells resulted in some overfitting (Methods). Error bars indicate s.e.m. from cross-validation. e, To determine the importance of reward-selective cells in decoding, we fitted linear discriminants while excluding reward-selective cells (>0.2 s.d. absolute fluorescence difference between reward conditions averaged from 0.1 to 1 s), as well as discriminants using only reward-selective cells. We compared these decoders’ performance to the optimal subset determined from lasso regression, and found that reward-selective cells recover most of the optimal decoder performance. Each line represents one imaging session (n = 13 sessions). f, We reasoned that if granule cells can signal the mouse’s reward anticipation, it should be possible to use neuronal activity to decode this anticipation on a moment-by-moment basis. We therefore defined the mouse’s instantaneous anticipation state to be its lick rate (in 200-ms bins) until it received reward, in which case we defined anticipation to decline to zero (Methods). For each imaging session, we performed a linear regression to approximate the mouse’s time-varying reward anticipation behaviour by using the time-varying fluorescence of all cells. We quantified regression performance as the R² fraction of variance in reward anticipation that was accounted for by the regression output (using cross-validation). Each dot denotes a single imaging session. Red bars denote average decoder performance. Green dot denotes example session used in g and h. g, For one example session, concurrence between decoded anticipation (top) and observed anticipation according to the definition in f (bottom), from a single imaging session averaged across all rewarded (blue) and omitted reward trials (red) (n = 26 neurons, 171 rewarded trials, 54 omitted-reward trials). h, For the example session in e, we performed a lasso regression that penalizes non-zero weights on cells, to restrict the number of cells used for decoding. We varied the penalty from zero to maximum in order to determine how accuracy scales with the number of cells (Methods). Reward anticipation decoding accuracy (using cross-validation) reached nearly asymptotic levels with typically 10–20 included neurons. Error bars indicate s.e.m. from cross-validation.

Extended Data Figure 8 Granule cell reward responses during a Pavlovian tone–reward task.

a–c, Average reward-aligned fluorescence of all reward preferring cells (a), reward-omission cells (b) and reward-anticipation cells (c), from all mice and lobules during the tone–reward task (n = 23 reward, 42 reward omission, and 25 reward anticipation cells from 11 experiments in 5 mice). On average, reward anticipation neurons were silent following unexpected reward (P = 0.24 Wilcoxon signed-rank test; mean fluorescence change of –0.05 ± 0.05 s.d. comparing 0 – 1 s to –0.25 to –0.05 s relative to unexpected reward, n = 25 neurons). Reward-omission cells did not distinguish expected from unexpected reward (P = 0.48 Wilcoxon signed-rank test comparing mean fluorescence from 0 to 1 s, n = 42 reward omission neurons). Dashed vertical lines indicate time of tone onset. d, Scatter of response properties of individual neurons (coloured dots) showing reward preference (x axis) versus licking sensitivity (y axis) during the tone–reward task (n = 450 neurons). e, Single-trial correlation between licking and activity of each reward anticipation neuron either before reward delivery, after reward omission, or after reward delivery, averaged across all reward anticipation neurons during the Pavlovian task (n = 25 reward anticipation neurons from 11 experiments in 5 mice; P = 0.02 pre-reward, P = 0.015 post-omitted-reward, P = 0.72 post-reward; Wilcoxon signed-rank test). As during forelimb movements, reward-anticipation neurons correlate with licking only when licking represents anticipation. Following reward, when anticipation ceases, licking exerts no effect on activity. f, A subset of cells exhibited decreased fluorescence following the tone. To determine what these cells might be encoding, we identified all such neurons (Methods) and examined their response on the various trial types. We determined that these cells remain inhibited while the mouse is licking, beginning with anticipatory licking through reward consumption (n = 20 cells from 5 mice). Importantly, on unexpected reward trials, these neurons are also inhibited. This is unlike reward-anticipation cells in c that cease to be active following reward delivery and also remain silent on surprise reward trials. Thus cells inhibited by licking are more classically sensorimotor. g, Top, trials with a normal sized reward compared to randomly interspersed trials with an larger reward. Bottom, normal reward trials compared to trials with the most and least reward licking. h, Plot of each cell’s response difference between normal and large rewards (x axis) and preference for licking on normal reward trials (y axis). Dashed boxes indicate reward magnitude sensitive neurons without substantial licking sensitivity. Example cell from g is outlined. i, Each row shows the trial-averaged Ca²⁺ response of a single neuron. Cells in each panel (trial types indicated above) are ordered identically based on their response on rewarded trials (n = 135 neurons from three sessions in lobules VIa, VIb and simplex from an example mouse).

Extended Data Figure 9 Chronic imaging cell tracking and registration.

a–c, Magnified view of mean two-photon image from the regions shown in Fig. 4a on day 1 (a), day 4 (b) and day 6 (c). d, Colourized overlay of the images in a–c in red, blue and green, respectively. We rigidly aligned the mean fluorescence image on each day to that of the final day using TurboReg³⁶, resulting in unambiguous alignment of visible morphological features of individual granule cells. e, To quantify any ambiguity in the image registration we offset our images from optimal alignment by small amounts. For one example session, we quantified the image concordance of day 1 and day 6 as a function of displacing the day 1 image in the x and y directions relative to the registered optimum at zero (sum squared pixel differences between days, normalized to the registered optimum). There is a clear trough in the alignment error at the optimum, demonstrating that even slight, sub-micrometre misalignments are easily detected by image registration. Thus, there is little appreciable ambiguity in the alignment procedure. f, g, Average alignment error as a function of image displacement from the registered optimum, as in e, here averaged across all sessions and mice (n = 15 alignments from 3 mice). Error bars denote s.e.m. across alignments. Even the smallest, sub-micrometre, single-pixel displacements result in significantly higher alignment error than the registered optimum (P = 4.4 × 10⁻⁶ and 5.8 × 10⁻⁵ for one-pixel x and y misalignments, respectively, Wilcoxon signed-rank test). h, Mean fluorescence response of all neurons for the example mouse shown in Fig. 4c, here ordered by their day 1 activity peak response time (n = 97 neurons). i, j, Change over the 6 days of the imaging study in licking behaviour (i) and forelimb movement behaviour (j) for the mouse in h. Gross changes in motor behaviour were relatively modest over the days of the imaging study (Methods).

Extended Data Figure 10 Granule cell reward responses probably do not result from a direct midbrain dopaminergic projection to the cerebellar cortex.

Previous literature on the topic of dopamine in the cerebellum has been controversial, with some anatomical tracing studies suggesting a projection to cerebellar cortex from ventral tegmental area (VTA)^38,39, while others failed to find such a projection⁴⁰. Some studies identified the presence of dopamine in the cerebellar cortex directly^41,42,43, yet a major confound arises owing to the large noradrenergic projection to the cerebellum from the locus coeruleus, as dopamine is a precursor to noradrenaline⁴⁴. To determine whether our widespread reward-related signals were likely to be driven by a direct dopaminergic projection, we traced the inputs to the cerebellar cortex using viral methods. a, Schematic. We injected CAV-cre, Cre recombinase expressed from canine adenovirus-2 known to robustly infect axons and their terminals in many neuronal types⁴⁵ including dopaminergic neurons specifically^46,47, into the cerebellar cortex of a highly sensitive Cre-reporter Ai14 transgenic mouse. Thus any neuron in a region presynaptic to the cerebellar injection site infected by CAV2 will express tdTomato. We injected either the vermis of lobule VI (n = 3 mice) or for comparison the hemisphere lobule Crus I (one mouse). b, We stained serial coronal brain sections for tyrosine hydroxylase (TH, a marker for dopaminergic neurons) and examined the distribution of input cells in the midbrain. In all four mice examined (sixty-four 40- or 60-μm sections encompassing all midbrain dopamine neurons), we did not find any VTA or substantia nigra pars compacta (SNc) dopamine neurons projecting to the cerebellar cortex. As a positive control, we noted that all mice exhibited robust tdTomato expression in known inputs to the cerebellum, such as the pontine nuclei. To exclude the unlikely possibility that putative VTA dopamine neurons that project to the cerebellum cannot take up CAV2 efficiently, we also performed an experiment where we injected AAVretro-EF1a-FLPo, a virus that robustly infects axonal terminals⁴⁸, into cerebellar lobule VI of a mouse that expresses FLP-dependent tdTomato, and again did not find tdTomato⁺ neurons in the VTA or SNc, but abundant tdTomato⁺ neurons in pontine nuclei (data not shown). Thus if a direct midbrain dopaminergic projection to the cerebellum exists, it must be very sparse, and therefore unlikely to drive the very large and widespread reward-related signals in our granule cell imaging data.

Supplementary information

Supplementary Information

This file contains Supplementary Text 1-2, additional references and Supplementary Table 1. (PDF 206 kb)

Example two-photon Ca²⁺ imaging of cerebellar granule cells during a forelimb movement task

The video is 4x temporally down-sampled from the 13.5-Hz acquisition rate. (MP4 26904 kb)

Example body motion tracking during a forelimb movement task

Side view was used to track right forepaw and base of tail motion while bottom view was used to track motion of each hind paw. (MP4 22279 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

PowerPoint slide for Fig. 4

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wagner, M., Kim, T., Savall, J. et al. Cerebellar granule cells encode the expectation of reward. Nature 544, 96–100 (2017). https://doi.org/10.1038/nature21726

Download citation

Received: 27 August 2016
Accepted: 02 March 2017
Published: 20 March 2017
Issue Date: 06 April 2017
DOI: https://doi.org/10.1038/nature21726

This article is cited by

A cerebro-cerebellar network for learning visuomotor associations
- Naveen Sendhilnathan
- Andreea C. Bostan
- Michael E. Goldberg
Nature Communications (2024)
Local synaptic inhibition mediates cerebellar granule cell pattern separation and enables learned sensorimotor associations
- Elizabeth A. Fleming
- Greg D. Field
- Court Hull
Nature Neuroscience (2024)
The cerebellum directly modulates the substantia nigra dopaminergic activity
- Samantha Washburn
- Maritza Oñate
- Kamran Khodakhah
Nature Neuroscience (2024)
Visual temporal attention from perception to computation
- Rachel N. Denison
Nature Reviews Psychology (2024)
Organization of reward and movement signals in the basal ganglia and cerebellum
- Noga Larry
- Gil Zur
- Mati Joshua
Nature Communications (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.