Cerebellar disruption impairs working memory during evidence accumulation

To select actions based on sensory evidence, animals must create and manipulate representations of stimulus information in memory. Here we report that during accumulation of somatosensory evidence, optogenetic manipulation of cerebellar Purkinje cells reduces the accuracy of subsequent memory-guided decisions and causes mice to downweight prior information. Behavioral deficits are consistent with the addition of noise and leak to the evidence accumulation process. We conclude that the cerebellum can influence the accurate maintenance of working memory.

One might expect the perturbations to change the reaction time statistics, but no difference in mean reaction time between control and laser was observed the the pooled data. Did any individual animals exhibit reaction time differences? The standard deviation is rather large; does this reflect differences between animals? Furthermore, the standard deviation appears to be higher for laser (332 ms) than control trials (222 ms). It would be useful to see the mean and standard deviation of the reaction time for control and laser trials for each individual animal.
How well did the logistic regression model predict behavior? The authors should assess this on a subset of data not used to fit the model. More details should be provided for the logistic regression analysis, and the model should be written explicitly in the methods. Were three independent variables (number of peri-cue bins) used, and if so, what were the bin edges? The caption for fig. 1e states that success rate is proportional to the sum of the logistic regression weights; why is this true? It would be useful to see the weights for all animals plotted individually for fig. 2 and 1e. Finally, please label the x axis at the three plotted points, rather than 0 and 0.25 s. Some aspects of the drift-diffusion model should be explained and justified more thoroughly: (1) The goodness-of-fit of the drift-diffusion model should be characterized in some way, and should be assessed on a test set not used to fit the model.
(2) Is left/right bias the difference between ηR and ηL, or is there an additional bias parameter? If the latter is the case, it should be included in the equation on line 360.
(3) What is the probability of a rightward lick if a(t) > 0 at the end of the trial? It appears to be 1-.5*lapse, but this should be stated.
(4) Why is a lapse rate parameter needed? It seems that larger values of this parameter reflect a weaker influence of a(t) on behavior. So why not fit the model without a lapse rate and see if the goodness-of-fit is poorer in the perturbation conditions?
(5) If the model was fit using the BPupsModel package, the github page should be linked. The fitting procedure should be explained in detail.

Minor points
In fig. 1b (or the supplement), please show a similar panel for laser-on trials.
In fig. 1c (or the supplement), the authors should also show laser-on vs laser-off as a scatter plot, in addition to showing the difference.
From the discussion: "cerebellar disruption impaired the noise and stability." Would a different word than "stability" be better, as λ < 0 will increase (not decrease) the stability of a(t) around 0?
Reviewer #2: Remarks to the Author: This ms follows up on a recent study by the same lab that describes a role for the lateral cerebellum in an evidence accumulation task in mice. Whereas the past study inactivated lateral cerebellum w muscimol, the present study takes advantage of temporal control afforded by optogenetics to attempt to gain insights into the specific role of the cerebellum in this task. The issue of cerebellar roles in non-motor function is timely and of general interest. Because similar decision-making tasks have been well studied in the context of numerous forebrain structures in rodents, the present study could (potentially) elucidate interactions between the cerebellum and forebrain.
1. An important methodological concern is whether the effects of the manipulation are truly related to evidence accumulation or other decision-related processes versus a host of other possibilities more in line with traditional roles of cerebellum in sensori-motor functions. The authors argue that motor behavior is unaffected by the stimulation but they should also check whether the optogenetic stimulation has effects on the movement and position of the whiskers during the stimulation, as might be expected based on previous studies (for example from the Lena group). Video monitoring of whiskers was performed previously for muscimol inactivation but should be reported again here for the optogenetic stimulation.
Another recent study (Brown and Raman) show that air-puffs evoke reflexive whisker protraction in awake mice. Moreover, opto activation of Purkinje cells alters such movements (as well as firing of PCs). If both the stimulus used and the optpo manipulation are engaging/altering whisker sensorimotor loops, the interpretation of the results could be very different (i.e. the mice are not just passively integrating sensory evidence). 2. A recent paper (Gao et al. Nature) report a necessary role for the medial cerebellar nucleus in a whisker-based working memory task. The paper shows that the lateral nucleus is not needed. How does this square with the present results? Presumably, the cortical area activated here projects to the lateral nucleus. A weakness of the present study is that no attempts were made to map the cerebellar regions involved in the task. As far I can tell, the previous paper from the same group did not do this either. Especially in light of the Gao paper, what, if anything, can we conclude regarding how different regions of the cerebellum might contribute to the author's task? The interpretation would be different if activating wide swaths of the cerebellum had exactly the same effect.
3. I have some questions about the main conclusions. The previous paper ended with: "For example, output signals from the cerebellum may be combined with signals in sensory circuits to control the input gain of sensory information into accumulators elsewhere in the brain. This model would be consistent with observations of cerebellar involvement in gating sensory information (Apps et al., 1997;Ozden et al., 2012) and inputs to working memory (Baier et al., 2014;Sobczak-Edmans et al., 2016). In a second possibility, the cerebellum may modulate dynamics of the accumulation process. Finally, cerebellar signals may modulate activity that converts the accumulator value into a decision. Such a post-categorization influence has been observed in prefrontal regions during evidence accumulation (Erlich et al., 2015). Detailed inactivation studies with high spatial and temporal precision can resolve these alternatives." I feel that the present ms should give me the answer but I am not sure what it is. This could be a problem with the writing, the experiments, the reviewer or some combination. Perhaps the authors can clarify?
More specifically, I am not sure why in the model-based analysis of figure 3a the experiments in which light was only given during part of the cue period are not included? The authors seem to interpret the analysis in fig. 3 to indicate impairment in integration but earlier in the paper they say that surprisingly, integration is fine when the light is on. How are these consistent? Also, Figure 3 shows a change in the lapse rate in both experimental conditions, however, the authors seem to ignore this, concluding that the deficits in the two conditions are caused by separate processes. 4. I found the discussion section of the paper too brief and too vague. Given the large amount of past work in this area, I see this as a major deficiency. Readers will want to understand how the cerebellum fits into existing schemes regarding the neural basis for decision making. Presumably, a major motivation of the study is to understand the interrelations between the cerebellum and forebrain regions, motivating the use of a task that has been heavily studied in the context of the neocortex, as well as the striatum. Vague statements such as "The behavioral effects we characterized here were not observed with perturbations of other brain regions in similar paradigms 3-5,7 ." should be expanded upon such that the reader can understand in reasonable detail how the present results compare to ostensibly very similar manipulations performed in very similar tasks in other brain regions, including PPC, FOF, and dorsal striatum. Even if the answers are not clear, stating clearly what the remaining questions are will be helpful.
The authors should also comment on how the result of the optogenetic activations used here compare to results they obtained previously using muscimol. This is an important issue given many recent results showing that different manipulations can give unexpectedly different results in the same task.

Response to Reviews
We thank the reviewers for their thoughtful and constructive critiques. Below we address each point, with reviewer text in bold and author responses in blue.

Reviewer #1 (Remarks to the Author):
In this manuscript, Deverett et al. use optogenetic perturbations in a leftright discrimination task to study the role of the cerebellum in evidence accumulation.

They demonstrate that activation of Purkinje cells in the hemispheres on either side during the cue sampling period disrupts performance in the task. Activation during a subinterval of the cue epoch selectively disrupts the effect of cues that occur prior to this interval. Fitting a driftdiffusion model to the data suggests that cueperiod inactivation increases the noise in the decision signal and increases the leak, while delayperiod inactivation disrupts the effect of this decision signal on the behavioral choice.
In general, the experiments are welldesigned and the figures presented clearly. The results represent an advance over the authors' previous work, which used pharmacology and calcium imaging in Purkinje cells (ref. 17), but did not reveal the trial or trialepochspecific effects of cerebellar perturbation.

Major points
How frequently did animals fail to lick in either direction? The authors should indicate, for each condition, the fraction of trials with no licks. Is it the case that the "Fraction R choices" plots (e.g. fig. 1d) display data only from trials in which licks occurred?
We have measured the percentage of lick trials (as a fraction of lick plus nolick trials) and report this in lines 7981; the percentage of trials with no licks is equal to 100 minus this number. As we note in the text, the nolick trial rates were not significantly affected by the perturbation. (The percentage of nolick trials for the subcueperiod perturbations was 0.4±0.5%, 0.5±0.4%, 1.0±0.8%; mean±sd for firstthird, middlethird, lastthird, respectively.) It is indeed the case that all of our analyses used exclusively trials in which a decision lick was made. We have specified this in the Methods section on lines 427429.
One might expect the perturbations to change the reaction time statistics, but no difference in mean reaction time between control and laser was observed the the pooled data. Did any individual animals exhibit reaction time differences? The standard deviation is rather large; does this reflect differences between animals? Furthermore, the standard deviation appears to be higher for laser (332 ms) than control trials (222 ms). It would be useful to see the mean and standard deviation of the reaction time for control and laser trials for each individual animal.
The lighton and lightoff reaction time data are now shown for each individual subject in the new Supplementary Fig. 5. Reaction time distributions are largely overlapping in all animals, and one animal exhibited increased reaction time with light delivery.
How well did the logistic regression model predict behavior? The authors should assess this on a subset of data not used to fit the model.
We have now added a crossvalidated evaluation of logistic regression model performance (lines 7172, 445447). The regression model predicts animal choice with approximately 75% and 58% accuracy for lightoff and lighton trials, respectively.

More details should be provided for the logistic regression analysis, and the model should be written explicitly in the methods. Were three independent variables (number of pericue bins) used, and if so, what were the bin edges?
We have now written out the model on lines 440445 and included these details. Indeed three independent variables were used, comprising three temporally uniform bins spanning the 3.8second cue period (i.e. bin edges of 01.27 s, 1.272.53 s, and 2.533.8 s).
The caption for fig. 1e states that success rate is proportional to the sum of the logistic regression weights; why is this true?
We demonstrate this point in Figure R1 below. This relationship is related to the fact that the regression model considers choice as a weighted sum of inputs, transformed by a logit. Because the data are in the approximately linear range of the logit function, the statement about proportionality holds. We have now included these plots in the new Supplementary Fig. 7.
Finally, please label the x axis at the three plotted points, rather than 0 and 0.25 s.
Some aspects of the driftdiffusion model should be explained and justified more thoroughly: https://docs.google.com/document/d/1MbC4phg-gDeBNed-IHMKi1ZIHs14enI-QVFNQrwdiXI/edit 4/10 (1) The goodnessoffit of the driftdiffusion model should be characterized in some way, and should be assessed on a test set not used to fit the model.
We have now characterized the goodness of fit of the driftdiffusion model using a crossvalidated procedure (lines 511514), which is presented in Supplementary Table  1. Specifically, for each of 1000 fit repetitions on subsets (80%) of the data, we used the resulting bestfit parameters to predict animal choice in the remaining 20% of the data not used to fit the model. Based on this procedure, the model predicted animal choice with approximately 73% accuracy. We also include in the same table the Bayesian Information Criterion (lines 514522) measure for the sake of comparison to other model options. (

2) Is left/right bias the difference between ηR and ηL, or is there an additional bias parameter? If the latter is the case, it should be included in the equation on line 360.
Bias is indeed a separate parameter; whereas η R and η L (along with σ s ) parameterize the noise associated with each individual pulse of evidence, bias is timeindependent constant offset in the value of a .
Because this offset does not contribute to da/dt but rather to a in a timeinvariant manner, we do not include it in the equation mentioned. The meaning of the parameter can be expressed in this form: We include the bias parameter in our fits for the sake of comparisons with other applications of this drift diffusion model, and we have now also included the results of fitting the model without the bias parameter to demonstrate that it does not influence our study conclusions (Supplementary Table 1).
(3) What is the probability of a rightward lick if a(t) > 0 at the end of the trial? It appears to be 1.5*lapse, but this should be stated.
We now state this on lines 497498.

(4) Why is a lapse rate parameter needed? It seems that larger values of this parameter reflect a weaker influence of a(t) on behavior. So why not fit the model without a lapse rate and see if the goodnessoffit is poorer in the perturbation conditions?
We have now also fitted the model without a lapse rate parameter (results in Supplementary Table 1). Indeed the model without the lapse rate parameter more poorly predicts choice in the perturbation conditions than in the lightoff condition.
Relative to the full model, crossvalidated choice prediction is indistinguishable in each condition, while BIC is generally slightly improved as a result of a simpler model with one fewer parameter. We think that while our main conclusions are not strongly altered by this, we agree that it is important to include all of this information and we now do so.
(5) If the model was fit using the BPupsModel package, the github page should be linked. The fitting procedure should be explained in detail.
The GitHub page has now been linked and the fitting procedure explained on lines 501510.

Minor points
In fig. 1b (or the supplement), please show a similar panel for laseron trials.
We now include this panel in the new Supplementary Fig. 4b.
In fig. 1c (or the supplement), the authors should also show laseron vs laseroff as a scatter plot, in addition to showing the difference.
We now include this panel in the new Supplementary Fig. 4a.
From the discussion: "cerebellar disruption impaired the noise and stability." Would a different word than "stability" be better, as λ < 0 will increase (not decrease) the stability of a(t) around 0?
We thank the reviewer for noting this confusing terminology and we have changed the phrasing from "impaired the noise and stability" to "impaired the noise and persistent time course" for clarity (lines 145146).

Reviewer #2 (Remarks to the Author):
This ms follows up on a recent study by the same lab that describes a role for the lateral cerebellum in an evidence accumulation task in mice. Whereas the past study inactivated lateral cerebellum w muscimol, the present study takes advantage of temporal control afforded by optogenetics to attempt to gain insights into the specific role of the cerebellum in this task. The issue of cerebellar roles in nonmotor function is timely and of general interest. Because similar decisionmaking tasks have been well studied in the context of numerous forebrain structures in rodents, the present study could (potentially) elucidate interactions between the cerebellum and forebrain.
1. An important methodological concern is whether the effects of the manipulation are truly related to evidence accumulation or other decisionrelated processes versus a host of other possibilities more in line with traditional roles of cerebellum in sensorimotor functions. The authors argue that motor behavior is unaffected by the stimulation but they should also check whether the optogenetic stimulation has effects on the movement and position of the whiskers during the stimulation, as might be expected based on previous studies (for example from the Lena group). Video monitoring of whiskers was performed previously for muscimol inactivation but should be reported again here for the optogenetic stimulation.

Another recent study (Brown and Raman) show that airpuffs evoke reflexive whisker protraction in awake mice. Moreover, opto activation of Purkinje cells alters such movements (as well as firing of PCs). If both the stimulus used and the opto manipulation are engaging/altering whisker sensorimotor loops, the interpretation of the results could be very different (i.e. the mice are not just passively integrating sensory evidence).
We agree that the effects of movement and sensory processing are important. In response to the reviewer's concern, we have now performed video monitoring of whiskers and included it in Supplementary Fig. 10  Our results are consistent with those of Brown and Raman 2018 (their Figure 4) and of Proville et al. 2014 (their Figure 6), which show a transient effect on some parameters of whisker movement following crus I Purkinjecell stimulation. Specifically, like their results, we observed brief increases in whisker movement following the onset and offset of light delivery (Supplementary Fig. 10). We hypothesize that these are attributable to attentional shifts associated with light delivery (comparable to that which we observe under normal circumstances at the start and end of the cue period), possibly in combination with transient modulations of the whisker system. We comment on these findings on lines 163173.
We thank the reviewer for noting this point and we believe the alignment of our new analysis with previous reports is important. These results do not significantly alter our interpretation of the memoryrelated effects we have characterized. We think the fact that our perturbation does not prevent animals from utilizing evidence presented during or following the light, and that it has effects on stimuli delivered seconds prior to light onset, is best explained using the models we have proposed, rather than as a function of the transient whisker movement modulations observed. In our view, cerebellar involvement in sensorimotor loops is a wellcharacterized finding that sets the stage for a role in working memory.

2.
A recent paper (Gao et al. Nature) report a necessary role for the medial cerebellar nucleus in a whiskerbased working memory task. The paper shows that the lateral nucleus is not needed. How does this square with the present results? Presumably, the cortical area activated here projects to the lateral nucleus. A weakness of the present study is that no attempts were made to map the cerebellar regions involved in the task. As far I can tell, the previous paper from the same group did not do this either. Especially in light of the Gao paper, what, if anything, can we conclude regarding how different regions of the cerebellum might contribute to the author's task? The interpretation would be different if activating wide swaths of the cerebellum had exactly the same effect.
We agree that the distinct roles of cerebellar regions is an important question, and systematic optical perturbation studies with laserscanning systems are likely to shed light on this.
Gao et al. did find preparatory neural activity in the lateral nucleus, but that lesions did not affect choices or movement. We think this is likely explained by important differences in our tasks: consistent with its proposed roles in higher cognitive function and relationship to prefrontal cortex, the dentate nucleus may be needed for manipulation of working memory contents, such as in the accumulation of evidence in our task. Since the task in Gao et al. does not put strong demands on working memory (substantially shorter memory durations involving a single unchanging stimulus), dentate activity may not have been crucial to execution (while fastigial nucleus may play a coarser motorpreparatory role, consistent with proposed motor roles for more medial cerebellar structures).
We have added discussion of this topic on lines 210216. We have added text explicitly linking our previous statements to the results of the current study (lines 181187). In short, our current results suggest that the cerebellar involvement is not related to input gain (since regression weights are normal during light delivery), but that it does modulate the dynamics of the accumulation process (i.e. the noise and leakiness results). We further elaborate on this below in response to reviewer question 3c.

I have
(b) More specifically, I am not sure why in the modelbased analysis of figure 3a the experiments in which light was only given during part of the cue period are not included?
We have now included the bestfit drift diffusion model parameters for the subcueperiod perturbation trials (Table 1). We do not focus on this analysis because the driftdiffusion model we used is not ideally suited for parameter values that change throughout the cue period of a trial, which is likely to occur in the subcueperiod perturbation, especially given the effects seen in the regression analyses. Nevertheless, we include these fits to demonstrate for consistency that the primary feature of the middle and lastthird perturbation is picked up as leakiness, i.e. failure to sufficiently weight prior evidence in the trial.
(c) The authors seem to interpret the analysis in fig. 3 to indicate impairment in integration but earlier in the paper they say that surprisingly, integration is fine when the light is on. How are these consistent?
We use "evidence integration" to refer to the entire process of both (a) maintaining evidence in memory and (b) adding new evidence to memory (one cannot integrate without both of these functions intact).
Our statement earlier in the paper, "mice had no difficulty using the evidence presented concurrent with light delivery," relates only to the latter function, whereas we subsequently show that the former function is impaired.
It is therefore a subset of integration functions which are impaired, and that subset is related to the stability of the existing memory, specified by the parameters σ a and λ (but not σ s , which is related to the addition of new information into memory, and that is consistent with the earlier mention of unimpaired usage of evidence presented concurrently with light). Also, Figure 3 shows a change in the lapse rate in both experimental conditions, however, the authors seem to ignore this, concluding that the deficits in the two conditions are caused by separate processes.
We focused on σ a and λ because they were significantly affected at the p=0.05 level.
We nevertheless now comment on the possible rise in lapse rate in the cueperiodperturbation condition (lines 146151, 187190).
4. I found the discussion section of the paper too brief and too vague. Given the large amount of past work in this area, I see this as a major deficiency. Readers will want to understand how the cerebellum fits into existing schemes regarding the neural basis for decision making. Presumably, a major motivation of the study is to understand the interrelations between the cerebellum and forebrain regions, motivating the use of a task that has been heavily studied in the context of the neocortex, as well as the striatum. Vague statements such as "The behavioral effects we characterized here were not observed with perturbations of other brain regions in similar paradigms 3-5,7 ." should be expanded upon such that the reader can understand in reasonable detail how the present results compare to ostensibly very similar manipulations performed in very similar tasks in other brain regions, including PPC, FOF, and dorsal striatum. Even if the answers are not clear, stating clearly what the remaining questions are will be helpful.
We have now added discussion of how our cerebellar findings align with previous reports from decisionmaking studies in other brain regions (lines 191201). These include explicit comparisons between our results and those from similar previous studies.
The authors should also comment on how the result of the optogenetic activations used here compare to results they obtained previously using muscimol. This is an important issue given many recent results showing that different manipulations can give unexpectedly different results in the same task.
We now comment on the relationship between our optogenetic and muscimol results (lines 179187). Compression of the psychometric curve was observed in both perturbations, while most of the other effects characterized in this study were not possible to assess in the pharmacological study due to the lack of temporal resolution and perturbation trial counts.