Laminar differences in decision-related neural activity in dorsal premotor cortex

Dorsal premotor cortex is implicated in somatomotor decisions. However, we do not understand the temporal patterns and laminar organization of decision-related firing rates in dorsal premotor cortex. We recorded neurons from dorsal premotor cortex of monkeys performing a visual discrimination task with reaches as the behavioral report. We show that these neurons can be organized along a bidirectional visuomotor continuum based on task-related firing rates. “Increased” neurons at one end of the continuum increased their firing rates ~150 ms after stimulus onset and these firing rates covaried systematically with choice, stimulus difficulty, and reaction time—characteristics of a candidate decision variable. “Decreased” neurons at the other end of the continuum reduced their firing rate after stimulus onset, while “perimovement” neurons at the center of the continuum responded only ~150 ms before movement initiation. These neurons did not show decision variable-like characteristics. “Increased” neurons were more prevalent in superficial layers of dorsal premotor cortex; deeper layers contained more “decreased” and “perimovement” neurons. These results suggest a laminar organization for decision-related responses in dorsal premotor cortex.


#### Some detailed comments
Line 15 -"... are not well understood" : This phrasing has unfortunately become a universal technique for general motivation of any paper. I wish the authors could be a bit more specific in the abstract as to the hypothesis that they are testing. Your paper is about the laminar organization. So something like "we have previously described the response properties of neurons in PMd using techniques that did not allow us to identify the laminar location... " Lines 19 -25 : On line 19 "firing rates of PMd neurons are organized ...." but at line 24 "units at the center". Is it a continuum of units or of firing rates? If it a continuum of units then you should write (on 19): "Units in PMd can be organized along a continuum by their responses in our reach task." (or something to that effect.) Line 19 "Consistent with a model, which" : Remove comma after model Figure 2: "Check" : I would encourage the authors to just write "cue" rather than "check" in Figure 2. "Check" is not a standard short form for "checkerboard", but 'Cue' is pretty easy to interpret (especially if you use that term in fig 1 and the rest of the text... e.g. the "checkerboard cue".) Figure 3e: Regression : The regression should be done on the actual data, not the binned and averaged data. Why did the authors do that? This figure needs to be made for each animal separately.
Line 88-90: "..the DDM framework applies even when sensory evidence is 90 available all at once" : This is not news. You cite two (relatively recent) papers. But much of the early psychophysics work related to the DDM is based on static visual discrimination. What is unique about your design? Figure 4B : Why use a line to fit this when it clearly looks non-monotonic? A quadratic fit (or two lines) seems more appropriate.

Reviewer #2 (Remarks to the Author):
Chandrasekaran et al. present an analysis of multi-unit recordings from laminar probes in premotor cortex during a perceptual decision making task. They found that some units whose activity positively correlated with reaction times, activity bears signatures of a decision variable as predicted by a drift-diffusion-type model. Depth measurements from the laminar probes revealed that decision-related units were preferentially distributed in the superficial layers, whereas units in the deeper layers reflected more motor-preparatory responses.
I find this to be an interesting and compelling study that provides a novel advance to our understanding of the cortical circuit dynamics during decision making. The analyses are thorough and well-designed, and the figures and text and carefully and clearly presented. The laminar dependence of decision vs. motor signals in primate premotor cortex is of particular novelty and importance, as they suggest a circuit-level transformation in the flow of information across layers during decision making. I have only two comments/concerns, which should be readily addressable, described below. Comments: 1. Relationship between visuomotor index and choice selectivity: What is the relationship across units between the visuomotor index and choice selectivity? Fig. 3C speaks to this question at the group level, averaging within decreased, perimovement, and increased divisions, but not at how these may be correlated or clustered at the unit level. This could be shown with a scatter plot showing, for each unit, the visuomotor index and some measure of choice selectivity (e.g. for each unit, the immediate pre-movement difference in firing rates or slopes between choices, averaged across coherences, and maybe normalized by baseline/overall firing rate). This analysis could address the important question of whether this sample can be described a unimodal distribution showing a range of choice selectivity, or whether there is clustering with some units showing very little or very high choice selectivity. For instance, the example units at channel 11 in Fig. 4A show very little choice selectivity at the time before movement. In that sense they resemble the "fixation" neurons recorded in the frontal eye field (by Jeff Schall and others), which show omni-directional decreases preceding movement. So is there a subgroup of units that show a strong choice signal, beyond the expected continuum of a unimodal distribution? 2. Potential limitations of multi-unit analysis: A key limitation of the current work is that the analyses are done on only multi-unit data, without any validation that the main conclusions would hold if analyses were applied to only isolated single-neuron recordings. Indeed, with this limitation it is not even clear how one should interpret the selectivity and visuomotor indices of these units, when multiple neurons are combined to produce an analyzed unit. The Discussion section should note this limitation and describe what aspects of the conclusions may be sensitive Having said that, I do recognize that the present paper quantifies the specific relationship with decision-related variables, which has not been done before. However, the index that is used is perhaps not ideal to tease that apart from other stimulus-locking or burst vs. tonic properties. The index is straightforward and reasonable, but it could potentially confound important phenomena. In particular, the FR is averaged across a lengthy period of time during which cells exhibit temporal changes, some increasing over the course of the trial and some decreasing. The classification of a cell as having a positive index could be caused by several different scenarios: 1) it could reach a similar level of activity but for a longer time prior to movement onset in long RT trials (perhaps simply because it is "stimulus locked" and "tonic"; 2) it could exhibit a burst that always lasts the same amount of time but is stronger in long RT trials; 3) it could build up as a function of elapsed time independently of difficulty. All of the examples shown illustrate scenario 1, but were other scenarios observed? If so, how often? A negative index could likewise be produced by different scenarios, and it is very conceivable that stronger bursts are associated with shorter RTs (and/or higher velocities). The index might therefore inappropriately partition cells that are in fact closely functionally related, or group together cells that are very different. It seems to me that one needs to be more careful in characterizing these trends (aim 1), possibly through the use of multiple indices that measure different aspects of cell responses (e.g. duration of burst, amplitude of burst, time-locking with stimulus or movement onset, etc.). The analyses used by Sato & Schall and Song & McPeek, examining the relationship between RT and neural discrimination time, could be particularly insightful here, but only as one among several ways of quantifying response patterns. Assuming that the trends in response patterns are well-characterized, mapping them to different cortical layers (aim 3) is novel and very interesting. In fact, I consider this to be the most significant advance presented in this paper. The result in Figure 4b is compelling, and it would be very good to show that for individual recording sessions, perhaps even as a 3-D map that examines how robust these trends are along a rostro-caudal or medial-lateral axes. Does the prevalence of superficial increasing cells decrease as you move caudally? Do they appear at different depths? It would also be desirable to have more detailed reconstructions so as to be able to judge what specific cortical layers correspond to the different depths. The online methods describe the careful placement of the probe to maximize consistency across sessions, but this cannot do anything about the different angles at which the probes intersect the curving cortical surface. A histological reconstruction would be best, but if the animals are still being used in experiments then perhaps a careful 3D reconstruction based on MRI data would suffice. I emphasize this because the laminar trends described here would be even more valuable if they could be more precisely related to specific cortical layers. Regarding aim 2, the paper is rather underwhelming. The authors contrast their findings with a study , in which it was claimed that PMd activity is not compatible with the DDM but instead quickly tracks sensory information and combines it with a growing urgency signal. Why is that explanation not compatible with the present data? The central assumption of the DDM is that build-up of neural activity is caused by an integrator with a long time constant, and Thura & Cisek argued that their data suggests the PMd time constant is short. Here, the authors argue that their data is compatible with the DDM (i.e. an integrator with a long time constant). But they can't actually test that proposal because it is not possible to distinguish between a long and short time constant with a task, like the present one, in which the stimulus information is static. All of the data shown here, and all of the fits, could be done with a variety of settings of time constants and urgency signals. The authors propose that perhaps the brain adjust time constants in a task-dependent way, using long ones when the stimulus is static. This is plausible, but it has also been claimed that time constants are short even for the classic random-dot motion task (Carland et al. 2016). Regardless, no case can be made either for or against the DDM using data, like that presented here, which does not actually test the DDM. Consequently, I would recommend that the authors just remove the DDM from the manuscript, including the discussion as well as the model fits. It merely distracts from the main contribution, which (at least in my mind) is the characterization of differences in activity patterns at different cortical layers. Furthermore, given the ambiguous status of the DDM as an explanation for this kind of experiment, I would recommend not basing any of the analyses on fits to that model. This specifically pertains to estimates of decision times, which are derived from parametric fits that assume the DDM as the correct model, even though it is not tested. In fact, the non-decision times of 326 and 360ms for the two monkeys, respectively, are quite long. If that is the sum of sensory plus motor delays, then it would be hard to explain how the monkeys could perform a simple reaction time task with an RT that is less than 300ms, which presumably they often do. Specific comments: Line 168: It is reported that the neural data comes from 546 units in monkey T and 450 in monkey O, and that this includes both single units and multi-units. Why include the multi-units? One critical prerequisite to addressing aims 1 and 3 is to characterize the diversity of firing patterns of real neurons, and so averaging across neurons is exactly the wrong thing to do. It only raises the concern that many important phenomena were missed. Line 176: It is interesting that stimuli with higher coherence were associated with earlier divergence of FRs, but the statistics that test this should be reported (In fact, the strength of this correlation would make a very interesting index for characterizing response patterns, as noted above). Similarly, statistical tests should be reported for all of the other claims made as well.

1
Response to reviewers for "Laminar differences in decision-related 2 neural activity in dorsal premotor cortex" 3 We thank all three reviewers for the positive evaluation of the novelty of the results presented here. 4 We are also grateful to all the reviewers for the insightful comments and the suggestions that we 5 believe have greatly improved the manuscript. In this revised manuscript, we have made every 6 attempt to address these comments and clarify the novelty as well as strength of the results 7 presented in the manuscript. Please find below the comments from the reviewers in black and our 8 response to the reviewers in red. The text/analysis changes we have made in the manuscript to 9 respond to the reviewer comments are provided in blue in this document (as well as in the 10 manuscript). Before addressing each comment, we provide a brief preamble to list the major 11 additions or clarifications we have made in the manuscript. 12 In our original manuscript, we relied heavily on the index as the primary way to separate 13 neuronal firing rates because we believe that it is simple and readily understood. We did not include 14 in the original manuscript other analyses we had performed because of the worry that it may be 15 unwieldy or overwhelming for the reader. However, we agree with the reviewers that adding 16 additional depth to these analyses improves the impact, readability, and relevance of this manuscript. 17 We have attempted to perform the appropriate analyses and include them in the manuscript in the 18 appropriate location. The new analyses included in this revised manuscript are described below here 19 first and also when appropriate for the questions separately for each reviewer. 20 1. Performed a neural discrimination time analysis on single neurons for a range of RTs and 21 show that discrimination time correlates with the visuomotor index and other metrics of 22 interest (Extended Data Fig. 4d to address concerns about normality and correlations between units. Results did not 87 change when using these nonparametric methods. The authors trained 2 monkeys to perform a visual discrimination reach task. Despite using a static 96 stimuli, the monkeys behavior was well fit by a drift-diffusion model (DDM). Recordings from PMd 97 revealed diverse neuronal populations, with some being more highly correlated with reaction time 98 and others less correlated. The authors claim that the units that were positively correlated with 99 reaction time and also choice and stimulus difficulty. But the units that were negatively correlated 100 with RT did not show "decision-variable" like characteristics. Using a laminar probe the authors 101 claim that superficial layer contain more of the "increasing" neurons (more correlated with RT.) 102 103 #### General Comments 104

The laminar organization of cortical function/computation is of great interest to 105
neuroscientists. This paper is one of the first to describe laminar distinctions in a 106 perceptually difficult reach task, which (in theory) permits the separation of perceptual, 107 decision, and action related signals. There are a few problems that prevent me from 108 recommending the paper for publication in Nature Comm. in its current form. 109 We thank the reviewer for the positive evaluation of the novelty of the results presented here. We 110 have attempted to perform appropriate analyses and rewritten parts of the manuscript to address the 111 issues raised by you. Please find below our point-by-point response to your comments. 112

1.2.
The behavior of the two monkeys is quite different, but all the analyses lump the 113 neurons together, calling into question the generality of the results. The authors need to 114 show that the main results and claims hold up independently in both animals.

116
We thank the reviewer for this comment. We carefully described in the original manuscript that 117 there are differences between the monkeys' behavior primarily at the level of the RT. Without the 118 RT curve and the DDM, and in the revised manuscript the UGM, the thresholds, and psychometric 119 curves appear similar and in a fixed discrimination task the behavior of the monkeys might be 120 considered near equivalent. 121 Nevertheless, we concur with the reviewer that the differences justify the need to show that the 122 neurophysiological effects are largely similar in both animals and we have done so where appropriate 123 by plotting data separately for each animal (Extended Data Figs. 1, 2, 3, 4, 11, and 12). 124 125 a. Extended Data Fig. 1 shows the fit of the DDM to the behavior of the two monkeys 126 separately. Table 1 shows the goodness of fit statistics for the 4 candidate DDM 127 models we considered for each monkey separately. The logic of the paper is convoluted. I feel that it could be substantially improved 149 with new analyses and writing. I detail the issues below, but the main problem is with the 150 overuse of the "index" (correlation between pre-movement firing rate and reaction time). 151 We apologize to the reviewer for the lack of clarity. As suggested by the reviewer, we have now 152 added several new analyses to extend the main conclusions of this paper that relied on the index. We 153 have also attempted to clarify and streamline the text throughout. 154

1.4.
Of the 3 "concrete goals" the authors describe in the discussion, only the third one 155 (laminar distribution of decision encoding) merits enough interest for publication in Nature 156 Comm. However, the authors did not perform any analyses of the simultaneously recorded 157 units that would provide stronger evidence that the decision-related activity in superficial 158 (and also the deepest) layers of cortex drive the more movement related neurons. 159 We appreciate that the reviewer concurs with us on the importance of the laminar differences in 160 PMd during a decision-making task. Regarding the other two points, we respectfully feel that the 161 systematic characterization of the relationship between different types of response patterns in PMd 162 and the decision-formation process is quite important and was largely lacking in the literature. For 163 instance, as reviewer 2 points out, the decreased neurons are a novel finding for PMd and have only 164 been previously described in the FEF and the superior colliculus. 165 Similarly, , the study with the most similar design as ours, documented decision-166 related responses in ~60 PMd neurons (many hundreds of neurons fewer that in the present 167 manuscript) but did not describe the response diversity that we attempt to show. They also did not 168 show that different units have different choice selectivity profiles and they provided one pooled estimate 169 of the time of choice selectivity. These types of analyses are a major component of the manuscript, 170 and our conclusions do not rest entirely on the laminar differences discussed here. The laminar 171 differences are the final analyses we perform to investigate the complexity observed in the FRs of 172 PMd neurons during the decision-making task. 173 Another important feature of this revised manuscript is that we also have now modeled the behavior 174 of the monkeys using the DDM and an urgency-gating model. We show how both models explain 175 some features of the data, but neither model is perfect (Extended Data Fig. 2). The DDM 176 overestimates the highest RT quantiles (70 and 90 th percentile). The UGM predicts a shape for the 177 quantile probability plots that is inconsistent with the data. 178 We have also added a decoding analysis to show that information about choice emerges earlier in the 179 superficial compared to the deeper layers of PMd ( unconvincing. It turns out that for the "decreased" population, choice signals emerge later, 185 but it did not have to turn out that way. Thus the index is not itself the property of interest. 186 Rather than report the laminar distribution of the "index" why not directly report the where we show that there is a near-perfect relationship between our correlation-derived visuomotor 208 index and the first principal component of an unbiased PCA analysis. The first PC explains ~ 50% 209 of the variance in the neural data set, substantially bolstering confidence in our correlation-derived 210 index (Extended Data Fig. 8e). 211

A major reference for the "visuo-motor continuum" type analysis is DiCarlos and 212
Maunsell 2005 "Using Neuronal Latency to Determine Sensory-Motor Processing Pathways 213 in Reaction Time Tasks." That paper provides a more robust technique for doing neural-214 behavior latency analysis (rather than just comparing spike-rates with RT). Another

technique for neural-behavior latency analysis is described in Erlich, Bialek & Brody (2011) 216
which is applicable to neurons with more complex dynamics. 217 We thank the reviewer for drawing our attention to both of these analysis techniques. We were 218 obtaining reasonable results with simple regressions between firing rates and RT and thus did not 219 pursue more sophisticated techniques for estimating this relationship. We have now added the 220 analysis performed in DiCarlo and Maunsell as an extended data figure (Extended Data Fig. 5). 221 We have now also used the technique from Erlich, Bialek and Brody (2011)  violate the assumption of independent samples for regression. Firing rates may or may not 248 satisfy the assumption that the data came from a normal distribution. 249 Thank you for this suggestion. Where applicable we have now included nonparametric statistics that 250 measure whether the medians are significantly different from one another to assuage the concern 251 that the means of these distributions may not be meaningful because of the non-Gaussian nature of 252 these distributions. As the reviewer has surmised, adopting tests that compare between the medians 253 did not change the main results of the manuscript. We have also included nonparametric statistics 254 using permutation tests where necessary. Again using these methods did not change the results 255 presented in the manuscript. 256 #### Some detailed comments 257

258
: This phrasing has unfortunately become a universal technique for general motivation of 259 any paper. I wish the authors could be a bit more specific in the abstract as to the 260 hypothesis that they are testing. Your paper is about the laminar organization. So something 261 like "we have previously described the response properties of neurons in PMd using 262 techniques that did not allow us to identify the laminar location... " 263 Many studies in the oculomotor system have delineated the role of different brain regions in 264 perceptual decision-making. We feel that the mechanisms are not well understood for the 265 somatomotor system. That's why we included this opening sentence. However, we fully agree that 266 these types of sentences have become standard opening or motivating statement for many papers 267 and we apologize if the opening sentence of this manuscript is trite. We have therefore edited the 268 first two sentences to better set up the rationale for our study. We thank the reviewer for the suggestion and have amended the text as well as the methods to be 295 clearer. We have replaced the word check in the figures with the word "cue" and refer to it as 296 "checkerboard cue" in the text. 297 298

Figure 3e: Regression 299
: The regression should be done on the actual data, not the binned and averaged data. Why 300 did the authors do that? This figure needs to be made for each animal separately. 301 We chose to perform the regression on latencies estimated from the average population data 302 because we were concerned that single neuron estimates of this regression might be noisy. However, 303 the analysis was just as robust when we performed the analysis using the individual neurons and 304 found very similar results (Extended Data Fig. 4d-f, Extended Data Fig. 7, Fig. 4d, Extended Data 305 Fig. 11d). We thank the reviewer for the suggestion and agree that this is statistically a more robust 306 measure of the latency differences in PMd. We now include these neuron-by-neuron estimates as 307 figures separately for each monkey (Extended Data Fig. 4e-f) and also pooled across both monkeys 308 (Extended Data Fig. 4d). We also include a correlation between the discrimination time measured on 309 a single neuron basis over all RTs and our simple visuomotor index (Extended Data Fig. 4d-f). 310 As advised by the reviewer, we also included analyses from DiCarlo and Maunsell (2005)  if one of them describes the data better than the other. The DDM predicts the shape of the quantile 329 probability plots better but overestimates the RTs for the highest quantiles (Extended Data Fig. 2a,  330 left panel). In contrast, the UGM predicts a different shape but predicts faster RTs for the highest 331 quantiles (Extended Data Fig. 2b, right panel). Depending on the relative balance, either the DDM 332 or the UGM describes the data better. 333 Nevertheless, we agree with both Reviewer 1 and Reviewer 3 that the DDM vs. UGM is not the 334 main focus of this manuscript. The more novel results in the manuscript are the temporal 335 heterogeneity of decision-related responses and the demonstration of a laminar structure in PMd. 336 We have rewritten this introductory paragraph and deemphasized the DDM in the results and 337 provided an explicit discussion on the DDM vs. the UGM. 338 We have adopted the strategy of keeping the modeling results but minimizing their emphasis in the 339 manuscript that the DDM is the only model that can fit this data. All of these decision-making 340 models predict very similar FR profiles for trial-averaged data with the stimuli used by us. 341 1.15. Figure 4B 342 : Why use a line to fit this when it clearly looks non-monotonic? A quadratic fit (or two lines) 343 seems more appropriate. 344 We thank the reviewer for this suggestion to use quadratic and higher order fits to describe the 345 dependence of visuomotor index on cortical depth (Fig. 4c We also agree that this is an interesting question. We have now attempted to answer this question 384 using the following analysis. 385 We have added an extended data figure (Extended Data Fig. 4h) that plots the correlation between 387 the index and the choice probability (measured using an ROC analysis on the smoothed firing rates) 388 at each time point. There is a reliable positive correlation between the visuomotor index we defined 389 and choice probability. The correlation plot shows that as the visuomotor index becomes more 390 positive (that is towards a more increased neuron) the choice selectivity also increases. We also show 391 three scatter plots for each of the three different time points shown in the left panel. The data are 392 more consistent with a continuum and less consistent with clusters of neurons with stronger vs. 393 weaker choice selectivity. 394

2.2.
In that sense, they resemble the "fixation" neurons recorded in the frontal eye field per convention, we term them single neurons. Some of these single neurons were collected using 431 high impedance (i.e., small electrode contact area, > 6 MΩ) sharp FHC electrodes. When using these 432 electrodes, every attempt was made to isolate and track single neurons and to stably record from 433 them. 434 We also had excellent success recording from isolated single neurons from the U-probes. 435 The U-probes are low impedance electrodes (~100 kΩ) with a small contact area and were thus 436 excellent for isolation of single units. We used a conservative threshold to maximize the number of 437 clearly defined waveforms and minimize contamination from spurious non-neuronal events. Online, 438 we used the hoops provided by the software client for our Cerebus system (BlackRock 439 MicroSystems) to delineate single neurons after the electrodes had been placed in the cortex for at 440 least half an hour to 45 minutes. Every time a spike was detected by the threshold method, a 1.6 ms 441 snippet was stored and used for subsequent evaluation of the clusters as well as adjustments needed 442 for spike sorting. Recordings from some electrodes in the U-probes consisted of mixtures of 2 or 443 more neurons well separated from the noise and from one another. In a majority of these cases, the 444 waveforms were clearly separated, and these were labeled as single units.

Discussion Section 461
The majority of the electrophysiological data reported in this manuscript were single neurons 462 (~80%) recorded in PMd during the decision-making task. However, we also included a substantial 463 fraction of multiunits in the data (~20%). The multiunits certainly provided us with additional power 464 for the analyses presented here, but it may have also lead to spurious misclassification of units in the 465 continuum. First, in the worst case, there is the possibility of combining the FRs of an increased 466 neuron with a decreased neuron and a perfect cancelling in the decision-formation period could 467 result in a spurious perimovement unit. Fortunately, as our laminar recordings show, the increased, 468 decreased and perimovement like FRs appear to be roughly segregated as a function of cortical 469 depth, so this type of spurious mixing will be minimized due to this topographic organization. 470 Second, because a multiunit contains additional spikes, there is a slightly greater chance that a 471 decreased neuron will be misclassified as a perimovement or an increased unit and this will increase 472 the preponderance of increased units in our database. Third, there is also the possibility that some 473 finer grained temporal patterns are smeared because of combining multiple neurons into a unit. 474 Finally, inclusion of the multiunits could have resulted in smoother visuomotor continuum than 475 what is actually present in PMd. Future studies that use a laminar electrode with a tetrode 476 configuration that will improve isolation or the next generation of silicon electrodes that provide 477 high Three corresponding main results are reported: 1) PMd cells can be characterized as lying along a 500 continuum that is indexed by their trial-to-trial relationship to reaction time. At one end there are 501 "increasing cells" whose movement-aligned activity is larger when reaction times are longer, and 502 which are most clearly related to the difficulty of the decision. At the other end are "decreasing 503 cells" whose activity is larger for shorter reaction times, and which do not reflect the choice until 504 just before movement onset. In-between there are "perimovement cells" that do not 505 correlate with RT and simply indicate movement direction about 150ms prior to onset.
2) The 506 activity of increasing cells, and the monkeys' behavior, is consistent with the DDM. 3) There is a 507 clear trend in the laminar distributions of the cell types, with increasing cells more common in 508 superficial layers, while decreasing and perimovement cells are more common at deeper layers. 509 I will comment on each of these aims and findings in turn, but I will begin with aims 1 and 3, as they 510 are related, before turning to aim 2. 511 512 authors acknowledge all of this, but then it's unclear why they state that a "detailed 520 description of the temporal patterns" is lacking (p. 2), and unclear how their analyses 521 provide a description that is more detailed than what was already shown previously. Having 522 said that, I do recognize that the present paper quantifies the specific relationship with 523 decision-related variables, which has not been done before. 524

Many previous studies have reported similar kinds of trends in PMd neural firing
We apologize for our lapse in including papers from classical motor studies. We  As the reviewer has recognized by our citations of some of the most relevant papers, we were well 551 aware of the many different studies that had previously demonstrated stimulus related and 552 movement related units in PMd and the frontal eye fields. Indeed, we built upon these ideas that a 553 visuomotor continuum can describe the FR patterns and used it as the simplest way to tease apart 554 the different types of units involved in the decision-formation process. We agree that we were 555 remiss in NOT citing these key references. We have now included them in the revised 556 manuscript. 557 However as the reviewer has recognized, first there is a gap in our understanding of the types of 558 responses in PMd during a difficult reach decision-making task and whether some of them are better 559 described by a decision-variable. Second, to our knowledge decreased neurons have not been 560 reported in PMd. As reviewer 2 points out, the decreased neurons are a novel finding for PMd and 561 have only been previously described in the FEF and the superior colliculus. 562 To illustrate, Coallier, Michelet, and Kalaska 2015, the most recent study with quite a similar 563 design as ours, documented decision-related responses in ~60 PMd neurons (hundreds of neurons 564 less than reported in the present manuscript) but did not describe the response diversity that we 565 attempt to show. They also did not show that different units had different choice selectivity profiles 566 and provided one pooled estimate of the time of choice selectivity (For instance, Table 3A of  567 Coallier et al. (2015)). We attempted to provide many different analyses that help us dissociate 568 between the different firing rate patterns. 569 Similarly, Song and McPeek (2010) described neurons in the visual and motor ends of the 570 continuum but did not describe any of the decreased neurons. Their task also involved a visual 571 search. Not only did we replicate their findings in this study, we also show that a similar visuomotor 572 continuum is present in PMd during a difficult decision-making task with an arm reach as the 573 behavioral report. 574

3.2.
However, the index that is used is perhaps not ideal to tease that apart from other 575 stimulus-locking or burst vs. tonic properties. The index is straightforward and reasonable, 576 but it could potentially confound important phenomena. In particular, the FR is averaged We thank the reviewer for raising this important concern. We have now attempted to use several 596 other analyses with varying degrees of supervision to describe these response patterns. We are 597 pleased to note that all of these alternative methods still provide very similar results to our main 598 conclusions using the straightforward and reasonable visuomotor index that we derived here. 599 • We have now implemented the analyses originally developed by    Fig. 6). This 606 sophisticated correlation is very well related to the visuomotor index we proposed here. 607 • We have included an analysis from  to identify the position of 608 these neurons along a visuomotor continuum (Extended Data Fig. 5). 609 • It is entirely plausible that there are many different patterns that we have completely missed. 610 However, if this is the case, then the number of principal components should increase with 611 the number of distinct patterns. The variance in the FRs is well explained by the top 5 612 components (Extended Data Fig. 8). 613 • As regards the unconvincing nature of the index, we have now incorporated an analysis 614 where we show that there is an almost near perfect correlation between the correlation-615 derived visuomotor index and the first principal component that explains ~ 50% of the 616 variance (Extended Data Fig. 8). 617 • Another analysis we performed was based on an analysis by Meister et al. (2013) that used 618 clustering to separate out neural signals in LIP during perceptual decisions. We find that 619 most of our data is well described by ~5 clusters. When we increase the number of clusters 620 we further fractionate the neural populations, but the gain in explanatory power is modest 621 (Extended Data Fig. 9). 622

3.3.
Assuming that the trends in response patterns are well-characterized, mapping them 623 to different cortical layers (aim 3) is novel and very interesting. In fact, I consider this to be 624 the most significant advance presented in this paper. The result in Figure 4b is compelling, 625 and it would be very good to show that for individual recording sessions. 626 As requested by the reviewer, we have now added single-session examples of this index in Fig. 4b  627 and Extended Data Fig. 10b. 628

3.4.
Perhaps even as a 3-D map that examines how robust these trends are along a rostro-629 caudal or medial-lateral axes. Does the prevalence of superficial increasing cells decrease as 630 you move caudally? 631 Successfully making U-probe recordings meant that we needed to constantly monitor the Dura to 632 see if there was any dimpling or resistance. The use of a grid system would drastically reduce the 633 visibility in the chamber and compromise the quality/safety of the recordings. As a consequence, we 634 have very little access to the precise locations of the recordings. 635 However, we did perform a modest number of U-probe recordings in a more caudal location in our 636 chamber (putative M1). For these sessions, we did not observe the same trends that we observed in 637 dorsal premotor cortex (Extended Data Fig. 12a)  urgency signals. The authors propose that perhaps the brain adjust time constants in a task-674 dependent way, using long ones when the stimulus is static. This is plausible, but it has also 675 been claimed that time constants are short even for the classic random-dot motion task 676 (Carland et al. 2016). Regardless, no case can be made either for or against the DDM using 677 data, like that presented here, which does not actually test the DDM. 678 We are very receptive to this concern, and we thank the reviewer for pointing out this lack of rigor 679 in the manuscript. We were remiss in loosely claiming that the DDM provides a better fit to the data 680 without testing the alternative Urgency Gating Model (UGM). As the reviewer correctly identifies, 681 many candidate models with different assumptions can potentially explain discrimination behavior, 682 and this is an active, fascinating debate that is currently occurring between Cisek and Colleagues and 683 Ratcliff Fig. 2a). 689 We, therefore, examined the quantile probability plots for the two different monkeys (Extended 690 Data Fig. 2b). Visually, the data were more consistent with a DDM than an UGM with a 100 ms 691 time constant. We went one step further and used the same toolbox developed by Hawkins et al. to 692 fit the data (Code available from Guy Hawkins on request, Extended Data Fig. 2c). We tested the 693 UGM developed in Thura et al. (2012) and an evidence accumulation model (or DDM) developed 694 by Stone et al. (Stone, 1960). Consistent with the observation from the quantile plots we found that 695 the DDM fits the data slightly better than the UGM for the fast RTs. In contrast, the UGM fits the 696 slowest RTs better (Extended Data Figs. 2c, d). We nevertheless agree with the reviewer that neither 697 the DDM nor the UGM can completely describe the data. We have removed the overemphasis on 698 the DDM from the introduction and wrote the discussion to be broader and incorporate the two 699 different viewpoints espoused by the Ratcliff and colleagues approach and also Cisek and others' 700 UGM approach. We have also amended the various portions of the text to better reflect this 701 distinction and wrote an explicit discussion, which now expands on this relevant debate and show 702 how our increased neurons could be explained by both of these mechanisms. 703 We did not want to remove the whole section on DDMs because Reviewer 2 suggested that these 704 parameters are of interest. This is also a very relevant and interesting topic for interpreting the 705 myriad reports of decision-related firing rates in many brain regions which usually are thought to be 706 described by the DDM. Another reason is that detailed modeling of behavior from decision-making 707 experiments in different species appears to support different mechanisms . Our 708 hope is that these new analyses and discussions suffice to ameliorate the concern of the reviewer 709 that we were not precise enough in our results and discussion section about the different models 710 that can explain the data. We thank you for this comment. We have taken an approach where we have fit both models to the 718 behavioral data and then shown that neither model provides a complete account of the behavior of 719 the monkeys. Fortunately, none of the analyses of the neural responses presented here depend on 720 whether the DDM or the UGM is the right model. As we understand it, when trial averaging, both 721 models predict very similar FR profiles. We have elected to retain the DDM (and UGM) discussion, 722 for the reasons described directly above. 723

3.8.
This specifically pertains to estimates of decision times, which are derived from 724 parametric fits that assume the DDM as the correct model, even though it is not tested. In 725 fact, the non-decision times of 326 and 360ms for the two monkeys, respectively, are quite 726 long. If that is the sum of sensory plus motor delays, then it would be hard to explain how 727 the monkeys could perform a simple reaction time task with an RT that is less than 300ms, 728 which presumably they often do. 729 We note that the best DDM fits (measured by the chi square estimate in Table 1) were obtained  730 when a variability parameter was also included for the non-decision time. This variability parameter 731 is the range of the uniform distribution centered on the mean non-decision time. This variability 732 range was ~140 ms for monkey T and ~120 ms for monkey O. So non-decision times could be in 733 principle as low as ~256 ms for monkey T and ~290 ms for monkey O. 734 Thus our estimates of decision-times are likely a lower bound. Even if our estimates of non-decision 735 times are higher than expected, it only bolsters confidence that the monkeys are monitoring the 736 stimulus more to make their decisions. It might mean that our "200 ms" decision time for the 737 monkeys is actually a lower limit on what the monkeys are using to form their decisions. 738 Regarding absolute magnitudes for non-decision times, we do not feel that these numbers are hugely 739 inconsistent with other reports of non-decision times. Thura and Cisek (2014) estimated non-740 decision times using the RTs from a simple delayed reach task that is similar to the simple RT task 741 which the reviewer has suggested. These numbers for RTs and thus by proxy non-decision times 742 for the two monkeys reported in Thura and Cisek were as follows (mean ± SD: 291±40 ms, 335±93 743 ms). These numbers were not too far from the mean non-decision times we report here. Similarly, 744 Coallier that non-decision times may be 35-50 ms larger in tasks that involve volitional commitment (as in 749 our RT discrimination task) compared to simple delayed reach tasks. Together, even though our 750 estimates seem slightly on the higher side, they are roughly in the same regions for monkeys 751 performing a very simple delayed reach task which in principle should only involve minimal sensory 752 and motor delays. 753 754 755 Specific comments: 756 757

Line 168: It is reported that the neural data comes from 546 units in monkey T and 758 450 in monkey O and that this includes both single units and multi-units. Why include the 759
multi-units? One critical prerequisite to addressing aims 1 and 3 is to characterize the 760 diversity of firing patterns of real neurons, and so averaging across neurons is exactly the 761 wrong thing to do. It only raises the concern that many important phenomena were missed. 762 We included the multi units because we were circumspect about excluding any recorded data and we 763 wanted as much power as we could for understanding how cortical depth influences decision-related 764 activity. We have now provided a more detailed account of the types of units that were included in 765 the database in the methods section. As suggested by reviewer 2 we have placed a caveat at an 766 appropriate location in the discussion section. We note that the multiunits were only 20% of the 767 units recorded in our full dataset and thus unlikely to dominate the results presented here. 80% of 768 the units are true single neuron recordings as determined by the usual electrophysiology standards in 769 the field. 770 The text included in the revised manuscript can be found in the answer to question 3 posed by 771 reviewer 2 (2.3). We apologize for this lack of clarity in the manuscript. We meant to say faster rate of divergence 778 which is to say that choice information increases faster for easier compared to harder coherences as 779 in Figs. 3b-c. We report the correlation between the slope of the FR dependence on coherence and 780 the visuomotor index in this revised manuscript. (positive correlation between the visuomotor index 781 and the dependence of slopes on color coherence, spearman's r=0.25, p=6.35e-16) 782

### Summary
The manuscript is improved. There are still a few places where the text needs clarification, but I overall I recommend it for publication. > We gently restrained the arm the monkey was not using ... Figure 1C: Generally, the negative coherence (more green) should be to the left of the more positive coherence. line 159: "than, the" remove comma line 291: "Both left, and right" remove comma line 303: The perimovement group is not defined in the same way as the other groups, since it is by definition the "leftovers". I wonder if the authors could "clean-up" that group? I recognize that this could be a bit complicated, but the techniques in Rouder et al (2009) could be of use. This may help to distinguide the decreased and the perimovement groups.
line 337: Not sure how you are estimating p in your permutation test but with a 2-tailed test and 10000 repeats your p should not be less then 1/5000. MY guess is that your data is completely outside of the shuffled distrubution, so you put p~0, but it should just be 2/(# of shuffles).
line 347: Is that really true? Does this sentence even make sense? The RT is a property of the trial (the specific trajectory that was taken), so how can you say "choice ... signal should emerge on average ... regardless of RT". I think you mean this for a given level of difficulty. That's important to say. But i still think this sentence is problematic. The DDM and UGM are behavioral models. They don't make strong predictions about how neural circuits instantiate the models. In order for this sentence to be meaningful a lot of unstated assumptions should be stated (such as the correlation between neurons and how the DV is read out.) Maybe easier to remove? line 390: I think the models predict crossing a threshold, which is not the same as convergence (the neurons could converge to 0 FR. Would that support the models?). Even if convergence was sufficient, is absence of correlation evidence for convergence? In Murakami et al (2014) they do a lot of analyses to argue for a rise to threshold mechanism. In their resubmission, the authors have performed extensive and thorough additional analyses, which address all major concerns raised in the initial review. This has greatly clarified and strengthened the manuscript. I now find this manuscript suitable for publication.

Reviewer #3 (Remarks to the Author):
In my opinion, the authors have significantly improved the manuscript. In particular, all three reviewers requested additional analyses and made a variety of suggestions, and the authors have heroically implemented all of these. Furthermore, these additional analyses generally support the initial conclusions that used the simple visuomotor index. While some of these are not terribly strong (e.g. Extended Figs 5 and 7), some are very compelling (e.g. Extended Fig 6f and 8e), and all are very worthwhile. Thus, my initial concerns about the suitability of the visuomotor index are well-addressed, and the new analyses make this a stronger paper. However, as a consequence the manuscript is now very large. By my estimate, the main text is more than twice over the word limit of Nature Communications. I would recommend that the details of most of the new analyses be relegated into the online methods and supplemental materials, and only mentioned in the main text as generally confirming the conclusions gained by the visuomotor index. Don't get me wrong -I like these new analyses and think they should remain (all the details can be in online methods and extended figures). I just think the reader should be allowed to move on more quickly to the main finding, the laminar trends.
A more serious flaw with the paper is the issue of DDM versus UGM. I understand the authors wanted to leave this in because they use modeling to estimate non-decision times. But what does it actually add? First of all, this issue is completely orthogonal to the real message of the manuscript, which quantifies the differences in cell types in different layers in PMd. That quantification is accomplished with the visuomotor index and all of the methods requested by the reviewers, none of which require a DDM or UGM or any other additional assumptions.
But second of all, the results of the model fits are incredibly inconclusive! The DDM works better for the early part of the RT distribution in error trials; the UGM better for the late part of RT distributions. The DDM is better for monkey T; the UGM is better for monkey O. So what are we to conclude? That animals sometimes use the DDM and make early errors, but as time passes they switch to a UGM? That different animals use different mechanisms? I'm sure the authors do not intend to make such proposals. In my opinion, the only thing that this model fitting exercise demonstrates is its own futility. There is no value in doing model fits to data from experiments that do not discriminate the models. Let's be honest here -these models are very simplified cartoons and surely nobody would expect them to capture the richness of the true (recurrent, non-linear) dynamics of the real brain. Thus, each of them captures a lot of the data but suffers "at the edges". Comparing how much each suffers at each edge is pointless, in my opinion. All it does is add noise to an already noisy debate. A better approach is to use "strong inference" (Platt JR 1964 Science) -i.e., to design experiments that force the models to make different predictions. This has been done, and the authors cite some of the relevant papers.
Thus, I again strongly recommend that the discussion of DDM vs UGM is removed from the manuscript, at least from the main body. It would be useful to include the modeling analyses in the supplemental data, but only as an instructive demonstration of the futility of doing such model fits on this kind of data. I'm sorry but there is nothing else you can conclude from these. In the main text, the authors should remove all statements and conclusions based on these utterly inconclusive results.
A final concern I had was with the use of "multiunits". The authors now report that single, wellisolated units made up 80% of their data, and multiunits only 20%. That's great. But then what is the point of including the multiunits? They only raise concerns (Lines 743-761) and I doubt that with all of the single units (N=881, which is excellent) you really need the additional power. Why not just include the nice clean data that you worked so hard to obtain? Do any of your results change or lose significance when you remove the 20% multiunits? Specific comments: Line 192-206: If you do choose to keep the model fitting in the supplemental, then you should give both models the same flexibility. For example, to produce early decisions, the urgency signal in the UGM needs to have both a slope and intercept parameter, otherwise it always starts at zero (Line 1259). Also, I'm confused why you say that the models have the same number of parameters (Line 1262). Shouldn't the UGM have at least one or two more? What are they?
Line 250: I was surprised that you did not find onset-related activity. Could it be because the target locations were always the same, so that onset was not informative? Nevertheless, the previous study cited (Song & McPeek 2010) did show activity related to cue onset, so the statement is incorrect (unless I misunderstood what you meant). Line 796: Indeed, in the tokens task all sensory evidence remains visible. But is that not also the case in the present task, in which the color checkerboard is always visible? Is that also not the case in the random-dot motion discrimination task, in which the motion (the thing the subject is deciding about) is always present in the stimulus? Perhaps I don't understand the distinction that is being made here, in which case, please explain.
Responses to reviewers for NCOMMS-16-15643B: "Laminar 1 differences in decision-related neural activity in dorsal 2 premotor cortex" We thank all three reviewers for the positive evaluation of the novelty of the results 5 presented here. We are also grateful to all the reviewers for the new comments and the 6 suggestions on the revised manuscript. In this re-revision, we have made every attempt to 7 address these comments and clarify the novelty as well as strength of the results presented in 8 the manuscript. Please find below the comments from the reviewers in black and our 9 response to the reviewers in blue. The text/analysis changes we have made in the 10 manuscript to respond to the reviewer comments are provided in orange in this document 11 (as well as in the manuscript We gently restrained the arm the monkey was not using with a plastic tube and cloth sling 44 45 2. Figure 1C: Generally, the negative coherence (more green) should be to the left of 46 the more positive coherence. 47 48 We thank the reviewer for this suggestion and have now changed Figure 1C. 49 50 3. line 159: "than, the" remove comma, 51 line 291: "Both left, and right" remove comma 52 53 We have now fixed these punctuation errors in the revised manuscript. As the reviewer has recognized, this is a difficult problem. As per the suggestion of the 62 reviewer, we examined the Bayes Factors provided by the Rouder toolbox and found that 63 our index and method of separation into these neural populations was quite robust (Supp. 64 Fig. 3b). The Bayes Factors for the perimovement groups were significantly different from 65 the Bayes Factors for the increased and decreased units (Supp. Fig. 3b). We found minimal 66 overlap between the perimovement and decreased groups in terms of Bayes Factors ruling 67 out significant contamination of the perimovement group by the decreased units and vice 68 versa. We have now included the following text in the methods section. 69 70 The perimovement units were defined as the ones with insignificant indices. We were 71 concerned that some of the decreased units, which had smaller values of the index, could be 72 mistakenly classified as perimovement units and vice versa. To address this concern, we used 73 the Bayes Factor method which provides the ratio of the likelihood of two competing 74 hypotheses or models 68 and is typically interpreted as evidence for one model over the other. 75 In our case the ratio is between classifying a unit as decreased (or increased, H 1 ) vs. 76 perimovement (H 0 ) 68 . A large value of Bayes Factor suggests that model H 1 is more likely 77 than the model H 0 and in our case would provide strong support that the unit is correctly 78 classified as increased or decreased. In contrast, a low Bayes Factor would suggest strong 79 evidence for H 0 and support the classification of the unit as perimovement in nature. We 80 examined the Bayes Factors computed for the different broad unit categories and found that 81 the Bayes Factors for the units classified as perimovement based on the visuomotor index 82 were very low suggesting that they were correctly classified (Supp. Fig. 3b). The Bayes 83 Factors computed for perimovement units also had very little overlap with Bayes Factors for 84 both the decreased and increased units (Supp. Fig. 3b)  usually defined as a one dimensional mechanism for individual neurons, and it is unclear if in 149 PMd, a rise to threshold mechanism is operating, even in simple tasks that do not require 150 decision-making 2, 3 . We believe it would be somewhat convoluted to argue for a rise-to-151 threshold mechanism in PMd for decision-making but not in simple delayed-reach tasks. We 152 also emphasize that in this figure we are plotting the choice selective signal which is the 153 difference in firing rates and thus cannot rule out the presence or absence of a threshold per 154 se which in principle would operate at the level of firing rates. Finally, all three neuronal 155 types seem to show very similar properties. Whether the neurons actually rise to a threshold 156 does not materially change the conclusions we make here in this paper about laminar 157 differences in firing rate profiles in PMd during perceptual decisions. 158 159 The revised section from the paper is provided below. 160 sign-test to compare median slopes for the average FR in -100 ms to move as a function of 168 color coherence and the slopes for a shuffled curve, increased: 0.30±0.24 spks/s 2 /100% 169 color coherence, p < 0.1; decreased: 0.49±0.33 spks/s 2 /100% color coherence, p < .16; 170 perimovement: 0.54±0.25 spks/s 2 /100% color coherence, p < 0.09; Figs. 3g, h, Supp. Fig.  171 4j). These results were again observed in both monkeys (Supp. Fig. 4j,  In their resubmission, the authors have performed extensive and thorough additional 192

Magnitude of choice selective signal does not depend on coherence at the
analyses, which address all major concerns raised in the initial review. This has greatly 193 clarified and strengthened the manuscript. I now find this manuscript suitable for 194 publication. 195 196 We thank reviewer 2 for the kind words and feedback on the revised manuscript and are 197 delighted that the manuscript has been recommended for publication. 198 199 200 201 Thank you for this suggestion. We have now moved the entire section on the DDM vs. 249 UGM to the supplementary methods. We have also moved sections that use alternative 250 analyses to examine these FRs to the supplementary materials. Please find the text we wrote 251 in the manuscript below. 252 253 We also further investigated the behavior of the monkeys by fitting the RT 254 distributions and accuracy using the drift-diffusion (DDM) and urgency gating models 255 (UGM) that have been developed to explain behavior in two alternative forced choice tasks 54, 256 55, 56, 57, 58 (Supp. Note 1 and Supp. Figs. 1,2). We performed this model-fitting analysis to 257 identify if these candidate computational frameworks could help us interpret decision-related 258 responses in PMd, and if the behavior was better explained by the DDM, estimate decision 259 times for the monkeys. In comparison to the well-studied random dots discrimination task 3, 260 59 , quantitative modeling of how monkeys perform discrimination of static stimuli such as 261 the checkerboard used here 55, 56 is lacking 55, 60 . We found that, while both the UGM and the 262 DDM provided very reasonable fits, neither model was completely sufficient to describe the 263 RT and accuracy of the monkeys performing discrimination of static checkerboard stimuli 264 used here (Supp . Figs 1 and 2). The UGM with an intercept and slope term provided the 265 best fit of all three models considered here 13 . Our results here further highlight the increasing 266 realization that differentiating between these models of decision-making behavior using 267 purely statistical techniques is currently very difficult 55, 56, 57, 58, 61, 62 -explicit stimulus 268 manipulations are necessary. Additional elaboration of these models is likely needed to better 269 describe the behavior of these monkeys in this static checkerboard discrimination task 63 . 270 Choice selectivity is distributed in a continuum in this PMd neural population and is 271 not well described as clusters of high and low choice selectivity (Supp. Note 2, Supp. Fig.  272 4h). One concern is that the visuomotor index we use to partition and understand our large 273 dataset of neurons is too simplistic and collapses over many key features of the data. To 274 ensure that increased, decreased and perimovement units were not a spurious artifact 275 specific to the index that we developed here, we also tested if our results were consistent 276 when applying other methods for describing a visuomotor continuum 12, 40 as well as as a We now note in the results and the methods section that none of the conclusions 343 change when excluding the multiunits from key analyses and have now created a new 344 supplementary note (Supp. Note 5) and a Supplementary Figure (Supp. Fig. 13) that details 345 several key results (dependence on coherence, discrimination time vs. RT, discrimination 346 time for the three broad neuronal categories, scatter plot of discrimination time vs. index, 347 laminar distribution of the visuomotor index, and discrimination time as a function of 348 cortical depth) for just the single units. 349 350 Methods statement: 351 352 All main claims made in the paper were unchanged when only restricting the analysis 353 to single neurons (Supp. Fig. 13  We also confirmed that our results were not influenced by the inclusion of the multi-units in 358 the database. We examined these effects in four different key analyses that we report in the 359 main results. We focused on the results describing the dependence of slopes on coherence, 360 discrimination time, laminar distribution of the index and discrimination time as a function 361 of cortical depth. None of the conclusions change when only considering the isolated single 362 units (Supp. Fig. 13a-f). 363 364 First, the dependence on coherence was stronger for the increased compared to the 365 decreased and perimovement neurons (Supp. Fig. 13a Spearman's r=0.36, p < 3.47e-25; Increased vs. Decreased: ranksum p=1.23e-10, increased 373 vs. perimove: p < 8.91e-12). Third, the visuomotor index again showed the same 374 dependence as a function of cortical depth for both single neurons and multi-units (Supp. 375 Fig. 13e, goodness of fit for single neurons: R 2 = 0.88, p < 2e-3, 1000 shuffles). Finally, the 376 discrimination time also increases as a function of cortical depth (Supp. Fig. 13f We thank you for alerting us to this typographical error. This is fixed in the revised 526 manuscript. 527 528