Flexible timing by temporal scaling of cortical responses

Musicians can perform at different tempos, speakers can control the cadence of their speech, and children can flexibly vary their temporal expectations of events. To understand the neural basis of such flexibility, we recorded from the medial frontal cortex of nonhuman primates trained to produce different time intervals with different effectors. Neural responses were heterogeneous, nonlinear and complex, and exhibited a remarkable form of temporal invariance: firing rate profiles were temporally scaled to match the produced intervals. Recording from downstream neurons in the caudate and thalamic neurons projecting to the medial frontal cortex indicated that this phenomenon originates within cortical networks. Recurrent neural network models trained to perform the task revealed that temporal scaling emerges from nonlinearities in the network and degree of scaling is controlled by the strength of external input. These findings demonstrate a simple and general mechanism for conferring temporal flexibility upon sensorimotor and cognitive functions.

M ental capacities such as anticipation, motor coordination, deliberation, and imagination lie at the heart of higher brain function. A fundamental feature of these capacities is that they are not tied to immediate sensory or motor events and unfold at different timescales. To support such temporal flexibility, the brain must control the dynamics of ongoing patterns of neural activity. An example of such flexible behavior is the control of selfinitiated movements. Humans can precisely control the timing of their movements and can make rapid adjustments based on instruction. However, the mechanisms that confer such flexibility are not well understood.
We investigated the neural mechanisms underlying flexible temporal control. We developed a task in which monkeys were instructed to produce different time intervals using different effectors. While monkeys performed the task, we evaluated the causal function and signaling properties of neurons across three brain areas that have been strongly implicated in timing: (i) the medial frontal cortex (MFC), which has been implicated in the inhibition 1 , initiation 2, 3 , and coordination 4-7 of movements, (ii) the caudate nucleus downstream of MFC, which is thought to play a major role in timing tasks [8][9][10][11][12][13][14][15] , and (iii) thalamic regions that project to MFC and causally influence self-initiated movements 16 .
Neurons exhibited a diversity of complex response profiles that could not be reconciled with dominant models of timing 13 , including clock-accumulator models 17,18 , oscillation-based models 19 , and population clock models 20,21 . Instead, responses were unified under a general principle of temporal scaling that was evident at both individual and population levels. Specifically, when animals produced longer intervals, the population activity evolved along an invariant neural trajectory but at a slower speed. Notably, speed was adjusted on a trial-by-trial basis and in accordance with the instruction provided to the animal. Although these findings are at odds with classic models of timing, they corroborate observations of temporal scaling in other tasks and areas 8,[22][23][24][25] .
To investigate the mechanisms underlying such flexible speed control, we analyzed the dynamics of recurrent neural network mod-els capable of using graded input to produce different time intervals. Analysis of these models revealed a previously uncharacterized yet simple mechanism for flexible temporal scaling: degree of scaling was controlled by an external input acting upon the nonlinear activation function of individual neurons in a recurrent network.
Temporal scaling of complex response profiles. To estimate each neuron's firing rate, we binned trials based on Tp and computed average spike counts after aligning trials to the time of the motor response (Fig. 2c). Across neurons, response profiles were highly heterogeneous and included linear, nonlinear, monotonic, nonmonotonic, and multimodal activity profiles (Fig. 2d). We tested each neuron's activity profile against predictions of various models of motor timing using a cross-validation procedure (Fig. 2e). We considered three variants of the clock-accumulator model: one in which flexible timing was achieved by adjusting a threshold over a ramping process, one in which the clock was adjusted, and one in which both were adjusted. Since clock models can only accommodate neurons with linear ramping profiles 17,18,[28][29][30] , they failed to capture the nonlinear profiles exhibited by the majority of neurons in the population. Cross-validated polynomial fits of different degrees of freedom indicated that only 11% (47 of 416) of responses increased linearly; the rest were explained by higherorder polynomials. This number increased by only 4% when the   Animals produced either an 800-ms (Short) or a 1,500-ms (Long) time interval, either by making a saccade (Eye) or a button press (Hand). These four conditions were randomly interleaved and were cued throughout the trial by the color and shape of two central stimuli, a circular fixation spot for Eye or a square fixation spot that cued the animal to place its hand on a button. The colored shape (circle or square) cued the effector, and the hue (red or blue) cued the desired interval (red, Short; blue, Long). After a random delay, a white circle was flashed to the left or right of the fixation point. This peripheral flash specified the saccadic target for the eye trials and played no role in the hand trials. After another random delay, a Set cue (a ring flashed around the fixation stimuli) initiated the motor timing epoch. The animal's production interval (Tp) was measured as the interval between Set and when either the saccade was made or the button was pressed. When Tp was generated with the desired effector and was within a specified reward window, the peripheral target (or square fixation) turned green, auditory feedback was provided, and the animal received a juice reward. The reward window was adjusted adaptively on a trial-by-trial basis and independently for the Short and Long conditions so that the animal received reward on approximately 50% of trials for both interval context on every session (on average, 57% for monkey A and 51% for monkey D). The reward magnitude increased linearly with accuracy as shown by the green triangular reward function. starting and terminating points of the linear ramps were allowed to vary by up to 200 ms. We also tested two oscillation-based models of interval timing, in which the response time is determined by the collective phase of oscillators and different frequencies 19 . In one variant, a single sinusoid was fit to the response of each neuron, and in another, multiple sinusoids (up to four) of different frequencies were used. These models were also unable to capture the diversity of MFC responses (Fig. 2e).
Finally, we tested MFC responses against a simple variant of the population-clock model 20,21 , in which the response profile of each individual neuron is unique and context-independent, and the collective activity of the population determines movement initiation time. Accordingly, we modeled each neuron by the best-fitting polynomial (cross-validated) that captured the activity across both the Short and Long contexts. This model performed better than the clock-accumulator and oscillation models. However, MFC data violated a key qualitative prediction of the population clock model: unlike in the population clock model, the vast majority of MFC responses differed for the Short and Long conditions from early on after the Set cue (Fig. 2d).
Our initial inspection indicated that response profiles were selfsimilar when stretched or compressed in accordance with the produced interval (Fig. 2c,d and Supplementary Fig. 1). This was true for both random fluctuations of Tp within each temporal context (i.e., 800 ms or 1,500 ms) and deliberate adjustments of Tp across the two contexts. Consistent with this observation, a temporally scaled polynomial function fitted to the data for different conditions clearly outperformed all other models in terms of explanatory power ( Fig. 2e; one-way ANOVA, F 6, 2,859 = 125.2, P < 0.001).
Speed control across the population. We quantified the degree of scaling by a scaling index (SI) that was computed as a coefficient of determination (R 2 ) across temporally scaled responses associated with different Tp bins. This analysis revealed a wide range of SI values across the population ( Supplementary Fig. 1a). When activity of a population of neurons is plotted in a coordinate system in which each axis represents the firing rate of one neuron, also known as the state space, the response dynamics of the population can be depicted as a high-dimensional neural trajectory. In this representation, perfect temporal scaling would result in perfectly overlapping neural trajectories evolving at different speeds. When we plotted MFC neural trajectories within the space spanned by the first three principal components (PCs) of neural activity, responses did not overlap perfectly, indicating that MFC responses comprised a mixture of scaling and nonscaling signals (Fig. 3a), which was also evident from the distribution of SI values across individual neurons ( Supplementary Fig. 1a). Ts is the desired interval) computed from minisession (randomly sampled subsets of trials without replacement; see Methods) before and after the injection of muscimol (above) and saline (below) for the two intervals (red, Short; blue, Long) and two effectors (circle, Eye; square, Hand). The white-over-black bar graphs partition MSE to bias 2 (black) and variance (white). Significance tests correspond to comparisons of MSE (see Table 1 for details) across minisessions (n, number of minisessions; **P < 0.001; N.S., not significant). c, Average firing rates were computed after aligning spike times to movement initiation time. Top: raster plot of spike times (black ticks) for an example neuron aligned to movement initiation time (dashed line) across trials (rows). Trials were sorted and grouped into bins according to the produced interval (Tp). box, 25th to 75th percentiles; whiskers, ± 1.5 × the interquartile range; dots, neurons whose R 2 values lie outside whiskers). The temporal scaling model (top) had the highest explanatory power (R 2 ) across models (one-way ANOVA, F 6, 2,859 = 125.2, P < 0.001; one-tailed paired-sample t test between temporal scaling and population-clock model, n = 416, t 415 = 6.32, ***P < 0.001). Models were cross-validated.
We hypothesized that perfect scaling might be found within a subspace of the population activity, i.e., a scaling subspace (Fig. 3b). As a first step, we examined the degree of scaling in the first few PCs. Using the same SI metric used for single neurons, we found that the first two PCs that explained nearly 40% of the variance (Fig. 3b) had scaling indices of 0.91 and 0.97, respectively (Fig. 3a). The third PC, however, did not exhibit temporal scaling and had a SI of 0.20. This provided initial evidence that certain high-variance dimensions in the state space exhibit strong scaling. However, scaling dimensions need not coincide with PCs, since PCs correspond to dimensions of maximum variance, not maximum scaling. To identify the scaling dimensions, we developed a dimensionality reduction technique that furnished a set of scaling components (SCs) that were ordered according to the degree of scaling in the data (see Methods).
The SI values for the first few SCs were relatively large, indicating that the optimization process correctly identified the scaling dimensions ( Fig. 3c and Supplementary Fig. 2). Because SCs were crossvalidated, the scaling index for SCs of the test data did not follow a strictly decreasing order, although this was the case for the dataset used to determine the SCs (data not shown). Responses projected Projections of neural responses onto a scaling subspace result in overlapping trajectories (purple) whose speed determines the produced interval, fast for Short (red) and slow for Long (blue). Bottom: cumulative percentage variance (cum. var.) explained by PCs and SCs. c, Top: population activity sorted according to produced interval (Tp) bins and projected onto the first three SCs. As expected, in this subspace, the trajectories overlap. Bottom: the first three SCs with their corresponding SI values. Because of cross-validation, SIs were not in decreasing order (see text). d, Variance (var) explained for individual SCs as a function of SI. SCs with the larger SI explain a large percentage of variance for both Hand (square) and Eye (circle) conditions. Inset: variance explained as a function of SI derived along 200 random one-dimensional projections of MFC activity in the state space. Individual projections were binned and pseudocolored to indicate the frequency of occurrence. The data show that high scaling indices are associated with high variance explained. e, Comparison of SI in the MFC, caudate, and thalamus with surrogate data generated from three Gaussian process models that were constrained to match the observed response profiles with increasing levels of sophistication (Supplementary Note and Supplementary Fig. 3). Inset: the hypothesis space in relation to various constraints and their combinations, with distinct colors and their overlaps. Perfect scaling (middle ellipse) is a subset of the possibilities that satisfy all four constraints. Each model consisted of the same number of neurons as in the MFC data, and the number of bootstrapped samples for each model was n = 200. The plot shows the average SI across all SCs computed from bootstraps (small circles), along with the corresponding means (vertical lines) for each of three brain areas and each of the surrogate models. The average SI for each surrogate model was significantly lower than the values associated with the MFC and caudate but not thalamus (N.S., not significant; ***P < 0.001). f, The speed of neural trajectory within the scaling subspace spanned by the first three SCs predicted average Tp values across bins. The relationship between speed and Tp was fit to a linear log-log function. The scaling subspace was computed from training data (arrows, two Tp bins) and used to evaluate speed on the remaining test data (14 Tp bins). R 2 was computed by repeating the procedure using bootstrapping (n = 10). Both axes are in log scale.
onto the subspace spanned by the first three SCs traced nearly identical trajectories that evolved at different speeds ( Fig. 3c), which is precisely what is expected in the scaling subspace. Next, we asked how much variance in the neural data the scaling subspace could account for. Ordered SCs explained less variance than the corresponding PCs, suggesting that the scaling dimensions were not identical to PC dimensions (Fig. 3c). To better quantify the relationship between scaling and variance explained, we performed two complementary analyses. First, we examined the relationship between SI and variance explained for each SC. This analysis provided initial evidence that SCs with large SIs explained a relatively large percentage of variance (Fig. 3d). Second, we developed a procedure for quantifying the relationship between scaling and variance without relying on projections onto specific directions, such as PCs or SCs. We used a bootstrap procedure and quantified the relationship between variance explained and SI along 200 random projections in the state space. We then constructed a two-dimensional probability distribution of the relationship between variance explained and SI across those random projections (Fig. 3d). This analysis verified that the dimensions with large degrees of scaling also explained a large portion of the variance.
To validate SI as a reliable metric for scaling, we quantified SI for surrogate data created from Gaussian processes. The surrogate data was constructed to statistically match MFC responses in terms of smoothness, starting and terminal firing rates, dimensionality, and the correlation between Short and Long activity profiles, but it was not constrained to exhibit temporal scaling (see Supplementary Note and Supplementary Fig. 3). The surrogate data, despite being matched to the statistics of MFC responses, had smaller SIs than those computed for MFC neurons (Fig. 3e). This verified that a significant portion of variance in MFC resides within a scaling subspace in which activity evolves along invariant trajectories at different speeds.
Finally, we quantified the relationship between speed in the scaling subspace and behavior. Using cross-validation, we derived the scaling subspace from a subset of shortest and longest trials and asked whether the speed of neural trajectories of the remaining trials in that subspace could predict Tp. Results indicated that longer Tps were associated with slower speeds (Fig. 3f and Supplementary  Fig. 4) and that the average speed was inversely proportional to Tp (R 2 = 0.87). These results suggest that the brain controls the speed of neural trajectories in order to flexibly produce different time intervals. Notably, this speed control seemed to explain both behavioral variability within each temporal context and flexible switching between the two contexts.

Speed control across cortico-basal ganglia circuits.
Having established speed control in MFC as a potential mechanism for temporal flexibility, we asked whether this property was also present downstream of MFC in the basal ganglia. We focused on a region of the caudate that is thought to receive direct input from MFC 31,32 (Fig. 4a,b). First, we used reversible inactivation to verify the causal involvement of this region in the task (Fig. 4b and Table 1). Afterwards, we recorded from individual neurons (Fig. 4c) and analyzed their responses with respect to the temporal scaling property. Caudate responses, like those in MFC, were complex and heterogeneous and had different profiles for Short and Long trials. At the level of single neurons, the degree of scaling in the caudate was similar to that in MFC ( Supplementary Fig. 1). At the population level, analysis of PCs and SCs verified the presence of a scaling subspace in the caudate ( Fig. 3e and Supplementary Fig. 5). Finally, the SI values of PCs, as well as an unbiased analysis of responses across random projections in the state space, indicated that dimensions with strong scaling explained a large part of variance in the data (Fig. 4d). These analyses verified that neural signals in the caudate shared the same key properties with MFC and could contribute to subspace speed control.
In addition to receiving inputs from MFC, the basal ganglia also projects back to MFC through the thalamus. The presence of this anatomical substrate raises the possibility that MFC inherits temporal scaling from the basal ganglia via transthalamic projections. To test this possibility, we targeted a region of the thalamus where MFC-projecting thalamocortical neurons were identified antidromically ( Fig. 4e and see Methods). Consistent with previous work 16 , reversible inactivation strongly influenced animals' timing behavior (Table 1). However, several observations indicated that the function of thalamocortical signals was different from the functions of caudate and MFC signals (Fig. 4g). First, SIs of single thalamic neurons (n thalamus = 846) were significantly smaller across the population compared to the other areas (n MFC = 416 and n caudate = 278, Mann-Whitney-Wilcoxon test, W 1,260 = 310,733, z = 7.89, P < 0.001 for MFC; W 1,120 = 189,163, z = 6.98, P < 0.001 for caudate; Supplementary Fig. 1a). Second, scaling in the thalamus was significantly smaller than the C+D+E+S surrogate data that matched the neural data in terms of smoothness (S), endpoints (E), dimensionality (D), and correlation (C; one-tailed two-sample t test, n = 200, t 398 = 35.2, P < 0.001; Fig. 3e). Third, scaling was less prominent in the thalamus, as indicated by the relationship between the magnitude of scaling and variance explained along random projections in the state space (Fig. 4h). Fourth, unlike in the caudate and MFC, neural trajectories in the thalamus were not invariant in the space spanned by the first three SCs ( Supplementary Fig. 5). This was also evident in the profile of the second PC, which systematically changed in average value, as opposed to scaling. Together, these observations provide strong evidence that thalamic neurons exhibit significantly less scaling than the MFC neurons they project to. Since the output of the basal ganglia to cortex is routed through the thalamus, the weak scaling in thalamocortical neurons implies that scaling may originate within MFC or in other cortical circuits projecting to MFC.
A model for flexible subspace speed control. Since the timescales of MFC response modulations were slower than the intrinsic time constants of single neurons, we assumed that the observed dynamics were the result of network-level interactions. Motivated by recent advances in understanding the dynamics of cortical population activity using network models 33-35 , we used a recurrent neural network model to investigate the potential underlying mechanisms  of speed control (Fig. 5). The model received a context input (Cue) whose magnitude specified the desired interval and a transient pulse (Set) that cued the start of the interval (Fig. 5a). The network was trained so that its output (a weighted linear sum of its units) had to breach a fixed threshold at the desired time 36 .
The network learned to generate the desired output function (Fig. 5d), and the activity of model neurons emulated the key features observed in MFC: response profiles of individual network units were heterogeneous, complex, and temporally scaled (Fig. 5b). Moreover, the speed of population dynamics directly determined the produced interval (Fig. 5c). These observations were robust regardless of whether the training objective was linear, nonlinear, scaling, or nonscaling ( Supplementary Fig. 6). The scaling behavior also persisted when the Cue input was provided transiently (Supplementary Fig. 6). Motivated by the robustness and generality of these results, we reverse-engineered the networks to investigate the underlying mechanisms of temporal scaling 37 .
Temporal scaling could be explained in terms of a pair of inputdependent stable fixed points, F init and F terminal . At the start of the trial, the Cue initialized the state of the network to an inital fixed point, F init . Activation of the Set pulse drove the system away from F init , allowing the system to evolve toward F terminal with a speed that was determined by the magnitude of the Cue input (Fig. 5c,e). Within the network, the input and the recurrent dynamics played complementary roles (Fig. 5c). The input specified the position of the initial and terminal fixed points along a direction, which we refer to as the input subspace. Recurrent dynamics on the other hand, established a recurrent subspace, which determined the neural trajectory between these fixed points. These two subspaces emerged from different components of the network. The input subspace was governed by the direction specified by the input weights. In contrast, the recurrent subspace emerged from the constraints imposed by the recurrent weights. The two subspaces also differed in terms of their relationship to the scaling phenomenon. Within the input subspace, different intervals were associated with changes in the level of activity but did not exhibit scaling. This change in level controlled the speed by setting the position of the neural state along the axis of the input subspace. The recurrent space, on the other hand, did not control the speed but was responsible for the emergence of invariant trajectories and temporal scaling.
The division of labor between these subspaces provides a simple explanation of why scaling and nonscaling signals might coexist within the same network. Nonscaling signals reflect the input that sets the speed, and scaling signals correspond to the evolution of activity with the desired speed. This organization predicts that MFC neurons with weak temporal scaling are likely recipients of relatively strong context-dependent input, possibly derived from signals in upstream thalamic neurons (Fig. 4g), and that neurons with strong temporal scaling are more directly engaged in recurrent interactions. Finally, the model-based distinction between these two subspaces provides a theoretical basis for analyzing MFC responses within a scaling subspace that corresponds to the recurrent subspace in the model. Notably, the model allows us to infer that within the nonscaling input subspace, production times should be correlated with the average level-not speed-of neural activity. To test this prediction, we investigated whether Tp could be predicted by the nonscaling component of MFC activity. We inferred the least-scaling  Fig. 2b. c, Activity profile of three example caudate neurons (same format as in Fig. 2d). d, Top: the relationship between variance (var) explained and SI in the caudate (same format as the inset in Fig. 3d). Bottom: the first three PCs with their corresponding SI values. e, As in a but showing the region of interest in the thalamus. We recorded from neurons in the region where MFC-projecting neurons were identified antidromically. Inset: example of reliable and low-latency spikes detected after antidromic stimulation. f-h, Inactivation, electrophysiology, and temporal scaling in the thalamus (format as in b-d).
Responses in the thalamus are qualitatively different from the caudate (d) and MFC (Fig. 3d) in that most projections in the state space do not exhibit temporal scaling. N.S., not significant; *P < 0.05; ***P < 0.001. direction from our scaling component analysis. SCs specified an orthonormal basis whose axes were ordered according to the level of scaling (Supplementary Fig. 7). Therefore, we used the last SC (SC9) as an estimate of the least-scaling direction and compared Tp to average MFC activity projected onto SC9. As predicted by the model, the average activities of the nonscaling components of MFC were indeed predictive of Tp ( Supplementary Fig. 8). This is a compelling result, as it bears out a key prediction about an unsuspected relationship between cortical activity and behavior made by a model that was constrained only to perform the task.
A potential neural mechanisms for speed control. To further investigate the role of input in speed control, we analyzed the eigenvalues of the system near F terminal . In the vicinity of this fixed point, stronger inputs caused the eigenvalues to decrease systematically (Fig. 5f). In a linear dynamical system, such contraction in the eigenvalue spectrum corresponds to a systematic increase in the network's effective time constants, τ eff (Fig. 5f). From this, we concluded that the action exerted by the input is equivalent to adjusting the system's effective time constant in a flexible input-dependent manner.
To gain insight into the mechanism that provides such powerful and modular control of time constants, we focused on a simplified model composed of only two mutually inhibitory neurons with a common input (Fig. 6a and Supplementary Note). Previous work has demonstrated that adjustments of the common input in this model could alter its recurrent dynamics to either relax to a single fixed point with a specific time constant or act as an integrator with exceedingly long time constants 38 . We reasoned that exploring the model's behavior while between these two regimes might lead us to a mechanistic understanding of how the effective time constant of a network can be flexibly adjusted.
In the presence of balanced input (Cue), the two-neuron model is associated with an energy landscape that engenders a pair of stable fixed points, similarly to the recurrent model (Fig. 6b). We analyzed the phase plane of the model (Fig. 6c) and verified that the input level can be used to create a continuum of τ eff . This is analogous to the recurrent network model in which activity along the input subspace served to control the speed. However, the two-neuron model helped us understand the underlying mechanisms: stronger input drives neurons toward their saturating nonlinearity, where  N  (0, 1 ). Decreasing the gain values from g = 1.0 (red) to g = 0.57 (blue) progressively decreases the magnitude of the eigenvalues and increases the effective time constants τ eff = τ/g. g, Units in the RNN model were sorted based on their maximal activity when the network was near F terminal . The plot shows maximum activity as a function of Cue input. Vertical arrows mark two neurons, one with positive and one with negative activity, plotted in h. h, Stronger input drives units toward the saturation point of their nonlinear activation function where the shallowness of slopes leads to reduced gain of neural activity. This is true both for units with a positive response whose responses increased with Cue input (right) and for units with a negative response whose responses decreased with input drive (left). In all plots, different colors correspond to different intervals, as shown by the color bar.
the slopes of activation functions are shallower (Fig. 6d). Shallower slopes correspond to smaller derivatives and larger values of τ eff . In other words, the presence of single-neuron nonlinearities provides a reservoir of slopes that an input can exploit to control the network's energy gradients (Fig. 6b).
Having established a low-level mechanism in the two-neuron model, we asked whether the same mechanism was operative in the recurrent network model. For the recurrent model, we analyzed the operating points of units as a function of the input drive near F terminal . Notably, for stronger inputs, units were systematically driven further toward their saturating nonlinearity (Fig. 5g,h), which is consistent with the mechanism of speed control in the simple network model. These results underscore a simple and powerful mechanism at the level of single neurons for controlling the speed of dynamics independent of the neural trajectory.

Discussion
We found that flexible motor timing was governed by controlling the speed of slow dynamics across populations of MFC and caudate neurons. Speed control also emerged as a natural solution in recurrent network models trained to produce different time intervals. This was achieved by an input that drove the system to the appropriate region of the state space, where recurrent interactions unfolded at desired speeds. In both systems, fluctuations of speed predicted variability within each temporal context, and systematic adjustments of speed provided the means for flexible control of timing. These results suggest that the brain uses a speed-control mechanism to deliberately control movement initiation time.
The division of labor conferred by the input and recurrent interactions has broad implications for flexible control of behavior, allowing the same motor and cognitive functions to unfold along the same neural trajectory at different timescales. For example, in decision-making tasks, adjustment of a speed command could explain how the brain might flexibly implement different speed-accuracy tradeoffs 39 . Indeed, if the speed command is controlled by a sensory input, our recurrent network would behave similarly to more detailed network models consisting of excitatory and inhibitory units that approximate temporal integration of sensory information 40 . However, biophysical models of decision making have not yet been extended to generate the diversity of scaling response profiles that we observed in vivo and in our recurrent model.
The engineered two-neuron model highlights the crucial role of single-neuron nonlinearities; adjustments of speed were governed by the interaction of input with these nonlinearities. This finding suggests that circuits and subcircuits could exploit different inputs and different biophysical properties to adjust speed independently and operate at different timescales. It also predicts that neuromodulatory effects and pharmacological treatments that interfere with the nonlinear response curve of individual neurons could alter the speed of cortical dynamics, as observations from numerous studies of interval timing might suggest 41 .
The source of the external input that adjusts the speed remains a pertinent and unresolved question. One possibility is that MFC receives this input directly from neurons in other cortical areas, which is consistent with recent observations in the parietal cortex 42 . Another possibility is that the input has a thalamocortical origin. Thalamic neurons, in turn, may inherit this signal from other cortical and/or subcortical regions. Neuromodulatory signals could also alter cortical dynamics. A number of physiology and pharmacology studies have implicated dopamine in regulating timing behavior 43,44 . Cortical dynamics are also known to depend on cellular properties, such as those mediated by NMDA receptors, which are thought to facilitate the generation of stable slow cortical dynamics 45 .
Another question for future work concerns the exact mechanisms that give rise to the diversity of response profiles in MFC. According to our model, this diversity emerges from recurrent interactions in direct response to an input drive. Alternatively, these activity patterns could be the result of cortical nonlinearities acting upon simpler ramping inputs, which constituted a minority of response profiles in the cortico-basal ganglia circuits we recorded from. Indeed, considering the bidirectional connections between thalamus and cortex, we cannot rule out the possibility that ramping activity in thalamus and/or other cortical areas might contribute to the scaling of more complex response profiles in MFC. Nevertheless, the model seems to provide the most parsimonious account of the data for both cortex and thalamus. The exact details of the signaling pathways, recurrent microcircuitry, and biophysical properties notwithstanding, the mechanisms that we have identified have the potential to explain how the brain flexibly controls the speed of cortical dynamics. The network has a bistable energy landscape whose gradients depend on the strength of the Cue input. Stronger inputs (blue) lead to shallower energy gradients, and vice versa (red). The Set pulse moves the state away from the initial fixed point (F init , filled circle) and over the saddle point (F saddle , open circle). The network then spontaneously moves toward the terminal fixed point (F terminal , filled circle). The speed of the movement toward F terminal is relatively slow when the energy gradient is shallow (blue) due to stronger common input. c, Phase plane analysis of the two-neuron model. The two axes on the lower left correspond to the activity of the two neurons (U and V). The input is applied to both units and thus drives the system along the diagonal (input subspace). The input level moves the sigmoidal nullclines of the two units (du/dt = 0, dashed; dv/dt = 0, solid; see Supplementary Note) and adjusts the location of the three fixed points (F init , F terminal , and the intermediate F saddle ). The figure shows the two nullclines and the corresponding fixed points for two inputs levels (red and blue). Activation of Set moves the system along a recurrent subspace, which is orthogonal to the input subspace. The proximity of nullclines (crosses below the input subspace) controls the speed. When the input is stronger, the nullclines are closer, which causes the system to become slower. d, Interaction of the input drive with the saturating nonlinearity of one unit. The action of the input upon the nonlinear activation functions moves the saddle point and controls the speed of the system. Stronger inputs push the neurons toward the shallower part of the nonlinear activation function and move the saddle point to slower regions of the phase plane, causing recurrent interactions to slow down.

Methods
Methods. Two adult rhesus monkeys (Macaca mulatta, a 6.5-kg female and a 9.0-kg male, both 5 years old) were trained on a two-interval two-effector motor timing task. All surgical, behavioral, and experimental procedures conformed to the guidelines of National Institutes of Health and were approved by the Committee of Animal Care at Massachusetts Institute of Technology.
Behavior. The MWorks software package (https://mworks.github.io/) running on a Mac Pro was used to deliver stimuli and to control behavioral contingencies. Visual stimuli were presented on a 23-inch (58.4-cm) monitor at a refresh rate of 60 Hz. Eye positions were tracked with an infrared camera (Eyelink 1000; SR Research Ltd, Ontario, Canada) and sampled at 1 kHz. A custom-made manual button, equipped with a trigger and a force sensor, was used to register button presses.
Motor timing task. Each trial began with the appearance of two fixation cues (FCs), a circle at the center of the screen and a square 0.5° below the circle. The animal had to shift its gaze to the circle, and the square informed the animal to hold its hand gently on the button. On each trial, one FC was colored and the other was white. The colored FC indicated the desired response effector (colored circle for saccade and colored square for button press). The color indicated the desired interval (red for 800 ms and blue for 1,500 ms). We denote these four trial conditions by EL, ES, HL, and HS, where E and H refer to Eye and Hand and S and L to Short (800-ms) and Long (1,500-ms) intervals. After a delay period (500-1,500 ms, uniform hazard), the saccade target was briefly presented 8° to the left or right of the FC. For button-press trials (colored square), the saccadic target was not relevant but was presented so that stimuli were consistent across trials. After another delay (500-1,500 ms, uniform hazard), a 48-ms annulus (Set cue) flashed around the FCs cued the animal to start timing. Trials were aborted if the animal made premature eye or hand movements (before Set or long before the desired time). To receive the reward, animals had to initiate a movement with the desired effector (cued by the colored FC) within a small window ('acceptance window') around the desired interval (cued by the color of FC). The saccade responses had to land inside a circular window of radius 2.5° centered on the location of the extinguished target and had to be made directly (less than 33 ms after exiting the FC window). Button-press responses had to be made with the hand contralateral to the recorded hemifield 46 . The production interval was measured from the endpoint of Set to the moment the saccade was initiated or the button was triggered. The width of the acceptance window was adjusted dynamically on a trial-by-trial basis and independently for the Short and Long conditions using a one-up-one-down staircase procedure. As such, animals were rewarded for nearly half of trials (on average, 57% in monkey A and 51% in monkey D) for both temporal contexts. For trials that were rewarded, in addition to reward delivery, the color of the stimulus changed to green and an auditory clicking sound was simultaneously presented. Within the acceptance window, the magnitude of the reward scaled with accuracy.
Electrophysiology. Animals were comfortably seated in a dark and quiet room. Each session began with an approximately 10-min warm-up period to allow animals to recalibrate their timing and exhibit stable behavior during electrophysiology recordings. Recordings were made through a craniotomy within a recording chamber while the animal's head was immobilized. Structural MRI scans were used to aid in targeting regions of interest. Single-and multiunit responses were recorded using a 24-channel laminar probe with 100-µ m or 200-µ m interelectrode spacing (V-probe, Plexon Inc.). Eye position was sampled at 1 kHz, and all behavioral and electrophysiological data were time-stamped at 30 kHz and streamed to a data acquisition system (OpenEphys).
The dataset collected for this study included 1,967 single units or multiunits recorded from the MFC, caudate and thalamus of two monkeys (Table 2), in which 69% (1,351/1,967) were tentatively single units. Neurons with firing rates less than 2 spikes per s during the timing epoch were excluded from subsequent analyses.

Reversible inactivation.
Injections were made with a microinjection pump (UMP3, World Precision Instruments) and a Hamilton syringe, which was connected to a custom 30 G stainless steel injection cannula via a fused silica injection line (365-µ m OD, 100-µ m ID, Polymicro Technologies). In each injection session, we first established the animal's baseline behavioral performance. Afterwards, we pressure-injected muscimol hydrobromide (5 µ g/µ L in saline) in the region of interest at a rate of 0.2 µ L/min. In the MFC and caudate, a total of 2 µ L was injected per session. In pilot inactivation experiments in the thalamus, we noticed that animals stopped performing the task after 2 µ L muscimol injection. To ensure animals would perform the task, the total volume of muscimol in the thalamus was reduced to 1.5 µ L. The behavioral task was resumed 10 min after the injection was completed. As a control, in separate sessions, sterile saline was injected following the same procedure. The experimental data consisted of unequal test sessions for muscimol and saline, and unequal numbers of trials in the before and after muscimol injection. For statistical comparison, these inequalities may introduce sampling biases. To avoid such biases, we created 50-trial minisessions from before and after the injections, in which the trials within a minisession were randomly sampled. The sampling was made without repeats to ensure trials were not counted twice. We quantified the effects of inactivation by comparing mean squared error, bias and variance, , before and after the injection for every minisession. The same procedure was used to assess the results of the saline injection experiments.
Antidromic stimulation. We used antidromic stimulation to localize thalamocortical MFC-projecting neurons. Antidromic spikes were recorded on a 24-channel electrode (V-probe, Plexon Inc.) in response to a single biphasic pulse of duration 0.2 ms (current < 500 µ A) delivered to MFC via low-impedance tungsten microelectrodes (100-500 kΩ , Microprobes). The guide tube for the tungsten electrode was used as the return path for the stimulation current. Antidromic activation evoked spikes reliably at a latency ranging from 1.8 to 3 ms, with less than 0.2 ms jitter. The region of interest targeted in the thalamus was within 1 mm of antidromically identified neurons. f ( )) regardless of whether they were applied to scalars or vectors. Data analysis. All offline data processing and analyses were performed in Matlab (2016b, MathWorks). Spiking data were bandpass-filtered between 300 Hz and 7 kHz, and spike waveforms were detected at a threshold that was typically set to 3 × the RMS noise. Single units and multiunits were sorted offline using custom software, MKsort (https://github.com/ripple-neuro/mksort). The majority of the neurons were recorded in separate behavior sessions.
Estimating firing rates accurately is challenging when rates change dynamically and trials have different durations 47,48 , which was the case in our data. Since our focus was on firing rates leading up to the movement, we aligned trials with respect to movement time (Fig. 2c). Additionally, for each condition, we discarded trials with Tp values more than 3 s.d. away from the mean (1.46% of trials). Firing rates were estimated by (i) averaging spike counts per time bin, (ii) using a 40-ms Gaussian kernel to compute smooth spiking density functions, and (iii) z-scoring to minimize sampling bias due to baseline and amplitude differences across neurons.
To examine the relationship between firing rates and Tp values, we binned trials according to Tp and compared average firing rates for each bin. For the 800ms interval, we used seven bins centered on 740 to 860 ms every 20 ms, and for the 1,500-ms interval, we used nine bins centered on 1,300 to 1,620 ms every 40 ms. We denoted the average firing rate of a neuron as a function of time by r t ( ), average firing rate for a specific condition c (EL, ES, HL or HS) by r t c ( ; ), and average firing rate for a specific condition and a specific Tp bin by r t c Tp ( ; , ). For population analyses, response vectors of individual neurons were organized into rows of a matrix denoted by r t c Tp ( ; , ). To test whether activity profiles could be described by a linear function (for example, ramping activity), we compared 0-order to 8th-order polynomial fits to r t ( ) using cross-validation with randomized train and test sets. All neurons that were best explained by a polynomial of order 0 or 1 were considered linear so long as the fit explained at least 50% of variance. We also applied the same procedure allowing up to 200 ms offset from the beginning or end of the timing interval to ensure our results were robust.
Comparing the motor timing models at the level of single-and multiunits.
To avoid overfitting and facilitate comparison of models with different levels of complexity, all model fitting was performed on the training set and the goodness

Replication
Describe whether the experimental findings were reliably reproduced.
Both animals were trained over a period of 2 years and performed thousands of trials. Results were highly consistent (reported in Fig. 1) Causal experiments were performed with multiple sessions and of trials. See Methods for details

Randomization
Describe how samples/organisms/participants were allocated into experimental groups.

Blinding
Describe whether the investigators were blinded to group allocation during data collection and/or analysis.
Note: all studies involving animals and/or human research participants must disclose whether blinding and randomization were used.

Statistical parameters
For all figures and tables that use statistical methods, confirm that the following items are present in relevant figure legends (or in the Methods section if additional space is needed).

n/a Confirmed
The exact sample size ( ) for each experimental group/condition, given as a discrete number and unit of measurement (animals, litters, cultures, etc.) A description of how samples were collected, noting whether measurements were taken from distinct samples or whether the same sample was measured repeatedly A statement indicating how many times each experiment was replicated The statistical test(s) used and whether they are one-or two-sided (note: only common tests should be described solely by name; more complex techniques should be described in the Methods section) A description of any assumptions or corrections, such as an adjustment for multiple comparisons Describe the software used to analyze the data in this study.
For manuscripts utilizing custom algorithms or software that are central to the paper but not yet described in the published literature, software must be made available to editors and reviewers upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). guidance for providing algorithms and software for publication provides further information on this topic.

Materials and reagents
Policy information about availability of materials 8. Materials availability Indicate whether there are restrictions on availability of unique materials or if these materials are only available for distribution by a for-profit company.

Antibodies
Describe the antibodies used and how they were validated for use in the system under study (i.e. assay and species).