Dynamical latent state computation in the male macaque posterior parietal cortex

Lakshminarasimhan, Kaushik J.; Avila, Eric; Pitkow, Xaq; Angelaki, Dora E.

doi:10.1038/s41467-023-37400-4

Download PDF

Article
Open access
Published: 01 April 2023

Dynamical latent state computation in the male macaque posterior parietal cortex

Nature Communications volume 14, Article number: 1832 (2023) Cite this article

4702 Accesses
1 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Success in many real-world tasks depends on our ability to dynamically track hidden states of the world. We hypothesized that neural populations estimate these states by processing sensory history through recurrent interactions which reflect the internal model of the world. To test this, we recorded brain activity in posterior parietal cortex (PPC) of monkeys navigating by optic flow to a hidden target location within a virtual environment, without explicit position cues. In addition to sequential neural dynamics and strong interneuronal interactions, we found that the hidden state - monkey’s displacement from the goal - was encoded in single neurons, and could be dynamically decoded from population activity. The decoded estimates predicted navigation performance on individual trials. Task manipulations that perturbed the world model induced substantial changes in neural interactions, and modified the neural representation of the hidden state, while representations of sensory and motor variables remained stable. The findings were recapitulated by a task-optimized recurrent neural network model, suggesting that task demands shape the neural interactions in PPC, leading them to embody a world model that consolidates information and tracks task-relevant hidden states.

Machine learning reveals the control mechanics of an insect wing hinge

Article 17 April 2024

Control of working memory by phase–amplitude coupling of human hippocampal neurons

Article Open access 17 April 2024

Memorability shapes perceived time (and vice versa)

Article 22 April 2024

Introduction

Imagine you are driving on a busy highway and wish to change lanes. To safely do so, you need to mentally track the pattern of traffic behind you even when not looking into the rear-view mirror. Many everyday tasks require maintaining and updating beliefs about state variables that are not directly observable. This can be computationally hard especially if the latent world states are continuous-valued, i.e., assume a range of values, and dynamic (vary in time); these properties are typically true in the real-world¹. Mechanisms underlying sensory perception and movement generation have been extensively investigated under a wide variety of conditions, such that we are converging on good computational models that are consistent with neural data^2,3,4,5. In contrast, we do not understand how the intermediate, continuous-valued, time-varying, latent states - the stuff of thoughts - are represented in the brain, nor the mechanisms used to compute those states^6,7. Filling this void is essential to building a complete picture of neural computations in the sensorimotor loop.

The past few decades have seen the emergence of two distinct approaches in the study of neural representation of latent world states. These have contributed significantly to our understanding in complementary ways. One approach, following the tradition of sensory neuroscience, uses binary decision-making tasks (e.g., motion direction discrimination) in which participants gradually integrate sensory evidence over time and then report one perceived outcome (e.g., dots moving to the left or right)^8,9,10,11. The high degree of experimental control afforded by this paradigm has helped reveal a tight link between the neural activity in the posterior parietal cortex and the time course of decision variables that guide behavior^12,13. However, because the latent world states themselves tend to be discrete and/or static in such tasks, it is difficult to fully extrapolate those insights to continuous, interactive behaviors where those latent states change continually as a consequence of one’s own actions. The alternative approach, which emerged from cognitive psychology, has sought to characterize neural correlates of continuously changing latent world states (e.g., position, heading of freely foraging animals)^14,15. This has led to a rich description of neural maps in the hippocampal formation that can potentially be used for computing latent world states. However, because neither sensory input nor behavior is controlled in this approach, it is difficult to determine the precise relationship between neural activity and the animal’s momentary beliefs in such settings. To overcome these limitations, we took an approach that combined the desirable elements of both approaches by using a task that was ecologically valid, yet well-defined and controllable. Our goal was threefold: (i) to characterize the neural representation of the repertoire of sensory, latent, and motor variables in a naturalistic closed-loop task featuring action-perception loops, (ii) to test whether the latent states computed by the neural population influence behavior, and (iii) to constrain the space of possible mechanisms that create the neural representation of the latent states.

We created a virtual environment in which monkeys used a joystick to steer to a transiently cued, random target location by integrating sparse optic flow cues¹⁶. To successfully perform the task, monkeys had to continuously update an internal estimate of the relative target location (the latent state) by integrating their own movement velocity inferred from the sparse optic flow cues. Brain regions in the posterior parietal cortex (PPC) have been implicated in various aspects of this computation such as optic flow processing^17,18, working memory^19,20,21, as well as planning of spatial movements^22,23. Because we are primarily interested in understanding the mechanisms of latent state computation rather than optic flow processing per se, we wanted to record neural activity in a region within PPC that likely already receives abstract velocity signals, such that it may serve as the locus of latent state computation in our task. There are several properties that make area 7a a more ideal candidate than other parts of PPC. First, anatomical tracing studies have consistently found a pattern of inter-areal connectivity that places area 7a at the top of the motion-processing (‘dorsal stream’) hierarchy^24,25. Moreover, it is one of the few areas in PPC that directly projects to the hippocampal formation^26,27, with lesions to area 7a affecting navigation performance^28,29. Second, area 7a neurons are known to have large, bilateral receptive fields (15–25 degrees) and activated by the full-field motion stimuli used in our VR environment³⁰. Third, response properties of area 7a neurons indicate that they are capable of marginalizing away the influence of eye movements thereby representing visual inputs in a navigationally useful, non-retinotopic format at the population level³¹. Fourth, we confirmed in prior work under passive viewing conditions that neurons in area 7a indeed encode linear and angular velocity in an abstract format, regardless of stimulus modality¹⁸. Finally, previous work has shown that representation of cognitive variables in area 7a is clearly decoupled from the influence of sensory and motor variables^32,33 whereas such decoupling has not been demonstrated elsewhere in PPC. Therefore, we simultaneously recorded from a large number of neurons from area 7a of PPC while monkeys performed this task.

We found that neural populations exhibited sequential activity during this task, and that coupling between neurons contributed substantially to the neural activity. Furthermore, single neurons carried information about sensory, latent, and motor variables, and latent world states decoded from the population activity were predictive of monkeys’ behavioral errors on individual trials. Finally, task manipulations that perturbed the world model dramatically altered both neuronal coupling and latent state tuning, but only minimally affected tuning to sensory and motor variables. These results suggest that PPC maintains dynamical beliefs about latent world states during naturalistic behaviors involving action/perception loops.

Results

Three monkeys performed a visual navigation task in which they used a joystick to steer to a transiently cued target location in a three-dimensional virtual reality (VR) environment without allocentric reference cues (i.e., stable landmarks) (Figs. 1a and S1a and “Methods”). Individual visual elements comprising the ground plane were visible only transiently and could not be used as landmarks. At the beginning of each trial, a circular target on the ground plane blinked briefly at a random location within the field of view, and then disappeared. The joystick controlled forward and angular velocities, allowing subjects to steer freely in two dimensions (Fig. 1b—left). The goal was to steer toward the target and stop when their position fell within a circular reward zone centered on the target (Fig. 1b—middle). The joystick was controlled via a mixture of frontal and lateral hand movements (Fig. 1b—right and Fig. S1b). On each trial, a target location was drawn randomly from a uniformly distribution over the ground plane area within the subject’s field of view (1–4 m, ±40^∘; Fig. 1c—left), eliciting diverse steering maneuvers as seen from their movement trajectories across trials (Fig. 1c—right). Performance feedback was provided at the end of each trial in the form of juice reward for correctly stopping within the reward zone (0.6 m radius; Fig. 1d) after waiting for a variable delay period (0.2–0.6 s).

**Fig. 1: Monkeys navigated to a remembered goal by integrating optic flow.**

Behavioral performance

Because target locations were randomized, travel durations varied widely across trials (median ± interquartile range [IQR]: 1.9 ± 0.8 s). On average, 61.5 ± 4% of the trials were rewarded, and the average error in stopping position was 0.41 ± 0.1 m. Both radial distance (Fig. 1e—left) and angular eccentricity (Fig. 1e—right) of the monkeys’ responses (stopping location) were highly correlated with the target location across trials (Pearson’s r ± SD, radial: 0.71 ± 0.06, angular: 0.87 ± 0.05). To test whether performance was accurate, we regressed responses against target locations. The slope of the regression was close to unity both for radial distance (0.87 ± 0.04) and angle (0.94 ± 0.06), suggesting that the monkeys were nearly unbiased. Non-parametric regression yielded qualitatively similar results (Fig. S1c), but additionally revealed modest undershooting for the most distant targets, an effect that is likely due to growing position uncertainty described in previous work³⁴.

Although the above results suggest that the behavior was appropriately modulated by task demands, they do not satisfactorily capture the performance for two reasons. First, they ignore differences in task difficulty associated with varying target distance. Second, they do not account for the errors arising from intrinsic variability in motor commands. Therefore, we used an approach that is conceptually similar to receiver operating characteristic (ROC) analysis to objectively evaluate the performance by accounting for both sources of variability. For each behavioral session, we constructed a “psychometric function” by computing reward probability as a function of a hypothetical reward window size (Fig. 1f—left; “Methods”). By plotting the true psychometric function against one obtained by shuffling target locations across trials, we obtain the monkey’s ROC curve (Fig. 1f—right). Chance-level performance would correspond to an area under the ROC curve (AUC) of 0.5, while perfectly accurate responses (zero error) will yield an AUC of one. The AUCs were well above chance (mean ± SD, 0.88 ± 0.03; Fig. 1f—right inset) and stable across target distances and angles (Fig. S1d). Nonetheless, the AUC was significantly worse than a subset (10%) of interleaved trials in which the target was visible throughout (0.94 ± 0.04, p = 0.002, two-sample t test; Figs. 1g and S1e). This suggests that monkeys found this task quite challenging, and performance was not limited simply by motor variability.

In principle, it is possible to avoid integrating optic flow by learning the precise transformation implemented by the joystick controller. However, as we will show later and as described in previous work, monkeys are sensitive to multiple task manipulations in a manner that strongly suggests that they perform the task by integrating optic flow to update beliefs about their spatial position³⁵. In the following sections, we examine the neural dynamics during this task, including the neural representation of dynamically evolving latent state estimates about position and other task variables.

Neural dynamics

We recorded neural activity from the PPC (area 7a) using chronically implanted multi-electrode arrays, while monkeys performed the task (Methods, Fig. 2a). A total of 1612 units were recorded across 32 sessions (44 ± 12 units/session). To avoid double counting neurons, we restrict our focus to a subset of 244 neurons obtained from three sessions with the highest yield, one from each monkey. Data from the remaining sessions are analyzed and presented in Supplementary Material for comparison. Because this task challenged monkeys to integrate self-motion and update beliefs about their position relative to a remembered target throughout the trial, it places a significant strain on working memory. Classic working memory paradigms have found either persistent activity of single neurons or activation of many neurons in sequence. Few neurons in our data exhibited persistent activity during the trial. Instead, neurons seemed to be more active at certain periods of the trial, with some neurons being active earlier than others (Fig. 2b—compare #1, #2 vs #3, #4). Therefore, we wanted to test if neurons instead exhibited sequential activity dynamics at the population level.

**Fig. 2: Latent states dictate population dynamics.**

Since changes to the latent state are restricted to the time between target onset and the end of movement, we estimated population firing rate maps by rescaling time over this period and computing the trial-averaged response of each neuron. We sorted the neurons according to the timing of peak activity (“Methods”). We found strong sequential activation of neurons in all three monkeys, as quantified by a standard index of sequentiality (Sql) that ranges from 0 (random) to 1 (sequential) (“Methods,” Mean Sql ± 95% CI—Monkey B: 0.34 ± 0.1, Monkey S: 0.23 ± 0.12, Monkey Q: 0.28 ± 0.1). Furthermore, the degree of sequentiality was robust to task demands: Sql was similar across groups of trials corresponding to different target distances (Mean Sql ± 95% CI, Nearby targets: 0.21 ± 0.1, Distant targets: 0.19 ± 0.08) and target angles (Leftward targets: 0.26 ± 0.07, Rightward targets: 0.22 ± 0.11). However, the overall activity level and the precise sequence in which neurons were activated was not preserved across trial groups: normalizing the activity and sorting the neurons according to one group of trials did not yield identical sequential activity in the complementary trial group (Figs. 2c, d and S2a, b). To quantify this, we computed the pattern similarity of population rate maps between task-relevant trial groups, and found that it was significantly lower than the similarity between odd and even trials within each trial group (Pattern similarity, Odd vs Even trials: 0.92 ± 0.03, Nearby vs Distant: 0.65 ± 0.04, Leftward vs Rightward: 0.72 ± 0.03; Fig. 2e, f—left). This suggests that the variability in sequencing is not due to neural noise, but rather reflects systematic representation differences between different trial groups. The pattern similarity was low throughout the trial and not just at the beginning or the end of the trial (Fig. 2e, f—right, Fig. S2c, d), suggesting that task demands alter the population dynamics and neurons are not merely keeping time.

While sequential neural dynamics could partly be a signature of the temporal integration process by which monkeys update their position estimates, sensory cues (optic flow) and motor commands (hand motion) are trajectory-dependent and thus also differ across trial groups. Consequently, to test whether PPC contains all signals relevant to this task, we need an explicit encoding model to relate neural activity to dynamical latent state estimates as well as other variables that influence neural activity, such as sensory and motor variables. Furthermore, to the extent that the activity is not driven solely by external inputs, the presence of sequential dynamics points to a potentially important role for neural interactions within PPC. We next fit a model with these features.

Encoding model

Because the causal structure of the task involves an action-perception loop, all task variables change dynamically during the course of the trial (Fig. 3a). To simultaneously account for how neural activity is influenced by dynamic sensory (linear and angular velocity), latent (target distance and angle), and motor predictors (hand speed along the two principal components of hand position) in addition to the monkey’s gaze and discrete events (target onset and reward delivery), we fit a generalized additive model (GAM) with Poisson distributed spike counts (“Methods”; Fig. 3b). Consistent with previous work in the monkey PPC, we observed robust ~ 15Hz oscillations in the local field potential (LFP, Fig. S3a). Therefore, we included the LFP phase as a predictor to capture temporal structure in the spike trains associated with these global rhythms. Finally, the model also incorporated temporal filters that explicitly captured causal, directional functional coupling between neurons, and autoregressive effects (spike-history filter). The model differs from a traditional encoding model in that it fits arbitrary nonlinear mappings (tuning functions) from predictors to neuronal response rather than linear kernels. However, this approach is closely related to generalized linear models (GLMs) in neuroscience³⁶, and is in fact identical to it if the task predictors are first expressed in terms of an appropriate non-linear basis. An iterative pruning procedure (backward elimination) identified task variables to which individual neurons were significantly tuned.

**Fig. 3: Neurons represent sensory variables, latent world states, and motor variables.**

The best-fit GAM captured 58 ± 6% of variance in the structure of population response (Fig. 3c) and 15 ± 4% of the temporal variability in single neuron responses (Fig. S3b) suggesting that the model had good predictive power. A significant fraction of neurons was tuned to linear and angular velocity (52 ± 8%), target distance and angle (37 ± 5%) and motor output (58 ± 4%) (Fig. 3d). The majority of neurons (53 ± 4%) were driven by target presentation while few neurons showed sensitivity to reward delivery (7 ± 4%). Because different task signals were mixed at the level of single neurons (Fig. S3c), the variance explained by individual task variables was typically low (Figs. 3e and S3d). Examining the model parameters, we found that single neurons were often tuned to sensory and motor variables, as well as to latent world states i.e., target distance and angle (Figs. 3f and S3e). Across the population of neurons, we found near uniform tiling of the sensory space with a significant fraction of neurons tuned for low, intermediate, and high linear and angular velocities (Fig. 3f—green). A similar trend was seen in the temporal kernels fit to motor variables (hand speed) where we found a good mix of neurons that responded before and after hand motions (Fig. 3f—blue). On the other hand, tuning to target distance and angle were relatively skewed, with the majority of neurons tuned to small target distances and extreme target angles (Fig. 3f—orange). Greater preference for small target distances could be a reflection of the fact that those states carry the highest value in the context of the task. Tuning to LFP phase was highly stereotyped: the spiking probability of all neurons peaked just before (18 ± 8^∘) the trough of the LFP signal. Tuning to target onset was somewhat variable, with the vast majority of neurons (71%) exhibiting an ON response with a latency of 122 ± 15 ms, and a smaller group of neurons (14%) which responded exclusively after the target turned OFF with a latency of 64 ± 32 ms (Fig. S3f). Tuning to gaze position was also diverse (Fig. S3g), consistent with the diversity of ‘gain fields’ discovered by classic studies investigating the role of posterior parietal cortex in transforming visual input from retinal co-ordinates into actionable co-ordinates³¹.

Next, we tested the extent to which coupling and spike-history filters contributed to neuronal responses. To do this, we compared likelihoods of the model with neither coupling nor spike-history filters against the model that included them both (Fig. 4a). We found that the likelihood was substantially greater for the coupled model, and marginally better for the model with just the spike-history factor (Mean Likelihoods ± SD across neurons, Uncoupled model: 0.36 ± 0.1, Spike-history model: 0.39 ± 0.1, Coupled model: 0.65 ± 0.14). The amplitude of these filters capture the modulation (‘gain’) in the probability of spiking as a function of the time since last spike from either the same neuron or another neuron (Fig. 4b—top). By doing so, they are able to capture features of spiking that are distinct from features captured by the task variables. In particular, the spike-history filter and coupling filter capture autocovariance and cross-covariance between neurons respectively that are not due to fluctuations in task variables (Fig. 4b—bottom). As a result, the coupled model is able to recapitulate the spatiotemporal covariance structure of the population activity very well (Pearson’s r ± SD, Data vs Coupled model prediction: 0.86 ± 0.08, Data vs Uncoupled: 0.06 ± 0.1; Figs. 4c and S4a), yielding substantially better predictions than the uncoupled model. Note that the uncoupled model can still capture covariance between neurons induced by the fluctuations in task variables, but not shared fluctuations at the millisecond timescale. Moreover, due to the directed nature of coupling filters, the coupled model can capture millisecond timescale, asymmetric interactions between neurons that may arise from recurrent connectivity. We found that the structure of coupling filters was sparse yet diverse: the same neuron had both net excitatory and inhibitory effects on different target neurons (Figs. 4d and S4b). This diversity should not be mistaken for a violation of Dale’s law since coupling filters capture effective interaction between neurons, rather than synaptic transmission properties. Across the population of all neuronal pairs, the mean gain was 1.04 ± 0.05 suggesting that excitatory and inhibitory effects were balanced (Fig. 4e—top). The timescale of coupling followed a power-law decay, with fast timescales contributing substantially more power to the coupling filter (Fig. 4e—bottom, Fig. S4c). The gain of both excitatory and inhibitory couplings decreased with distance between neurons (Fig. 4f), mirroring widely documented trends in anatomical connectivity and correlated variability.

**Fig. 4: Coupling encapsulates population structure.**

Population decoding

We have seen that single neurons in the macaque PPC encode task-relevant variables, in particular, the latent i.e., spatial position of the monkey. Because successful performance in this task depends on tracking the dynamical latent state, the above finding indicates that PPC might be critically involved in implementing the underlying sensorimotor transformation. If this is the case, then we can make the following predictions. First, we should be able to dynamically decode sensory, latent, and motor variables with good precision from the population activity. Second, trial-by-trial fluctuations in the error in decoding the latent state from neural activity should propagate to the motor plan and thus should be correlated with behavioral error (latent → motor). Likewise, trial-by-trial fluctuations in the error in decoding the sensory input should be correlated with the error in decoding the latent state (sensory → latent) that depends on those sensory inputs. We tested all three predictions.

To test whether neural activity was informative about task variables, we trained a linear decoder of the population response separately for each task variable, regressing each variable against the activity of all simultaneously recorded neurons (“Methods”). Figure 5a shows the timecourse of different task variables estimated using the corresponding decoding weights, on six example trials. The estimates were remarkably well aligned with the ground truth (gray). The correlation between the true and the decoded values was high, demonstrating good decoder performance (Pearson’s r ± SD, Sensory: 0.77 ± 0.04, Latent: 0.66 ± 0.06, Motor: 0.74 ± 0.04; Fig. 5b). While decoder performance initially increased with population size, the performance began to level off suggesting that the assessment of the decoders was not dramatically limited by the recording size (Fig. S5a).

**Fig. 5: Population activity predicts behavior on individual trials.**

Next, we assessed whether the decoding performance was correlated with behavior. Because the monkey’s decision to stop moving ultimately depended on their (internal) estimate of target distance, we restricted our focus on the decoder of this particular latent variable for this analysis. Both raw estimates of the decoder performance and estimates extrapolated to infinite population predicted monkeys’ behavioral accuracy, taken to be the fraction of rewarded trials (Figs. 5c and S5b). As a stronger test, we evaluated the correlation between the decoding error and behavioral error across trials for each monkey. Due to the fine-grained nature of this analysis, we took behavioral error to be the stopping distance to target rather than a binary variable (rewarded/unrewarded). The correlation was significantly greater than chance in all three monkeys (Fig. 5d—left, Fig. S5c). Monkeys tended to undershoot when the decoder underestimated the target distance, and overshoot when there was overestimation (Median decoding errors, Undershot trials: −13 ± 3 cm, Overshot trials: 11 ± 2 cm; Fig. 5d—middle). Consequently, we were able to successfully classify undershot/overshot trials with 69 ± 3% accuracy based on an ROC analysis on the distribution of decoding errors (Fig. 5d—right).

Finally, we tested whether error in decoding sensory inputs propagates to the latent state representation. Across trials, we found that sensory decoding error was significantly correlated with the error in decoding latent states (Pearson’s r, Linear velocity vs Target distance: 0.15 ± 0.1, Angular velocity vs Target angle: 0.26 ± 0.1; Fig. 5e—left and middle, Fig. S5d). Although the weights of sensory and latent decoders were very different, they were not perfectly orthogonal (Fig. S5e). We controlled for this by computing a null distribution of correlations, by projecting population activity onto pairs of random surrogate modes that were separated by the same angle as the decoders. Correlation between surrogate responses was significantly less that the correlation between error in sensory and latent decoders (p = 0.008, paired t test; Fig. 5e—right). This suggests that neural computations include a significant interaction between activity subspaces representing sensory and latent variables. Such an interaction is consistent with a role of PPC in computing latent world states from sensory inputs.

Effect of task manipulations

To better understand how neural computations for this task are implemented in monkey PPC, we tested how manipulating task variables affects neural representations. In separate sessions, two of the monkeys (S and Q) performed three variations of the baseline task in which we manipulated either the reliability of optic flow (sensory input) by changing the density of ground plane elements, the consequence of actions (motor output) by changing the gain of the joystick, or the animal’s position in the virtual world (latent state) by briefly dislodging them off their intended trajectory (details in Methods; Fig. 6a). Crucially, these sessions also included trials from the common, unmanipulated task such that we could record and contrast the activity of the same population of neurons under both conditions.

**Fig. 6: Behavioral manipulations trigger changes in coupling and latent representation.**

Behavior was robust to all three manipulations, but there was a small drop in performance (mean AUC ± SD across six sessions of each manipulation—baseline: 0.88 ± 0.03, average across manipulations: 0.84 ± 0.03, p = 0.002, paired t test; Figs. 6b and S6a). Because activity of each neuron was greatly influenced by the activity of other neurons, we first tested whether the shape of the coupling filters was affected by task manipulations. We observed that all three manipulations altered coupling, albeit to varying degrees (Figs. 6c and S6b). For each pair of neurons, we quantified the degree to which coupling was altered by computing the correlation between filters fit to data recorded with and without task manipulation (Fig. 6d—top left in color). Under all manipulations, the median correlation dropped significantly below the noise ceiling defined as the correlation between couplings fit to odd/even trials of the baseline task (color vs black), but remained significantly above chance level defined as the correlation between couplings between random pairs of neurons (color vs gray). Notably, manipulations did not change the extent to which coupling explained the activity of single neurons (paired t test—sensory manip: p = 0.81, latent manip: p = 0.68, motor manip: p = 0.23; Fig. 6e). We quantified the stability of coupling filters to task manipulations by measuring the stability index (“Methods”) that ranged from 0 (highly unstable) to 1 (perfectly stable). According to the causal structure of the task, all three manipulations push the latent belief state about the world away from what the animal has come to expect based on his experience during the baseline task. However, for a given action, the latent world states in the sensory manipulation condition remain closer to the unmanipulated distribution. Strikingly, of the three manipulations, sensory manipulation produced the least change in coupling (Stability Index (SI) of coupling—sensory manip: 0.68 ± 0.01, latent manip: 0.36 ± 0.01, motor manip: 0.49 ± 0.01; Fig. 6f). This suggests that an internal model of the latent state dynamics may be embedded in the network connectivity, such that manipulations that defy the learned model are more likely to alter the interactions between neurons, either directly by re-calibrating the recurrent connections or by recruiting additional neural pathways, in order to maintain good behavioral performance.

If coupling filters reflect neural interactions that support the computation used to track the latent state dynamics by PPC neurons, then changes in coupling should produce changes in how single neurons represent the latent state. We tested this by computing the robustness of tuning functions to the task manipulations, and found that tuning to latent state (position) was indeed greatly affected (SI of latent tuning—sensory manip: 0.72 ± 0.05, latent manip: 0.53 ± 0.03, motor manip: 0.47 ± 0.03; Fig. 6d, f). In contrast, neuronal tuning to sensory and motor variables was relatively more stable (Figs. 6d, f and S6c). Thus, both sensory and motor representations were robust to task manipulations whereas latent state representation was not.

Taken together, results from the manipulation experiments suggest that the functional role of PPC in this task might be primarily to integrate sensory input (velocity) inherited from upstream brain areas in order to track the latent world state (position), which could then be used by downstream circuits to generate appropriate motor commands. Concretely, integration could be implemented locally via recurrent interactions within PPC which embody the world model, resulting in a dynamically changing representation of the latent world state in response to task manipulations. In contrast, sensory and motor signals may flow through relatively static input and output connections of PPC yielding more stable representations of those variables. To test the plausibility of this hypothesis, we trained a recurrent neural network (RNN) model to perform the same task as monkeys and compared the representation learned by the network to monkey PPC. The objective was to produce a model that was functionally equivalent to monkeys without explicitly constraining the representations it learned, such that any similarity between the representation of the RNN and the monkey PPC could be attributed to the computational constraints satisfied by the RNN.

RNN model

We trained a fully connected RNN model (Fig. 7a) to solve a task similar to the one solved by monkeys. The network comprised 100 recurrently connected, nonlinear neurons whose activity ranged from −1 to +1. The network received four inputs (2D self-motion velocity and 2D target position) and the network activity was linearly read out by two controller output neurons (2D hand velocity). The control outputs affected the latent world states as well as the sensory inputs which were fed back via the two input channels conveying self-motion velocity (World model block; Methods). At the start of each trial, the network received transient pulses whose amplitude encoded the target position. We trained the network to produce controller outputs such that the resulting trajectory ended atop the target center (“Methods”). The network learned to generate qualitatively good trajectories (Fig. 7b—left), and the training was halted when the performance matched that of the monkeys (Fig. 7b—right, Fig. S7a).

**Fig. 7: A recurrent neural network model operating in closed loop recapitulates experimental findings.**

Similar to monkey PPC neurons, model neurons exhibited sequential activity, with a precise sequence that was modulated by the goal location (Fig. 7c). However, model neurons were active for much longer periods and consequently exhibited lower sequentiality than monkeys’ neurons (mean Sql ± 95% CI—Model: 0.08 ± 0.05, Monkeys: 0.29 ± 0.1). Introducing a metabolic constraint into the training objective by penalizing the average activity substantially improved the degree of sequentiality (“Methods”; Fig. S7b), suggesting that such constraints might be operating in PPC. Although the model was not explicitly trained to track the latent state (position), model neurons nonetheless exhibited tuning to target distance and angle (Fig. 7d—left) because this was needed to perform the task. There was no evidence for functional specialization and the model neurons exhibited a high degree of mixed selectivity to sensory, latent, and motor variables (Fig. S7c). Furthermore, we found that recurrent connectivity explained a large fraction of the activity (Mean R² ± 95% CI—without coupling: 0.39 ± 0.04, with coupling: 0.92 ± 0.1). The distribution of couplings between neurons in the network reflected a balance between inhibitory and excitatory interactions (Fig. 7d—right, Fig. S7d). Similar to monkey PPC, the error in decoding the target distance from model neurons predicted the error in stopping position (Pearson’s r: Model: 0.55 ± 0.1, Monkeys:0.29 ± 0.2; Fig. 7e—left). This suggests that the latent state representation learned by the network propagates to the readout neurons that drive motor output. Likewise, the error in decoding the sensory input was correlated with the error in decoding the latent world states (Pearson’s r—Linear velocity vs Target distance: 0.23 ± 0.2, Angular velocity vs Target angle: 0.45 ± 0.2; Fig. 7e—right) suggesting that the latent representations were derived, at least in part, by integrating sensory inputs. Finally, we tested the network on the three manipulations described in the previous section. Although the network was robust to adding more sensory noise, it generalized less well to latent state and motor manipulations (Fig. S7e). Network performance in the latter two manipulations could be restored by retraining the recurrent weights (Fig. 7f—top left, “Methods”). Notably, comparable performance in all three manipulations could be achieved without retraining the input weights or readout weights (Fig. 7f—top right). Consequently, tuning to both sensory and motor variables was stable under all manipulations (Fig. 7f—bottom). In contrast, tuning to latent state and the coupling between model neurons were only robust to sensory manipulation where the recurrent weights did not have to be retrained. These findings parallel the effect of manipulations on PPC neurons (Fig. 6f).

The striking similarity between the model and monkey neural representations suggests that the neural mechanisms by which PPC contributes to this task may be dictated largely by simple architectural constraints that were incorporated into the RNN model. Specifically, like the model neurons, PPC neurons might inherit self-motion information from upstream brain regions via stable pathways established during development. This could explain why tuning to self-motion velocity remains largely invariant under task manipulations. Analogous to the controller units in the model, downstream motor areas responsible for driving the muscles might decode the activity in PPC using fixed readout weights, yielding a stable relationship between PPC activity and hand movements. Finally, recurrent interactions within PPC might reflect knowledge of the world model, such that the stream of sensory inputs are filtered through those interactions to dynamically infer the latent world state. Critically, changes to the world model, such as those introduced here through task manipulations, would effectively modify the neural interactions and change the relationship between neural activity and the latent world state, as we see in the data.

Discussion

To test whether neural circuits can track continuous and dynamical latent state variables, we designed a naturalistic paradigm in which monkeys navigated to a remembered goal location in a VR environment lacking explicit position cues. We recorded the activity of neurons in area 7a of the PPC and found that neurons exhibited sequential activity at the population level, were strongly influenced by other neurons, and encoded latent states at the neuronal level. Latent states decoded from population activity on individual trials were correlated with the monkeys’ stopping position relative to the goal. Finally, manipulating sensory reliability, latent state, or motor gain altered functional coupling between neurons and also affected the latent state representation, but spared sensory and motor representations.

A large body of work in primates and rodents has found evidence that PPC neurons encode a dynamic decision variable when inferring latent causes from sensory inputs^8,11, and that the neural activity predicts behavior⁹. We highlight three ways in which the paradigm used here differs from past studies and why it matters. First, in standard paradigms such as motion discrimination or the towers task^8,37, the animal integrates evidence in favor of a categorical proposition. The latent state itself is discrete and time-invariant, and the integration process serves to average away the noise in sensory input thereby improving decision confidence. In contrast, the latent state in the paradigm used here (relative goal location) is both continuous and time-varying, such that integration is needed to continually track this dynamical state. This does not obviate the need to gather momentary evidence about self-motion from noisy sense data. Instead, monkeys had to perform both computations - infer movement velocity from optic flow, and then integrate that to track the latent world state³⁴. Second, in contrast to binary decision-making paradigms, the task used here allows for a continuous behavioral readout via joystick movements and a greater decoupling of latent state variables and motor output. Third, due to the interactive nature of this task, spatiotemporal statistics of the sensory input is not predetermined by the experimenter, but generated online by the monkeys’ own actions, mimicking the closed-loop nature of real-world behaviors. This paves the way for a more direct comparison with neural activity in rodents, where such interactive behaviors are more commonly studied. At the same time, this did not entail sacrificing experimental control as we could independently manipulate sensory reliability (optic flow density), latent state (position), and the motor plant (joystick gain) and examine their consequences on behavior and neural response. For these reasons, the paradigm used here helps generalize past findings toward the domain of natural behavior, and sheds new light on the computational role of PPC.

At a very high level, the task required storing and manipulating latent states in working memory. Classic working memory tasks have reported two qualitatively different types of neural activity dynamics depending on the task: persistent activity^38,39 and sequential activity^9,40. Few neurons exhibited activity that persisted throughout the trial in our task, but there was robust sequential activity at the population level. A recent study in rodent PPC proposed that sequential activity observed during virtual navigation might simply be inherited from the input due to the presence of spatial cues that signaled location⁴¹. Because there were no explicit landmarks in our task, our results demonstrate that structured inputs are not needed for sequential dynamics in parietal cortex. Our findings agree with a recent computational study on task-optimized neural networks which suggests that working memory tasks with greater complexity favor sequential dynamics, whereas simpler tasks such as delayed recall favor persistent activity dynamics⁴². That same study also demonstrated that the strength of coupling between neurons was greater in networks with sequential activity. Indeed, we found that coupling between neurons contributed substantially to the activity of individual neurons (>80% increase in information about precise spike times predicted with versus without coupling). In contrast, previous work in monkeys showed that coupling had a more modest effect (~20% increase) during a memory-guided saccade task that elicited persistent activity in lateral intraparietal area (LIP) neurons⁴³. This difference persisted even when we subsampled our population and thus could not be attributed to the larger population size of our recordings. Our results are quantitatively more compatible with the strong coupling reported in PPC of mice performing an evidence accumulation task in VR where trial lengths were much longer⁴⁴, suggesting that task complexity might be the primary determinant of neural activity dynamics in PPC. One possibility is that persistent activity can support holding information in memory for a brief period (i.e., short-term memory⁴⁵), but manipulating the information content in working memory (e.g., updating the latent state) might require richer dynamics such as sequential activity.

At the neuronal level, we found that about one-third of the neurons were tuned to the latent state, which here was the position of the monkey relative to the goal. This is similar to a recent finding that PPC neurons in cats trained to step over obstacles on a treadmill, encoded distance to those obstacles⁴⁶. We did not observe ramping activity dynamics such as those observed in monkey area 7a and LIP during motion discrimination tasks, which can alternatively be interpreted as tuning to net evidence. As noted by others, the tuning need not be monotonic to support evidence accumulation^47,48 and so ramping dynamics might emerge only under restricted settings. Precisely which settings favor ramping over other solutions should become more clear once the link between neural dynamics and computation are fully elucidated⁴⁹. Additionally, roughly half of the PPC neurons were tuned to sensory (linear and angular velocity) and motor (hand movement) variables. Mixed selectivity to task variables has been widely documented in many cortical areas^50,51 including rodent PPC⁵². Non-linear mixed selectivity has been argued to increase neural dimensionality, thereby allowing linear readout mechanisms to solve arbitrary classification problems⁵³. This computational benefit might also extend to regression problems, such as ours, where a representation with high expressivity would allow PPC to generalize better to new task variations⁵⁴.

Decoding analysis revealed that all task-relevant variables could be dynamically decoded from the population activity to a high degree of precision. Of note is the finding that trials in which the decoder underestimated (or overestimated) the target distance tended to result in undershooting (or overshooting) by the monkey. Recall that monkeys stop where they believe the target is located. Therefore, this result directly links PPC neural activity to the monkeys’ belief about a continuous-valued, dynamical latent state. Relationships linking neural activity and continuous variables have been found for several sensory (e.g., orientation, motion direction) and motor (e.g., hand position, running speed) variables, but for latent states, such relationships are almost always quantified by measuring the correlation between neural activity and a binary choice^55,56,57. This measure is an artifact of using experimental paradigms that do not allow for a direct readout of the animal’s estimate of the latent state, but only an indirect readout after that estimate undergoes further nonlinear processing that affects decision-making, such as thresholding. In contrast, the analysis used here shows that trial-by-trial fluctuations in the neural state of PPC is in fact correlated with the monkeys’ continuous internal estimate, and reveals the value of this experimental paradigm in establishing a tighter link between neural activity and latent beliefs. Since latent beliefs are formed by integrating past sensory inputs, a role in latent state computation also provides a parsimonious account of recent findings that PPC guides behavior based on sensory-history^58,59.

The above finding suggests that the latent state representation in PPC propagates to behavior, but is the latent state information inherited from elsewhere in the brain? We found signatures of latent state computation within PPC. Specifically, the error in decoding sensory variables (linear and angular velocity) was correlated with the error in decoding latent state variables (target distance and target angle). Such a correlation would not arise if the information about latent states and sensory inputs were inherited from independent sources, but is entirely expected if latent states were computed by integrating noisy sensory input within a network that includes PPC. It is theoretically possible that both sensory and latent state variables are acquired from a common brain area. Testing this would require causal intervention experiments and brain-wide recordings, which we hope to perform in the future. Meanwhile, we assumed that PPC contributed to the computation of the latent states, and probed the underlying neural mechanisms using experimental manipulations.

All three experimental manipulations yielded strikingly similar effects. They altered both neuronal coupling and the latent state representation, but left sensory and motor tuning unaltered. While previous studies have characterized how functional connectivity between brain-wide networks changes across tasks^60,61 and across epochs within a task⁴³, how task manipulations affect coupling between neurons locally within a brain area is not known. Our results suggest that coupling can vary with task manipulations and identifies a potential mechanism by which behavioral performance in this task can be maintained in the face of external perturbations to task variables—namely, by reconfiguring the neural interactions within PPC. This reconfiguration could either result from synaptic modifications of recurrent connections within PPC or from contextual inputs to PPC that alter the effective interactions⁶². Either way, the concomitant changes observed in latent tuning suggest that latent states are likely computed via recurrent mechanisms within PPC, an interpretation that was further validated by the striking resemblance with the representations learned by a recurrent neural network (RNN) model trained to perform the same task. Finally, robustness of sensory and motor tuning to task manipulations was also readily explained by the same model by freezing the input and output weights and retraining only the recurrent weights. This suggests that sensory/motor variables inhabit a stable subspace in PPC, and could be transmitted from/to other brain areas via stable communication channels. It follows that the communication subspace between PPC and sensory cortex, as well as between PPC and motor cortex should be remain relatively invariant to context, a prediction that future studies can test using available tools⁶³. PPC has been variously described as a brain area that is involved in sensory processing, working memory, and motor planning. Our results suggest broadening existing views: PPC computes and continuously monitors the dynamical latent state of the organism in naturalistic behaviors involving action/perception loops. This is in line with the perspective of PPC as a state estimator for optimal feedback control to flexibly interface sensory information with actions^64,65.

A limitation of our treatment of the neural computations in this task is that we have overlooked the contribution of autonomous strategies. In principle, it is possible to use the learned world model to plan joystick movements ahead of time without any sensory feedback. Although we know from manipulation experiments that monkeys relied on sensory feedback (optic flow), the contribution of autonomous strategies to navigation is likely non-negligible^35,66,67. Planning-based computations, which require reasoning about the consequences of action sequences via mental simulation, are thought to be performed in the prefrontal cortex (PFC)^68,69. Predictive signals that enable planning in dynamic environments have also been found recently in the dorsal anterior cingulate cortex⁷⁰. A promising direction for future could be to compare the causal contributions of PPC against frontal brain regions in this task, by combining inactivation experiments and statistical tools that characterize behavior in the spectrum between purely sensory feedback-based vs purely autonomous strategies. Another open question concerns the stability of representations across time. Previous studies have demonstrated that neural representations in the rodent PPC drift over time⁷¹. While we have analyzed how representations are affected by task manipulations, knowing whether they are stable across long timescales would complement the insights gained from this study and help further constrain the underlying mechanisms.

Methods

Experimental model

Three rhesus macaques (Macaca mulatta) (all male, 7-8 years. old)—referred to as B, S, and Q for simplicity—participated in the experiments. All surgeries and experimental procedures were approved by the Institutional Review Board at Baylor College of Medicine, and were in accordance with National Institutes of Health guidelines.

Experimental setup

Monkeys were chronically implanted with a lightweight polyacetal ring for head restraint, and scleral coils for monitoring eye movements (CNC Engineering, Seattle WA, USA). Utah arrays were chronically implanted in area 7a of the left hemisphere of all three monkeys using craniotomy. Prior to the surgery, the brain area was identified using structural MRI to guide the location of craniotomy. After craniotomy, the array was pneumatically inserted after confirming the co-ordinate of the target area using known anatomical landmarks. The electrode arrays implanted in monkeys Q and B were composed of a 10 × 10 grid of 96 silicon microelectrodes, each 1 mm long and spaced 400 μm apart. The array implanted in monkey S had identical electrode lengths and spacing, except that it was composed of a 6 × 8 grid of 48 microelectrodes. At the beginning of each experimental session, monkeys were head-fixed and secured in a primate chair placed on top of a platform (Kollmorgen, Radford, VA, USA). A 3-chip DLP projector (Christie Digital Mirage 2000, Cypress, CA, USA) was mounted on top of the platform and rear-projected images onto a 60 × 60 cm tangent screen that was attached to the front of the field coil frame, ~ 30 cm in front of the monkey. The projector was capable of rendering stereoscopic images generated by an OpenGL accelerator board (Nvidia Quadro FX 3000G).

Virtual reality

Monkeys used an analog joystick (M20U9T-N82, CTI electronics) with two degrees of freedom and a circular displacement boundary to control their linear and angular speeds in a virtual environment. Fore-aft and sideways movement of the joystick controlled linear and angular velocity respectively. The virtual world comprised a circular ground plane with a radius of 70 m (near and far clipping planes at 0.05 m and 40 m respectively), with the subject positioned at its center at the beginning of each trial. The ground plane was textured with small isosceles triangles (base × height: 0.85 cm × 1.85 cm) that were each randomly repositioned and reoriented anywhere in the arena at the end of its limited lifetime ( ~ 250 ms), making them impossible to use as landmarks. The maximum linear and angular speeds were fixed to 2 m/s and 90 ^∘/s respectively, and the density of the ground plane was fixed at 2.5 elements/m². The stimulus was rendered as a red-green anaglyph and projected onto the screen in front of the subject’s eyes. Monkeys wore goggles fitted with Kodak Wratten filters (red #29 and green #61) to view the stimulus. The binocular crosstalk for the green and red channels was 1.7% and 2.3%, respectively.

Behavioral task

In each session, monkeys performed a series of trials in which they had to steer to a random target location that was cued briefly at the beginning of the trial. Each trial was programmed to start after a variable random delay (truncated exponential distribution, range: 0.2–2.0 s; mean: 0.5 s) following the end of the previous trial. The target, a circular disc of radius 20 cm whose luminance was matched to the texture elements appeared at a random location between θ = ± 40^∘ of visual angle at a distance of r = 0.7–4 m relative to where the subject was stationed at the beginning of the trial. The target only appeared transiently on the screen for 300 ms, but the joystick was always active so monkeys were free to start moving before the target vanished. Monkeys were teleported to the origin of the environment at the time of target onset. Since the environment was comprised of flickering triangles, teleportation was seamless and did not interfere with the continuous nature of the task. Trials were aborted after a maximum duration of 7 seconds (5 seconds in a subset of sessions). Monkeys typically performed a block of ~ 1500 trials in each experimental session, and received binary feedback following a variable waiting period after stopping (truncated exponential distribution, range: 0.1–0.6 s; mean: 0.25 s). They received a drop of juice if their stopping position was within 0.6 m away from the center of the target. No juice was provided otherwise. Monkeys were first trained extensively, gradually reducing the size of the reward zone until their performance stopped improving. In this study, we focus only on their post-training behavior. At this point, the radius of the reward zone was fixed across trials. The fixed reward boundary of 0.6 m was determined during pilot experiments using a staircase procedure to ensure that monkeys received reward in approximately two-thirds of the trials. We collected behavioral data from 110 recording sessions (27 from monkey B, 18 from monkey S, and 65 from monkey Q) yielding a total of 121,930 trials for behavioral analyses.

Task manipulations

We performed three different task manipulations on monkeys S and Q. One of these involved manipulating the reliability of the sensory observations (optic flow) by changing the density of the ground plane element. Trials with two densities that differed by a factor of 25 (2.5 elements/m² and 0.1 elements/m²) were randomly interleaved in these sessions. In a second version, we manipulated the effect actions (hand movements) inflicted on the latent state by altering gain of the joystick controller. In these sessions, we interleaved trials in which the gain of the joystick controller was switched between 1× (identical to baseline), 1.5×, and 2×. Within each trial, both linear and angular velocities were scaled by the same gain factor in order to avoid inducing different effects on linear and angular responses. Finally, we disrupted the transitions between the latent states by adding a brief external passive displacement that moved the subjects away from their expected path, at a random time during the trial. The perturbations had a fixed duration of 1 s and their velocity had a Gaussian profile with a standard deviation of 0.2 s and an amplitude that, on each trial, was drawn randomly from a uniform distribution bound between −2 and 2 m/s and between −120 and 120 ^∘/s for the linear and angular velocity, respectively. The perturbation onset time was randomly varied from 0 to 1 s after movement onset.

Behavioral recording and acquisition

All stimuli were generated and rendered using C++ Open Graphics Library (OpenGL) by continuously repositioning the camera based on joystick inputs to update the visual scene at 60 Hz. The camera was positioned at a height of 0.1 m above the ground plane. Spike2 software (Cambridge Electronic Design Ltd., Cambridge, UK) was used to record and store the time series of target locations as well as the animal’s location in the virtual environment for offline analysis. All behavioral data were recorded along with the event markers at a sampling rate of 833$\frac{1}{3}$ Hz.

Tracking of eye and hand movements

We recorded the horizontal and vertical positions of both eyes using chronically implanted scleral search coils in monkeys Q and B. Eye-tracking in monkey S was performed using a video-based eye-tracking system (ISCAN Inc., Woburn, MA, USA). Additionally, a video of the monkeys’ hand movements was captured at 30 frames/s using a 1280 × 960 (1.2 Megapixels) industrial-grade monochrome CCD camera (DMK 23U445, The Imaging Source LLC, Charlotte, NC, USA). The start and end of the video recording was synchronized with other behavioral data using a trigger-pulse sent by the stimulus acquisition software (Spike2). We used DeepLabCut⁷², a Python toolbox, to extract the trajectory of hand movements from the above videos. To do this, we first labeled the same set of identifiable features (fingers and wrist) in a random subset of 200 frames from one randomly chosen video recording. We then trained a deep neural network model using DeepLabCut on an NVIDIA Quadro P5000 GPU until the training error for the set of labeled frames saturated (typically around 500,000 iterations). Finally, we analyzed all the videos using the trained network to extract the time course of the spatial location of the features of interest.

Neural recording and acquisition

We recorded extracellularly using multi-electrode arrays (Blackrock Microsystems, Salt Lake City, UT, USA) from area 7a. Broadband neural signals were amplified and digitized at 30 KHz using a digital headstage (Cereplex E, Blackrock Microsystems, Salt Lake City, UT, USA), processed using the data acquisition system (Cereplex Direct, Blackrock Microsystems) and stored for offline analysis. Additionally, for each channel, we also stored low-pass filtered (−6 dB at 250 Hz) local-field potential (LFP) signals sampled at 500 Hz. Finally, copies of event markers were received online from the stimulus acquisition software (Spike2) and saved alongside the neural data.

Spike detection and sorting

Spike detection and sorting were initially performed on the raw (broadband) neural signals using MATLAB KiloSort⁷³ software on an NVIDIA Quadro P5000 GPU. The software uses template-matching both for detection and clustering of spike waveforms. The spike clusters produced by KiloSort were visualized with a Python package called Phy and manually refined by a human observer using standard heuristics. A typical recording session yielded 70–100 neurons across electrodes.

Models

Generalized additive model

To test whether task variables modulate neural activity, we fit a Poisson generalized additive model (GAM) to the responses of individual neurons. The model relates spike counts of ${{{{{{{{\bf{r}}}}}}}}}_{t}\in {{\mathbb{Z}}}_{+}^{N}$ of the neural population to continuous-valued input variables ${{{{{{{{\bf{x}}}}}}}}}_{t}\in {{\mathbb{R}}}^{{N}_{{{{{{{{\rm{C}}}}}}}}}}$, binary events ${{{{{{{{\bf{z}}}}}}}}}_{t}\in {\{0,\,1\}}^{{N}_{{{{{{{{\rm{E}}}}}}}}}}$ and past neural activity r_1:t−1 according to:

$$\log ({\mu }_{t}^{i})=\mathop{\sum }\limits_{k=1}^{{N}_{{{{{{{{\rm{C}}}}}}}}}}{f}_{k}^{i}({x}_{t}^{k})+\mathop{\sum }\limits_{l=1}^{{N}_{{{{{{{{\rm{E}}}}}}}}}}({g}_{l}^{i}*{z}_{1:t-1}^{l})+({h}^{i}*{r}_{1:t-1}^{i})+{b}^{i}$$

(1)

where ${r}_{t}^{i} \sim \,{{\mbox{Poisson}}}\,({\mu }_{t}^{i})$ denotes the Poisson-distributed response of neuron i at time t, ${x}_{t}^{k}$ is magnitude of the k^th continuous-valued input variable at time t, ${f}_{k}^{i}(\cdot )$ is any generic nonlinear function operating on x^k, ${z}_{t}^{l}$ is the value of the l^th binary event at time t, ${g}_{l}^{i}$ is the temporal filter operating on z^l, hⁱ is the causal spike-history filter that accounts for the refractory period and other autoregressive effects, N_C & N_E denote the total number of continuous-valued inputs and binary events respectively, ‘*’ denotes the convolution operator, and bⁱ is an additive constant to capture tonic firing. This model did not take recurrent interactions between into account, so we refer to it as the uncoupled model. We also fit an extension of the above model that included coupling between neurons as follows:

$$\log ({\mu }_{t}^{i}) = \mathop{\sum }\limits_{k=1}^{{N}_{{{{{{{{\rm{C}}}}}}}}}}{f}_{k}^{i}({x}_{t}^{k})+\mathop{\sum }\limits_{l=1}^{{N}_{{{{{{{{\rm{E}}}}}}}}}}({g}_{l}^{i}*{z}_{1:t-1}^{l})+({h}^{i}*{r}_{1:t-1}^{i}) \\ \, + \mathop{\sum }\limits_{\begin{array}{c}j=1 \;j\ne i\end{array}}^{N}({p}_{j}^{i}*{r}_{1:t-1}^{j})+{b}^{i}$$

(2)

where ${p}_{j}^{i}$ is the causal coupling filter that captures the directional interaction from neuron j to neuron i and N denotes the total number of neurons in the recording. We refer to this as the coupled model. Details about model parameters are stated in the Model fitting section.

Recurrent neural network model

We trained a fully connected recurrent neural network (RNN) comprising N = 100 nonlinear firing rate units to solve the same task as the monkeys. The network contained M = 4 input channels, two for conveying the 2D target location (x) encoded in the amplitude of a transient pulse delivered in the beginning of the trial and two for conveying sensory feedback about the 2D self-motion velocity (z) throughout the trial. There were P = 2 output channels, one each for controlling the velocity of the ‘hand’ along the linear and angular axes of the joystick (y). The network was similar to those commonly trained to solve standard neuroscience tasks, but with one key architectural modification: the output channels were temporally integrated and fed back to the network through the input channels conveying movement velocity (i.e., ${{{{{{{{\bf{z}}}}}}}}}_{t}=\int\nolimits_{0}^{t}{{{{{{{{\bf{y}}}}}}}}}_{s}\,ds$), thereby closing the sensorimotor loop. This feedback mimics the functionality of the virtual reality simulator that uses the joystick output to render real-time sensory feedback in the form of optic flow in our experiments. To mimic noise in the motor periphery, we added a small amount of process noise to the output channels before integrating. The equation governing the network dynamics was:

$$\tau \dot{{{{{{{{\bf{r}}}}}}}}}=-{{{{{{{\bf{r}}}}}}}}+({W}^{{{{{{{{\rm{rec}}}}}}}}}{{{{{{{\bf{r}}}}}}}}+{W}^{{{{{{{{\rm{in}}}}}}}}}\tilde{{{{{{{{\bf{x}}}}}}}}})\quad {{{{{{{\rm{and}}}}}}}}\quad {{{{{{{\bf{y}}}}}}}}={W}^{{{{{{{{\rm{out}}}}}}}}}{{{{{{{\bf{r}}}}}}}}$$

(3)

where r is the population activity, $\dot{{{{{{{{\bf{r}}}}}}}}}$ denotes its time-derivative, y is the network output representing hand velocity, $\tilde{{{{{{{{\bf{x}}}}}}}}}=({{{{{{{\bf{x}}}}}}}},\,{{{{{{{\bf{z}}}}}}}})$ denotes the input to the network obtained by concatenating the target location and sensory feedback obtained by integrating the network output, τ is the cell-intrinsic time constant, and $(\cdot )=\tanh (\cdot )$ is the neuronal nonlinearity. Matrices ${W}^{{{{{{{{\rm{rec}}}}}}}}}\in {{\mathbb{R}}}^{N\times N}$, ${W}^{{{{{{{{\rm{in}}}}}}}}}\in {{\mathbb{R}}}^{N\times M}$ and ${W}^{{{{{{{{\rm{out}}}}}}}}}\in {{\mathbb{R}}}^{P\times N}$ correspond to recurrent, input, and output weights respectively. Details about inputs, outputs, and the training procedure used to learn the network parameters are stated in the Model fitting section.

Linear decoder

For each recording session, we regressed the time course of population pattern of instantaneous firing rates $R\in {{\mathbb{R}}}^{T\times N}$ (where N is the size of the neural population and T is the total number of time bins) separately against each continuous-valued variable x_1:T to obtain weights ${{{{{{{\bf{w}}}}}}}}={({R}^{{{{{{{{\rm{T}}}}}}}}}R)}^{-1}{R}^{{{{{{{{\rm{T}}}}}}}}}{{{{{{{\bf{x}}}}}}}}$. Firing rates were estimated by convolving the spike train with an exponential filter with time constant η as a hyper-parameter. For each target variable, we obtained the regression weights w using data from the training set (80% trials) and decoded that variable from the population activity observed in a validation set (10% trials) to estimate the decoding error $\epsilon=\sqrt{{\sum }_{t}{({{{{{{{{\bf{w}}}}}}}}}^{{{{{{{{\rm{T}}}}}}}}}{R}_{t}-{x}_{t})}^{2}}$ where t denotes time bin. For each task variable, we determined the optimal timescale of the filter η within a range between ~25 and 250 ms as the timescale that minimized the decoding error in the validation set. Finally, decoding performance was evaluated by decoding the population activity observed an independent test set (remaining 10% trials) using regression weights corresponding to the optimal timescale.

Model fitting and evaluation

Generalized additive model

Given a time series of neuronal responses r of a population of neurons, and inputs x & z, the goal is to recover the set of all tuning functions fⁱ, temporal filters gⁱ, hⁱ, coupling filters pⁱ and the additive constant bⁱ for each neuron i. We solve this by computing the maximum a posteriori (MAP) estimate:

$$\{{\hat{{{{{{{{\bf{f}}}}}}}}}}^{i},\,{\hat{{{{{{{{\bf{g}}}}}}}}}}^{i},\,{\hat{{{{{{{{\bf{h}}}}}}}}}}^{i},\,{\hat{{{{{{{{\bf{p}}}}}}}}}}^{i},\,{b}^{i}\}={{{{{{{{{\bf{f}}}}}}}}}^{i},\,{{{{{{{{\bf{g}}}}}}}}}^{i},\,{{{{{{{{\bf{h}}}}}}}}}^{i},\,{{{{{{{{\bf{p}}}}}}}}}^{i},\,{b}^{i}}{{{{{{\rm{arg}}}}}}\, {{{{{\rm{max}}}}}}} \, P({{{{{{{\bf{r}}}}}}}},\,{{{{{{{\bf{x}}}}}}}},\,{{{{{{{\bf{z}}}}}}}}|{{{{{{{{\bf{f}}}}}}}}}^{i},\,{{{{{{{{\bf{g}}}}}}}}}^{i},\,{{{{{{{{\bf{h}}}}}}}}}^{i},\,{{{{{{{{\bf{p}}}}}}}}}^{i},\,{b}^{i})P({{{{{{{{\bf{f}}}}}}}}}^{i},\,{{{{{{{{\bf{g}}}}}}}}}^{i},\,{{{{{{{{\bf{h}}}}}}}}}^{i},\,{{{{{{{{\bf{p}}}}}}}}}^{i},\,{b}^{i})$$

(4)

where $P({{{{{{{\bf{r}}}}}}}},\,{{{{{{{\bf{x}}}}}}}},\,{{{{{{{\bf{z}}}}}}}}|{{{{{{{{\bf{f}}}}}}}}}^{i},\,{{{{{{{{\bf{g}}}}}}}}}^{i},\,{{{{{{{{\bf{h}}}}}}}}}^{i},\,{{{{{{{{\bf{p}}}}}}}}}^{i},\,{b}^{i})={\prod }_{t}{e}^{-{\mu }_{t}^{i}}{({\mu }_{t}^{i})}^{{r}_{t}^{i}}/{r}_{t}^{i}!$ is the model likelihood for the ith neuron where ${\mu }_{t}^{i}$ is given by Eq. (1), and P(fⁱ, gⁱ, hⁱ, pⁱ, bⁱ) is the prior over model parameters. We chose a factorizable Gaussian prior on the curvature of the tuning functions fⁱ, the temporal filters gⁱ and the spike-history filter hⁱ to encourage smoothness, a Laplace prior on coupling filters pⁱ to encourage sparseness, with no prior constraints on bⁱ:

$$P({{{{{{{{\bf{f}}}}}}}}}^{i},\,{{{{{{{{\bf{g}}}}}}}}}^{i},\,{{{{{{{{\bf{h}}}}}}}}}^{i},\,{{{{{{{{\bf{p}}}}}}}}}^{i},\,{b}^{i})= \mathop{\prod }\limits_{j=1}^{N}\mathop{\prod }\limits_{l=1}^{{N}_{{{{{{{{\rm{E}}}}}}}}}}\mathop{\prod }\limits_{k=1}^{{N}_{{{{{{{{\rm{C}}}}}}}}}}P({f}_{k}^{i})P({g}_{l}^{i})P({h}^{i})P({p}_{j}^{i})=\mathop{\prod }\limits_{j=1}^{N}\mathop{\prod }\limits_{l=1}^{{N}_{{{{{{{{\rm{E}}}}}}}}}}\mathop{\prod }\limits_{k=1}^{{N}_{{{{{{{{\rm{C}}}}}}}}}} \\ \exp \left\{-{\lambda }_{k}{\left|\frac{\partial {f}_{k}^{i}}{\partial {x}^{k}}\right|}^{2}-{\gamma }_{l}{\left|\frac{\partial {g}_{l}^{i}}{\partial t}\right|}^{2}-\alpha {\left|\frac{\partial {h}^{i}}{\partial t}\right|}^{2}-\beta|{p}_{j}^{i}|\right\}$$

where λ_k, γ_l, α, and β are the hyperparameters that penalize rough tuning functions, rough temporal kernels, and dense coupling. ∥ ⋅ ∥ and ∣ ⋅ ∣ denote the ℓ₂ norm and the ℓ₁ norm respectively. After fitting the model parameters, we estimated the marginal tuning functions ${\mathbb{E}}[{\hat{r}}^{i}|{x}^{k}]$ to each variable x^k by computing the conditional expectation of model-predicted response ${\hat{r}}^{i}$ given variable x^k by marginalizing over the remaining variables:

$${\mathbb{E}}[{\hat{r}}^{i}|{x}^{k}]={e}^{{\hat{f}}_{k}^{i}}\left(\mathop{\prod }\limits_{\begin{array}{c}j=1 \; j\ne k\end{array}}^{{N}_{{{{{{{{\rm{C}}}}}}}}}}\int\,{e}^{{\,\,f}_{j}^{i}({x}^{j})}P({x}^{j})d{x}^{j}\right)\left(\mathop{\prod }\limits_{l=1}^{{N}_{{{{{{{{\rm{E}}}}}}}}}}\int\frac{1}{T}{e}^{{g}_{l}^{i}(t)*{z}^{l}(t)}dt\right)$$

(5)

where we have assumed that the joint probability density function over task variables can be factorized into a product of marginal densities. Under this assumption, tuning to each task variable is multiplicatively modulated by the remaining task variables without affecting its shape. Furthermore, we ignored the effect of spike-history and coupling filters because these filters did not substantially affect the average firing rate of the neuron predicted by the model i.e., their multiplicative modulation was close to unity. Marginal temporal responses to events z^l were determined by computing ${\mathbb{E}}[{\hat{r}}^{i}|{z}^{l}]$ in an analogous fashion. To fit the model using experimental data, we used different combinations of N_C = 9 continuous-valued variables—two sensory variables (linear velocity and angular velocity), two internal estimates (distance to target and target angle), two motor variables (hand speed along the first two principal components of hand position), the instantaneous phase of the local field potential (LFP), and the two components of eye position (horizontal and vertical)—and N_E = 2 discrete events (target onset and reward onset). Although the motor variables (hand speeds) were continuous-valued, they changed in a phasic manner with most changes concentrated around the onset of navigation and end of navigation. Preliminary analyses indicated that the associated neural changes were better captured by (acausal) temporal filtering of hand speed than tuning functions to hand speed. Therefore, we fit temporal kernels to capture the relationship between the motor variables and neuronal activity.

To fit the functions f, g, h, p, we expressed each of them as a linear combination of basis functions. Tuning functions f were parameterized using a basis of ten boxcar functions, where each function spanned an equal range of the predictor variable. Temporal filters g were parameterized using a basis of ten raised cosine filters spanning a range of 600 milliseconds. The filter associated with target-onset was causal ([0, 600] ms), while the remaining filters were non-causal ([-300, 300] ms). Both spike-history filter h and coupling filter p were expressed using a basis of ten causal raised cosine filters in logarithmic time scale. Spike-history filters spanned 350 ms, while coupling filters spanned 1.375 seconds. For each category of filter, the time duration was set to be the largest value beyond which the filters did not substantially improve the model likelihoods in preliminary analyses.

Regularization hyperparameters were first determined using a cross validation procedure on a subset of neurons. In this procedure, we varied the hyperparameter values on a logarithmic scale from 0.001 to 1000 and fit the model by including all task variables for each hyperparameter setting using 90% of the data, and chose the hyperparameter combination with the highest model likelihood in the remaining 10% of the data. To reduce the complexity of this procedure, we assumed a three-dimensional hyperparameter space with one hyperparameter each for all tuning functions (f), all event-related and spike-history temporal filters (g, h), and all coupling filters (p). The value of the hyperparameters dictates the bias-variance trade-off in the model: whereas large values yield flat tuning functions and predict responses that lack task-specificity, small values will lead to poor test performance due to over-fitting. The optimal setting was found to be identical ([λ_k=100, γ_l=α=10, β=10]) for the vast majority of neurons in the subset. Therefore, we used these values for fitting all neurons in the data as described below.

We fit several models by choosing different combinations of variables, performed 10-fold cross-validation to compute model likelihoods in each case, and selected the combination with the highest likelihood by the method of Backward Elimination which removed variables that did not contribute to improving the model likelihood. Because the model contained a large number coupling filters (equal to population size), these filters were selected as one group in the elimination process to minimize the computational complexity of the fitting procedure. Each fold of cross-validation comprised 9% of the trials, such that model selection was done using 90% of the data. The remaining 10% was used to evaluate the variance explained by the best-fit model. We estimated the variance explained (Pseudo-R²) ${{{{{{{\mathcal{M}}}}}}}}$ as $\left[1-\frac{({L}_{\infty }\,-\,{L}_{{{{{{{{\mathcal{M}}}}}}}}})}{({L}_{\infty }\,-\,{L}_{0})}\right]$ where ${L}_{{{{{{{{\mathcal{M}}}}}}}}}$ is the log-likelihood of the model obtained by setting the mean of the Poisson spiking process ${\mu }_{t}^{i}=\hat{r}(t)$, L_∞ is the log-likelihood of a model with ${\mu }_{t}^{i}=r(t)$, and L₀ is the log-likelihood of a model with constant firing rate ${\mu }_{t}^{i}={b}^{i}$. Note that L_∞ is the maximum possible log-likelihood achievable by any Poisson spiking model, while L₀ is the maximum possible log-likelihood achievable by a model with constant firing rate. The variance explained by any particular variable was estimated as the reduction in variance explained when that variable is removed from the model containing the set of all variables. We also estimated the fraction of variance explained in a more conventional way as the coefficient of determination (R²) by comparing the raw firing rate (obtained by smoothing the observed spike train with a 60ms wide Gaussian) and model-estimated firing rates, and found qualitatively very similar results. We therefore used this latter measure, ${R}^{2}=1-\frac{{{{{{{{\rm{V\; ar}}}}}}}}({\hat{r}}^{i}-{r}^{i})}{{{{{{{{\rm{V\; ar}}}}}}}}({r}^{i})}$, for reporting variance explained in single neurons throughout the text. Variance explained in the structure of population response was computed using an expression similar to coefficient of determination, except the numerator and denominator were both summed across neurons, ${R}_{{{{{{{{\rm{pop}}}}}}}}}^{2}=1-\frac{{\sum }_{i}{{{{{{{\rm{V\; ar}}}}}}}}({\hat{r}}^{i}-{r}^{i})}{{\sum }_{i}{{{{{{{\rm{V\; ar}}}}}}}}({r}^{i})}$. This measure is influenced more by the model’s ability to explain responses of neurons with larger intrinsic variability. This is motivated by the fact that if most of the fluctuations in population activity is driven by a tiny fraction of neurons, then capturing the responses of those neurons is more critical to explaining the structure of population response.

Recurrent neural network model

We trained the RNN model defined in Eq. (3) by learning the recurrent weights W^rec using BackPropagationThroughTime (BPTT). On each trial, the 2D target location was encoded by the amplitudes of a 300 ms pulse arriving at two of the input channels (x). Target locations were drawn from the possible locations spanning the same range of distances and angles as monkey experiments, and varied randomly across trials. The network output (y) corresponded to the 2D hand velocity, such that an output of zero is akin to holding the joystick at a fixed position and would produce no change in the velocity of self-motion. Non-zero output, on the other hand, would result in a change in motion velocity. In this sense, the output of the network encoded acceleration and therefore integrated twice to compute the 2D position s. The network was trained to reach the target location (x) within a certain time t^* and stay there for 0.6s (maximum stopping duration for monkeys). t^* corresponded to the time taken when traveling along an idealized circular trajectory from the starting location to target location at maximum speed. To simulate sensory feedback in the form of optic flow, we integrated the network output once to compute 2D self-motion velocity, and fed it back to the remaining two input channels (z) with a small amount of sensory noise. The time-constant τ was set to 20 ms and each training trial lasted between 2-3 s depending on the target location. Weights were updated at the end of each trial by computing their gradients with respect to the loss function, ${{{{{{{\mathcal{L}}}}}}}}={\sum }_{k}{\sum }_{t\ > \ {t}^{*}}|{s}_{k}(t)-{x}_{k}(t){|}^{2}$, using BPTT. We also trained a variant of this model where the loss function contained additional terms that penalized high amplitudes and fast fluctuations in the network output and activity (∣∣y∣∣², $||\dot{{{{{{{{\bf{y}}}}}}}}}|{|}^{2}$, ∣∣r∣∣², $||\dot{{{{{{{{\bf{r}}}}}}}}}|{|}^{2}$). This variant had smoother and sparser activity profiles, and exhibited sequential dynamics that was more comparable with the neural data. In all cases, small amount of process noise was added to the motor output channels (i.e., noisy plant) during training, to prevent the network from learning a purely autonomous control policy. Training was halted to probe the resulting neural representation once the performance reached the level of the average monkey.

The network was then retrained to be robust to task manipulations by fixing the input and output weights, and updating only the recurrent weights W^rec. Sensory reliability was manipulated by increasing the amount of noise added to the sensory feedback channels (z). Motor gain was manipulated by multiplying the network output (y) by a gain factor before feeding it to the plant. To perturb the latent state dynamics, we added gaussian temporal pulses to the sensory feedback channels (z) at a random time after the target onset. Because adding sensory noise did not adversely affect performance, the network was tested on this manipulation without retraining. Since we do not know the precise change in signal-to-noise ratio that corresponds to density manipulation in the monkey experiments, we added the amount of noise that caused the performance level to fall off to the same extent as monkeys. For the remaining manipulations, the network was retrained until the performance reached the same level as the sensory manipulation condition.

Statistical analysis

Data exclusion

Since we were interested in understanding latent state computation, we wanted to exclude data where the monkey was clearly not performing this computation. From each experimental recording session, we therefore excluded a small minority (~15%) of trials where the monkey appeared to clearly disengage from the task. Such trials were objectively identified as those in which the monkey either remained stationary throughout or failed to stop moving before the trial timed-out. This is analogous to the standard practice of excluding trials in which monkeys break fixation in more controlled experiments.

Behavior

In a co-ordinate system where the monkey’s starting position was taken to be the origin, we evaluated behavioral performance by regressing each monkey’s response positions (r,θ) against target positions (r^*,θ^*) separately for the radial (r vs r^*) and angular (θ vs θ^*) co-ordinates. The precision of the responses depended on the target location. To quantify the performance across all target locations in a concise manner, we pooled all trials and performed ROC analysis as follows. For each session, we first constructed a psychometric function by calculating the proportion of correct trials as a function of (hypothetical) reward boundary which was varied between 0-4 m. Whereas an infinitesimally small boundary will result in all trials being classified as incorrect, a large enough reward boundary will yield near-perfect accuracy. To define a chance-level psychometric function, we repeated the above procedure but now by shuffling the target locations across trials, thereby destroying the relationship between target and response locations. Finally, we obtained the ROC curve by plotting the proportion of correct trials in the original dataset (true positives) against the shuffled dataset (false positives) for each value of hypothetical reward boundary. We used the area under this ROC curve to obtain an accuracy measure as a single scalar value for each recording session.

Neural sequences

Peak-normalized response of neurons were first calculated by averaging responses by grouping trials according to target distance (nearby vs distant) and target angle (leftward vs rightward). Spike times were re-scaled based on the trial duration before trial-averaging, and the response profile of each neuron was subsequently normalized by the peak activity. Neurons were sorted according to the timing of their peak response observed within each trial group to construct firing rate maps of sequential activity. Pattern similarity was defined as the correlation coefficient between the firing rate maps taken from either the same trial group (odd vs even trials) or different trial groups (nearby vs distant targets, or leftward vs rightward targets). Time course of the pattern similarity was computed as the correlation between population activity vectors (columns of the rate maps) taken from the same trial group (odd vs even trials) or different trial groups (nearby vs distant targets, or leftward vs rightward targets). Following ref. ⁷⁴, Sequentiality index (Sql) was defined as the geometric mean of peak sparseness (f_peak) and temporal sparseness (f_temp), ${{{{{{{\rm{Sql}}}}}}}}=\sqrt{{f}_{{{{{{{{\rm{temp}}}}}}}}}*{f}_{{{{{{{{\rm{peak}}}}}}}}}}$ where:

$${f}_{{{{{{{{\rm{peak}}}}}}}}}=\mathop{\sum }\limits_{t=1}^{M}-{p}_{t}\log ({p}_{t})/\log (M)$$

(6.1)

$${f}_{{{{{{{{\rm{temp}}}}}}}}}=1-{{\mathbb{E}}}_{t}\left[\mathop{\sum }\limits_{i=1}^{N}-{r}_{i}^{t}\log ({r}_{i}^{t})/\log (N)\right]$$

(6.2)

where M and N denote the number of time bins and neurons respectively, p_t is the fraction of neurons whose activity peaked in time bin t, ${r}_{i}^{t}$ denotes the activity of neuron i in time bin t, normalized by the sum of activities of all neurons in that bin. Peak sparseness is high if the distribution of the time of peak activity across the population is roughly uniform. Temporal sparseness is high if only a few neurons are active in each time bin.

Cross-correlation function

The cross-correlation between spike trains r_i(t) and r_j(t) was computed as ${R}_{ij}(\tau )=\frac{1}{N{\bar{r}}_{i}}({\sum }_{t}{r}_{i}(t){r}_{j}(t-\tau ))-{\bar{r}}_{i}$ where ${\bar{r}}_{i}$ denotes the time-averaged firing rate of neuron i. This can be interpreted as the excess spike rate in neuron i due to neuron j³⁶. The auto-correlation function was a special case corresponding to i = j.

Stability index

The stability of coupling filters and tuning functions to task manipulations was quantified using Stability index (SI). Stability index was of coupling filters was given by $\frac{\rho -{\rho }_{0}}{{\rho }^{*}-{\rho }_{0}}$ where ρ is the median correlation between coupling filters fit to data with and without task manipulation, ρ^* is the median correlation between coupling filters fit to data in odd and even trials of the baseline task, and ρ₀ is the median of the null distribution constructed by shuffling neuronal pairs. A value of 0 corresponds to highly unstable coupling where the degree of match to baseline condition is no better than chance, and 1 corresponds to perfectly stable coupling where the filter did not change shape. SI of tuning functions was determined in an analogous manner where ρ denoted the correlation between tuning functions of a neuron in data with or without task manipulations. Note that depending on the type of task manipulation, the distribution of some of the task variables would change a lot. For example, the sensory input (velocity) is scaled by 2x during gain manipulation. To keep things consistent across all analyses, we fixed the domain over which tuning functions were computed to be identical to the domain used when fitting the model using baseline data.

Mixed selectivity index

Mixed selectivity index was used to estimate the uniformity of variance explained by different task variables in the neuronal response (Fig. 7). It was quantified as the participation ratio, ${[\mathop{\sum }\nolimits_{i=1}^{K}{v}_{i}]}^{2}/\mathop{\sum }\nolimits_{i=1}^{K}{[{v}_{i}]}^{2}$ where v_i denotes the variance explained by task variable i, and K = 6 is the number of task variables. This index is bounded between 1 (no mixing where only one variable contributes to predicting neural activity) and K (uniform mixing where all variables contribute equally to the predicting neural activity).

Canonical correlation analysis

We used canonical correlation analysis (CCA), an iterative technique to assess task-relevant linear dimensionality of population response (Fig. S4d). We considered the set of all continuous-valued variables except for LFP phase resulting in a total of N = 6 task-relevant variables. We considered the set of all simultaneously recorded neurons resulting in an M dimensional vector of neural activity at each time step (M ≫ N). If X and R denote the time-course of the set of all task variables and population response respectively, we first identify a pair of vectors $a\in {{\mathbb{R}}}^{{{{{{{{\rm{N}}}}}}}}}$ and $b\in {{\mathbb{R}}}^{{{{{{{{\rm{M}}}}}}}}}$ that maximizes the correlation, Corr(a^TX, b^TR), between the pair of canonical variables obtained by projecting the task and neural response variables onto the directions specified by those vectors. Then, we identify a second pair of vectors in the same way but with the additional constraint that the resulting canonical variables are uncorrelated with the first pair of canonical variables. We continue this procedure N times to identify up to N task-relevant dimensions of neural response. Dimensionality of canonical correlations is classically defined simply as the number of canonical pairs with significant correlations. However, this measure of dimensionality fails to account for the differences between the actual fraction of covariance in those dimensions. To capture the spectrum of covariance between task variables and neural response, we instead defined “task-relevant neural dimensionality” analogously to the standard measure of participation ratio used to measure the flatness of eigenspectra.

$$D=\frac{{\left[\mathop{\sum }\nolimits_{i=1}^{P}{{{{{{{\rm{Cov}}}}}}}}({a}_{i}^{{{{{{{{\rm{T}}}}}}}}}X,\,{b}_{i}^{{{{{{{{\rm{T}}}}}}}}}R)\right]}^{2}}{\mathop{\sum }\nolimits_{i=1}^{P}{\left[{{{{{{{\rm{Cov}}}}}}}}({a}_{i}^{{{{{{{{\rm{T}}}}}}}}}X,\,{b}_{i}^{{{{{{{{\rm{T}}}}}}}}}R)\right]}^{2}}$$

(6.3)

where a_i & b_i correspond to the ith canonical pair of vectors with unit-norm, and M & N denote the number of task variables and neurons, $P=\min (M,\,N)$, and 1 ≤ D ≤ P.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Pre-processed data are available at https://gin.g-node.org/kaushik-l/firefly-monkey. Raw data (~50 GB per experimental session) are stored in a local database server but available upon reasonable request. Source data files are provided with this paper. Source data are provided with this paper.

Code availability

The codes used to perform the analyses in this study are available at https://github.com/kaushik-l/neuroGAM, https://github.com/kaushik-l/firefly-monkey, and https://github.com/DeepLabCut/DeepLabCut/releases/tag/1.11.

References

Lee, D. D., Ortega, P. A. & Stocker, A. A. Dynamic belief state representations. Curr. Opin. Neurobiol. 25, 221–227 (2014).
Shenoy, K. V., Sahani, M. & Churchland, M. M. Cortical control of arm movements: a dynamical systems perspective. Annu. Rev. Neurosci. 36, 337–359 (2013).
Article CAS PubMed Google Scholar
Sussillo, D., Churchland, M. M., Kaufman, M. T. & Shenoy, K. V. A neural network that finds a naturalistic solution for the production of muscle activity. Nat. Neurosci. 18, 1025–1033 (2015).
Article CAS PubMed PubMed Central Google Scholar
Yamins, D. L. & DiCarlo, J. J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19, 356–365 (2016).
Kell, A. J., Yamins, D. L., Shook, E. N., Norman-Haignere, S. V. & McDermott, J. H. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98, 630.e16–644.e16 (2018).
Article Google Scholar
Pitkow, X. & Angelaki, D. E. Inference in the brain: statistics flowing in redundant population codes. Neuron 94, 943–953 (2017).
Wu, Z., Kwon, M., Daptardar, S., Schrater, P. & Pitkow, X. Rational thoughts in neural codes. Proc. Natl Acad. Sci. USA 117, 29311–29320 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).
Article CAS PubMed Google Scholar
Harvey, C. D., Coen, P. & Tank, D. W. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature 484, 62–68 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, 78–84 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Hanks, T. D. et al. Distinct relationships of parietal and prefrontal cortices to evidence accumulation. Nature 520, 220–223 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Shadlen, M. N. & Newsome, W. T. Neural basis of a perceptual decision in the parietal cortex (area lip) of the rhesus monkey. J. Neurophysiol. 86, 1916–1936 (2001).
Waskom, M. L., Okazawa, G. & Kiani, R. Designing and interpreting psychophysical investigations of cognition. Neuron 104, 100–112 (2019).
Article CAS PubMed PubMed Central Google Scholar
Moser, E. I., Kropff, E. & Moser, M. B. Place cells, grid cells, and the brain’s spatial representation system. Annu. Rev. Neurosci. 31, 69–89 (2008).
Article CAS PubMed Google Scholar
Finkelstein, A., Las, L. & Ulanovsky, N. 3-D maps and compasses in the brain. Annu. Rev. Neurosci. 39, 171–196 (2016).
Article CAS PubMed Google Scholar
Lakshminarasimhan, K. J. et al. Tracking the mind’s eye: primate gaze behavior during virtual visuomotor navigation reflects belief dynamics. Neuron 106, 662.e5–674.e5 (2020).
Article Google Scholar
Phinney, R. E. & Siegel, R. M. Speed selectivity for optic flow in area 7a of the behaving macaque. Cereb. Cortex 10, 413–421 (2000).
Article CAS PubMed Google Scholar
Avila, E., Lakshminarasimhan, K. J., Deangelis, G. C. & Angelaki, D. E. Visual and vestibular selectivity for self-motion in macaque posterior parietal area 7a. Cereb. Cortex 29, 3932–3947 (2019).
Article PubMed Google Scholar
Constantinidis, C. & Steinmetz, M. A. Neuronal activity in posterior parietal area 7a during the delay periods of a spatial memory task. J. Neurophysiol. 76, 1352–1355 (1996).
Article CAS PubMed Google Scholar
Motley, S. E. et al. Selective loss of thin spines in area 7a of the primate intraparietal sulcus predicts age-related working memory impairment. J. Neurosci. 38, 10467–10478 (2018).
Article CAS PubMed PubMed Central Google Scholar
Li, S., Constantinidis, C. & Qi, X. L. Drifts in prefrontal and parietal neuronal activity influence working memory judgments. Cereb. Cortex 31, 3650–3664 (2021).
Article PubMed PubMed Central Google Scholar
Andersen, R. A., Snyder, L. H., Bradley, D. C. & Xing, J. Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annu. Rev. Neurosci. 20, 303–330 (1997).
Chafee, M. V. & Crowe, D. A. Thinking in spatial terms: decoupling spatial representation from sensorimotor control in monkey posterior parietal areas 7a and LIP. Front. Integr. Neurosci. 6, 112 (2013).
Andersen, R. A. Visual and eye movement functions of the posterior parietal cortex. Annu. Rev. Neurosci. 12, 377–403 (1989).
Article CAS PubMed Google Scholar
Britten, K. H. Mechanisms of self-motion perception. Annu. Rev. Neurosci. 31, 389–410 (2008).
Article CAS PubMed Google Scholar
Rockland, K. S. & Hoesen, G. W. V. Some temporal and parietal cortical connections converge in ca1 of the primate hippocampus. Cereb. Cortex 9, 232–7 (1999).
Article CAS PubMed Google Scholar
Ding, S. L., Hoesen, G. V. & Rockland, K. S. Inferior parietal lobule projections to the presubiculum and neighboring ventromedial temporal cortical areas. J. Compar. Neurol. 425, 510–530 (2000).
Article CAS Google Scholar
Traverse, J. & Latto, R. Impairments in route negotiation through a maze after dorsolateral frontal, inferior parietal or premotor lesions in cynomolgus monkeys. Behav. Brain Res. 20, 203–215 (1986).
Barrow, C. J. & Latto, R. The role of inferior parietal cortex and fornix in route following and topographic orientation in cynomolgus monkeys. Behav. Brain Res. 75, 99–112 (1996).
Article CAS PubMed Google Scholar
Blatt, G. J., Andersen, R. A. & Stoner, G. R. Visual receptive field organization and cortico-cortical connections of the lateral intraparietal area (area lip) in the macaque. J. Compar. Neurol. 299, 421–445 (1990).
Zipser, D. & Andersen, R. A. A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature 331, 679–684 (1988).
Article ADS CAS PubMed Google Scholar
Crowe, D. A., Chafee, M. V., Averbeck, B. B. & Georgopoulos, A. P. Neural activity in primate area 7a related to spatial analysis of visual mazes. Cereb. Cortex 14, 23–34 (2004).
Article PubMed Google Scholar
Crowe, D. A., Averbeck, B. B., Chafee, M. V. & Georgopoulos, A. P. Dynamics of parietal neural activity during spatial cognitive processing. Neuron 47, 885–891 (2005).
Lakshminarasimhan, K. J. et al. A dynamic bayesian observer model reveals origins of bias in visual path integration. Neuron 99, 194.e5–206.e5 (2018).
Article Google Scholar
Alefantis, P. et al. Sensory evidence accumulation using optic flow in a naturalistic navigation task. J. Neurosci. 42, 5451–5462 (2022).
Article CAS PubMed PubMed Central Google Scholar
Park, I. M., Meister, M. L. R., Huk, A. C. & Pillow, J. W. Encoding and decoding in parietal cortex during sensorimotor decision-making. Nat. Neurosci. 17, 1395–1403 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pinto, L. et al. An accumulation-of-evidence task using visual pulses for mice navigating in virtual reality. Front. Behav. Neurosci. 12, 36 (2018).
Romo, R., Brody, C. D., Hernández, A. & Lemus, L. Neuronal correlates of parametric working memory in the prefrontal cortex. Nature 399, 470–473 (1999).
Article ADS CAS PubMed Google Scholar
Goard, M. J., Pho, G. N., Woodson, J. & Sur, M. Distinct roles of visual, parietal, and frontal motor cortices in memory-guided sensorimotor decisions. eLife 5, e13764 (2016).
Scott, B. B. et al. Fronto-parietal cortical circuits encode accumulated evidence with a diversity of timescales. Neuron 95, 385.e5–398.e5 (2017).
Article ADS Google Scholar
Krumin, M., Lee, J. J., Harris, K. D. & Carandini, M. Decision and navigation in mouse parietal cortex. eLife 7, e42583 (2018).
Orhan, A. E. & Ma, W. J. A diverse range of factors affect the nature of neural representations underlying short-term memory. Nat. Neurosci. 22, 275–283 (2019).
Article CAS PubMed Google Scholar
Hart, E. & Huk, A. C. Recurrent circuit dynamics underlie persistent activity in the macaque frontoparietal network. eLife 9, 1–22 (2020).
Article Google Scholar
Runyan, C. A., Piasini, E., Panzeri, S. & Harvey, C. D. Distinct timescales of population coding across cortex. Nature 548, 92–96 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Diamond, A. Executive functions. Annu. Rev. Psychol. 64, 135–168 (2013).
Article PubMed Google Scholar
Marigold, D. S. & Drew, T. Posterior parietal cortex estimates the relationship between object and body location during locomotion. eLife 6, e28143 (2017).
Meister, M. L., Hennig, J. A. & Huk, A. C. Signal multiplexing and single-neuron computations in lateral intraparietal area during decision-making. J. Neurosci. 33, 2254–2267 (2013).
Marcos, A. S. & Harvey, C. D. History-dependent variability in population dynamics during evidence accumulation in cortex. Nat. Neurosci. 19, 1672–1681 (2016).
Mastrogiuseppe, F. & Ostojic, S. Linking connectivity, dynamics, and computations in low-rank recurrent neural networks. Neuron 99, 609.e29–623.e29 (2018).
Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497, 585–590 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Pagan, M., Urban, L. S., Wohl, M. P. & Rust, N. C. Signals in inferotemporal and perirhinal cortex suggest an untangling of visual target information. Nat. Neurosci. 16, 1132–1139 (2013).
Raposo, D., Kaufman, M. T. & Churchland, A. K. A category-free neural population supports evolving demands during decision-making. Nat. Neurosci. 17, 1784–1792 (2014).
Article CAS PubMed PubMed Central Google Scholar
Fusi, S, Miller, E. K. & Rigotti, M. Why neurons mix: high dimensionality for higher cognition. Curr. Opin. Neurobiol. 37, 66–74 (2016).
Johnston, W. J. & Fusi, S. Abstract representations emerge naturally in neural networks trained to perform multiple tasks. Nat. Commun. 14, 1040 (2021).
Britten, K. H., Newsome, W. T., Shadlen, M. N., Celebrini, S. & Movshon, J. A. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis. Neurosci. 13, 87–100 (1996).
Article CAS PubMed Google Scholar
Crapse, T. B. & Basso, M. A. Insights into decision making using choice probability. J. Neurophysiol. 114, 3039–3049 (2015).
Article CAS PubMed PubMed Central Google Scholar
Lakshminarasimhan, K. J., Pouget, A., DeAngelis, G. C., Angelaki, D. E. & Pitkow, X. Inferring decoding strategies for multiple correlated neural populations. PLoS Comput. Biol. 14, e100637 (2018).
Hwang, E. J., Dahlen, J. E., Mukundan, M. & Komiyama, T. History-based action selection bias in posterior parietal cortex. Nat. Commun. 8, 1242 (2017).
Akrami, A., Kopec, C. D., Diamond, M. E. & Brody, C. D. Posterior parietal cortex represents sensory history and mediates its effects on behaviour. Nature 554, 368–372 (2018).
Hearne, L. J., Cocchi, L., Zalesky, A. & Mattingley, J. B. Reconfiguration of brain network architectures between resting-state and complexity-dependent cognitive reasoning. J. Neurosci. 37, 8399–8411 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gonzalez-Cartillo, J. & Bandettini, P. A. Task-based dynamic functional connectivity: recent findings and open questions. Neuroimage. 180(Pt. B), 526–533 (2018).
Borst, A., Flanagin, V. L. & Sompolinsky, H. Adaptation without parameter change: dynamic gain control in motion detection. Proc. Natl Acad. Sci. USA 102, 6172–6176 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Semedo, J. D., Zandvakili, A., Machens, C. K., Yu, B. M. & Kohn, A. Cortical areas interact through a communication subspace. Neuron 102, 249.e4–259.e4 (2019).
Buneo, C. A. & Andersen, R. A. The posterior parietal cortex: sensorimotor interface for the planning and online control of visually guided movements. Neuropsychologia 44, 2594–2606 (2006).
Medendorp, W. P. & Heed, T. State estimation in posterior parietal cortex: distinct poles of environmental and bodily states. Prog. Neurobiol. 183, 101691 (2019).
Stavropoulos, A., Lakshminarasimhan, K. J., Laurens, J., Pitkow, X. & Angelaki, D. E. Influence of sensory modality and control dynamics on human path integration. eLife 11, e63405 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhu, S., Lakshminarasimhan, K. J., Arfaei, N. & Angelaki, D. E. Eye movements reveal spatiotemporal dynamics of visually-informed planning in navigation. eLife 11, e73097 (2022).
Tanji, J. & Hoshi, E. Behavioral planning in the prefrontal cortex. Curr. Opin. Neurobiol. 11, 164–170 (2001).
Benoit, R. G., Szpunar, K. K. & Schacter, D. L. Ventromedial prefrontal cortex supports affective future simulation by integrating distributed knowledge. Proc. Natl Acad. Sci. USA 111, 16550–16555 (2014).
Yoo, S. B. M., Tu, J. C., Piantadosi, S. T. & Hayden, B. Y. The neural basis of predictive pursuit. Nat. Neurosci. 23, 252–259 (2020).
Rule, M. E., O’Leary, T. & Harvey, C. D. Causes and consequences of representational drift. Curr. Opin. Neurobiol. 58, 141–147 (2019).
Nath, T. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat. Protoc. 14, 2152–2176 (2019).
Pachitariu, M., Steinmetz, N. A., Kadir, S. N., Carandini, M. & Harris, K. D. Fast and accurate spike sorting of high-channel count probes with KiloSort. In Proc. of the 30th International Conference on Neural Information Processing Systems 4455–4463 (Curran Associates Inc., 2016).
Zhou, S., Masmanidis, S. C. & Buonomano, D. V. Neural sequences as an optimal dynamical regime for the readout of time. Neuron 108, 651.e5–658.e5 (2020).
Article Google Scholar

Download references

Acknowledgements

The authors express their deepest gratitude to Roozbeh Kiani for performing the Utah array implantations, Erin Neyhart for assisting with the data collection, Karen Wood and Rebecca Meyer for assisting with spike-sorting, and Jing Lin and Jian Chen for their help in programming the stimulus. This work was supported by NIH grant 1R01 DC004260 and 1R01 NS127122 to D.E.A., NSF NeuroNex 1707400 and NSF CAREER IOS-1552868 to X.P, NIH CRCNS 1R01 NS120407-01, 1U19 NS118246, and Simons Collaboration on the Global Brain, grant no. 324143 to X.P. and D.E.A. K.J.L. was supported by the NSF NeuroNex Award DBI-1707398 and the Gatsby Charitable Foundation GAT3780.

Author information

These authors jointly supervised this work: Xaq Pitkow, Dora E. Angelaki.

Authors and Affiliations

Center for Theoretical Neuroscience, Columbia University, New York City, NY, USA
Kaushik J. Lakshminarasimhan
Center for Neural Science, New York University, New York City, NY, USA
Eric Avila & Dora E. Angelaki
Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
Xaq Pitkow
Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
Xaq Pitkow
Electrical & Computer Engineering, Rice University, Houston, TX, USA
Xaq Pitkow
Department of Mechanical and Aerospace Engineering, New York University, New York City, NY, USA
Dora E. Angelaki

Authors

Kaushik J. Lakshminarasimhan
View author publications
You can also search for this author in PubMed Google Scholar
Eric Avila
View author publications
You can also search for this author in PubMed Google Scholar
Xaq Pitkow
View author publications
You can also search for this author in PubMed Google Scholar
Dora E. Angelaki
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.J.L., X.P., and D.E.A. conceived the study; K.J.L. and E.A. adapted the paradigm to non-human primates; E.A. developed the protocol to train monkeys to perform dexterous joystick movements and participated in the surgeries for implanting multi-electrode arrays; K.J.L. and E.A. collected the data with the help of research technicians. E.A. performed spike-sorting with the help of research technicians. K.J.L. performed the data analysis and wrote the manuscript. K.J.L., E.A., X.P., and D.E.A. revised the manuscript.

Corresponding author

Correspondence to Kaushik J. Lakshminarasimhan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lakshminarasimhan, K.J., Avila, E., Pitkow, X. et al. Dynamical latent state computation in the male macaque posterior parietal cortex. Nat Commun 14, 1832 (2023). https://doi.org/10.1038/s41467-023-37400-4

Download citation

Received: 13 February 2022
Accepted: 15 March 2023
Published: 01 April 2023
DOI: https://doi.org/10.1038/s41467-023-37400-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Machine learning reveals the control mechanics of an insect wing hinge

Control of working memory by phase–amplitude coupling of human hippocampal neurons

Memorability shapes perceived time (and vice versa)

Introduction

Results

Behavioral performance

Neural dynamics

Encoding model

Population decoding

Effect of task manipulations

RNN model

Discussion

Methods

Experimental model

Experimental setup

Virtual reality

Behavioral task

Task manipulations

Behavioral recording and acquisition

Tracking of eye and hand movements

Neural recording and acquisition

Spike detection and sorting

Models

Generalized additive model

Recurrent neural network model

Linear decoder

Model fitting and evaluation

Generalized additive model

Recurrent neural network model

Statistical analysis

Data exclusion

Behavior

Neural sequences

Cross-correlation function

Stability index

Mixed selectivity index

Canonical correlation analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links