## Abstract

Relating neural activity to behavior requires an understanding of how neural computations arise from the coordinated dynamics of distributed, recurrently connected neural populations. However, inferring the nature of recurrent dynamics from partial recordings of a neural circuit presents considerable challenges. Here we show that some of these challenges can be overcome by a fine-grained analysis of the dynamics of neural residuals—that is, trial-by-trial variability around the mean neural population trajectory for a given task condition. Residual dynamics in macaque prefrontal cortex (PFC) in a saccade-based perceptual decision-making task reveals recurrent dynamics that is time dependent, but consistently stable, and suggests that pronounced rotational structure in PFC trajectories during saccades is driven by inputs from upstream areas. The properties of residual dynamics restrict the possible contributions of PFC to decision-making and saccade generation and suggest a path toward fully characterizing distributed neural computations with large-scale neural recordings and targeted causal perturbations.

This is a preview of subscription content, access via your institution

## Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$29.99 / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

$189.00 per year

only $15.75 per issue

Rent or buy this article

Get just this article for as long as you need it

$39.95

Prices may be subject to local taxes which are calculated during checkout

## Data availability

All neural data used in the manuscript are available at https://doi.org/10.5281/zenodo.7378387.

## Code availability

The data analysis pipeline and code to generate simulations presented in the paper are available at https://github.com/anirgalgali/residual-dynamics.

## References

Steinmetz, N. A., Zatka-Haas, P., Carandini, M. & Harris, K. D. Distributed coding of choice, action and engagement across the mouse brain.

*Nature***576**, 266–273 (2019).Yuste, R. From the neuron doctrine to neural networks.

*Nat. Rev. Neurosci.***16**, 487–497 (2015).Cunningham, J. P. & Yu, B. M. Dimensionality reduction for large-scale neural recordings.

*Nat. Neurosci.***17**, 1500–1509 (2014).Yu, B. M. et al. Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity.

*J. Neurophysiol.***102**, 614–635 (2009).Linderman, S. W. et al. Bayesian learning and inference in recurrent switching linear dynamical systems.

*Proc. 20th Int. Conf. Artif. Intell. Stat.***54**, 914–922 (2017).Zhao, Y. & Park, I. M. Variational latent Gaussian process for recovering single-trial dynamics from population spike trains.

*Neural Comput.***29**, 1293–1316 (2017).Pandarinath, C. et al. Inferring single-trial neural population dynamics using sequential auto-encoders.

*Nat. Methods***15**, 805–815 (2018).Duncker, L., Bohner, G., Boussard, J. & Sahani, M. Learning interpretable continuous-time models of latent stochastic dynamical systems. in:

*Proceedings of the 36th International Conference on International Conference on Machine Learning*1726–1734 (PMLR, 2019).Chaudhuri, R., Gerçek, B., Pandey, B., Peyrache, A. & Fiete, I. The intrinsic attractor manifold and population dynamics of a canonical cognitive circuit across waking and sleep.

*Nat. Neurosci.***22**, 1512–1520 (2019).Mazor, O. & Laurent, G. Transient dynamics versus fixed points in odor representations by locust antennal lobe projection neurons.

*Neuron***48**, 661–673 (2005).Vyas, S., Golub, M. D., Sussillo, D. & Shenoy, K. V. Computation through neural population dynamics.

*Annu. Rev. Neurosci.***43**, 249–275 (2020).Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex.

*Nature***503**, 78–84 (2013).Sussillo, D., Churchland, M. M., Kaufman, M. T. & Shenoy, K. V. A neural network that finds a naturalistic solution for the production of muscle activity.

*Nat. Neurosci.***18**, 1025–1033 (2015).Sohn, H., Narain, D., Meirhaeghe, N. & Jazayeri, M. Bayesian computation through cortical latent dynamics.

*Neuron***103**, 934–947 (2019).Barak, O., Sussillo, D., Romo, R., Tsodyks, M. & Abbott, L. F. From fixed points to chaos: three models of delayed discrimination.

*Prog. Neurobiol.***103**, 214–222 (2013).Mastrogiuseppe, F. & Ostojic, S. Linking connectivity, dynamics, and computations in low-rank recurrent neural networks.

*Neuron***99**, 609–623 (2018).Pinto, L. et al. Task-dependent changes in the large-scale dynamics and necessity of cortical regions.

*Neuron***104**, 810–824 (2019).Sauerbrei, B. A. et al. Cortical pattern generation during dexterous movement is input-driven.

*Nature***577**, 386–391 (2020).Shadlen, M. N. & Newsome, W. T. The variable discharge of cortical neurons: implications for connectivity, computation, and information coding.

*J. Neurosci.***18**, 3870–3896 (1998).Churchland, A. K. et al. Variance as a signature of neural computations during decision making.

*Neuron***69**, 818–831 (2011).Churchland, M. M. et al. Stimulus onset quenches neural variability: a widespread cortical phenomenon.

*Nat. Neurosci.***13**, 369–378 (2010).Cohen, M. R. & Kohn, A. Measuring and interpreting neuronal correlations.

*Nat. Neurosci.***14**, 811–819 (2011).Goris, R. L. T., Movshon, J. A. & Simoncelli, E. P. Partitioning neuronal variability.

*Nat. Neurosci.***17**, 858–865 (2014).Rosenbaum, R., Smith, M. A., Kohn, A., Rubin, J. E. & Doiron, B. The spatial structure of correlated neuronal variability.

*Nat. Neurosci.***20**, 107–114 (2017).Ebrahimi, S. et al. Emergent reliability in sensory cortical coding and inter-area communication.

*Nature***605**, 713–721 (2022).Li, N., Daie, K., Svoboda, K. & Druckmann, S. Robust neuronal dynamics in premotor cortex during motor planning.

*Nature***532**, 459–464 (2016).Chettih, S. N. & Harvey, C. D. Single-neuron perturbations reveal feature-specific competition in V1.

*Nature***567**, 334–340 (2019).Inagaki, H. K., Fontolan, L., Romani, S. & Svoboda, K. Discrete attractor dynamics underlies persistent activity in the frontal cortex.

*Nature***566**, 212–217 (2019).Jazayeri, M. & Afraz, A. Navigating the neural space in search of the neural code.

*Neuron***93**, 1003–1014 (2017).Sadtler, P. T. et al. Neural constraints on learning.

*Nature***512**, 423–426 (2014).Gallego, J. A., Perich, M. G., Chowdhury, R. H., Solla, S. A. & Miller, L. E. Long-term stability of cortical population dynamics underlying consistent behavior.

*Nat. Neurosci.***23**, 260–270 (2020).Buesing, L., Macke, J. H. & Sahani, M. Spectral learning of linear dynamics from generalised-linear observations with application to neural population data.

*Adv. Neural Inf. Process. Syst.***25**, 1682–1690 (2012).Sani, O. G., Abbaspourazad, H., Wong, Y. T., Pesaran, B. & Shanechi, M. M. Modeling behaviorally relevant neural dynamics enabled by preferential subspace identification.

*Nat. Neurosci.***24**, 140–149 (2021).Angrist, J. D. & Krueger, A. B. Instrumental variables and the search for identification: from supply and demand to natural experiments.

*J. Econ. Perspect.***15**, 69–85 (2001).Wang, X.-J. Decision making in recurrent neuronal circuits.

*Neuron***60**, 215–234 (2008).Churchland, M. M. et al. Neural population dynamics during reaching.

*Nature***487**, 51–56 (2012).Laje, R. & Buonomano, D. V. Robust timing and motor patterns by taming chaos in recurrent neural networks.

*Nat. Neurosci.***16**, 925–933 (2013).Murray, J. D., Jaramillo, J. & Wang, X.-J. Working memory and decision-making in a frontoparietal circuit model.

*J. Neurosci.***37**, 12167–12186 (2017).Das, A. & Fiete, I. R. Systematic errors in connectivity inferred from activity in strongly recurrent networks.

*Nat. Neurosci.***23**, 1286–1296 (2020).Wilting, J. & Priesemann, V. Inferring collective dynamical states from widely unobserved systems.

*Nat. Commun.***9**, 2325 (2018).Chaudhuri, R., Knoblauch, K., Gariel, M. A., Kennedy, H. & Wang, X.-J. A large-scale circuit mechanism for hierarchical dynamical processing in the primate cortex.

*Neuron***88**, 419–431 (2015).Gold, J. I. & Shadlen, M. N. The neural basis of decision making.

*Annu. Rev. Neurosci.***30**, 535–574 (2007).Machens, C. K., Romo, R. & Brody, C. D. Functional, but not anatomical, separation of ‘what’ and ‘when’ in prefrontal cortex.

*J. Neurosci.***30**, 350–360 (2010).Semedo, J. D., Zandvakili, A., Machens, C. K., Yu, B. M. & Kohn, A. Cortical areas interact through a communication subspace.

*Neuron***102**, 249–259 (2019).Murphy, B. K. & Miller, K. D. Balanced amplification: a new mechanism of selective amplification of neural activity patterns.

*Neuron***61**, 635–648 (2009).Goldman, M. S. Memory without feedback in a neural network.

*Neuron***61**, 621–634 (2009).Ganguli, S., Huh, D. & Sompolinsky, H. Memory traces in dynamical systems.

*Proc. Natl Acad. Sci. USA***105**, 18970–18975 (2008).Gao, P. et al. A theory of multineuronal dimensionality, dynamics and measurement. Preprint at https://www.biorxiv.org/content/10.1101/214262v2 (2017).

Kiani, R. et al. Natural grouping of neural responses reveals spatially segregated clusters in prearcuate cortex.

*Neuron***85**, 1359–1373 (2015).Stokes, M. G. ‘Activity-silent’ working memory in prefrontal cortex: a dynamic coding framework.

*Trends Cogn. Sci.***19**, 394–405 (2015).Kaufman, M. T., Churchland, M. M., Ryu, S. I. & Shenoy, K. V. Cortical activity in the null space: permitting preparation without movement.

*Nat. Neurosci.***17**, 440–448 (2014).Javadzadeh, M. & Hofer, S. B. Dynamic causal communication channels between neocortical areas.

*Neuron***110**, 2470–2483 (2022).Murray, J. D. et al. A hierarchy of intrinsic timescales across primate cortex.

*Nat. Neurosci.***17**, 1661–1663 (2014).Hart, E. & Huk, A. C. Recurrent circuit dynamics underlie persistent activity in the macaque frontoparietal network.

*eLife***9**, e52460 (2020).Hanks, T. D. et al. Distinct relationships of parietal and prefrontal cortices to evidence accumulation.

*Nature***520**, 220–223 (2015).Kiani, R., Hanks, T. D. & Shadlen, M. N. Bounded integration in parietal cortex underlies decisions even when viewing duration is dictated by the environment.

*J. Neurosci.***28**, 3017–3029 (2008).Aoi, M. C., Mante, V. & Pillow, J. W. Prefrontal cortex exhibits multidimensional dynamic encoding during decision-making.

*Nat. Neurosci.***23**, 1410–1420 (2020).Libby, A. & Buschman, T. J. Rotational dynamics reduce interference between sensory and memory representations.

*Nat. Neurosci.***24**, 715–726 (2021).Hennequin, G., Vogels, T. P. & Gerstner, W. Optimal control of transient dynamics in balanced networks supports generation of complex movements.

*Neuron***82**, 1394–1406 (2014).Durstewitz, D. & Seamans, J. K. Beyond bistability: biophysics and temporal dynamics of working memory.

*Neuroscience***139**, 119–133 (2006).Deco, G. & Jirsa, V. K. Ongoing cortical activity at rest: criticality, multistability, and ghost attractors.

*J. Neurosci.***32**, 3366–3375 (2012).Dahmen, D., Grün, S., Diesmann, M. & Helias, M. Second type of criticality in the brain uncovers rich multiple-neuron dynamics.

*Proc. Natl Acad. Sci. USA***116**, 13051–13060 (2019).Purcell, B. A., Heitz, R. P., Cohen, J. Y. & Schall, J. D. Response variability of frontal eye field neurons modulates with sensory input and saccade preparation but not visual search salience.

*J. Neurophysiol.***108**, 2737–2750 (2012).Hennequin, G., Ahmadian, Y., Rubin, D. B., Lengyel, M. & Miller, K. D. The dynamical regime of sensory cortex: stable dynamics around a single stimulus-tuned attractor account for patterns of noise variability.

*Neuron***98**, 846–860 (2018).Litwin-Kumar, A. & Doiron, B. Slow dynamics and high variability in balanced cortical networks with clustered connections.

*Nat. Neurosci.***15**, 1498–1505 (2012).Britten, K. H., Shadlen, M. N., Newsome, W. T. & Movshon, J. A. The analysis of visual motion: a comparison of neuronal and psychophysical performance.

*J. Neurosci.***12**, 4745–4765 (1992).Santhanam, G., Ryu, S. I., Yu, B. M., Afshar, A. & Shenoy, K. V. A high-performance brain–computer interface.

*Nature***442**, 195–198 (2006).Katayama, T.

*Subspace Methods for System Identification*(Springer London, 2005).Cedervall, M. & Stoica, P. System identification from noisy measurements by using instrumental variables and subspace fitting.

*Circuits Syst. Signal Process.***15**, 275–290 (1996).Hastie, T., Tibshirani, R. & Friedman, J.

*The Elements of Statistical Learning: Data Mining*,*Inference and Prediction*. (Springer, 2008).Wald, A. The fitting of straight lines if both variables are subject to error.

*Ann. Math. Stat.***11**, 284–300 (1940).Bound, J., Jaeger, D. A. & Baker, R. M. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak.

*J. Am. Stat. Assoc.***90**, 443–450 (1995).Rudin, L. I., Osher, S. & Fatemi, E. Nonlinear total variation based noise removal algorithms.

*Phys. Nonlinear Phenom.***60**, 259–268 (1992).D’Errico, J. Eigenshuffle, MATLAB Central File Exchange. https://www.mathworks.com/matlabcentral/fileexchange/22885-eigenshuffle

Henrici, P. Bounds for iterates, inverses, spectral variation and fields of values of non-normal matrices.

*Numer. Math.***4**, 24–40 (1962).

## Acknowledgements

We thank J. Reppas and W. Newsome for the data collection. We thank K. Martin and all members of the Mante Lab for their valuable feedback as well as N. Meirhaeghe, L. Duncker and M. Jazayeri for discussions and comments on the manuscript. This work was funded by the Swiss National Science Foundation (award PP00P3-157539, V.M.), the Simons Foundation (SCGB 328189 and 543013, V.M., and SCGB 543039 and 323228, M.S.), the Swiss Primate Competence Center in Research (V.M.), the Gatsby Charitable Foundation (M.S.), the Howard Hughes Medical Institute (W. Newsome) and the Air Force Research Laboratory (W. Newsome).

## Author information

### Authors and Affiliations

### Contributions

A.R.G and V.M conceived and designed the study. A.R.G developed the methods and performed the analyses, with input from M.S. and V.M. A.R.G and V.M wrote the manuscript. All authors were involved in discussing the results and the manuscript.

### Corresponding authors

## Ethics declarations

### Competing interests

The authors have no competing interests to disclose.

## Peer review

### Peer review information

*Nature Neuroscience* thanks Matthew Perich and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

## Additional information

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data

### Extended Data Fig. 1 Residual and effective dynamics in models of decisions and movement.

**a**, Variability in responses across trials from the same task condition are interpreted as perturbations away from the condition-averaged trajectory. The evolution of these perturbations reflects the properties of the underlying recurrent dynamics (flow field, same conventions as in Fig. 1c). Inset on right shows a magnified view of the condition-averaged trajectory (red, choice 2) and corresponding single trials (dark gray) simulated from the saddle point model. Residual vectors at each time (shown in purple for a single trial and time) are computed by subtracting the condition-averaged response at that time from the corresponding single-trial response (purple equation). Time-varying dynamics matrices (**A**_{t}) of a linear time-varying, autonomous state-space model (black equations, top-right) are fit to the residuals. These matrices approximate the dynamics in distinct ’local’ regions of state space (for example dashed boxes) and are indexed according to time and condition. **b**-**c**, Components of the dynamics for the models of decisions (**b**) and movement (**c**) for an example reference time (blue dot) along the condition-averaged trajectory for choice 1. Same conventions as in Fig. 2a. Dynamics are shown for a local state-space region close to the corresponding initial condition (boxes in Fig. 1c, d; left). For all models, the estimated effective and residual dynamics (columns 5 and 6) closely match the true effective and residual dynamics (columns 3 and 4). In these models, the residual dynamics (column 4) reflects only the recurrent dynamics (column 1), but is not identical to it. For one, the fixed point of the residual dynamics by definition is located at the location of the reference state (the blue dot), which in general does not match the position of fixed points of the recurrent dynamics (for example the red circle in the first row and first column, corresponding to the position of the unstable fixed point in the saddle point model). The position of fixed points of the recurrent dynamics can only be inferred if the inputs are known, a requirement that is not fulfilled in many experimental settings. For another, consistent drifts resulting from the recurrent dynamics (for example the drift along the channel in the dynamic attractor model) are not reflected in the residual dynamics. Such drifts are ‘subtracted’ from the variability in the computation of residuals. Differences in the underlying recurrent dynamics are more apparent in the residual compared to the effective dynamics in cases where the input drive is strong. For example, the average cosine similarity between flow fields is 0.27/0.99 (saddle vs. line-attractor), 0.02/0.94 (saddle vs point-attractor) and 0.58/0.95 (line-attractor vs point-attractor) for the residual/effective dynamics.

### Extended Data Fig. 2 Schematic of analysis pipeline.

Schematic depicting the complete data analysis pipeline for inferring residual dynamics from noisy neural population recordings (see Methods). The pipeline involves four sequential steps. Step 1: session alignment; involves pooling single trials from different recording sessions to increase the statistical power of the analyses. Step 2: dynamics subspace estimation; involves using ‘aligned’ single-trial neural residuals to obtain estimates of a dynamics subspace (**U**_{dyn}) that effectively contains the residual dynamics. Step 3: residual latent state estimation; involves using the first stage of a two stage least squares (2SLS) approach to estimate a ‘denoised’ latent residual state. Step 4: time-varying dynamics estimation; uses the denoised residual latent states (obtained in step 3) for the second stage of the 2SLS, to estimate the time-varying residual dynamics matrices (**A**_{t}).

### Extended Data Fig. 3 Residual dynamics of simulated, time-varying, linear dynamical systems.

**a-c**, Validation of the estimation procedure on simulations of time-varying, linear dynamical systems with temporally uncorrelated latent noise (see Methods; Supplementary Methods). Simulations are based on a latent variable dynamical system with 3 latent dimensions and 20 observed dimensions. Residual responses are generated using a gaussian (circle markers: fixed latent noise variance; square markers: latent noise variance switches mid-way through the trial) or poisson (triangle markers) observation process. In all simulations, the properties of the dynamics switch midway through the simulated time window, from slowly decaying to quickly decaying (**a**); from normal to non-normal (**b**); or from non-rotational to rotational (**c**). As in Fig. 4b–d, we characterize dynamics with the magnitude of the eigenvalues (left), the rotational frequency (middle), and the singular values (right). Markers correspond to the estimated residual dynamics, black curves to the ground-truth values. The estimated residual dynamics accurately matches the ground-truth for all types of dynamics and observation models, before and after the switch, and also reveals the time of the switch. We observed this match even when the latent noise variance of gaussian observations was switched at the same time as the eigenvalues/eigenvectors of the dynamics (square markers), demonstrating that estimates of residual dynamics are robust to changes in latent noise variance (see also Extended Data Fig. 5a-b vs e-f). **d**, Analogous to **c**, but for residual dynamics (circles) estimated using ordinary least squares (OLS) instead of two-stage least squares (2SLS) as in **c**. Results are only shown for data simulated using a gaussian observation process. Unlike the 2SLS estimates, the OLS estimates are strongly biased, that is the magnitude of the eigenvalues and the singular values are consistently underestimated. These biases are expected—they arise because both the regressors and the dependent variables are corrupted by observation noise (see Methods). The 2SLS instead produces unbiased estimates, as the first stage of 2SLS results in a denoising of the regressors (Methods; see also Extended Data Fig. 9). **e**, Parameters of the latent noise and observation noise for the simulations in **a-d** were chosen to approximately match the variability in the measured PFC responses. The variability in the measured responses were quantified in terms of four statistics (l0, l1, l1/l0 and pvar, x-axis; see Supplementary Methods). Histograms indicate the respective values of these statistics in the neural data (one data point per task configuration, choice condition and monkey; see legend in Extended Data Fig. 6a). The open markers (top, same conventions as **a-c**) indicate the values of the statistics in the simulations for each of the three models.

### Extended Data Fig. 4 Inflation of local residual dynamics in a linear two-area dynamical system.

We systematically explored the effect of correlated input variability on estimates of residual dynamics in a two-area, linear dynamical system (see Methods & Supplementary Methods). The input area implements 2D isotropic recurrent dynamics characterized by parameters *λ*_{1}, *τ*_{1}, and *ω*_{1} (eigenvalue, time-constant, rotation frequency). Activity in the input area is externally driven by uncorrelated noise. Values of *λ*_{1} closer to 1 result in longer auto-correlation times in the variability of activity in the input area. This activity provides the input into the recorded area, which implements 2d isotropic recurrent dynamics with parameters *λ*_{2}, *τ*_{2}, *ω*_{2}. Residual dynamics at steady-state is estimated from activity of the recorded area. At steady state, estimates can be derived analytically (see Supplementary Math Note B). Because of temporally correlated input variability, the properties of the residual dynamics (*λ*_{res}, *τ*_{res}, *ω*_{res}) in general do not match those of the recurrent dynamics in the recorded area. **a**-**b**, Inflation of eigenvalues. **a**, Schematic of the model (top) and recurrent dynamics in each area (bottom, flow fields). Recurrent dynamics is stable and non-rotational in both areas. **b**, Residual dynamics (*λ*_{res}) in the recorded area as a function of recurrent dynamics in the recorded area (*λ*_{2}, x-axis) and in the input area (*λ*_{1}, gray lines). The eigenvalues of the residual dynamics are inflated, that is *λ*_{res} is larger than *λ*_{2} (all gray lines above the green line). Larger *λ*_{1} (longer input auto-correlations) lead to stronger inflation. For *λ*_{2} = 0 (no recurrent dynamics in the recorded area) *λ*_{res} = *λ*_{1} (gray circles). **c**-**d**, Inflation of rotation frequency. **c**, Recurrent dynamics is rotational in the input area, but stable and non-rotational in the recorded area. **d**, Residual dynamics in the recorded area, expressed as the magnitude of the eigenvalue (*λ*_{res}, top) and the rotation frequency (*ω*_{res}, bottom). The eigenvalues of the residual dynamics are generally inflated (top), but the relation with *λ*_{2} is non-monotonic and depends on *ω*_{1}. The residual dynamics is rotational (bottom, *ω*_{res} >0) even though the recurrent dynamics in the recorded area is not (*ω*_{1}= 0). The inflation of rotation frequency is reduced for increasing *λ*_{2}. **e**-**f**, Equivalence of upstream and local recurrent dynamics. **e**, Analogous to **c**, but dynamics is switched between input and recorded area. **f**, Analogous to **d**, but for the dynamics in **e**. The residual dynamics is identical to that in **d**. In general, residual dynamics in the recorded area reflects the combined effect of local and upstream recurrent dynamics.

### Extended Data Fig. 5 Explanation of input driven inflation in residual dynamics.

To gain an intuitive understanding of inflation of eigenvalue magnitude, we consider simulations of two-area linear dynamical systems similar to those in Extended Data Fig. 4a. For simplicity, here we simulate stable 1d-dynamics in each area, whereby variability of the input into the recorded area is either temporally correlated (**c-d**) or uncorrelated (**a-b**, **e-f**), and has fixed (**a-b**, **c-d**) or time-dependent latent noise variance (**e-f**). The variability injected into the input area is always temporally uncorrelated. Recurrent dynamics in the recorded area is identical in all simulations. **a**, Model parameters for the case of temporally uncorrelated input (*λ*_{1} = 0). **b**, Contributions to activity x in the recorded area at steady-state. Activity x(t) (x-axis) is propagated through the recurrent dynamics (left, y-axis) and added to the noise e(t) (middle, y-axis) to obtain activity x(t+1) at time t+1 (right, y-axis). The noise e(t) corresponds to activity/output of the input area, and is shaped by dynamics determined by *λ*_{1}. Points in the scatter plots correspond to different simulated trials. Estimating the eigenvalue of the residual dynamics in the absence of observation noise amounts to measuring the slope of the regression line relating x(t) to x(t+1) (right, gray line). In this case, this slope is identical to that obtained if the latent noise had not been added to the activity (left, gray line), meaning that residual dynamics correctly reflects the effect of the recurrent dynamics in the recorded area (slope < 0, reflecting *λ*_{2} < 0; left). **c**, Model parameters for the case of correlated input (*λ*_{1} > 0 for t > 0; *λ*_{1} = 0 at other times). **d**, Analogous to **b**, but for the model in **c**. Here activity and noise are shown at two times in the trial: early, when steady-state is not yet reached (top) and late, at steady-state (bottom). At both times, residual dynamics is inflated, that is the regression slope between x(t) and x(t+1) (right) is larger than that obtained by applying only the recurrent dynamics (left), indicating inflation of the eigenvalues. Inflation occurs because the noise itself is correlated with activity in the recorded area (middle, slope > 0), an effect that results indirectly from the correlation between e(t) and e(t-1). At steady state, even the inflated residual dynamics is still stable (bottom-right, slope < 1; see also Extended Data 4b). However, immediately after the onset of the temporally correlated input, residual dynamics erroneously reveals an instability (top-right, slope > 1). **e**, Parameters for the case of temporally uncorrelated noise but time-varying noise variance. The variance of the noise injected into the input area is increased at time t = 0, from *σ*_{latent} = 10^{−6} to 10^{−5}. **f**, A change in noise variance does not result in inflation of the residual dynamics, neither early nor late after the change (right, top and bottom; same slope as on the left; see also Extended Data Fig. 3a-c, squares).

### Extended Data Fig. 6 Alignment of neural population responses from different experiments.

Validation of the session alignment procedure of the analysis pipeline (Extended Data Fig. 2, Step 1; see Methods). We aligned neural population responses of all experiments belonging to the same task configuration and then pooled the aligned single trial responses across experiments before computing the residuals used in estimating the dynamics. The outcome of the session alignment procedure is a set of 20 ‘aligned’ modes for each experiment, defined such that the activity of each mode has the same dependency on time and choice across experiments. **a**, Definition of task configurations. We assigned each experiment to one of four target configurations (distinguished by markers, indicated on top of each panel along with number of experiments) based on the angular position of the targets (blue: choice 1; red: choice 2). The position of the targets was similar, but not identical, across experiments within the same task configuration. (left: Monkey T, right: Monkey V). **b**, Psychometric curves for all experiments in both monkeys (left: Monkey T, right: Monkey V), showing the fraction of saccades to choice 1 as a function of the signed motion coherency. Each gray data point is computed from trials belonging to a single experiment. The employed values of signed coherency varied slightly across experiments, in an attempt to achieve a comparable overall performance in each experiment. Black curves show logistic functions fitted separately to data points from a given task configuration (different markers; see legends in c) and evaluated at logarithmically spaced levels of coherency (positions of the white markers along the x-axis). **c**, Cumulative variance explained in condition-averaged population responses (mean +/− 2 s.e.m. across experiments; symbols as in **a**, n = number of experiments in each task configuration: see **a**) as a function of the number of aligned modes in both monkeys (left: Monkey T, right: Monkey V). The cumulative variance explained by the first 20 aligned modes for all 164 experiments in Monkey T and 80 experiments in Monkey V showed a strong positive trend with number of trials (inset, bottom) and a weak negative trend with the number of units (inset, top). **d**, Activity of the first 20 aligned modes (numbered from top-left to bottom-right) for config-3 in monkey T (15,524 trials across 41 experiments) ordered according to the amount of variance explained. Activity is defined as the projection of the population condition averages onto each mode. The projection was computed separately across experiments for choice 1 and choice 2 (blue and red) with responses aligned either to stimulus onset or saccade onset (black arrows). The resulting projections were then averaged across experiments (line: mean; shading: 2 s.e.m. across 41 experiments). **e**, Same data as in **d**, but showing the time-course of each aligned mode (numbered from 1 to 20) for each individual experiment (y-axis) separately for the two choice conditions (choice 1 and choice 2, top and bottom sub-panels). Differences in the activation of a given mode across experiments (that is across rows in each sub-panel) are much smaller than the differences in the activations across modes (that is across sub-panels), demonstrating the success of the alignment procedure. **f**, Absolute value of the projection (y-axis) of the 8 basis vectors (dim-1 through dim-8; red to blue) that span the dynamics subspace (*U*_{dyn}, estimated in Step 2 of the analysis pipeline; Extended Data Fig. 2) onto the 20 aligned modes, indicating the relative alignment of the aligned and dynamics subspace. The dynamics subspace is computed separately for each task configuration (symbols as **a**) in each monkey (left: Monkey T, right: Monkey V), and projects most strongly onto the first few aligned components (i.e large projection values for smaller aligned mode number). The dynamics subspace thus largely overlaps with the subspace of activity that captures most of the task-related variance in the responses (see also Extended Data Fig. 7c). **g**, Evaluation of the alignment procedure for all task configurations (columns) in both animals (rows). Each element of the matrix is obtained from the correlation coefficient between the time-courses of two aligned modes (that is positions along horizontal and vertical axes). We show the median correlation coefficient across all pairs of dissimilar experiments. Values close to 1 along the diagonal and close to 0 in the off-diagonal indicate that the time-courses are much more similar across experiments than across modes, indicating successful alignment.

### Extended Data Fig. 7 Single session and single unit results.

**a**, Residual dynamics estimated using neural data for a single choice condition (choice-1, 875 trials) from a single experiment in monkey T. This experiment has the largest number of trials among all experiments in monkey T. Conventions as in Fig. 4b-d. We estimated the residual dynamics directly from high-dimensional residual observations that corresponded to square-root transformed, binned spike-count vectors (dimensionality = number of units; 170 for this session), without performing the session alignment (step 1 in Extended Data Fig. 2). Overall, the properties of the residual dynamics estimated from this single session are similar to those obtained after pooling trials across sessions (Figs. 4b-d, 8 dimensional), suggesting that the main features of the residual dynamics (Fig. 4) are not affected by the alignment procedure. The lower dimensionality of the estimated residual dynamics (4 dimensions, blue to cyan; compared to 8 dimensions in Fig. 4a-d) most likely is a consequence of the smaller number of available trials in the single session compared to the aligned sessions. The resulting smaller statistical power makes is harder to estimate, in particular, the faster decaying eigenmodes of the dynamics. **b**, Trial-by-trial variability in single neurons is transiently reduced at the onset of specific task-events. We quantified single neuron variability as the time-varying, mean-matched Fano-factor computed by pooling units/neurons across all experiments in a monkey (empty circles: mean; dashed curve: 95% normal confidence intervals obtained by resampling datapoints; left: Monkey T, n = 218,856 datapoints; right: Monkey V, n = 118,629 datapoints; each datapoint corresponds to a single neuron-condition pairing within an experiment). In both monkeys, the mean-matched Fano factor undergoes a transient reduction locked to the onset of the stimulus and the onset of the saccade. The reduction in variability around the time of saccade onset coincides with a contraction of the eigenvalues of the residual dynamics (Fig. 4b,e), suggesting that more quickly decaying dynamics may underlie variability quenching at that time. A contraction of eigenvalues, however, does not appear necessary to explain variability quenching, as an analogous contraction is not observed at the time of stimulus onset, despite the consistent reduction in variability at stimulus onset. **c**, Overall fraction of variance explained by the dynamics subspace. We quantified what fraction of the variance of the condition-averaged trajectories in the high-dimensional neural space (state space defined by the individual units) is contained in the dynamics subspace (*U*_{dyn}, estimated in Step 2 of the analysis pipeline; Extended Data Fig. 2). Data from all 164 experiments in monkey T. On average in monkey T, the 8-dimensional dynamics subspace explains 68% of the variance in the average neural trajectories in monkey T (dashed vertical line, n = 164 experiments).

### Extended Data Fig. 8 Cross-validation of hyper-parameters used for estimating residual dynamics.

**a-c**, Representative results of the cross-validation procedure used to determine the various hyper-parameters of the analysis pipeline (Extended Data Fig. 2; see Methods) for neural data from a single task configuration in monkey T (config-3, see Extended Data Fig. 6a). **a**, Cross-validated hankel matrix reconstruction error (E_{hankel}; circle: mean over n = 20 repeats of hold-out cross validation; error bars: 1 s.e.m) plotted as a function of the rank of the hankel matrix (r, step 2 in Extended Data Fig. 2; see Methods) for residuals from the two epochs (left: decision; right: movement) and two choices (blue: choice 1; red: choice 2). The reconstruction error for each of the 20 repeats was computed by assigning a random 50% of the trials as a "training" set and the rest as a "test" set. **b**, 5-fold cross-validated mean squared error (E_{fs}; circles: mean over n=5 folds; error bars: 1 s.e.m) of the denoised residual predictions obtained from the first stage of the two-stage least squares regression (2SLS; step 3 in Extended Data Fig. 2), plotted as a function of the hyper-parameters: d (dimensionality of dynamics subspace); and l (number of past lags). For each cross-validation fold, a single mean squared error measure was computed by pooling the denoised predictions across time points in both epochs (left: choice 1; right: choice 2). **c**, Cross-validated mean squared error (circle: mean across n = 5 ’repeats’ of the average mean squared error across 5-folds; error bars: 2 std across repeats) of the residual predictions obtained from the second stage of the 2SLS regression (step 4 in Extended Data Fig. 2), plotted as a function of the smoothness hyper-parameter *α* for different epochs (left: decision; right: movement) and choice (choice 1 and 2). Both the train (orange) and test (gray) error are shown. **d**, Summary showing the optimal value for the dimensionality d and lag l (step 3 in Extended Data Fig. 2) for all task configurations and monkeys (symbols as in Extended Data Fig. 6a). A dimensionality of 8 and a lag of 3 was deemed optimal for both monkeys and task configurations (used in Fig. 4). **e**, Summary showing the optimal smoothness hyper-parameter *α*(step 4 in Extended Data Fig. 2) for all task configurations and monkeys. Final values of *α* were chosen to be the same across monkeys in Fig. 4 (decision epoch:*α* = 200; movement epoch:*α* = 50) despite a small degree of variability across the two monkeys. Same conventions as in **d**.

### Extended Data Fig. 9 Assessing statistical bias of eigenvalue estimates.

We estimated the residual dynamics for different choices of bin size, to identify the smallest bin size resulting in unbiased estimates. In the discrete time formulation of a linear dynamical system, like the one we use here, re-binning of the responses trivially results in a scaling of the estimated eigenvalues of the residual dynamics. To compensate for this rescaling, here we ‘mapped’ the estimated eigenvalues onto a common, reference bin size (see Methods). In the absence of statistical biases, the resulting ‘re-binned eigenvalue’ would be independent of bin size. **a**, Re-binned eigenvalues for simulations of a time-invariant, latent-variable (3 latent dimensions), LDS model (reference bin size = 40 ms) as a function of bin-size (dashed line: ground truth). Different gray lines correspond to models with different levels of latent noise (legend). When latent noise is large, estimates of the residual dynamics are biased for small bin sizes, but become unbiased when bin size is sufficiently large (light gray). When latent noise is too small, estimates are biased for any choice of bin size (black). **b**, Estimated, re-binned eigenvalues (reference bin size = 15 ms) as a function of bin size for all configurations in monkey T. Columns correspond to the 8 distinct eigenmodes of the estimated 8-dimensional residual dynamics (left to right, largest to smallest EV), rows correspond to task configurations (top to bottom, config-1 to 4; see Extended Data Fig. 6a). Here the re-binned eigenvalues were computed separately for each choice (red vs blue) and averaged in small temporal windows specific to each epoch: 0.2-0.4 s relative to stimulus onset (solid lines) and −0.15 to 0.25 s relative to saccade onset (dashed lines). All main analyses of recorded neural responses are based on a bin size of 45 ms, for which eigenvalue estimates have converged to an asymptote, suggesting that our estimates are not biased. Note that the re-binned eigenvalues for a bin size of 45 ms are larger than the corresponding eigenvalues reported in other figures (for example Figure 4b), because the former were mapped onto a reference bin size of 15 ms.

### Extended Data Fig. 10 Unidirectional and bidirectional communication between areas.

A population level mechanism explaining unidirectional and bidirectional communication between areas, incorporating key properties of the global residual dynamics in the feedforward (**a**, **b**) and feedback networks (**c**, **d**) in Fig. 7. We simulated time-independent, two-dimensional, linear dynamics, whereby the two cardinal dimensions (left panels in **a**-**d**) represent the choice modes in PPC and PFC (Fig. 6a,d). The time modes in each area are ignored here. We simulated a local perturbation (right panels in **a**-**d**) either in PPC (**a**, **c**) or PFC (**b**, **d**) by initializing activity along the corresponding choice mode (black circles, left panels) and then letting activity evolve (white points) based on the linear dynamics determined by the respective EVs (Fig. 7a; see Supplementary Methods). **a**, Perturbation in PPC in the feedforward model. Left: evolution of activity in the two-dimensional, global state-space spanned by PPC and PFC. Right: time-course of the norm of the population activity. The PPC perturbation causes expanding activity in PPC that propagates to PFC. **b**, Perturbation of PFC in the feedforward model in Fig. 6a. The PFC perturbation decays in PFC and does not propagate to PPC. This unidirectional communication results from non-normal dynamics, as EV_{1} is shared, while EV_{3} is private to PFC (EV_{1} not orthogonal to EV_{3}). **c**, Perturbation of PPC in the feedback model. The PPC perturbation causes a dip in PPC and expanding activity in PFC. **d**, Perturbation of PFC in the feedback model in Fig. 6d. The PFC perturbation causes a dip in PFC and expanding activity in PPC. In the feedback model, perturbations in one area thus propagate to the other area. This bidirectional communication arises because both EV_{1} and EV_{4} are shared equally between PPC and PFC. Somewhat counter-intuitively, the existence of bidirectional communication in these models can be inferred when considering activity in the perturbed area alone. Activity in the perturbed area initially decays, and expands only later; activity in the unperturbed area does not show this dip. The dip occurs because any local perturbation is only partially aligned with the shared, unstable direction (EV_{1}). Initially, activity in the perturbed area then mostly reflects the rapidly decaying component of activity along the second, global eigenvector (EV_{4}).

## Supplementary information

### Supplementary Information

Supplementary Methods, Supplementary Text A, Supplementary Math Note A, Supplementary Math Note B and Supplementary References

## Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

## About this article

### Cite this article

Galgali, A.R., Sahani, M. & Mante, V. Residual dynamics resolves recurrent contributions to neural computation.
*Nat Neurosci* **26**, 326–338 (2023). https://doi.org/10.1038/s41593-022-01230-2

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1038/s41593-022-01230-2