Article | Published:

# Learning by neural reassociation

## Abstract

Behavior is driven by coordinated activity across a population of neurons. Learning requires the brain to change the neural population activity produced to achieve a given behavioral goal. How does population activity reorganize during learning? We studied intracortical population activity in the primary motor cortex of rhesus macaques during short-term learning in a brain–computer interface (BCI) task. In a BCI, the mapping between neural activity and behavior is exactly known, enabling us to rigorously define hypotheses about neural reorganization during learning. We found that changes in population activity followed a suboptimal neural strategy of reassociation: animals relied on a fixed repertoire of activity patterns and associated those patterns with different movements after learning. These results indicate that the activity patterns that a neural population can generate are even more constrained than previously thought and might explain why it is often difficult to quickly learn to a high level of proficiency.

## Main

Studies of the neurophysiological changes during learning have largely focused on individual-neuron tuning properties1,2,3,4,5,6,7,8,9,10 and correlations between the activities of pairs of simultaneously recorded neurons11,12. However, neurons operate within large networks, and to fully understand learning, we may need to understand how neural activity reorganizes at the population level. Recent studies have discovered tantalizing evidence of population-level mechanisms by considering the joint activity across many neurons13,14,15,16,17,18,19,20. Population-level studies of learning have only recently begun to emerge21,22,23,24, and our understanding of how neural population activity reorganizes during learning is far from complete.

A major challenge to understanding the neural basis of learning is that, in many experiments, it can be difficult to determine the behavioral relevance of observed changes in neural activity. Interpreting the behavioral implications of such changes requires knowledge of the causal mapping from neural activity to behavior, which is not precisely known in most behavioral paradigms. BCIs have emerged as a powerful experimental tool25 because the experimenter explicitly defines this causal mapping and can readily manipulate the mapping to induce learning7,8,9,10,22,23,24,26,27,28,29,30. Exact knowledge of the mapping enables the experimenter to interpret the behavioral relevance of observed changes in neural activity and to characterize the set of activity patterns that would achieve any particular behavioral goal.

Using a BCI, we recently found that animals can readily learn to generate certain population activity patterns22. Consider a population activity space in which each axis represents the activity of one neuron and a point represents the simultaneous activity across all recorded neurons at a given time (termed a ‘population activity pattern’). We and others have observed that population activity patterns do not occupy this space uniformly13,14,16,18,31. Rather, activity patterns tend to reside within a low-dimensional subspace31, which we refer to as the ‘intrinsic manifold’. By changing the BCI mapping mid-experiment, we found that animals could more readily learn to produce population activity patterns within the intrinsic manifold than outside it22. Precisely how the activity patterns reorganize within the intrinsic manifold is not yet understood and is the primary focus of this study.

There are many ways population activity could reorganize within the intrinsic manifold to drive behavioral improvements during learning, and observations of behavior alone are not sufficient to deduce the neural strategies guiding these changes. To begin, we consider three possible neural strategies of learning, which make differential, testable predictions about how population activity patterns might change during learning to improve behavior. The optimal strategy is for activity patterns to realign with the BCI mapping in a manner that maximizes behavioral performance. Perhaps surprisingly, the data are inconsistent with this hypothesis. Alternatively, in analogy to visuomotor gain adaptation32,33, neural variability might rescale along each dimension to restore the influence that each dimension of population activity had on movements before the perturbation. The data are also inconsistent with this hypothesis. Rather, we found that the overall repertoire of population activity patterns is preserved during learning. Specifically, the activity patterns produced after learning when intending a particular movement are remarkably similar to patterns produced before learning when intending a potentially different movement. These findings suggest that neural populations are constrained to generate activity patterns from a fixed repertoire within the intrinsic manifold, which may ultimately dictate the amount of behavioral improvement possible during learning.

## Results

We recorded neural population activity from the primary motor cortex (M1) in three rhesus macaques (monkeys J, L and N) while they performed a BCI learning task (Fig. 1a). We detailed the experiment and behavioral findings for two of the animals (monkeys J and L) in a previous report22. Briefly, animals modulated their neural activity to drive cursor movements to visual targets in a 2D center-out task. We applied factor analysis23,34,35,36,37 to the recorded spike counts to identify the intrinsic manifold and summarize the neural population activity at each moment in terms of a set of 10D factors, zt. The causal relationship between neural activity at time t and 2D cursor velocity, vt, was defined by the BCI mapping

$${{\bf{v}}}_{t}={{\bf{Av}}}_{t-1}+{{\bf{Bz}}}_{t}+{\bf{c}}$$
(1)

where A, B and c are the parameters of the BCI mapping. In this work, behavior is defined by BCI cursor movements. We exclusively studied the factors zt because they capture the largest shared cofluctuations across the neural population, and because only aspects of the spike counts that are reflected in the factors can directly affect behavior (as a result of equation (1)). Henceforth we refer to these factors as ‘population activity patterns’.

At the beginning of each experiment, the animal proficiently controlled the cursor using an ‘intuitive BCI mapping’ (Fig. 1b, black line), which was designed to be consistent with the intrinsic manifold (Fig. 1b; black line lies within yellow plane). To induce learning, we then switched to a ‘perturbed BCI mapping’ (Fig. 1b, red line), which abruptly decreased the animal’s behavioral performance. Performance recovered over several hundred trials as the animal learned (Fig. 1c and Supplementary Fig. 1). In this work, we focus exclusively on within-manifold perturbations (Fig. 1b; red line lies within yellow plane), which altered the relationship between the factors zt and cursor velocity vt through changes to B in equation (1) (Supplementary Fig. 2). We have shown that these perturbations can be consistently learned within a single experimental session lasting 1–2 h22. Here, we seek to understand the learning-related changes in neural population activity that underlie this behavioral improvement.

### Neural strategies of learning

Using the intuitive BCI mapping, the animal generated population activity patterns that produced the ‘intended movement’ (Fig. 2a), which we define to be straight from the current cursor position to the target20. However, a given activity pattern typically produces different movements through the intuitive (Fig. 2a) and perturbed (Fig. 2b) mappings. Because behavior improved under the perturbed mapping (Fig. 1c and Supplementary Fig. 1), there must have been changes to the set of activity patterns produced for each intended movement (termed the ‘movement-specific cloud’ of activity). There are many ways these activity patterns could have reorganized to improve behavior. We begin by considering three specific neural strategies of learning, which predict qualitatively different changes to movement-specific clouds of activity, along with the accompanying changes (or lack thereof) to the set of activity patterns taken across all intended movements (termed the ‘overall neural repertoire’). Importantly, none of these hypotheses predict novel activity patterns outside of the intrinsic manifold, as we found that, on the timescale of these experiments, the intrinsic manifold remains stable (Supplementary Fig. 3) and animals do not readily learn to produce outside-manifold activity patterns22.

#### Hypothesis 1: learning by ‘realignment’

The behaviorally optimal neural strategy is to realign the overall neural repertoire relative to the perturbed BCI mapping in the manner that maximizes behavioral performance (Fig. 2c). The key neural signature of realignment is the emergence of novel activity patterns that produce high-speed movements through the perturbed BCI mapping (for example, activity patterns beyond the outer dotted lines in Fig. 2c). These novel activity patterns represent a targeted expansion of the overall neural repertoire along the dimensions spanned by the perturbed BCI mapping.

#### Hypothesis 2: learning by ‘rescaling’

A major effect of the perturbations is a change in how strongly each factor (i.e., each element of zt from equation (1)) influences movement velocity. Humans32 and monkeys33 can learn to rescale the extent of arm movements when experiencing a change in the influence that their movements have on visual feedback of those movements. In analogy to this behavioral phenomenon of rescaling movements along dimensions in kinematics space, we tested for rescaling along dimensions in population activity space. Perhaps the animal learns to rescale the variance of population activity along each neural dimension to compensate for the change in that dimension’s influence on movement due to the perturbation (Fig. 2d). Under rescaling, the animal would learn to ‘push harder’ along neural dimensions whose influence was attenuated by the perturbation and to ‘push softer’ along dimensions whose influence was amplified by the perturbation.

#### Hypothesis 3: learning by ‘reassociation’

Perhaps the neural population can only generate certain patterns within the intrinsic manifold (for example, due to underlying network constraints) such that the overall neural repertoire does not change with learning. Under reassociation, the animal flexibly reassociates existing activity patterns with different intended movements to improve behavior (Fig. 2e). This strategy limits movements to those that can be generated by a fixed neural repertoire, and as a result some high-speed movements that were possible through the intuitive mapping (for example, those corresponding to activity patterns beyond the outer dotted lines in Fig. 2a) might not be possible through the perturbed mapping (for example, there are no activity patterns beyond the outer dotted lines in Fig. 2e). In this sense, reassociation is behaviorally suboptimal.

The key distinction between these strategies is that realignment and rescaling predict a change to the overall neural repertoire, whereas reassociation predicts that the overall neural repertoire is preserved throughout learning. As such, under realignment and rescaling, we would expect to see novel activity patterns within the intrinsic manifold after learning. By contrast, under reassociation, we would expect that each pattern produced after learning is similar to some pattern produced before learning.

To ground these hypotheses quantitatively, we predicted the movement-specific clouds of population activity patterns that would result from learning according to each strategy. Importantly, we ensured that all predicted activity patterns respect the intrinsic manifold, physiological limitations on the firing rates of individual neural units, and realistic levels of neural variability. We formulated these predictions using convex optimization problems38 whose solutions provided the population activity patterns that would produce the maximum behavioral performance attainable subject to particular constraints (see Methods). The constraints on realignment were only those mentioned above. Rescaling was further constrained to rely on a rescaled neural repertoire, and reassociation was constrained to rely only on the before-learning neural repertoire. Based on these concrete predictions, we then asked how well each hypothesis explained the empirically observed changes in population activity and behavior.

### Population-level signatures of learning strategy

To build intuition about the population-level changes in neural activity during learning, we visualized the overall neural repertoire (i.e., across all movements) during the last 50 trials under the intuitive BCI mapping (referred to as ‘before learning’) and during the 50 trials once peak performance had been achieved under the perturbed BCI mapping (referred to as ‘after learning’). The population activity patterns are defined in terms of the 10D factors zt, but we can only visualize two of those dimensions at a time. We chose to visualize the 2D outputs of the BCI mappings so that each activity pattern can be readily interpreted relative to task goals. After building qualitative intuitions using 2D visualizations, we will quantify effects in the full 10D population activity space.

We found that the after-learning overall neural repertoire showed a nearly complete visual overlap with the before-learning repertoire, whether activity patterns are viewed through the perturbed BCI mapping (Fig. 3, center panel) or through the intuitive BCI mapping (Supplementary Fig. 4, center panel). This visual similarity is consistent with repertoire preservation, the key hallmark of learning by reassociation. The after-learning repertoire predicted by reassociation shows a high degree of visual overlap with the empirical before-learning repertoire, whereas realignment and rescaling predict systematic repertoire changes (Supplementary Fig. 5).

Separating the overall neural repertoire into its movement-specific clouds revealed changes indicative of behavioral improvements (Fig. 3, outer panels). Consistent with the visually minimal changes to the overall neural repertoire (Fig. 3, center panel, and Supplementary Fig. 4, center panel), these movement-specific changes (Fig. 3, outer panels, and Supplementary Fig. 4, outer panels) were predominantly characterized by dropping before-learning activity patterns from a movement-specific cloud (for example, Fig. 3, panel at 45°) and/or incorporating patterns that were contained within the before-learning cloud for other movements (for example, Fig. 3, panel at 225°).

To quantify the degree of similarity between the neural repertoire before and after learning, we devised a metric based on distances between activity patterns in the 10D population activity space (Fig. 4a). First, we computed distances between each after-learning pattern and its nearest neighbors in the before-learning overall repertoire. Then we normalized these distances by the spread of the before-learning repertoire so that repertoire preservation is indicated by values near zero, values above zero imply repertoire shift or expansion, and values below zero imply repertoire contraction (Supplementary Fig. 6). The empirically observed activity did not show substantial repertoire change (Fig. 4b), which is consistent with reassociation and the intuition conveyed by Fig. 3 and Supplementary Fig. 4. Realignment and rescaling predict substantial repertoire change, which was not consistent with the data.

To corroborate this reassociation-like finding of repertoire preservation and to further contrast with the predictions of realignment and rescaling, we analyzed the shared variability in the overall neural repertoire (Figs. 5 and 6). First, we looked for changes in population covariability along the dimensions of the BCI mappings, which measure the extent that changes in population activity would be reflected as changes in cursor velocities. Visually, the covariability of the activity along the dimensions of the perturbed BCI mapping corresponds to the spread of the activity patterns as shown in Fig. 3, and the covariability along the dimensions of the intuitive BCI mapping corresponds to the spread of patterns in Supplementary Fig. 4.

The data did not show substantial changes in the amount of covariance projected along the intuitive (Fig. 5a) or perturbed (Fig. 5b) BCI mappings, which is again consistent with learning by reassociation (Fig. 2e and Fig. 5c, blue). By contrast, realignment and rescaling, which both predict repertoire change (Fig. 4b, red and yellow), make differential predictions about the structure of those changes. Realignment predicts repertoire expansion due to the addition of novel activity patterns that have large outputs through the perturbed BCI mapping relative to patterns produced before learning (Fig. 2c). This expansion is detected as an increase in covariability along the dimensions of the perturbed mapping (Fig. 5c, red). Rescaling predicts repertoire expansion due to the addition of novel activity patterns that have large outputs through the intuitive BCI mapping (Fig. 2d). This is seen as an increase in covariability along the dimensions of the intuitive mapping (Fig. 5c, yellow). The data (Fig. 5c, black) were not consistent with these predictions of realignment or rescaling.

Next, we searched for changes in shared variability across the ten dimensions of the population activity space that might be related to the particular perturbation. Recall that each factor (i.e., each element in zt from equation (1)) represents population activity fluctuations along a particular dimension of the population activity space. Each perturbation effectively changes both the direction that each factor’s activity pushes the cursor (the direction represented by each 2 × 1 column of B in equation (1); Fig. 6a,b) and its ‘pushing magnitude’ (the norm of that column of B; Fig. 6c–e). The learning strategies we have presented make differential predictions about how each factor’s variance should change in response to the change in that factor’s pushing magnitude due to the perturbation (Fig. 6f,g).

Realignment predicts an increasing trend between changes in pushing magnitude and changes in factor variance (Fig. 6f,g, red). Under realignment, the movement-specific clouds of activity migrate into and spread along the dimensions spanned by the perturbed mapping (Fig. 2c). As a result, variability should increase for factors that contribute more to movement under the perturbed mapping than they did under the intuitive mapping. Rescaling predicts the opposite trend (Fig. 6f,g, yellow). If a perturbation increases (or decreases) the contribution of a particular factor toward movement, variance should decrease (or increase) for that factor to restore the influence that factor had on movement before the perturbation (Fig. 2d).

These predictions of realignment and rescaling contrast with those of reassociation. Because reassociation predicts that the same overall repertoire of activity patterns is used before and after learning, reassociation predicts that the variance for each factor should not change, regardless of how each factor’s pushing magnitude has changed (Fig. 6f,g, blue). The data did not show a trend between changes in pushing magnitude and changes in factor variance (Fig. 6f,g, gray), which closely matches the predictions of reassociation.

### Behavioral consequences of learning strategy

As expected, behavioral performance dropped abruptly when the BCI mapping was perturbed (Fig. 1c). This performance drop is predicted by the before-learning population activity (Fig. 7). After learning, behavioral performance improved substantially. Notably, after-learning behavioral performance did not completely recover to intuitive levels.

Realignment, rescaling and reassociation all predict behavioral improvements due to learning. However, the extents of these predicted improvements vary (Fig. 7). The behaviorally optimal solution, realignment, predicts substantially more behavioral improvement than shown by the animals. Rescaling predicts slightly more behavioral improvement than shown by the animals. Reassociation predicts behavioral improvement closely matched to that shown by the animals and, in doing so, also predicts the incomplete recovery of behavioral performance demonstrated by the animals. These behavioral results, taken together with the repertoire preservation demonstrated in Fig. 4b, suggest that a fixed neural repertoire represents a fundamental constraint on the amount of behavioral improvement that is possible during short-term learning.

### Variants and mixtures of learning strategies

There is a continuum of neural strategies that could subserve learning, and of these we have thus far only considered three distinct strategies. Now we consider the possibility that learning involves variants or mixtures of the strategies presented thus far. First, we consider an attenuated variant of realignment in which behavioral predictions are matched to empirical after-learning behavioral performance. Second, we consider subselection, a variant of reassociation in which the activity patterns produced for a given movement after learning are a subset of the patterns produced for that same movement before learning. Finally, we consider the possibility that learning involves a combination of reassociation and realignment.

The first variant we explore is ‘partial realignment’. Realignment, as previously defined, predicts substantially better behavioral performance than animals showed empirically after learning (Fig. 7). Might it be that the animals’ population activity did change in a manner akin to realignment, but each movement-specific cloud of activity migrated only partially toward the cloud predicted by complete realignment (Fig. 8a)? To address this possibility, we refined the realignment predictions to match the animals’ empirical levels of behavioral performance after learning. We found that the before-learning movement-specific clouds only needed to migrate about 15% toward the complete-realignment clouds to match these empirical levels of behavioral performance (see Supplementary Math Note).

Given that the changes in population activity predicted by partial realignment are subtler than those predicted by complete realignment, we might not be able to disambiguate reassociation and partial realignment when considering the overall repertoire of activity patterns across movements, as in Figs. 46. However, we can clearly disambiguate these strategies by analyzing changes in the movement-specific clouds of activity. Reassociation predicts that the movement-specific clouds shift substantially more (Fig. 2e) than predicted by partial realignment (Fig. 8a). To quantify these changes, we measured movement-specific repertoire change using the same distance-based metric as in Fig. 4, but applied to the movement-specific clouds rather than to the overall neural repertoire. The data showed movement-specific repertoire change that was consistent with reassociation and was substantially greater than that predicted by partial realignment (Fig. 8c).

The second variant we explore is ‘subselection’39 (Fig. 8b). Subselection predicts that, for a given movement after learning, the animal produces only the activity patterns from that movement’s before-learning cloud that remain appropriate for that movement under the perturbed BCI mapping (filled points in Fig. 8b). Patterns that are no longer appropriate for that movement are no longer produced (open points in Fig. 8b). Subselection is like reassociation in that, across all movements, the animal does not produce novel patterns after learning. However, for a particular movement, reassociation may recruit activity patterns that were associated with other movements before learning, whereas subselection cannot.

In the example experiment shown in Fig. 3 and Supplementary Fig. 4, the after-learning movement-specific clouds (outer panels, red) contained a substantial number of activity patterns outside the before-learning cloud for the same movement (corresponding gray regions). This finding is inconsistent with subselection. To test for subselection quantitatively, we again looked at movement-specific repertoire change. Subselection predicts a substantial contraction within the movement-specific clouds (Fig. 8c, light blue bars). The data (Fig. 8c, gray bars) were not consistent with this key prediction of subselection, but rather were consistent with the movement-specific repertoire shifts predicted by reassociation (Fig. 8c, dark blue bars). Taken together, these analyses indicate that the animals learned by co-opting existing population activity patterns (Figs. 36) to subserve new movement intents after learning (Fig. 8), as predicted by reassociation.

Finally, we explore the possibility that learning engages multiple learning processes simultaneously40,41,42,43. Our analyses have revealed that reassociation explains the population activity (Figs. 36) and behavioral improvements (Fig. 7) we observed during learning. This included showing that, consistent with reassociation, the amount of population covariability along the perturbed mapping did not change substantially as a result of learning (Fig. 5b,c). Upon closer inspection, we found that subtle experiment-by-experiment fluctuations in this covariability metric correlated positively with levels of behavioral learning, which is consistent with partial realignment (Supplementary Fig. 8). Although our analyses have already ruled out the possibility that behavioral improvements are primarily due to realignment or partial realignment (Figs. 38), this subtle effect suggests that an element of realignment might play a minor role, alongside reassociation, during short-term learning.

### Potential influences on learning strategy

Finally, we asked whether the design of our experiments influenced the neural strategy of learning demonstrated by the animals. One possibility is that accumulated experience controlling intuitive mappings (i.e., across many experiments) might make it progressively more difficult for a neural population to change its neural repertoire, perhaps owing to a consolidation of activity patterns that are most effective at driving the intuitive mapping. If this were the case, we would expect evidence of reassociation to become progressively stronger throughout the course of these experiments, while evidence of another learning strategy (for example, realignment or rescaling) becomes progressively weaker. This was not the case. Rather, the data were consistent with learning by reassociation throughout the entire course of the experiments (Supplementary Fig. 9).

Another possibility is that the within-manifold perturbations might not apply enough pressure to change the neural repertoire. Two pieces of evidence suggest that this is not the case. First, even after learning, animals showed a substantial performance deficit relative to intuitive-level control (Fig. 7 and Supplementary Fig. 7). Thus, there is likely pressure to continue improving behavior beyond the levels of performance we observed after learning, and yet we did not observe changes to the neural repertoire (for example, realignment or rescaling) that would have driven such additional behavioral improvement. Second, when there was more pressure to change the neural repertoire, we did not observe larger changes to the neural repertoire (Supplementary Fig. 10). These two pieces of evidence indicate that the finding that animals largely learned by reassociation is not due to a lack of pressure to show activity patterns outside the neural repertoire.

## Discussion

In this work, we investigated the population-level changes in neural activity that drive behavioral improvements during short-term learning. We found that repertoire preservation was the guiding constraint underlying the reorganization of population activity. After learning, animals produced roughly the same set of activity patterns across all movements as they produced before learning. What had changed was the association between movement intents and activity patterns within the neural repertoire. We found that a neural strategy of reassociation predicts this repertoire preservation and the extent of behavioral learning demonstrated by the animals. These levels of behavioral performance are considerably suboptimal relative to those possible via a strategy of neural realignment, which is not constrained by repertoire preservation. Taken together, these findings indicate that, on the timescale of these experiments (1–2 h), changes in neural activity during learning are even more constrained than previously believed.

In previous work, we found that animals can readily reorganize neural activity within the intrinsic manifold but not outside of it22. However, it remained an open question specifically how neural activity changes within the intrinsic manifold to support the behavioral learning we observed. In this work, we addressed this question by considering a range of hypotheses, all of which operate exclusively within the intrinsic manifold (i.e., they do not predict outside-manifold activity patterns, nor do they predict changes to the intrinsic manifold). Thus, the changes we considered here are fundamentally different from those that might be required for learning a BCI mapping that lies outside of the intrinsic manifold.

Several previous BCI learning studies have addressed the related question of whether behavioral improvements are driven by changes that are independent across neurons or by changes that reflect shared constraints across neurons8,9,10,22,23,27,29,44. For short-term learning (i.e., within 1–2 h), studies have found evidence of such shared constraints in M18,10,22,27 and the parietal reach region44 and have suggested that independent-neuron learning does not play a dominant role. Informed by these studies, in this work we only considered population-level learning strategies that reflect shared constraints across neurons.

An important contribution beyond these previous studies is that our investigation into these shared constraints was performed at the level of 10D factors (zt in equation (1)), which provide a more comprehensive characterization of the population activity than the 1D or 2D kinematics-based quantities previously used to describe those constraints8,10,27,44. In addition to capturing variables that relate directly to task kinematics, the factors we identified can also capture variables that are internal to the animal and do not directly relate to task kinematics or objectives. Together, these factors more fully describe the degrees of freedom in the population activity that the animal can exploit to improve behavior during learning. There are many ways that these factors could reorganize during learning while respecting shared constraints across neurons (for example, the intrinsic manifold), and analyses of behavior alone22 are not sufficient to deduce the neural strategies guiding this reorganization. Here, we rigorously defined a range of hypotheses about how these factors might reorganize during learning and presented an analysis framework that enabled us to disambiguate between these hypotheses. Because these analyses were based on 10D factors, they have the power to identify learning-related changes that might not be apparent in one or two kinematics-based factors.

The hypotheses we considered lie along a continuum describing the flexibility of the neural repertoire, which ranges from realignment (most flexible) to subselection (most constrained). Realignment can flexibly change the neural repertoire to maximize behavioral performance. Subselection constrains the activity patterns for each movement to be a subset of the patterns used for that same movement before learning. Reassociation has an intermediate flexibility because it cannot change the neural repertoire, but it can change how activity patterns within the repertoire are used. That reassociation predicts the data well and lies between the most and least flexible strategies we considered, suggests that the breadth of hypotheses we considered was adequate.

Given further exposure to a perturbed BCI mapping, might there be further reorganization of neural activity, with corresponding improvements in behavioral performance? Further behavioral improvements would require neural changes beyond those predicted by reassociation. One such possibility is that the animal learns to decrease neural variability in a manner that improves the ability to precisely generate activity patterns that drive high-performance movements45. Another possibility is that the animal learns to produce novel activity patterns. For the within-manifold learning tasks considered in this work, substantial behavioral improvements would be possible if novel activity patterns could be generated within the intrinsic manifold, such as those activity patterns predicted by realignment. We did see subtle hints of realignment (Supplementary Fig. 8), but this was not the dominant process driving behavioral improvements on the 1–2 h timescale of our experiments (Figs. 37). We cannot rule out the possibility that different task demands might accelerate learning of novel activity patterns. However, animals did not show more realignment when there was more behavioral incentive to do so (Supplementary Fig. 10), and BCI mappings outside of the intrinsic manifold are not readily learned on this same 1–2 h timescale22. Thus, reassociation and realignment might operate in parallel but with vastly different timescales. When learning over longer timescales, the cumulative effect of realignment-like changes could become a substantial driver of behavioral improvement. Such a combination of learning processes would allow an initial reduction in errors that is largely due to reassociation (i.e., on a timescale of hours), with further error reduction driven by realignment (i.e., on a timescale of days to weeks).

It is unclear what neural mechanisms underlie the reassociation-like reorganization we found in the population activity. Sensorimotor learning requires changes to the output signals (in our case, M1 activity) generated in response to a given sensory input. Changes to M1 activity could arise from connectivity changes between M1 neurons or from changes to the inputs of M1 for a given sensory input. While we cannot definitively distinguish these two possibilities, our finding of neural repertoire preservation seems more consistent with changes to the inputs of M1, since connectivity changes within M1 would likely lead to changes in the repertoire. The driver of these learning-related changes could be cortical or subcortical46, and more experiments are needed to make these distinctions.

In this work, we took a population-level approach to studying BCI learning in M1 and found that a strategy of learning by neural reassociation predicted key features in the data. The hypotheses and analysis framework presented here in the context of a BCI task can also be used to ask whether similar population-level strategies and constraints govern learning in other contexts, such as arm movements (for example, in M13,6), perceptual learning (for example, in visual cortex11), rule learning (for example, in prefrontal cortex21) or associative learning (for example, in auditory cortex12).

## Methods

### Experimental procedures

Experimental procedures for monkeys J and L are described in detail in ref. 22. Procedures for monkey N were nearly identical. Here we briefly summarize the procedures and highlight any differences in the procedures for monkey N. All animal procedures were approved by the Institutional Animal Care and Use Committee of the University of Pittsburgh.

#### Neural recordings

Three adult male rhesus macaques (Macaca mulatta; age, monkey J: 7 years; monkey L: 8 years; monkey N: 7 years) were each chronically implanted with a 96-channel multielectrode array targeting the proximal arm area of M1. Spikes on a given channel were identified as threshold crossings and were counted in non-overlapping 45-ms bins. We refer to each channel as a ‘neural unit’, and we refer to the set of spike counts recorded simultaneously across all channels during a single 45-ms time bin as a ‘spike count vector’. We recorded from 86.5 ± 1.40 units (mean ± 1 s.d.) across 27 analyzed experiments for monkey J, 88.4 ± 0.88 units across 11 analyzed experiments for monkey L, and 93.5 ± 0.81 units across 10 analyzed experiments for monkey N.

Animals performed an eight-target center-out BCI task. Each trial began with a 300-ms freeze period, during which the cursor (circle, radius 18 mm) remained at the center of the workspace. A peripheral target (circle, radius 20 mm) was displayed at the beginning of this freeze period. Animals then moved the cursor by modulating their neural activity. A water reward was delivered if the target was acquired within 7.5 s following the end of the freeze period, and the next trial was initiated 200 ms after target acquisition. If the target was not acquired within 7.5 s, there was a 1.5 s timeout before the next trial was initiated. Target locations were selected from a set of eight uniformly spaced locations around a circle (radius, monkey J: 150 mm; monkeys L and N: 125 mm). For monkeys J and L, targets were presented in a pseudorandom order to equalize the number of successful trials for each target. For monkey N, targets were presented in a random order independent of target acquisition history. Each animal’s arms were loosely restrained during the BCI task, and animals showed little to no arm movement22.

Each experiment began with 80 calibration trials used to identify the intrinsic manifold and to define the intuitive BCI mapping. The intuitive mapping was then used during a block of intuitive trials (monkey J: 382 ± 66.7 trials; monkey L: 269 ± 52.3 trials; monkey N: 193 ± 3.68 trials). The mapping was then changed to a perturbed BCI mapping for a block of perturbed trials (monkey J: 871 ± 66.3 trials; monkey L: 360 ± 84.9 trials; monkey N: 620 ± 60.0 trials). After the perturbed trials, the intuitive BCI mapping was reinstated for a block of washout trials (not analyzed in this work).

#### Identifying the intrinsic manifold and extracting population activity patterns

We used factor analysis23,34,35,36,37 to identify the intrinsic manifold and to summarize each high-dimensional spike count vector, $${{\bf{u}}}_{t}\in {{\mathscr{R}}}^{q}$$, in terms of a low-dimensional set of factors, $${{\bf{z}}}_{t}\in {{\mathscr{R}}}^{p}$$, where q is the number of simultaneously recorded neural units, p is the number of factors (i.e., the dimensionality of the intrinsic manifold), and p < q. All references to “population activity patterns” refer to these factors zt. A new factor analysis model was fit for each experiment on the basis of the recorded neural activity from the calibration trials. For all analyses, factors were extracted such that dimension 1 (i.e., the first element in zt) explains the most shared covariance across the population, dimension 2 is orthogonal to dimension 1 and explains the next most shared covariance, and so on (see Supplementary Math Note).

For consistency, we used p = 10 across all experiments. We used 10 factors (or dimensions) because that was the average dimensionality identified by factor analysis via cross-validation over the experiments on monkeys J and L, and because when higher dimensionalities were identified, they did not offer substantially better accounts of the data relative to using 10 factors22. We found that animals’ after-learning neural activity remained consistent with these descriptions of the intrinsic manifold (Supplementary Fig. 3).

#### Intuitive BCI mappings

BCI mappings translated the factors zt into 2D cursor velocities vt using a Kalman filter47,48. Intuitive BCI mappings took the form

$${{\bf{v}}}_{t}={{\bf{Av}}}_{t-1}+{{\bf{Bz}}}_{t}+{\bf{c}}$$
(2)

where $${\bf{A}}\in {{\mathscr{R}}}^{2\times 2}$$ temporally smooths the velocities, $${\bf{B}}\in {{\mathscr{R}}}^{2\times 10}$$ defines the dimensions within the intrinsic manifold that directly influence cursor movements (termed the ‘control space’), and $${\bf{c}}\in {{\mathscr{R}}}^{2}$$ is a constant offset.

Each experiment began with 80 trials used to calibrate an intuitive BCI mapping. Trials involved either closed-loop BCI cursor control, passive observation of center-out cursor movements, or a combination of the two (see details below). Population activity was recorded during these trials and was paired with estimates of the animal’s intended cursor velocities. We then determined the parameters of the intuitive mapping (A, B and c from equation (2)) on the basis of these paired data (see Supplementary Math Note).

For monkey J, two different calibration procedures were used. In early experiments, calibration consisted of closed-loop center-out trials under the previous day’s intuitive BCI mapping. Intended cursor velocity at each timestep was taken to be in the current cursor-to-target direction with a speed equal to the current cursor speed20,49. For monkey J’s later experiments, we used an observation-based calibration procedure, which did not depend on the previous day’s mapping. This change was made to reduce the likelihood of carry-over effects on the neural population across days. During these calibration trials, we recorded neural activity as the animal passively observed automatic center-out cursor movements straight to the target at a constant speed (0.15 m/s). Here, intended cursor velocity at each timestep was taken to be the observed cursor velocity (0.15 m/s in the center-to-target direction).

For monkeys L and N we used a hybrid of these closed-loop and observation-based approaches. These calibrations began with 16 trials (2 to each target) of the observation-based procedure. For the next 8 trials, the animal controlled cursor movements using a mapping calibrated using the data from the previous 16 trials, but the cursor was restricted to move only along the center-to-target direction (velocity components perpendicular to the center-to-target direction were scaled by a factor of 0). The next 8 trials used a mapping calibrated from the previous 24 trials, and perpendicular velocity components were scaled by a factor of 0.125. We repeated this procedure for a total of 80 trials until the animal was given complete control of the cursor (perpendicular scale factor = 1). All calibrations performed within this procedure defined intended cursor velocities to be in the center-to-target direction with speeds taken from the corresponding cursor movements that were displayed to the animal.

Animals demonstrated proficient cursor control using the intuitive BCI mapping from the very first intuitive trial of each experiment, as evidenced by success rates and acquisition times (Fig. 1). Acquisition times from the last 50 intuitive trials of each experiment are described in Fig. 7 (bars labeled “before learning,” “intuitive mapping”; median acquisition time, monkey J: 885 ms; monkey L: 974 ms; monkey N: 636 ms).

#### Perturbed BCI mappings

Perturbed BCI mappings altered the relationship between recorded population activity patterns and cursor movements. In this work, we studied within-manifold perturbations, which altered the relationship between the factors zt and the cursor velocity vt. Specifically, we permuted the ordering of the factors (i.e., the elements of zt), which is equivalent to permuting the columns of B in equation (2) while preserving the ordering of the factors. Accordingly, perturbed BCI mappings took the form

$${{\bf{v}}}_{t}={{\bf{Av}}}_{t-1}+{{\bf{B}}}^{{\rm{pert}}}{{\bf{z}}}_{t}+{\bf{c}}$$
(3)

where A and c are unchanged from equation (2) and Bpert contains the permuted columns of B from equation (2). Geometrically, a within-manifold perturbation corresponds to reorienting the control space within the intrinsic manifold (Fig. 2b). With each experiment, our aim was to select a candidate perturbation that would be difficult enough that substantial learning would be required to restore proficient control, but not so difficult as to deter the animal. Our procedure for selecting such a perturbed BCI mapping is detailed in the Supplementary Math Note.

Behaviorally, these perturbations had complex effects on cursor movements, which cannot be replicated by pure visuomotor rotations or gains (Supplementary Fig. 2). Before learning, the effects of a typical perturbation can be approximately summarized by a combination of per-target velocity rotations and speed scalings. Because the perturbations were implemented in 10D, these rotations and scalings need not be consistent across movement directions and speeds (as they would be in the case of a pure visuomotor rotation or gain). Perturbations often affected movement speeds more profoundly along one movement direction than along the perpendicular direction. Angular errors (i.e., deviations between movement direction and target direction) were also often larger for some targets than for others. For some perturbations these angular errors had a consistent sign across targets, but this was not always the case.

### Animal training history

Animals were initially trained to perform cursor movements that were tied to arm movements. Once an animal demonstrated understanding of the task goals (for example, move the cursor to the target), we transitioned the animal into the BCI task by loosely restraining the animal’s arms and determining cursor movements from neural activity through a BCI mapping. Prior to the experiments analyzed in this work, animals accrued experience controlling intuitive BCI mappings through this training and through other experiments not analyzed in this work (monkey J: 19.2 months; monkey L: 1.9 months; monkey N: 2.5 months).

The experiments analyzed in this work involved within-manifold perturbations of the BCI mapping. In additional experiments not analyzed in this work, the perturbed BCI mapping was outside the intrinsic manifold. These outside-manifold perturbation experiments were interleaved with the within-manifold perturbation experiments, and the perturbation type was selected pseudorandomly each day. The experiments analyzed spanned several months (monkey J: 4.6 months; monkey L: 6.8 months; monkey N: 4.6 months). In this work, we exclusively analyzed the within-manifold perturbations because animals showed more behavioral learning in those experiments22, and our primary goal in this work was to understand the neural underpinnings of this behavioral learning.

### Selecting experiments and trials for analysis

Because our goal is to characterize changes in neural activity due to learning, we focused on the experiments in which the animals showed the most behavioral learning (i.e., improvements in cursor movements). We included an experiment for analysis if we detected significant improvements in both success rate (P < 0.05, two-sided unpaired Wilcoxon rank-sum test) and acquisition time (P < 0.05, two-sided unpaired t-test) between the first 50 perturbed trials and any subsequent block of 50 perturbed trials. For monkeys J, L and N, 27 of 28, 11 of 14, and 10 of 11 experiments met these criteria, respectively.

In experiments that met these criteria, we analyzed the last 50 successful intuitive trials (‘before learning’) and the successful trials from the 50 consecutive perturbation trials that showed the best behavioral performance (‘after learning’). Here, behavioral performance was measured using a composite statistic that combines normalized success rate and normalized acquisition time (“amount of learning” from ref. 22). Failed trials interspersed within those 50 successful trials were not analyzed because it is difficult to determine whether the animal was actively engaged in the task during failed trials.

### Selecting and grouping activity patterns for analysis

We composed movement-specific clouds of activity for intended movements in each of eight uniformly spaced directions, which correspond to the eight target directions in the center-out task. At timestep t, we defined the intended movement direction to be the nearest of these eight directions to the straight-to-target direction from the current cursor position. We did not introduce a lag between intended movement direction and cursor position because we have previously shown that, during BCI control, animals compensate for natural visuomotor latencies such that M1 reflects the animal’s movement intent relative to the current cursor position (rather than an outdated cursor position)20. To account for visuomotor latencies at the start of each trial, we excluded data from analysis for the first 135 ms (i.e., 3 time steps in the BCI system) following target onset20.

An important goal in this work was to characterize learning-related changes in the overall neural repertoire. To ensure that our findings were not biased by differences in the number of activity patterns in each movement-specific cloud (for example, due to asymmetric cursor kinematics), we matched the number of activity patterns used to define each movement-specific cloud (i.e., before-learning and after-learning clouds for each of the eight movement directions). To achieve this matching for a given experiment, we identified the movement-specific cloud with the fewest activity patterns and subsampled all other movement-specific clouds to match that number of patterns, N. We performed this subsampling by progressively dropping activity patterns that corresponded to the largest within-trial time elapsed since target onset. For each experiment, this procedure produced size-matched, movement-specific clouds of before- and after-learning activity patterns.

### Predicting population activity after learning

To interpret the empirically observed changes in animals’ population activity during learning, we compared the observed after-learning activity patterns to activity patterns predicted by realignment, partial realignment, rescaling, reassociation and subselection. These predictions shared four important constraints. First, none of the predictions were informed by after-learning neural activity. Predictions were based on the before-learning movement-specific clouds and the perturbed BCI mapping. Partial realignment and subselection were designed to match after-learning behavioral performance and hence were also informed by after-learning cursor velocities and target positions. Second, we ensured that predicted activity patterns from all hypotheses did not correspond to firing rates beyond each unit’s physiological range, as defined by the minimum and maximum spike counts observed for each unit during the before-learning trials. Third, all hypotheses’ predictions were defined within the 10D space defined by the intrinsic manifold, meaning that none of the hypotheses predict activity patterns that are outside the intrinsic manifold. Finally, all hypotheses predict realistic levels of neural variability across activity patterns produced for the same intended movement direction, and these levels of variability were matched to the before-learning data in a hypothesis-specific manner. We did not include variability that was independent to each individual neural unit (for example, Poisson-like variability) because all predictions were based on the factors extracted by factor analysis, which represent variance that is shared across units34,36,50. In post hoc analyses we confirmed that including Poisson-like variability in predicted activity patterns does not violate the physiological plausibility of any of the hypotheses we considered (Supplementary Fig. 11).

Detailed prediction procedures can be found in the Supplementary Math Note. Briefly, predictions for realignment, rescaling and reassociation involved solving convex optimization problems38 to find the movement-specific clouds that maximize behavioral performance subject to the constraints of each strategy. For partial realignment, predicted movement-specific clouds were intermediate between the empirical before-learning clouds and the realignment-predicted clouds. Subselection-predicted movement-specific clouds were fit to subsets of the corresponding empirical before-learning clouds.

### Visualizing population activity patterns

In Fig. 3 and Supplementary Figs. 4 and 5, we visualized population activity patterns. In Fig. 3, each point indicates a 2D single-timestep cursor velocity, $${{\bf{v}}}_{t}^{{\rm{single}}\mbox{-}{\rm{timestep}}}$$, which represents the contribution of a single population activity pattern, zt, to cursor velocity according to the perturbed BCI mapping:

$${{\bf{v}}}_{t}^{{\rm{single}}\mbox{-}{\rm{timestep}}}={{\bf{B}}}^{{\rm{pert}}}{{\bf{z}}}_{t}+{\bf{c}}$$
(4)

where Bpert and c are from equation (3). Because the after-learning patterns were recorded when the perturbed BCI mapping was in place, each red point represents a cursor velocity that was used in closed loop to move the cursor during a perturbed trial. The before-learning activity patterns were recorded while the intuitive BCI mapping was used for control, and thus each black point represents a cursor velocity that would have resulted from each before-learning activity pattern had the perturbed BCI mapping been in place.

The outlines in the center panel of Fig. 3 were designed to convey the domain spanned by the activity patterns while being robust to outliers. These outlines enclose the central 98% of the before-learning (black) and after-learning (red) activity patterns. To determine the 2% of patterns to exclude from each of these outlines (i.e., the patterns that might be outliers) we successively dropped the outermost points until 2% of all points had been dropped. To determine the order in which points were dropped, we began by computing the convex hull of all of the 2D points, which represents the smallest polygon enclosing all of the 2D points such that the polygon also encloses all possible line segments between any two points within the polygon. Next, we successively dropped the points that lay along the boundary of this convex hull in order from largest to smallest Mahalanobis distance from the centroid of all 2D points. Mahalanobis distances were computed relative to the covariance across all 2D points. If the number of points dropped reached 2% of all points, the procedure terminated. If the points along the boundary of the convex hull were fewer than 2% of all points, all of those boundary points were dropped, a new convex hull was computed over the remaining points, and the dropping procedure repeated until 2% of all points had been dropped. This procedure was performed independently for the before-learning (black) and after-learning (red) activity patterns.

In Supplementary Fig. 4, we took the same population activity patterns zt as in Fig. 3 and plotted their outputs through the intuitive BCI mapping (i.e., replaced Bpert in equation (4) with B from equation (2)). Thus, each black point represents a cursor velocity that was used in closed loop during an intuitive trial, and each red point represents a cursor velocity that would have resulted from an after-learning activity pattern had the intuitive BCI mapping been in place. The outlines and filled regions in Supplementary Fig. 4 were created using the methods described above for Fig. 3. In Supplementary Fig. 5 we compare the population activity patterns visualized in Fig. 3 and Supplementary Fig. 4 with patterns predicted by realignment, rescaling and reassociation.

### Measuring changes to the neural repertoire

Repertoire change in Fig. 4 was assessed by computing, for each after-learning activity pattern zt, a normalized distance dt to the before-learning neural repertoire. Normalization was necessary to interpret distances in the population activity space relative to the empirical variability in the before-learning population activity patterns. If the before- and after-learning patterns come from the same underlying neural repertoire, the after-learning activity patterns should be as close to the before-learning activity patterns as those before-learning activity patterns are to each other. Such repertoire preservation is indicated by normalized distances near zero. Normalized distances greater than zero imply that after-learning activity patterns are (relatively) far from the before-learning activity patterns, which would indicate an expansion or a shift (i.e., translation) of the neural repertoire. Values less than zero imply that the after-learning patterns are closer to some set of the before-learning activity patterns than all of the before-learning patterns are to each other, which would indicate a contraction of the neural repertoire (Supplementary Fig. 6).

Normalized distances were computed as

$${d}_{t}={\rm{\lambda }}\frac{{\rho }_{t}}{\nu }-1$$
(5)

where ρt is the distance (in 10D population activity space) between activity pattern zt and its Kth nearest neighbor (KNN) among all before-learning activity patterns across all intended movement directions, ν is the mean KNN distance between each before-learning activity pattern relative to all before-learning activity patterns, $${\rm{\lambda }}=(8N-1)/8N$$ is a scale factor to account for the fact that ν and ρt are assessed relative to different numbers of activity patterns, and N is the number of activity patterns in each of the eight movement-specific clouds. Each distance contributing to ν is assessed relative to 8N – 1 before-learning patterns (i.e., not including self distances, which are trivially 0), whereas ρt is assessed relative to all 8N before-learning patterns. For all distance measurements we used Mahalanobis distance relative to the overall before-learning covariance (Sbefore, to be defined in equation (6)). This ensures that each distance measurement reflects all dimensions of the population activity patterns, rather than being dominated by the dimensions that contain the most shared variance. We chose to assess K = 5 nearest neighbors, although results were qualitatively similar across a range of values for K (1, 2, 5, 10, 20). In Fig. 4b we show these normalized distances as ‘repertoire change’. Repertoire change from the observed data (gray bars) is compared to that predicted by each neural strategy (colored bars), where predicted repertoire change was obtained by computing the distances dt for each predicted activity pattern (for example, colored points in Fig. 4a) relative to the observed before-learning neural repertoire (for example, black points in Fig. 4a).

We used a similar metric in Fig. 8c to quantify movement-specific repertoire change (i.e., changes to each movement-specific cloud). For intended movement direction Θ, we measured distances between each activity pattern zt in the after-learning movement-Θ cloud relative to all patterns in the before-learning movement-Θ cloud. Normalized distances, dt, were then computed as in equation (5), but with ρt being the distance between zt and its KNN among the N before-learning activity patterns for movement Θ, ν being the mean KNN distance between each before-learning movement-Θ activity pattern relative to the N – 1 other patterns in the before-learning movement-Θ cloud. Correspondingly, the scale factor λ was taken to be $$(N-1)/N$$. In Fig. 8c we report the percentage of movement-specific clouds showing repertoire shifts or expansions (indicated by positive movement-specific normalized distances) versus the percentage showing repertoire contraction (indicated by negative distances). Here, we treated the sign of the repertoire change measurement as a Bernoulli random variable. We then linearly mapped the probability of measuring a positive normalized distance onto the scale running from 100% contractions (which indicates that all distances were negative) to 100% shifts/expansions (which indicates that all distances were positive). A value of zero in Fig. 8c indicates that 50% of distances were positive and 50% were negative.

### Measuring changes in population covariability

In Fig. 5 and Supplementary Fig. 8 we quantified the amount of population covariability along the dimensions spanned by the BCI mappings. We summarized before-learning overall covariability by computing the covariance matrix, Sbefore:

$${{\bf{S}}}^{{\rm{before}}}=\frac{1}{8N}\sum _{t\in {{\mathscr{T}}}^{{\rm{before}}}}({{\bf{z}}}_{t}-{\bar{{\bf{z}}}}^{{\rm{before}}}){({{\bf{z}}}_{t}-{\bar{{\bf{z}}}}^{{\rm{before}}})}^{{\rm{T}}}$$
(6)

where $${{\mathscr{T}}}^{{\rm{before}}}$$ is the set of all analyzed before-learning timesteps and $${\bar{{\bf{z}}}}^{{\rm{before}}}$$ is the empirical overall mean population activity pattern before learning (i.e., across all movements):

$${\bar{{\bf{z}}}}^{{\rm{before}}}=\frac{1}{8N}\sum _{t\in {{\mathscr{T}}}^{{\rm{before}}}}{{\bf{z}}}_{t}$$
(7)

Similarly, we defined the after-learning overall covariance, Safter, using equations (6) and (7), but replacing $${\bar{{\bf{z}}}}^{{\rm{before}}}$$ and $${{\mathscr{T}}}^{{\rm{before}}}$$ with $${\bar{{\bf{z}}}}^{{\rm{after}}}$$ and $${{\mathscr{T}}}^{{\rm{after}}}$$, respectively. The covariance projected along the dimensions spanned by a BCI mapping (for example, equation (1)) with parameter B is $${{\bf{S}}}^{{\rm{proj}}}={{\bf{V}}}^{{\rm{T}}}{\bf{SV}}\in {{\mathscr{R}}}^{2\times 2}$$, where $${\bf{V}}\in {{\mathscr{R}}}^{10\times 2}$$ has orthonormal columns spanning the row space of B and can be obtained from the singular value decomposition B = UDVT. We summarized the amount of covariability projected along the BCI mappings as trace(Sproj). In Fig. 5c and Supplementary Fig. 8 we show the percentage change in these amounts of projected covariability after learning relative to before learning, such that positive changes correspond to an expansion of projected covariability during learning. Changes in projected covariability from the observed data (black; from Fig. 5a,b) are compared to those predicted by each neural strategy (colors), where predicted covariances were computed over each strategy’s predicted activity patterns following equations (6) and (7), and changes were assessed relative to the empirical before-learning covariance, Sbefore.

In Fig. 6 we related changes in variability along each of the ten dimensions of the population activity to changes in the pushing magnitudes between the intuitive and perturbed BCI mappings. The pushing magnitude for dimension i in an intuitive BCI mapping is defined by

$${\rm{pushing}}\,{{\rm{magnitude}}}_{i}={\left\Vert {{\bf{b}}}_{i}\right\Vert }_{2}=\sqrt{{{\bf{b}}}_{i}^{{\rm{T}}}{{\bf{b}}}_{i}}$$
(8)

where bi is the ith column of B from the intuitive BCI mapping (equation (2); Fig. 6c). Each dimension’s pushing magnitude changed when the mapping was changed to the perturbed BCI mapping (replacing bi in equation (8) with $${{\bf{b}}}_{i}^{{\rm{pert}}}$$, the ith column of Bpert from equation (3); Fig. 6d). Changes in pushing magnitudes (Fig. 6e and horizontal axis in Fig. 6f) were computed by subtracting the intuitive pushing magnitudes (using B in equation (2)) from the perturbed pushing magnitudes (using Bpert in equation (3)). The change in covariability (vertical axis in Fig. 6f) along dimension i was obtained by comparing the ith element along the diagonal of the after-learning overall covariance matrix, Safter, to the corresponding element in Sbefore (see equation (6)). For each experiment we summarized the relationship between changes in covariability and changes to the BCI mapping by finding the slope of a line fit via linear regression to the scatter of these changes across all ten dimensions (Fig. 6g).

### Assessing behavioral performance

In Fig. 1c behavioral performance was assessed using success rate and acquisition time. Both metrics were computed in non-overlapping 50-trial windows. In a given window, success rate is the percentage of trials during which the animal successfully acquired the target, and acquisition time is the time elapsed between the end of the freeze period (see “Behavioral task”) and target acquisition, averaged across successful trials only.

In Fig. 7 we evaluated the animals’ empirical behavioral performance, as measured by acquisition time, and compared it to that predicted by each neural strategy of learning. For the closed-loop trials (‘before-learning, intuitive mapping’; ‘after-learning, perturbed mapping’), we can directly measure acquisition time. We refer to the empirical average acquisition time from the before-learning and after-learning trials as $${T}_{{\rm{intuitive}}}^{{\rm{before}}}$$ and $${T}_{{\rm{pert}}}^{{\rm{after}}}$$, respectively. To enable fair comparisons between this empirical closed-loop behavior and predicted behavior, which cannot be directly measured in closed-loop (‘before-learning, perturbed mapping’; realignment; rescaling; reassociation), we predicted acquisition times according to

$${\hat{T}}_{{\rm{pert}}}^{{\rm{after}}}={{\rm{\lambda }}}_{{\rm{pert}}}^{{\rm{after}}}\frac{D}{\frac{1}{8}\sum _{i=1}^{8}{P}_{{\rm{pert}}}\left({\bar{{\bf{z}}}}_{{\rm{\Theta }}_{i}},{\rm{\Theta }}_{i}\right)}$$
(9)

which is computed according to the following four steps. First, we identified the contribution to cursor velocity of each predicted population activity pattern zt using equation (4). Second, for all predicted activity patterns in the movement-Θ cloud, we asked how much movement each pattern would produce in direction Θ under the perturbed BCI mapping. We term this metric ‘cursor progress’, P(zt,Θ):

$$\begin{array}{*{20}{l}}{P}_{{\rm{pert}}}\left({{\bf{z}}}_{t},{\rm{\Theta}} \right)& =& \left[\begin{array}{c}{\rm{cos}}{\rm{\Theta}} \\ {\rm{sin}}{\rm{\Theta}} \end{array}\right]\cdot \left({{\bf{B}}}^{{\rm{pert}}}{{\bf{z}}}_{t}+{\bf{c}}\right)\\& =& \left[\begin{array}{c}{\rm{cos}}{\rm{\Theta}} \\ {\rm{sin}}{\rm{\Theta}} \end{array}\right]\cdot {{\bf{v}}}_{t}^{{\rm{single}}\mbox{-}{\rm{timestep}}} \end{array}$$
(10)

which is the projection of the single-timestep cursor velocity (equation (4)) onto a unit vector in direction Θ. The average cursor progress across all patterns in the movement-Θ cloud through the perturbed BCI mapping is $${P}_{{\rm{pert}}}({\bar{{\bf{z}}}}_{\rm{\Theta }},{\rm{\Theta }})$$, where $${\bar{{\bf{z}}}}_{\rm{\Theta }}$$ is the vector-mean of all activity patterns in the movement-Θ cloud. The denominator in equation (9) is the average cursor progress across the eight movement-specific clouds. Third, we translated these average cursor progress values into predicted acquisition times. Cursor progress is a measure of speed (in units of mm/s), and as such we can compute predicted acquisition time (in units of seconds) as the center-to-target distance, D, (i.e., the distance along the center-to-target direction that the cursor must traverse to acquire the target; in units of mm) divided by cursor progress (in units of mm/s). Finally, we used the empirical closed-loop measurements of acquisition time, $${T}_{{\rm{pert}}}^{{\rm{after}}}$$, to correct the scale of the predicted acquisition times. Because the single-timestep velocities, $${{\bf{v}}}_{t}^{{\rm{single}}\mbox{-}{\rm{timestep}}}$$, that determine cursor progress (equation (4)) do not include the Avt–1 term from equation (3), the magnitudes of the $${{\bf{v}}}_{t}^{{\rm{single}}\mbox{-}{\rm{timestep}}}$$ are typically smaller than the magnitudes of the closed-loop velocities, vt, from equation (3). We thus scaled the predicted acquisition times using the scalar multiplier $${{\rm{\lambda }}}_{{\rm{pert}}}^{{\rm{after}}}$$ required to make $${\hat{T}}_{{\rm{pert}}}^{{\rm{after}}}={T}_{{\rm{pert}}}^{{\rm{after}}}$$ when using empirical after-learning activity patterns for the $${\bar{{\bf{z}}}}_{{\rm{\Theta }}_{i}}$$ in equation (9). We used this $${{\rm{\lambda }}}_{{\rm{pert}}}^{{\rm{after}}}$$ to scale the predicted acquisition times for ‘before learning, perturbed mapping,’ realignment, rescaling and reassociation.

In Fig. 7, the ‘after learning, perturbed mapping’ bars indicate the closed-loop empirical acquisition times Tafter, which by construction exactly match the times predicted by equation (9) when using empirical after-learning activity patterns for the $${\bar{{\bf{z}}}}_{{\rm{\Theta }}_{i}}$$. ‘Before learning, perturbed mapping’ bars indicate predicted acquisition times from equation (9), using empirical before-learning activity patterns for the $${\bar{{\bf{z}}}}_{{\rm{\Theta }}_{i}}$$. Realignment, rescaling and reassociation bars indicate predicted acquisition times from equation (9), using predicted activity patterns for the $${\bar{{\bf{z}}}}_{{\rm{\Theta }}_{i}}$$. ‘Before learning, intuitive mapping’ bars indicate the empirical closed-loop acquisition times, $${T}_{{\rm{intuitive}}}^{{\rm{before}}}$$, which by construction exactly match the times that would be predicted by an updated equation (9) that reflects the intuitive BCI mapping. Note that theoretically it is possible for cursor progress (equation (10)) to yield negative values. However, in practice the average cursor progress values in the denominator of equation (9) were always substantially greater than zero, and as such all predicted acquisition times were positive and well defined.

### Statistics

To test for statistical significance, we used nonparametric tests (for example, Wilcoxon signed-rank test, sign test), which do not assume normality. We used a parametric test (t-test) in one instance to select experiments with significant behavioral learning. Here, the data distribution was assumed to be normal, but this was not formally tested. No statistical methods were used to predetermine sample sizes, but our sample sizes (48 experiments across 3 monkeys) are similar to those reported in previous publications1,2,3,4,5,6,7,8,9,10,11,17,18,19,20,22,23,24,26,27,30,33,34,35,43,44,48,49. The experiments described in this work were not grouped, and thus no blinding or group randomization procedures were required.

### Life Sciences Reporting Summary

Further information is available in the Life Sciences Reporting Summary.

### Code availability

Matlab code that supports the modeling and analyses of this study is available at https://github.com/mattgolub/bci_learning.

### Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Change history

• ### 05 July 2018

In the version of this article initially published, equation (10) contained cos Θ instead of sin Θ as the bottom element of the right-hand vector. The error has been corrected in the HTML and PDF versions of the article.

## References

1. 1.

Mitz, A. R., Godschalk, M. & Wise, S. P. Learning-dependent neuronal activity in the premotor cortex: activity during the acquisition of conditional motor associations. J. Neurosci. 11, 1855–1872 (1991).

2. 2.

Asaad, W. F., Rainer, G. & Miller, E. K. Neural activity in the primate prefrontal cortex during associative learning. Neuron 21, 1399–1407 (1998).

3. 3.

Li, C.-S. R., Padoa-Schioppa, C. & Bizzi, E. Neuronal correlates of motor performance and motor learning in the primary motor cortex of monkeys adapting to an external force field. Neuron 30, 593–607 (2001).

4. 4.

Paz, R., Boraud, T., Natan, C., Bergman, H. & Vaadia, E. Preparatory activity in motor cortex reflects learning of local visuomotor skills. Nat. Neurosci. 6, 882–890 (2003).

5. 5.

Rokni, U., Richardson, A. G., Bizzi, E. & Seung, H. S. Motor learning with unstable neural representations. Neuron 54, 653–666 (2007).

6. 6.

Mandelblat-Cerf, Y. et al. The neuronal basis of long-term sensorimotor learning. J. Neurosci. 31, 300–313 (2011).

7. 7.

Ganguly, K. & Carmena, J. M. Emergence of a stable cortical map for neuroprosthetic control. PLoS Biol. 7, e1000153 (2009).

8. 8.

Chase, S. M., Schwartz, A. B. & Kass, R. E. Latent inputs improve estimates of neural encoding in motor cortex. J. Neurosci. 30, 13873–13882 (2010).

9. 9.

Ganguly, K., Dimitrov, D. F., Wallis, J. D. & Carmena, J. M. Reversible large-scale modification of cortical networks during neuroprosthetic control. Nat. Neurosci. 14, 662–667 (2011).

10. 10.

Chase, S. M., Kass, R. E. & Schwartz, A. B. Behavioral and neural correlates of visuomotor adaptation observed through a brain-computer interface in primary motor cortex. J. Neurophysiol. 108, 624–644 (2012).

11. 11.

Gu, Y. et al. Perceptual learning reduces interneuronal correlations in macaque visual cortex. Neuron 71, 750–761 (2011).

12. 12.

Jeanne, J. M., Sharpee, T. O. & Gentner, T. Q. Associative learning enhances population coding by inverting interneuronal correlation patterns. Neuron 78, 352–363 (2013).

13. 13.

Mazor, O. & Laurent, G. Transient dynamics versus fixed points in odor representations by locust antennal lobe projection neurons. Neuron 48, 661–673 (2005).

14. 14.

Luczak, A., Barthó, P. & Harris, K. D. Spontaneous events outline the realm of possible sensory responses in neocortical populations. Neuron 62, 413–425 (2009).

15. 15.

Berkes, P., Orbán, G., Lengyel, M. & Fiser, J. Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science 331, 83–87 (2011).

16. 16.

Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51–56 (2012).

17. 17.

Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497, 585–590 (2013).

18. 18.

Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, 78–84 (2013).

19. 19.

Kaufman, M. T., Churchland, M. M., Ryu, S. I. & Shenoy, K. V. Cortical activity in the null space: permitting preparation without movement. Nat. Neurosci. 17, 440–448 (2014).

20. 20.

Golub, M. D., Yu, B. M. & Chase, S. M. Internal models for interpreting neural population activity during sensorimotor control. Elife 4, e10015 (2015).

21. 21.

Durstewitz, D., Vittoz, N. M., Floresco, S. B. & Seamans, J. K. Abrupt transitions between prefrontal neural ensemble states accompany behavioral transitions during rule learning. Neuron 66, 438–448 (2010).

22. 22.

Sadtler, P. T. et al. Neural constraints on learning. Nature 512, 423–426 (2014).

23. 23.

Athalye, V. R., Ganguly, K., Costa, R. M. & Carmena, J. M. Emergence of coordinated neural dynamics underlies neuroprosthetic learning and skillful control. Neuron 93, 955–970 (2017).

24. 24.

Vyas, S. et al. Neural population dynamics underlying motor learning transfer. Neuron https://doi.org/10.1016/j.neuron.2018.01.040 (2018).

25. 25.

Golub, M. D., Chase, S. M., Batista, A. P. & Yu, B. M. Brain-computer interfaces for dissecting cognitive processes underlying sensorimotor control. Curr. Opin. Neurobiol. 37, 53–58 (2016).

26. 26.

Taylor, D. M., Tillery, S. I. H. & Schwartz, A. B. Direct cortical control of 3D neuroprosthetic devices. Science 296, 1829–1832 (2002).

27. 27.

Jarosiewicz, B. et al. Functional network reorganization during learning in a brain-computer interface paradigm. Proc. Natl. Acad. Sci. USA 105, 19486–19491 (2008).

28. 28.

Koralek, A. C., Jin, X., Long, J. D. II, Costa, R. M. & Carmena, J. M. Corticostriatal plasticity is necessary for learning intentional neuroprosthetic skills. Nature 483, 331–335 (2012).

29. 29.

Clancy, K. B., Koralek, A. C., Costa, R. M., Feldman, D. E. & Carmena, J. M. Volitional modulation of optically recorded calcium signals during neuroprosthetic learning. Nat. Neurosci. 17, 807–809 (2014).

30. 30.

Armenta Salas, M. & Helms Tillery, S. I. Uniform and non-uniform perturbations in brain-machine interface task elicit similar neural strategies. Front. Syst. Neurosci 10, 70 (2016).

31. 31.

Cunningham, J. P. & Yu, B. M. Dimensionality reduction for large-scale neural recordings. Nat. Neurosci. 17, 1500–1509 (2014).

32. 32.

Krakauer, J. W., Pine, Z. M., Ghilardi, M.-F. & Ghez, C. Learning of visuomotor transformations for vectorial planning of reaching trajectories. J. Neurosci. 20, 8916–8924 (2000).

33. 33.

Paz, R., Nathan, C., Boraud, T., Bergman, H. & Vaadia, E. Acquisition and generalization of visuomotor transformations by nonhuman primates. Exp. Brain Res. 161, 209–219 (2005).

34. 34.

Yu, B. M. et al. Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity. J. Neurophysiol. 102, 614–635 (2009).

35. 35.

Santhanam, G. et al. Factor-analysis methods for higher-performance neural prostheses. J. Neurophysiol. 102, 1315–1330 (2009).

36. 36.

Churchland, M. M. et al. Stimulus onset quenches neural variability: a widespread cortical phenomenon. Nat. Neurosci. 13, 369–378 (2010).

37. 37.

Harvey, C. D., Coen, P. & Tank, D. W. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature 484, 62–68 (2012).

38. 38.

Boyd, S. & Vandenberghe, L. Convex Optimization (Cambridge University Press, Cambridge, 2004).

39. 39.

Charlesworth, J. D., Tumer, E. C., Warren, T. L. & Brainard, M. S. Learning the microstructure of successful behavior. Nat. Neurosci. 14, 373–380 (2011).

40. 40.

Smith, M. A., Ghazizadeh, A. & Shadmehr, R. Interacting adaptive processes with different timescales underlie short-term motor learning. PLoS Biol. 4, e179 (2006).

41. 41.

Kording, K. P., Tenenbaum, J. B. & Shadmehr, R. The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat. Neurosci. 10, 779–786 (2007).

42. 42.

Joiner, W. M. & Smith, M. A. Long-term retention explained by a model of short-term learning in the adaptive control of reaching. J. Neurophysiol. 100, 2948–2955 (2008).

43. 43.

Yang, Y. & Lisberger, S. G. Learning on multiple timescales in smooth pursuit eye movements. J. Neurophysiol. 104, 2850–2862 (2010).

44. 44.

Hwang, E. J., Bailey, P. M. & Andersen, R. A. Volitional control of neural activity relies on the natural motor repertoire. Curr. Biol. 23, 353–361 (2013).

45. 45.

Cohen, R. G. & Sternad, D. Variability in motor learning: relocating, channeling and reducing noise. Exp. Brain Res. 193, 69–83 (2009).

46. 46.

Shadmehr, R. & Krakauer, J. W. A computational neuroanatomy for motor control. Exp. Brain Res. 185, 359–381 (2008).

47. 47.

Kalman, R. E. A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 35–45 (1960).

48. 48.

Wu, W., Gao, Y., Bienenstock, E., Donoghue, J. P. & Black, M. J. Bayesian population decoding of motor cortical activity using a Kalman filter. Neural Comput. 18, 80–118 (2006).

49. 49.

Gilja, V. et al. A high-performance neural prosthesis enabled by control algorithm design. Nat. Neurosci. 15, 1752–1757 (2012).

50. 50.

Churchland, M. M. & Abbott, L. F. Two layers of neural variability. Nat. Neurosci. 15, 1472–1474 (2012).

## Acknowledgements

This work was supported by NIH R01 HD071686 (A.P.B., B.M.Y. and S.M.C.), NSF NCS BCS1533672 (S.M.C., B.M.Y. and A.P.B.), NSF CAREER award IOS1553252 (S.M.C.), NIH CRCNS R01 NS105318 (B.M.Y. and A.P.B.), Craig H. Neilsen Foundation 280028 (B.M.Y., S.M.C. and A.P.B.), Pennsylvania Department of Health Research Formula Grant SAP 4100077048 under the Commonwealth Universal Research Enhancement program (S.M.C. and B.M.Y.) and Simons Foundation 364994 (B.M.Y.).

## Author information

### Author notes

1. These authors contributed equally: Steven M. Chase, Byron M. Yu.

### Affiliations

1. #### Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA

• Matthew D. Golub
•  & Byron M. Yu
2. #### Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, PA, USA

• Matthew D. Golub
• , Emily R. Oby
• , Kristin M. Quick
• , Aaron P. Batista
• , Steven M. Chase
•  & Byron M. Yu
3. #### Department of Electrical Engineering, Stanford University, Stanford, CA, USA

• Matthew D. Golub
•  & Stephen I. Ryu
4. #### Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA

• , Emily R. Oby
• , Kristin M. Quick
• , Elizabeth C. Tyler-Kabara
•  & Aaron P. Batista
5. #### Systems Neuroscience Institute, University of Pittsburgh, Pittsburgh, PA, USA

• , Emily R. Oby
• , Kristin M. Quick
•  & Aaron P. Batista
6. #### Department of Neurosurgery, Palo Alto Medical Foundation, Palo Alto, CA, USA

• Stephen I. Ryu
7. #### Department of Physical Medicine and Rehabilitation, University of Pittsburgh, Pittsburgh, PA, USA

• Elizabeth C. Tyler-Kabara
8. #### Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA

• Elizabeth C. Tyler-Kabara
9. #### Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA

• Steven M. Chase
•  & Byron M. Yu

### Contributions

M.D.G., B.M.Y., S.M.C. and A.P.B. designed the analyses and discussed the results. M.D.G. performed all analyses and wrote the paper. P.T.S., K.M.Q., M.D.G., S.M.C., B.M.Y. and A.P.B. designed the animal experiments. P.T.S. and E.R.O. performed the animal experiments. S.I.R., E.C.T.-K. and E.R.O. performed the animal surgeries. All authors commented on the manuscript. B.M.Y. and S.M.C. contributed equally to this work.

### Competing interests

The authors declare no competing interests.

### Corresponding authors

Correspondence to Steven M. Chase or Byron M. Yu.

## Supplementary information

1. ### Supplementary Text and Figures

Supplementary Figures 1–11 and Supplementary Math Note

### DOI

https://doi.org/10.1038/s41593-018-0095-3

• 1.
• Aniruddh R. Galgali
•  & Valerio Mante

Nature Neuroscience (2018)

• 2.