Sensorimotor transformation via sparse coding

Sensorimotor transformation is indispensable to the accurate motion of the human body in daily life. For instance, when we grasp an object, the distance from our hands to an object needs to be calculated by integrating multisensory inputs, and our motor system needs to appropriately activate the arm and hand muscles to minimize the distance. The sensorimotor transformation is implemented in our neural systems, and recent advances in measurement techniques have revealed an important property of neural systems: a small percentage of neurons exhibits extensive activity while a large percentage shows little activity, i.e., sparse coding. However, we do not yet know the functional role of sparse coding in sensorimotor transformation. In this paper, I show that sparse coding enables complete and robust learning in sensorimotor transformation. In general, if a neural network is trained to maximize the performance on training data, the network shows poor performance on test data. Nevertheless, sparse coding renders compatible the performance of the network on both training and test data. Furthermore, sparse coding can reproduce reported neural activities. Thus, I conclude that sparse coding is necessary and a biologically plausible factor in sensorimotor transformation.

model 18,19 , which can control the sparseness of neural activities using a single parameter. Computer simulations were conducted of visually guided wrist movements in various postures 8,20,21 . The inputs in this task consisted of the visual targets and posture information, whereas the outputs were the motor commands used to activate nonlinear muscle units. This means that the task required multisensory integration as well as the computation of complicated motor commands, or sensorimotor transformation. Under constant metabolic cost regardless of the sparsity of neural activities, I prove the following results: 1) sensorimotor transformation cannot be learned when a large portion of neurons exhibit extensive activity (dense coding) but can be learned in sparse coding conditions; 2) there is an optimal sparseness required to attain sensorimotor transformation; 3) The learning performance to training data as well as to test data (generalization performance) is better in sparse coding than in dense coding; and 4) neural activity in sparse coding conditions is similar to previously reported neural activities 21 or the multiplicative interaction form.

Results
Following previous studies 8,20,21 , this study focuses on visually guided wrist movements in various postures (see Figure 1a for a schematic diagram of the following computational model and the Methods section for the summarized procedures of the following computer simulations). In the assumed task, subjects were required to move a cursor toward a target v t 5 (cos H v,k(t) , sin H v,k(t) ) on a computer screen at the t-th trial, where H u,k(t)~{ pz2p k(t) K . k(t) was randomly sampled from 1,..., K in each trial and K is the number of targets (I assumed K 5 8). The cursor movements were determined by subjects' wrist movements (e.g., the cursor moved up if a subject moved his/her wrist upward). Subjects thus needed to appropriately move their wrists to carry out the task with wrist posture p t 5 (cos H p,k9(t) , sin H p,k9(t) ), where I assumed (cos H p,k9 , sin H p,k9 ) to be (1,3)= ffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 1 2 z3 2 p in pronation (k9 5 1), (2,2)= ffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2 2 z2 2 p in midrange (k9 5 2), and (3,1)= ffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 3 2 z1 2 p in supination (k9 5 3). Notably, the results were independent of these values. The target position v t was treated as visual information and the posture p t was taken to be proprioceptive information.
The task was thus to determine an executed movement x 5 P k9(t) M that minimizes movement error e 5 v 2 x, where P k9(t) is a posturedependent transformation matrix from muscle activity M to executed movement x. Muscle activity was determined nonlinearly by neural activities A: M 5 [WA] 1 , where W represents adaptable connectivities between neurons and muscles, and [?] 1 is a linear threshold function ([y] 1 5 y when y . and [y] 1 5 0 otherwise). The product of W t and A t was a motor command in this case because it determined muscle activity. When movement error e 5 v 2 x was observed, W was modified to minimize the squared movement error

E~1
2 e T e (detailed descriptions regarding the learning rule and each parameter were provided in the Methods section).
To investigate the functional roles of sparse coding, a linear threshold function was assumed to represent neural activity following previous models of visual or parietal cortices 18,19 : where Z is a normalization constant, J v and J p are fixed random matrices. The threshold h 5 (h 1 ,...,h N ) was a crucial factor because it can control the sparseness of neural activity: when a small value of h i was chosen, a large portion of neurons showed activity (dense coding), but only a small portion of neurons exhibited activity (sparse coding) when h i was large. Although a homogeneous threshold value was mainly assumed, h i 5 h, heterogeneous threshold values did not affect any results (Figures 2d and e, see below). The above function was used because it allowed to control the sparsity of neural activities using only the parameter h ( Figure 1b; sparseness in this figure was defined as N NA /N, where N NA is the number of neurons whose activities are always 0). The normalization constant Z was determined to satisfy X N i~1 (A i ) 2~1 , i.e., the metabolic cost was fixed across all values of h. Notably, some previous studies have suggested a functional role of sparse coding is to reduce metabolic cost 17 . The normalization constant thus permitted the investigation of the functional roles of sparse coding other than reducing metabolic cost.
Sensorimotor transformation via sparse coding. Figure 2a denotes representative learning curves in dense (blue line) and sparse coding (red line). The above-mentioned sensorimotor transformation task can be achieved only by sparse coding (Figure 2a). The task cannot be achieved by dense coding, which suggests that a linear integration of visual and proprioceptive inputs without any threshold operation is not sufficient to achieve the task. By contrast, a sparse coding network permits the learning of sensorimotor transformations possibly because of the nonlinear thresholding operation. Figure 2b shows average movement error across 10 simulation runs, and Figure 2c shows the standard deviation of the movement error. Sparse coding recorded better average movement error as well as better standard deviation than dense coding. Thus, sparse coding  enables outstanding and robust learning in sensorimotor transformation. It is important to note that there was an optimal sparsity in completing the task: h 5 3 was the best sparsity value for this threshold linear network model and sensorimotor transformation task.
Neurons show diverse features, and it is not plausible that the threshold h i has the same value across all neurons. Therefore, I analyzed the cases where h i was different in each neuron (i.e., h i 5 h 1 j i , where j i was sampled from a Gaussian distribution whose mean was 0 and standard deviation was 0.1). Although optimal values of h shifted from 3 to 3.4 because of the heterogeneity of the threshold value, my conclusion was not affected by this manipulation (Figures 2d and e): sparse coding enables outstanding and robust learning in sensorimotor transformation, even when each neuron has a different threshold value.
Although the significance of sparse coding in sensorimotor transformation could be found by using the threshold linear function, it remains unclear whether the significance of sparse coding is valid when the neural activities are modeled by other nonlinear functions (see the Methods section for a detailed description). The significance of sparse coding was found to be invariant given that neural activities were determined by a hyperbolic tangential function or a sigmoidal function (Figures 2f and g). Taken together, the significance of sparse coding in the sensorimotor transformation is a solid result. Hereafter, neural activities were determined by a threshold linear function, and the threshold values were homogeneous across all the neurons.
Sensorimotor transformation was realized only in the sparse coding network, but the reason for this remains unclear. Since some previous studies have suggested that sparse coding can decorrelate neural activities, which can facilitate the decoding of information from the neural population 11,22 , I expected similar effects in sensorimotor transformation. I used a principal component analysis (PCA) 23 for the covariance matrix of neural activities. Particularly in Equation (1), no correlation between neural activities was defined (no noise correlation was defined, and each component of J was randomly and independently sampled from a Gaussian distribution), but visual and proprioceptive information were common inputs for all neurons, thus leading to considerable correlation between neural activities. PCA enables the determination of the extent of correlation between neural activities by calculating the number of dimensional subspaces that can be observed in neural activity patterns. Since there are 24 inputs in the current sensorimotor transformation task (eight visual inputs and three proprioceptive inputs), the maximal number of dimensions is 24. If the activities of all neurons are completely independent, there is no constraint with regard to neural activities, and the number of dimension can achieve its maximal value. When the subspace consists of 24 dimensions, all 24 inputs can be encoded independently in each dimension, resulting in each input being easily distinguishable, and for it to be expected that the corresponding neural network can generate an appropriate motor command for each input. On the other hand, if the neural activities are strongly correlated, there are strong constraints of neural activities, and the number of dimensions is small. When the subspace consists of one dimension, 24 inputs are encoded confusingly in the dimension, such that each input is not easily distinguishable, and it can be expected that the neural network cannot generate an appropriate motor command for each input. As expected, the number of dimensions showing a large percentage of contributions was greater in sparse coding than in dense coding (Figure 3), thus suggesting that sparse coding decorrelated neural activities and facilitated learning. Hence, decorrelation was a reason for why a sparse coding network enabled the learning of the sensorimotor transformation task.
Advantages of sparse coding. I also investigated the functional roles of sparse coding in the sensorimotor transformation task except for complete and robust learning. In the subsection Sensorimotor transformation via sparse coding, I investigated learning ability using sparse coding when K 5 8, and sparse coding proved superior to dense coding with regard to learning ability. In general, the performance of a network on training data and that on test data cannot be compatible 23 : when trained neural networks overfit to training data, the network shows poor generalization performance on test data. I refer to this generally accepted fact about machine learning as the ''incompatibility of specialization and generalization'' because overfitting to training data can be regarded as a specialization of the network to the data. Based on this knowledge, one can predict that the use of sparse coding leads to worse generalization performance than that of dense coding.
I investigated generalization performance using dense and sparse coding with the weight matrix W, which was fixed after learning, and a new target sequence v l 5 (cos H v,l , sin H v,l ) and H u,l~2 p l L (l 5 1,..., L, where L 5 500). As shown in Figure 4, the sparse coding network showed better generalization performance than the dense coding network, which contrasted with the incompatibility of specialization and generalization. Thus, sparse coding can achieve compatibility between specialization and generalization in sensorimotor tasks.
Multiplicatively modulated neural activities. I also investigated whether sparse coding can reproduce reported neural activities. Figures 5a and 5b show representative neural activities in sparse and dense coding, respectively. In dense coding, the neural activity was additively modulated when proprioceptive information changed (i.e., A i 5 f (v) 1 g(p)), which contrasted with reported neural activities 21 . This is natural because when the threshold h was low, no thresholding operation was necessary and neural activities were linearly affected by visual and proprioceptive inputs. In contrast, in the sparse coding network, neural activity seemed to be multiplicatively modulated as A i 5 f (v)g(p), which corresponded to previously reported neural activities. When the threshold h was high, a nonlinear thresholding operation was required, which could reproduce a multiplicative interaction of visual and proprioceptive information as reported by previous neurophysiological experiments. Hence, not only did a sparse coding network attain complete and robust learning and the compatibility between specialization and generalization, it also reproduced previously reported neural activities in the sensorimotor transformation task.
Comparisons with nonlinear network models. In addition to the above, I investigated whether sparse coding has advantages over other nonlinear network models. Although a linear network (dense coding) cannot facilitate learning in the sensorimotor task, some nonlinear network models are expected to do so. To show this, I first simulated a nonlinear network model where neural activities were determined by a hyperbolic tangential function, h i 5 0 (dense coding), W was fixed, and J was modified to minimize the squared movement error. These settings contrasted to the threshold linear network because J was fixed and W was modified in the network. The hyperbolic tangential network model failed to learn the sensorimotor transformation task, which suggests that nonlinearity alone in neural activities is not sufficient to learn the task (Figure 6a). Following this, I simulated a nonlinear network model in which neural activities were also determined by a hyperbolic tangential function, h i 5 0, and both J and W were modified to minimize the movement error. This network model could learn the sensorimotor transformation task; however, in two of the 10 simulation runs, the network model failed to learn the task (Figure 6b). These failures were possibly due to a high degree of parameter sensitivity because the initial values, including the value of each parameter, significantly affected learning performance in this network model 24 . In contrast, in sparse coding, robust learning was possible because only W was adaptable, which led to low parameter sensitivity. Furthermore, the nonlinear network models failed to reproduce multiplicatively modulated neural activities (Figures 6c). Thus, sparse coding has advantages over other nonlinear network models in that it has lower parameter sensitivity and can reproduce reported neural activities.

Discussion
In this paper, by assuming a threshold linear network and studying visually guided wrist movements in various postures, I revealed that sparse coding is superior for learning a sensorimotor transformation task than dense coding and other nonlinear networks. When the sparsity of the neural firing was optimal, the neural network could completely and robustly learn the sensorimotor transformation task (Figure 2), by decorrelating neural activities (Figure 3). The significance of sparse coding was invariant when neural activities were modeled by a threshold linear function with the same threshold value across all neurons (Figures 2a-c), with a function with heterogeneous threshold (Figures 2d and e), or hyperbolic tangential nonlinear function (Figures 2f and g). Although overfitting to training data (specialization) and performances on test data (generalization) are incompatible in general, sparse coding successfully rendered the two compatible ( Figure 4). Previous studies suggested that a functional role of sparse coding is to reduce metabolic cost 17 . However, by   keeping the cost constant for both dense and sparse coding, this study revealed that sparse coding also has other significant roles in sensorimotor transformation: better learning ability, robust learning, and rendering specialization and generalization compatible in sensorimotor transformation. Furthermore, sparse coding enabled the reproduction of previously reported neural activities (Figure 5a), which were impossible by dense coding (Figure 5b) and other nonlinear neural networks ( Figure 6). The compatibility of specialization and generalization was also previously reported for a similar network model by assuming a binary neuron model and a discrimination task 25 . The study also reported that sparse coding facilitates the decorrelation of neural activities, resulting in better discrimination performance in sparse coding than in dense coding. My research here facilitated a comparison between neural activities in a sparse coding network and previously reported neural activities in sensorimotor transformation, and showed that a sparse coding network can reproduce previously reported neural activities. Hence, my work here showed another advantage of sparse coding networks, and revealed the biological plausibility of sparse coding from a different aspect: the reproduction of actual neural activities. Furthermore, although the previous and this study assumed different tasks, a discrimination and a sensorimotor transformation task, respectively, both studies found an optimal sparsity value of 0.9 (the definition of sparsity is different between the two studies). At a glance, 0.9 seems to be a magic number of sparsity; however, further analyses are necessary to rigorously discuss the optimal sparsity in various tasks. Notably, the compatibility of specialization and generalization appears to be an attractive feature of an artificial and analog neural network model, on which I focused in the current study, but further investigation is required for a competent discussion of its compatibility in more biologically plausible neural network models.
Although previous research in the area assumed that multiplicative modulation in neural activities was the crucial factor in sensorimotor transformation tasks 6,8,26 , my research here showed that sparse coding is critical to sensorimotor transformation. In fact, this network can even learn a sensorimotor transformation task after eliminating neurons whose activities are multiplicatively modulated   (data not shown). Furthermore, while some studies have suggested neural implementation of multiplicative modulation 19,27 , I have in this paper proposed a different framework to reproduce modulation that can seamlessly connect sparse coding, the achievement of sensorimotor transformation, and reported neural activities. Furthermore, my study suggests that multiplicative modulation is a sub-phenomenon of sparse coding, and that sparse coding is an essential factor in sensorimotor transformation.

Methods
Learning rule. When movement error e t 5 v t 2 x t is observed at the t-th trial, the connectivities from neurons to muscles W t [R NM |N is modified to minimize the cost which consists of the squared movement error and the squared sum of muscle activities: or where l is a regularization parameter, g is the learning rate, and P k9(t),1 is a matrix whose m-th column was set to 0 when the m-th muscle activity was 0. The minimization of the squared sum of muscle activities is an important factor to discuss muscle activity 28 . On the other hand, the minimization did not play an important role in this study because I focused on the effect of the sparseness of neural activities on movement errors following the convergence of learning. l was thus set to 0 and g was set to 0.4. The number of trials was set to 1,000,000, but movement error had converged to a certain value by 20,000 trials. Hence, I represented movement error until 20,000 trials, as shown in Figure 2a. The average movement errors (Figures 2bg) were calculated by averaging the movement error from the 900,000-th to the 1,000,000-th trial.
Parameters. Each element of J [R N|4 was randomly sampled from a Gaussian distribution whose mean was 0 and standard deviation was 1 for 10 times, i.e., I ran 10 simulations to calculate the average movement error shown in Figures 2b-g. The number of neurons N was set to 2,000. The number of neurons did not significantly affect the results because the number of neurons significantly affects the learning speed 29 . The threshold value h was sampled at 18 linearly spaced points from -3 to 3.8. Actually, h was sampled at 21 linearly spaced points from -3 to 5, but no neuron was active when h was greater than 3.8. Thus, no learning was possible for those values of h. The initial weight value W 0 was set to 0. The number of muscles N M represented the number of muscles and was set to 5 based on previous studies 8,20,21 . The fixed neuron-to-muscle connectivties P k'(t) [R 2|NM depend on posture. The (1,i)-th and the (2,i) -th components of P k9(t) are defined as cos(Q i,k9(t) ) and sin(Q i,k9(t) ), respectively, where Q i,k9(t) represents the pulling direction of the i-th muscle at the k9-th posture. The pulling direction is the direction of motion induced by muscle activation and depends upon posture. Each pulling direction was determined based on previous studies 8,20,21 .
Summary of computer simulation. The procedures of my computer simulations can be summarized as follows. Setting the parameters J, l, g, N to certain values, W 0 5 0, and H v,k(t) and H p,k9(t) to a certain value at the t-th trial yields the following: (Determining input pattern) I t~( u t ,p t ) T ( cos H u,k(t) , sin H u,k(t) , cos H p,k'(t) , sin H p,k'(t) ) T : (Update of connectivity matrix) W tz1~W t zg(P k'(t),z ) T e t (A t ) T {glM t (A t ) T :ð9Þ Compared network model. In the subsection Sensorimotor transformation via sparse coding and Comparisons with other nonlinear network models, I reported the results of simulations from three hyperbolic tangential network models. In the subsection Sensorimotor transformation via sparse coding, neural activities were determined by