Abstract
Humans are capable of achieving complex tasks with redundant degrees of freedom. Much attention has been paid to task relevant variance modulation as an indication of online feedback control strategies to cope with motor variability. Meanwhile, it has been discussed that the brain learns internal models of environments to realize feedforward control with nominal trajectories. Here we examined trajectory variance in both spatial and temporal domains to elucidate the relative contribution of these control schemas. We asked subjects to learn reaching movements with multiple viapoints and found that hand trajectories converged to stereotyped trajectories with the reduction of task relevant variance modulation as learning proceeded. Furthermore, variance reduction was not always associated with task constraints but was highly correlated with the velocity profile. A model assuming noise both on the nominal trajectory and motor command was able to reproduce the observed variance modulation, supporting an expression of nominal trajectories in the brain. The learningrelated decrease in taskrelevant modulation revealed a reduction in the influence of optimal feedback around the task constraints. After practice, the major part of computation seems to be taken over by the feedforward controller around the nominal trajectory with feedback added only when it becomes necessary.
Introduction
Biological motor control problems involve considerable redundancy, neural noise and substantial sensorimotor delay. The brain solves these problems with limited resources and time by learning to perform given tasks. In particular, the brain has to cope with the variance caused by external perturbation as well as internally generated noise. Recent studies suggested that the variance was reduced only around the task constraints and remained large at locations irrelevant to the task in complex tasks, which is called the minimal intervention principle. At the same time, there is much experimental evidence showing a reduction of movement variance after learning and convergence to nominal trajectories in simple reaching tasks.
These two different observations of movement variance are closely related with old and new arguments of whether biological motor systems mainly depend on feedback control or feedforward control. In the optimal feedback control approach, the brain does not plan movement beforehand, but solves the problems online using all possible feedback information available at that time without the need for feedforward control^{1}. This idea originated in traditional psychological theory such as dynamical system theory^{2,3} and the theory of uncontrolled manifolds^{4,5,6} and was formulated in the context of biological motor control as the optimal feedback control theory by Todorov and Jordan^{1}. This approach solves all problems simultaneously at the feedback level and therefore assumes a heterarchical implementation in the brain. There is a large amount of behavioral evidence that humans reduce variability only in the direction relevant to the task^{4,6,7}. Such taskrelevant variance modulation is called the minimal intervention principle and has been regarded as evidence of the absence of a plan^{1,8,9}. Feedforward control, on the other hand, solves the problem sequentially by dividing a complicated problem into several simple problems (divide and conquer). To divide the problem, this approach normally requires an intermediate representation between the task and motor commands, e.g., a desired trajectory, that are not directly specified by the task constraint. Such a strategy assumes hierarchical implementation in the brain. Complicated problems are partly solved at the planning level before the start of movement^{10,11,12,13,14}. This type of approach stems from control theory in robotics and physiological and imaging data suggest hierarchical information processing and modular characteristics of the brain^{15,16,17}. If the role of feedback control (such as the control of impedance) is to reduce the deviation from the planned (desired) trajectory, the actual trajectories are likely to be spatially and temporally uniformly distributed around the fixed planned trajectory, rather than reducing their variance only around task constraints.
The question we address in this paper is whether task relevant modulation (optimal online feedback control) is the major contribution of the trajectory variance, or whether the assumption of nominal trajectories (feedforward control) is required for explaining trajectory variance. As a behavioral investigation, we focused on the modulation of variance in reaching tasks with multiple intermediate targets.
We first focus on the learningrelated change of taskrelevant modulation in variance. An important prediction to be tested is if the minimal intervention principle is optimal behavior in the sense of the optimal feedback control schema, then taskrelevant modulation should be observed in skilled movements even after learning. Our behavioral experiments demonstrate that this taskrelevant modulation decreases or disappears after practice, suggesting the significance of planbased control for skilled movements.
We then examine the possibility that movement planning is expressed in the form of trajectories (i.e., kinematic variables as a function of time) in the brain. In such a case, we suppose that neural noise could be added to the time indices (timejitter noise), which predicts velocitydependent modulation of variability. The observed velocitydependent modulation of the trajectory variability would support the existence of nominal trajectory expression in the brain with the proposed timejitter noise model.
We finally consider the effect of dynamics of the body (e.g., arm dynamics) on the observed variance modulation. The experimental paradigm suggests that trajectory optimization should take into account the dynamics of the nonlinear musculoskeletal system, which makes the problem of online optimization complex.
Results
Experiment 1: taskrelevant variance modulation disappears after learning
We asked participants to perform reaching tasks with multiple targets and examined the practicerelated changes in the variance structure.
The participants performed four different types of multipletarget movements until 50 successful trials were acquired (see Fig. 1 and Methods). To focus on variance modulation in the task space, we computed the variance normalized by path length^{1} (see Eq. 2). This ‘path variance’ evaluates solely spatial variance by selecting the nearest points, without taking into account the temporal information^{1}. To quantify the variance modulation, we computed the modulation index (MI) from path variance. The MI will be large if the variances between targets are larger than the variance at adjacent targets. Therefore, a larger MI indicates better modulation of the variance. Figure 2 shows the evolution of the MI over normalized trial numbers. In general, the taskrelevant modulation of the path variance was observed at the early stage of practice and reduced at the later stage of practice. In all tasks, the MI gradually decreased as practice proceeded. In five of six tasks, the MI of the final 5% of total trials was significantly smaller than that of the initial 5% (Todorov’s task: t(8) = 7.11, p = 0.00010, parabola task: t(7) = 2.83, p = 0.02544, fast zigzag task: t(7) = 2.12, p = 0.07137, slow zigzag task: t(8) = 5.20, p = 0.00082, fast threeviapoint task: t(6) = 2.53, p = 0.04440, slow threeviapoint task: t(8) = 3.14, p = 0.01383). The path variance at each target did not change significantly while it significantly decreased after learning (paired ttests) at some midpoints between the targets. Therefore, the observed decrease in the modulation index was due not to the increase in variance at the targets but to the decrease in variance at the midpoints.
We then examined whether there was still significant modulation of path variance after learning (inset of Fig. 2). Although we observed significant differences among targets and midpoints in Todorov’s task, the parabola task and the threeviapoint tasks (ANOVA), the posthoc comparison revealed that there were differences relevant to the minimum intervention principle (i.e., an increase in variance at a midpoint in comparison with its neighboring targets) for midpoints 2 and 3 of Todorov’s task and for midpoint 2 of the threeviapoint task, but not for others. These results show that taskrelevant modulation of path variance does not necessarily increase but tends to disappear after practice.
If online correction towards the original task constraints is the major contribution of the variance modulation, it is hard to explain why the variance between the targets has reduced, where no task constraints were given. Reduced task relevant modulation after learning suggests that the brain may have been lead to converge the trajectory between the targets through practice, by learning a nominal trajectory toward which the variance is reduced.
Temporal aspect of variance modulation
We then computed the ‘trajectory variance’ taking into account the spatiotemporal characteristics of the variance. Here the trajectory variance is defined as the spatial variance along normalized time and the effect of movement duration was removed by resampling the position for each trajectory so that the duration was evenly divided into 100 pieces (time normalization, as detailed in Methods). This method normalizes the total movement duration without removing the local time warp. While the path variance is expressed as a function of path length, the trajectory variance is expressed as a function of normalized time. Figure 3a–d show the temporal profiles of the trajectory variance and mean squared velocity for representative participants. In all tasks, there was a general tendency that the trajectory variance increased with an increase in velocity and vice versa. When the minima of the velocity and target did not match (e.g., targets 1 and 2 in the parabola task, targets 2 and 3 in the multitarget task), the minima of the trajectory variances were located not around the targets (blue arrows) but around the midpoints where the velocity minima were observed (red arrows). These results suggest that the trajectory variance did not follow the minimal intervention principle but changed in parallel with the velocity profile. As expected, the linear quadratic Gaussian (LQG) framework^{1,18} that computes optimal feedback controllers predicted the minimal intervention principle even for the parabola task and the multitarget task, in which the variance increased at the midpoint of the two viapoints where the velocity was minimal (red arrows in Fig. 3f and h, in comparison with those in Fig. 3b,d).
Experiment 2: noise in the nominal trajectory explains movement variance
Because the trajectory variance tends to be more affected by the movement velocity than by the task constraints as seen in the above experiments, we investigated the temporal aspects of the variance in more detail. If a representation of a nominal trajectory exists in the brain for skilled and rapid movements, we hypothesize that noise can be added to that trajectory representation in a manner similar to noise added to the control command. If we assume that the motor command is computed sequentially via a nominal trajectory (Fig. 4a), the motor noise at the level of a nominal trajectory affects the actual trajectory without time delay because it does not yield to the integration effect of dynamics (Fig. 4b). In contrast, motor noise added to the motor command, known as signal dependent noise, plays through dynamics, resulting in an incremental increase of variability in the actual trajectory (Fig. 4c). Therefore, we may be able to distinguish the source of noise by analyzing the temporal aspect of variability in the actual trajectory either at the planning level or at the motor command level.
We propose a novel noise model assuming the time jitter noise in a desired trajectory expressed as a position sequence as a function of time as well as signaldependent noise. The time jitter noise here means local advance or delay of time in reading out a desired trajectory owing to speed changes of planner dynamics. Employing the proposed model, we separated the variability caused by noise at the planning level from that caused by noise at the motor command level. Specifically, we succeeded in predicting the time course of the trajectory variability T_Var(t) during reaching movements using a linear summation of incremental variability coming from the signaldependent noise and velocitydependent variability coming from the planning noise (see Methods):
where, the trajectory variance T_Var(t) corresponds to the variability of the actual hand position at time t from the desired hand position at the same time t assuming the representation of a desired trajectory in the brain. Because information of the actual desired trajectory in the experiments with human subjects is not available, we used the mean trajectory as its approximation. In Eq. 1, the first term represents the effect of timejitter noise on the desired trajectory and the second term represents the effect of signaldependent noise playing through the dynamics, which is proportional to the double integral of the sum of square of the motor command τ. For simplicity, τ(t)^{2} was approximated by the summation of the square of the shoulder torque and the square of the elbow torque. E includes both the spatial noise of the planned trajectory and the modeling error. Thus, the hierarchical schema predicts that the trajectory variance T_Var(t) can be reproduced by the linear summation of terms proportional to 1) the square of velocity, 2) the incremental term that is proportional to the integrated motor commands and 3) the error term.
The proposed model predicts that the trajectory variance normal to the movement direction has less contribution from the velocitydependent term because there is a small component of velocity in the normal direction. The trajectory variance tangential to the movement direction, however, should have a significant velocitydependent term because there is a substantial increase in velocity in the tangential direction. Therefore, β_{1} should be different when reconstructing either a normal or tangential variance. To confirm this, we tested simple pointtopoint reaching movements with nearly straight trajectories (Fig. 5a).
We computed the trajectory variances normal to and tangential to the mean trajectory from 40 movements after enough practice (Methods). For each variance, the parameters β_{1} and β_{2} in Eq. 1 were estimated using the least square error method. Figure 5b,c show the time courses of the observed variance (solid curves) and mean square velocity weighted by the parameter β_{1} (dotted curves) as well as the reconstructed variance (dashed curves) for forward movements. Both the tangential and normal variances were well reconstructed (Table 1). The contribution of the velocitydependent term was significantly smaller for the normal than for the tangential variance (ANOVA F(3, 8) = 169.25, p < 0.000001 [forward], ANOVA F(3, 8) = 75.50, p < 0.00001 [rightward]). The normal variance was mainly explained by the incremental term relating to the signaldependent noise while the tangential variance required both incremental and velocitydependent terms. The LQG simulation (see Methods) predicted a bellshaped trajectory variance in both the normal and tangential directions (Fig. 5d). To reproduce the observed dissociation between the normal and tangential trajectory variance in the LQG simulation, signaldependent noise has to be larger in the tangential direction than in the normal direction and at the same time, the task constraint has to be smaller in the tangential direction than in the normal direction (Fig. 5e).
Our noise model assumes that the time at each position on the movement path across many trials has a Gaussian distribution with a mean of zero and that the standard deviation of this distribution is constant throughout the movement duration (Methods). Independent of the model fittings, we computed the actual distribution of time at each position in the trajectories produced in Experiment 2 to examine the properties of the timejitter noise. Although the movement paths are nearly identical to each other, the time when the hand reached the same position on the path is slightly different (Fig. 6a,b). For example, the time at the 50% point of one path shown in (b) (dashed curve) was advanced of that of the mean path (solid curve). In contrast, the time at the 50% point of another path (dashdotted curve) was delayed compared with that of the mean path. The timejitter noise in the produced trajectories was approximated by the Gaussian distribution with a mean of zero, which is consistent with the model assumption (Fig. 6c,d). From this, we computed the standard deviation of the timejitter noise for each task and compared it with the standard deviation of the timejitter noise predicted from the model (the square root of the parameter β_{1}). The standard deviation of the timejitter noise computed from the data of Experiments 1, 2 and 3 (see below for the data of Experiment 3) correlated with that estimated from the model fitting (r = 0.79, Fig. 6e). The results show that temporal fluctuations in the trajectory had a Gaussian distribution with an approximately constant standard deviation throughout the movement.
Experiment 3: effect of dynamics on trajectory variance
It is known that human hand trajectories are affected by the dynamics of the muscle skeletal system of the body^{11}. For example, when the location of an intermediate target (viapoint) is closer to the body than the horizontal start–end line, the hand velocity tends to exhibit a doublepeaked profile^{12}. In contrast, it has a singlepeaked profile when an intermediate target is located away from the body (Fig. 7a,b). The minimal intervention principle in the Cartesian task space with linear dynamics would predict a symmetric variance modulation with respect to the location of the target around the horizontal line as predicted by the LQG simulation (Fig. 7e,f) where the nonlinear dynamics of the body were not taken into account. However, we observed asymmetric modulation of the trajectory variance similar to the velocity profile (Fig. 7c).
The proposed noise model in Eq. 1 successfully reconstructed the trajectory variance and the contribution of the velocitydependent term was significantly positive for all participants (p < 0.00001) and sufficiently large (Fig. 7d, Table 1).
Discussion
Characteristics of variance modulation
The present study focused on variance modulation during human reaching movements in both spatial and temporal domains. We demonstrated (1) learningrelated reduction of variance modulation and (2) velocitydependent variance modulation. These results demonstrate convergence towards nominal trajectory after learning and suggest an expression of nominal trajectories in the brain. We identified two different noise sources that could contribute to the time course of movement variability: timejitter noise in the desired trajectory and signaldependent noise in the motor commands. This simple model was able to reproduce the variance modulation observed in behavioral experiments. The observed learningrelated decrease in taskrelevant modulation revealed that movement became more stereotyped, suggesting a reduction in the weight on online optimal feedback in movement. After practice, the feedforward controller around nominal trajectory seems to take over the major part of computation for these movements^{19}.
Explanation of the reduction in variance modulation
It has been argued that the taskrelevant modulation of variability (the ‘minimal intervention principle’) is compatible with the optimal feedback control hypothesis but incompatible with the planbased control hypothesis. However, it is in fact compatible with the planbased control hypothesis, particularly when considering the process of practice and trajectory optimization. Especially for complex and inexperienced movements, computation of the desired trajectory may not have converged yet at first. Taskrelevant modulation observed at the beginning of learning could be partly explained by the exploration of a desired trajectory in addition to the results of optimal feedback control. Suppose that participants encounter a new reaching task with constraints (e.g., the novel allocation of intermediate targets) and practice it several tens of times to meet the constraints. In the planbased control strategy, the brain first produces a desired trajectory by offline optimization computation. In simulations, a number of iterations in the computation are typically required to determine the optimal trajectory from suboptimal trajectories, especially when the cost function is complex. It is also probable for a biological system that sufficiently long duration is needed to converge to the optimal trajectory^{20}. In such a case, a participant may start reaching with suboptimally planned trajectories that satisfy the target constraints while exploring the trajectory between targets^{21}. Given the intermediate targets as hard constraints, the optimal trajectory planner may try to solve redundancy problems that lie between the targets by optimizing soft constraints such as jerk or energy. Then, at the beginning of practice, the path variance should be smaller at the task constraints and larger between them. After the brain obtains the optimal solution, the trajectory should become a more stereotyped pattern with less modulation around the task constraints. Therefore, the planbased control hypothesis predicts a practicerelated decrease in taskrelevant variance modulation.
Optimal feedback control may also be able to explain stereotyped movement after learning if we consider changes of the weights in the criteria in the brain. For example, after learning, the brain might have had smaller weights with respect to the constraint and larger weights on, for example, the effort leading to convergence of the trajectory between targets.
Feedback control or not
Many attempts have already been made to investigate whether the exclusive use of feedback control is sufficient to explain human arm movements. Theoretical work has demonstrated that rapid and coordinated arm movements cannot be executed solely under feedback control because biological feedback loops are slow and have small gains^{22}. Miall et al. suggested an idea of combining the forward model and feedback controller as an inverse model to compensate time delay^{23}. However, Mehta and Schaal showed that this strategy could not stabilize an unstable system^{24}. Therefore, the exclusive use of a feedback control law does not seem to be practically useful in a biological system with time delay, whereas a feedforward impedance controller can stabilize unstable dynamics by learning appropriate impedance^{25}. In addition, as illustrated in Experiment 3, optimal feedback control with linear dynamics was not able to predict asymmetric trajectory and variance profiles and it is nontrivial to derive a general optimal feedback controller for nonlinear plant dynamics^{26}. To our knowledge, there is still no effective demonstration of dealing with nonlinear dynamics or unstable dynamics through the exclusive use of optimal feedback control in the literature on biological motor control. The iterative linear quadratic Gaussian (iLQG) scheme, as one solution for the nonlinear dynamics, includes both feedforward commands and local optimal feedback around the optimal trajectory^{27}. In this case, the feedforward motor command is computed without hierarchical computation, as a result of optimization using an internal model.
Several recent studies reported physiological and behavioral results supporting optimal feedback control^{28,29,30,31,32}. For example, representations of MI neurons do not always remain constant across behavioral contexts but change their sensitivity according to the task^{33}. There exists a flexible and sophisticated long latency reflex that is similar to the later voluntary response^{34} and has an internal model of limb dynamics^{35}. Diedrichsen successfully demonstrated that both feedback control and the adaptation of two hands change optimally according to the current bimanual task requirements^{36}.
The feedback controller itself is consistent with the desired trajectory hypothesis. Some recent models have progressively incorporated the feedback controller into the feedforward controller with a trajectory planner^{37,38}. These models can deal with interactive movements with objects as well as target shifts during the movements. Much evidence suggests the existence of sophisticated longlatency feedback and it is possible that such highlevel feedback systems are dedicated to an optimal feedback control law. However, it is still unclear whether the feedback gain is in fact effectively modified online so that it can optimize the performance. The gain may be sophisticated but suboptimal^{39} and it may be already preplanned and executed in a feedforward manner^{25}. By controlling stiffness (i.e., the feedback gain using a feedforward mechanism), the effect of signaldependent noise can be reduced without complicated online computation of feedback gains^{40}. This schema (i.e., desired trajectories combined with feedforward impedance control) allows us to explain the decrease in variance at task constraints^{5,41}. The variance at the targets can be decreased because the trajectories are corrected towards the desired trajectory with a preplanned gain that is higher than that at midpoints. Thus, even if the path variance were to decrease near viapoints, we could not conclude that the motor control system employed the optimal feedback control law. Further investigations will be performed to demonstrate the existence of realtime and optimal modulation of feedback gain.
There is also recent physiological evidence for feedforward control. Subcortical structures can contribute to prepared motor responses in humans through reticulospinal tract, although its fundamental role is limited to coordinated movement of the whole hand, rather than dexterous individual finger movements^{42,43}. Subcortical structure of rats, without motor cortex, has recently been shown to execute skilled, but not dexterous motor tasks, although motor cortex was necessary during the process of learning the tasks^{44}. These results demonstrate that subcortical structures may play an important role in feedforward control of fundamental motor repertories. Better understanding of these interactions between cortical and subcortical structures may help elucidate conditions and tasks for which feedforward and feedback control strategies are most relevant.
Noise in the nominal trajectory
Previously, Gordon et al. measured variability at the end of a movement to determine the nature and origin of the coordinate system in which the movements were planned^{45}. McIntyre et al. further examined different types of errors to identify properties of the internal representation and coordinate transformations in the brain^{46}. Churchland et al. showed that at least 30% of behavioral variability can be accounted for by the variability of preparatory neural activity in the dorsal premotor cortex^{20}. While these studies assume that observed variability mainly or partly arises in the planning process, van Beers et al. successfully explained the variability at the end of movement by noise in movement execution (i.e., noise in the motor commands (signaldependent noise^{47})) rather than by noise in the desired trajectories^{48}.
In our noise model, we hypothesized that timejitter noise exists at the planning level but not at the motor command level. If the time series of motor commands were temporally stored before being issued, or expressed in such a way as a table lookup method, timejitter noise could also appear when the stored motor commands were read out at each moment. However, because the effect of noise on the motor commands plays through the integrating effect of arm dynamics, it will also result in the incremental increase of path variance, mainly expressed as the global extension and contraction of movement duration and overshoot/undershoot at the end of the movement. Therefore, time jitter noise on the motor command will not correlate with velocity and should be removed by time normalization and/or be included in the second incremental term of Eq. 1.
Combination of the preplanned trajectory and local optimal feedback
Our overall results suggest that online feedback control around task constraints is not the exclusive strategy of motor control and, at least for rapid skilled movements, desired trajectories are planned in the brain. The most possible and feasible solution would be the combination of a desired trajectory and local optimal feedback control. Using a preplanned trajectory, the brain can solve complex problems with nonlinear dynamics before starting the movement through the learned internal model. The brain makes the best use of offline computation to reduce the cost of online computation while online computation concentrates on further fine tuning and dealing with unpredictable perturbations using redundant degrees of freedom^{19,32}. The principles of biological motor control will be further investigated in future work.
Methods
Participants and experimental setting
Twentyfour male and one female participants, aged 21–32 years, who were righthanded except for one, participated in at least one of the three experiments. Seventeen participated in the multiviapoint tasks (Exp. 1), nine in the pointtopoint movement tasks (Exp. 2) and seven in the mirrorplacement singleviapoint tasks (Exp. 3). Seven participated in both Experiments 1 and 3 and another four participated in both Experiments 1 and 2. All experiments were conducted in accordance with the principles and the guidelines in the Declaration of Helsinki and were approved by the ATR Human Subject Review Committee. The participants provided informed consent before participation.
Participants were seated on a chair and their shoulders fixed to the back of the chair with a harness. The height of the table was adjusted to lift the participant’s arm to shoulder level. The participant’s right wrist was braced so that movement was constrained to allow only two degrees of freedom of the elbow and shoulder. To reduce friction between the arm and table, the arm was attached to a board that was levitated above the table by an air sled. The participants performed all tasks with their right hand. An OPTOTRAK 3020 device (Northern Digital Inc., Canada) was used to measure the position of a marker placed on the end of a 9cm vertical bar that was grasped by the participant. The marker position was sampled at 500 Hz (Experiment 2) and 400 Hz (Experiment 1 and 3) and projected as a cross mark on a highresolution monitor placed in front of the participants to represent the current hand position. The participants performed the experiment while looking only at the monitor. The room was darkened to eliminate visual information and the participant wore noisecanceling headphones (Bose QuietComfort Acoustic Noise Canceling headphones, Bose Corporation, USA) to eliminate auditory noise and to allow him/her to concentrate on the individual experiment. We used a beep sound to indicate the beginning and the end of the movement task.
Experiment 1: multiple target tasks
Nine participants performed Todorov’s task (three intermediate target conditions of Experiment 1 in ref. 1), the slow zigzag task and the slow three viapoint task while the other eight participants performed the parabola task, the fast zigzag task and the fast three viapoint task (Fig. 1). The participants performed each task until there were 50 successful trials. One participant could not complete the fast threeviapoint task, which was excluded from the analysis (while other tasks of this participant were included in the analysis). The average numbers of trials were 345 ± 85 for Todorov’s task, 132 ± 31 for the parabola task, 109 ± 24 and 106 ± 20 for the fast and slow zigzag tasks and 133 ± 34 and 148 ± 35 for the fast and slow threeviapoint tasks. The peak velocity after learning (average of the last 20% of the total trials) averaged across participants was 43.67 ± 4.66 cm/s for Todorov’s task, 57.71 ± 9.26 cm/s for the parabola task, 49.87 ± 7.66 and 30.38 ± 2.01 cm/s for the fast and slow zigzag tasks and 61.51 ± 13.89 and 36.65 ± 3.04 cm/s for the fast and slow three viapoint tasks respectively.
Experiment 2: pointtopoint reaching task
Experiment 2 consists of two sessions of pointtopoint reaching tasks along different directions. In one session, participants performed movements in the forward direction and in the other session, they moved in the rightward direction (Fig. 5). Movements ending in the target circle within the specified duration were regarded as successful trials. The participants were randomly placed into one of the two groups. On the first day, one group practiced forward movements until they reached 50 successful trials followed by rightward movements. The second group started with rightward movements followed by forward movements. On the second day, the first group performed forward movements until they reached 40 successful trials followed by rightward movements. The second group started with rightward movements. The average success rates on the second day were 78.72 ± 12.37% for forward movements and 79.65 ± 5.52% for rightward movements.
Experiment 3: single viapoint tasks (mirrorplacement tasks)
Eight intermediate targets were selected and equally arranged on the perpendicular bisector of the start–goal straight line (Fig. 7). The combination of the start point, end point and via point was regarded as a set. One of the eight sets was randomly presented for each task. First, the participants performed eight sets of 20 tasks for a total of 160 tasks (training). Second, the participants were trained for four viapoints that were selected on the basis of the failure rate. Forty trials were performed for each viapoint, amounting to 160 tasks (training). Finally, the participants performed the same task as in the first step. Equation 1 was applied to the trajectories in the last step.
Data analysis
Position data were digitally filtered using a thirdorder Butterworth lowpass filter with a cutoff frequency of 12 Hz. The velocity was computed by applying a threepoint derivative of the measured position data. The start and end points of each movement were determined using a curvature threshold of 100 m^{−1} ^{49}. Trials that did not stop within the final target circle and trials whose movement duration or path length was more than 3 standard deviations from the mean were excluded from further analyses.
For the purpose of examining the taskrelevant variance, we applied the methods used by Todorov and Jordan^{1} where data were normalized on the basis of path length. This method removes the temporal effects. First, all trajectories for one participant and condition were resampled at 100 equally spaced points along the path. Second, the average trajectory was computed from the resampled data. Third, for each average point, the nearest point from each trial was found and the path variance was calculated from that point. To avoid artifacts of realignment, 5% of the path at each end was eliminated from the analysis.
The modulation index was computed from the path variance as follows.
here, P_Var_{tgt}(i) denotes the variance at the ith target and P_Var_{mid}(i) denotes the variance at the midpoint between the (i1)th and ith targets. N denotes the number of intermediate targets. The variance at a certain target was defined as the minimum value observed within 5% in front of and behind the target along the path. The variance at a certain midpoint was defined as the maximum value observed within 5% in front of and behind the midpoint along the path.
For the purpose of examining the temporal characteristics of the variance and the effect of timejitter noise, the trajectory variance of each participant was computed for a set of trajectories as follows. First, the data were resampled between the start and end times so that the duration was evenly divided into 100 pieces to remove the effect of movement duration (time normalization). Therefore, each trajectory has 100 data points with different sampling intervals depending on the movement duration. Second, the resampled position was ensemble averaged to compute the mean position for each 100 time steps. Trajectories whose movement duration and path length were within 2 standard deviations of the mean were included in the regression analysis.
We assume that a trajectory pattern (i.e., a position sequence as a function of time) of the kth trial is generated by the trajectory planner. The position at each time step t of this planned trajectory pattern is expressed as
Here x_{desired} denotes the desired trajectory of that task; i.e., a position sequence as a function of time in the absence of noise. Each planned trajectory spatiotemporally deviates from the desired trajectory because of the noise in planner dynamics and computational limitations. δ(t) represents the local advance or delay of time (time jitter noise) and ω(t) represents spatial noise.
This time series of the planned position is read out at each moment. The motor command at each time step is computed using an inverse model of the controlled object such as the arm. For notational simplicity, we use G to denote the dynamics of the controlled object (i.e., the arm) and G^{−1} to denote its inverse model. Assuming that signaldependent noise is added to motor commands when they are issued to the controlled object, the actual produced trajectory can be approximately expressed as
Here, δ(t), G(ε(t)), ω(t) are assumed to have a normal distribution with zero mean and standard deviation a(t), b(t), c(t), respectively (see below). The produced trajectory is approximated by the Taylor expansion to the second degree:
Assuming that are mutually uncorrelated, the mean and variance of the produced trajectory are
Because b(t) is the effect of signaldependent noise playing through the dynamics, it should be proportional to the double integral of the sum of the square of the motor command τ. Here we used the joint torque to approximately estimate the magnitude of the motor commands. a(t) and c(t) are the time jitter and spatial noise of the planned trajectory, respectively, both of which are independent of the signal and dynamics. Assuming that a(t) is constant throughout the movement, the trajectory variance is modeled as in Eq. 1.
The total trajectory variance is defined as the sum of the x and y variances:
The trajectory variances in tangential and normal directions are computed from the variance in the direction tangential and normal to the mean trajectory respectively:
where y′ and x′ denote the tangential and normal components of position with respect to the mean trajectory, respectively.
For the purpose of regression analysis of Eq. 1, dynamic torques were calculated using the dynamics equations of a twolink arm model and the position data and link parameters estimated from the link length for each participant (with the data of an adult male arm measured with a threedimensional scanner to provide a standard). The mass of links was adjusted for each participant by making the standard value proportional to the link length of the participant. The moment of inertia of the links was adjusted by making the standard value proportional to the third power of the link length of the participant. Viscosity coefficients were estimated from the absolute average torque for each movement using the equation (6) in ref. 3.
Optimal feedback control simulation
In the comparative LQG simulations for the reaching tasks in Experiments, the LQG formulation described in the supplementary information in ref. 1 was used (refer to Section 2 for details). The same parameters for the point mass (m = 1 kg), the time constant for the filters (τ = 40 ms) and the sampling time for discretization (Δt = 10 ms) were used as in ref. 1. The sensory noise parameter σ_{S} and the control noise parameter σ_{u} were adjusted so that simulated variability approximately matched the experimental data. The weight parameters in the cost defining the relative importance of stopping for the velocity and force terms respectively (w_{v}, w_{f}) were also adjusted for the experimental data. The weight for the effort penalty r was adjusted for each task (either r = 0.002 or r = 0.00002 similarly as in ref. 1). See the figure legend for the parameter settings for each simulation.
Additional Information
How to cite this article: Osu, R. et al. Practice reduces task relevant variance modulation and forms nominal trajectory. Sci. Rep. 5, 17659; doi: 10.1038/srep17659 (2015).
References
Todorov, E. & Jordan, M. I. Optimal feedback control as a theory of motor coordination. Nat Neurosci 5, 1226–1235 (2002).
Saltzman, E. L. & Kelso, J. A. S. Skilled actions: A taskdynamic approach. Psychological Review 94, 84–106 (1987).
Kelso, J. A., Southard, D. L. & Goodman, D. On the nature of human interlimb coordination. Science 203, 1029–1031 (1979).
Scholz, J. P. & Schoner, G. The uncontrolled manifold concept: identifying control variables for a functional task. Exp Brain Res 126, 289–306 (1999).
Scholz, J. P., Schoner, G. & Latash, M. L. Identifying the control structure of multijoint coordination during pistol shooting. Exp Brain Res 135, 382–404 (2000).
Domkin, D., Laczko, J., Djupsjobacka, M., Jaric, S. & Latash, M. L. Joint angle variability in 3D bimanual pointing: uncontrolled manifold analysis. Exp Brain Res 163, 44–57 (2005).
Robertson, E. M. & Miall, R. C. Multijoint limbs permit a flexible response to unpredictable events. Exp Brain Res 117, 148–152 (1997).
Todorov, E. Optimality principles in sensorimotor control. Nat Neurosci 7, 907–915 (2004).
Liu, D. & Todorov, E. Evidence for the flexible sensorimotor strategies predicted by optimal feedback control. J Neurosci 27, 9354–9368 (2007).
Flash, T. & Hogan, N. The coordination of arm movements: an experimentally confirmed mathematical model. J Neurosci 5, 1688–1703 (1985).
Uno, Y., Kawato, M. & Suzuki, R. Formation and control of optimal trajectory in human multijoint arm movement. Minimum torquechange model. Biol Cybern 61, 89–101 (1989).
Nakano, E. et al. Quantitative examinations of internal representations for arm trajectory planning: minimum commanded torque change model. J Neurophysiol 81, 2140–2155 (1999).
Harris, C. M. & Wolpert, D. M. Signaldependent noise determines motor planning. Nature 394, 780–784 (1998).
Miyamoto, H., Nakano, E., Wolpert, D. M. & Kawato, M. TOPS (Task Optimizaztion in the Presence of Signaldependent noise) model. Systems and Computers in Japan 35, 940–949 (2004).
Kalaska, J. F. & Crammond, D. J. Cerebral cortical mechanisms of reaching movements. Science 255, 1517–1523 (1992).
PadoaSchioppa, C., Li, C. S. & Bizzi, E. Neuronal correlates of kinematicstodynamics transformation in the supplementary motor area. Neuron 36, 751–765 (2002).
PadoaSchioppa, C., Li, C. S. & Bizzi, E. Neuronal activity in the supplementary motor area of monkeys adapting to a new dynamic environment. J Neurophysiol 91, 449–473 (2004).
Todorov, E. Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural Comput 17, 1084–1108 (2005).
Shmuelof, L., Krakauer, J. W. & Mazzoni, P. How is a motor skill learned? Change and invariance at the levels of task success and trajectory control. J Neurophysiol 108, 578–594 (2012).
Churchland, M. M., Afshar, A. & Shenoy, K. V. A central source of movement variability. Neuron 52, 1085–1096 (2006).
Wu, H. G., Miyamoto, Y. R., Gonzalez Castro, L. N., Olveczky, B. P. & Smith, M. A. Temporal structure of motor variability is dynamically regulated and predicts motor learning ability. Nat Neurosci 17, 312–321 (2014).
Kawato, M. Internal models for motor control and trajectory planning. Curr Opin Neurobiol 9, 718–727 (1999).
Miall, R. C., Weir, D. J., Wolpert, D. M. & Stein, J. F. Is the Cerebellum a Smith Predictor? J Mot Behav 25, 203–216 (1993).
Mehta, B. & Schaal, S. Forward models in visuomotor control. J Neurophysiol 88, 942–953 (2002).
Burdet, E., Osu, R., Franklin, D. W., Milner, T. E. & Kawato, M. The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414, 446–449 (2001).
Kirk, D. E. Optimal control theory : an introduction, (Dover Publications, Mineola, N.Y., 2004).
Li, W. & Todorov, E. Iterative linearization methods for approximately optimal control and estimation of nonlinear stochastic system. International Journal of Control 80, 1439–1453 (2007).
Scott, S. H. Optimal feedback control and the neural basis of volitional motor control. Nat Rev Neurosci 5, 532–546 (2004).
Scott, S. H. Inconvenient truths about neural processing in primary motor cortex. J Physiol 586, 1217–1224 (2008).
Diedrichsen, J., Shadmehr, R. & Ivry, R. B. The coordination of movement: optimal feedback control and beyond. Trends Cogn Sci 14, 31–39 (2010).
Scott, S. H. The computational and neural basis of voluntary motor control and planning. Trends Cogn Sci 16, 541–549 (2012).
Nashed, J. Y., Crevecoeur, F. & Scott, S. H. Rapid online selection between multiple motor plans. J Neurosci 34, 1769–1780 (2014).
Kurtzer, I., Herter, T. M. & Scott, S. H. Random change in cortical load representation suggests distinct control of posture and movement. Nat Neurosci 8, 498–504 (2005).
Pruszynski, J. A., Kurtzer, I. & Scott, S. H. Rapid motor responses are appropriately tuned to the metrics of a visuospatial task. J Neurophysiol 100, 224–238 (2008).
Kurtzer, I. L., Pruszynski, J. A. & Scott, S. H. Longlatency reflexes of the human arm reflect an internal model of limb dynamics. Curr Biol 18, 449–453 (2008).
Diedrichsen, J. Optimal taskdependent changes of bimanual feedback control and adaptation. Curr Biol 17, 1675–1679 (2007).
Hoff, B. & Arbib, M. A. A model of the effects of speed, accuracy and perturbaton on visually guided reaching. in Control of arm movement in space: Neurophysiological and computational approaches (ed. Caminiti, R. ) 285–306 (SpringerVerlag, New York, 1991).
Hirayama, M., Kawato, M. & Jordan, M. I. The Cascade Neural Network Model and a SpeedAccuracy TradeOff of Arm Movement. J Mot Behav 25, 162–174 (1993).
Shadmehr, R. & Krakauer, J. W. A computational neuroanatomy for motor control. Exp Brain Res 185, 359–381 (2008).
Osu, R. et al. Optimal impedance control for task achievement in the presence of signaldependent noise. J Neurophysiol 92, 1199–1215 (2004).
Osu, R., Morishige, K., Miyamoto, H. & Kawato, M. Feedforward impedance control efficiently reduce motor variability. Neurosci Res 65, 6–10 (2009).
Honeycutt, C. F., Kharouta, M. & Perreault, E. J. Evidence for reticulospinal contributions to coordinated finger movements in humans. J Neurophysiol 110, 1476–1483 (2013).
Nonnekes, J. et al. StartReact restores reaction time in HSP: evidence for subcortical release of a motor program. J Neurosci 34, 275–281 (2014).
Kawai, R. et al. Motor cortex is required for learning but not for executing a motor skill. Neuron 86, 800–812 (2015).
Gordon, J., Ghilardi, M. F. & Ghez, C. Accuracy of planar reaching movements. I. Independence of direction and extent variability. Exp Brain Res 99, 97–111 (1994).
McIntyre, J., Stratta, F., Droulez, J. & Lacquaniti, F. Analysis of pointing errors reveals properties of data representations and coordinate transformations within the central nervous system. Neural Comput 12, 2823–2855 (2000).
Jones, K. E., Hamilton, A. F. & Wolpert, D. M. Sources of signaldependent noise during isometric force production. J Neurophysiol 88, 1533–1544 (2002).
van Beers, R. J., Haggard, P. & Wolpert, D. M. The role of execution noise in movement variability. J Neurophysiol 91, 1050–1063 (2004).
Pollick, F. E. & Ishimura, G. The threedimensional curvature of straightahead movements. Journal of Motor Behavior 28, 271–279 (1996).
Acknowledgements
This work was supported by Funding Program for Next Generation WorldLeading Researchers and a contract with the National Institute of Information and Communications Technology entitled, ‘Development of network dynamics modeling methods for human brain data simulation systems’, Japan. We thank Dr Hirokazu Tanaka for his technical supports and helpful comments.
Author information
Authors and Affiliations
Contributions
R.O. and K.M. ran the experiments and wrote the manuscript. R.O., H.M. and M.K. planned experiments and simulations. K.M. executed simulations and prepared figures. J.N. and M.K. supervised the computational model. All authors reviewed the manuscript.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Osu, R., Morishige, Ki., Nakanishi, J. et al. Practice reduces task relevant variance modulation and forms nominal trajectory. Sci Rep 5, 17659 (2016). https://doi.org/10.1038/srep17659
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/srep17659
This article is cited by

Visuospatial information foraging describes search behavior in learning latent environmental features
Scientific Reports (2023)

From internal models toward metacognitive AI
Biological Cybernetics (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.