Costs of position, velocity, and force requirements in optimal control induce triphasic muscle activation during reaching movement

The nervous system activates a pair of agonist and antagonist muscles to determine the muscle activation pattern for a desired movement. Although there is a problem with redundancy, it is solved immediately, and movements are generated with characteristic muscle activation patterns in which antagonistic muscle pairs show alternate bursts with a triphasic shape. To investigate the requirements for deriving this pattern, this study simulated arm movement numerically by adopting a musculoskeletal arm model and an optimal control. The simulation reproduced the triphasic electromyogram (EMG) pattern observed in a reaching movement using a cost function that considered three terms: end-point position, velocity, and force required; the function minimised neural input. The first, second, and third bursts of muscle activity were generated by the cost terms of position, velocity, and force, respectively. Thus, we concluded that the costs of position, velocity, and force requirements in optimal control can induce triphasic EMG patterns. Therefore, we suggest that the nervous system may control the body by using an optimal control mechanism that adopts the costs of position, velocity, and force required; these costs serve to initiate, decelerate, and stabilise movement, respectively.


Results
We simulated movement using an approximate OFC 35 and a two-joint six-muscle arm model 19 [BF], and biarticular extensor [BX] muscles). The simulation reproduced a centre-out reaching task, which required moving the hand to targets aligned 8 cm from the initial hand position. For base setting, we fixed the simulation duration at 500 ms and fixed the movement duration at 400 ms to fit the experimental data 2,19 . Then, we assumed four situations involving the various cost weights in Eq. (10) as Cases 1-4. In Case 1, the positional cost was active, and movement was only constrained in the terminal position. In Case 2, the cost terms of the position and velocity were active. This required movement under kinematic constraints. In Case 3, the force actively inhibited the end-point force at the target position. Finally, in Case 4, cost terms of the position, velocity, and force were active, which required regulation of the end-point position, velocity, and force states at the movement end. The task required stopping at the target without force generation.
Our model showed that the hand pathways varied slightly, forming curves or almost-straight lines in all cases, according to direction (Fig. 2, left column). However, the movement parameters and muscle activation patterns (Fig. 3) showed distinct specifications according to the cost functions. Although the muscles were selectively activated according to the target direction, which was defined as a counter-clockwise direction from the right (i.e. the x-direction), regardless of the case following the first muscle activation (AG1), the second (ANT) and third (AG2) bursts of the muscles varied temporally among cases.

Movement parameters.
In Case 1, the hand passed through on the targets and the speed did not converge on zero at the end of movement (i.e. 400 ms); the joint torques peaked at the onset of movement to generate the hand force triggering the movement (Figs. 2a, 4a). Then, the hand force is shown as the root mean square (RMS) computed as f 2 x + f 2 y , where f x and f y are the lateral and longitudinal forces generated at the hand in the horizontal plane, respectively (Fig. 2). The joint torques and hand force then diverged slightly at the movement end www.nature.com/scientificreports/ because a passive tensile force was generated by stretching the muscles. In Case 2, the hand stopped on the targets and the speed was almost set to zero at the movement end, although slight motions were observed after the movement end (Figs. 2b, 4b). The hand force and joint torques showed biphasic peaks. The second peak of the joint torques was generated in the direction opposite to the movement, thus decelerating the movement and setting the hand speed to zero at the movement end. However, the hand force and joint torques did not converge at zero at the movement end; instead, they gradually decreased. In Case 3, the hand passed through the targets and the speed was not zero at the movement end time, similar to the findings in Case 1; furthermore, the hand force and joint torques converged on zero at this time in contrast to Case 1 (Figs. 2c, 4c). In Case 4, the hand stopped on the targets and the speed was close to zero with gradual curves, forming clear bell-shaped profiles (Figs. 2d, 4d). Although the hand force and joint torques showed biphasic peaks similar to the findings in Case 2, they converged on zero at the movement end.
Muscle activation patterns. In Case 1, agonist muscles were activated once to initiate movement (Figs. 3a, 4e). Then, the monoarticular elbow muscles EF and EX showed no activation. In Case 2, activation of the agonist and antagonist muscles alternated similarly to AG1 and ANT burst, occurring immediately after the beginning and at the end of movement, respectively (Figs. 3b, 4f). The antagonist muscles might contribute torque in the direction opposite to the movement to decelerate the movement. However, EF and EX were activated only as  www.nature.com/scientificreports/ antagonist muscles synchronising with the biarticular muscles BF or BX. In Case 3, the agonist muscles were activated twice, similar to bursts of AG1 and AG2 (Figs. 3c, 4g). The second activations might suppress the development of hand force and joint torques at the end of movement. Muscles EF and EX were weakly activated to supplement BF or BX. In Case 4, muscle activations showed a triphasic pattern resembling the superposition of activations in Cases 2 and 3, such that the agonist and antagonist muscles were activated in an alternating manner; however, the agonist muscles were occasionally activated twice (Fig. 3d). Notably, when the SF muscle acted as an agonist (e.g. 90° movement direction; Fig. 4h: left), it was activated twice, at the start and end of movement; it was activated only once in the middle of movement when it was an antagonist muscle (e.g. 270° movement direction; Fig. 4h: right). The later activation of agonist muscles may contribute to movement stabilisation because the later torque observed in Case 2, which the antagonist muscles generated, may induce unnecessary movement after reaching the target. Thus, the additional torque generated by the agonist muscles was required to counteract opposite torques and stabilise movement after completing the task. The costs of the position, velocity, and force corresponded to AG1, ANT, and AG2, respectively. Thus, Case 4 could only reproduce the triphasic muscle activation pattern.

Stabilisation control.
Our results suggest that the force cost reactivates the agonist muscle to counteract the breaking torques for stabilisation. An alternate hypothesis is that to achieve a reaching movement within a certain timeframe, the stability of the terminal kinematic state (i.e. position, or position and velocity) must be optimised after the movement phase, but not to minimise the terminal cost solely at the movement end. To investigate this hypothesis, we defined two extra cost functions for stabilisation control as shown in Eq. (11).
Although the cost function in Eq. (11) showed the biphasic muscle pattern present in Case 2 (Fig. 5c), the cost function in Eq. (12) reproduced the triphasic muscle pattern present in Case 4 (Fig. 5d). Accordingly, we could not reject the hypothesis of stabilisation control represented by Eq. (12). However, stabilisation control may regulate a cost term differential because the stabilisation of Eq. (11), which adopts only the position cost, decreased the hand speed in a manner similar to Case 2 (Fig. 5a). Therefore, we presume that the stabilisation control of Eq. (12) suppressed hand acceleration during the stabilisation period (Fig. 5b), and it played a role similar to the force cost in Case 4. www.nature.com/scientificreports/

Effects of movement duration on muscle activation patterns. As the movement duration was
shorter and the velocity of movement increased, the muscle activities were strengthened while the activation duration decreased. Slower movements were associated with prolonged activation from the agonist muscle with little or no antagonist activity, although the agonist bursts twice 38 . We evaluated the effects in Case 4 and the stabilisation control adopting the position and velocity costs to change their movement durations to 0.2 s, 0.8 s, and 1.0 s under the simulation time at 1.1 s (Fig. 6). The triphasic pattern in Case 4 was consistent, regardless of movement duration, although the bursts were weakened under prolonged duration when the movement duration was long ( Fig. 6a-c). In the stabilisation control, the muscle activation pattern changed from triphasic to biphasic for the shoulder muscles (SF and SX) at the movement end 1.0 s (Fig. 6f), although it was a similar effect to Case 4 ( Fig. 6d-f). Deformation of the muscle activation pattern occurred because of a decrease in the antagonist muscle burst at long movement durations. The agonist muscle SF and SX (i.e. AG2) were not required to negate the breaking torques induced by the antagonist muscle ANT. However, in Case 4, the agonist muscles were required to compensate the passive tensile force of the stretched muscles according to force cost, regardless of movement duration.

Discussion
This study investigated the requirements for deriving muscle activation patterns, in which antagonistic muscle pairs show alternating bursts with triphasic shapes. To achieve this aim, we carried out simulations of arm movements and applied four types of cost function, considering the end-point position, velocity, and force www.nature.com/scientificreports/ requirements under minimisation of the neural input. To summarise the results, the position, velocity, and force costs were found to play following roles: (1) the position cost led to the activation of the first agonist muscle, triggering movement; (2) the velocity cost activated the antagonist muscles and generated braking torques to decelerate the movement; and (3) the force cost reactivated the agonist muscle to negate the breaking torques and the passive tensile force of the stretched muscles, providing stabilisation. Other stabilising cost functions of the terminal position and velocity after the movement phase also produced the triphasic muscle pattern. However, the pattern was deformed at long movement duration, while the cost function adopting the position, velocity, and force exhibited a consistent pattern, regardless of movement duration. Taken together, our results suggest that triphasic muscle activation is induced by the costs of the position, velocity, and force at the movement end. Our control methodology was only approximately optimal, given the challenge of solving the optimal control problem analytically in nonlinear systems such as the musculoskeletal system. According to the original OFC model for a linear dynamics plant whose state variable x(t) comprises the position, velocity, and force 16 , the motor command is represented as·u(t) = K(t) ·x(t) , where K(t) is the feedback gain and x(t) is the estimated state integrating the prediction of the internal model (i.e. feedforward model) with the sensory feedback similar to a Kalman filter. Because the Kalman filter allowed a delay in the feedback depending on the physiological systems, the feedback delays did not directly affect feedback gains. The components of the feedback gain for the position,  www.nature.com/scientificreports/ movements that generated positive motor command, and the velocity and force gains contributed to decelerate and stabilise the movement (Fig. 7b), thus supporting our hypothesis regarding the roles of the cost terms. This approach may close the gap between ideal arm movements in the horizontal plane and actual nonlinear dynamics in neurophysiological systems. Indeed, it has been difficult to acquire optimal solutions for more complex and realistic motor control, such as multi-dimensional motions that recruit numerous muscles; however, OFC may be provided by a neural substrate in M1, integrating joint motion information into joint torque for fast feedback control 39,40 . Thus, trajectory planning in kinematics and the muscle activation pattern may be generated in separate steps 41,42 . Based on this idea, a hierarchical control framework has been suggested, which divides the control problem into high-level and low-level controllers 42,43 . The high-level controller is designed to capture the main features of the complex high-dimensional plant dynamics, although it exhibits reduced dimensionality, similar to a mass-point dynamics system. The low-level controller generates arm configurations and muscle activations to exactly match the high-level controls, while concurrently satisfying biological constraints. Although the solutions of this framework are essentially identical to the solutions acquired by the approach used in this work, the optimal solution for this system was acquired and we did not apply any external load; therefore, it could potentially be used to understand how neural systems handle natural biological motions.
Several studies in patients with motor disorders have demonstrated that the basal ganglia may have a role in scaling the size of AG1, reinforcing voluntary command and inhibiting inappropriate EMG activity 6 . Therefore, we assumed that in the basal ganglia module, the positional cost weight w p is related to AG1. The cerebrum may regulate the other cost weights w v and w f to balance w p , because the cerebellum plays a role in timing the voluntary bursts of ANT and AG2. Indeed, cerebellar patients are unable to perform accurate movements 44 . This deficit is known as dysmetria, which is a lack of coordination of movement that results in overshooting or undershooting a target during reaching tasks.
Dehghani and Bahrami suggested that arm movements are planned with some principal patterns of muscle synergies; the plans can be divided into relatively few phases to reduce the dimension of the control space 45,46 . Furthermore, Sakaguchi et al. proposed a computational model in which the brain adaptively divides the continuous-time axis into discrete segments and executes feedforward control in each segment to allow sensorimotor delays 47 . However, the OFC realised control of the muscle synergies through feedback gains, and the segmentation of motor execution may have been identified as steady-state feedback gains computed by a model predictive control under the framework of the OFC 23 .
In previous studies that used similar muscle models 25,35 , the tensile force of the passive elastic component was excluded from the muscle model (i.e. F PE2 in Eq. (8)), despite the size of the force. We assume that this was www.nature.com/scientificreports/ excluded because of parameter selection sensitivity for the muscle model, such as the moment arms, optimal muscle lengths, and optimal joint angles, which affect the muscle length and generate a tensile force without muscle activation. In our model, the tensile force requires initial muscle activation to maintain the initial position before movement onset; this affects the hand trajectories, causing marked distortion. Thus, we set the initial hand position to an equilibrium point balancing the forces, ensuring that the initial muscle activations remain at zero. With such sensitive effects, it is important to investigate how movement can be generated and controlled. Previous studies suggested that the neurons are optimised for physical mechanics [23][24][25] . However, if individuals have learned fine motor tasks once, such learnings tend to persist, regardless of whether they are clearly suboptimal 48 . Subsequently, individuals learn to overcome real and virtual changes in their biomechanics, but prefer to rescale their prior motor habits, rather than recomputing to optimise the control policy 49 . The prior motor habits may be learned from muscle activation patterns generated by lower sensorimotor circuitry that is functionally suboptimal. However, we ignored the lower sensorimotor circuitry, which is a limitation of our model. Recently, a model of the spinal circuitry and musculoskeletal system was developed 50 , which may be useful for studying habitual motor control systems.
The OFC framework has another potential limitation. The model cannot predict movement duration, but requires this duration as an input. Indeed, the amplitudes and timings of the muscle activities and feedback gains are directly related to movement time and velocity. According to the findings in a previous study 51 , the movement duration affects the cost function in optimal control, through a mechanism equivalent to reward delay; this is because the passage of time reduces the value of a reward in the human brain 52 . Shadmehr et al. suggested a cost function introducing the cost of time as a hyperbolic function to the OFC-like cost function 51 . However, the cost of time is dependent on parameters in the hyperbolic function of the model, and the cost appears to be influenced by several factors (e.g. life span 53 ).
In summary, the results of this study imply that a triphasic muscle activation pattern can be predicted by an optimal control mechanism (e.g. by adopting an OFC-like cost function). Furthermore, the costs of position, velocity, and force requirements may be critical parameters for the physiological control of movements; they correspond to the triggering, braking, and stabilising of movement, respectively.

Methods
We considered the monkey's arm to be a two-joint arm composed of the shoulder and elbow joints. The joint angles were defined as vector θ = [θ 1 , θ 2 ] T , where θ 1 and θ 2 indicate the shoulder and elbow variables, respectively. Suppose joint torque τ ∈ R 2 , the dynamics of the monkey's arm in horizontal plane are denoted by where M(·) ∊ R 2×2 , C(·) ∊ R 2 , and D ∊ R 2×2 are the inertia matrix, Coriolis force vector, and viscosity matrix, respectively, and are given by They are represented by the link parameters: mass m i , length l i , distance from the joint centre of mass l gi , moment of inertia I i , joint friction d i1 , d i2 (i = 1, upper arm; i = 2, forearm). The parameters are shown in Table 1, and were estimated from our measurements of a Japanese macaque. Because many muscles act on the arm in the horizontal plane, we modelled only two degrees of freedom actuated by six muscle groups: SF, shoulder flexor; SX, shoulder extensor; BF, biarticulate flexor; BX, biarticulate extensor; EF, elbow flexor; and EX, elbow extensor (Fig. 8a). The joint torque is a function of its moment arms A ∈ R 2×6 and the muscle tension vector T = [T 1 , T 2 , ˑˑˑ, T 6 ] T , and it is given by τ = A·T. The moment arm is defined as the perpendicular distance from the muscle line of action to the joint centre of rotation, given by (1) τ = M(θ) ·θ + C(θ,θ) + D ·θ, M(θ) = s 1 + 2s 2 cos θ 2 s 3 + s 2 cos θ 2 s 3 + s 2 cos θ 2 s 3 , (2) A = 1.5 −1.   www.nature.com/scientificreports/ The jth muscle activation a j (j = 1, 2, …, 6) is not equal to the instantaneous neural input u j ; instead, it is generated by passing u j through a filter that describes the calcium dynamics modelled with a first-order nonlinear filter 54 : where The time constant parameters were set as t act = 0.05 [s] and t deacct = 0.066 [s] because the input-dependent activation dynamics are faster than the constant deactivation dynamics (Fig. 8b).
Mammalian muscles have remarkable scaling properties, meaning that all are similar after proper normalisation: length is expressed in units of L 0 j the length at which the maximum isometric force is generated 55 , and velocity is expressed in units L 0 j per second 56 . Thus we assume that the units are unified across all muscles as L 0 = 0.08 [m], and we denote a normalised muscle length L j as follows: where A j is the jth row vector of A, θ 0 j ∈ R 2 is the optimal joint angle vector of the jth muscle for generating the maximal torque in the shoulder and elbow, respectively. Then the muscle tension T j is given to scale the unitless tension by the absolute muscle force of the physiological cross-sectional area (PCSA) to yield the physical tension 57 : where F a is the absolute muscle force and is set to F a = 32 [N/cm 2 ] based on measurements in monkeys 58 , and P is the PCSA that is assumed to be uniform across muscles as P = 10 [cm 2 ]. According to a model of mammalian skeletal muscle 56,59 , the unit-less muscle tension is produced by a nonlinear muscle model composed of the functions of contractile element F CE (·) and passive elastic element F PE (·): where Here, A f (·), F L (·), F V (·) are the functions of the activation-frequency relationship, the tetanic force-length relationship, and the tetanic force-velocity relationship, respectively. The passive elastic force is represented by two separate functions, F PE1 (·) and F PE2 (·), which exert a tensile force and resist compression force, respectively. They are defined as follows: (4) www.nature.com/scientificreports/ The dependence of force on the length and velocity of a muscle are often referred to as the force-length and force-velocity curves, respectively (Fig. 8c). The muscle model parameters are shown in Table 2; these were assigned based on the results of anatomical measurements of macaque monkeys 58,60 . Approximately optimal feedback control. We transformed the two-joint six-muscle model into a state-space model. The control object was denoted by the state vector x ∊ R 10 as where a = [a 1 , a 2 , …, a 6 ] T is the muscle activation vector. Here, we define the state vector at time t as x(t). Then, the dynamics of the musculoskeletal arm model can be written as a state-space equation described as.
The nonlinear functions F(·) and G(·) are defined for descriptive purposes to represent the dynamics in an affine form. In practice, they are given as locally linearised forms around each state at time t to obtain an approximately OFC law. The motor command u(t) is given by u(t) = u(t) + K(t)•(x(t) − x(t)), where u(t) is an open-loop control component, K(t) is a feedback gain, and x(t) is a nominal trajectory. The parameters are computed iteratively to optimise u(t) using the Levenberg-Marquardt algorithm to minimise the following cost function: where T s and T are the terminal time of movement and simulation duration, respectively. The iterative optimisation process finally brings an outcome converging K(t) ≈ 0 and x(t) ≈ x(t). Note that we set the times T s = 0. where J(t) ∈ R 2×2 is the Jacobian matrix In addition, p* is a target position in Cartesian space, and w p , w v , and w f are cost weights of the position, velocity, and force requirements at the movement end, respectively. On the right-hand side of Eq. (10), the first, x(t) = F(x(t)) + G(x(t)) · u(t).
l 1 cos θ 1 (t) + l 2 cos(θ 1 (t) + θ 2 (t)) l 1 sin θ 1 (t) + l 2 sin(θ 1 (t) + θ 2 (t)) , v(t) = J(t) ·θ(t), J(t) = −l 1 sin θ 1 (t) − l 2 sin(θ 1 (t) + θ 2 (t)) −l 2 sin(θ 1 (t) + θ 2 (t)) l 1 cos θ 1 (t) + l 2 cos(θ 1 (t) + θ 2 (t)) l 2 cos(θ 1 (t) + θ 2 (t)) . www.nature.com/scientificreports/ second, and third terms evaluate the requirements of the end-point position, velocity, and force, respectively, at the end of movement in achieving the desired state, whereby the end-point is close to the target position with zero values for the end-point velocity and force. The fourth term, which is the sum of the squares of the neural inputs during the movement, evaluates the metabolic cost of the neural input. This plays a role in minimising the end-point variance at the movement end under the condition of motor noise, known as signal-dependent noise 61 . However, the noise was not incorporated into this model for simplification, enabling us to focus on the muscle activities excluding external effects.
Cost weight parameters. We simulated four situations requiring movement under minimised neural input by balancing the cost weights in Eq. (10) ( Table 3). These were determined heuristically to achieve the task. We confirmed that the results of this model were robust under parameter changes by performing a sensitivity analysis (see Supplementary Note 1).

Stabilisation control.
We examined an alternate hypothesis that stabilises the terminal position, or position and velocity, after the movement phase; however, it does not minimise only the terminal cost at the movement end. Thus, we additionally defined following two cost functions for stabilisation control: