Sub-optimality in motor planning is retained throughout 9 days practice of 2250 trials

Optimality in motor planning, as well as accuracy in motor execution, is required to maximize expected gain under risk. In this study, we tested whether humans are able to update their motor planning. Participants performed a coincident timing task with an asymmetric gain function, in which optimal response timing to gain the highest total score depends on response variability. Their behaviours were then compared using a Bayesian optimal decision model. After 9 days of practicing 2250 trials, the total score increased, and temporal variance decreased. On the other hand, the participants showed consistent risk-seeking or risk-averse behaviour, preserving suboptimal motor planning. These results suggest that a human’s computational ability to calculate an optimal motor plan is limited, and it is difficult to improve it through repeated practice with a score feedback.


Inter-personal differences in strategy
Supplementary Figure 2 shows the observed mean response time against the SD of the response time for 12 of 15 participants. The result for the remaining 3 participants is presented in Figure 6. For clarity, we arranged participants in each column (risk-averse, risk-neutral, and risk-seeking) based on the difference between the observed and optimal mean response time in the last 10 blocks (i.e., day 8 & day 9).
of the measurements. Black curves indicate the optimal mean response time calculated using the Bayesian model (Equation 2). Grey curves indicate the 95% confidence intervals of the optimal mean response times obtained using a bootstrapping algorism.

Consistency of motor planning under risk
We performed a regression analysis between the differences of and from day 1 to day 9. Supplementary Table 1 shows the regression matrix. A slope of a regression line, 95% confidence intervals (CI) of slope, a coefficient of determination (R 2 ), and P value are plotted.

Consistency of distortion in utility function
We performed a regression analysis between the values of the exponential parameter from day 1 to day 9. Supplementary Table 2 shows the regression matrix. A slope of a regression line, 95% confidence intervals (CI) of slope, a coefficient of determination (R 2 ), and P value are plotted.

Model assumption based on Weber's law
We calculated the optimal mean response time based on the model that takes Weber's law into account. Here we call response time as button press time from onset of a start signal (visual cue). In this model, the probability distribution of response time was defined as a Gaussian distribution with mean and standard deviation , which scaled linearly with a planned response time with a constant coefficient of variation 1) as follows. Supplementary Figure   3a shows an example of distributions when is 0.05.
The expected gain can be calculated by integrating the gain function under Risk condition ( ) over the probability distribution ( | ).
Supplementary Figure 3b shows the expected gain as a function of a planned response time when is 0.05. We calculated the optimal mean response time ′ by maximizing the expected gain. the deviation between and ′ is larger as Weber fraction or the SD of response time is larger. We show 95% confidence interval (CI) of the SD of response time obtained in the Risk condition, as the gray region. The deviation between and ′ was 19 ms (1951 ms-and that we found in the experiment was clearly larger than this deviation (see Fig. 3e).
Furthermore, based on the proportional variance model, we calculated a slope of the regression line between the difference of and ′ on day 1 and that on day 9. We found it to be a slope of 0.69 ( Supplementary Fig. 3d). A regression slope between the difference between and on day 1 and that on day 9 was 0.70 (Fig. 4a).
We also conducted an additional experiment to measure participant's Weber fraction. In this experiment, three participants (P2, P5, and P6) performed the task with four different timing intervals (800 ms, 2300 ms, 3800 ms, and 5300 ms) for 50 trials each. They were instructed to press a button aiming at these intervals. We assumed that participant's response variance 2 is a linear function of the planned response time , 2 = ( + ) 2 . From the response variance and the mean response time data, we estimated and . The estimated were 0.028, 0.033, and 0.037 and were 0.064, 0.066, and 0.032 for P2, P5 and P6 respectively. For these values of , the deviations between and ′ were 1 ms (2160 ms-2159 ms), 3 ms (2140 ms-2137 ms), and 3 ms (2125 ms-2122 ms).