The online and offline effects of changing movement timing variability during training on a finger-opposition task

In motor learning tasks, there is mixed evidence for whether increased task-relevant variability in early learning stages leads to improved outcomes. One problem is that there may be a connection between skill level and motor variability, such that participants who initially have more variability may also perform worse on the task, so will have more room to improve. To avoid this confound, we experimentally manipulated the amount of movement timing variability (MTV) during training to test whether it improves performance. Based on previous studies showing that most of the improvement in finger-opposition tasks comes from optimizing the relative onset time of the finger movements, we used auditory cues (beeps) to guide the onset times of sequential movements during a training session, and then assessed motor performance after the intervention. Participants were assigned to three groups that either: (a) followed a prescribed random rhythm for their finger touches (Variable MTV), (b) followed a fixed rhythm (Fixed control MTV), or (c) produced the entire sequence following a single beep (Unsupervised control MTV). While the intervention was successful in increasing MTV during training for the Variable group, it did not lead to improved outcomes post-training compared to either control group, and the use of fixed timing led to significantly worse performance compared to the Unsupervised control group. These results suggest that manipulating MTV through auditory cues does not produce greater learning than unconstrained training in motor sequence tasks.

The online and offline effects of changing movement timing variability during training on a finger-opposition task Jason Friedman 1,2* , Assaf Amiaz 2 & Maria Korman 3 In motor learning tasks, there is mixed evidence for whether increased task-relevant variability in early learning stages leads to improved outcomes. One problem is that there may be a connection between skill level and motor variability, such that participants who initially have more variability may also perform worse on the task, so will have more room to improve. To avoid this confound, we experimentally manipulated the amount of movement timing variability (MTV) during training to test whether it improves performance. Based on previous studies showing that most of the improvement in finger-opposition tasks comes from optimizing the relative onset time of the finger movements, we used auditory cues (beeps) to guide the onset times of sequential movements during a training session, and then assessed motor performance after the intervention. Participants were assigned to three groups that either: (a) followed a prescribed random rhythm for their finger touches (Variable MTV), (b) followed a fixed rhythm (Fixed control MTV), or (c) produced the entire sequence following a single beep (Unsupervised control MTV). While the intervention was successful in increasing MTV during training for the Variable group, it did not lead to improved outcomes post-training compared to either control group, and the use of fixed timing led to significantly worse performance compared to the Unsupervised control group. These results suggest that manipulating MTV through auditory cues does not produce greater learning than unconstrained training in motor sequence tasks.
Skilled motor performance is obtained through repetitive practice that drives improvements in smoothness, speed and accuracy of movement execution [1][2][3] . However, in terms of their kinematic 4 and kinetic 5 characteristics, repetitions of the same movements are never identical; this phenomenon is called motor or movementto-movement variability. Several different processes have been suggested to explain the omnipresence of motor variability in simple and complex movements 6 . First, the motor commands originate from inherently noisy sensory-motor processes integrating central and peripheral neural signals into muscle force [7][8][9][10] . Motor variability is a consequence of this noise and, thus, an error marker of the system, i.e., the mechanisms responsible for muscle activations are inherently inaccurate. According to this view, motor variability is an often undesirable characteristic of motor performance 7,10,11 . Nevertheless, a complementary and not mutually exclusive notion regards the functional role of motor variability, which suggests that it is generated by the central nervous system to foster the exploration of the large number of possible solutions of motor control in a given task 3,12 . One widely accepted notion is that variability decreases with motor learning 13 .
Motor redundancy-many degrees of freedom in the human body (joints and muscles)-is a key adaptive characteristic of the motor system enabling multiple movement solutions for a given task 14,15 . Motor variability subserves this flexibility in reaching an optimal solution among many possible alternatives in a cost-effective way [16][17][18] , a notion that is conceptualized as structure learning of a motor task that limits movements to a subspace of all possible movements 19 . Accordingly, amplification of variability may promote motor learning through action exploration 12,20 . It was proposed that motor variability is actively modified through learning leading to decreased task-relevant variability while task-irrelevant variability remains high, resulting in gains in performance accuracy 16,21 .
The experimental evidence on the role of variability in predicting inter-individual differences in motor learning is rather inconclusive. Since the publication of "schema theory" by Schmidt 22 26 . We hypothesized that high MTV, both imposed (Variable group) and natural (Unsupervised control group), would lead to larger learning gains (i.e., relatively more sequences performed) compared to low MTV (Fixed control group), immediately after training and also offline at 24 h retest. Limited processing capacity, i.e., load on working memory (active storage and manipulation of information 45 ), may constrain motor learning and performance 46,47 . Thus, we predicted that the inclusion of supervised MTV in training in the Variable group may benefit some learning processes but concurrently may have costs due to increasing the cognitive load. Specifically, we expected that reaction times to the auditory cues during imposed variable training will be slower compared to fixed imposed or self-controlled rate of movement execution during training.
To examine the effect of MTV in training related differences, we examined how the MTV changes throughout the phases of learning across the groups. We decomposed the variability observed into its underlying kinematic components, and then tested the effect of these differences in variability on the standard outcome measuresimprovement in the number of correct sequences performed and the number of errors.

Results
Reaction time differences between the groups. We first explored the differences in the training session induced by the three different protocols. We observed a difference in the "reaction time" (RT), defined as the time between the last beep and the first movement made, during the training blocks, see Fig. 1. An ANOVA found a main effect of group (F(2,54) = 18.960, p < 0.001), with post-hoc t-tests showing that the RT was the fastest in the Unsupervised control group (0.28 ± 0.01 s), which was significantly faster than the Fixed control group (0.49 ± 0.04 s, t(36) = 5.01, p < 0.001), which was significantly faster than the Variable group (0.78 ± 0.05 s, t(35) = 4.52, p < 0.001).
Movement Timing Variability (MTV) induced by the training. Next, we examined how the variability of the differences between touch times (i.e., when the thumb and other finger touched) changed during the different sessions, quantified using the coefficient of variation (CV, the standard deviation divided by the mean), shown in Fig. 2. The CV was used to avoid effects caused by the correlation of mean and standard deviation. A mixed-design ANOVA showed a main effect of session (F(3,162) = 6.906, p < 0.001)-the CV was lower at 24 h (0.178 ± 0.006) compared to both the Pretest (0.203 ± 0.007, t(112) = 2.64, p = 0.01) and the immediate Posttest (0.208 ± 0.009, t(112) = 2.85, p = 0.005), indicating overnight consolidation of the CV. In addition, an interaction of session and group (F(6,162) = 5.610, p < 0.001) was observed. Post-hoc tests showed that during training, the Variable group showed higher CV (0.254 ± 0.012) than the Unsupervised control group (0.172 ± 0.013, t(36) = 4.51, p < 0.001), which showed higher CV than the Fixed control group (0.137 ± 0.011, t(36) = 2.05, p = 0.048). However, this effect was transient-no significant difference was observed between groups in any www.nature.com/scientificreports/ other session (t-tests showed all p > 0.05). Importantly, at baseline (Pretest), there were no significant differences in CV between the groups. There are three possible ways that an increase in CV may be achieved during training-(a) by increasing variability in the time taken to move the finger, (b) by increasing variability in the time the fingers touch, and (c) by increasing variability in the pauses between the movements. These three potential explanations are explored in the next section.
Variance of the decomposed sequence. To further examine the differences in movement timing variability induced in the three groups, we decomposed the time taken to perform each sequence in the training session and calculated the standard deviation of each of the three parts, shown in Fig. 3. It is important to examine the components individually because as we previously showed 2,44 , the time course of learning, as well as the overall contribution to improvement in performance, is not the same across different components. Specifically, the total time for a sequence was decomposed into the movement time (the time taken for the appropriate finger to move towards the thumb), the touch time (how long the finger and thumb touch), and the inter-movement interval (the time between releasing the touch, and when the next finger starts moving). Full definitions are given in the methods section. For the decomposition into parts, we used standard deviation rather than CV because the values for the inter-movement interval approach zero for some participants, which leads to very large values for the CV. To examine the effects of training type, we performed a one-way ANOVA on the three components during training.
Parallel analyses were performed for the main outcome measures of performance speed and accuracy themselves (and not their variability), as described below. Improvement in number of correct sequences performed. The relative online (Posttest) and offline, overnight (24 h test) improvements in performance speed (increase in the mean number of correct sequences performed in 30 s test blocks relative to the mean number performed in the first two blocks of the Pretest) is shown in Fig. 4. A main effect of session was observed (F(1,53) = 5.886, p = 0.019), with higher performance at 24 h (6.34 ± 0.36 sequences) than at Posttest (5.05 ± 0.37 sequences). A main effect of group was also observed (F(2,53) = 3.694, p = 0.031), but no interaction. There was also no main effect of reaction time (F(1,53) = 0.473, p = 0.494). Post-hoc tests showed that the Unsupervised control group showed significantly more improvement (6.75 ± 0.56 sequences) than the Fixed control group (4.62 ± 0.49 sequences, p = 0.032), but the differences between the Fixed control and the Variable (5.72 ± 0.65 sequences, p = 0.353) and between the Unsupervised control and the Variable groups were not significant (p = 0.353). These results were supported by also performing Figure 2. Coefficient of variation (CV) for movement timing variability between touching the thumb and the other finger, for the three groups, averaged across movements in the different blocks. The faint circles are data for all subjects, dark dots are the mean, error bars are the standard error for each group. The black bars indicate significant between-group differences, the red bars indicate significant differences between sessions. *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001. N = 19 for all groups. www.nature.com/scientificreports/ a Bayesian repeated measures ANOVA, see Table 1. The best model for describing the data was the same as found in the non-Bayesian ANOVA (i.e., a main effect of session and group, but no interaction or main effect of RT). Post-hoc Bayesian t-tests (see Table 2) similarly showed strong evidence for differences between Fixed control and Unsupervised control groups (BF 10 = 58), but only anecdotal evidence for differences between the other combinations.
The results in terms of number of correct sequences (rather than relative improvement) are presented in the Supplementary Material (SM-1).  www.nature.com/scientificreports/ We note that nearly all subjects improved as a result of the training, and at Posttest the improvement was significantly greater than 0 (i.e., all groups improved during training), as shown by a one-sample t-test (t(56) = 13.619, p < 0.001).
We compared the errors at each of the three time points using the Kruskal Wallis test (as many subjects made no errors, the distributions cannot be normal). We found a significant difference only at Posttest (H(2) = 7.209, p = 0.027). Post-hoc Mann-Whitney U-tests found that the Unsupervised control group (median 0.00, IQR 0.44) showed significantly less errors than the Variable group (median 0.75, IQR 0.75; U = 459, p = 0.0083).
Decomposition of time taken to complete sequence. We decomposed the movement time into four parts, see Fig. 5, into movement time (the time the finger closes towards the thumb), touch time (when the finger is touching the thumb), and the inter-movement time (the times between finishing moving the previous finger, and starting to move the next finger). The inter-movement time was divided into the time between movements in the same sequence, and between sequences.
No main effect of group was observed for any of the four measures, while an interaction of session and group was observed only for the inter-movement interval within sequence (F(4,108) = 2.437, p = 0.05). Post-hoc tests showed that while all groups showed a significant improvement (a decrease) in inter-movement interval from Pretest to Posttest (Fixed control: 0.129 ± 0.200 s, p = 0.011; Variable: 0.315 ± 0.220 s, p < 0.001; Unsupervised control: 0.230 ± 0.193 s, p < 0.001) and during consolidation a significant improvement (i.e., greater than zero) was shown for the Fixed control (0.086 ± 0.113 s, p = 0.004) and Unsupervised control groups (0.066 ± 0.105 s, p = 0.013), but not for the Variable group (0.015 ± 0.120 s, p = 0.609), although the effect sizes were small.

Discussion
Acquisition of a new motor skill often is measured as a more fluent and faster execution of a task without sacrificing accuracy. The role of movement timing variability (MTV) in within-training and post-training motor sequence learning is largely unknown. The current study provides several novel empirical observations on the matter through comparison between training outcomes under conditions of supervised and unsupervised MTV in different groups of young adult participants practicing a novel 5-element finger-opposition sequence of movements. Two supervised MTV training conditions relied on the affordance of an instructional 5-element auditory rhythmic sequence, fixed or variable, heard before each movement sequence repetition. The Unsupervised control group received a single auditory cue indicating the start of the whole sequence without specification of the rhythm of finger movements. Our findings suggest that supervised MTV induced high cognitive load, evident in significantly slower RTs (time from the last beep to the onset of the first sequence movement) in the Fixed control and Variable groups compared to the Unsupervised control group. Moreover, the supervised groups also differed from each other: the Fixed control group had significantly faster RTs than the Variable group, likely due to the difference in predictability of the sequence's pace (Fig. 1). Training in supervised conditions in the current protocol is essentially imposing dual-tasking (a classical paradigm to assess the impact of cognitive load on performance) by demanding to keep the mental representation of the auditory sequence in working memory and to apply it to the execution of finger movements. Slower RTs in supervised MTV conditions are consistent with prior findings, e.g., dualtasking during sequence execution leading to increased sequence element duration 48 . A recent study, however, found that the learning gains in performance were not deteriorated by the dual-tasking 49 . In the light of this finding, we assume that the effects of supervised MTV and the effects of dual-tasking in the current study could be independent. However, this assumption should be empirically tested in future studies.
Auditory instruction for the three MTV training conditions differentially impacted the variability during the training session, as expected (Fig. 2): the coefficient of variation (CV) of the timing of finger touching events during the training session was higher in the Variable group than in the Unsupervised control group, which showed higher CV than the Fixed control group. However, this effect was transient-no significant differences in the CV were observed between groups even immediately post-training. Over the consolidation period, all three groups showed a significant reduction in CV, however, without between-group differences. Previous studies have also shown that variability reduces over time 21 . These results suggest that differences in MTV during training had no direct impact on the MTV of consolidated skill performance.
The imposed instructions in the Variable group robustly caused an increase in variability in a task-relevant parameter (inter-movement intervals, or gaps within-sequences), which has been previously shown to be responsible for most of the observed improvements in performance with training 2,44 . Despite this increase in task-relevant variability, participants did not improve more than self-paced participants post-training. Rather, at the level of sequence performance, the Unsupervised control group showed the highest gains in speed at 24 h post-training, the differences between the Fixed control and the Variable and between the Unsupervised control and the Variable groups were not significant. When we consider the Fixed condition as a control to the Variable group, we did not observe a significant difference in improvement following the training. In contrast, there was a significant difference between the Unsupervised control and Fixed groups, suggesting that these two groups define two extremes of magnitude of MTV in this protocol, as intended. www.nature.com/scientificreports/ Decomposition of the movements into four kinematic components (finger movement time, touch time, and the inter-movement time between movements in the same sequence, and between sequences) 44 , showed that all the components contributed to the improvement, with the largest improvement in inter-movement interval within a sequence (see the Supplementary Material SM-2). This component also improved further in overnight consolidation, except for the Variable group. After 24 h, all groups showed a reduction in variability, as has been observed previously in learning tasks.
As reaction time may be a proxy for cognitive load, we included reaction time as a covariate in analysis of the improvement and number of sequences performed, but we did not find a significant effect of reaction time on the outcomes. Thus while the reaction times differed across groups, within groups reaction time did not predict performance, although to test this more thoroughly a larger sample size would be preferable.
Previous studies have performed manipulations that may not have been described as variability manipulations but were in fact (e.g., self-paced vs. evenly-paced), and showed that the long-term learning outcomes may differ 2 as a function of movement timing control during training. In contrast to this and other, intentionally targeting movement variability studies 3,25 , here we manipulated the variability through instruction rather than comparing the effect of baseline variability on performance. In this way, we attempted to avoid the confound of subjects that have high variability are likely to have low baselines 6,50 and thus more room for improvement. An unavoidable limitation of the current protocol is that this intervention changed the nature of the task between groups, which may have partially been responsible for the relative lack of improvement for the Variable group. Additionally, the same relative timing was used for all subjects in the Variable group. This may explain the relative disadvantage of imposing variable MTV-such training could be very different from the pre-existing coordination strategy and also could interfere with the process of gradual evolution of the innate, pre-existing coordination strategy into new motor patterns, as normally happens in the Unsupervised control condition (see Fig. 5). An additional limitation of the study is that we did not test the baseline cognitive or motor differences between the participants.
There is mounting evidence that the role of motor variability depends on the way the variability is induced (e.g., in the task space-affecting task performance or in the null space -not affecting performance, naturally occurring or imposed 27,[51][52][53]. Current findings on the imposed temporal variability are in line with the notion that variability is a complex construct 26,53 and that changing the task demands via extrinsically induced movement timing variability should be carefully considered in terms of both the immediate and delayed effects on motor learning and memory.
The broader practical implications of the current study are that development of motor training programs for typical and special populations should take into account the temporal structure of the instructions given, as the temporal organization type of the instructions can impact the learning outcomes even when otherwise the same amount of practice is performed. This was demonstrated in this and other studies 2,54-56 .
Altogether, our findings provide empirical evidence for the important role of unconstrained and unsupervised temporal structure of action exploration during novel motor skill acquisition. Higher MTV during training may impact performance improvement across the consolidation phase, but imposed temporal variance in the FOS task does not necessarily improve performance. Rather, it seems that allowing self exploration of the sequence is preferential for improved learning in this task 57 . It may be that by using natural variability, there will be a gradual change in how the sequences are performed, which starkly contrasts with the abrupt changes in variability used in this study, which did not seem to help improve motor learning. Evenly paced temporal instructions during sequence training seem to be undesirable.

Methods
Participants. Fifty-eight right-handed participants from the Tel Aviv University student population took part in the experiment. Right-handedness was confirmed using the Edinburgh Handedness Inventory 58 . Ethics approval was received from the Tel Aviv University Institutional Review Board, and all experiments were performed in accordance with the relevant guidelines and regulations. The participants signed an informed consent form before beginning the experiments, and were paid 70 shekels (approximately $20) for their participation.
As a result of failures in the sensor measurements (sensors falling off the hand), one participant was removed from the Fixed control group, Therefore, only 57 subjects (18 males and 39 females, average age 25.6 ± 2.5 years, range 22-33 years) were included in the final analyses. The number of participants recruited was based on previous, similar studies 43,44 comparing groups in sequence-learning tasks. Fig. 6. Participants were randomly assigned to one of three different groups-Variable, Fixed control or Unsupervised control, which differed only during the training session.

Experiment protocol. The protocol is shown in
The participants were required to perform a finger-opposition task 43 , in which the thumb and another finger of the left, non-dominant hand were required to touch in a given sequence (4-1-3-2-4): 1 corresponds to the index finger touching the thumb, 2 to the middle finger touching the thumb, etc., see Fig. 7. The primary outcome measure in this task is the improvement in the number of correct sequences than can be performed in a 30 s test (from before to after training). The experimenter demonstrated touching the thumb and finger but did not demonstrate the sequence. The participants were instructed not to look at their fingers while performing the task. After successfully performing the sequence 3 times, the recordings started. In the first session, the participants first performed four test trials (Pretest). Each trial was 30 s long, and the participants were instructed to perform accurately as many sequences as possible during this time. Beeps indicated the start and end of the trial, and a 30 s rest period was provided between trials. Before each trial, the sequence was shown on the screen as text (e.g., 4-1-3-2-4). During the tests the screen was blank. www.nature.com/scientificreports/ Following a 5-min break, the participants performed ten training trials, with each trial consisting of sixteen repetitions of the sequence. Participants in the Variable and Fixed control groups heard a rhythmic sequence of 5 beeps. Immediately after hearing the beeps, they needed to produce the sequence with their fingers touching the thumb in the same rhythm as the beeps they just heard (but without hearing the beeps again). As the duration for performing a sequence at baseline varies greatly across participants (see Fig. S1 in Supplementary Materials), the total duration of the sequence was selected to be identical to the mean duration of the sequences produced by the participant in the fourth block of the Pretest. This was selected so that the speed would be feasible for the subject, but still somewhat challenging. For the Variable group, the duration between the first four beeps were pseudorandomly selected from a normal distribution with a mean of 25% of the sequence time, and a standard deviation of 5% of the sequence time. The duration between the fourth and fifth beeps was then the remaining time left in the sequence-unless it was below 10% or above 40% of the total sequence time, in which case it was rejected. The same relative timing was used for all subjects in the Variable group. For the Fixed control group, the duration between the beeps was fixed (25% of the sequence time). The participants were instructed to copy the rhythm and focus on accuracy. For the Unsupervised control group, a single beep was played, after which they needed to produce the entire sequence. For all groups, the same type of beep was used (500 Hz beep for 50 ms).
Following the training (after a 2 min break), each participant performed four test trials (Posttest), with the same instructions as for the Pretest. The session on the first day took approximately 30 min in total. Approximately 24 h after the initial training session, participants were re-tested on performing the same trained sequence (24 h). The session on the second day took approximately 10 min.
Measurement and data pre-processing. The finger movements of the participants were recorded using an Ascension trakSTAR magnetic motion capture system, sampling at 240 Hz. The six sensors were taped on each fingertip and one on the palm of the left hand. The sensors were taped on the fingertips such that no tape was on the finger pads, in order to preserve full touch sensation at the fingertips. The experiments were run using the "Repeated Measures" software 59    The sequence of finger-to-the thumb opposition movements (4-1-3-2-4) that participants were required to perform (from left to right) throughout the experiment. During tests the participants were instructed to perform the sequence repeatedly as "fast and as accurate" as they can during a 30 s interval. During training the participants of the Fixed control and Variable groups were following an auditory guide (set of rhythmic beeps) for the timing of the within-sequence movements, while the participants of the Unsupervised control group had an auditory cue only for the start of the sequence. The wires shown are the Ascension trakSTAR sensors used to measure the 3D location of the fingertips. www.nature.com/scientificreports/ Data were analyzed offline using custom Matlab software. The raw data was low-pass filtered with a 4th order two-way Butterworth filter, with a cutoff frequency of 20 Hz. Finger touches (of the thumb and other fingers) were identified automatically based on the minimal distances between the thumb and other fingers. The timing of these touches was manually corrected (using a custom graphical user interface developed in Matlab) so that the number of sequences and errors performed matched the number recorded by the experimenter (from observation during the experiment). Each sequence was then decomposed to determine the relative contribution of the different temporal parts, following the technique used in Friedman and Korman 2,44 , based on the distances between the thumb and the relevant finger. The movement time was defined as the time from the last trough in the derivative of the finger distance (i.e., when the finger and thumb start moving closer together) to the moment they touch. The touch times (when the thumb was touching the other finger) were defined as the time adjacent to the touch where the magnitude of the derivative of the finger-thumb distance is below 5% of the maximum magnitude of the derivative of the finger-thumb distance. The remaining time (the inter-movement intervals) was divided into the times between movements within a sequence, and between sequences (i.e., between the end of one sequence and the start of the next sequence). Normalized data (relative improvement) were calculated relative to baseline performance in the first two trials of the pre-test. We performed this normalization on both the total time to complete a sequence correctly, and on the components described above (movement times, touch times, inter-movement intervals). We subtracted the duration of the appropriate component from the baseline value (for one sequence).

Statistical analysis.
We used a one-way between-subjects ANOVA to compare the cognitive load (as reflected in the reaction times) across the groups. Then, to determine whether the intervention changed the variability during training, we calculated the coefficient of variation between the finger-to-thumb touch times. The coefficient of variation (CV-the standard deviation divided by the mean) was used rather than the standard deviation to avoid effects caused by the often-observed correlation of mean and standard deviation. We compared the three groups and sessions (Pretest, Training, Posttest and 24 h) using a mixed-design ANOVA.
One-way ANOVAs were used to compare the differences between the three groups in standard deviation of the decomposed quantities (movement time, touch time and inter-movement interval within a sequence). For this analysis, standard deviations were used rather than coefficient of variation because the inter-movement interval approaches 0, which makes the coefficient of variation unstable.