Introduction

Mixed reality (MR) (based on the virtuality continuum framework proposed in Milgram and Kishino1.) technologies, such as virtual (VR) and augmented (AR) reality, merge the physical (i.e., unmediated reality; UR) with a digitally mediated virtual space. The digitized space can simulate interactions with objects that are commonplace, or highly specific and difficult or even impossible to provide for in the physical environment. The flexibility and adaptability of MR systems, therefore, make it possible to design and develop immersive and interactive learning and training experiences without the restrictions imposed by physical space or resource availability2,3,4. Nevertheless, mediating visual perception with a digital device does not come without consequences. Notably, certain aspects of MR devices (e.g., vergence-accommodation conflict5, field of view, resolution, and weight; (see 6 for a review)) have been shown to elicit egocentric distance distortion in VR7,8 and AR9, which also affects the accuracy and consistency of visually-guided actions in VR and AR, such as locomotion9 and manual pointing10,11. For instance, when fixating on an object at a distance, the eyes would rotate inwards or outwards so that the image of the object would be at the center of reach retina (vergence), whereas the shape of the eyes’ lenses would also adjust accordingly so that the retinal images remain in focus (accommodation). In a natural viewing condition, vergence and accommodation are tightly coupled as they both specify the same fixation distance (Fig. 1a). However, with MR devices, such as a head-mounted display (HMD), accommodation is commonly restricted to a fixed distance while the vergence angle constantly updates based on the observer’s fixation (Fig. 1b). This discrepancy results in vergence-accommodation conflict, which perturbs the stereoscopic viewing geometry and affects distance perception and targeted movements11. These differences between sensorimotor processing in UR and MR might lead to adaptations to movement planning and execution in MR, which, in turn, might have an after-effect that impacts subsequent movement when returning to UR.

Figure 1
figure 1

A demonstration of the vergence-accommodation conflict. (a) Without being mediated with a screen, the distance at which the eyes converge (vergence distance; black solid lines) coincides with that to which the eyes accommodate (orange shades). The black, dashed lines represent the visual angle subtended by the target, which is used to specify the target’s binocular disparity that yields the target’s distance. (b) While wearing a VR/AR headset, the eyes accommodate to the location of the screen, which was closer than the fixation point (i.e., vergence). Due to this discrepancy, the crosslink between vergence and accommodation drives the vergence angle inward (black, solid lines), resulting in a larger vergence angle compared to the original (black, dotted lines) and, effectively, a shorter fixation distance. With the same target location (black, dashed lines), the larger vergence angle also increases the binocular disparity corresponding to the target, which would lead to a shorter perceived distance of the target.

In UR, movements involve both a pre-planned ballistic component that gets the limb toward the goal and online correction mechanisms that use feedback from the ongoing movement to identify errors in the movement and change the ongoing movement to ensure accuracy12,13. Detected errors in the endpoint of the action can also be used offline to change the ballistic phase of the next movement to enhance accuracy. Despite the perceptual distortions, studies have shown that visual feedback in MR can correct errors in visually guided actions through sensorimotor recalibration10,14,15,16,17. For instance, Kohm, et al.16 examined the long-term (12 weeks) repeated calibration of manual reaching in VR. Results revealed that the depth compression in VR gradually diminished over time and performance improved within VR as the participants gained more experience in the virtual environment. Although movement adaptation and improvement were observed, the improved accuracy was only tested within the same modality (VR) in which the participants were trained. The potential transfer of these adaptations to other modalities remains unknown.

In the context of education and training, the transfer between modalities is critical because the goal of training is improvement in perceptual motor skills in the one modality after learning in the other (e.g., transfer of training from VR to UR). The transfer between modalities is understudied. For example, Kohm and colleagues16 showed that the participants could overcome the distance distortions in VR via practice. However, the distance distortions in MR are attributed to the unique characteristics of the headsets, and the improved sensorimotor performance simply indicates that the observers can recalibrate their movements to the distorted perceptual space in MR. Such adaptations within an altered environment are not trivial but are widely researched18. Because the perceptual spaces are intrinsically different between MR and UR19,20, the transfer from adaptation in MR to UR may entail an adaptation aftereffect21 that may lead to a decrement in the performance of similar tasks in UR.

Studies have shown that explicitly induced sensorimotor recalibration in MR could be transferred to the physical environment22,23,24. For example, Mohler et al.23 manipulated the perceptual features of a virtual environment and evaluated the effect of this manipulation on locomotion in the physical environment. During locomotion, optic flow provides relevant visual information about the speed and direction of the movement25,26. Changing the magnitude of the optical flow of walking would result in a different perceived speed of visual motion than that specified by the biomechanical speed. After adapting to the perturbed visual speed in the virtual environment, the participants were asked to blind walk to a target in the physical environment. Results showed that after adapting to a faster visual speed in the virtual environment, the participants would underestimate distance in the physical environment during the blind walk. This change occurred because the participants recalibrated their walking with the faster visual speed in VR such that they were only perceptually, not biomechanically, faster in UR. The opposite also applies to the adaptation to a slower visual speed. These findings suggest that sensorimotor recalibration in locomotion could transfer from MR to UR.

The previous studies on transfer between MR and UR all explicitly perturbed the visual environment in MR. Training in MR may lead to adaptation to a distorted perceptual space and improved performance accuracy in MR. Because of the distinct perceptual geometries of MR and UR, adaptation to these geometries may in turn negatively impact the performance of similar tasks in UR27. The present study was designed to examine the potential adaptation aftereffect of moving in MR on movement in UR using a manual pointing task. Participants performed a series of targeted pointing movements in UR before (pre-test) and after (post-test) completing a series of similar pointing movements in VR (HTC VIVE Pro Eye) and AR (Microsoft HoloLens 2). If the adaptation that occurs in MR negatively transfers to UR, then the pointing errors in the post-test should deviate from those in the pre-test in the same direction as the errors observed in the MR task. However, if the adaptation that occurs in MR does not transfer to UR, then there should not be any difference in pointing errors in UR between the pre- and post-tests.

Results

Participants sat at a table and executed a series of rapid manual aiming movements with their right index finger in 3 phases. In the first (pre-test) and third (post-test) phases, participants performed a series of 30 right-handed pointing movements in UR. There were 10 movements to each target at distances of 20, 25, and 30 cm set in 5 blocks of 6 trials. In the second phase, they performed 288 similar pointing movements (with the same target distances as in UR) in 16 repetitions of 18 trials. These trials were analyzed as 5 blocks of 54 trials (the first repetition was discarded as practice). Participants were randomly assigned to separate groups who performed these movements in either VR or AR. The movements were captured using an infrared-emitting diode (IRED) attached to their index finger. All dependent variables were derived using a human kinematic analysis toolbox (TAT-HUM;28). Constant error (CE) was calculated as the difference between the actual pointing distance and the target distance, which measured the accuracy of each movement. While mean CE can be used to evaluate any potential overall bias in distance perception and movement execution, CE can also be subject to a speed-accuracy tradeoff wherein large CE could be associated with faster and more variable movements and lower CE could be associated with slower and less variable movements. To assess this possibility, additional dependent measures were evaluated. Variable error (VE) was calculated as the standard deviation of CE for each participant, mediation, and block. VE measures the consistency of the movement. Movement time (MT) was derived as the time interval between movement onset and, which were defined as the moment when the velocity along the primary movement axis (in depth) exceeds and falls below the 50 mm/s threshold, respectively. MT reflects the overall time taken to execute the movement, with longer MTs being associated with slower movements that have been subject to more online error-reducing correction processes, and shorted MTs being associated with faster movements that are largely ballistic and not subject to online correction processes. If, for instance, a mediated condition yields a larger CE compared to the baseline, this inaccuracy could only be considered as the result of differing sensorimotor processing in the mediation condition if the movement was similarly variable (comparable VE) and was completed within a similar amount of time (comparable MT). Mixed-factor analysis of variance (ANOVA) was used to evaluate the effect of modality (between-subjects factor) and block (within-subjects factor) on mean CE, VE, and MT in the different stages of the study with separate ANOVAs being calculated to assess performance in MR and to assess differences in performance pre-/post- MR intervention.

Constant error

For the pointing task in MR, a mixed factor ANOVA on CE with one between-subject factor (mediation: VR, AR) and one within-subject factor (block: 1–5) was used. The ANOVA revealed a significant main effect of mediation, \(\text{F}\left(1, 38\right)=4.84, p<0.05, {\eta }_{p}^{2}=0.11\). Neither block, \(\text{F}\left(2.09, 79.36\right)=1.78, p=0.17, {\eta }_{p}^{2}=0.045\), nor the interaction, \(\text{F}\left(2.09, 79.36\right)=0.31, p=0.74, {\eta }_{p}^{2}=0.008\), were significant. As Fig. 2a (middle) shows, the 95% confidence intervals do not overlap with 0 indicating that participants in VR (mean = -0.23, SE = 0.05) generally undershot whereas those in AR (mean = 0.29, SE = 0.11) overshot the target distance.

Figure 2
figure 2

(a) Constant errors as a function of experimental blocks for the pre- (left) and post-tests (right), and as a function of mediation for the mixed-reality (MR) task (middle). (b) The deviations of constant errors between pre- and post-test results. Error bars represent 95% confidence intervals.

For the pre- and post-MR movements executed in UR, a mixed-factor ANOVA with one between-subject factor (mediation: VR or AR) and two within-subject factors (session: pre- and post-test; block: 1–5) revealed a significant main effect of mediation, \(\text{F}\left(1, 38\right)=16.56, p<0.001, {\eta }_{p}^{2}=0.30\). CE for the group who had just executed movements in VR (mean = -0.22, SE = 0.032) was significantly smaller than CE for the group who previously executed movements in AR (mean = -0.041, SE = 0.032). The main effect of block was also significant, \(\text{F}\left(4, 152\right)=2.88, p<0.05, {\eta }_{p}^{2}=0.071\). More importantly, although the main effect of session was not significant, \(\text{F}\left(1, 38\right)=0.70, p=0.41, {\eta }_{p}^{2}=0.018\), the interaction between session and mediation was significant, \(\text{F}\left(1, 38\right)=33.90, p<0.001, {\eta }_{p}^{2}=0.47\). Finally, the remaining interactions approached but did not surpass conventional levels of statistical significance (p > 0.05).

For the interaction between session and mediation, Tukey’s HSD with mediation as a simple effect showed that there was no difference between VR and AR (mean difference = 0.022, SE = 0.049, \(p=0.66\)) in the pre-test, but that AR and VR were significantly different in the post-test where CE in VR was smaller than those in AR (mean difference = -0.39, SE = 0.064, \(p<0.001\)). The direction of the difference is congruent with the CE observed in the MR task. As shown in Fig. 2a (left and right panels), in the pre-test, the CE of the VR and AR groups were not different across the blocks in the pre-test; the UR movements of the participants in both groups slightly undershot the target distance. This pattern of undershooting could be attributed to the general tendency of undershooting in 3D aiming movements to reduce energy expenditure29,30. Strikingly, the difference between the movements in UR executed by the groups who previously moved in VR and AR became remarkably large after the MR task in the post-test. In the first block, there was a noticeable undershoot in VR and an overshoot in AR. The undershoot in VR gradually diminished towards the end of the session, whereas AR’s overshoot dramatically fell to the baseline level after the first block. This pattern indicates the presence of an adaptation aftereffect from moving in MR that dissipated with additional movement in UR.

To further investigate the aftereffect, a post-hoc analysis comparing the CE deviations between the pre- and post-test was conducted. The grand mean of CE for VR and AR in the pre-test was considered as the baseline and was subtracted from the CE in the post-test. The direction of the deviations relative to 0, measured using the 95% confidence intervals with Šidák correction, would inform the direction of change between the pre- and post-test and is used as a measure of the adaptation aftereffect (Fig. 2b). Participants in the VR condition consistently undershot for the first three blocks and the accuracy gradually improved in the last two blocks. The direction of the undershoot was consistent with that in the VR task, suggesting an adaptation aftereffect in the negative direction. In contrast, the overshoot in the AR condition only lasted for the first block, and the deviations in the following blocks were at around 0. This initial overshoot was also consistent with the bias in the AR task. Interestingly, the rate at which the adaptation aftereffect diminished differs between VR and AR, where performance in VR returned to the pre-test baseline level at a much slower rate than that in AR.

Movement time and variable error

Although the stark contrast between the pre- and post-test pointing performance could be attributed to the adaptation aftereffect of MR’s perturbed perceptual environment, it was also possible that the deviation in CE was a result of a speed-accuracy tradeoff31,32. Specifically, the increased error in the post-test could result from a change in the strategic approach to movements in which the participants chose to move more rapidly, at the expense of accuracy, and so the movements with higher CE would be associated with shorter MT and/or increased variability at the end point. To address this possibility, the same ANOVAs were performed on MT and VE in the pre- and post-test trials (Fig. 3). For MT, there was a significant main effect of mediation (\(\text{F}\left(1, 38\right)=4.72, p<0.05, {\eta }_{p}^{2}=0.11\)), where MTs of the VR group in UR (mean = 495 ms, SE = 4.90) were longer than the MTs of movements of the AR group in UR (mean = 441 ms, SE = 7.09). The interaction between block and session was significant, \(\text{F}\left(2.69, 102.06\right)=4.51, p<0.01\). Post-hoc pairwise comparison with session as a simple effect only showed a significant difference between the pre- and post-test MT for the first block (mean difference = 28 ms, SE = 11.41, \(t\left(38\right)=2.45, p<0.05\)) whereas none of the other blocks yielded a significant difference between the sessions (Block 2: mean difference = 13 ms, SE = 9.69, \(t\left(38\right)=1.34, p=0.19\); Block 3: mean difference = 13 ms, SE = 9.82, \(t\left(38\right)=1.34, p=0.19\); Block 4: mean difference = 3 ms, SE = 9.90, \(t\left(38\right)=0.28, p=0.78\); Block 5: mean difference = 1 ms, SE = 10.21, \(t\left(38\right)=0.11, p=0.91\)). As Fig. 3a shows, while there was a significant interaction between block and session, MT remained relatively stable between pre- and post-tests across blocks and the only significant difference from the post-hoc analysis was negligible (28 ms). Importantly, none of the other effects were significant, including the interaction between mediation and session (\(\text{F}\left(1, 38\right)=2.14, p=0.15, {\eta }_{p}^{2}=0.053\)), which was observed in CE. Overall, this pattern of findings suggests that the increased CE in the post-test (Fig. 2a) was not attributed to faster movements. No interaction involving mediation was statistically significant, including the crucial mediation by session interaction, \(\text{F}\left(1, 38\right)=2.14, p=0.15, {\eta }_{p}^{2}=0.053\).

Figure 3
figure 3

Movement time (a) and variable error (b) as a function of experimental blocks for the pre- (left) and post-tests (right).

For VE, none of the effects were statistically significant (mediation: \(\text{F}\left(1, 38\right)=2.48, p=0.12,{\eta }_{p}^{2}=0.061\); block: \(\text{F}\left(4, 152\right)=0.22, p=0.93, {\eta }_{p}^{2}=0.006\); session: \(\text{F}\left(1, 38\right)=1.01, p=0.32, {\eta }_{p}^{2}=0.026\); mediation × block: \(\text{F}\left(4, 152\right)=1.23, p=0.30, {\eta }_{p}^{2}=0.031\); mediation × session: \(\text{F}\left(1, 38\right)=1.85, p=0.18, {\eta }_{p}^{2}=0.046\); block × session: \(\text{F}\left(4, 152\right)=0.96, p=0.43, {\eta }_{p}^{2}=0.025\); mediation × block × session: \(\text{F}\left(4, 152\right)=0.87, p=0.48, {\eta }_{p}^{2}=0.022\)). Fig. 3b shows stable levels of VE for both groups across blocks within the two sessions. Importantly, the critical mediation by session interaction was not statistically significant, \(\text{F}\left(1, 38\right)=1.85, p=0.18, {\eta }_{p}^{2}=0.46\). Thus, the reduced accuracy in the post-test was not the result of more variable targeting at the end point. Overall, the analyses of MT and VE did not provide evidence that the CE observed in the post-test was associated with strategic changes in the speed-accuracy trade-off. The more likely account for the change in performance is an adaptation aftereffect, where the participants adapted to the distorted sensorimotor coupling in MR that was subsequently carried over to the task performance in UR.

Discussion

The current study was designed to examine whether the execution of sensorimotor behavior in MR would elicit adaptation aftereffects in UR. Prior to executing actions in MR, participants in the AR and VR groups accurately pointed at targets in UR. After spending approximately one hour in MR, the same movements performed accurately in the UR pre-test were now biased in a manner consistent with performance in the MR: Participants in the VR condition undershot whereas participants in the AR condition overshot the target. Additional analyses on MT and VE did not show any significant difference between the pre- and post-test performance for the VR and AR conditions, confirming the difference in CE in the pre- and post-test performance could not be explained by a change in the speed-accuracy trade-off. Thus, the more likely explanation is that there was an adaptation aftereffect after being in MR, that is, a transfer of the sensorimotor recalibration based on the incorrect viewing geometry specific to the MR conditions.

The overall direction of the movement bias is congruent with the predictions based on different headsets’ optical structure that dictates the vergence-accommodation conflict in MR headsets11. As mentioned in the Introduction, in a natural viewing condition in real-world environments, vergence and accommodation are tightly coupled with both processes specifying the same distance between the observer and the target. Because vergence and accommodation are tightly coupled, the lens’s accommodative state would drive the vergence response and vice versa33,34,35,36. However, because perceptual information in MR is conveyed through a pair of screens at a fixed distance from the eyes, the accommodative state of the lenses of the eyes would remain constant for this fixed distance while the orientation of the eyes converges and diverges based on the different virtual locations of the targets. Thus, in MR, the breakdown of the vergence-accommodation coupling leads to conflict wherein the vergence and accommodation specify different distances. For manual pointing movements in MR, the accommodation distance is either shorter or longer than the fixation distance to the target (i.e., distance specified by vergence), which would differentially drive the vergence angle inward or outward37,38. Using this insight, Wang and colleagues11 conceptualized the vergence-accommodation conflict as a positive angular offset to the vergence angle (i.e., larger vergence angle), resulting in an effectively shorter fixation distance and, consequently, compression in perceived depth (Fig. 1). A thorough analysis of the performance of the movements in the MR conditions revealed that the vergence-accommodation conflict modeling could predict up to 66 percent of the variance in behavioral results.

The present study focuses on the adaptation aftereffects observed in the post-test. Results imply that the observers could have adapted to the fixed accommodation distance of the HMDs, which temporarily altered the coupling between vergence and accommodation in AR/VR, subsequently affecting performance in UR. Eadie and colleagues33 demonstrated that the two systems that govern the vergence and accommodation coupling (i.e., accommodative vergence and vergence accommodation) are capable of adaptation. Therefore, prolonged exposure to MR using an HMD could lead the observer’s visual system to adapt to the altered accommodation distance, among the potential adaptation to other sensorimotor perturbations imposed by the HMD (e.g., weight, restricted field of view, etc.; see6), and these adaptations persist after exiting MR and affects visually guided actions in UR.

Additionally, the dynamics of the recovery of the adaptation aftereffects demonstrate an interesting contrast between VR and AR. For VR, the change was gradual with the CE in the post-test steadily returning to the pre-test level at block 3 (after 9–12 trials). Conversely, the AR condition had a more rapid return where the post-test biases diminished by block 2 (after 3–6 trials). This contrast in de-adaptation could be attributed to the amount of access to the physical environment with a natural viewing geometry during the different MR experiences. For VR, the participants were completely isolated from the physical world, and their entire visual field, including both the target and moving hand, was occupied by the virtual world conveyed through a pair of fixed-distance screens. Because of these sensorimotor perturbations, movement in VR was significantly slower than in AR while showing no differences in the levels of endpoint variability and online modulation, suggesting that the observers adopted a compensation strategy of slowing the movement down to engage in more online control to overcome the perturbations (see also39 for additional comparisons between movements in VR, AR, and UR). For AR, the virtual objects were displayed as a hologram on an otherwise transparent holographic frame, giving the participants access to the unmediated physical environment and the natural viewing geometry therein. In other words, the participants in the AR condition only experienced the distorted viewing geometry due to the vergence-accommodation conflict when fixating on the virtual target, but had an unperturbed vision of the moving hand as well as the action surface of the table and other surrounding objects. Due to the unmediated access to the physical environment, movement in AR was executed more rapidly than in VR, suggesting less strategic change for online control. Note that additional comparisons to movements in UR in other work are consistent with the suggestion here for a comparable level of control in UR and AR39. Overall, these findings indicate that having access to the correct viewing geometry of the moving effector and surrounds may affect the observer’s strategy in overcoming the sensorimotor perturbations imposed by MR and the sensorimotor recalibration process, resulting in a more transient adaptation aftereffect.

The findings of the current study call for a deeper consideration of MR’s use scenario in training and skill development. Due to their intrinsic discrepancy in perceptual geometry, successful sensorimotor skill transfer from MR to UR may be questionable, especially for tasks that require fine sensorimotor skills. Note, however, that this suggestion does not diminish the value of MR for the development of explicit knowledge such as procedural or strategic knowledge components of tasks, or for skill development for tasks that will be performed solely in MR (hence do not require transfer). Indeed, the findings of Kohm et al.16 indicate that practice in VR could improve manual reaching performance accuracy and consistency in VR. In this context, skills training focusing on using MR as an interface for the teleoperation of high-risk tasks could be extremely valuable40,41. For example, research on VR-based teleoperation of high-risk machinery has shown that using robotic hand avatars, instead of human-like hands, could help to reduce the workers’ risk perception and improve task performance42. Further, it is also possible that individual differences in MR experience (e.g., video game experience) could influence performance in VR/AR and the resulting adaptation and transfer between modalities. This factor was not explored in the present study but should be considered in future work. Nevertheless, given the findings of the current study, researchers and MR developers should focus on developing MR training tools that focus on skill development in the same MR modality. Just like any other digital invention that transforms how humans interact with the world around them, MR technologies should be considered as a tool that could help to improve efficiency and accuracy of completing some tasks, instead of as a replacement for the physical reality.

Materials and methods

Participants

Forty adults completed the experiment, including 20 in the VR (8 males and 12 females, age range = [18, 32]) and 20 in the AR (8 males and 12 females, age range = [18, 30]) conditions. Participants provided their full and informed consent prior to the experiment and received monetary compensation. All participants were right-handed and had normal or corrected-to-normal vision. All procedures were approved and were consistent with the standards of the University of Toronto Research Ethics Board (Protocol #: 00042432).

Stimuli and apparatus

For VR, the environment was presented with an HTC VIVE Pro Eye HMD with a resolution of 1440 × 1600 pixels per eye, a combined 110° field of view, and a refresh rate of 90 Hz. For AR, the stimuli were presented through Microsoft HoloLens (2nd gen) holographic mixed reality glasses with a 43° × 29° field of view with a holographic density of 2500 radiants (light points per radian) and a refresh rate of 60 Hz. Being an optical passthrough device, HoloLens allows the users to see their physical surroundings in a natural viewing condition (i.e., not mediated by a screen) with holographic objects. Therefore, the effective field of view of the AR headset should equate to the participant’s natural field of view, which is slightly over 210° with binocular viewing. For both modalities, the participants performed pointing movements with the right index finger of their own physical hand, and the movements were recorded using an optoelectric motion tracking system (Optotrak, Northern Digital Inc., Waterloo, Ontario, Canada) at a 250 Hz sampling frequency. For AR, the participants could see and interact with the environment using their physical hand, whereas for VR, the motion tracking data were used to animate a virtual hand for interactions (Fig. 4a). Finally, the motion capture system was calibrated so that the primary movement direction (the direction in which the participants pointed) coincided with the motion capture system’s z-axis.

Figure 4
figure 4

(a) Examples of the pointing tasks performed in VR (left) and AR (right). (b) A schematic illustration of the setup for the pre- and post-test pointing task. The bottom square is the home position, whereas the other squares are the target. Note that the depicted dimensions are not up to scale.

Procedures and design

Before and after experiencing the MR conditions, all participants performed pointing movements in UR. A board with four rectangles was placed in front of the participants on a table (Fig. 4b). The closest rectangle was the home position, 5 cm away from the edge of the table. The other rectangles were targets located 20, 25, and 30 cm from the home position. The locations of the home position and the targets coincide with those used in the MR tasks to minimize the differences between the tasks. For each trial, the participants placed their right index finger at the home position. After the experimenter announced the target number (1, 2, or 3), a beeping sound would play. Participants were instructed to point to the indicated target as accurately and as quickly as possible after the sound. There were two movements to each of the three target distances in a block and there were 5 blocks of trials in each of the pre- and post-test sessions. Distances within each block were randomized. Therefore, each session contained 30 trials (5 blocks × 6 trials per block).

The MR conditions presented here are part of a bigger study that examined the effect of MR on motor planning and control11,39. The original study was intended to evaluate the effect of visual illusions on targeted movements. For both VR and AR groups, participants performed manual pointing movements to a series of Müller-Lyer stimuli (a long central line with shorter lines connected in different orientations to the end of each line) with three different end-point configurations (fin-in, fin-flat, fin-out) and three stimulus lengths (20, 25, and 30 cm). Depending on the illusion’s configuration, the stimulus could either be perceived to be shorter (fin-in) or longer (fin-out) than it actually is. Moreover, to manipulate the illusion’s effect on motor planning and control, the participants performed the pointing movement with either full or no vision of the stimuli, in which case a perceptual mask was shown for 1.5 s prior to movement onset (Fig. 4a). Although the target stimuli in VR and AR in the intervening task were different from the target stimuli in the pre/post-test sessions (different configurations of lines vs. rectangles, respectively), the target stimuli in the intervening tasks were consistent across conditions and, importantly, the veridical distances of the target stimuli and movements were consistent across VR, AR, and UR tasks. Hence, the differences in stimuli across the phases of the study should not drive any patterns of outcomes. There was a total of 288 trials, consisting of 16 repetitions of 18 trials, combining the 2 visual conditions, 3 movement lengths, and 3 end-point configurations. The trial order within each repetition was randomized. The first repetition (trials 1–18) was considered practice and was not included in the data analysis. Every three repetitions were considered as a block in the analysis. This portion of the experiment took approximately one hour to complete. For a more detailed description of the manipulations in the MR tasks, readers should refer to39.

Data reduction

Raw movement trajectories were processed using TAT-HUM; an open-sourced toolkit focusing on human kinematic analysis)28. Missing data due to marker occlusion within a trial were replaced using linear interpolation. Then, a second-order low-pass Butterworth filter (250 Hz sampling frequency, 10 Hz cutoff frequency) was applied to the trajectory. A 3-point central difference method was used to derive the velocity, which was also smoothed using the same Butterworth filter. A velocity threshold of 50 mm/s was used to identify movement onset and termination. MT is the temporal duration between movement onset and termination. Given the motion capture’s calibration setup, the reach distance is the difference in z-coordinates (distance along a transverse plane in a direction perpendicular to the mid-sagittal plane) between the start and end positions in real space. Using the distance along a single dimension, instead of the Euclidean distance on the horizontal plane, helped to minimize the effect of extraneous lateral movement on the derived distance. See Wang and Welsh28 for more details on the data reduction procedure. Finally, CE was computed as the difference between the real reach distance of the limb and the stimulus length, whereas VE was computed as the standard deviation of CE for each participant, mediation, and block. CE was also used to identify any outlier trials, which were defined as individual values that were more than three standard deviations away from the mean for each participant, mediation, block, and session (pre- and post-tests only). 71 trials (or 0.70% of the total trials) were removed for the MR tasks, whereas 17 trials (or 0.75%) were removed for the pre- and post-tests.

Power analysis

Because the current study aims to identify any potential adaptation aftereffect in different mediation conditions, the theoretically relevant effect is the interaction between modality (VR and AR, between-subject) and session (pre- and post-test, within-subject) for CE. Therefore, a power analysis was performed for a mixed-design ANOVA focusing on the within-between interaction using G*Power43. To the best of the author’s knowledge, there has not yet been a study that specifically examines this interaction, making it challenging to identify a reasonable effect size. Therefore, the current study used an effect size sensitivity analysis44 instead. With a desired power of 0.95, a moderate correlation among repeated measures (rmcorr = 0.5) and sphericity correction (\(\epsilon\) = 0.5), two between-subject groups (modality), 20 participants per modality (total sample size = 40), and 10 measurements (10 repetitions for each session), the smallest possible effect can be detected was f = 0.22 or equivalently, \({\eta }_{p}^{2}\) = 0.046.

Statistical analysis

To investigate the adaptation aftereffect, CE, VE, and MT were compared between the pre- and post-test results as a function of the trial order for VR and AR. For the pre/post-test sessions, mean CE, VE, and MT were submitted to a mixed-factor analysis of variance (ANOVA), with between-subject variable of mediation (VR, AR) and within-subject variables of session (pre-test, post-test) and blocks (1–5). To further understand the potential difference between the pre- and post-test results, performance in the MR tasks was also evaluated. As mentioned previously, the data from the MR tasks are reported in detail in another paper39 to address a different theoretical question. For the current study, emphasis is placed on the effects of mediation and block, whereas the effects of stimulus length, illusion configuration, and visual feedback were averaged across conditions. The mixed-factor ANOVA contained one between-subject variable of mediation (VR, AR), and one within-subject variable of block (1–5). Greenhouse–Geisser corrections, as indicated by the decimal values in the reported degrees of freedom, were applied to effects that violate the sphericity assumption. Significant effects were further evaluated using Tukey’s honest significance difference test (HSD).