In one of his interviews, Ichiro Suzuki, one of the most consistent batters in Major League Baseball in recent times, disclosed that he refrains from closely watching poor batters in his team before going out to bat because it affects his own batting performance (2007, June 19, Yukan Fuji). While his comments were branded by sections of the mass media as arrogant and inconsiderate, it relates to a major controversial question in the cognitive and social neuroscience community over the last decade: are there common neural processes which contribute to both “closely watching” or understanding and action production, such that one can affect the other?

The discovery of ‘mirror neurons’1,2,3, which activate during both the execution of an action and the observation of the same action performed by others, has made the hypothesis that the motor system is involved in action understanding very popular recently. However, the conclusions from the monkey electrophysiology4,5, human brain imaging6 and child development studies7,8 in this regard have been controversial since the evidence is generally correlative and not causative in nature9,10,11. On the other hand, the results from lesion and patient,12 and brain stimulation13,14,15 studies are inconclusive because it is difficult to concretely access the functional extent of a lesion or stimulus, which may include both motor system and action understanding neurons2. Indeed no previous study has exhibited a causal relation between action understanding and production, where a purposely induced change in the action understanding system affects action production, or vice-versa.

To demonstrate a causal relation between these two functions, we utilized a novel behavioral paradigm and examined the behavioral interaction between action production and outcome prediction7,16,17, which is considered as a component of action understanding18,19. Specifically, we asked expert dart throwers to predict the outcome of throws made by an unfamiliar darts novice by watching the novice darts player's throwing action. We regulated the relevant prediction error feedbacks available to the experts, controlled the improvement in their prediction ability19,20 and observed if this affects the accuracy of the expert's own dart throws.

Behavioral paradigms examining interference and transfer of learning between tasks have been previously utilized to investigate the neural processes behind human motor learning21,22,23,24,25,26,27. Here we use a similar procedure for outcome prediction learning. Behavioral paradigms cannot measure or identify the spatial characteristics of neural activity related to a behavior. However their advantage lies in the fact that with proper control, they can ensure changes in neural processes specific to a behavioral function wherever they lie in the brain. For our purpose, the outcome prediction learning paradigm enabled us to induce targeted changes in the outcome prediction system of individuals while avoiding the spatio-functional ambiguities characteristic of changes induced by lesions12 and neural interventions13,14,15. We chose to use darts experts as subjects due to several reasons: i) Experts in a sport are known to possess an excellent ability to predict the outcome of observed actions16, ii) Arguably, the observing expert will not explicitly imitate the novice and iii), an expert's motor performance is expected to be stable with time and resistant to fatigue. We could thus exclude any major contribution of explicit strategy changes28 and fatigue in our results.

To anticipate our results, we observed that an improvement in outcome prediction by the experts causes a corresponding and proportional deterioration in their own darts performance, but only when they watch novice dart throwers and not when they watch novice ten-pin bowlers. These results exhibit a causal relation between action production and outcome prediction, giving behavioral evidence for the involvement of the motor system in outcome prediction of actions observed in others.


Experiment-1: Watching darts vs watching bowling

Experiment-1 extended over two days. 16 darts experts threw 70 darts (aimed for the center of the darts board) each day over two visual feedback (VF) blocks, where they could see where their darts landed on the board and five blocks without visual feedback (nVF) where the room light was switched off when they released their dart so they could not see where their darts landed (Fig. 1a). The nVF blocks were interleaved with observation-prediction (OP) blocks, where the experts watched the video of a novice darts thrower (on one day) or a novice ten-pin bowler (another day) as control. Part of the videos in both cases were masked such that the dart flight trajectory and darts board and the bowling ball trajectory and bowling pins were not visible to the viewers (see snapshots of watched novice videos in Fig. 1b). Novice subjects were asked not to show any expressions after their throws and the recorded video was further checked and edited to remove any throws that still contained some expressions after. The experts were informed of the ‘goal’ of the novice actions (hitting the board center or ‘bull’ in case of dart throwers and felling all ten pins or a ‘strike’ for bowlers) and asked to predict the outcome of the throws (in terms of either the location on a lower resolution darts board or the number of bowling pins felled) by watching the action kinematics in the videos. They were informed of the correct outcome orally after each prediction. The experiment time line is presented in Fig. 1c. The nVF blocks were used to prevent visual correction by the experts and helped in magnifying the effects of the OP task on their darts performance.

Figure 1
figure 1

The Experiment:

Our experiment consisted of (a)- two motor action tasks, one in which the subjects threw darts in the presence of visual feedback (VF) of where their darts land on the darts board and second, in the absence of visual feedback (nVF) and (b)- observation-prediction (OP) tasks in which the subjects observed the video of either a novice darts thrower or a ten-pin bowler (snap shots shown), made a prediction of the outcome of each throw and were given the feedback of the correct outcome orally by the experimenter. The chance level for both OP tasks was 9.09% (= 1/11 × 100). Each experiment session followed the sequence of blocks as shown in (c). T.I. and G.G. took all photographs and created all drawings except for the bowling ball with skittles (Saelynriel/ link -

The outcome prediction in the experts improved significantly through the OP task of Experiment-1 both, when they watched the video of a novice ten-pin bowler (8.57 ± 4.15SD% correct predictions above chance, t(15) = 8.25, p<0.001) and a dart thrower (11.69 ± 7.15SD, t(15) = 6.54, p<0.001). Fig. 2 shows the performance error of the experts through Experiment-1, measured as the distance from the center of the board. In the first VF1 block, the experts made an average error of 2.35 cm (left red plot). This error increased marginally in the immediately following nVF block when the visual feedback was switched off. In the subsequent nVF blocks the error steadily increased after each OP block but only when they watched the video of a novice darts thrower (red plots in Fig. 2, t(15) = 2.60, p = 0.02) but not when they watched the video of a novice bowler (blue plots in Fig. 2, t(15) = −1.09, p = 0.29). The performance error remained high even after returning to the natural darts throwing environment i.e in the presence of visual feedback (VF2). A two-way ANOVA revealed significant interaction (F1,15 = 9.23, p<0.01) between the darts performance across VF block (VF1, VF2) and the observed video in the OP task (darts and bowling). Though the initial performance of experts was similar in the VF1 block (F1,30 = 3.07, p>0.05), a significant increase in the performance error was observed in experts when they watched a darts novice (F1,30 = 15.55, p<0.001) but not when they viewed a bowling novice. The darts performance deterioration, defined as the increase of performance error between VF1 and VF2, was therefore significantly positive in Experiment-1 (Fig. 3b, red plot; t(15) = 5.10, p<0.001).

Figure 2
figure 2

Watching darts vs watching ten-pin bowling (Experiment-1):

The figure plots the average (and SE) performance error, measured as the distance of the darts from the board center, observed across darts experts in each VF block (solid plots) and nVF blocks (open plots) when they observed darts video (red plots) and when they observed bowling videos (blue plots) in the OP blocks. The solid lines represent the linear fit of the data in the nVF blocks averaged across subjects. Significant increase in performance error was observed during/after watching the darts novice in both the nVF blocks (p<0.05; significant slope only after darts observation) and VF blocks (p<0.001).

Figure 3
figure 3

Modulation of outcome prediction and its effect on performance deterioration:

(a) The experts could utilize two types of prediction errors to improve their outcome prediction in our task. First, the outcome prediction error between the expert predicted outcome and the correct outcome feedback from the experimenter. Second, the kinematic prediction error between the action kinematics predicted by the expert corresponding to his goal belief (of where the novice aims his throws for) and the kinematics the expert actually observes in the video. We modulated the outcome feedbacks and goal belief provided to the expert subjects across our three experiments. (b) In the presence of both the prediction errors in Experiment-1 (red plot) the outcome prediction change (abscissa of red plot) was significant, leading to darts performance deterioration (ordinate of red plot). When the outcome feedbacks and goal belief were both removed in Experiment-2, the outcome prediction change (abscissa of black plot) as well as the performance deterioration (ordinate of black plot) were prevented. In Experiment-3, subjects were divided into two groups and provided with either only outcome feedbacks (Group GOE, pink plot) or only a goal belief (Group GKE, orange plot) to modulate their outcome prediction change. Error bars show SE. (c) Summary of the availability of outcome feedbacks and goal belief in the three experiments. T.I. and G.G. took the photograph in a).

Experiment-1 thus exhibited two results. First, predicting a novice's action leads to a progressive increase in the performance error in the expert dart throwers. Second, the performance change is task specific: darts performance error increases on predicting outcomes of darts throws but not on predicting outcomes of ten-pin bowling, critically even when the improvement in outcome prediction was similar between darts and bowling OP task conditions (t(15) = 1.22, p = 0.24). The absence of performance changes in the bowling sessions (blue plots, Fig. 2) shows that the increase in performance error is not due to fatigue, loss of attention, motivation or lack of visual feedback.

However, while Experiment-1 exhibits that watching novice dart throwers deteriorates the performance of experts, it does not conclusively exhibit that the deterioration is due to changes in the outcome prediction system. The outcome prediction did significantly improve in Experiment-1 but, the performance deterioration may have been unrelated to this prediction change and may have resulted simply due to unconscious mimicry (related to the so called Chameleon effect29) of the observed novice's darts action which was different, both in style and variability, in comparison to the expert's. To exhibit that the improvement in the outcome prediction is indeed the cause of the performance deterioration, we conducted two additional experiments (Experiment-2 and 3) and examined how the performance deterioration is affected when the improvement in outcome prediction of the observed darts action is modulated by us.

Prediction errors for outcome prediction

The experts could utilize two prediction errors to improve their outcome prediction19,20,30,31 in the OP blocks of Experiment-1. The first was the outcome prediction error - the difference between the outcome predicted by the expert from the observed novice action and the correct outcome provided to him orally by the experimenter (Fig. 3a). Second was the kinematics prediction error - the difference between the kinematics expected by the expert corresponding to the goal he believed the novice aimed for (the center of the board) and the novice kinematics he actually observed (Fig. 3a).

Experiment-2: Watching darts without prediction errors

In Experiment-2 we removed both these prediction errors, expecting this to completely suppress the improvement of outcome prediction in the darts experts. The outcome prediction error was removed by removing the feedback of the correct outcome provided to the experts. On the other hand, the kinematics prediction error was suppressed by removing the expert's goal belief. We mis-informed the expert at the start of the experiment that “the novice does not always aim for the center but aims for unknown targets provided by us and that we display only those trials in which he was successful”. We expected the mis-information to remove any prior goal belief that the expert may have. As expected, in the absence of prediction errors, the outcome prediction in Experiment-2 (black plot in Fig. 3b) was significantly lower than in Experiment-1 (t(29) = 3.82, p<0.001) and not different from chance. The outcome prediction system was thus little affected in Experiment-2. Importantly, in contrast to Experiment-1, there was no evidence of performance deterioration in Experiment-2 (Fig. 3b, black plot; t(15) = 0.11, p = 0.92). Note that except for the removal of the prediction errors in the OP task, all other conditions, including the observed darts novice videos and the level of darts performance (evaluated as the darts error in VF1; t(29) = 1.91, p = 0.19), were same between Experiment-1 and Experiment-2. Therefore, clearly the improvement in outcome prediction was the cause of the performance deterioration in Experiment-1.

Experiment-3: Watching darts with either prediction error

Next, through Experiment-3, we investigated the individual contributions of the outcome and kinematics prediction errors on the outcome prediction and performance deterioration. 21 experts participated in this study and were divided into two groups of 11 each (one expert belonged to both groups). The experts in group ‘GOE’ were provided with only the outcome prediction error while the experts in group ‘GKE’ were provided with only the kinematics prediction error (Fig. 3c). While we observed no significant improvement in the experts ability to predict the novice's action in group GOE (Fig. 3b, pink plot; t(24) = 1.21, p = 0.24) in comparison to Experiment-2, the outcome prediction in group GKE was significantly improved (Fig. 3b, orange plot; t(24) = 2.70, p = 0.013). Correspondingly, we also observed substantial performance deterioration in group GKE, but not in group GOE (Fig. 3b, GOE: t(10) = 0.63, p = 0.54, GKE: t(10) = 2.02., p = 0.07). These observations again exhibit that performance deterioration is regulated by the change in outcome prediction of the observed actions.

Outcome prediction change correlates with performance deterioration across individuals

Our observations across the three experiments exhibited a monotonic relation between outcome prediction change and performance deterioration (Fig. 3b). We next examined whether such a relation was also observed across individuals. The individual data from the experiments where there was motor performance deterioration (Experiment-1 and group GKE of Experiment-3) were pooled because no significant difference was observed in either the outcome prediction change(t(25) = 0.91, p = 0.37) or performance deterioration (t(25) = 0.52, p = 0.60) between the two groups. An individual's performance deterioration was observed to be significantly correlated to his improvement in the outcome prediction of the observed darts action (p<0.033, Fig. 4).

Figure 4
figure 4

Across-individual correlation between outcome prediction change and performance deterioration.

Across the subjects of Experiment-1 and the GKE group of Experiment-3 the deterioration was significantly correlated to their individual changes in outcome prediction. The data were pooled across the two experiments because no significant difference was observed in either the outcome prediction change or performance deterioration between the two groups.


Together, these data clearly demonstrate a causal relation between action production and outcome prediction of observed actions by humans. We started by asking darts experts to watch and predict the outcome of the novice throws. We provided them with prediction errors during the prediction task19,20,30,31 to improve their outcome prediction ability. Experiment-1 showed that observation of novice's action does deteriorate the experts' motor performance and that the deterioration is task specific (Fig. 2). In Experiment-2, we then removed the available prediction errors to exhibit that the performance deterioration is present only when the outcome prediction of the observed action is improved, clearly exhibiting that the change in outcome prediction is the cause of the performance deterioration observed in our task.

Next in Experiment-3, we re-introduced the two prediction errors one at a time (groups GKE and GOE) to exhibit that each affects outcome prediction and the corresponding performance deterioration differently (Fig. 3). In addition, the GOE group of Experiment-3 provided us with an important control to check that the outcome prediction change and the related performance deterioration in Experiment-1 was not a result of attention/motivational factors related to the presence of online outcome feedbacks in the OP blocks of Experiment-1 and which were absent in Experiment-2. The subjects in the OP task in Experiment-1 and Experiment-3 (GOE group) are expected to have similar attention/motivation levels. This is because the GOE group in Experiment-3 had the same task as experts in the Experiment-1 - They observed the same novice video, made a prediction of the novice accuracy and got the same outcome feedback as in Experiment-1. However, while their darts performance deteriorated in Experiment-1, it did not deteriorate in Experiment-3 (GOE group). This exhibits that the performance deterioration does not correspond to the attention level during the OP task. Instead, it does correspond to changes in outcome prediction in these two experiments (Fig. 3b).

Overall, across our three experiments we observed a monotonic relation between outcome prediction change and performance deterioration (Fig. 3b) and across the experiments where there was motor performance deterioration (Experiment-1 and group GKE of Experiment-3), the deterioration was significantly correlated to the individual outcome prediction change exhibited by the subjects (Fig. 4). Therefore our results exhibit, both a crucial causal relation between outcome prediction and performance deterioration as well as a monotonic relation between the two.

These results have several important ramifications. Primarily, they provide insights into the debate on the involvement of the motor system in action understanding3,11,18. Previous neurophysiological monkey4,5,32,33 and human6 studies, that have exhibited motor neural activities correlated with action understanding, have been unable to concretely prove that the observed activations are functionally involved in the action understanding process and not a consequence of some correlated epiphenomena. In contrast, our study directly modulated one function by inducing learning and revealed its effect on the other- changes in a component of action understanding, that is outcome prediction, affect motor action after observation. On the other hand, the behavioral study prevents us from making conclusions about the spatiotemporal characteristics of the neural processes involved in our task. Due to this reason we are unable to comment on the role of the mirror neuron system in our results. Furthermore, while we observed effects on the motor system, our results do not clarify which specific processes through motor planning to motor execution34,35 are affected. Nonetheless, what our results do provide is a behavioral evidence for the involvement of at least parts of the motor system in one component of action understanding by humans.

Moreover, our results give behavioral support for the presence of hierarchical action understanding in humans. The novice darts action that the experts watched during our study can be understood at multiple levels18,19,36. In terms of the task it corresponds to (darts or bowling in our case), in terms of the goal aimed for (bull or other targets), in terms of the kinematics (e.g. arm trajectory) and finally in terms of the outcome the action would lead to (landing position of the darts on the board). We asked the experts to understand the mapping between the kinematics and the corresponding outcomes in the observed novice. Therefore, while the outcome prediction was expected to change in the presence of outcome feedback, it was interesting to note that the outcome prediction is affected more in the presence of the goal belief (orange plot in Fig. 3b). This effect of goal belief on outcome prediction supports the idea that the outcome prediction7,16,17 utilizes multiple layers of the action understanding system18,19. Furthermore, we can observe some interesting differences in the understanding dynamics due to each of the prediction errors (Supplementary Fig. S1 online). While presence of a goal belief leads to a fast increase in outcome prediction in the very first OP block (orange trace), outcome prediction improvement due to the outcome feedback (pink trace) is slower and continues through the experiment. On the other hand, in the presence of both the prediction errors (red trace), the outcome prediction increases faster than when either feedback is available alone. These observations corroborate a hierarchical process of action understanding in humans where a representation of the action at each level is updated to reduce a difference between both top-down and bottom-up predictions, ensuring appropriate action understanding19.

Finally, our result that watching novice action deteriorates expert's motor performance provides interesting insight into the action-observation dynamics in human behaviors. Our social skills, from gestures for communication, sports, to driving a car safely, are critically dependent on our ability to understand observed actions performed by others and take appropriate actions ourselves37. Our results caution that watching an observed action can distort one's own action. The distortion is probably small in many cases but may be significant enough for professional sports like darts, golf and baseball where each throw or hit significantly influences results. Our data therefore suggest that sports professionals should avoid watching fellow players during games to maintain optimal performance, especially when performance of the fellow player is worse than them - something that Ichiro Suzuki may have realized by experience over his playing career and was the true reason behind his seemingly inconsiderate comments.



27 subjects- 22 darts ‘experts’ (22 males, aged 22–48), 3 novice darts throwers (3 males, aged 30–35) and 3 novice bowlers (3 males, aged 30–34) took part in our study (one person took part in our study as both a novice darts thrower and a novice bowler). The darts ‘experts’ were defined as players with a rank of A or above on the International online darts game scale (VSPHOENIX: The novice darts players were all individuals who threw darts for the first time. The novice bowlers were individuals who had bowled less than 5 times in their life. All experiments were conducted according to the principles in the Declaration of Helsinki. The subjects gave informed consent prior to the experiment and the experiments were approved by the local ethics committee in National Institute of Information and Communications Technology.

Experimental apparatus and design

The throws made by the novice dart throwers and novice bowlers were video recorded from behind (and right) of the subjects. Subjects made 36 throws each in which they targeted either the darts board center (in case of darts) or a strike (in case of bowling). The video recorded throws from each novice were shuffled and used to create a series of 120 throws for each darts novice and novice bowler. Furthermore, part of the video was masked such that the viewer could see all the moving limbs of the novice action and the ball/darts release but were not able to see either the ball/darts trajectory or the outcome darts board/bowling pins (see snapshot of video in Fig. 1b). The novices in both cases were asked not to show any expressions after their throws. Furthermore the recorded novice video was checked and edited to remove any throws that still contained some expressions, such that the expert could not judge the novice performance based on any information other than their action kinematics.

We conducted three experiments in the study. Each experiment consisted of three tasks- Dart throwing without visual feedback (nVF), darts throwing with visual feedback (VF) and the observation-prediction task (OP). The experiment tasks are described in detail in the next section.

The experts used their own darts in the experiments. Individuals who forgot their set were provided with darts tips from Bottleson darts company ( A ‘hard dart’ board (SB 3001 from Puma darts products Ltd.; was utilized in the study.

Experiment conditions

Darts throwing without visual feedback (nVF)

In this condition, the experts threw darts aimed at the center of a darts board but were prevented from seeing where each of their throw landed on the board (Fig. 1a). This was achieved by switching off the room light during each throw. The room light switch off was achieved with a switch that the experts held between the palm and fingers in their left hand and operated while throwing the darts. They practiced using the switch before starting the experiment.

Each throw trial started with the room light ‘on’ which allowed the expert to take aim on the darts board. The experts held the light switch in their closed left hand such that the switch was kept on and were asked to simultaneously open their left hand when they released the dart with their right hand. This switched off the room light and prevented them from viewing where their dart landed. A specialized experiment room with black curtains ensured that the room was dark when the light was switched off and the dart flight trajectory and darts board were not visible to the subjects. They were asked to turn around immediately after they threw a dart and while the room light was still off. The light was then turned on and the expert was asked to ‘self-judge’ their performance -mark the position he expected his dart to have landed, on a similar darts board pasted on the back wall of the room. During this same period, one of the experimenter measured the real accuracy of the experts throw and removed the dart from the board. The expert then turned back, was provided with a new dart and went on to make the next throw. Each nVF block included 10 dart throws. Note that the data from the self-judgment task is not utilized in this manuscript.

Darts throwing with full visual feedback (VF)

The experts made dart throws in this condition (similar to nVF) but with the room light kept on throughout the condition (Fig. 1a). However, to equalize the conditions they were asked to operate the light switch similar to the nVF condition, even though this did not make any change in the lighting conditions. Each VF block included 10 dart throws and there was no ‘self-judgment’ during the VF condition.

Novice action observation-prediction (OP)

In the OP conditions, the experts sat comfortably on a chair and viewed the video of either a novice darts thrower, or a novice bowler on the computer monitor kept on the table in front of them (Fig. 1b). Each expert viewed either bowling or darts from a single novice throughout any one experiment session. They were shown the video of one novice trial (either darts or bowling) at a time and asked to predict the outcome of the throw by watching the novice action. They were instructed to write down the prediction on a sheet of paper provided to them. There were 11 possible outcomes in the bowling task, from 0 pins to 10 (or ‘strike’). In the darts task, to equalize the difficulty, we divided the darts board into 11 parts as well. The divisions are shown in Fig. 1b. There was no time limit on the OP task. Once the expert announced that he had written his prediction, he was provided with the correct answer orally by an experimenter. They were asked to write the correct answer besides their prediction. Following this, they went on to see the video of the next throw. This process was repeated for each throw. Each OP block included either 15, 30 or 45 throws to view. In total, each expert viewed 120 trials through four OP blocks in each experiment session.

Experiment time-line

Practice session

All experiments started with a practice period where first the experts were allowed to take their time and throw darts so they acclimatize themselves to our experimental environment. This was followed by instructions on the use of the light switch. Following this, the expert was again let to practice their throws (but while operating the switch) till they felt that the light switch did not interfere with their concentration. The operation of the light switch during practice switched off one light in the room while the other remained on, allowing the experts to have visual feedback throughout the practice period. All experts felt comfortable with the switch with a practice of less than 15 minutes. Finally the experts were instructed on the OP task and also let to practice 3 throws when all the lights in the room were switched off by their switch.

Experiment session

Each experiment session required a dart expert to throw seventy darts (aimed for the center of the darts board) over a VF block, followed by five nVF blocks which were interleaved with 4 OP blocks and finally ending with a second VF (see Fig. 1c). The nVF blocks were used to prevent visual correction by the experts and helped in magnifying the effects of the OP task on their darts performance.

We conducted three experiments with two sessions in each. In each session and in all the experiments, the order of blocks remained as mentioned above with changes only in the OP tasks (either in the observed video (darts/bowling) and/or in the instructions). The two sessions on the same day were separated by a 20 minute break followed by a practice period and light switch training similar to that at the beginning of the first session.


Experiment-1 extended over two days with two sessions each day. Half of the 16 subjects who participated in this experiment had a session of darts observation, followed by a session of bowling observation on the first day and then the vice-versa on the second day (see Supplementary Table S1 online). The remaining subjects had the opposite order of session on each day. In the beginning of each session, the expert subject was clearly instructed that the novice in the video they observed aims for the center of the board (if it was a session of darts observation) or a strike (if it was a session of bowling observation). Furthermore, during all OP blocks the experts were provided with the feedback of the correct answers.

The two experiment days for any subject were separated by at least 4 days. Note that data from only the first session of each day is reported in this manuscript. The second session data on each day of Experiment-1 were collected to investigate interference effects are not utilized in this manuscript.


Experiment-2 extended a single session and involved 16 subjects (11 of them had participated in Experiment-1 and the other 5 subjects were new recruits) (see Supplementary Table S1 online). The OP blocks in Experiment-2 utilized only darts observation and each expert saw the video of a different novice individual than who he had observed in Experiment-1. Experiment-2 differed from Experiment-1 in two critical aspects. First, even though each expert in Experiment-2 saw the darts video of a novice trying to hit the center (same as in Experiment-1), he was clearly instructed with a lie that “the novice do not always aim for the center but aim for unknown targets provided by us and we display only those trials in which they were successful”. This misinformation helped us remove any prior belief an expert may have of the novice's aim. Second, no oral feedback was provided to the experts after each prediction in the OP blocks. These two changes helped us remove the prediction error feedbacks (see Results section) available to the subject.


Experiment-3 involved 21 experts. All experts except one (expert# 22 in Supplementary Table S1 online) had previous experience in atleast one of Experiment-1 or Experiment-2 before participating in this experiment. The experts were divided into two groups of 11 each: groups GOE and GKE (expert # 22 belonged to both groups). The experts in group GOE observed the darts video without a prior goal belief (they were told a lie as in Experiment-2) while they DID get an outcome feedback after each prediction in the OP blocks. The experts in group GKE observed the darts video and they were told that the video shows novice subjects aiming for the center (they were not lied to) but they were NOT provided with an outcome feedback after prediction (see Fig. 3c and Supplementary Table S1 online). The second session data from 4 subjects (# 12-15) in Experiment-3 were used to study the effects of viewing another expert (instead of a novice) and are not discussed in this study.

Data Analysis

Dart performance analysis

The performance of the experts in each VF and nVF block was measured as the unsigned distance from the center of the board where they were asked to aim. The performance of the subject in a nVF block was evaluated as the distance from the board center averaged over the ten throws. The performance was averaged across the subjects for each block and plotted in Fig. 2 (with the standard error across subjects plotted as the error bars). In addition the mean data from the nVF conditions on each subject were fitted with a regression line to indicate the performance trend. The average trend over the subjects was plotted as the solid line in Fig. 2.

The performance deterioration was defined by the performance change between the two VF blocks before and after the OP blocks. We did this because the VF blocks (with visual feedback) were closest to the real life darts throwing. The performance in the first VF block (the baseline performance) was evaluated as the distance averaged only over the last three throws of the VF, to ensure the expert performance has stabilized (we observed a lot of variance in the first trials of the experiment probably due to anxiety). The subject performance was observed to be similar across all the experiments by the last three VF trials (one-way ANOVA, F4,65 = 0.74, p = 0.57). On the other hand, the performance in the second VF block was evaluated as the distance averaged over the first five throws (half of the throws) to minimize artifacts due to the trial-by-trial feedback corrections performed by the experts in the presence of the visual feedback. The change in performance between the VF blocks of each subject was averaged across the subjects of an experiment to represent the performance deterioration in that experiment. These were plotted in Fig. 3b. Error bars represent standard error.

Observation-prediction performance analysis

Both the darts OP task and the bowling OP task required the experts to predict from one of 11 possible solutions in each trial. A prediction was deemed successful only if the darts zone or the bowling pins felled matched the correct outcome. Therefore the chance level for both OP tasks was 1/11*100 = 9.09%. The outcome prediction change was evaluated as the percentage of total correct predictions above chance (see supplementary methods in the Supplementary Material online). This value was averaged across the subjects for each experiment. One subject (sub #4 in Table S1 in the Supplementary Material online) was excluded from analysis of the OP task in Experiment-2, because he missed writing down his predictions in some trials leading to a mismatch in the answer sheet between the presented videos and his answers.