Introduction

Sense of agency (SoA) is the registration1 that the self initiates actions to influence its external environment2. It therefore accompanies voluntary actions3,4,5,6, allows oneself to feel distinct from others7,8,9, and be responsible for its own actions2,6,10,11. Studies show SoA emerges from, and is particularly sensitive to any disruption in, the congruous flow of intentional actions to expected sensory outcomes12. Crucially, the degradation of this experience characterizes certain psychiatric and neurological disorders13,14,15. For example, studies show schizophrenic patients tend to attribute someone else’s actions to themselves. Despite its significance16,17,18, the literature still lacks the computational principles that can elucidate SoA.

We theorize SoA as the confidence in one’s perception of the action-outcome effect, and that it is consistent (e.g., spatially or temporally) with the hypothesis that the action caused the outcome. We adapted the model of Sato et al.19 that was originally used to explain the ventriloquism effect as a Bayesian estimate of a common source behind the consistency of the audiovisual stimuli, akin to being the common cause20 of the audiovisual integration. Formalizing SoA by this Bayesian psychophysics principle distinguishes our theory from existing works.

We compared the predictions of our model with the results of two pertinent intentional binding studies. Intentional binding, which is the perceived compression of the time interval between voluntary action and its outcome, has been reported as a reliable implicit measure of SoA and has been used in a large number of studies providing valuable analyses on the temporal perception of action-outcome effects and the nature of SoA21. The seminal experiment of Haggard et al.3 investigated the perceived action-outcome timing effects in three conditions: voluntary wherein the subject intentionally presses a button, involuntary wherein muscle twitches of the subject’s hand are induced by a transcranial magnetic stimulation (TMS) applied to the motor cortex, and sham TMS wherein the TMS on the parietal cortex produces audible clicks but no movement (hereafter, voluntary, involuntary, and sham conditions, respectively). Haggard and colleagues computed the time interval between the perceived action timings (with the timings of either voluntary actions, muscle twitches, or audible TMS clicks as control experiment) and the perceived timings of subsequent tone stimuli. They showed that voluntary actions produced intentional binding, involuntary muscle twitches produced repulsion, i.e., prolonged opposite perception of the action-outcome intervals, and audible TMS clicks produced neither binding nor repulsion. Hence, they posit intentionality is necessary to achieve action-outcome binding.

The second pertains to the study of Wolpe et al.22, which investigated the contribution of cue integration to intentional binding by manipulating the reliability of the consequent tone relative to a background white noise. Such manipulation resulted in three levels of tone uncertainty conditions, namely low, intermediate, and high uncertainty. Their analyses showed that when tone reliability was reduced, the perceptual shift in tone timing towards the action was increased.

Although Bayesian integration was proposed as a general principle behind SoA14,23, it was unknown whether the observed action-outcome temporal compression and repulsion effects are consistent with Bayesian principles, and if indeed the case, the question is how. Our Bayesian model reproduces the above empirical results on intentional binding based on a computational principle. Further, it goes beyond timing estimations by exposing the underlying Bayesian mechanisms that possibly drove the temporal binding. Our Bayesian model explains the perceived compressed action-outcome time interval is more consistent with the prior belief of the causal role of one’s action in producing the immediate outcome and thus increases the confidence in the Bayesian estimate assuming the causal case, modeled as SoA. Moreover, our model explains intentional binding as a specific class of the more general notion of causal binding. Our Bayesian model predicts that intentional binding generally happens on a per-trial basis, yielding a bimodal distribution of the perceived action-outcome interval. Lastly, the model also predicts that if the sensory input signals are perceived as reliable (precise), SoA may arise even for unintended actions, which serves as a testable theory for future SoA experiments.

Results

Bayesian inference model of action-outcome temporal binding

We considered the experimental setup of intentional binding where a subject presses a button (i.e., the action) and a tone (i.e., the outcome) sounds 250 ms after the button press. The true action and outcome timings are thus described by \(t_{\mathrm{{A}}}^ \ast\) = 0 ms and \(t_{\mathrm{{O}}}^ \ast\) = 250 ms, respectively, but they are unknown to the subject. The task for the subject is to accurately report her perceived timings of the button press and tone. We assume the arrival of relevant sensory input informing the timing of each of these physical events involves sensory delay d and jitter of variance σ2 due to sensory noise. Thus, the arrival time τA of sensory input that signals the action timing is assumed to be generated from a Gaussian distribution, \({\cal{N}}\left( {t_{\mathrm{{A}}}^ \ast + d_{\mathrm{{A}}},\sigma _{\mathrm{{A}}}^2} \right)\), with mean \(t_{\mathrm{{A}}}^ \ast + d_{\mathrm{{A}}}\) and variance \(\sigma _{\mathrm{{A}}}^2\). Similarly, the arrival time τO of sensory input that signals the outcome timing is generated from \({\cal{N}}\left( {t_{\mathrm{{O}}}^ \ast + d_{\mathrm{{O}}},\sigma _{\mathrm{{O}}}^2} \right)\).

The brain often resolves such ambiguity in sensory inputs by integrating multiple sensory cues akin to the Bayesian “ideal observer”24. Hence, we model a Bayesian observer who estimates action timing tA and outcome timing tO based on the corresponding noisy sensory inputs arriving at time τA for the action and τO for the outcome. The conditional probability distributions of τA and τO that the Bayesian observer uses are modeled as Gaussian distributions

$$P\left( {\tau _{\mathrm{{A}}}|t_{\mathrm{{A}}}} \right) \propto \exp \left( { - \frac{{\left( {\tau _{\mathrm{{A}}} - t_{\mathrm{{A}}}} \right)^2}}{{2\sigma _{\mathrm{{A}}}^2}}} \right) \\ P\left( {\tau _{\mathrm{{O}}}{\mathrm{|}}t_{\mathrm{{O}}}} \right) \propto \exp \left( { - \frac{{\left( {\tau _{\mathrm{{O}}} - t_{\mathrm{{O}}}} \right)^2}}{{2\sigma _{\mathrm{{O}}}^2}}} \right),$$
(1)

with mean tA and tO, and variance \(\sigma _{\mathrm{{A}}}^2\) and \(\sigma _{\mathrm{{O}}}^2\) for action and outcome, respectively. It is noteworthy that sensory delays dA and dO are not included in Eq. (1) for the reason we describe in the next paragraph.

Before studying the binding effect, let us consider simple baseline conditions. In one baseline condition, the action timing is reported by the subject without the presentation of an outcome tone. If no prior knowledge is available, the Bayesian observer reports the action timing that maximizes the conditional probability distribution in Eq. (1). Hence, the estimated action timing \(\hat t_{\mathrm{{A}}} = \tau _{\mathrm{{A}}}\) is solely determined by the noisy sensory input informing the action timing. In this case, the model predicts that the distribution of \(\hat t_{\mathrm{{A}}}\) is \({\cal{N}}\left( {t_{\mathrm{{A}}}^ \ast + d_{\mathrm{{A}}},\sigma _{\mathrm{{A}}}^2} \right)\). The mean and SD of \(\hat t_{\mathrm{{A}}}\) in the baseline condition were experimentally reported, e.g., Haggard’s results in the voluntary condition suggest dA = 6 ms and σA = 66 ms (refer to Table 1 in Methods for all condition-based dA and σA values). Importantly, we assume that the observer does not take into account sensory delay dA in Eq. (1). If the Bayesian observer included its effect, it could compensate for this delay and report unbiased timing, which was not the case in the experiment. Therefore, we assume that the observer was unable to take into account the sensory delay in Eq. (1). In the other baseline condition, the subject passively listens to a tone and reports its timing. This case goes parallel to the above case and the model predicts that the estimated tone timing is \(\hat t_{\mathrm{{O}}} = \tau _{\mathrm{{O}}}\), which is distributed according to \({\cal{N}}\left( {t_{\mathrm{{O}}}^ \ast + d_{\mathrm{{O}}},\sigma _{\mathrm{{O}}}^2} \right)\). The comparison of this model prediction to Haggard’s experiment, e.g., would be dO = 15 ms and σO = 72 ms (refer to Table 1).

Table 1 List of Bayesian model parameters and their values

Next we study the effect of binding when the subject makes an action and then listens to the outcome tone, commonly referred to as the operant condition. In this case, the Bayesian observer makes an inference not only based on the conditional probability distribution in Eq. (1) but also based on the prior distribution of tA and tO. Adapting the Bayesian model of the ventriloquism effect19, we assume the prior distribution depends on the observer’s belief whether the action caused the outcome, i.e., the causal case: ξ = 1, or the action and the outcome are unrelated, i.e., the acausal case: ξ = 0:

$$P\left( {t_{\mathrm{{A}}},t_{\mathrm{{O}}}|\xi } \right) \propto \left\{ {\begin{array}{*{20}{l}} {\exp \left( { - \frac{{\left( {t_{\mathrm{{O}}} - t_{\mathrm{{A}}} - \mu _{{\mathrm{{AO}}}}} \right)^2}}{{2\sigma _{{\mathrm{{AO}}}}^2}}} \right),} \hfill & {(\xi = 1)} \hfill \\ {1.} \hfill & {(\xi = 0)} \hfill \end{array}} \right.$$
(2)

The action causes the outcome in the causal case (ξ = 1) so that the outcome timing involves a typical delay μAO with respect to the action timing and a Gaussian-distributed jitter of SD σAO. The outcome is caused by something other than the action in the acausal case (ξ = 0) so that tA and tO are independent. Lastly, we define P(ξ) as the prior for each belief: P(ξ = 1) for the causal case and P(ξ = 0) = 1 − P(ξ = 1) for the acausal case. We hypothesize the estimation of ξ to be essential for the perception of causality and SoA (explained below).

Given a pair of sensory inputs at τA and τO, the Bayesian observer estimates the most probable timing for the action and the outcome, and whether these observations are consistent with the causal case. According to the Bayesian estimation theorem, the maximum-a-posteriori (MAP) estimate (\(\hat t_{\mathrm{{A}}},\hat t_{\mathrm{{O}}},\hat \xi\)) of the corresponding pair of physical sensory timing (tA, tO) and the causal variable ξ is given by

$$\left( {\hat t_{\mathrm{{A}}},\hat t_{\mathrm{{O}}},\hat \xi } \right) = \arg \mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{{A}}},t_{\mathrm{{O}}},\xi } P\left( {t_{\mathrm{{A}}},t_{\mathrm{{O}}},\xi |\tau _{\mathrm{{A}}},\tau _{\mathrm{{O}}}} \right),$$
(3)

where P(tA, tO, ξ|τA, τO) is the posterior probability distribution of (tA, tO, ξ) given the sensory inputs (τA, τO). Hence, whether the Bayesian observer estimates the action-outcome effect to be causal or not depends on the posterior ratio comparing the causal case (ξ = 1) and the acausal case (ξ = 0), namely

$$r \equiv \mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{{A}}},t_{\mathrm{{O}}}} P\left( {t_{\mathrm{{A}}},t_{\mathrm{{O}}},\xi = 1|\tau _{\mathrm{{A}}},\tau _{\mathrm{{O}}}} \right){\mathrm{/}}\mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{{A}}},t_{\mathrm{{O}}}} P\left( {t_{\mathrm{{A}}},t_{\mathrm{{O}}},\xi = 0|\tau _{\mathrm{{A}}},\tau _{\mathrm{{O}}}} \right).$$
(4)

Causality is detected if the confidence in the causal estimate is greater than that in the acausal case, i.e., r > 1. The MAP estimate of Eq. (3) is then given by (see Methods for the derivation)

$$\left( {\hat t_{\mathrm{{A}}},\hat t_{\mathrm{{O}}},\hat \xi } \right) = \left\{ {\begin{array}{*{20}{c}} {\left( {\tau _{\mathrm{{A}}} + \frac{{\sigma _{\mathrm{{A}}}^2}}{{\sigma _{{\mathrm{{tot}}}}^2}}\left( {\tau _{\mathrm{{O}}} - \tau _{\mathrm{{A}}} - \mu _{{\mathrm{{AO}}}}} \right),\tau _{\mathrm{{O}}} - \frac{{\sigma _{\mathrm{{O}}}^2}}{{\sigma _{{\mathrm{{tot}}}}^2}}\left( {\tau _{\mathrm{{O}}} - \tau _{\mathrm{{A}}} - \mu _{{\mathrm{{AO}}}}} \right),1} \right),} & {(r > 1)} \\ {\left( {\tau _{\mathrm{{A}}},\tau _{\mathrm{{O}}},0} \right),} & {(r < 1)} \end{array}} \right.$$
(5)

with \(\sigma _{{\mathrm{{tot}}}}^2 \equiv \sigma _{\mathrm{{A}}}^2 + \sigma _{\mathrm{{O}}}^2 + \sigma _{{\mathrm{{AO}}}}^2\). This indicates, on one hand, that perceptual shift does not happen if the causality is not detected (\(\hat \xi = 0\))—the time estimates for action and outcome simply reflect the corresponding sensory signals in this case. On the other hand, perceptual shift happens if the causality is detected (\(\hat \xi = 1\))—the action and outcome timing attract each other in the form of binding if τO − τA > μAO and repel each other in the form of repulsion if τO − τA<μAO. The magnitude of perceptual shift for the action and outcome timing depends on coefficients  \(\sigma _{\mathrm{{A}}}^2/\sigma _{{\mathrm{{tot}}}}^2\) and \(\sigma _{\mathrm{{O}}}^2/\sigma _{{\mathrm{{tot}}}}^2\), respectively, implying that perceptual shift is greater for a more unreliable stimulus. This model predicts that the occurrence of binding, repulsion, or no perceptual shift is trial-dependent, influenced by the noisy sensory signal τO − τA informing the action-outcome interval. We denote the probability of detecting causality (i.e., \(\hat \xi\) = 1) by Pc (see Methods for its analytical expression). Pc increases with larger P(ξ = 1) and smaller σAO if \(\sigma _{{\mathrm{{AO}}}} \ll \sigma _{\mathrm{{A}}},\sigma _{\mathrm{{O}}}\).

Proposed measure of SoA

Separate from the judgement of causality described above, we also directly quantify the confidence in the causal MAP estimate

$${\mathrm{{CCE}}} = \mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{{A}}},t_{\mathrm{{O}}}} P(t_{\mathrm{{A}}},t_{\mathrm{{O}}},\xi = 1|\tau _{\mathrm{{A}}},\tau _{\mathrm{{O}}})$$
(6)

and we postulate this quantity to be a possible indication of the pre-reflective feeling of agency (FoA; see Discussion). The analytical expression of confidence in causal estimate (CCE) in Methods yields the following requirements to have high CCE: (i) the timing of sensory signals must be consistent with the causation of the outcome by the action, namely τO − τA ≈ μAO; (ii) the causal prior probability P(ξ = 1) must be high; (iii) the sensory inputs must be precise, i.e., the amplitudes σA and σO of sensory jitter must be small enough. Furthermore, by computing for the peak of the conditional probability distribution, instead of integrating over tA and tO, CCE does not only indicate the causation of the outcome by the action but is also sensitive to the accuracy of the action and outcome timing estimates. We therefore posit SoA as encapsulation and manifestation of several pertinent aspects, which include temporal consistency in the action-outcome effect, the prior belief of an action causing the outcome, and the reliability of the perceived sensory signals. Hence, our Bayesian model coherently explains not just SoA that arises from the causation of the outcome by the action but also one that is influenced by the reliability of the different agency cues—a precision-dependent causal agency.

Simulation results and model predictions

Here we briefly describe how we obtained the parameter values used in our simulation (but see Methods for more details about the model fitting and simulation). Fitting of dA, dO, σA, and σO is straightforward; they are suggested by the means and SDs of the reported subjects’ baseline estimation errors (Table 1-Sets A and B in Methods). After fixing these parameters, the model is left with three free parameters, μAO, σAO, and P(ξ = 1). As described in Eq. (5), μAO has an important role in determining whether binding or repulsion happens in each experimental condition. A fixed value of μAO = 230 ms successfully accounts for this qualitative behavior in all the six experimental conditions (three from Haggard et al.3 and three from Wolpe et al.22) that we study. The analytical expressions in Methods suggest that σAO and P(ξ = 1) have a largely overlapping role in detecting causality. Causality is more likely detected if σAO is small or P(ξ = 1) is large, although the exact mechanisms are slightly different. At least one of these two parameters needs to be adjusted according to the conditions to account for the experimental observations. For simplicity, we fix σAO = 10 ms to be a small enough constant to permit noticeable perceptual shift and adjust P(ξ = 1) (see Table 1 for the parameter values in six experimental conditions) to account for two observations in each condition, namely the perceptual shifts in the action timing and the outcome timing.

Our results show that our simple Bayesian model qualitatively reproduces the perceptual shifts that were reported in the study by Haggard et al.3 (Fig. 1). Consistent with their findings, our Bayesian observer inferred the perceived action and outcome timings to shift towards each other in the voluntary condition, resulting in compressed temporal intervals between the action and outcome perceptual shifts. However, reversed and prolonged perceptual shifts were observed in the involuntary condition. The model also reproduced no appreciable perceptual shifts in the sham condition.

Fig. 1
figure 1

Qualitative replication of the empirical results reported by Haggard et al.3. Each subject’s mean judgment error in the single-event baseline condition was subtracted from the mean judgment error for the corresponding event in the operant condition. This resulted in the values underneath the blocks that indicate the magnitude and direction to which the temporal perceptions shifted. A positive perceptual shift informs delayed awareness and a negative shift informs anticipated awareness. The action and outcome timings are perceived to shift towards each other in the voluntary condition. In contrast, they are perceived to repulse in the involuntary condition. There is no discernible perceptual shift in the sham condition

Our Bayesian model predicts binding and repulsion to increase with stronger causal prior (Fig. 2). From Eq. (5), the amount of binding or repulsion is given by \(\left( {\tau _{\mathrm{{O}}} - \tau _{\mathrm{{A}}}} \right) - \left( {\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}}} \right)\), which is \(\left( {\tau _{\mathrm{{O}}} - \tau _{\mathrm{{A}}} - \mu _{{\mathrm{{AO}}}}} \right)(\sigma _{\mathrm{{A}}}^2 + \sigma _{\mathrm{{O}}}^2){\mathrm{/}}\sigma _{{\mathrm{{tot}}}}^2\) in the causal case \((\hat \xi = 1)\) and none otherwise \((\hat \xi = 0)\). As the sensory signals are distributed according to \(\tau _{\mathrm{{A}}}\sim {\cal{N}}\left( {t_{\mathrm{{A}}}^ \ast + d_{\mathrm{{A}}},\sigma _{\mathrm{{A}}}^2} \right)\) and \(\tau _{\mathrm{{O}}}\sim {\cal{N}}\left( {t_{\mathrm{{O}}}^ \ast + d_{\mathrm{{O}}},\sigma _{\mathrm{{O}}}^2} \right)\), the average of τO − τA − μAO factor is \(m = t_{\mathrm{{O}}}^ \ast - t_{\mathrm{{A}}}^ \ast + d_{\mathrm{{O}}} - d_{\mathrm{{A}}} - \mu _{{\mathrm{{AO}}}}\). Hence, the sign of m determines whether binding or repulsion is predicted on average. With the current set of parameters, m is positive in the voluntary condition, yielding binding, and negative in the involuntary condition, yielding repulsion (schematically drawn in Fig. 3a). Perceptual shift is almost zero regardless of the causal prior P(ξ = 1) in the sham condition, because m ≈ 0. We chose P(ξ = 1) = 0.1 for this under-constrained sham condition, assuming that causality would not be frequently detected.

Fig. 2
figure 2

Bayesian model predictions of the influence of causal prior strength on action-outcome perceptual shifts. The best estimates of the Bayesian model (in Fig. 1) were obtained from different causal priors, specifically, P(ξ = 1) is 0.9, 0.9, and 0.1 (marked by the colored dots) for the voluntary, involuntary, and sham conditions, respectively. The intervals between the action and outcome perceptual shifts shrink in the voluntary, but widen in the involuntary, condition with a strong causal prior. Minimal changes in perceptual shifts are predicted for the sham condition even with a strong causal prior

Fig. 3
figure 3

Bayesian model predictions of trial-to-trial perceptual shifts and timing intervals. a Our Bayesian model predicts that (shown schematically) if τO − τA > μAO action and outcome binding will happen. Otherwise, i.e., τO − τA < μAO, action-outcome repulsion will occur. In both cases, the perceived timings in the baseline move (compress or stretch) towards the temporal consistency \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\) in the operant condition. b, c When τO − τA > μAO, there is positive perceptual shift in action awareness (\(\hat t_{\mathrm{{A}}} - \tau _{\mathrm{{A}}} > 0\)) and negative perceptual shift in outcome awareness (\(\hat t_{\mathrm{{O}}} - \tau _{\mathrm{{O}}} < 0\)). The opposite happens when τO − τA < μAO. Both binding and repulsion occur in both voluntary and involuntary conditions, but very little effect in the sham condition. d The Bayesian estimates follow the sensory inputs in the baseline condition, i.e., \(\tau _{\mathrm{{O}}} - \tau _{\mathrm{{A}}} \approx \hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}}\), where all trials are acausal (\(\hat \xi\) = 0) by definition. e The Bayesian estimate shifts towards the prior assumption, \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\), when the sensory inputs are highly consistent with the prior, τO − τA ≈ μAO, and therefore when causality is detected (\(\hat \xi\) = 1). Otherwise, the estimate of action and outcome timings follow the sensory inputs. The fitted causal prior P(ξ = 1) is 0.9, 0.9, and 0.1 for the voluntary, involuntary, and sham conditions, respectively (as in Fig. 2). The per-trial results are grouped accordingly into bins of width 200 (randomly chosen), and the mean and SD for each bin are plotted. This format is followed each time a quantity of interest is plotted as a function of τO − τA

Our Bayesian model provides interesting insights on what possibly drives the perceived action-outcome temporal compression and repulsion effects. We empirically observed sensory delay d to increase with larger SD σ of the Gaussian-distributed jitter (observed in both Haggard et al. 3 and Wolpe et al.22; see Table 1 in Methods). This may imply that, as action or outcome ambiguity is increased due to noise (greater σ) for increased sensory uncertainty, more time would be needed (greater d) for a sensory input to reach the subject’s perceptual threshold for temporal awareness in the baseline condition. Thus, because of m’s dependency on dO − dA, binding more likely happens when the outcome is unreliable (i.e., with large dO) and repulsion more likely happens when the action is unreliable (i.e., with large dA).

To further illustrate the model prediction from our simulations, we plotted separately the action and outcome perceptual shifts for the three conditions as functions of the temporal disparity τO − τA (c.f. Eq. (5)). Indeed, our data show that for instances in which τO − τA > μAO, action awareness is delayed (positive action shift, Fig. 3b) and outcome tone is anticipated (negative outcome shift, Fig. 3c), thereby demonstrating binding. The opposite happens when τO − τA < μAO, thereby demonstrating repulsion in both action and outcome awareness (Fig. 3b, c, respectively). We then plotted how the model’s MAP estimates on the action-outcome interval are affected by the sensory time difference τO − τA in the baseline (here, \(\hat \xi\) = 0 is forced; Fig. 3d) and operant (Fig. 3e) conditions. We observe from the baseline condition that the MAP estimates follow sensory inputs, \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} = \tau _{\mathrm{{O}}} - \tau _{\mathrm{{A}}}\), whereas the perception of action and outcome timings shifted towards the prior mean, \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\), in the voluntary and involuntary conditions but not so much in the sham condition with weak causal prior. Therefore, our model is agnostic as to whether the action is self-intended or unintended. Binding towards \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\) will happen, be it in the opposite direction, as long as the action is believed to have caused the outcome. This suggests that causality is the phenomenon that underlies intentional binding, and likely SoA, with self-intended causality being a specific case. The temporal window of τO − τA for detecting causality is wider in the voluntary and involuntary conditions than in the sham condition.

We then examined how the prior belief in causation affects our proposed measure for SoA in Haggard’s experimental setup. Our model predicts CCE to strengthen together with the causal prior but its strength differs depending on the conditions even at the same strength of the prior (Fig. 4a). Interestingly, dA and σA are the only parameters of our Bayesian model that differentiate the three conditions in this figure. As we described above, these two parameters are empirically correlated such that the delay dA increases with larger σA. Hence, the difference in CCE in the three conditions can be attributed to the inequalities in SDs of the subjects’ action timing estimation errors in the three conditions: \(\sigma _{\mathrm{{A}}}^{{\mathrm{{Vol}}}} < \sigma _{\mathrm{{A}}}^{{\mathrm{{Sham}}}} < \sigma _{\mathrm{{A}}}^{{\mathrm{{Invol}}}}\) as per the data of Haggard et al.3 Haggard et al. speculated that the unexpected and surprising quality of the TMS-induced movement could account for the repulsion effect in the involuntary condition. We suggest that this surprise might have introduced uncertainty in the perception of action input signals. Hence, although subjects were certain of the nature of their voluntary actions, they could be less certain of the proprioception signals induced by TMS, which could explain the inequalities in σA. As a result, the model gives CCEVol > CCESham > CCEInvol according to requirement (C), i.e., reliable sensory inputs, for having high CCE when compared at the same strength of the causal prior.

Fig. 4
figure 4

Bayesian model predictions of the confidence in causal estimate (CCE), which is our proposed measure for SoA. a Our Bayesian model predicts CCE to increase with a stronger causal prior. Furthermore, CCE differs for each condition even with equal prior strengths. This can be attributed to the difference in the amplitude of the jitter in the self-generated vs TMS-induced movement (muscle twitches) and audible clicks. b When plotted as functions of the trial-to-trial temporal disparity τO − τA, with the specific causal priors obtained for each condition, marked in a, CCE has a higher peak in the voluntary condition, but much lower values in the sham condition. Furthermore, CCE diminishes as the temporal disparity in sensory inputs moves further away from the prior mean |τO − τA − μAO|. This falling of the CCE is faster when the causal prior is weaker and the uncertainty in the action input signal is higher

The relation between CCE and SoA becomes clear when we analyze them with the fitted values of the causal prior (P(ξ = 1) = 0.9 for the voluntary and involuntary conditions and P(ξ = 1) = 0.1 for the sham condition as indicated in Table 1). Figure 4b plots CCE on a per-trial basis as functions of the temporal disparity τO − τA (c.f. the analytical expression for CCE in Methods). CCE in the voluntary condition has a higher peak than the involuntary condition as we described above (due to small σA in the voluntary condition for the requirement (C)). In both voluntary and involuntary conditions, CCE diminishes as τO − τA moves farther from μAO because of the requirement (A) of small |τO − τA − μAO| for having high CCE. Finally, CCE for the sham condition takes much lower values than the voluntary or involuntary conditions because of the requirement (B) of large P(ξ = 1) for having high CCE.

In a similar fashion, we then examined the underlying psychophysical mechanisms that could account for the temporal binding observed by Wolpe et al.22, in which three uncertainty levels (high, intermediate, and low uncertainty) of the outcome stimulus were tested. We use the Bayesian model that was used to reproduce the Haggard’s experiments with the same values of μAO and σAO but adjusted the strength of the causal prior P(ξ = 1) to fit the reported action timing and outcome timing in each condition. We used P(ξ = 1) = 0.9, 0.6, and 0.5 for low, intermediate, and high tone uncertainty conditions, respectively (see Table 1 and Methods). This means that the prior belief in causation decreases with the tone uncertainty, which is plausible. (Alternatively, we could increase σAO, which produces similar results; see above discussion on model fitting.)

Our model reproduces the experiments of Wolpe et al.22 (Fig. 5a), qualitatively explaining the temporal binding they observed in terms of a single, coherent cue integration formulation. The Bayesian estimate of the action-outcome intervals shift towards \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\), as per the causal temporal prior in Eq. (2) when causality is detected. On the one hand, the magnitude of the shift is greater when the outcome uncertainty is high (c.f. Eq. (5)). However, on the other hand, causality is less frequently detected when the outcome uncertainty is high with the reduced causal prior. These two opposing effects are summarized in Fig. 5b. The model can qualitatively reproduce the experiments if the former effect is more dominant. Quantitatively, however, the latter effect is necessary to mitigate the former effect.

Fig. 5
figure 5

Qualitative replication of, as well as predictions related to, the results reported by Wolpe et al.22. a Qualitative replication of the experimental results (left panel) by our Bayesian model (right panel). b The action-outcome binding increases under heightened uncertainty. However, causality is less detected when the causal prior is lower, which decreases the action-outcome binding effect. The best estimates of the Bayesian model in a were obtained from different causal prior strengths, specifically P(ξ = 1) is 0.9, 0.6, and 0.5 (marked by the colored dots) for the low, intermediate, and high tone uncertainty conditions, respectively. c, d The causal prior strengths that correspond to each condition were used for the Bayesian estimate of the action-outcome timing interval \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}}\) in the baseline and operant conditions. The Bayesian estimate follows the sensory inputs in the baseline condition where all trials are acausal, but shifts towards the prior assumption, τO − τA ≈ μAO, when causality is detected. The temporal window of τO − τA for detecting causality is wider when the outcome uncertainty is lower, which means more instances demonstrate binding

Next, we plot how the Bayesian estimate of the action-outcome interval, \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}}\), depends on the sensory inputs τO − τA. The perceived intervals faithfully follow the sensory inputs in the baseline condition (Fig. 5c), where all trials are acausal (\(\hat \xi = 0\)) by definition. In the operant condition (Fig. 5d), the Bayesian estimate shifts towards the prior assumption \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\) when the sensory inputs are highly consistent with the prior τO − τA ≈ μAO and, thus, when the causality is detected (\(\hat \xi = 1\)). Otherwise, the estimate of action-outcome intervals follows sensory inputs. The temporal window of τO − τA for detecting causality is wider when the outcome uncertainty is lower.

Next, we quantify again CCE as a possible measure of SoA. CCE diminishes with outcome uncertainty even when compared at the same level of causal prior (Fig. 6a). Hence, CCE explicitly depends on the outcome uncertainty. When plotted as functions of temporal disparity, with the specific causal priors obtained for each outcome uncertainty condition, the peak values of CCE noticeably differ across the uncertainty conditions (Fig. 6b). This is because of the different values of the outcome uncertainty σO but also partly because of the different values of the causal prior. In all conditions, CCE falls off with the disparity of sensory inputs from the prior mean, |τO − τA − μAO|. This fall-off is milder when the uncertainty is lower. These results clearly manifest again three basic requirements of CCE as follows: (i) the consistency of sensory inputs with the causal prior; (b) strong prior belief in causality; and (c) reliable sensory inputs.

Fig. 6
figure 6

Bayesian model predictions of CCE as function of the causal prior and temporal disparity τO − τA. a The different effects of the causal prior on CCE across the three conditions is evident even with equal causal priors, which means that CCE depends on outcome uncertainty. b When plotted as functions of the temporal disparity τO − τA, given the condition-dependent causal priors (marked by the colored dots in a), CCE falls off with the disparity of sensory inputs from the prior mean, |τO − τA − μAO|, faster when the outcome uncertainty is higher

Discussion

We formalize SoA by drawing parallels from a Bayesian inference of the ventriloquism effect that estimates a common cause behind its multisensory integration. Understanding causality has been viewed to facilitate predictive, adaptable, and goal-directed actions25,26,27; hence, this may bring about SoA. Our Bayesian model integrates the action-outcome signals, compares them with the prior expectation, and infers the causality between them as well as the timing of these sensory signals. Our model could concisely reproduce the intentional binding experiments by Haggard et al.3 and Wolpe et al.22. Whether intentional binding effects indeed follow Bayesian principles remained obscure. Specifically, this was raised as an open question by Moore and Fletcher14, pointing out only indirect empirical evidence existed in support of Bayesian cue integration, and Wolpe et al.22 even posited that Bayesian cue integration does not explain outcome binding. Our model explains the temporal binding and repulsion phenomena as compromise between the noisy sensory observations and the prior belief of the action-outcome timing. Importantly, our Bayesian model predicts that the perceptual binding is generally trial-dependent and it must be correlated with the estimated causality \(\hat \xi\) between the action and outcome. This prediction can be tested when the probability Pc for detecting causality is not close to 0 or 1, by examining whether the distribution of action-outcome intervals is bimodal and whether the intervals correlate with the reported causality between the action and outcome. We have therefore shown how Bayesian mechanism may underlie intentional binding. This is a significant contribution, as no previous Bayesian proposals accounted for experimental data on intentional binding and repulsion.

In addition, we theorize SoA as the CCE. CCE is high when the action-outcome timing is consistent with the causal prior, the causal prior is strong, and the action and outcome signals are reliable. This notion is consistent to what have been propounded as demonstrations of SoA: SoA arises from the causal relation between performed actions and their consequences1,21,27,28, and from the integration of different agency cues whose individual influences are determined by their reliability14,15,29,30,31,32. Hence, we posit CCE to be a plausible measure of SoA. Here, Bayesian cue integration in terms of CCE is derived based on the computational principle of optimal inference in contrast to empirical observations that causality and reliability are involved. Further, CCE can explain outcome binding in terms of cue reliability that was previously considered non-Bayesian22. CCE is not an indicator of intention or a simple estimate of whether the action caused an outcome, but a new proposal of how SoA may emerge from the confidence in the estimate of the causality and timing (see discussion below on CCE against intention-based temporal binding).

Specifically, we postulate CCE fits the notion of a pre-reflective, implicit FoA. Synofzik et al.1,30 provide a compelling account of such feeling: FoA is best accounted for by multimodal weighting and integration of different agency cues, and consists of an automatic registration of whether an action or sensory event is caused by the self or not. They posit FoA is nothing other than first-person in that the self is implied; hence, no external attribution (e.g., to TMS that caused the action) is possible. In the event that there is a feeling of exogenous causation, this will be overwritten by an explicit, interpretative judgment of agency (JoA) based on contextual beliefs or rationalizations. Similarly, the analytical expression of CCE shows that it is a multimodal weighting and integration process that lies at the center of obtaining a Bayesian causality inference. Furthermore, CCE itself does not attribute causality to any external agent, such as in the case of strong causal prior for TMS-induced movements. The judgement of the causality, \(\hat \xi\), is then made based on the posterior ratio r that compares CCE with the confidence in the acausal estimate. Perceptual timing in our model simply reflects the sensory signals if the causality is not detected (\(\hat \xi\) = 0), whereas they are overwritten by the influence of the prior if the causality is detected (\(\hat \xi\) = 1). For example, in the involuntary condition of Haggard et al.3, the estimated action and outcome timing by the model repulse reflecting the judgment of the causality. A compelling speculation in the paper by Haggard et al.3 suggests this notion: the repulsion in the involuntary condition “reflects a mental operation to segregate, and thus to discriminate, pairs of events that cannot plausibly be linked by our own causal agency” (p. 384). We suggest such mental operation fits the notion of JoA, as quantified by the time shifts in Eq. (5) with the detected causality, and the peculiar feeling of causation by the involuntary movement to be FoA, quantified by CCE.

Following the above explanation, our theory therefore has a different take of the binding effect by Haggard et al.3, which requires intentionality. Although intentional binding has been repeatedly observed in the context of voluntary action, it remains contentious in the literature whether it is indeed specific to voluntary action, or causality contributes to this effect33. Our model argues that the judgment of the causality is central to the perceived temporal action-outcome binding, consistent with current evidence that competes with the intentional account: the temporal binding is actually causal, not intentional21,27,34. For example, our model judges the causation of the tone even by the TMS-induced action in the involuntary condition. Hence, our Bayesian model predicts this unintended causality. Furthermore, our Bayesian model predicts that the action-outcome timing shifts toward the prior belief, \(\hat t_{\mathrm{{O}}} - \hat t_{\mathrm{{A}}} \approx \mu _{{\mathrm{{AO}}}}\), when the causality is perceived irrespective of the nature of the action, whether self-generated (i.e., the voluntary condition) or unintended (i.e., the involuntary condition). Interestingly, this temporal binding toward the same prior belief produces the compression and repulsion effects if the perceptual delay in the action timing (dA) is small and large, respectively. What causes this difference in the perceptual delay? We found that unreliable senses (with large σA or σO) tend to involve long perceptual delays (with large dA or dO). Hence, the observed large perceptual delay in the TMS-induced action timing may be caused by the internal prediction error due to the absence of efference copy35,36,37 and artificially perturbed neural activity. In this sense, intentionality is not strictly necessary for the sense of causality but influences the precision-dependent action-outcome timing shifts in our model. This is consistent with a recent empirical finding of intentional binding-like effects that emerged without intentional actions33. We predict that experimental manipulations that reduce σA would increase perceived SoA even for unintended artificial actions. The prediction is therefore distinct from what was previously considered and can therefore serve as testable prediction for future experiments on causal agency.

Our theory also has a different take of the binding effect of Wolpe et al.22. Wolpe et al.22 showed intentional binding as cue integration with uncertainty in outcome signals. They speculated that action and outcome bindings are driven by two distinct mechanisms: action binding is predicted by cue integration but outcome binding supports the predictive pre-activation hypothesis38, i.e., the neural representation of the sensory outcome is activated prior to it. Hence, the outcome signals are perceived faster with less jitter than when it is not predicted to occur after the action. This could explain why the subjects’ timing estimations are largely erroneous in the baseline condition and why the outcome binding is greater than the action binding. Our theory, although qualitative, explains both action and outcome bindings by a single Bayesian cue integration mechanism. Our model explains that the magnitudes of the action and outcome perceptual shifts, τO − τA − μAO, are influenced primarily by the ambiguity of the outcome sensory signals, \((\sigma _{\mathrm{{A}}}^2 + \sigma _{\mathrm{{O}}}^2)/\sigma _{{\mathrm{{tot}}}}^2\), and also in part by the strength of the causal prior that diminishes with outcome uncertainty.

The intentional binding paradigm has also been used to study pathological SoA39,40,41. Patients with schizophrenia tend to have much stronger temporal binding than healthy volunteers. Moreover, unlike healthy volunteers, their temporal binding of action timing does not depend on the probability of the outcome tone presentation41. These results are explained by our Bayesian model by assuming that schizophrenia patients cannot easily adapt their abnormally strong belief in causality (i.e., too large P(ξ = 1)) and the uncertainty in the outcome (i.e., σO). Another important point is that, unlike healthy volunteers, patients with schizophrenia exhibit temporal binding of action timing that depends on the presence or absence of the outcome. It will be an interesting future study to model this result by explicitly incorporating the probabilistic occurrence of the outcome in our Bayesian model.

In summary, we posit that as the Bayesian cue integration is primarily precision-dependent so is our theory of SoA. Our model predicts and awaits confirmation that if the uncertainty of the sensory input signals could be maintained small, even unintended causal action may give rise to high CCE (hence, strong SoA)—hence, our notion of precision-dependent casual agency. We posited the precise estimation that gives rise to SoA encapsulates consistency in the perceived action-outcome effect, the prior belief of the causation of the outcome by the action, and the reliability of the perceived sensory signals. This theory may shed light on the mechanism of reduced SoA in psychosis, the understanding of the difference between FoA and JoA, and the design of prosthetic devices that heighten SoA. Furthermore, the challenge for future experiments that aim to link intentional binding to SoA is to demonstrate effects beyond what our model has already predicted: with the reliability of sensory inputs and strength of causal prior diminished, intentionality should be sufficient for strong intentional binding to emerge or not.

Methods

Analytical expressions for the Bayesian estimates

The MAP estimate (Eq. (3)) of the Bayesian observer has a simple analytical expression. The MAP estimation is computed based on the posterior probability P(tA, tO, ξ|τA, τO) = P(τA, τO, tA, tO, ξ)/P(τA, τO), where the peak location only depends on the joint distribution P(τA, τO, tA, tO, ξ) in the numerator. The joint distribution is decomposed as P(τA, τO, tA, tO, ξ) = P(τA|tA)P(τO|tO)P(tA, tO|ξ)P(ξ), where the conditional distributions for action and outcome are \(P\left( {\tau _{\mathrm{A}}{\mathrm{|}}t_{\mathrm{A}}} \right) = \exp \left[ { - \left( {t_{\mathrm{A}} - \tau _{\mathrm{A}}} \right)^2/(2\sigma _{\mathrm{A}}^2)} \right]{\mathrm{/}}\sqrt {2\pi \sigma _{\mathrm{A}}^2}\) and \(P\left( {\tau _{\mathrm{O}}{\mathrm{|}}t_{\mathrm{O}}} \right) = \exp \left[ { - \left( {t_{\mathrm{O}} - \tau _{\mathrm{O}}} \right)^2{\mathrm{/}}(2\sigma _{\mathrm{O}}^2)} \right]{\mathrm{/}}\sqrt {2\pi \sigma _{\mathrm{O}}^2}\), respectively, and the prior distribution is

$$P\left( {t_{\mathrm{A}},t_{\mathrm{O}}{\mathrm{|}}\xi } \right) = \left\{ {\begin{array}{*{20}{l}} {\exp \left( { - \frac{{\left( {t_{\mathrm{O}} - t_{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right)^2}}{{2\sigma _{{\mathrm{AO}}}^2}}} \right){\mathrm{/}}Z_1} \hfill & {(\xi = 1)} \hfill \\ {1{\mathrm{/}}Z_{\mathrm{0}}} \hfill & {(\xi = 0)} \hfill \end{array}} \right.$$

with normalization constants \(Z_1 \equiv {\int}_{\mathrm{R}} {\exp } \left( { - \frac{{\left( {t_{\mathrm{O}} - t_{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right)^2}}{{2\sigma _{{\mathrm{AO}}}^2}}} \right)dt_{\mathrm{A}}dt_{\mathrm{O}} \approx \sqrt {2\pi } \sigma _{{\mathrm{AO}}}T\) and \(Z_{\mathrm{0}} \equiv {\int}_{\mathrm{R}} {dt_{\mathrm{A}}} dt_{\mathrm{O}} = T^2\).

The prior probability distribution P(tA, tO|ξ) cannot be normalized unless a finite range of (tA, tO) is defined. Therefore, we only consider it in the range \(R = \left\{ {t_{\mathrm{A}},t_{\mathrm{O}}|t_{\mathrm{A}} \in \left( {t_{\mathrm{A}}^ \ast - T{\mathrm{/}}2,t_{\mathrm{A}}^ \ast + T{\mathrm{/}}2} \right),t_{\mathrm{O}} \in \left( {t_{\mathrm{O}}^ \ast - T{\mathrm{/}}2,t_{\mathrm{O}}^ \ast + T{\mathrm{/}}2} \right)} \right\}\) and assume that it is zero outside R, where again \(t_{\mathrm{A}}^ \ast = 0\) ms and \(t_{\mathrm{O}}^ \ast = 250\) ms are the true action and outcome timings, unknown to the observer, and T = 250 ms is a large enough but finite constant that specify the interval lengths in consideration. Hence, the prior probability distribution P(tA, tO|ξ) must be normalized within R. Our results are robust to a shift in the center of R.

We separately compute the peak location \((\hat t_{\mathrm{A}},\hat t_{\mathrm{O}})\) for the causal case ξ = 1 and the acausal case ξ = 0 and, then, compare these two peaks. In the acausal case, because P(τA|tA) and P(τO|tO) take the maximum values at tA = τA and tO = τO, respectively, the location of the acausal peak is \(\left( {\hat t_{\mathrm{A}},\hat t_{\mathrm{O}}} \right)\left. \right|_{\xi = 0} = (\tau _{\mathrm{A}},\tau _{\mathrm{O}})\) and the peak value is \(\mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{A}},t_{\mathrm{O}}} P\left( {\tau _{\mathrm{A}},\tau _{\mathrm{O}},t_{\mathrm{A}},t_{\mathrm{O}},\xi = 0} \right) = \frac{{P(\xi = 0)}}{{2\pi \sigma _{\mathrm{A}}\sigma _{\mathrm{O}}Z_0}}\). In the causal case, the peak of the joint distribution is found by minimizing a quadratic function. The peak location is given by \(\left( {\hat t_{\mathrm{A}},\hat t_{\mathrm{O}}} \right)\left. \right|_{\xi = 1} = \left( {\tau _{\mathrm{A}} + \frac{{\sigma _{\mathrm{A}}^2}}{{\sigma _{{\mathrm{tot}}}^2}}\left( {\tau _{\mathrm{O}} - \tau _{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right),\tau _{\mathrm{O}} - \frac{{\sigma _{\mathrm{O}}^2}}{{\sigma _{{\mathrm{tot}}}^2}}\left( {\tau _{\mathrm{O}} - \tau _{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right)} \right)\), where \(\sigma _{{\mathrm{tot}}}^2 \equiv \sigma _{\mathrm{A}}^2 + \sigma _{\mathrm{O}}^2 + \sigma _{{\mathrm{AO}}}^2\) is the total variance, and the peak value is computed as \(\mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{A}},t_{\mathrm{O}}} P\left( {\tau _{\mathrm{A}},\tau _{\mathrm{O}},t_{\mathrm{A}},t_{\mathrm{O}},\xi = 1} \right) = \frac{{P(\xi = 1)}}{{2\pi \sigma _{\mathrm{A}}\sigma _{\mathrm{O}}Z_1}}\exp \left( { - \frac{{\left( {\tau _{\mathrm{O}} - \tau _{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right)^2}}{{2\sigma _{{\mathrm{tot}}}^2}}} \right)\). We define the log ratio of the posterior peaks for ξ = 1 and ξ = 0 by

$$r \equiv \frac{{\mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{A}},t_{\mathrm{O}}} P\left( {t_{\mathrm{A}},t_{\mathrm{O}},\xi = 1|\tau _{\mathrm{A}},\tau _{\mathrm{O}}} \right)}}{{\mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{A}},t_{\mathrm{O}}} P\left( {t_{\mathrm{A}},t_{\mathrm{O}},\xi = 0|\tau _{\mathrm{A}},\tau _{\mathrm{O}}} \right)}} = \exp \left( {\theta - \frac{{\left( {\tau _{\mathrm{O}} - \tau _{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right)^2}}{{2\sigma _{{\mathrm{tot}}}^2}}} \right)$$

with \(\theta \equiv \log \left[ {\frac{{P\left( {\xi = 1} \right)Z_0}}{{P\left( {\xi = 0} \right)Z_1}}} \right]\). If r > 1, the MAP estimate is given by \(\left( {\hat t_{\mathrm{A}},\hat t_{\mathrm{O}}} \right)\left. \right|_{\hat \xi = 1}\) and \(\hat \xi = 1\), which predicts perceptual shifts. If r < 1, the MAP estimate is given by \(\left( {\hat t_{\mathrm{A}},\hat t_{\mathrm{O}}} \right)\left. \right|_{\hat \xi = 0}\) and \(\hat \xi\) = 0, which predicts no perceptual shifts. The probability for detecting causality (i.e., \(\hat \xi\) = 1) is also easily computable, because τO − τA − μAO is distributed according to the Gaussian distribution \({\cal{N}}(m,\sigma _{\mathrm{A}}^2 + \sigma _{\mathrm{O}}^2)\) with \(m \equiv t_{\mathrm{O}}^ \ast - t_{\mathrm{A}}^ \ast + d_{\mathrm{O}} - d_{\mathrm{A}} - \mu _{{\mathrm{AO}}}\). Hence, the causality is detected if \(|\tau _{\mathrm{O}} - \tau _{\mathrm{A}} - \mu _{{\mathrm{AO}}}| < \sqrt {2\theta } \sigma _{{\mathrm{tot}}}\) and this happens with probability

$$P_{\mathrm{c}} = \frac{1}{2}\left[ {{\mathrm{erf}}\left( {\frac{{\sqrt {2\theta } \sigma _{{\mathrm{tot}}} - m}}{{\sqrt {2\left( {\sigma _{\mathrm{A}}^2 + \sigma _{\mathrm{O}}^2} \right)} }}} \right) + {\mathrm{erf}}\left( {\frac{{\sqrt {2\theta } \sigma _{{\mathrm{tot}}} + m}}{{\sqrt {2\left( {\sigma _{\mathrm{A}}^2 + \sigma _{\mathrm{O}}^2} \right)} }}} \right)} \right].$$

Next, we evaluate the confidence in the causal MAP estimation \({\mathrm{CCE}} \equiv \mathop {{{\mathrm{max}}}}\limits_{t_{\mathrm{A}},t_{\mathrm{O}}} P\left( {t_{\mathrm{A}},t_{\mathrm{O}},\xi = 1|\tau _{\mathrm{A}},\tau _{\mathrm{O}}} \right)\), which comprises the numerator of the ratio r. To quantify this confidence, we need to first evaluate P(τA, τO) = P(τA, τO, ξ = 1) + P(τA, τO, ξ = 0) with

$$\begin{array}{c}P\left( {\tau _{\mathrm{A}},\tau _{\mathrm{O}},\xi = 1} \right) = \mathop {\iint}\limits_R P \left( {t_{\mathrm{A}},t_{\mathrm{O}},\xi = 1,\tau _{\mathrm{A}},\tau _{\mathrm{O}}} \right)dt_{\mathrm{A}}dt_{\mathrm{O}}\\ = \frac{{P\left( {\xi = 1} \right)\sigma _{{\mathrm{AO}}}}}{{Z_1\sigma _{{\mathrm{tot}}}}}\exp \left( { - \frac{{\left( {\tau _{\mathrm{O}} - \tau _{\mathrm{A}} - \mu _{{\mathrm{AO}}}} \right)^2}}{{2\sigma _{{\mathrm{tot}}}^2}}} \right)\end{array}$$

and

$$\begin{array}{c}P\left( {\tau _{\mathrm{A}},\tau _{\mathrm{O}},\xi = 0} \right) = \mathop {\iint}\limits_{\mathrm{R}} P \left( {t_{\mathrm{A}},t_{\mathrm{O}},\xi = 0,\tau _{\mathrm{A}},\tau _{\mathrm{O}}} \right)dt_{\mathrm{A}}dt_{\mathrm{O}}\\ = \frac{{P\left( {\xi = 0} \right)}}{{Z_0}}\end{array}$$

Combining these expressions together, we obtain

$$\begin{array}{c}CCE = \mathop {{{\mathrm{max}}}}\limits_{t_A,t_O} P\left( {\tau _A,\tau _O,t_A,t_O,\xi = 1} \right){\mathrm{/}}P\left( {\tau _A,\tau _O} \right)\\ = \frac{{\sigma _{tot}}}{{2\pi \sigma _A\sigma _O\sigma _{AO}}}{\mathrm{Sigmoid}}\left( {\theta - \frac{{\left( {\tau _O - \tau _A - \mu _{AO}} \right)^2}}{{2\sigma _{tot}^2}} + \log \frac{{\sigma _{AO}}}{{\sigma _{tot}}}} \right)\end{array}$$

where Sigmoid(x) = 1/(1 + ex) is the sigmoid function.

In this work, we focus on the timing to investigate the intentional binding effects but the mathematical elucidations above can permit other modalities (e.g., visual or haptic) and structural properties (e.g., inter alia, location, size, shape, and texture).

Model fitting

The simple analytical expression for the Bayesian timing estimate has an intuitive form and exposes all parameter dependencies explicitly. This allowed us to perform a theoretically guided parameter search to reproduce the experiments. We posit the perceptual delay d and jitter of SD σ due to sensory noise explain the reported means and SDs of the baseline event timing. Hence, we could immediately fix the values of parameters dA, σA, dO, and σO (Table 1-Sets A and B). This leaves us with three free parameters, μAO, σAO, and P(ξ = 1), where fitting is not direct. Equation (5) shows that μAO alone can determine the qualitative difference between action-outcome binding (τO − τA > μAO) and repulsion (τO − τA < μAO). This immediately gives us a possible range of μAO that could account for both binding and repulsion, which is 182 ms < μAO < 259 ms, because \(\left( {t_{\mathrm{O}}^ \ast + d_{\mathrm{O}}} \right) - \left( {t_{\mathrm{A}}^ \ast + d_{\mathrm{A}}} \right) - \mu _{{\mathrm{AO}}}\) must be positive and negative in the voluntary condition and involuntary condition, respectively, from Eq. (5). We therefore tested \(\mu _{{\mathrm{AO}}}{\it{\epsilon }}\left[ {190,200, \ldots ,240,250} \right]{\mathrm{ms}}\) with 10 ms increments. Our model also explains that both σAO and P(ξ = 1) can similarly influence the magnitude of binding and repulsion (c.f. Eq. (5) and formula for Pc). To obtain discernible perceptual shifts, σAO should be small and P(ξ = 1) should be large. As their effects are similar, we fixed σAO = 10 ms and we varied P(ξ = 1) later on, and observed how different causal prior strengths influenced action-outcome binding and repulsion.

The principal measure of intentional binding is the mean perceptual shift of temporal awareness of action and sensory outcome. A perceptual shift is the change in the subjective estimation of action or outcome timing from the baseline to the operant condition. This can be computed as \(E\left[ {\hat t_{\mathrm{A}}} \right] - E\left[ {\tau _{\mathrm{A}}} \right]_{}^{}\) and \(E\left[ {\hat t_{\mathrm{O}}} \right] - E\left[ {\tau _{\mathrm{O}}} \right]_{}^{}\) (c.f. Eq. (5)) for action and outcome timings, respectively. A positive shift therefore informs the perception of timing shifted later in time and a negative shift informs the perception of timing shifted earlier in time. We could then compute for the model estimation error as absolute difference between our Bayesian model’s estimates of the mean action and outcome perceptual shifts and the corresponding perceptual shifts reported in the experiments. We then selected the parameter values that best minimized the model estimation error.

Simulation details

Table 1 lists all the parameters of our Bayesian model. We performed different simulations to reproduce the action and outcome perceptual shifts reported by Haggard et al.3 and Wolpe et al.22, and to explain their underlying psychophysical mechanisms in Bayesian terms.

In the first simulation, our objective was to determine μAO, to reproduce the perceptual shifts reported by Haggard et al3. We generated 35,000 instances of τA and τO pairs for each experimental condition using the baseline parameters in Table 1-Set A. Testing each value in the set of possible values for μAO, and with σAO = 10 ms, we obtained the model estimation errors for the reported action and outcome perceptual shifts listed in Table 2-Set A. We took the average of the model estimation errors for the voluntary, involuntary, and sham conditions to obtain a single model estimation error. We looked at the model estimation errors for (a) action perceptual shifts only, (b) outcome perceptual shifts only, and (c) action-outcome perceptual shifts. Our results showed the best estimates of the model to be at μAO = 230 ms. Furthermore, we observed our Bayesian model’s estimates of the perceptual shift in action timing alone was sufficient to indicate the optimal parameters of the model.

Table 2 Reported perceptual shifts in action and outcome temporal awareness

Our objective in the second simulation was to obtain the specific strength of the causal prior that reproduces Haggard et al.’s results. With μAO = 230 ms and σAO = 10 ms, we tested for P(ξ = 1) in the range 0 to 1 with increments of 0.1. We used the same pairs of τA and τO from the first simulation, and we computed once again the model estimation errors for the empirical results listed in Table 2-Set A. We selected the P(ξ = 1) that best minimized the model estimation errors for the voluntary, involuntary, and sham conditions, and fit the experimental data. Table 1-Set C includes the parameters that yielded the best model estimates. Figure 1 shows the action and outcome perceptual shifts, as well as the intervals between perceptual shifts, which were obtained by our Bayesian model using these parameters.

In the third simulation, we aimed to reproduce the perceptual shifts reported by Wolpe et al.22, listed in Table 2-Set B. We generated another set of 35,000 τA and τO pairs using this time the baseline parameters listed in Table 1-Set B. We performed simulations with μAO = 230 ms, σAO = 10 ms, and P(ξ = 1) in the range 0 to 1 with increments of 0.1. We did not perform additional simulations to redetermine μAO, as our aim is to reproduce qualitatively all the experiments with the same μAO and σAO as possible in order to have simple yet consistent explanations by our Bayesian model. Although we did not modify here μAO and σAO, our analyses and results can show their effects can be predicted and explained by our model. The model estimation errors once again indicate the estimates of action perceptual shifts led to the best estimates of the model. We list under Table 1-Set D the P(ξ = 1) that yielded the best estimates of the model for the low, intermediate, and high uncertainty tone conditions. Figure 5a shows the action and outcome perceptual shifts, and intervals between shifts, predicted by our Bayesian model for this experimental setup.

In the fourth simulation, our objective was to determine the influence of the causal prior and the temporal difference τO − τA (that varies in every trial) on the various predictions of our Bayesian model for Haggard et al.’s experimental setup. We used the model parameters and τA and τO pairs from the first and second simulations. We obtained our Bayesian model’s predictions of the intervals between action and outcome perceptual shifts, binding and repulsion effects, action-outcome timing interval, \(\hat t_{\mathrm{O}} - \hat t_{\mathrm{A}}\), in the baseline and operant conditions, and CCE. The results are shown in Figs. 24.

Our objective and target results in the final simulation were the same as the fourth simulation, but we used the model parameters and τA and τO pairs from the third simulation to account for the experimental setup of Wolpe et al.22. The resulting plots are shown in Figs. 5 and 6.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.