Introduction

Rapid and accurate evaluation of the consequences of one’s own behavior and appropriate adjustment of subsequent behaviors are essential processes for human survival. Humans not only learn from their own mistakes but can also learn from others’ experiences1,2. Observational learning has been investigated not only in humans but also in other animals, and single-neuron recording in non-human primates has provided interesting findings regarding the representation of self and others in the brain3,4,5,6. For example, there are neurons that respond only to the errors of others in the medial frontal cortex6. Given that humans do not necessarily use only one source of learning when they adjust to an environment, it is important to clarify how they use information accumulated from multiple learning sources in an integrated manner. Examining such processes is crucial for understanding adaptive behaviors in a social environment. In the reinforcement learning theory of action monitoring7, prediction of an action’s outcome is the requisite element. In the present study, we investigated how predictions from one’s own and another’s performance history affect each other. To address this issue, we specifically evaluated (1) the effect of one’s own performance history on the evaluation of another’s action outcomes, and (2) the effect of another’s performance history on the evaluation of one’s own action outcomes using event-related brain potentials (ERPs).

Feedback-related negativity (FRN) is an electrophysiological signal used to study outcome evaluation processes. It is a negative deflection of the ERPs associated with negative outcomes such as monetary loss and performance error. It reaches maximum voltage at 200–300 ms after the presentation of an action’s outcome and has a front-central scalp distribution8,9,10. Normally, FRN amplitude is evaluated via the difference waveform created by subtracting the ERP for a positive outcome from the ERP for a negative outcome11,12. In addition, studies have reported that the source of this ERP is the anterior cingulate cortex (ACC)8,9,13. It should be noted that in this regard, several studies suggest that the difference in ERP amplitudes for desirable and undesirable outcomes is not due to FRN for the undesirable outcomes, but to reward positivity for desirable outcomes that correspond to activation of reward-related areas14,15,16,17,18. The reinforcement learning theory of action monitoring states that the FRN reflects a reward prediction error (RPE) signal—the difference between a predicted and an actual outcome. According to this theory, unexpected negative outcomes lead to larger FRN than expected negative outcomes. A study manipulated the outcome frequency with task difficulty to confirm the validity of this theory19. When a task is easy, participants are successful in many trials and should expect to be correct. In line with the theory, unpredicted erroneous feedback in the easy condition elicited larger amplitudes of FRN than that in the hard condition, followed by greater behavioral adjustments. Furthermore, recent studies using a modeling approach and single-trial analysis also have demonstrated that the FRN reflects the prediction error signals20,21. Hence, the perceived likelihood of an outcome is a determining factor for FRN amplitude.

A series of studies have revealed that the mechanism for the RPE calculation based on one’s own action outcomes also occurs when perceiving outcomes for others, resulting in a negative deflection for negative outcomes22,23,24. This negative deflection is called observer FRN (oFRN) since the latency, scalp distribution, and source are similar to the FRN25. Typically, the amplitude of the oFRN is much smaller than that of the FRN22,23,24,26. Findings on whether the oFRN is sufficiently sensitive to the expectancy of action outcomes are mixed as of the moment, with at least one study showing that the oFRN is less sensitive to the expectancy of action outcomes than the FRN27. However, another study demonstrated that unexpected outcomes elicited larger oFRN28. The effect of outcome expectedness on oFRN amplitude suggests that the system for generating the oFRN also stores the history of others’ performances and makes predictions regarding others’ action outcomes. However, it remains unclear whether the prediction of an action outcome distinguishes between different sources of experience or not.

In the present study, we examined the effect of the history of another’s performance on the RPE calculation derived from one’s own action outcomes by evaluating the FRN, and we evaluated the effect of one’s own performance history on the RPE derived from another’s action outcomes by evaluating the oFRN. We manipulated the task difficulty for pairs of participants independently. One participant always performed a time estimation task with either easy or hard difficulty. The other participant performed the same task with medium difficulty, making correct and erroneous responses equiprobably. If the prediction of action outcomes does not distinguish between sources, we would expect to see the same frequency effect in both the FRN and the oFRN, because the easy/hard difficulty of the one participant skews the overall distribution of correct and erroneous responses. That is, the history of one’s own and another’s performance would affect each other’s outcome predictions. In contrast, if the prediction of action outcomes does distinguish between sources, we would see no frequency effect or a different frequency effect between the FRN and the oFRN. We did not have a prediction as to which case would be upheld. We also explored how one’s own performance history would affect prediction of another’s outcomes, and how another’s performance history would affect the prediction of one’s own outcomes, if the latter case occurred.

Methods

Participants

Twelve gender-matched pairs (14 females and 10 males) participated in this experiment. Participants in each pair were independently recruited. Participants were 18–25 years old (M = 20.8; SD = 2.0) and reported having normal or corrected-to-normal vision. Prior to the experiment, we obtained written informed consent from participants. This experiment was approved by the University Research Ethics Review Board and was conducted in accordance with the Declaration of Helsinki. The data from one participant were not used in the analysis because the number of available trials for the computation of the ERPs did not reach the criteria.

Electrophysiological data

Figure 3 shows the ERPs for one's own and another's correct and error feedbacks and their difference waveforms at FCz and their scalp distributions. Repeated measures ANOVA for the peak latency of the difference FRN showed a main effect of agency (Fcorrected(1,22) = 23.3, p < 0.001, ηp2 = 0.514), indicating that the peak latency of the oFRN (M = 202 ms) was shorter than that of the FRN (M = 249 ms). The main effect of frequency and the interaction were not significant.

Significant elicitation of the FRN for both the frequent (− 6.72 μV) and infrequent (− 8.17 μV) conditions was confirmed by one-sample t tests (t(22) = 4.7, p < 0.001, d = 1.380; t(22) = 7.8, p < 0.001, d = 2.312). A paired-sample t test for the difference FRN amplitude for the frequent and infrequent conditions was not significant. Next, a one-sample t test confirmed the significant elicitation of the difference oFRN in the frequent (− 2.76 μV; t(22) = 3.3, p = 0.003, d = 0.986) but not in the infrequent condition (− 0.66 μV). The amplitude of difference oFRN in the frequent condition was significantly larger than that in the infrequent condition (t(22) = 2.4, p = 0.023, d = 0.575). To reveal the source of difference between the amplitude of difference oFRN in the frequent and infrequent conditions, we conducted an additional ANOVA for the original oFRN amplitude with Frequency and Outcome (correct, error) as factors. This test indicated significant main effect of outcome (F(1,22) = 7.3, p = 0.013, ηp2 = 0.249) and interaction (F(1,22) = 6.0, p = 0.023, ηp2 = 0.214). That is, the amplitude of oFRN for the frequent error outcome was larger than those for the frequent correct outcome (p = 0.003) and infrequent error (p = 024).

Discussion

To investigate how the history of performance in oneself and others affects the RPE calculation derived from each other's action outcomes, we manipulated the task difficulty for self and other independently. The behavioral results showed that the task difficulty appropriately changed the frequency of the outcomes. The frequencies of correct and error trials were similar to those of a previous study19. Consistent with previous studies8,9,10,11,12,19,22,23,24,26,27,31, the typical FRN effect was found for one's own action outcomes, regardless of the partner’s outcome frequency. That is, the amplitude of the FRN for the error feedback was larger than that for the correct feedback, suggesting that feedback on one's own performance was processed appropriately. However, there was no difference between the difference FRN for the frequent and the infrequent conditions. On the other hand, the difference oFRN amplitude for the frequent condition was larger than that for the infrequent condition. If the responses and outcomes of self and other were tracked indistinguishably, then the same frequency effect should have emerged in the FRN and the oFRN. Thus, this effect in the difference oFRN amplitude, which is different from that in the FRN, suggests that the histories of own and another’s action outcomes are tracked separately.

We found a frequency effect in the amplitude of the difference oFRN. That is, the amplitude of the difference oFRN, derived from the other’s action outcomes, was larger in the frequent condition than in the infrequent condition. Moreover, the additional ANOVA for the original (i.e., pre-subtraction) oFRN amplitude indicated that the source of this frequency effect comes from differences in the evaluation of partner’s error outcomes rather than correct outcomes. A possible explanation of this phenomenon is that the monitoring system would predict the other’s action outcome relative to the history of one’s own performance. When a participant performed the hard difficulty task and had a negative-biased performance history, a moderate number of the other’s errors in the medium difficulty task were processed as a relatively unexpected event compared to one’s own erroneous responses. In the same manner, when a participant performed the easy difficulty task and had a positive-biased performance history, a moderate number of the other’s errors in the medium difficulty task were recognized subjectively as an expected event. Thus, the results of this study are consistent with previous studies indicating that unexpected outcomes lead to larger difference oFRN amplitudes than expected ones28. Several lines of research have suggested that the monitoring system uses a reference point to evaluate the consequence of an action35,36. For example, the absence of a reward is perceived as a bad event, or as a good event if the alternative is monetary loss35. Taken together, the present study suggests that the monitoring system calculating the RPE signal refers to the prediction for one’s own performance when it evaluates the other’s action outcomes.

Finally, one recent study with single trial EEG analysis indicates that the oFRN does not reflect the reward prediction errors21. However, in that study, the actor’s performance had no effect on the observer’s monetary gain or loss, and thus the other’s outcomes had only low significance. On the other hand, the reward prediction error may have been calculated for the partner’s outcomes as well since the partner’s outcomes also affected the monetary consequences for the observer in the current study. Apart from the findings from single trial EEG analysis, the sensitivity of the oFRN to the expectedness of outcomes was lower than that of the FRN27. Thus, the frequency effect of the oFRN, not the FRN, in this experiment may seem surprising at first glance. However, given that the person causing the bias in performance history in the session where we observed the frequency effect on the oFRN was oneself rather than the other, this result makes sense. These results suggest that information related to one’s own outcomes plays an important role even when predicting others’ outcomes. This implication extends the finding from previous studies that one’s own action outcomes are more motivationally significant than those of others22,23,24,26.

In conclusion, the present study revealed that the monitoring system tracked histories of one’s own and others’ outcomes separately. In addition, the information related to one’s own outcomes played a crucial role even when predicting the other’s action outcomes.