## Introduction

The term altruistic punishment describes the sacrifice of self-interest to punish violations of social norms like fairness or reciprocity1,2. This behaviour is often modelled using the Ultimatum Game (UG). In the typical UG3, an anonymous proposer makes an offer regarding how to split a monetary sum, which the responder may accept or reject. Accepted splits are carried out as proposed. If the split is rejected, both receive nothing. Rejection of any non-zero offer is widely considered irrational1,4, but players typically reject a proposed monetary offer from another player if it represents less than 20–25% of the total3,4. In one-shot interactions, such rejections are often described as altruistic because others may benefit from greater equity in future interactions with the punished defector while the player is left objectively worse off2,5. Further, it is commonly suggested that the purpose of such responses is to encourage fairness and cooperation at the group level2,6 and that this benefit outweighs the cost to the individual responder5.

However, evidence linking costly punishment in the UG to altruism is limited. Rejections of unfair offers in the UG appear to be unrelated or inversely related to prosocial behaviour in other economic paradigms despite the fact that cooperative tendencies across a variety of economic paradigms tend to be correlated, stable over time and associated with actual helping behaviour7,8. And unfair offers are usually rejected even in private impunity games in which punishment and the potential for benefiting others have been eliminated9,10. Consequently, alternate interpretations of costly punishment in the UG have been suggested.

One alternative is that rejection of inequity represents self-interested retaliation, motivated by immediate negative affect2,10,11,12,13,14,15,16. A second proposal (which is not inconsistent with the first) is that, in keeping with the idea that costly punishment is aimed at punishing violations of social norms like reciprocity, such punishment reflects sensitivity to prosocial group norms. The disposition to punish people who violate group-beneficial norms at a cost yields a reflexive tendency to respond to cooperation with cooperation and to defection with defection, without regard to future payoffs6. Such dispositions toward strong reciprocity can sustain high levels of cooperation within groups17. Computational models have revealed that strong reciprocity is an evolutionarily stable strategy and that it supports forms of cooperation that are observed in human populations that cannot be sustained by kin selection or direct or indirect reciprocity18,19. Strong reciprocity is also consistent with the dual inheritance model of cooperative behaviour suggested by Henrich and colleagues20,21. According to this model, the adaptive value of strongly reciprocal cooperation, which may be individually costly and without personal benefit, is bolstered by conformist biases that could allow this group-relevant adaptive strategy to be perpetuated through genetic inheritance as well as the social learning of local norms.

The variation in punishment behaviour observed across cultures supports the role of such norms in costly punishment in the UG and other economic paradigms. Some form of costly punishment in the UG is relatively consistent across a wide range of cultures, with extreme inequity being nearly universally punished. But thresholds vary, such that some societies are more punishment-averse, whereas others reject both unfair and hyper-fair offers22. This variation is best explained by a tendency toward strong reciprocity, coupled with fairness norms that are dictated by the local economic system, rather than individual-level economic and demographic variables. This suggests that humans may possess a predisposition toward strong reciprocity that supports and sustains cooperation and that can be modulated by local norms22,23. Importantly, these processes likely operate at implicit rather than explicit levels in many cases.

We aimed to test the competing hypotheses that rejection of unfair offers in the UG reflects altruism as opposed to normative prosociality. We assessed non-normative altruism to dissociate altruism from norm sensitivity. Specifically, we examined UG task performance in a sample of extraordinarily altruistic individuals who had all donated a kidney to a stranger—a behaviour that is simultaneously strongly altruistic and strongly non-normative. Altruistic donors undergo surgery to donate a kidney to an unknown recipient. They receive no compensation and incur various non-monetary costs, including extensive pre-surgical screening and post-surgical pain24,25, such that altruistic kidney donation satisfies the most stringent definitions of altruism26,27,28. Accordingly, donors typically cite concern for the well-being of the recipient as their top motivation for donating24. Altruistic donors typically engage in high levels of other altruistic behaviours, including blood donation and volunteering24, consistent with the notion that prosocial tendencies are relatively stable across metrics and across time7,29,30,31 and they exhibit patterns of brain structure and activation consistent with heightened socio-emotional sensitivity32. But unlike many other forms of prosocial behaviour, altruistic kidney donation is strongly counter-normative and is often met with scepticism and even derision24,25.

We assessed normative altruism using the Self Report Altruism (SRA) scale30. This is a 20-item scale developed to assess self-reports of everyday behaviours such as ceding to others in line and holding doors open. Scores on this scale tend to correspond to other self-reports of prosociality and with peer-reported prosociality30. Because public displays of selflessness are an effective means of increasing social status33, SRA responses may index prosociality driven by norm conformity34, which sustain cooperative interactions but may not be altruistic in nature. Responses gathered using a separate sample of participants support the characterization of the SRA as an index of prescriptively and descriptively normative prosocial behaviours (see Materials and Methods). We hypothesised that if costly punishment in the UG stems from altruistic motivations, altruistic kidney donors would engage in increased rejection of inequity relative to controls. Conversely, we hypothesised that if costly punishment in the UG stems from cooperation-sustaining norm conformity2, rejection of inequity would correspond more closely to scores on the SRA.

To test these hypotheses, 16 altruistic kidney donors and 28 matched controls played the UG (Table 1). All participants completed preliminary online screening, which inquired about basic demographic information, including age, sex, education and income; kidney donor status; self-reported normative altruism using the SRA; and self-reported empathy using the Interpersonal Reactivity Index (IRI)35. Qualified participants were invited to complete laboratory testing, which included the UG as well as an assessment of IQ and a measure of economic prosociality, the Triple Dominance measure of social value orientation36 (see Materials and Methods). We included this measure to confirm prior findings that a prosocial social value orientation is not associated with increased costly rejection in the UG8,37.

### Analysis

Total scores for each participant were calculated for the SRA and total and subscale scores were calculated for the IRI. Participants were classified into one of the three categories of social value orientation if they selected at least six prosocial, individualistic, or competitive choices out of the nine-item Triple Dominance measure. A proself classification was also computed, based on at least six selections of either individualistic or competitive choices. Group differences in these measures were examined through independent sample t tests and chi-squared tests for independence.

For analysis of the UG, only valid responses made during the offer period of each trial were counted. If participants made multiple responses, only the first valid response was counted. The average response rate across the 24 trials was 92.8% (range: 75–100%). Keeping with other recent studies of the UG39, responses were analysed through the generalised estimating equations (GEE) method of logistic regression in SPSS 22. GEE is a semiparametric analysis method that uses generalised linear models while accounting for correlated repeated measurements, thus allowing multiple responses within each condition for each participant. With response to each offer as the binomial response variable, an initial model in which group, fairness and a group × fairness interaction predicted rejection was examined. Education and its interaction with fairness were then added to this model, due to the group differences in education level between kidney donors and controls. Four alternate models were examined in which group was replaced by self-reported altruism, total empathy on the IRI, empathic concern on the IRI and prosocial social value orientation. Continuous measures were dichotomised via median split. Fairness was examined as a within-subjects variable while group, education level and dichotomised self-report measures were examined as between-subjects variables. An exchangeable working correlation matrix was specified, as correlations between repeated trials were expected to be equivalent. A model-based estimator was used for the covariance matrix, since a subject variable was also specified, thus accounting for the repeated nature of within-subject measurements and also given the relatively small sample size. Finally, correlations between average response latencies and rejection rates, both within unfair offers, were examined as a function of SRA group.

### Online Survey

To confirm our characterization of behaviours listed on the SRA and costly punishment, but not altruistic kidney donation, as normative acts, we conducted an online survey on Qualtrics using Amazon Mechanical Turk. One hundred adult respondents across the United States were presented with each item on the SRA, a description of costly punishment (“Refuse to accept an unfair deal, even if it means both parties get nothing”) and a description of altruistic kidney donation (“Undergo surgery to donate a kidney to a stranger”) and were asked to rate the descriptive and prescriptive frequency of each behaviour. Specifically, on a five-point scale [(1) almost no one, (2) few people, (3) some people, (4) many people, (5) almost everyone] participants rated first what portion of the population ever engages in each behaviour and then, regardless of real world frequency, what portion of the population should engage in each behaviour, should the opportunity arise, in an ideal world. A free-response manipulation check inquiring how answers were selected for the descriptive and prescriptive sections confirmed that participants were using descriptive and prescriptive frames of reference, respectively. Rating differences were tested with paired sample t tests.