The Evolution of Generosity in the Ultimatum Game

When humans fail to make optimal decisions in strategic games and economic gambles, researchers typically try to explain why that behaviour is biased. To this end, they search for mechanisms that cause human behaviour to deviate from what seems to be the rational optimum. But perhaps human behaviour is not biased; perhaps research assumptions about the optimality of strategies are incomplete. In the one-shot anonymous symmetric ultimatum game (UG), humans fail to play optimally as defined by the Nash equilibrium. However, the distinction between kin and non-kin—with kin detection being a key evolutionary adaption—is often neglected when deriving the “optimal” strategy. We computationally evolved strategies in the UG that were equipped with an evolvable probability to discern kin from non-kin. When an opponent was not kin, agents evolved strategies that were similar to those used by humans. We therefore conclude that the strategy humans play is not irrational. The deviation between behaviour and the Nash equilibrium may rather be attributable to key evolutionary adaptations, such as kin detection. Our findings further suggest that social preference models are likely to capture mechanisms that permit people to play optimally in an evolutionary context. Once this context is taken into account, human behaviour no longer appears irrational.


Supplementary Information
In order to understand why the detection of kinship allows strategies to become generous, we have to understand under which conditions strategies become evolutionarily successful. Let us consider two strategies i and j playing the one-shot anonymous ultimatum game. In this game, the first player has to make an offer on how to split a resource. This resource is set to 1.0. The offer is the fraction that the first player (proposer) offers to the receiving second player (responder). If the offer meets the responder's demands, the resource is split accordingly. If the offer is rejected, neither player receives anything. The game is played so that each player interacts with every other player, once as proposer and once as responder. The player's strategy is defined by a pair of values (p, q), where p is the fraction offered, and q is the demand to be met.
The success of a strategy depends on its ability to optimize the offer (p) in such a way that the smallest acceptable fraction is handed over. At the same time, the demand (q) has to be optimized such that the responder receives as much as possible and does not become too renunciative. The Nash Equilibrium (NE) for this game is p ≈ > 0.0 and q ≈ > 0.0 (where is the smallest value one can offer that is not 0.0; in the following, we refer to this as p = 0.0 or q = 0.0) and can be derived directly from considerations about the evolutionarily stable strategy (ESS) [1].
The evolutionary stability of a strategy i or j depends on the payoffs players receive playing each other E(i, j) and E(j, i), as well as playing against themselves E(i, i) and E(j, j) (see Table S1).
or if: and The above statement can be rephrased as: A strategy is an ESS if the payoff it receives playing itself E(i, i) is larger than the payoff any other strategy receives playing it, E(i, i)>E(j, i). Thus, in order to find ESSs, one needs to look at the diagonal, and every strategy whose payoff on the diagonal is larger than any payoff in the same column is an ESS (see Table S2).
The evolutionary stability of strategy i does not exclude other strategies (j, k, l, ...) from also being evolutionarily stable. To simplify the payoff matrix E, one can subtract any constant in each column without changing the evolutionary stability of the strategies. Therefore, we arrange the matrix in such a way that we have only 0 on the diagonal. The outcome of a game is now defined only by the sign of each value off the diagonal [2]. Ideally, for strategy i to become an ESS, it should maximize the payoff if receives against j and minimize the payoff strategy j receives playing i (see Table S2). Maximizing the payoff a strategy receives against itself increases its value on the diagonal, which increases its chance of being the highest value in the column. Conversely, increasing the payoff a strategy receives against another strategy reduces the likelihood that the opponent strategy has a higher payoff playing against itself then playing against the first strategy.
In order to optimize the payoff while remaining an ESS (and preferably the only strategy to be an ESS), p and q have to change. Table S3 shows in green that for strategy i to receive more than j, q should be larger than 0.5 and p should be smaller than 0.5. At the same time, strategy i must i j i maximize j minimize Table S2: Payoff E, and what strategy i has to change in order to become the only strategy to be an ESS play against itself, and fairness dictates that p > q [3]. Thus, the optimal strategy to play this game is p = 0.0 and q = 0.0 (while p > q), which is the ESS and the NE. However, this case considers only strategies that play unconditionally regardless of the opponent. If a strategy could lift the fairness constraints (p > q) by playing differently against itself than against other strategies, q could become larger, as Table S3 suggests. The ability to detect kin would allow this to happen. Table S3: Necessary condition for strategy i to become an ESS in red, and possible optimizations for strategy i to become the only strategy to be an ESS in green To test this hypothesis, we introduce the ability for strategies to detect kinship. Each agent receives a tag and a kin recognition threshold (t), where each value will be held between 0.0 and 1.0 [4,5]. A strategy i recognizes another strategy j as kin if |tag i − tag j | < t i . Note that strategy j could come to a different conclusion than j. For illustration, let tag i = 0.5 and tag j = 0.7; t i = 0.3 and t j = 0.1. In this situation, i will recognize j to be kin, because |0.5 − 0.7| < t i . Strategy j, on the other hand, will not recognize strategy i to be kin because |0.5 − 0.7| > t j . To permit strategies to play conditionally on the kin tag, we had strategies use one pair of values (p kin , q kin ) when recognizing an opponent as kin and another pair of values (p other , q other ) when recognizing an opponent as non-kin. Strategies still have to obey the basic rules in order to become an ESS (see Eq. 0.1 and 0.3), and strategy i can still try to prevent other strategies from being ESSs by maximizing E(i, j) while minimizing E(j, i) (see Table S3).
But now that an agent can potentially play differently against agents who are kin, we obtain a different payoff table (See Table S4). This new payoff matrix allows strategies to be fair, because p kin > q kin , while at the same time having a higher demand q other > 0.5 and a lower offer p other < 0.5.
This explains how introducing the ability to detect kinship relations can result in a strategy becoming generous toward non-kin players while still remaining an ESS.

Limiting Kin Detection
As a control to test the effect of kin detection, we manipulated the minimal selection threshold. Instead of having the lowest threshold set to 0.0, we systematically increased the threshold in 0.1 intervals and repeated the evolutionary experiment. The results (See Figure S1) confirm the intuition that when removing the effect of kin selection, the evolved strategies playing against kin become more and more similar to the Nash Equilibrium, while the part of the strategy that is used against non-kin starts to drift until it becomes completely neutral at very high threshold levels, because at high minimal thresholds no opponent is recognized as kin. A value not under selection will drift randomly, and since we sample from a uniform random distribution ([0.0, 1.0]) we can identify a completely neutral value as having a variance of 1 12 , since that is the variance of a uniform random distribution. This also confirms that the effects on the offer other and demand other are not due to drift but to actual selection pressures.

Computational Model Implementation
The computational model is written in C++; the code can be obtained here: (A link to the code will be provided upon acceptance of the manuscript). The code is structured into three sections: setup, evolution, data output. While specific experiments might change certain parameters, here we describe the default behaviour. At program start a population of agents is created, where each agent receives a genome vector that contains four values drawn from a uniform random distribution [0,1]. The first two values describe the offer and demand used when the opponent is recognized as kin; the other two values specify the offer and demand used when the opponent is not recognized as kin. Additionally, the tag is drawn from a uniform random distribution [0,1], and the recognition threshold is drawn from a distribution [limit,1]. The limit is used to test limits in kin detection as described above. At each generation (replacing one agent at a time) a random agent is removed, and needs to be replaced by another one proportional to fitness. Thus, we compute the likelihood for an agent to propagate offspring into the next generation as the agents score divided by the sum of all scores. After that we randomly select an agents and draw a random number ([0.0, 1.0]) and test if this random number is smaller than the agentâs relative score. If so, this agent is used to replace the one that got removed; if not we continue to randomly draw agents. This mechanism ensures that agents are being replaced proportional to their score. After 5,000,000 of these generations, a random agent is picked, and we record all values of all agents preceding the agent picked, back along its list of ancestors to an individual in the first generation. These data are then saved to a file and constitute one experiment. Each experiment uses a new and independent random seed.

Starting Condition Independence
At program start, agents are initialized with random values. To exclude the possibility that the starting point affected the outcome of the experiment, we tried various starting conditions, and found that the evolutionary trajectories converged on the same point, regardless of start conditions. Because each organism is initialized with six values, we started independent experiments with any combination of these six values being either 0.0 or 1.0, resulting in 64 possible start conditions. All other experimental conditions were identical to the evolutionary experiment described for Figure 2 (main text) for w = 10.0 and allowing for kin detection to evolve. Each experiment was repeated 100 times; lines of descent were constructed, and data points after 2.5 million updates averaged to find the convergence point. We found that, within the expected slight variation, the starting point had no effect on the point of convergence (see Figure SS2). At the same time, we again found that, independently of the start condition, p self and q self evolved to be fair (p > q) as indicated by the dashed line in Figure S2.

Population Size and Mutation Rate
To rule out the possibility that mutation rate my (0.01) or population size (100) affected the outcome of the experiment, we explored two more mutation rates (0.001 and 0.02) and other population sizes (500 and 1000) in combination with each other. For each of the nine parameter combinations, 100 replicate runs were performed with selection strength W = 10.0. As in the starting condition experiment, the averages of p self and q self after 2.5 million updates were computed, as well as their respective values of p other and q other . As shown in Figure S3, there were no noticeable differences. Implementing even larger or even smaller mutation rates and population sizes than the ones implemented would be increasingly unrealistic.