Abstract
All organisms descend from populations with limited resources, so it is clear why evolution should select strategies that win resources at the expense of competitors. Less obvious is how altruistic behaviours evolve, whereby an individual helps others despite expense to itself. Modelling simple agents using evolutionary game theory, it is shown that steady states of extreme altruism can evolve when payoffs are very rare compared with death. In these states, agents give away most of their wealth. A new theorem for general evolutionary models shows that, when payoffs are rare, evolution no longer selects strategies to maximize income (average payoff), but to minimize the risk of missingout entirely on a rare resource. Principles revealed by the model are widely applicable, where the game represents rare lifechanging events: disasters or gluts.
Introduction
Altruism exists in many species^{1,2,3,4}, even microbes^{5,6,7}. Famously, a bird’s alarm call^{1} is an altruistic trait, as it benefits others at the cost of alerting predators to the calling bird. Here (in common with refs^{3,8}), we shall distinguish between “cooperation”^{9}, which may benefit both parties involved, and “altruism”, which benefits only others, not the altruist.
In evolutionary game theory, the complex competitive processes of life are modelled by simple agents playing a simple game. They reproduce and die according to the game’s outcome, and offspring inherit (imperfect) copies of parents’ strategies. A number of mechanisms have been identified^{10} that can promote altruism in such models, including social compensation^{11,12,13}, group selection^{8}, repeated fragmentation into colonies^{7}, or kin selection with low cost compared to conferred benefit^{1,8,14,15}. These and similar mechanisms can arise spontaneously in spatially structured and fluctuating populations^{4,7,16,17,18}. Another generic mechanism is identified in the present investigation.
Insight into altruism is gained from a standard evolutionary model, the spatial ultimatum game (UG)^{19}, in which one player, the proposer, must decide how to apportion some beneficial resource between itself and the other player: the responder. If the responder accepts, the proposer keeps the remainder. If the offer is rejected, neither player receives anything. The proposer gains no direct benefit from the portion given away. Thus increasing that portion constitutes altruism.
The ultimatum game (UG) is a paradigm for the trading or squandering of any resource. It was originally studied in experiments on human players^{19}, but the current study is not specific to humans. While simple games can successfully represent some aspects of human behaviour^{4}, it should not be inferred that those same games influenced the human evolution responsible for those behaviours. Thus, when used in evolutionary models, the UG should be understood as a proxy for resource allocation during the evolution of species with simple behaviours.
The game was originally modelled^{13} in mean field — all agents interacting with all others. It has since been studied spatially^{17,20,21} and subject to noise^{18,22}, yielding steady states with average offers above the Nash equilibrium (“rational” selfinterest) value of zero and, in some cases^{18,21} close to the “fair” value of 50%. Higher average offers are seen in minigame versions of the UG where only a discrete subset of strategies are allowed^{23}, or if other constraints are imposed on the strategies^{13,24} or rules of the UG. See ref.^{18} for a recent review.
We shall see that, if gameplay is very rare compared with death (occurring on average once in many lifetimes per individual), steady states evolve with average offers of 75% for some parameter values, without the introduction of constraints or alteration of the rules of the game.
Here (Box 1), a version of the spatial UG^{20} is used, with the order of trading and competition between agents determined stochastically (as in^{22}). Their strategies evolve freely by natural selection. The model has two parameters: mutation scale μ and death rate R (equal to birth rate, and exceeding the unit gaming rate). To check modelindependence of the findings, other versions with noncumulative payoffs, and with competitive births (rather than deaths) have also been simulated (to be published elsewhere).
Results
For the cases reported in this section, the system is initialized with the strategy (p_{i}, q_{i}) at each site (see Box 1) drawn independently and uniformly from the unitsquare in strategyspace, p_{i}, q_{i} ∈ [0, 1]. The dynamical rules are iterated while population statistics are measured as functions of time, eventually asymptoting. The asymptotic steady states were found to be independent of the initial conditions (see Methods).
Timescales and starting transient
On the shortest timescales, comparable to or smaller than R^{−1}, the reciprocal of the death (equivalently birth) rate, the distribution of ages in the young population varies with time t. When \(t\ll {R}^{1}\), mean age is approximately equal to t. The mean and standard deviation of the age distribution are plotted against time in Fig. 1 for a typical simulation, with deathrate R = 100 and mutation strength μ = 0.00258. The age distribution remains almost constant for times \(t\gg {R}^{1}\), with just small variations visible in Fig. 1 at later times, due to changes in the distribution of strategies, which take place on longer timescales. We next consider those strategic dynamics.
For large mutation scale \(\mu \gtrsim 0.1\), strategies continue to fill the unit square, so that mean offer and acceptance values remain at \(\langle p\rangle =\langle q\rangle =\frac{1}{2}\).
For small μ (henceforth assumed) typical intermediatetime evolution (occurring on the timescale of the mean time between games at a site \(t \sim 1\)) is shown in Fig. 2. Strategies with p_{i} < q_{i} quickly die out leaving only the triangular region 0 ≤ p ≤ 1, 0 ≤ q ≤ p of strategyspace (henceforth called the dominant triangle) occupied for the remainder of the simulation. At the end of this intermediate stage, strategies fill the dominant triangle approximately uniformly, so the mean offer (at its centre of mass) is very generous, \(\langle p\rangle \approx \frac{2}{3}\).
Subsequently, the distribution of strategies slowly evolves towards the final steady state, but remains confined to the dominant triangle. Within that region, generous \((p > \frac{1}{2})\) or selfish \((p < \frac{1}{2})\) strategies may dominate, depending on the parameter values R and μ.
The typical timescale of this final approach to equilibrium is set by the diffusive motion of the population through strategy space, due to inheritance with random mutation. The typical number of generations per timestep per lattice site is given by the death (and birth) rate R. Each reproduction is accompanied by a mutation; a random displacement in the (p, q) strategyspace, with a variance of approximately (neglecting selection and boundaries) μ^{2}. Hence, in the absence of selection, lineages make excursions of variance Rμ^{2}t (in each direction, p and q) in time t. Under selection pressure, the random walks in strategyspace are modified, so that corrections to this formula arise, but it continues to provide a useful orderofmagnitude estimate.
To reach a statistically steady state, the population requires time to explore the unitsquare strategyspace. That is,
which defines τ_{eq}, the characteristic timescale for equilibration of the steadystate distribution of strategies that coexist in the population.
The evolution of the mean offer and acceptance values (〈p〉, 〈q〉), spanning all timescales, is plotted in Fig. 3 for the cases (R, μ) = (15, 0.0015) and (100, 0.00258), for which t_{eq} = 29600 and t_{eq} = 1500 respectively.
Steady states
The final statistically steady states are ascertained (see Methods) for a range of (R, μ) parameter values. Figure 4 shows the steadystate mean strategies (〈p〉, 〈q〉). At very high death rate R, extreme altruism (very generous offers) persists.
A typical instantaneous configuration of agents on the 2D lattice is shown in Fig. 5 for a population in a statistically steady state at (R, μ) = (80, 0.00462), where agents are very generous on average, (〈p〉, 〈q〉) = (0.70 ± 0.02, 0.24 ± 0.01).
As we shall discuss in the next section, the reasons for the stability of the generous strategies depend on the stochastic nature of gameplay. Because the order of gameplay and choice of opponent is stochastic, the payoff received for any given strategy is not uniquely determined. Instead, the payoff for a given strategy has a distribution with a nonzero variance, so that its mean is not the only important property.
This can be seen in the scattergraph of the wealth w of each agent, plotted against its offer value p, shown for a typical steady state configuration in Fig. 6. At the death rate of R = 80, the UG is played at each cite typically only once in every eighty lifetimes. Hence most agents in the system have never played the ultimatum game in their lives, and therefore have no wealth, w = 0. They are clustered along the horizontal axis in Fig. 6. A small minority of the population has nonzero values of w.
Consider, for example, an agent with strategy p = 0.7 (and some typical value of q). It might or might not play the UG as proposer and/or responder and, if it does, will partner an agent with an asyet unknown strategy. Hence the net payoff for strategy (p, q) is not uniquely determined but (from Fig. 6 at p = 0.7), might be zero (most likely), or close to 0.3 or 0.7 or, with lower probability, some other value.
Analysis
Payoff advantage of dominanttriangle strategies
Let us first consider why the population selforganises into the dominant triangle (Fig. 2). Neighbouring sites are usually closely related, so differ in strategy by few mutations. Hence, agents play mostly against approximately their own strategy, so effective kin selection emerges from relatedness correlations (“assortment”^{3}). Thus an agent i is more likely to receive zero payoff if p_{i} < q_{i}. We shall next argue that such agents cannot exploit those with p > q.
Stability against invasion
Figure 7 illustrates an interface between regions of differing strategies. Within each region, agents are assumed to be locally similar. Agents shaded grey have strategies ≈(p_{0}, q_{0}) where p_{0} < q_{0}, so reject offers from similar agents. Unshaded agents have strategies ≈(p_{1}, q_{1}) in the dominant triangle, p_{1} > q_{1}, so accept kindred offers.
Each agent can play with four neighbours, as proposer or responder, giving it eight possible distinct games, all with equal probability. Of those eight, the number resulting in success (i.e. nonzero payoff) is labelled for each agent. Those numbers therefore give the agents’ relative probabilities of receiving a nonzero payoff. The payoff magnitude remains uncertain, but irrelevant because (as we shall see) survival rate depends only on whether the payoff is zero or nonzero, not on its mean value. The reason is that, when payoffs are much rarer than death (high R), any agent with nonzero wealth is inevitably the richest in its neighbourhood, because most have never played.
So a strategy’s survival depends only on its proficiency in avoiding zero payoff, as shown by the theorem in the next section (which is not confined to the UG). In all cases in Fig. 7, we see that unshaded agents have higher success rate, so outcompete those (shaded) outside the dominant triangle.
Away from the interface, similar strategies have identical success rate hence, by the theorem below, equal fitness. So lineages diffuse through a flat fitness landscape in the dominant triangle, filling it uniformly. This explains the approximate result 〈p〉 ≈ 2/3.
If we relax the assumption of locally similar strategies, and consider instead the opposite limit; well mixed agents; then a generous population again remains stochastically stable^{25} by virtue of the highdeathrate theorem (in the next section). In that case, a population of agents with a distribution of strategies uniformly filling the region (p > 1/2; q < 1/2) all have maximum success rate (no offers rejected), because p > q in every case. In the presence of such a population, a cheat with p < 1/2 would have a higher payoff whenever its offer is accepted, but a lower probability of acceptance (and thus of nonzero payoff), and hence a lower fitness, by the theorem. So the cheat cannot invade.
Such a population has mean 〈p〉 = 3/4. There is some evidence of states close to this in Figs 3b and 4a.
Theorem: Evolution at high death rate
Much of the behaviour discussed above is attributable to a general feature of evolution at high deathrate (compared with the rate of game play), not confined to a specific evolutionary game.
All agentbased evolutionary models include two processes:

(i)
Agents are assigned wealth depending on their strategies and on some rules of gameplay, which may have a stochastic element arising from the game itself or from the order of play or from the environment of each agent and its neighbours.

(ii)
Agents are replaced/reproduced according to some rules that depend on their wealth. This stage involves comparing the wealths of some local neighbourhood of z competing agents (where z → ∞ would be the wellmixed meanfield case). In particular, when death is wealthdependent then, in competitions amongst a local neighbourhood of z agents, one is killed. Survival probability for each agent is some function K(w, {w_{i}}) of its wealth w and those {w_{i}} of competing agents. (An example, defined by the competition in Box 1 in the absence of any ties, would be \(K=\tfrac{1}{2}\mathrm{[1}1/z]\) if \(w\, < \,{w}_{i}\,\forall \,i\) and \(K\,=\,11/2z\,{\rm{i}}{\rm{f}}\,w\, > \,{w}_{i}\) for some i).
Let us consider some general properties of K(w, {w_{i}}).

(a)
In any neighbourhood, the agents’ probabilities of not surviving must sum to unity, because exactly one agent will be killed. That is,
$$\sum _{j}^{z}\,\mathrm{[1}K({w}_{j},\{{w}_{i}:i\ne j\})]=1.$$(2) 
(b)
In the special case of a neighbourhood in which all z of the agents have equal wealth a, all must have equal survival probability. Hence, from Eq. 2,
$$K(a,\{a,a,\ldots \})=11/z.$$(3) 
(c)
In the special case where all agents have equal wealth a except for one with wealth w, from Eq. 2, we have
$$1K(w,\{a,a,a\ldots \})+(z\mathrm{1)}(1K(a,\{w,a,a\ldots \}))=1$$(4)because their probabilities of not surviving must sum to unity (one agent will be killed).

(d)
In any reasonable model, K(w, {w_{i}}) is a nondecreasing function of w, because being richer is never a disadvantage.

(e)
Some updating schemes have an inbuilt wealth scale defining the rate at which survival probability K rises with increasing wealth w. For instance, in the “smoothed imitation” scheme^{26}, the probability of an agent with wealth w_{a} replacing one with wealth w_{b} is proportional to 1/{1 + exp[(w_{b} − w_{a})/α]} with α defining a wealth scale. Many other common updating schemes are scalefree, so that the relative importance of different wealths is determined only by those values present in the neighbourhood. Examples include the “imitate if better” rule^{26,27}, the “replicator rule”^{17,28,29} and ranked schemes where survival depends only on the order of wealth (wealthiest to poorest), as well as the linear scheme (\(K=\lambda w/(w+\sum \,{w}_{i})\)), and also the scheme in Box 1. For scalefree schemes, in the special case where all agents in a neighbourhood have zero wealth except for one with wealth w > 0, its survival probability is independent of the magnitude of w, that is
where λ is a constant.
We shall consider the case where the above processes (i) and (ii) do not involve the same sets of agents; i.e. each agent does not compete for survival with the same agents with which is has played the game. This is usually the case since, in a wellmixed population, the same pair of agents is unlikely to meet twice. And, on a square lattice, games are played by nearest neighbours while competition is between nextnearest neighbours (the four neighbours of a focal agent that will replace one of them). In this case, the wealths w and {w_{i}} of competing agents are not the result of the same instance of gameplay, and hence are correlated with each other only weakly, via the correlations between strategies in the neighbourhood.
Consider a large, structured population of agents, with a variety of strategies and wealths, subject to the processes (i) and (ii) and the conditions specified above. Within that population, let us consider only that subset of agents that have a particular strategy s. Those agents have various values of wealth w (as illustrated by a scatter of points at fixed p in Fig. 6), with some emergent distribution f_{s}(w). (A meanfield analysis would use only the strategydependent mean wealth 〈w〉_{s}, instead of the full distribution).
Next, let us consider all those agents that can compete against any agent with strategy s, because they belong to the same neighbourhood. Those competitors against strategy s have various values of wealth w, with some emergent distribution g_{s}(w).
Given that survival probability is K(w, {w_{i}}) and that the wealths of an agent with strategy s and its competitors are drawn independently (by the assumption below Eq. 5) from the distributions f_{s}(w) and g_{s}(w) respectively, the survival rate of strategy s is
Some agents have exactly zero wealth, because they have either never played the game or played it unsuccessfully. Let us define the total probabilities of any nonzero wealth for an agent with strategy s and its competitor as ε_{s} and ε′_{s}, respectively, and \({\hat{f}}_{s}\) and \({\hat{g}}_{s}\) as normalized distributions of nonzero wealth, such that \({\hat{f}}_{s}\)(0) = \({\hat{g}}_{s}\)(0) ≡ 0. Then
If w takes discrete values, δ(w) is the Kroenecker delta δ_{w,0}. For continuous w, δ(w) is the Dirac delta and all summations are read as integrations.
If nonzero payoffs are rare then ε_{s} and ε′_{s} are small, so, substituting Eqs 7 and 8, into Eq. 6 and expanding to first order gives,
Now, substituting from Eqs 3 and 4 with a = 0 gives
Irrespective of the selection rule (characterized by function K), nonzero wealth (w > 0) is favourable, so 1 − 1/z ≤ K(w, {0}) ≤ 1. Hence for any model, P(s) lies between P_{1} = 1 − 1/z and P_{2} = 1 − 1/z + (ε_{s} − ε′_{s})/z to first order in ε_{s} and ε′_{s}. Finally, from Eq. 5 for scalefree updating schemes we have
which depends on ε_{s} and ε′_{s} but not on \({\hat{f}}_{s}\) or \({\hat{g}}_{s}\).
So leadingorder dependence of survival probability on strategy is independent of any features of the strategydependant wealth distribution f_{s}(w) (e.g. its mean) except the total probability ε_{s} of any nonzero wealth. Thus, strategies that yield a higher average payoff 〈w〉_{s} carry no benefit (to dominant order) and will actually be suppressed if they enhance, even by a little, the risk (1 − ε_{s}) of zero payoff.
Discussion and Conclusions
In summary, two different but related results have been demonstrated. The first is very general — that, when payoffs are rare, strategies evolve to minimize the risk of receiving no payoff, instead of maximizing revenue. Any strategy that enhances the risk is suppressed, irrespective of its average payoff.
The second result follows from applying that general principle to the UG. At high death rate (compared to gaming rate), a strategy of greed cannot prevail in the UG, even though it enhances the mean payoff (because of the possibility of a big win), because it also increases the risk of receiving nothing.
The reason is that, when payoffs are rare, most agents have never played the game in their lives, so have zero wealth (the amount they were born with). Hence any agent that has a nonzero payoff is wealthier than its neighbours. Increasing that payoff would have no effect  they would still be the wealthiest.
In the UG, this riskminimization creates a flat fitness landscape (since success rate in Fig. 7 is independent of p within the dominant triangle), favouring altruism in stochastic steady states. Thus most agents carry a predisposition for generosity, without ever encountering an opportunity to exercise it (gaming being rare).
Realworld applications are ubiquitous if the game represents rare lifechanging events (disasters or gluts) requiring decisive action to avoid losing out. Altruistic traits thus engendered could meanwhile manifest in small acts of generosity with lesser consequences for reproductive success.
These results might be tested experimentally if very many generations of a microbial population are cultured whilst, at very low ratedensity, pairs of individuals are given the opportunity to share a highly beneficial resource. Such a scenario presents a significant experimental challenge.
Methods
All simulations were performed on a twodimensional square lattice of L^{2} sites with periodic boundary conditions. For a representative sample of (R, μ) parameter values, calibration was performed by varying the system size L, the duration t of simulations and the initial conditions, and observing the effects on a set of statistical properties of the system: the mean and standard deviation of the age distribution, of the wealth distribution, of the acceptance thresholds, and the first four cumulants of the offer distribution.
To avoid finitesize effects, the system size L was increased until all results became independent of L. Some of the calibration data are shown in Fig. 8. Results presented above are for sizes ranging from L^{2} = 128^{2} to 512^{2} = 262144 sites.
To establish that the longtime limit (the steadystate) had been reached, for all cases reported in Fig. 4, asymptoting of all statistics was checked in every case by analysing the timedependence of the full set of calibration statistics. By using a logarithmic time axis (as, e.g. in Figs 1 and 3), relaxation processes were observed, some of them on very long timescales (as discussed above). All simulations were run for at least one order of magnitude beyond the timescales of any systematic change in the measurements. It is perhaps of interest to note that, for the randomized initializations, the time taken to asymptote was always less than (but of the order of) τ_{eq} defined in Eq. 1.
Furthermore, the steadystate results were established to be independent of the initial conditions for a representative subset of the simulated parameter sets (including the full range of parameters used in Fig. 4b) by comparing the latetime results following two very different initial conditions. For a large system with a continuous set of strategies, it is impossible to establish unequivocally the absolute stability of the latetime state, since that would require an infinite set of initial conditions to be tested, and each simulated until t → ∞. It is therefore necessary to be selective in the initializations employed, but important (as noted in ref.^{30}) to use more than one.
One of the initializations, already discussed, was fully randomized, with the values of p and q at each site drawn independently from a uniform distribution in the interval [0, 1]. This initialization was chosen as it contains no preconceptions about the possible final states.
In the other initialization, the archetypal selfish and generous states each filled half of the lattice, meeting at a straight interface down the middle of the lattice and another at the x = 0 periodic boundary. In the selfish half, (p, q) = 0 for every agent. In the generous half, p = 1 while q is drawn randomly from a uniform distribution in the interval [0, 1] for each agent. Hence, all agents, on both sides of the interface, have strategies within the dominant triangle. This halfandhalf initialization is designed both to be very different from the randomized initialization, and to overcome metastability near any phase transitions between selfish and generous states. Figure 9 shows latetime results following a halfandhalf initialization, yielding results consistent with Fig. 4b where the randomized initialization was used.
Snapshots of a system following a halfandhalf initialization are shown in Fig. 10, for comparison with Fig. 5, which followed a randomized initialization at the same parameter values (and a larger system).
These initialization protocols are similar to the “stability of subsystem solutions” procedure introduced by Perc^{30} for models with discrete sets of strategies and no mutation. However, in the present study, the selfish and generous phases are not individually timestepped prior to being brought into contact because (a) the timescale allowed for such a procedure must anyway remain arbitrary, (b) if the system size is sufficient, each phase will equilibrate locally before the other phase can significantly invade (c) all simulations here are run until fully equlibrated (asymptoted), instead of observing only the initial direction of interfacial movement (which could be nonmonotonic).
Uncertainties σ_{M} quoted in the Results are one standard deviation of the mean, rounded to one significant figure. That is \({\sigma }_{M}=\sigma /\sqrt{N1}\) where σ is the standard deviation of a sample of N independent values, sampled for different random number seeds and/or different times separated by a duration of at least the diffusive relaxation time τ_{eq}.
Code Availability
The simulation code used to generate data for this study is available at https://github.com/RMLEvans/UltimatumGame.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Dugatkin, L. A. Cooperation among animals: an evolutionary perspective. (Oxford University Press, Oxford, 1997).
 2.
Maynard Smith, J. G. & Price, R. The logic of animal conflict. Nature 246, 15 (1973).
 3.
Tarnita, C. E. The ecology and evolution of social behavior in microbes. J. Exp. Biol. 220, 18 (2017).
 4.
Perc, M. et al. Statistical physics of human cooperation. Phys. Rep. 687, 1 (2017).
 5.
Harrison, F. & Buckling, A. Hypermutability impedes cooperation in pathogenic bacteria. Current Biol. 15, 1968 (2005).
 6.
Xavier, J. B. Social interaction in synthetic and natural microbial communities. Molecular Systems Biol. 7, 1 (2011).
 7.
Cremer, J., Melbinger, A. & Frey, E. Growth dynamics and the evolution of cooperation in microbial populations. Sci. Rep. 2, 281 (2012).
 8.
Drossel, B. Biological evolution and statistical physics. Adv. Phys. 50, 209 (2001).
 9.
Nowak, M. A., Sasaki, A., Taylor, C. & Fundenberg, D. Emergence of cooperation and evolutionary stability in finite populations. Nature 428, 646 (2004).
 10.
Lehmann, L. & Keller, L. The evolution of cooperation and altruism – a general framework and a classification of models. J. Evolutionary Biol. 19, 1365 (2006).
 11.
Nowak, M. A. & Sigmund, K. Evolution of indirect reciprocity. Nature 437, 1291 (2005).
 12.
Nowak, M. A. & Sigmund, K. Evolution of indirect reciprocity by image scoring. Nature 393, 573 (1998).
 13.
Nowak, M. A., Page, M. & Sigmund, K. Fairness versus reason in the ultimatum game. Science 289, 1773 (2000).
 14.
Hamilton, W. D. The genetical evolution of social behaviour. J. Theor. Biol. 7, 1 (1964).
 15.
Ohtsuki, H., Hauert, C., Lieberman, E. & Nowak, M. A. A simple rule for the evolution of cooperation on graphs and social networks. Nature 441, 502 (2006).
 16.
Du, W.B., Cao, X.B., Hu, M.B., Yang, H.X. & Zhou, H. Effects of expectation and noise on evolutionary games. Physica A 388, 2215 (2009).
 17.
Iranzo, J., Román, J. & Sánchez, A. The spatial Ultimatum game revisited. J. Theor. Biol. 278, 1 (2011).
 18.
Debove, S., Baumard, N. & André, J.B. Models of the evolution of fairness in the ultimatum game: a review and classification. Evol. Human Behav. 37, 245 (2016).
 19.
Güth, W., Schmittberger, R. & Schwarze, B. An experimental analysis of ultimatum bargaining. J. Econ. Behav. Organization 3, 367 (1982).
 20.
Page, K. M., Nowak, M. A. & Sigmund, K. The spatial ultimatum game. Proc. R. Soc. Lond. B 267, 2177 (2000).
 21.
Szolnoki, A., Perc, M. & Szabó, G. Accuracy in strategy imitations promotes the evolution of fairness in the spatial ultimatum game. EPL 100, 28005 (2012).
 22.
Sánchez, A. & Cuesta, J. A. Altruism may arise from individual selection. J. Theor. Biol. 235, 233 (2005).
 23.
Forber, P. & Smead, R. The evolution of fairness through spite. Proc. Roy. Soc. B 281, 20132439 (2014).
 24.
Szolnoki, A., Perc, M. & Szabó, G. Defense mechanisms of empathetic players in the spatial ultimatum game. Phys. Rev. Lett. 109, 078701 (2012).
 25.
Foster, D. & Young, P. Stochastic evolutionary game dynamics. Theor. Population Biol. 38, 219 (1990).
 26.
Szabó, G. & Fáth, G. Evolutionary games on graphs. Phys. Rep. 446, 97 (2007).
 27.
Nowak, M. A. & May, R. M. Evolutionary games and spatial chaos. Nature 359, 826 (1992).
 28.
Helbing, D. Interrelations between stochastic equations for systems with pair interactions. Physica A 181, 29 (1992).
 29.
Schlag, K. H. Why imitate, and if so, how? A boundedly rational approach to multiarmed bandits. J. Econ. Theory 78, 130 (1998).
 30.
Perc, M. Stability of subsystem solutions in agentbased models. Eur. J. Phys. 39, 014001 (2018).
Acknowledgements
Thanks to Katherine Evans, Bhavin Khatri, Leonardo Miele, Mike Ries, Manlio Tassieri, Nigel Wilding and John Williamson for helpful discussions.
Author information
Affiliations
School of Mathematics, University of Leeds, Leeds, LS2 9JT, UK
 R. M. L. Evans
Authors
Search for R. M. L. Evans in:
Competing Interests
The authors declare no competing interests.
Corresponding author
Correspondence to R. M. L. Evans.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.