Evolution of cooperation and consistent personalities in public goods games

The evolution of cooperation has remained an important problem in evolutionary theory and social sciences. In this regard, a curious question is why consistent cooperative and defective personalities exist and if they serve a role in the evolution of cooperation? To shed light on these questions, here, I consider a population of individuals who possibly play two consecutive rounds of public goods game, with different enhancement factors. Importantly, individuals have independent strategies in the two rounds. However, their strategy in the first round affects the game they play in the second round. I consider two different scenarios where either only first-round cooperators play a second public goods game, or both first-round cooperators and first-round defectors play a second public goods game, but in different groups. The first scenario can be considered a reward dilemma, and the second can be considered an assortative public goods game but with independent strategies of the individuals in the two rounds. Both models show cooperators can survive either in a fixed point or through different periodic orbits. Interestingly, due to the emergence of a correlation between the strategies of the individuals in the two rounds, individuals develop consistent personalities during the evolution. This, in turn, helps cooperation to flourish. These findings shed new light on the evolution of cooperation and show how consistent cooperative and defective personalities can evolve and play a positive role in solving social dilemmas.


The model
In our model, in a well-mixed population of N individuals, groups of size g are randomly formed to play a PGG with enhancement factor r 1 , possibly followed by a second PGG with enhancement factor r 2 . That is, at each time step, the whole population is divided into N/g randomly formed groups. Strategies of the individuals in the second PGG are independent of the first PGG. Thus, there are four possible strategies: cooperation in both games ( C 1 C 2 ), cooperation in the first game and defection in the second one ( C 1 D 2 ), defection in the first and cooperation in the second game ( D 1 C 2 ), and finally, defection in both games ( D 1 D 2 ). I consider two different scenarios. In the first scenario, called a (prosocial) reward dilemma, while defection entails no more round of PGG, cooperation in the first game leads to the entrance to a second PGG. This scenario is consistent with a situation where cooperation serves as a signal of merit 9,37,38,47 or offers a high social status [48][49][50] which can increase others' willingness to interact with a cooperator, and thus, the individual is permitted to enter an elite PGG. In the second scenario, called the assortative public goods game, all the individuals in the group proceed to play a second PGG. However, motivated by many pieces of evidence of assortative behavior 51,52 , such as breaking or forming ties 53,54 , I assume individuals are sorted based on their strategies in the first round, such that all the individuals who cooperate in the first round form a subgroup to play a PGG (which is called the cooperative or prosocial PGG), and all those who defect in the first round form a different subgroup to play their second PGG (which is called the defective or anti-social PGG). In this way, the corresponding PGG can be considered an assortative PGG, in which individuals are sorted based on their strategy in the first round.
Individuals gather payoff according to the outcome of the games. Besides, I assume individuals receive a base payoff π 0 from other activities not related to the PGG. After playing the games, individuals reproduce with a probability proportional to their payoff, such that the population size remains constant. That is, the whole population is updated synchronously such that each individual in the next generation is offspring to an individual i in the past generation with a probability proportional to its payoff π i . Offspring inherit the strategy of their parent. However, mutations can occur. Mutations in the strategy of the individuals in each round ( s 1 and s 2 , where s i can be C or D) occur independently, and each with probability ν . When a mutation occurs, the corresponding variable's value switches to its opposite value (C to D and vice versa).

Results
The first scenario: a reward dilemma solves the social dilemma. As shown in the "Methods" section, both models can be described in terms of the replicator dynamics, which gives an exact description of the model in infinite population limit. Beginning with the reward dilemma model, I note that when r 2 > g , the second PGG is no longer a dilemma. Cooperation becomes the most rational strategy, which guarantees for the second PGG to yield a positive reward to those who enter it by cooperating in the first PGG. As it is known [20][21][22] , and as our model confirms, such a certain reward promotes cooperation. The situation becomes more interesting when r 2 < g . In this case, the second PGG can even yield a negative outcome. For a promise of having future interaction to promote cooperation, the community needs to solve a second dilemma. As I show below, such coupled dilemmas can be solved through evolution.
I begin by plotting the phase diagram of the model for two different mutation rates, ν = 10 −3 (top), and ν = 10 −5 (bottom), in Fig. 1a. Here and in the following, g = 10 , c = 1 , π 0 = 2 . For too large values of r 1 , the dynamics settle in a fixed point (denoted by FP in the figures). On the other hand, for smaller values of r 1 , cyclic behavior occurs for intermediate values of r 2 (indicated by PO in the figures). For both too large and too small r 2 , a transition to a phase where the dynamics settle in a fixed point is observed. Comparison of the phase diagram for two different mutation rates shows that lower mutation rates increase the size of the region where the dynamics settle in a periodic orbit.
To see how the cooperation changes with r 1 and r 2 , in Fig. 1b-d, I plot, respectively, the time average of ρ C 1 C 2 , ρ C 1 D 2 and ρ D 1 = ρ D 1 C 2 + ρ D 1 D 2 . For small r 2 , such that the second PGG is not profitable enough to motivate cooperation in the first PGG, the dynamics settle into a defective fixed point, where the majority of the individuals defect in the first game and do not play the second game. Consequently ρ D 1 takes a large value approximately equal to 1 − ν , and both ρ C 1 C 2 and ρ C 1 D 2 take small values maintained by mutations. As r 2 increases, cooperation in both rounds evolve. To see how this happens, I note that the cost of cooperation in the first round is equal to c(r 1 /g − 1) . As a single mutant C 1 C 2 receives a payoff of c(r 2 − 1) from the second round, cooperation evolves when the payoff of a mutant C 1 C 2 from the second round becomes larger than the cost of cooperation in the first round. That is when r 2 > 2 − r 1 /g . By increasing r 2 beyond this point, ρ C 1 C 2 rapidly increases. This increases the effective group size in the second round PGG, and thus the expected payoff of second-round defectors, which is an increasing function in ρ C 1 C 2 , increases. To see why this is the case, I note that the payoff of a C 1 D 2 individual from the second round in a group composed of n C 1 C 2 C 1 C 2 group-mates and n C 1 D 2 C 1 D 2 group-mates is equal to cr 2 ρ C 1 C 2 /(1 + n C 1 C 2 + n C 1 D 2 ) , which is larger in groups with a higher number of C 1 C 2 individuals. The probability of formation of groups with a higher number of C 1 C 2 individuals, in turn, increases with increasing ρ C 1 C 2 (see "Methods"), and thus, the expected payoff of a second-round defector increases by increasing ρ C 1 C 2 . Consequently ρ C 1 D 2 increases by further increasing r 2 beyond 2 − r 1 /g . While this leads to enhanced cooperation in the first round, it also reduces the frequency of second-round cooperators due to the exploitation by second-round defectors. Consequently, the profitability of the second-round PGG decreases and fewer individuals cooperate in Top panels result from solutions of the replicator dynamics, and bottom panels result from simulations. Cooperation evolves in the second game (c) and is maximized for moderate values of r 2 . This renders entering the second game an incentive to cooperate in the first game, which promotes cooperation in the first game. Here, g = 10 , c = 1 , and π 0 = 2 . In (b)-(d), ν = 10 −3 . Simulations are performed in a population of size N = 5000 . The simulation is performed for T = 3000 time steps starting from an initial condition with random assignment of strategies, and the time averages are taken over the last 2000 time steps. The replicator dynamics are solved for T = 5000 time steps starting from homogeneous initial conditions. Time averages are taken over the last 2000 time steps. As the model is mono-stable in the entire phase diagrams, the results are independent of the initial conditions. Scientific Reports | (2021) 11:23708 | https://doi.org/10.1038/s41598-021-03045-w www.nature.com/scientificreports/ the first round to enter the second round PGG. Thus the density of first-round defectors shows a local maximum at a moderate value of r 2 at the transition between the cyclic orbit and partially cooperative fixed point at large r 2 .
The time evolution of the system for different strength of reward dilemma r 2 is presented in Fig. 2. Here, the result of the numerical solution of the replicator dynamics (top) and a simulation in a population of size N = 5000 (bottom) are presented. Here, g = 10 , ν = 10 −3 , c = 1 , π 0 = 2 , and r 1 = 1.8 . In Fig. 2a, r 2 = 2 . This is slightly larger than 2 − r 1 /g = 1.82 , and thus C 1 C 2 strategy survive in the population. For larger values of r 2 , the fixed point becomes unstable, and the dynamics settle in the cyclic orbit. An example of the cyclic orbit for r 2 = 3.8 is presented in Fig. 2b. In the cyclic phase, when the density of individuals who cooperate in the first round and thus enter the second PGG is small, individuals can reach a high payoff by entering and cooperating in the second PGG. Thus ρ C 1 C 2 increases. When ρ C 1 C 2 increases enough, individuals can reach a higher payoff by defecting in the second PGG. At this point, ρ C 1 D 2 begins to increase, while ρ C 1 C 2 decreases. As the density of defectors in the second PGG increases, its profitability decreases, and thus individuals have no incentive to cooperate in the first round. Consequently, both ρ D 1 D 2 and ρ D 1 C 2 , as well as ρ D = ρ D 1 D 2 + ρ D 1 C 2 increase, while other strategies decrease (I note that, since those who defect in the first round do not enter the second game, the two strategies D 1 D 2 and D 1 C 2 are degenerate as they lead to the same payoff and are found in the same densities).
The time evolution of the system in the partially cooperative fixed point in large r 2 is presented in Fig. 2c,d. In Fig. 2c, r 2 = 8.8 . This corresponds to just above the transition line from the periodic solution to the fixed point. Consequently, the replicator dynamics settle in the fixed point after showing transient damped osculations around the fixed point. The simulation results show small fluctuations around the stationary state due to finitesize effects. Comparison of the case of r 2 = 3.8 in Fig. 2b and r 2 = 8.8 in Fig. 2c shows that ρ C 1 D 2 increases for larger r 2 due to the higher profitability of the second-round PGG, which motivates more individuals to cooperate in the first round to enter this PGG. This, in turn, can have an adverse effect on ρ C 1 C 2 due to the larger effective group size of the second round PGG. For larger r 2 , as in Fig. 2d where r 2 = 9.8 , the dynamics settles in the fixed point without showing damped osculations. Furthermore, ρ C 1 C 2 increases and ρ C 1 D 2 decrease as r 2 approaches g.
The second scenario: the evolution of cooperation and consistent personalities in assortative public goods. As we have seen so far, a reward dilemma solves the social dilemma. An interesting question is whether such a mechanism can be competitive if defectors have the chance of forming a PGG of their own? This brings us to an assortative public goods game where both first-round cooperators and first-round defectors are rewarded by a second round of interaction, but in separate groups.
The phase diagram of the assortative public goods game is presented in Fig. 3a, top panel. For both too small and too large r 2 , the system settles into a fixed point, and cyclic behavior emerges in between. However, there are two qualitatively different periodic orbits, each stable in some region of the parameter space. For smaller r 2 , the dynamics settle into a defective periodic orbit (DPO). In this orbit, while cooperation in the cooperative PGG evolves, cooperation does not evolve in the defective PGG. I note that this is the same periodic orbit observed in the reward dilemma model. That such a periodic orbit endures in the second scenario shows its competitive stability. In other words, just as prosocial reward is stable in the presence of an antisocial reward 21 , a prosocial reward dilemma is stable in the presence of an anti-social reward dilemma. On the other hand, for large r 2 , the dynamics settle into the cooperative periodic orbit (CPO), where cooperation in both cooperative and defective PGGs evolves.
Interestingly, for small fixed r 1 , by increasing r 2 the system shows a cross-over from the DPO to CPO without passing any singularity. However, for large r 1 , starting from a uniform initial condition, in which the initial In all the cases, r 1 = 1.8 . In (a), r 2 = 2 , which is slightly larger than the threshold necessary for the evolution of cooperation, 2 − r 1 /g = 1.82 , and thus a small fraction of C 1 C 2 strategies evolve. In (b) r 2 = 3.8 , corresponding to the cyclic phase where different strategies cyclically dominate the population. In (c) and (d), respectively, r 2 = 8.8 and r 2 = 9.8 , both corresponding to the cooperative fixed point. Parameter values: ν = 10 −3 , g = 10 , c = 1 , and π 0 = 2 . The replicator dynamics is solved starting from homogeneous initial conditions, and simulations are performed starting from random assignment of strategies. www.nature.com/scientificreports/ densities of all the strategies are equal, as r 2 increases, the equilibrium state of the system changes singularly in a certain value of r 2 . This indicates the transition between DPO and CPO resembles a discontinuous transition for larger r 1 . As a discontinuous transition is usually accompanied by bistability 60 , this raises the question of whether the system possesses a bistable region as well? To address this question, I present the boundaries of bistability in Fig. 3b (black lines). The phase boundaries, which result from a homogeneous initial condition, are superimposed in this figure as well. The boundaries of bistability are derived by solving the replicator dynamics starting from different initial conditions and checking for hysteresis (see "Methods"). The system is monostable outside of the bistable region, indicated by black lines. In the monostable regions, the dynamics settle in the same stationary state, starting from all the initial conditions. In the bistable region (inside the black line), two different stable states, indicated in the figure, are possible depending on the initial conditions. I note that, while the cooperative periodic orbit is stable in the bistable region, it has a very small basin of attraction, such that the replicator dynamics does not settle into this orbit starting from most randomly generated initial conditions (see the Supplementary Information, figures SI.4 and SI.5). For this reason, to derive the boundaries of bistability, a hysteresis analysis is used (see "Methods").
As can be seen in the figure, the transition from a fixed point to the DPO in small r 2 does not show any bistability and occurs at the same value for all the initial conditions. Similarly, the transition from CPO to a fixed point in large r 2 does not show any bistability. In contrast, the transition to CPO by increasing r 2 shows bistability: For medium r 2 , there is a region of the phase diagram where both CPO and DPO are stable. Similarly, the model shows a bistability region for large r 1 where both the fixed point and the cooperative periodic orbit are stable.
In Fig. 4, I present the time evolution of different strategies. An example of different periodic orbits is presented in Fig. 4a (CPO) and Fig. 4b (DPO). Top panels present the result of the replicator dynamics, and the bottom panels present the result of a simulation in a population of size N = 5000 . In Fig. 4a, r 1 = 2.8 and r 2 = 4.8 , and in Fig. 4b, r 1 = 2.8 and r 2 = 2.4 . For larger r 2 , as in Fig. 4a, defective and cooperative PGGs perform competitively, and competition between these two maintains cooperation in the system. When ρ C 1 D 1 ( ρ D 1 D 2 ) is small, the cooperative (defective) PGG is profitable, and thus, it motivates individuals to cooperate (defect) in the first game in order to enter to this PGG. Consequently, ρ C 1 C 2 ( ρ D 1 C 2 ) increases. As ρ C 1 C 2 ( ρ D 1 C 2 ) increases enough, the cooperative (defective) PGG becomes vulnerable to defection, due to high frequency of cooperators in this PGG which increase the expected payoff of a second round defector, ρ C 1 D 2 ( ρ D 1 D 2 ). At this point, ρ C 1 D 1 ( ρ D 1 D 2 ) starts to increase. This in turn decreases the profitability of the cooperative (defective) PGG, and individuals are better off by switching to defection (cooperation) in the first round to enter the defective (cooperative) PGG. Consequently, both ρ C 1 C 2 and ρ C 1 D 2 ( ρ D 1 C 2 and ρ D 1 D 2 ) decrease, while ρ D 1 C 2 and ρ D 1 D 2 ( ρ C 1 C 2 and ρ C 1 D 2 ) increase. In this way, competition between cooperative and defective PGGs maintain cooperation in the population. Interestingly, individuals tend to have compatible strategies in the two rounds. That is, those who cooperate in the first round are more likely to cooperate in the second round. This can be seen by noting that on average ρ C 1 D 2 is much smaller than ρ D 1 D 2 , even though defection in cooperative and defective PGGs leads to, on average, similar payoffs as the density of cooperators in these two games ( ρ C 1 C 2 and ρ D 1 C 2 ) are similar.
As mentioned before, the situation is different for smaller r 2 . For smaller r 2 , as can be seen in Fig. 4b, while cooperation in the cooperative PGG evolves, cooperation in the defective PGG does not evolve. This again hints at the evolution of compatible strategies in the two rounds. Consequently, for smaller r 2 , while the cooperative PGG can become profitable due to the evolution of fully cooperative C 1 C 2 strategies and motivates individuals to cooperate in the first game, the defective PGG does not perform competitively and can not attract individuals. As mentioned before, this periodic orbit is the same periodic orbit observed in the reward dilemma model.  Fig. 4c, for small r 2 , r 2 = 2 and Fig. 4d, for large r 2 , r 2 = 5.6 . As for small r 2 , cooperation evolves only in the cooperative PGG, the situation is similar to the reward dilemma model: C 1 C 2 individuals are found in the population (beyond that maintained by mutations) if r 2 > 2 − r 1 /g . This is what we observe in Fig. 4c. This again hints at the evolution of consistent personalities, as only fist round cooperators cooperate in the second round. The fixed point for large r 2 is presented in Fig. 4d. As r 2 is chosen just slightly beyond the transition line between cooperative periodic orbit and the fixed point, the dynamics shows damped oscillations around the stationary state before settling in the fixed point. Simulations in a finite population, on the other hand, show that small fluctuations around the stationary state occur in this region. I note that the evolution of consistent personalities can be observed in this regime as well, as the frequency of defectors in the defective PGG is larger than that in the cooperative PGG. That is, first-round defectors are more likely to defect in the second round compared to first-round cooperators.
The fact that individuals' strategies in the two games tend to be compatible can be studied in more depth. To do this, I consider two measures of consistency of the strategies in the two rounds. As a first measure of compatibility of the strategies of the individuals in the two rounds, I define γ = [P(C 2 |C 1 ) + P(D 2 |D 1 ) − P(D 2 |C 1 ) − P(C 2 |D 1 )]/2 . Here, P(s 2 |s 1 ) is the conditional probability that an individual has strategy s 2 in the second game, given its strategy in the first game s 1 . γ takes a value between one and minus one, and the more positive γ , the more individuals' strategies in the two rounds are consistent. As a second measure of personality consistency, I consider the connected correlation function of the strategies of the individuals in the two rounds �s 1 s 2 � c = �s 1 s 2 � − �s 1 ��s 2 � . To calculate this, I assign −1 to the strategy D, and +1 to the strategy C. These are plotted in Fig. 5a (top). For comparison, the density of different strategies is plotted as well (bottom panel). Both personality measures show cyclic behavior and always remain non-negative. Importantly, the latter holds in all the phases. This can be seen in Fig. 5b, where the color plot of the time average of γ , for numerical solutions of the replicator dynamics (top) and a simulation in a population of size N = 10,000 (bottom) is presented. Interestingly, for fixed r 2 , personality consistency measures show a maximum close to (but not exactly on) the transition from the fixed point to the defective periodic orbit. This corresponds to r 2 = 2 − r 1 /g , where the benefit of cooperation in the cooperative PGG starts to become large enough to compensate the cost of cooperation in the first round necessary to enter this PGG. Our results thus show that individuals develop consistent cooperative and defective personalities [see the Supplementary Information for the validity of this result for other parameter values]. This, in turn, plays a positive role in promoting cooperation, as individuals behaving consistently in the two rounds, together with the assortative nature of the public goods, allows first-round cooperators to be more likely to reap the benefit of cooperation in the second round, compared to first-round defectors.
I begin the study of the cooperation by plotting the average cooperation in the first game in Fig. 5c, and the average cooperation in the second game in Fig. 5d. Here, the phase transition lines are indicated in the figure as well. As can be seen, cooperation in the first game is maximized for a moderate value of r 2 , and it drops for both too small and too large values of r 2 . Increasing r 2 beyond this value has a detrimental effect on cooperation in the first round. This is due to the fact that for larger values of r 2 , cooperation in the second round increases in both the defective and the cooperative PGGs. This decreases individuals' incentive to cooperate in the first round to be sorted with fellow cooperators in the second round. On the other hand, for too small r 2 , any potential benefit from the second round can be too small to promote high cooperation in the first round. Interestingly, the maximum cooperation level is achieved exactly on the transition line between the cooperative and the defective In the fixed point of the dynamics for small r 2 but larger than 2 − r 1 /g , as in (c) where r 2 = 2 , only C 1 C 2 cooperators evolve, and for large r 2 , r 2 = 5.6 in (d), both C 1 C 2 and D 1 C 2 evolve. Here, g = 10 , ν = 10 −3 , c = 1 , π 0 = 2 , and r 1 = 2.8 . The replicator dynamics is solved starting from homogeneous initial conditions and simulations are performed starting from random assignment of strategies.

Scientific Reports
| (2021) 11:23708 | https://doi.org/10.1038/s41598-021-03045-w www.nature.com/scientificreports/ periodic orbits, which coincides with the edge of bistability. This aligns with some arguments that being on the edge of bistability can be beneficial for biological systems 61 . Finally, the level of cooperation in the second PGG is plotted in Fig. 5d. Here, it can be seen the cooperation increases by increasing r 2 . On the other hand, as we have seen in Fig. 1, in the reward dilemma model, cooperation in the second game is maximized in an intermediate value of r 2 . This contrast can be argued to result from competition between the two cooperative and defective PGGs, such that evolution favors the more cooperative one. Consequently, if individuals in one of the two cooperative or defective PGGs start to defect, the other more cooperative one is favored by evolution. This, in turn, shows that a potential reward to defection, by promoting competition, can have a surprisingly positive impact on the evolution of cooperation in an assortative context.

Discussion
The fact that individuals consistently show cooperative or defective strategies had been noticed in public goods experiments [41][42][43][44] , and in many animal populations 45 . It is argued this persistent personality differences can partly explain why cooperation is observed in laboratory experiments and many human and animal societies [41][42][43][44][45] . However, while some theoretical work had shed light on the evolution of some aspects of personality differences, such as the evolution of responsive and unresponsive personalities 57 , risk-averse and risk-taking personalities 55 , or personality differences in leadership 59 , the evolution of cooperative and defective personalities had alluded theoretical understanding. Our findings address this gap by showing how such consistent personality differences can evolve naturally in an evolutionary process when individuals need to work collectively to solve a social dilemma. Importantly, as the analysis of the model shows, the evolution of consistent personalities, in turn, helps solving social dilemmas by increasing the likelihood that the benefit of a cooperative act is reaped by those who behave cooperatively. In this regard, the evolution and maintenance of consistent cooperative and defective personalities can be regarded as an important mechanism at work in promoting cooperation in biological populations.
An interesting question is that why consistent personality differences evolve in an assortative context? The key to the question is that entering the cooperative PGG in the second round is costly, while entering the defective PGG is costless. As it has been shown recently, an entrance cost for a PGG can promote cooperation in a costly PGG, due to the smaller effective size of a costly PGG 26 . For this reason, cooperation is more frequent and defection less frequent in the cooperative, costly PGG, compared to the defective, cost-less PGG. This phenomenon naturally leads to the evolution of consistent cooperative and defective personalities in an assortative context.
The phase boundaries are plotted in the top panel as well. Interestingly, the cooperation level in the first round is maximized on the singular transition between the two periodic orbits. (e) The contour plot of the time average cooperation level in the second round �ρ www.nature.com/scientificreports/ Our analysis also reveals new roads to the evolution of cooperation. In this regard, as the analysis of the reward dilemma model shows, a population of self-interested agents can successfully solve a reward dilemma and this, in turn, helps to solve the social dilemma. This mechanism can be at work to promote cooperation in a context where cooperation increases an individual's chance of having more social interaction, even if actually benefiting from those interactions requires solving another social dilemma. The second scenario model, on the other hand, shows cooperation still evolves when defection is rewarded by a promise of future interaction as well, provided an assortative mechanism is at work. In other words, just as prosocial reward is stable against antisocial reward 21 , a prosocial reward dilemma is competitively stable in promoting cooperation in the presence of an anti-social reward dilemma. In the presence of both prosocial and antisocial reward dilemmas, competition between the prosocial and anti-social public goods maintains cooperation in the system, and moreover, surprisingly increase cooperation in the second round, compared to a case where such competition is lacking. This shows a potential reward to defection, by fostering competition, can have a surprisingly positive impact in promoting cooperation.

Methods
Replicator dynamics. The model can be described in terms of replicator-mutation equations 62 , which provide an exact description of the model in infinite population limit. These equations can be written as follows: Here, xy (as well as x ′ y ′ and x ′′ y ′′ ) refer to the strategies of the individuals, such that x is the strategy of an individual in the first round, and y is its strategy in the second round. x, x ′ etc. can be either cooperation C or defection D. ν x ′ y ′ xy is the mutation rate from the strategy x ′ y ′ to the strategy xy. These can be written in terms of mutation rate ν as follows: In Eq. (1), π x ′ y ′ is the expected payoff of an individuals with strategy x ′ y ′ . In the case of the first scenario, these are given by the following equations: Here, we have n C 1 = n C 1 C 2 + n C 1 D 2 . To write these equations, I used the fact that in a group with n C 1 C 2 individuals with strategy C 1 C 2 , and n C 1 D 2 individuals with strategy C 1 D 2 , r 1 1+n C 1 g − c and r 1 n C 1 g are, respectively, the expected payoff of an individual who cooperates, defects, in the first game. Those who defect in the first game do not gather payoff from the second game. On the other hand, those who cooperate in the first game, obtain a payoff from the second game as well (which can be negative or positive). This is r 2 − c for an individual with strategy C 1 C 2 , and r 2 , is the probability that a focal individual finds itself in a group with n C 1 C 2 , n C 1 D 2 , n D 1 C 2 , and n D 1 D 2 individuals with, respectively, strategies if (x � = x ′ and y = y ′ ) or (x = x ′ and y � = y ′ ), ν if (x � = x ′ and y � = y ′ ). (3) Scientific Reports | (2021) 11:23708 | https://doi.org/10.1038/s41598-021-03045-w www.nature.com/scientificreports/ C 1 C 2 , C 1 D 2 , D 1 C 2 , and D 1 D 2 . Here, g − 1 n C 1 C 2 , n C 1 D 2 , n D 1 D 2 , g − 1 − n C 1 C 2 − n C 1 D 2 − n D 1 C 2 is the multinational coefficients (that is the number of ways that among the g − 1 group mates of a focal individual, n C 1 C 2 , n C 1 D 2 , n D 1 C 2 , g − 1 − n C 1 C 2 − n C 1 D 2 − n D 1 C 2 individuals have strategies, respectively, C 1 C 2 , C 1 D 2 , D 1 C 2 , and D 1 D 2 ). Summation over all the possible configurations gives the expected payoff of the focal individual with the given strategy from the games. Finally, as all the individuals receive a base payoff π 0 , this is added to the total payoff. Using the expressions in Eq. (3) for the expected payoff of differentstrategies in Eq. (1), I have a set of four equations which gives an analytical description of the model, in the limit of infinite population size.
In the same way, it is possible to write down equations for the expected payoffs of individuals with different strategies in the second scenario. The difference with the preceding scenario is that, in the second scenario those who defect in the first round proceed to a second PGG as well. Thus, under the same notation and conventions as before, the individuals with strategies D 1 C 2 and D 1 D 2 , obtain a payoff of, respectively, r 2 1+n D 1 C 2 1+n D 1 − c and r 2 n D 1 C 2 1+n D 1 , from their second game. Here, n D 1 = n D 1 C 1 + n D 1 D 2 . Thus, we have for the expected payoffs of different strategies in the second scenario: Using these expressions for the expected payoffs of individuals with different strategies in Eq. (1), we have the analytical description of the second scenario model, in the limit of infinite population size.
The simulations and numerical solutions. Numerical solutions of the replicator dynamics result from numerically solving the replicator dynamics of the models derived in the "Methods" section. Simulations of the models are performed according to the model definition. Unless otherwise stated, both the simulations and numerical solutions of the replicator dynamics are performed with an initial condition in which all the strategies are found in similar frequencies in the population pool. For the solutions of the replicator dynamics, this is assured by setting the initial frequency of all the four strategies equal to 1/4. For simulations, this is assured by a random assignment of the strategies. The phase diagram presented in Fig. 5a is derived by locating the parameter values where a transition between different attractors occurs starting from a homogeneous initial condition. The boundary of bistability in Fig. 5a is derived by examining history dependence and checking for the existence of hysteresis in the evolution of the system. That is, the replicator dynamics are solved starting from parameter values belonging to different phases. Then, the parameter values are changed in small steps, using the stationary state of the preceding steps as the initial condition for the solution of the replicator dynamics in the next step. In this way, the boundary of bistability beyond which a solution becomes unstable is found. See the Supplementary Information for more details. (4) cr 1 n C 1 g + cr 2 n D 1 C 2 1 + n D 1 ρ C 1 C 2 n C 1 C 2 ρ C 1 D 2 n C 1 D 2 ρ D 1 C 2 n D 1 C 2 ρ D 1 D 2 g−1−n C 1 C 2 −n C 1 D 2 −n D 1 C 2 g − 1 n C 1 C 2 , n C 1 D 2 , n D 1 C 2 , g − 1 − n C 1 C 2 − n C 1 D 2 − n D 1 C 2 + π 0 .