Introduction

The evolution of cooperation is still a puzzle in evolutionary biology and social science. The prisoner's dilemma (PD) game should be considered to be the gold standard for modeling the evolution of cooperation. The public goods game (PGG) is a PD game with multi-players1,2,3,4,5,6. In a single-round PGG, each player in a group of size n is given a fixed endowment and contributes as much of this as he or she wants to a common pool. The total amount in the pool is multiplied by a factor r with 1 < r < n and then redistributed evenly to each player in the group. It is to the group's advantage if all players contribute their total endowment since r > 1 but each player, given the contributions of the others, does best by contributing nothing since r < n. That is, the only individually Nash equilibrium (NE) of PGG based on monetary considerations is for all players to free ride. However, experiments on PGG do not agree with this NE prediction. Typically, individuals contribute about half their endowment on average to the common pool and this average tends to decrease as the individuals play the single-round PGG more often among randomly formed groups, starting as high as 70% and decreasing to no lower than 20%1,2,7,8,9. This outcome is also observed for the repeated PGG game where the same group plays the game over and over1,7,8,9.

Theoretical research10,11,12,13,14,15,16 has provided an explanation based on NE behavior (through such effects as reputation, reciprocity and revenge) for the evolution of cooperative behavior in the repeated PGG when individuals are assumed to act solely in their own monetary best interests. Experiments with peer-incentive1,2,3,4,5,6 and with institutional-incentive17,18,19,20,21 have also been conducted to test whether reward and/or punishment increases cooperative behavior. Other studies argue that observed behavior in repeated PGG can be described through a combination of these self interests and other intrinsic motivations such as inequity aversion22,23 or shame/honour24. However, for the repeated PGG experiments, it is still not clear whether individual behaviors only depend on their own best interests; how individuals avoid (or dilute) the risk under the reward/punishment incentives; how individuals' reactions differ when rewarded compared to punished; and whether individuals being more cooperative increases the cooperation level of other group members.

To reveal mechanisms behind individual behavior in PGG, we designed experiments with institutional incentives. Here, it is not individuals who reward or punish; rather each individual is rewarded (or punished) with known probabilities that increase (or decrease) as the amount he/she contributed goes up25. In our experiments that include an institutional reward (respectively, punishment), exactly one member of the group is selected in this way to be rewarded (respectively, punished). We base our experiments on the repeated multi-player PGG to reflect that an institution includes several members and uses incentives on an ongoing basis. In fact, the use of institutional incentives is a common feature in nature (e.g. the maintenance of cooperation among subordinates through punishment in hierarchical species26) and many parts of human society such as in businesses and in government institutions20,21,27,28,29.

In this study, we emphasize the prominent roles that imitative and reactive behaviors play in explaining observed outcomes in our repeated PGG experiments. In particular, we study the effects of conformists who base their behavior on contributions rather than income by changing their contribution in the next round in the direction of the average group contribution in the current round30. On the other hand, we also consider the difference between the reward and punishment incentives in promoting cooperation based on individual reactive behavior to these incentives. Finally, some individuals in our experiments can be regarded as the exemplars (leaders) in cooperation since they display more cooperation than other individuals and the role of these exemplars in promoting cooperation (i.e. the effect these exemplars have on the average group contribution level) is considered.

Methods

A total of 792 university students participated in our PGG experiments at the School of Mathematical Sciences Computer Lab, Beijing Normal University. Subjects interacted anonymously via computer screens for 50 rounds of the repeated game among the same four players (Supplementary Information (SI), Methods). The control experiment (Control) is a standard four-player repeated PGG1,6. In each round, every subject receives an endowment of 20 monetary units and decides how much to contribute to the public pool, keeping the rest for himself. The contributions are increased by 60% and split evenly among the four group members.

To reveal how individuals respond to institutional punishment/reward in PGG, we designed three different incentive schemes (SI, Methods), called Const, Up and Down, respectively. The amount of punishment/reward is fixed at 20 monetary units in Const and increases linearly from 16 to 25.6 (or decreases from 25.6 to 16) monetary units per round as a function of the group's total contribution in Up (or Down). For each of Const, Up and Down, there are three treatments. In these three treatments, each round of PGG is followed by a second stage, which corresponds respectively to an institutional punishment (IP), an institutional reward (IR), or both institutional reward and punishment (IRP). As described in the following paragraph (see also SI, Methods), after each round of the IP (respectively, IR) treatment, one of the four subjects is selected to be punished (respectively rewarded). The probability a particular subject is selected decreases (respectively, increases) as the contribution the subject makes to the common pool increases15. Examples of these probabilities relevant for their experiment as well as the amount of the punishment or reward are given to the participants as part of their instructions (SI, Methods). From these examples they gain experience with the monetary consequences of different single-round contribution levels. In the IRP treatment, one of the four subjects is selected to be punished with the IP probabilities above and one subject is independently selected to be rewarded with the IR probabilities. The Control and the nine treatments had approximately 20 groups apiece resulting in almost 800 participants.

In a single round of our repeated PGG experiment with institutional incentives, an individual's expected payoff is in Control when he/she contributes c and the average contribution of a group member is . For the treatments, an individual's expected payoff are πIR = πC + PIRA in IR; πIP = πCPIPA in IP; and πIRP = πC + (PIRPIP)A in IRP where A denotes the amount of incentive (which is A = 20 in Const, in Up and in Down) and PIR (respectively, PIR) is the probability the individual is rewarded in IR (respectively, punished in IP). Specifically, and (SI, Analysis 2.2). In particular, an individual who contributes the group average receives the incentive 25% of the time. Furthermore, an individual who cooperates by contributing 20 units when the rest of the group free-rides has a high chance of being rewarded (87.5%) whereas the others are rewarded only about 4% of the time. This latter small probability reflects the possibility that the institution makes a mistake as to who the cooperator is. In particular, no participant can guarantee he/she will receive the reward in IR (or prevent punishment in IP). Due to these institutional “mistakes”, it is reasonable that the same participant can be rewarded and punished in the same round of IRP (i.e. the reward and punishment mechanisms are independent). In fact, many peer-incentive experiments3,4,6 allow the same individual to be both rewarded and punished. We also mention here that a previous study used a different exogenous mechanism to examine efficient probabilistic punishment25.

Following the methods of a recent theoretical study15, the predicted contribution of a rational player in a single round (i.e. his/her NE behavior) for all three incentive schemes can be easily calculated (SI, Analysis 2.3). As noted above, these single round PGG results are also relevant for our repeated game since outcomes are similar whether the game is played among randomly chosen groups or the same group plays over and over1,2,7,8,9. If an individual contributes c and the average contribution of his/her other group members is , then the contribution level is called a NE if the actual expected payoff of this individual is highest when . It is well-known that the NE for Control is to free-ride1,2,3,4,5,6. From SI (Analysis 2.3), there is a unique NE in IR at about 5, IP has two NE 0 and 20 and IRP has a unique NE at 20. Specifically, for each of Const, Up and Down, the NE behaviors in Control and in treatments IR, IP and IRP are shown in Table 1. Thus, single-round NE behavior predicts our rewards provide a powerful incentive to get cooperation started but are less well suited to maintain cooperation and the converse holds for our punishments12,15,16,20,31 (SI, Analysis 2.3 and Figure S1). Furthermore, we expect that the NE outcomes of Table 1 will appear in the corresponding theoretical studies since all these NE are stable under evolutionary dynamics that model how myopic individuals change their contribution in the direction of increasing payoff15.

Table 1 Nash equilibrium behaviors in a single round when all group members contribute the same amount. The entries give the NE in Control and in each of the nine treatments

Results

The primary experimental result (Figure 1) is that for all three incentive schemes, IRP is significantly better than either IP or IR in promoting cooperative behavior. Furthermore, IP has contribution levels significantly above Control while the levels in IR are above Control on average but never significantly so. The statistic analysis of Figure 1 is given in SI (SI, Tables S2–4 in Analysis 2.1).

Figure 1
figure 1

Average contribution per round for institutional incentives compared to Control (Ctrl).

Panels (a), (b) and (c) show the results in the three schemes Const, Up and Down, respectively. The data is analyzed at the level of the group (i.e. average group contribution per capita) to avoid interdependence of outcomes for members of a given group. The average contribution per session (i.e. over all 50 rounds) are: 7.55 ± 4.65 in Control; IRP 15.23 ± 2.08, IP 10.79 ± 4.65 and IR 8.95 ± 2.08 in Const; IRP 17.69 ± 1.64, IP 12.55 ± 4.25 and IR 9.18 ± 2.85 in Up; IRP 15.52 ± 2.20, IP 11.80 ± 4.35 and IR 8.32 ± 2.34 in Down.

From Table 1, the evolutionary outcome in the incentive schemes will also be more cooperative than Control (this is true in IP as well since some groups will free-ride while others will become fully cooperative and contribute 20). Although the single-round NE predictions do not exactly match the experimental results (Figure 1), this evolutionary behavior is consistent with the experimental results where the trend in IRP (respectively, Control and IR) is to increase (respectively, decrease) their contributions toward the single NE as they play more rounds (Figure 1). However, although motivations based on single-round payoffs correctly predict the evolutionary direction, individuals overall do not play these NE strategies and so we now consider behaviors based on other intrinsic motivations.

Group selection cannot be used to explain our results. For example, IRP is a self-financing incentive scheme32 (i.e. the reward and punishment amounts cancel themselves out) and so groups with the same contribution have the same total payoff in one round of Control as they do in IRP. Thus, group imperatives cannot explain why these two protocols always display the largest difference in contributions. Unlike peer-incentive PGG experiments1,2,3,4,5,6 where group payoffs are typically examined in conjunction with group contribution to judge the effectiveness of different protocols in promoting cooperation (e.g. group selection in favor of higher contribution becomes a bigger factor when such contributions produce larger effects on group payoffs), group selection arguments are inconsistent with our results. Instead, we turn to other individual motivations as contributing causes for the observed differences in contribution levels among our experimental protocols.

Conforming behavior has been used to explain observed outcomes in PGG experiments and other social dilemmas30,33. In the theoretical analysis of a sequential PGG30, a conformist (called an imitator there) contributes the average contribution of all individuals who have already played. In the analysis of a repeated PGG experiment34, a conformist moves his contribution in the next round in the direction of the group average for the current round. In particular, conformists base their behavior on contributions rather than income. This latter approach is used in Table 2 to define conforming behavior as well as defecting and cooperating behavioral types that partition the participants in each round of the experiment into three behavioral types.

Table 2 Definitions for the three behavioral types; conforming, cooperating and defecting. For instance, if an individual contributes less than the group average in round k-1 (first row of table), he is conforming between this round and the next if he increases his contribution (last column of table) and defecting otherwise (other two columns)

From Figure 2a, the proportion of participants who display conforming behavior is essentially constant at about 52.5% in all our experimental protocols (SI, Figure S4 and Table S5 in Analysis 2.4), supporting the conclusion that some, if not all, decisions are indeed based on contributions. Furthermore, the analysis of Figure 2b (SI, Analysis 2.5) shows participants are not randomly choosing their contributions without regard to payoffs or the contributions of other group members. It is also clear from Figure 2a that cooperating behavior is more pronounced in treatments that include punishment. From SI (Analysis 2.6), there is a higher between-round transition rate to cooperating behavior in these treatments as well. In fact, the limiting distribution of the Markov process generated by the transitions among behavioral types between rounds can be used to predict the outcomes in all our experimental protocols (SI, Analysis 2.6).

Figure 2
figure 2

Individual behavior based on contribution and individual reactive behavior to reward/punishment incentives.

(a) For Control and treatments IR, IP and IRP in each of Const, Up and Down, the average proportions of participants who display conforming, defecting and cooperating behaviors over all 49 rounds, respectively, are plotted. (b) The distribution of the number of times (of a possible 49) that an individual displays conforming behavior from one round to the next. (c) The reaction to being rewarded in IR in the current round (where the symbols “+”, “0” and “−” represent “increase contribution”, “no change in contribution” and “decrease contribution”, respectively, in the following round). The proportions of these three reactions over all 49 rounds are shown for a given contribution in the current round. (d) The reaction to being punished in IP in the current round.

For our incentive experiments, a more challenging question is how participants' reactions differ when rewarded compared to punished. These reactions, which are now a combination of decisions based on payoffs and on contributions, can best be explained by comparing them in the IR and IP protocols. Figure 2c (respectively, Figure 2d) shows the proportions of individuals in all three IR (respectively, IP) protocols who increase (+), decrease (−) and do not change (0) their contribution in the round that follows receiving a reward (respectively, punishment), given their current contribution is between 1 and 19. Players who currently contribute 0 (respectively, 20) are not included since they cannot decrease (respectively, increase) their contribution. Similarly, the reactions of participants who are not rewarded in IR (not punished in IP) are plotted in Figure S6a (Figure S6b) (see SI, Analysis 2.7). From Figure 2c–d (SI, Tables S8a-c in Analysis 2.7), there are significantly more individuals who increase their contribution after being punished in IP than after being rewarded in IR. In fact, this result can be attributed to the difference in reaction to reward or punishment among those who are currently contributing between 11 and 19 (i.e. among high contributors) since there is no significant difference among those who contributed less than 10. Moreover, there are the following differences in behavior between individuals who receive an incentive and those who do not. From Figure 2c–d and Figure S6a–b, among high contributors, individuals who are punished in IP (respectively, rewarded in IR) are significantly more likely to increase (respectively, decrease) their contribution than those who are not punished in IP (respectively, those not rewarded in IR). These reactions to being the recipient of an incentive are consistent with single-round NE behavior. Specifically, Table 1 shows that high contributors in IR should decrease their contribution in order to move in the direction of the unique NE whereas high contributors in IP move toward their nearest NE by increasing their contribution. It is also interesting to note that Table 1 implies that punishment is more efficient in promoting cooperation among high contributors than low contributors since low contributors in IP decrease their contribution to move toward their nearest NE (see also the Discussion). This argument is in line with experimental results questioning the effectiveness of punishment21,35.

To show how individual behavior based on contribution affects the average contribution, we define a participant as a conforming-individual (cooperating-individual, defecting-individual) if the frequency that he/she displays conforming behavior (cooperating behavior, defecting behavior) over all 49 rounds (i.e. from round 2 to round 50) is larger than 50%. Obviously, this definition concerns the holistic characteristic of each individual's behavior. According to this definition, the proportions of conforming-individuals, cooperating-individuals and defecting-individuals in all our experimental protocols are shown in Figure 3a along with the proportion who fit none of these three categories. It is easy to see that the proportion of conforming-individuals is about 60% in all treatments. For Control and treatments IR, IP and IRP, the change in the proportion of cooperating-individuals (defecting-individuals) has the pattern with Control< IR< IP< IRP (Control ≈ IR > IP ≈ IRP) (Figure 3a and SI, Table S9). This result strongly suggests it is only the change in proportion of cooperating-individuals (or defecting-individuals) that responds sensitively to the incentives, since the proportion of conforming-individuals seems to be independent of the incentives. This result exactly matches the observations in Figure 2a as well. Finally, Figure 3b shows that the average group contribution is proportional to the proportion of cooperating-individuals. This means that the cooperating-individuals can be regarded as “exemplars” (or “leaders”) whose behaviors are tracked by conformists who follow the resultant change in group average contribution. In particular, since there is on average one cooperating-individual in each IRP group (i.e. the proportion of cooperating-individuals in all three IRP protocols is about 25% compared to less that 18% in all other treatments in Figure 3b), this exemplar leads his/her group to higher group average contribution. This conclusion is also consistent with the observation of a previous study36, in which cooperating-individuals are particularly good at leading by example.

Figure 3
figure 3

Conforming-, cooperating- and defecting-individuals and their effects on the average contribution.

(a) The blue, green and red bars denote the proportions of conforming-, cooperating- and defecting-individuals, respectively. The yellow bar (Others) denotes the proportion of individuals who display each of conforming, cooperating and defecting behaviors in less than 50% of the rounds. (b) The average group contribution versus the proportion of cooperating-individuals. This shows clearly that the increase in the proportion of cooperating-individuals will lead to the increase of the average group contribution.

Discussion

Although we mostly attributed the reaction of a participant who receives an incentive to its effect on payoffs, these reactions may also be a reflection of society's different attitudes to rewards and punishments. Participants who are punished in IP even though they are contributing at a high level want to avoid the shame of being punished again and so contribute more24. Shame is probably not an important factor for low contributors. Punishment will not be effective at increasing their contribution if they decide it is unlikely they will be punished two rounds in a row when there are other low contributors in their group. On the other hand, both low and high contributors who are rewarded in IR are satisfied with their reward and the status quo37,38 and so see no reason to increase their chance of being rewarded.

Whatever the ultimate reasons behind these reactions to rewards and punishments, it is also important to understand why individuals would consider conforming behavior in our experiments. Conforming is a good choice for individuals who want to avoid risk in a complex game since it can guarantee that their income level will be at the average (SI, Figure S7 in Analysis 2.8). That is, even though the conformist cannot win the game, he/she does not lose either and so it is ‘not a bad choice’ to follow group behavior, especially if his/her NE behavior is difficult to analyze39.

Most previous theoretical and empirical studies on the repeated PGG assumed that an individual's behavioral decision mainly depends on benefits (or incomes)10,11,12,13,14,15,16. However, in our experiments, observed outcomes can be better explained by dividing individuals into conforming, cooperating and defecting behavioral types34 based on contributions. The prevalence of conforming in all experimental protocols agrees with the fact that this behavior is common in nature and human society39,40. Conformists in our experiments are able to avoid (or dilute) risk by guaranteeing their income will be close to average. We also found that participants who are the recipient of an institutional incentive react differently than those who are not and that reactions to receiving a reward are different to being punished. Our analysis shows that these different reactions predict the effect of incentives in our experiments.

Finally, based on the holistic characterization of an individual's behavior over all 50 rounds of the repeated PGG, we found that “cooperating-individuals” can be regarded as the exemplars (or leaders) and that their behaviors are tracked by conformists who follow the resultant change in group average contribution. This opinion is strongly supported by the experimental evidence. In particular, the proportion of “conforming-individuals” is almost the same for all incentive treatments and the change in the proportion of “cooperating-individuals” (respectively, “defecting-individuals”) corresponding to Control and treatments IR, IP and IRP has the pattern Control< IR< IP< IRP (respectively, Control ≈ IR > IP ≈ IRP). That is, this experimental evidence can be used to explain why IRP is significantly better than either IP or IR in promoting cooperation and why IP is better than IR (see the primary results in Figure 1). The role of these exemplars in promoting cooperation provides an important insight into the mechanisms behind cooperation in our repeated PGG experiments that can be used to predict the effectiveness of institutional incentives at promoting desirable societal behavior.