The dynamics of human behavior in the public goods game with institutional incentives

The empirical research on the public goods game (PGG) indicates that both institutional rewards and institutional punishment can curb free-riding and that the punishment effect is stronger than the reward effect. Self-regarding models that are based on Nash equilibrium (NE) strategies or evolutionary game dynamics correctly predict which incentives are best at promoting cooperation, but individuals do not play these rational strategies overall. The goal of our study is to investigate the dynamics of human decision making in the repeated PGG with institutional incentives. We consider that an individual’s contribution is affected by four factors, which are self-interest, the behavior of others, the reaction to rewards, and the reaction to punishment. We find that people on average do not react to rewards and punishment, and that self-interest and the behavior of others sufficiently explain the dynamics of human behavior. Further analysis suggests that institutional incentives promote cooperation by affecting the self-regarding preference and that the other-regarding preference seems to be independent of incentive schemes. Because individuals do not change their behavioral patterns even if they were not rewarded or punished, the mere potential to punish defectors and reward cooperators can lead to considerable increases in the level of cooperation.

the effect of punishment 7,18,[24][25][26][27][28][29]31,[34][35][36][37][38][40][41][42][43][44] . Specifically, punishment can eliminate selfish behaviors in a cooperative population and stabilize cooperation 18,24,25,27,31,[34][35][36][37][38]41,42 . By contrast, rewards play an important role in leaving the selfish state but are relatively ineffective in maintaining a high cooperation level 18,25,26,28,29,35,37,38,[42][43][44] . The asymmetry between reward and punishment has also been observed in laboratory experiments 15,[19][20][21][22]32 . A meta-analysis found that the punishment effect was slightly stronger than the reward effect, and the centralization of incentives did not moderate the effect size 32 . Although a self-regarding model that is based on NE analysis or evolutionary game dynamics correctly predicts which incentives are better at promoting cooperation 15,18,25,28,35,37,42 , individuals do not play these rational strategies overall 15,21 . In fact, most subjects in PGG with incentives tend to lower (or raise) their contributions if they contributed more (or less) than others in the previous round 2, 15,21 . Furthermore, the subjects exhibited different reactions to peer incentives and institutional incentives. In the peer incentive scenario, both reward and punishment encouraged the recipients to increase contributions 21 . However, in the institutional incentive scenario, only punishment successfully caused the receivers to increase their contributions, and the subjects who were rewarded by the institution often decreased their contributions in the next round 15 . As a result, the contribution levels in institutional reward experiments were not significantly above the standard PGG 15 .
The above discussion suggests that it is important to recognize that actual people are not perfectly rational, and models that are based on self-interest may fail to predict the effectiveness of incentives at promoting cooperation. However, it is unclear how people make decisions when confronted with institutional incentives and why subjects have different attitudes on reward and punishment. In this study, we analyze the observed outcomes in PGG experiments with institutional incentives 15 and provide an explicit description of human decision making. As an extension of Fischbacher and Gächter's model 17 , we emphasize that the subjects in the experiments considered not only self-interests but also the behavior of others and whether they were rewarded or punished in the previous round. Our main goal is to answer how the self-regarding preference and the other-regarding preference affect contributions in PGG with institutional incentives and why the reactions to rewards and punishment are different. Interestingly, we find that people on average do not change their behavioral patterns after they are rewarded or punished. In fact, the people who received punishment (or reward) generally contributed less (or more) than their group members, and they increase (or decrease) their contribution in the next round because of the other-regarding preference. Furthermore, the other-regarding preference seems to be independent of the incentive schemes, and institutional incentives promote cooperation by affecting the self-regarding preference.

Experimental setups and primary results. The experimental setups and primary experimental results
were reported in our previous study 15 , and here, we briefly review them. In the experiments, the subjects interacted anonymously for 50 rounds of the repeated game among the same four players. The control experiment (Control) is a standard four-player repeated PGG. In each round, every subject receives 20 monetary units and decides how much to contribute to the public pool. The total contributions in the pool are then multiplied by 1.6 and split evenly among the four group members. In the nine treatment experiments, each round of PGG is followed by a second stage, which corresponds, respectively to an institutional punishment (IP), an institutional reward (IR), or both institutional reward and punishment (IRP). In a round, exactly one player will be chosen to be rewarded or punished according to his/her contribution (see Methods). For each IR, IP and IRP, there are three different types of incentive intensities, which are called Const, Up and Down, where the amount of punishment/ reward is fixed at 20 monetary units in Const, and increases linearly from 16 to 25.6 (or decreases from 25.6 to 16) monetary units per round as a function of the group's total contribution in Up (or Down).
The primary experimental results are shown in Fig. 1 (the related statistics can be found in Wu et al. 15 ). For all three types of incentive intensities, IRP is significantly better than either IP or IR in promoting cooperation, and IP has contribution levels significantly above Control, whereas the levels in IR are not significantly above Control (see Fig. 1a). Furthermore, there is a significant increase in contribution levels from the first to the last round in IRP, whereas there is a significant decrease in IR and Control, and a slight decrease in IP (see Fig. 1b). Our previous study showed that although the motivations that are based on (single-round) Nash equilibria correctly predict the evolutionary direction, individuals overall do not play rational strategies 15 . In contrast, in all ten experiments, the proportion of conforming behaviors (i.e., changing the contribution in the next round in the direction of the average group contribution in the current round) is higher than one-half. Furthermore, significantly more individuals increase their contribution after being punished than after being rewarded. However, the correlation between conforming behaviors and reactions to incentives is unknown, and an explicit description of the subjects' decision making is lacking. Thus, the findings in Wu et al. 15 cannot be applied to evaluate the efficiencies of different incentive schemes on promoting cooperation.

Modeling human behavior in PGG with institutional incentives. Following Fischbacher and
Gächter 17 , we consider that the contribution of a player in PGG with institutional incentives is affected by four factors, which are his/her own behavior, the behavior of others, and whether he/she was rewarded and/or punished in the previous round. Write the contribution of a player in round t as In Eq. (1), C(t − 1) and OC(t − 1) denote the contribution of the player and the average contribution of his/her three other group members, respectively, in round t − 1. Thus, b 1 and b 2 measure the effects of the self-regarding preference and the other-regarding preference, respectively (see SI Section 2 for a detailed discussion), where b 2 = 0 represents individuals who consider only their self-interests, and b 1 = 0 represents individuals who care only about others' behavior. In contrast, R(t − 1) and P(t − 1) are the amounts of reward and punishment, respectively, that are received in round t − 1, and R(t − 1) (or P(t − 1)) equals 0 if the player was not rewarded (or punished). Therefore, b 3 and b 4 describe the reactions of being rewarded or punished, respectively, where a player tends to increase his/her contribution after being rewarded (or punished) if b 3 > 0 (or b 4 > 0). Following Eq. (1), the behavioral patterns of the players in the repeated PGG with institutional incentives can be characterized by a 4-dimensional vector (b 1 , b 2 , b 3 , b 4 ), where free-riders, unconditional contributors and conditional cooperators (who move toward the average contribution of others 10,15,16 In particular, the imperfect conditional cooperators who are defined by Fischbacher and Gächter 17 satisfy b 1 + b 2 < 1 and b 2 > 0. We then calculate the behavioral patterns of the 792 participants based on the regression equation, Eq. (1). The regression results are significant for 79.5% of the players (F-test, P-value < 0.01); furthermore, the results are significant for 91.3% of the players at the 5% significance level. These results imply that the behavior of most individuals in PGG with institutional incentives can be described by our model.

The effects of reward and punishment.
Based on the regression results, we first examine the reactions to reward and punishment. Surprisingly, Table 1 shows that in all nine treatment experiments, the mean values of b 3 and b 4 are very small. Notice that the absolute values are less than 0.05 in all treatments, and the resulting change in the contribution is less than 1 monetary unit. Furthermore, in almost all of the treatments, b 3 and b 4 are not significantly different from zero (the only exception is b 3 in the IRP Const experiment). This finding means that people on average do not react to reward or punishment.
Because b 3 and b 4 are small and their impacts on contributions are not significant, we drop them from Eq. (1) and consider the following simplified regression equation 1 2 In SI Table S1, we show that excluding b 3 and b 4 from Eq. (1) does not significantly affect the self-regarding parameter b 1 and the other-regarding parameter b 2 . In addition, the regression results that are based on Eq. (2) are even better than the results of Eq. (1), i.e., the results are significant for 87.2% of all players (F-test, P-value < 0.01); furthermore, the results are significant for 95.8% of the players at the 5% significance level. This outcome further demonstrates that people on average do not change their behavioral patterns after they are rewarded or punished, and a combination of the self-regarding preference and the other-regarding preference sufficiently explains the dynamics of human behavior in PGG with institutional incentives.  Table 1. Reactions to reward and punishment. The mean values of b 3 and b 4 in the nine treatment experiments. The symbol " * " denotes that the mean value of b 3 is significantly different from zero (Mann-Whitney U-test, P-value < 0.01). The data are analyzed at the group level to avoid the interdependence of outcomes for members of a given group.
incentive intensities, i.e., small changes in the amount of reward or punishment do not affect b 1 and b 2 . We therefore combine the data in Up, Const and Down and investigate b 1 and b 2 in the four schemes of Control, IR, IP and IRP. As shown in Table 2, the mean values of b 2 are between 0.52 and 0.55 in all four schemes, and the differences among them are not significant (see SI Table S3 for a statistical test). In contrast, we observe that b 1 in IRP and IP is larger than in IR and Control, and b 1 in IRP is slightly larger in IP, whereas the difference between b 1 in IR and Control is not significant (see SI Table S3 for a statistical test). We also estimated b 1 and b 2 in the four schemes separately for rounds 1 to 25 and rounds 26 to 50. In IR, IP and IRP, the estimated coefficients are very similar in both halves of the experiments (see SI Table S4). This result means that individual behavioral patterns do not change over rounds in PGG with incentives. However, in Control, b 2 in the first 25 rounds is significantly larger than in the last 25 rounds, whereas b 1 in the first 25 rounds is (insignificantly) smaller than in the last 25 rounds. These findings are consistent with the observation of Fischbacher and Gächters 17 that belief in others plays a major role in early periods, and self-interest becomes more important later.
These regression results raise two additional questions, namely, why b 1 in IRP and IP is larger than in IR and Control and why b 2 in different incentive schemes is similar. We answer these questions by investigating the correlations between the group average payoff and the group average, b 1 , b 2 and b 1 + b 2 . The main results are shown in Table 3. In the three incentive schemes of IR, IP and IRP, the correlation between b 2 and the payoff is not significant, whereas we observe a strong positive correlation between b 1 and the payoff in IP and IRP. Thus, a larger b 1 is preferred in IRP and IP because it can lead to a higher payoff. In Control, there is a strong positive correlation between b 2 and the group average payoff, i.e., the level of group average contribution increases in b 2 . This finding is consistent with the previous study that conformity promotes the evolution of cooperation in PGG 16 . Now, we explain why the average contributions in IRP and IP are higher than in IR and Control. The analysis in SI Section 2 shows that the group average contribution in a repeated PGG increases in the group average, b 1 + b 2 . Furthermore, a high level of contributions can be maintained in a PGG if the group average of b 1 + b 2 ≈ 1 and b 2 are not too small (i.e., the group consists of conditional cooperators 16 ). Thus, IRP and IP are better at promoting cooperation than IR and Control because they have a larger b 1 + b 2 . Notice that most groups in IRP and IP satisfy b 1 + b 2 ≈ 1, and cooperation increases or does not change significantly in these two incentive schemes (see Fig. 2a). However, b 1 + b 2 < 1 in most groups of IR and Control. As a result, contributions drop significantly in these two schemes.
We also examine b 1 + b 2 at an individual level (see Fig. 2b). Table 2 shows that b 1 + b 2 is significantly smaller than 1 in IR and IP, which means that many subjects in IR and IP are imperfect conditional cooperators. This result explains why cooperation slightly decreases in IP although punishment successfully eliminates the free-riders from the population. Interestingly, individual b 1 + b 2 is insignificantly different from one in Control, which implies that many people in Control are conditional cooperators. As shown in Fig. 2b, individual behavioral patterns in Control have a large degree of heterogeneity, where free-riders (i.e., b 1 + b 2 ≪ 1) and conditional cooperators (i.e., b 1 + b 2 ≈ 1,b 2 > 0) are the two largest types. Thus, cooperation can be maintained in groups that consist of conditional cooperators, and contributions drop if there are free-riders or imperfect conditional cooperators in the group (as shown in Fig. 2a, approximately 30% of the groups in Control comprise conditional cooperators).

Discussion
Recent empirical research has shown that conditional cooperation or conforming behaviors are common in PGG experiments [8][9][10][11][12][13][14][15] , and people may have different attitudes concerning reward and punishment 15,[19][20][21][22]32,45 . The goal of our study is to explore how the self-regarding preference and the other-regarding preference affect contributions in repeated PGG with institutional incentives and why some incentive schemes promote cooperation better than others. To achieve this goal, we consider that individual contributions are affected by four factors, which are self-interest, the behavior of others, and the reactions to reward and punishment. The regression results show that people on average do not react to reward or punishment and that two factors, namely, self-interest and the   values of b 1 and b 2 in Control, IR, IP and IRP. The symbol "* " denotes that the individual b 1 + b 2 is significantly smaller than one (Mann-Whitney U-test, P-value < 0.01). Most people in IR and IP are imperfect conditional cooperators (i.e., b 1 + b 2 < 1), whereas most people in IRP are conditional cooperators (i.e., b 1 + b 2 ≈ 1). Table 3. Correlation coefficients between the group average payoff and the group average, b 1 , b 2 and b 1 + b 2 in Control, IR, IP and IRP. The symbol "* " denotes that the correlation is strong, i.e., P-value < 0.01.
behavior of others, sufficiently explain the dynamics of human behavior. Furthermore, institutional incentives promote contributions by affecting the self-regarding preference, b 1 , and the other-regarding preference, b 2 , seems to be independent of incentive schemes. Our conclusion questions the applicability of many theoretical models of PGG with institutional incentives 15,31,[34][35][36][37][38][39][40][41][42][43][44] . The evolutionary/learning dynamics that are considered in these models are based on the assumptions of perfect rationality, such as Nash equilibrium strategies (or best response strategies) 15,35,39 , or preferential imitation of better performing players, such as replicator dynamics 35,37,[40][41][42][43] , pairwise comparison updating 34,36,38,44 , and exploration dynamics 31,39 . However, our experiments did not provide significant evidence that the subjects choose payoff maximizing strategies or imitate their group members with the best payoff. Instead, the contributions of the players mainly depend on their own previous action and the actions of their group members. Accordingly, we suggest that subsequent theoretical research on PGG with institutional incentives should consider our findings that players seem to not care about the payoffs of their group members in updating their actions, but they care about their group members' contributions.
Our analysis also reveals that individuals display different behavioral patterns when confronted with institutional incentives and peer incentives. The experiments on peer incentives have shown that people behave more cooperatively after being rewarded or punished 21,22,39 . These observations indicate a problem with peer incentives: the maintenance of cooperation by peer incentives relies on whether defectors are punished and cooperators are rewarded in time, and the cooperation level will decline if the reward or punishment level declines. It has been shown that the average number of peer punishers (or rewarders) decreases with the number of defectors (or cooperators) 21,39 . This result implies that peer incentives are inherently fragile because reward levels are difficult to maintain in a cooperative population and because punishment levels are difficult to maintain in a selfish population. In contrast, in PGG with institutional incentives, because individuals do not change their behavioral patterns regardless of whether they received incentives, the mere potential to punish defectors and to reward cooperators can lead to considerable increases in the level of cooperation. In our experiments, only one player will be rewarded or punished in each round. Therefore, institutional incentives are more powerful than peer incentives in promoting cooperation because the incentive institutions work although not all defectors are punished.
Finally, it is well known that human populations are in general highly structured, where different individuals interact with different subsets of the entire population. Theoretical studies that are based on evolutionary game methods have indicated that structures play a major role in the evolution of cooperation in social dilemma games, such as prisoner's dilemma (PD) and PGG [46][47][48][49][50][51][52][53][54][55] . However, cooperative outcomes have been rarely observed in experiments on many static networks [56][57][58][59][60] . A possible explanation for this inconsistency is that most theoretical studies have considered payoff-based updating rules, whereas the people in the experiments did not adopt these rules [61][62][63] . In particular, a behavioral rule called "moody conditional cooperation" was observed in several spatial PD game experiments, where moody conditional cooperators make decisions based on their own previous action and the actions of their neighbors (specifically, they cooperate more when they themselves cooperated in the previous round and more of their neighbors cooperated) 61,63 . There is a direct connection between moody conditional cooperation and our model 16 , and our method can also be applied to describe the dynamics of human behavior on networked PD games. By introducing network parameters into the model (e.g., the number of neighbors), we can expect to quantitatively evaluate the effect of network structures on human decision making.

Methods
A total of 792 university students participated in our PGG experiments at the School of Mathematical Sciences Computer Lab at Beijing Normal University. All participants provided written informed consent after the nature and possible consequences of the studies were explained. All experimental methods were conducted according to the approved guidelines. All experimental protocols were approved by the Ethics Review Committee of the Institute of Zoology.
The subjects interacted anonymously through computer screens for 50 rounds of the repeated game among the same four players. The control experiment (Control, 76 subjects, 19 groups) is a standard four-player repeated PGG. In the treatment experiments, each round of PGG is followed by an incentive stage, which corresponds to an institutional punishment (IP), an institutional reward (IR), or both institutional reward and punishment (IRP). Furthermore, for each IR, IP and IRP, there are three different types of incentive intensities that are called Const, Up and Down. In IP, there are 80 subjects (20 groups) in Const, 76 subjects (19 groups) in Up and 84 subjects (21 groups) in Down. In IR, there are 80 subjects (20 groups) in Const, 72 subjects (18 groups) in Up and 84 subjects (21 groups) in Down. In IRP, there are 80 subjects (20 groups) in Const, 80 subjects (20 groups) in Up and 80 subjects (20 groups) in Down.
In Control, an individual's single round expected payoff is π ≡ − + . C C 20 1 6 C when he/she contributes C and the average contribution of the group is C. In the treatments, exactly one player will be chosen to be rewarded or punished refers according to his/her contribution in the incentive stage. An individual's expected payoffs are π IR = π C + P IR A in IR, π IP = π C − P IP A in IP, and π IRP = π C + (P IR − P IP )A in IRP, where A denotes the amount of incentives (which is A = 20 in Const, = + . A C 16 0 48 in Up and = + . − A C 16 0 48 (20 ) in Down), and P IR (or P IP ) is the probability that the individual is rewarded in IR (or punished in IP). Specifically, = + + P C C ( 1)/4( 1)

IR
, and = − − P C C (21 )/4 (21 ) IP . Thus, the probability that an individual is rewarded (or punished) increases (or decreases) as the amount that he/she contributed increases. More methodological details and sample instructions can be found in Wu et al. 15 .