Introduction

The emergence of overconfidence is a well-established bias in which a person’s subjective confidence in self-assessment is greater than the objective accuracy of those judgments, especially when confidence is relatively high1. In human societies overconfidence has been recognized in many different ways, such as overestimation of one’s actual performance, over-ranking of personal achievement relative to others and the excessive certainty regarding the accuracy of individual beliefs2. Although it is often blamed for hubris, market bubbles, financial collapses, policy failures and costly wars, overconfidence remains prevalent in our daily experience3,4,5. Such a bias can evolve due to the competition of alternative strategies and may contribute significantly to the increase of morale, ambition, resolve and persistence6,7,8,9. Very high levels of core self-evaluations, a stable personality trait composed of locus of control, neuroticism, self-efficacy and self-esteem, may also be related to the overconfidence effect10,11.

As a concomitant bias, bluffing, also named boasting or exaggeration, is a representation of something in an excessive manner12,13. The boaster is regarded as one who pretends to have distinguished qualities, but has not at all or to a lesser degree6,12,14. Usually bluffing is not reliably distinguished from true ability15 and exists in different forms, such as amplifying achievements, deceiving others expectations by magnifying emotional expressions12. It is important to stress that the deception profile, including the appropriate levels of overconfidence and bluffing intensities, plays a decisive role in determining what an individual gets in resource competitions. Our specific interest here is to explore how such profiles develop due to an evolutionary process.

The application of realistic evolutionary rule, however, requires sanctioning of uncovered bluffing, which represents a sort of social norm of the population. In fact, the ability to develop and enforce social norms is probably one of the distinguishing features of the human species16. Several experiments and theoretical investigations have revealed that sanctions are able to create a sufficiently strong selective pressure to prevent cheating, which is necessary to stabilize human cooperation17,18,19,20,21,22,23. Similarly, the deception behavior, regardless of self-deception (overconfidence) or other-deception (bluffing), might be controlled by centralized sanctions. Although third-party punishment can rectify peer biases caused by deception in cognition, system bias, which represents an inclination of the whole group, is beyond its reach. This system bias can be brought by social comparison bias24, for instance, where most of the members in a group believe that they are better (worse) than the average level of this group in certain aspects, which is apparently against the basic mathematical principles25,26. Mandatory rules and many other factors can also bring about such system bias27.

Taking all the factors above into consideration, we explore how overconfidence and bluffing evolve within the framework of a spatial resource competition model. Here we follow the successful method of evolutionary game theory, which proved to be particularly efficient to explain the emergence and maintenance of cooperation28,29,30,31,32,33,34,35,36,37,38,39,40. We suppose that instead of strategies, players imitate personal profiles during the evolutionary process41. More specifically, to adapt this concept to the present model, overconfidence and bluffing, considered as peer biases, could be the subject of imitation14. The key element of the proposed model is both overconfidence and bluffing can evolve simultaneously, which influence a player’s success to reach the desired resource. Furthermore, as another crucial point of our model, we suppose that uncovered bluffing will destroy the reputation of the related player. Accordingly, the mentioned player’s overconfidence and bluffing intensities fall onto the minimal levels that are available in the actual population.

By using this simple concept, we find that the general bluffing level always evolves to a higher level than overconfidence. The application of sanctions, when the positive values of system bias reveal more possible conflicts between competitors, lower the overconfidence and bluffing levels remarkably. Beyond these observations we pay special attention onto the possible consequence of interaction topology. It is well known that the spatial structure of interaction graph can influence significantly the evolutionary outcome of competing strategies in social dilemmas41,42,43,44,45,46,47,48,49,50,51,52,53. Motivated by this fact, we test different representative topologies and explore their consequence on the evolution of overconfidence and bluffing. We find that heterogeneity may boost bluffing and facilitate punishment against individuals’ overconfidence, while increasing available neighbors of each player on homogeneous networks has triggered similar effect.

Results

We start by presenting the stationary overconfidence level fO and bluffing level fB as a function of resource-to-cost ratio r/c, obtained on square lattice, as shown in Fig. 1(a). It suggests that increasing r/c does not noticeably change fO, but decreases fB, especially when system bias δ is relatively large. Lifting δ also significantly reduces overconfidence level fO for moderate punishment probability p (p = 0.5). Note that positive values of δ induce extra conflicts and thus boost the chances of centralized sanctions. Therein it seems that the values of r/c have little impact on the stabilization of overconfidence, regardless of whether punishment is rare or frequent. Meanwhile, boast behavior (fB) slightly decreases as r/c increases. Importantly, the results for regular random graph with k = 4 are in accordance with those for translation invariant square lattice. Thus it seems that the structure of interactions does not play a prominent role as long as the average degrees k are identical. The value of k, however, could play a decisive role on the evolution of deception profile. To explore this effect, we investigate the impact of r/c on fO and fB under extreme conditions (p = 1, δ = 1) on homogeneous networks with different k values (k = 4; 8; 16). As shown in Fig. 1(b) overconfidence almost goes extinct irrespective of the values of r/c and k, showing that enough sanctions can effectively reduce the general level of overconfidence. Meanwhile, fB drops sharply to a minimum value as r/c ascends when k = 4, in contrast to larger degrees as k = 8 or k = 16. In other words, having more available neighbors partially offsets the effect of punishment on boasters.

Figure 1
figure 1

Stable overconfidence level fO and bluffing level fB as a function of resource-to-cost ratio r/c for different values of p and δ on homogeneous networks as indicated in the legends.

Data presented in Panel (a) are obtained on square lattices with periodic boundary conditions, while results depicted in Panel (b) are obtained on regular random graphs with different values of degree (k = 4, 8, 16) when optimal punishment is applied (p = 1, δ = 1). Other parameters: .

We next evaluate the impact of probability of punishment p and system bias δ on general overconfidence level fO and bluffing level fB (see Fig. 2). Besides homogeneous networks, summarized in Fig. 2(a,b), we also explored the possible impact of interaction heterogeneity by considering BA scale-free networks, shown in Fig. 2(c). To avoid additional effects we used the same average degree used for random graph in Fig. 2(a). It can be observed that at any given value of δ, increasing punishment rate p will slightly reduce both fO and fB. Meanwhile, for any given p, both fO and fB drop with δ monotonously, signalling that δ plays a decisive role in restraining the deception behaviours (both overconfidence and bluffing). This behaviour is based on the fact that large δ ensures frequent conflicts between competing players, which will reveal their real abilities. In the other extreme case, negative δ < 0 parameter values inhibits conflicts, which results in a prompt fixation into a high overconfidence and high bluffing deception profile (this case is not shown in figures). Moreover, another common trait of color maps independently of the applied topologies is that fB always evolves to a higher level than the corresponding fO, highlighting that natural selection provides higher bluffing level than overconfidence when other factors equal. Furthermore, the comparison of Fig. 2(a,c) illustrates that network heterogeneity can apparently elevate average bluffing intensity, fB. It also illustrates that the heterogeneity of interaction topology helps to restrain overconfidence for relatively large δ values. Interestingly, increasing k of homogeneous networks is capble to lift bluffing level fB while overconfidence fO is slightly reduced (see also Fig. 1(b)).

Figure 2
figure 2

Color maps depicting the overconfidence level fO (left column) and the bluffing level fB (right column) on the punishment probability (p) - system bias (δ) plane.

Data presented in Panel (a) are obtained on regular random graph with k = 4. Panel (b) are obtained on regular random graph with k = 8, while results depicted in Panel (c) are obtained on BA scale-free network with . Note that δ < 0 immediately leads to fO → 1 and fB → 1 regardless of the applied topology (not shown). Other parameters: r/c = 2.5, .

For better understanding the possible influence of sanctioning mechanism on the evolution of deception profile (α, β), we monitor the time evolution of α and β values on a square lattice without and with punishment (shown in Fig. 3(a) and in Fig. 3(b), respectively). Figure 3(a) shows how the probability distribution of f(α, β) pairs evolve in time in the absence of punishment, when only imitation of deception profiles is possible. It can be observed that the small β values die out first, signalling that boast is most favored by natural selection. Later, when only large β values are present, those players become more successful who apply higher α values. As a result, the whole population will be trapped into a large (α, β) pair after sufficiently long relaxation (t = 100000 MC steps). In fact, once fixation occurs the evolutionary process stops. Here, fO and fB can then be determined by means of averaging over the final states that emerge from different initial conditions. As we conclude, a high α–high β combination survives when there exist only imitations, which is in accordance with our previous observations14. However, fixation never happens when sanction determines the evolution (see Fig. 3(b)). In the early stage almost half of the population is punished, hence low α–low β combinations will form the majority of f(α, β) distribution. Later, as time passes, a dynamic balance emerges between low α–low β combinations and a moderate α–high β pairs. The specific position of the latter depend on the actual values of δ and p parameters. In general, the punishment plays a “shunting” role here, undermining the stabilization of overconfidence and bluffing in the whole population. Importantly, these results hold for any homogenous networks besides square lattice. For strongly heterogeneous networks, sometimes more than one α − β pair can survive around strong hubs even without punishment, which is in agreement with related works where other player-specific profiles evolved41,48.

Figure 3
figure 3

Time evolution of the α − β profile, as obtained on square lattices (a) without punishment (p = 0) and (b) with punishment (p = 1). From top left to bottom right we have presented the temporal distribution of (α, β) pairs at different MC steps, as indicated. The comparison illustrates that punishment undermines the fixation of overconfidence and bluffing. Other parameters: , (a) r/c = 3, δ = 0; (a) r/c = 3, δ = 1.

After realizing the significant impact of sanctions on the evolution of deception profile, next we are interested in the targets of such punishments. More precisely, we wonder whether the real inferiors’ deception profiles are minimized on homogeneous networks with different k values. For this reason we measure separately the average real capability of those players who are punished and those players who are not. The ratio of their averages is denoted by Rability. Similarly, we also measure the average payoff of the mentioned subclasses and their ratio is denoted by Rpayoff. These ratios are depicted in Fig. 4(a) for different random regular networks, where we gradually increase the degree k. Apparently, Rability < 1 indicates that on average, players having lower real γ abilities are punished more frequently. At the same time, Rpayoff < 1 values highlight that the mentioned small-γ group benefit less than their higher ability opponents. Increasing the degree of nodes, both Rability and Rpayoff raise unambiguously, showing that enhancement of connections narrows the real capability- and payoff-gap between the punished players and those who are not punished. In other words, punishment is directed principally towards who are really weak, but this selective impact is gradually weakened as each one has more neighbors. Furthermore, for a deeper insight, it is worth studying the influence of real capability on the evolution of overconfidence and bluffing. Note that the real ability γ of each player remains unchanged during updating. We mark by RO([a, b]) the ratio of the average overconfidence level of those individuals whose γ values are in the [a, b] interval compared to the whole population. For simplicity, we divided the [0, γmax] interval into 10 subclasses. Similarly, RB([a, b]) denotes the ratio of bluffing level for the same subpopulation. Our results for k = 4 random regular graph are summarized in Fig. 4(b). The plot suggests clearly that both RO and RB ascend with γ and exceed the ratio 1 once γ > 0.5. Note that homogeneous networks with other k values show similar tendency. Thus we conclude that players with high ability are inclined to evolve to a higher state of both overconfidence and bluffing because they have a higher chance to collect resource without conflict. Furthermore, if conflict is inevitable and competitors should reveal their real abilities then the mentioned players still have a higher chance to win.

Figure 4
figure 4

Some representative ratios plotted in histogram forms.

Panel (a) depicts Rability and Rpayoff ratios as a function of degree k on regular random networks. Here Rability (Rpayoff) represents the ratio of the average real capability (the average payoff) of those individuals who are punished to those who are not punished. Panel (b) depicts RO and RB as a function of real ability interval on regular random networks with k = 4. Here RO denotes the ratio of the average overconfidence level of individuals whose real capability fits in corresponding interval of fO. In a similar fashion, RB represents the ratio of the average bluffing level of individuals whose real capability belong into the corresponding interval of fB. Other parameters: p = 1, δ = 0.8, .

Lastly, it is instructive to investigate the impact of upper limits αmax and γmax on the evolution of fO and fB values. By keeping γmax = βmax = 1, αmax > 1 means that excessive overconfidence intensity is allowed for competitors. γmax > 1, when αmax = βmax = 1, however, implies that real abilities of players are significantly higher compared to the changing α or β values. We note that βmax > 1 is not taken into consideration, for extravagant boasting could be easily recognized from real facts. For appropriate comparison, fO is normalized, , when αmax > 1 is applied. As demonstrated in Fig. 5(a), the possibility of sanctioning results in drastic reductions in the normalized overconfidence level as αmax is increased. It suggests that punishment can effectively restrain excessive overconfidence, but is unable to decrease bluffing level significantly. However, without punishment (p = 0), raising αmax gives rise to intensive conflicts that help competitors to recognize others’ real capabilities. Therefore, and fB monotonously decrease with αmax and finally converge to 0.5, which equals to the initial value of average bluffing intensities. We stress that the results presented in Fig. 5(a) are robust and remain valid if we use other interaction topologies. Increasing γmax drives the evolution toward “neutral drift” because peer biases, such as overconfidence and bluffing, become second-order important in resource competitions when real abilities dominate. Importantly, however, fO and fB may fluctuate heavily in heterogeneous networks, showing that the existence of strong hubs might influence significantly the evolution both overconfidence and bluffing.

Figure 5
figure 5

Stationary overconfidence level fO and bluffing level fB as a function of upper limit of overconfidence intensity αmax (panel (a)) and in dependence of the upper limit of real capability γmax (panel (b)). Panel (a) depicts the normalized overconfidence level and bluffing level fB as a function of αmax without punishment (p = 0) and with punishment (p = 0.3) on square lattices, where . Panel (b) depicts fO and fB as a function of γmax on square lattice and BA network when p = 0.9. Error bars indicate the standard deviations, which are almost invisible in homogeneous network. Other parameters: δ = 0.2, r/c = 2.

Discussion

In summary, we have investigated how overconfidence and bluffing co-evolves within the framework of a resource competition game. It is a well recognized fact that when confidence is relatively high then the whole population fall victim easily into overconfidence, which is considered to be the most “pervasive and potentially catastrophic” of all the cognitive biases by some psychologists1,10. Counterintuitively, this “erroneous” psychology can maximize individual fitness in many situations, leading to its prosperity in human society. Meanwhile, the existence of bluffing behavior, sometimes unable to be detected, usually leads to ambiguity in one’s perception about other’s real ability. Our previous study highlighted that bluffing promotes overconfidence and they both stabilize at a high level when evolution is limited via imitation without the chance to reveal competitors’ real abilities14. However, the ability to develop and enforce social norms is probably one of the most characteristic feature of human species16. Motivated by this fact we propose an evolutionary which combines sanction mechanism with the clebrated rule of “imitating the better”54. Punishment here, instead of reducing individuals’ real income, is only reduced to their deception behaviors, including both self-deception (overconfidence) and other-deception (bluffing). It is a key point of our model that these two mechanisms, which may determine a player’s success, can coevolve. Furthermore, except the deception profile, the system bias describing the group inclination towards extra conflicts is also considered. Accordingly, system bias can be treated as integral effect, caused by all the other factors, to stimulate conflicts (δ > 0) or to inhibit conflicts (δ < 0) between competitors. In addition, punishment is not certain to occur, but happens with probability p here. Lastly, we stress that we have tested different interaction topologies to explore the possible consequences of structured population. All these details make our model more realistic.

Our extended model gives deeper insight to previous findings14. As shown in Fig. 2, overconfidence and bluffing have essentially the same changing tendency irrespective of p, δ and topological properties. It is in accordance with previous observation that bluffing promotes overconfidence. There is, however, a significant difference, when both side of deception can coevolve. Namely, boasting seems more stable than the fatal psychology of overconfidence because individuals can take advantage of bluffing immediately. As a consequence, eliminating boast behavior requires more intensive sanction mechanism to work. We also find that increasing heterogeneity or average degree of the interaction networks significantly promote bluffing and simultaneously increase efficiency of adequate punishment (when p and δ are large) against overconfident behavior. More importantly, this third-party punishment prominently limits overconfidence of excessive intensity. Intriguingly, high capability of an elite might induce high level of his deception profile, which lies in the fact that elites hardly fail in the conflicts.

In conclusion, for better understanding the intricate relation between overconfidence and bluffing, we have proposed a more realistic model in which the individual deception profile coevolve. Overall, both social norms and topological properties of interaction networks have substantial influence on the evolution of these “peer biases”. We hope that these observations will motivate further research aimed at promoting our comprehension of the evolution of these “erroneous” but sometimes meaningful inclinations.

Methods

The traditional setup of an evolutionary game assumes N players occupying vertices of an interaction graph. Our basic model is a resource competition game (RCG) in which neighbors compete for resources and their success is based on how convincingly they claim for it. Without loss of generality, an individual i is characterized by a time-independent real capability and an evolving overconfidence intensity and bluffing intensity . Here γmax, αmax, βmax values represent upper limits of corresponding properties of the whole population. Unless stated, . While the real capacity γi is fixed and unalienable feature of each players, αi represents the actual overconfidence state (OS), a perception error about self-ability. Similarly, βi characterizes the bluffing state (BS) of the player that helps to over-represent abilities towards competitors. In particular, i believes he/she owns a “self-perceived capability” ki as:

while his/her “displaying capability” mi is observed as:

Supposing a resource r is potentially available to neighboring individuals that claim it. If neither of them claims then the resource remains unused. If only one individual makes a claim, then it acquires the resource and gains fitness r while the other gains nothing. When, players i and j both claim for this resource, a RCG takes place. In the latter case each individual pays a cost c due to the conflict between them and the one who has higher real capability acquires the resource. In this model, the recognition ability of each player is also influenced by a uniform system bias δ, which allows us to control the intensity of conflicts between competitors. Summing up, a player i facing with player j gains a payoff Pij that can be calculated as follows:

  1. 1

    If ki > mj − δ and kj < mi − δ, player i claims but player j does not, thus Pij = r.

  2. 2

    If ki < mj − δ, player i will not claim and remains empty handed, Pij = 0.

  3. 3

    If ki > mj − δ and kj > mi − δ, a conflict emerges between players i and j when they have to reveal their real capabilities which determine what they get: If γi > γj, Pij = r − c; If γi < γj, Pij = −c.

Here parameter δ represents a uniform group inclination how to handle possible conflicts: for positive δ > 0 values group members are motivated to “open their cards” impulsively and bravely and thus more conflicts take place. In case of δ < 0, however, conflicts are avoided because all players in the group are excessively cautious.

Initially each player i is assigned by random γi, αi and βi values. The situation that two values are equal is not taken into consideration. In stark contrast to our preliminary work14 in the extended model both αi and βi can coevolve, which influence dramatically a player’s success in resource competition. During an elementary Monte Carlo (MC) step a randomly selected player i collects its payoff Pi by playing RCG with all ki neighbors, where ki represents the degree of player i in the interaction graph. The total payoff of player i is

where Ω(i) represents all players in i′s neighborhood. Subsequently, a randomly chosen neighbor j acquires its payoff Pj in a similar way.

As we noted, a crucial point of the evolution that players may change their deception profile to collect more resources. In particular, if a player i looses a conflict against player j then his/her extreme overconfidence and bluff levels are revealed, hence player i is punished with probability p. As a result, the (αi; βi) values are reduced to the minimum levels of the whole population. Otherwise, player i adopts the deception profile of a randomly selected neighbor j with the probability W = W(Pj − Pi). And thus

where εα and εβ represent the minimum overconfidence and bluffing intensity respectively. Parameter K characterizes the level of uncertainty in deception profile adoption55. Without loss of generality we use K = 0.1, but qualitatively similar results can be obtained for other K values. Importantly, since the profile consists of two parameters, two independent random numbers are drawn to enable uncorrelated imitation of αi and βi values, as it was suggested in ref. 48.

The presented simulation results were obtained using different interaction graphs, such as square lattice with periodic boundary conditions, regular random graph with different degrees and the Barabási-Albert (BA) scale-free graph56. The latter is served to explore the possible consequence of heterogeneities. In accordance with the random sequential update, each full MC step, which consists of N times of repeated elementary steps, gives a chance on average once to update individual deception profiles. The typical system size contains N = 104 − 105 nodes and the stationary frequencies are determined by averaging over 104 MC generations in the stationary state after sufficiently long relaxation times. The stationary state is considered to be reached when the average of the overconfidence level fO (the stable average values of α) and bluffing level fB (the stable average values of β) no longer change in time. We have averaged the final outcome over 50 independent initial conditions.

Additional Information

How to cite this article: Li, K. et al. The coevolution of overconfidence and bluffing in the resource competition game. Sci. Rep. 6, 21104; doi: 10.1038/srep21104 (2016).