Fast deliberation is related to unconditional behaviour in iterated Prisoners’ Dilemma experiments

Montero-Porras, Eladio; Lenaerts, Tom; Gallotti, Riccardo; Grujic, Jelena

doi:10.1038/s41598-022-24849-4

Download PDF

Article
Open access
Published: 24 November 2022

Fast deliberation is related to unconditional behaviour in iterated Prisoners’ Dilemma experiments

Scientific Reports volume 12, Article number: 20287 (2022) Cite this article

1327 Accesses
1 Citations
7 Altmetric
Metrics details

Subjects

Abstract

People have different preferences for what they allocate for themselves and what they allocate to others in social dilemmas. These differences result from contextual reasons, intrinsic values, and social expectations. What is still an area of debate is whether these differences can be estimated from differences in each individual’s deliberation process. In this work, we analyse the participants’ reaction times in three different experiments of the Iterated Prisoner’s Dilemma with the Drift Diffusion Model, which links response times to the perceived difficulty of the decision task, the rate of accumulation of information (deliberation), and the intuitive attitudes towards the choices. The correlation between these results and the attitude of the participants towards the allocation of resources is then determined. We observe that individuals who allocated resources equally are correlated with more deliberation than highly cooperative or highly defective participants, who accumulate evidence more quickly to reach a decision. Also, the evidence collection is faster in fixed neighbour settings than in shuffled ones. Consequently, fast decisions do not distinguish cooperators from defectors in these experiments, but appear to separate those that are more reactive to the behaviour of others from those that act categorically.

A quantitative description of the transition between intuitive altruism and rational deliberation in iterated Prisoner’s Dilemma experiments

Article Open access 19 November 2019

Learning leads to bounded rationality and the evolution of cognitive bias in public goods games

Article Open access 08 November 2019

Contextualised strong reciprocity explains selfless cooperation despite selfish intuitions and weak social heuristics

Article Open access 06 July 2021

Introduction

Experiments have shown that people differ in their motivations to cooperate (or not) in social settings^1,2,3 and that these differences are shaped by, on the one hand, their inherent characteristics such as gender and age^4,5 and, on the other hand, the context they find themselves in, such as the presence of certain social norms or institutions^6,7,8, their network of contacts^9,10 and repetition of those interactions^11,12, among many other factors. It has furthermore been shown that people have different perceptions of what is fair and what is not: people may be inequality averse, hence interested in the alignment between the others’ actions and their payoffs with respect to their own^1,13, while fairness differs also in how people value social welfare and concerns for efficiency in resource distribution¹⁴.

The details on why these differences appear among humans and what they mean, remain an area of active research. One mechanism proposed by researchers to describe individual heterogeneity in cooperation is Social Value Orientation (SVO)^15,16, also known as other-regarding preferences^2,17 or interdependent preferences for reciprocity¹⁸, among other names¹⁹. What these key concepts aim to measure is how humans perceive the importance of their own gains in relation to what others receive^1,20. This way, there may be individuals that strongly think in terms of others, require the group as a whole to benefit, or prefer equality^1,19. Some others might prefer to receive more than their opponents or even out-compete them, i.e. individualistic/competitive SVO or selfish preferences in the context of other-regarding preferences, or materialists in interdependent preferences.

These individual differences in cooperation have been associated with differences in the time people need to act or make a choice, i.e. their response time (RT). RT has, for instance, been analysed in the Ultimatum Game^21,22 and the Prisoner’s Dilemma (PD) game²³. Previous works on the relationship between RT and SVO specifically, focused on the Public Goods Game (PGG)^24,25, Dictator Game, Ultimatum Game, and Trust Games (playing the role of the trustee)²⁶, concluding that highly cooperative and highly individualistic participants are faster to decide. The explanation for this result was associated with the level of conflict a person perceives when making a decision^{23,27,28,29,30,31}, rather than the competition between contrasting cognitive processes like deliberation or intuition³². Indeed, the use of behavioural or physical observations (such as RT) directly to explain mental processes remains unclear because this relationship might miss sources of variability in both the experiment and the data²⁹.

Nonetheless, in the field of psychology, researchers have tried to understand and model the cognitive processes behind decision-making and RT for decades³³. Progress has been made by linking neurosciences and behavioural economics^34,35,36, which provides new insights into how different variables are related to the deliberation process in value-based decision-making, such as RT³⁷. In traditional economic theory, it is assumed that people know exactly their preferences when choosing between two options when in reality, people deliberate over (noisy) subjective representations of these two options, which are encoded in neuronal firing rates³⁴. One of the models that has tried to explain the relationship between the cognitive process of discrete-choice tasks and RT is called Drift Diffusion Model (DDM)^33,38,39. DDM uses RT to model the cognitive processes of the participants in terms of initial bias (i.e. their preference for cooperation or defection), deliberation speed (i.e. how fast they collect evidence towards their preferred choice) and decision difficulty or carefulness³³.

DDM has been used ever since in experimental settings in behavioural economics to study human deliberation processes and the psychological mechanisms of loss-aversion⁴⁰, moral judgements⁴¹, inferring others’ preferences in binary decisions⁴², altruistic behaviour⁴³ and food choices⁴⁴, to cite a few. Mathematical decision models such as DDM have shown an accurate degree of generalization from one choice context to the other⁴⁴. Furthermore, it was shown in a networked iterated PD (IPD) context how the participants’ perceived difficulty to make the decision decreased while their speed to collect evidence to make the decision increased at each iteration⁴⁵. This work associated RT with the underlying cognitive process of deliberation and intuition, without making any assumptions about how fast the participants responded.

Here we show, through the use of DDM, that the cognitive processes of participants in a game-theoretical experiment are correlated to the heterogeneity in their predispositions and expectations to cooperate (or not). By linking the reaction times as modelled by the DDM with this heterogeneity, we are able to dig deeper into the individual cognitive differences in making decisions under different conditions. Specifically, the following two questions are addressed: (1) Are the predispositions to cooperate visible through the lens of DDM? (2) How do different decision contexts and game structures of the IPD play a role in our deliberation process? To answer these questions, we selected data from three different treatments with different network structures and payoff matrices, one where subjects play IPD in a pairwise setting, the second where they play the same game with multiple opponents in a network with a Moore’s neighbourhood setting (i.e., a square lattice with four direct neighbours), and the third experiment where participants were playing in a Von Neumann neighbourhood (i.e., a square lattice with eight direct neighbours). These three experiments were selected as they all have RT recorded, required the participants to make a binary choice and allow us to study the impact of the increasing interaction complexity on the RT of the participants. We inferred their cautiousness and speed to collect evidence throughout the games based on their DDM parameters. This way, their deliberation processes used to play these games are made explicit and can be associated with their individual behavioural heterogeneity.

As these experiments were not designed to directly collect data about individual SVO, other-regarding, or interdependent preferences; we propose here a measure that may serve as a proxy, which we will refer to as the Relative Allocation angle ($RA^{\circ }$) measure. $RA^{\circ }$ will represent the individual’s desired allocation towards self and others given the behaviour of their opponents in previous rounds (see “Methods”). $RA^{\circ }$ will allow us to examine the individual preferences in the IPD. This measure will assign each individual an angle: mostly cooperative will fall near $90^{\circ }$, mostly defective will fall around $0^{\circ }$, and conditional behaviour when the angle is close to $45^{\circ }$. We correlate these preferences with the cognitive parameters generated by the DDM model, allowing us to investigate whether knowing $RA^{\circ }$ accounts for insights into the cognitive processes of a participant. We offer an interpretation of the results and discuss how our results correlate with orthogonal neurological fMRI and eye-tracking studies on human cooperation.

Results

Participants reciprocated on pairwise settings, defected unconditionally in network games

Through the $RA^{\circ }$ measure, each participant’s preference towards their own gains relative to their co-players’ gains can be visualised, as shown in Figs. 1, 2 and 3. In the pairwise experiments (PIPD), the participants cluster around the $45^{\circ }$ degree mark in the fixed partners experiment, as the chart at the right shows. This distribution indicates that in fixed-partner setting (PIPD$_f$) subjects’ earnings appear to cluster around the equality marker. In shuffled partners (PIPD$_s$), the $RA^{\circ }$ distribution is spread out much more revealing a higher heterogeneity in allocations, this difference appears to be significant (KS statistic $D = 0.25$, p-value = 0.02). Nonetheless, in both cases, there appears to be a trend towards equal allocations between self and others.

In the network experiment mNIPD, the distribution in both the polar plane and the cumulative density plot, is shown in Fig. 2. There is a statistically significant difference (KS statistic $D = 0.21$, p-value = 0.001) between the distributions of the $RA^{\circ }$ scores in the fixed network setting mNIPD$_f$ and the shuffled network setting mNIPD$_s$, both with many participants defecting unconditionally leaving them at the bottom of the $RA^{\circ }$ distribution (near the zero degrees mark).

In the vnNIPD experiment, the density of $RA^{\circ }$ on the zero degrees mark (purely individualistic) appears to be 0. This may be due to the differences in the IPD payoff matrices, as was mentioned earlier (see Table 1). Both treatments in the vnNIPD experiments averaged $RA^{\circ }$ scores again near the equality marker. No significant difference is observed between the mean $RA^{\circ }$ scores of both treatments (KS statistic $D = 0.12$, p-value = 0.62). This means that while the subjects knew the payoffs of their opponents in the vnNIPD$_i$ treatment, the knowledge of their actions was enough to make their decisions as discussed in⁴⁸.

Evidence collection and perceived difficulty correlates with relative allocation

Now to examine the correlation between a participant’s deliberation process and an individual’s social preferences, we fitted the parameters a, v and z (threshold of the decision, drift speed and decision bias respectively) of the subjects against their $RA^{\circ }$ angle. The next sections detail the results for each of these parameters.

Decision bias: intuitive position before deliberation matters more in pairwise experiments

In the pairwise experiments, as shown in Fig. 4A, there is only a significant relationship between the decision bias and subjects’ $RA^{\circ }$ in the PIPD$_s$ (Spearman’s $\rho = 0.49$, p-value $< 0.001$), meaning that when the starting point of the deliberation process was lower than 0.5, the $RA^{\circ }$ is also small, while the opposite happens with subjects with high $RA^{\circ }$, resulting in a higher decision bias. This is the relation one would expect, suggesting that here players that intuitively tend towards cooperation would be expected to also allocate more resources to others. However, the same observation can not be made for the fixed network experiment mNIPD$_f$, where there is here a significant anti-correlation with decision bias (Spearman’s $\rho = -\,0.24$, p-value $= 0.001$), see Fig. 4B. A similar relationship was found in the mNIPD$_s$ treatment (Spearman’s $\rho = -\,0.22$, p-value $= 0.004$). This means that some of those with low $RA^{\circ }$ had a decision bias higher than 0.5, nearer to the cooperation boundary, but ended up defecting. This might be because, despite the initial intuition of each individual to cooperate, they had to deliberate more since they were in a setting with more opponents.

In the vnNIPD experiments, as shown in Fig. 4C, no significant correlation was found between the $RA^{\circ }$ and their decision bias in the vnNIPD$_i$ (Spearman’s $\rho = 0.12$, p-value $= 0.91$). Same result for the vnNIPD$_o$ treatment (Spearman’s $\rho = 0.19$, p-value $= 0.12$).

Decision threshold: more individualistic participants consider the game as more difficult

As shown in Fig. 5A, there is a significant negative relationship between the $RA^{\circ }$ and their threshold in the PIPD$_s$ experiment (Spearman’s $\rho = -\,0.40$, p-value $< 0.001$), meaning that for more selfish subjects the experiment was perceived as more difficult, as they tend to be more careful with their decision, gathering more evidence before committing to a decision, as their $RA^{\circ }$ decreased. In PIPD$_f$, subjects around $45^{\circ }$ experienced the experiments in all ranges of difficulty, while those with low or high values of $RA^{\circ }$ stayed around the same, no significant correlation between the threshold and $RA^{\circ }$ was found in PIPD$_f$ (Spearman’s $\rho = -\,0.15$, p-value $=0.29$).

In the mNIPD experiments, there is a significant negative relationship between $RA^{\circ }$ and their decision threshold in the mNIPD$_f$ (Spearman’s $\rho = -\,0.40$, p-value $< 0.001$) and mNIPD$_s$ (Spearman’s $\rho = -\,0.40$, p-value $< 0.001$), see Fig. 5B. This means that the subjects with low $RA^{\circ }$ (tendency for individualism) also experienced the experiments as more difficult than those with more cooperative tendencies.

Moreover, there is a significant difference between the decision threshold between the two treatments in mNIPD, where mNIPD$_f$ has a bigger threshold compared to mNIPD$_s$ (KS statistic $D = 0.17$, p-value = 0.02). This difference could be because participants played in mNIPD$_f$ first in a fixed network setting, and then they were put in a shuffled network afterwards for the mNIPD$_s$ treatment. It has been shown that participants use the first rounds to explore and learn the game⁴⁶, hence increasing the perceived difficulty. No other significant differences in decision threshold were found in other experiments.

The vnNIPD experiments (see Fig. 5C) showed as well a negative relationship between the decision threshold and their $RA^{\circ }$. Although, the treatment with no information about opponents’ payoffs vnNIPD$_n$ showed a negative correlation with the decision threshold (Spearman’s $\rho = -\,0.24$, p-value $= 0.03$) and a stronger one in the treatment with information vnNIPD$_i$ (Spearman’s $\rho = -\,0.30$, p-value $= 0.006$).

Drift speed: fast evidence recollection is related to unconditional behaviour

As shown in Fig. 6A, the drift speed in the PIPD$_s$ experiment followed a linear relationship with $RA^{\circ }$ (Spearman’s $\rho = 0.84$, p-value $< 0.001$), as well as a positive relationship was found in the PIPD$_f$ experiment (Spearman’s $\rho = 0.29$, p-value = 0.03).

A significant relationship it is also shown in Fig. 6B, where in both treatments of mNIPD there is a high correlation between their $RA^{\circ }$ and their drift speed (Spearman’s $\rho = 0.77$, p-value $<0.001$ for the mNIPD$_f$ and Spearman’s $\rho = 0.81$, p-value $<0.001$ for the mNIPD$_s$ experiment).

In Fig. 6C, a similar situation is shown. In the vnNIPD, the information treatment vnNIPD$_i$ showed a high positive correlation (Spearman’s $\rho = 0.69$, p-value $<0.001$) and the no-information treatment (Spearman’s $\rho = 0.85$, p-value $<0.001$).

Also, there is a significant difference in average drift speed (v) between treatments in both PIPD and mNIPD. In PIPD$_f$, there is a positive average v ($\mu = 0.21, \sigma = 0.61$) when the partner is always the same and a negative average v ($\mu = -\,0.28, \sigma = 0.44$) for PIPD$_s$, meaning that subjects in these treatments collected evidence and drifted towards a different direction, i.e. cooperation and defection, respectively (KS statistic $D = 0.43$, p-value $< 0.001$). In mNIPD, both experiments drift towards defection (negative average v), although in the last part of the experiment with a shuffled network ($\mu = -\,0.39, \sigma = 0.43$) with a significantly greater speed (KS statistic $D = 0.31$, p-value $< 0.001$) with respect to the first part of the experiment with a fixed network ($\mu = -\,0.24, \sigma = 0.28$). No other significant difference was found among the treatments in vnNIPD.

This shows how for those with either low or high scores of $RA^{\circ }$, the drift speed goes higher towards their preferred strategy: defection, represented by negative drift speed; or cooperation, positive drift speed. Moreover, subjects around $RA^{\circ } = 45^{\circ }$, or subjects who responded more conditionally to their past interactions, had their drift speed near zero. This means that the evidence recollection speed for participants closer to the equality marker is slower than those who are further from $RA^{\circ } = 45^{\circ }$, i.e. participants who mostly defect or mostly cooperate. Moreover, having a fixed partner or network resulted in faster evidence collection compared with having a random co-player or network of co-players.

Moreover, to validate the results in general, we fitted a linear regression model per treatment and DDM parameter. The details of these models can be found in Supplementary Table S2. In general, it can be seen that the drift speed v is the variable that is able to explain better the variance in $RA^{\circ }$, as it corresponds to a significant coefficient in the linear regression and the higher Adjusted R-Square measure (last column on the table). Likewise, the decision bias z was not a significant coefficient in some treatments such as PIPD$_f$ and vnNIPD$_i$. The decision threshold a was significant but less able to explain the variance in $RA^{\circ }$.

Lastly, in their work, Gallotti et al. developed a measure to calculate how much of a decision is influenced by this a-priori intuition (represented in the DDM with the parameter z), where zero represents a decision with no deliberation, solely based on intuition, and one a decision based solely on deliberation. This measure was referred to as “rationality” (R). In Supplementary Fig. S2 can be seen how there is a significant negative correlation between $RA^{\circ }$ and their rationality measure R, except for the fixed partners treatment PIPD$_f$. This result means that while there are some participants with high $RA^{\circ }$ with high rationality R scores (upper right corner of each figure), most of the individualistic participants relied more on their deliberation rather than on their intuition, which also aligns with the result in the previous section regarding decision threshold, where the individualistic participants perceived the task as more difficult.

Discussion

Our findings show that individual preferences in resource allocation, provided by the $RA^{\circ }$ proxy, influence the way subjects arrive and make decisions in multiple contexts and game structures, demonstrating again that DDM offers a richer analysis of response times.

The DDM decision bias, i.e. the starting point in the decision process that reflects the predisposition of participants, reveals a significant difference between the two pairwise treatments, suggesting that participants that played in the fixed-partners setting on average were more predisposed to cooperate than those in the shuffled partners’ treatment. Moreover, a positive correlation between the average decision bias and their $RA^{\circ }$ score in shuffled pairwise experiments was found, which contrasts with the negative one in the network experiments. Moreover, in all experiments (except for PIPD$_f$) there is a significant negative correlation between $RA^{\circ }$ and a rationality score R. This could mean that individualistic behaviour (higher $RA^{\circ }$) was related with a more deliberative decision-making process, as opposed to rely on their intuitive position.

Concerning the decision threshold, which measures the carefulness and perceived difficulty of the decision, participants in the mNIPD experiment were more careful in the first treatment (mNIPD$_f$) than in the one that took place later in the game (mNIPD$_s$). This might be due to the so-called “learning phase” in iterated games, where the first rounds are used to explore and learn the game⁴⁶. Furthermore, we found that in settings such as PIPD$_s$ or both of the network experiments NIPD, there is a significant negative correlation between their $RA^{\circ }$ and their decision threshold, meaning that the lower their $RA^{\circ }$, the more difficult they perceive the game. The decision threshold makes the decision boundaries wider, hence the decision could potentially take longer.

Finally, there is also a significant positive correlation between the $RA^{\circ }$ and the drift speed in the shuffled partners and network experiments. Those with more extreme strategies (unconditionally cooperative, all-C, or unconditionally defective, all-D) apparently accumulated evidence towards their preference much faster than those in the middle of the $RA^{\circ }$ spectrum, whose allocated gains are almost equal to those of the others, also accumulating evidence slower. Meaning that those with unconditional strategies have a stronger preference for those options, although in some experiments, those who defected found the decision to do so harder than those that cooperated unconditionally. Also, we found differences between treatments in PIPD and mNIPD. Specifically, those in the shuffled partners drifted faster towards defection, while the fixed-partners players drifted towards cooperation. In mNIPD, subjects in the shuffled network drifted faster at the beginning of the game, even though they started with a fixed network, they ended up with a stronger preference for cooperation.

As mentioned in the introduction, previous works have linked neural activity with social values using techniques like fMRI^49,50. This time, it is possible to look at the same problem with different lens provided by the DDM analysis. According to Emonds et al. pro-self (individualistic) strategies are driven by calculation, which can be compared with our finding in the section related to the decision threshold: In both PIPD and mNIPD experiments the threshold was negatively correlated with their $RA^{\circ }$ scores, meaning that participants with low $RA^{\circ }$ had a higher difficulty in arriving at a decision.

Our work also agrees with Fiedler et al. when it comes to the relationship with the $RA^{\circ }$ and their evidence collection. In their work, eye movement was tracked to show how SVO affects subjects’ strategies. As shown in Fig. 6, where subjects who mostly-defected or mostly-cooperated, drift-speed was higher, meaning that evidence to arrive at a decision was accumulated faster⁵¹. In a similar work, Bieleke et al. tested information acquisition and its relationship with SVO and deliberation, where they showed that less selfish individuals gathered more information about others and their payoffs⁵², reaching to a similar conclusion to our work and Fiedler et al. Although we also found high evidence collection (absolute value of v) in individualistic participants, we nonetheless observed that they relied more on their rationality rather than their intuitive a-priori position than their non-selfish counterparts. Also, our results are related to Capraro et al., where they measured participants’ cognitive style with the Cognitive Reflection test, and showed that deliberation promoted individuals’ concern for social efficiency (getting the most payoff for the group overall) but also egalitarian motivations were linked to intuitive responses⁵³.

It is evident that the structures and strategic nature of the IPD game might have affected their $RA^{\circ }$ scores. For example, in PIPD$_f$, only the drift speed showed a significant correlation with the $RA^{\circ }$, meaning that the fixed opponent setting was a stronger influence of RT than the decisions under different conditions. Nevertheless, in other settings such as PIPD$_s$ and the network experiments, there were significant correlations between their $RA^{\circ }$ scores and their DDM parameters. Our findings are also in line with the findings of Andrighetto et al. They found that response time and cooperation are moderated by their SVO, where highly pro-social were faster to cooperate and highly individualistic were faster to defect²⁵, with the difference that our work offers a much richer use of reaction times as a proxy to study our deliberation process. Since our results correlate with other works that emerged on the subject of deliberation, RT and heterogeneity in cooperation, it suggests that those findings are valid and useful for future research in human cooperation.

For future work, it is important to test these hypotheses with other games and settings, and perhaps with more than two options as presented in the IPD. This is crucial since in real life more than one setup and more than two options are presented in the complexity of human interactions, however, it is still of value to start to apply these concepts to different games and types of agents, such as artificial agents that can react faster than any human could, enriching our understanding of how response times affect our own (and also others’) deliberation process.

One has to acknowledge that DDM is still a new technique to combine with Game Theory experiments, as far as our knowledge goes. Still, it presents inviting opportunities to analyse individual differences in decision-making: we can go from measuring reaction times to arguing about social values and deliberation processes, and any researcher that desires to look deeper into the underlying processes of deciding these games (arguably without the need of an fMRI machine or an Eye-Tracking sensor). Moreover, this work adds to the theory of how different we are and to what extent, apart from all the techniques reviewed here, we could compare how DDM can be useful for future research.

Methods

Experimental IPD data

The analysis performed in this work relies on three experimental datasets wherein participants played different setups of the IPD. In the first, participants joined in a pairwise IPD (PIPD)⁴⁶ whereas in the second they played the IPD in a network setting in a Moore Neighbourhood (mNIPD) which is a square lattice with eight direct neighbours⁴⁷. In the third experiment, participants played in a network with a Von Neumann neighbourhood setting (vnNIPD)⁴⁸ which consists in a square lattice composed of four direct neighbours. We include different network topologies and IPD experiments with different payoff matrices to validate our hypotheses. Specific information related to the datasets used here is provided in Supplementary Table S1. All the data used in this article is publicly available, see “Data availability” section.

We refer to the first experiment in⁴⁶ as fixed partners or PIPD$_f$ ($n = 58$) and the second as shuffled partners or PIPD$_s$ ($n = 96$) throughout the rest of the paper. Note that, from the mNIPD experiment⁴⁷, we consider the first fixed treatment (mNIPD$_f$, $n = 169$) from now on) and the following shuffled treatment (mNIPD$_s$, $n=169$). The second fixed treatment was not included as it did not add anything substantial to the results shown here, as it was also a fixed network just as mNIPD$_f$. Lastly, in the third experiment⁴⁸, participants played vnNIPD in two treatments: one where they were informed of their opponents’ payoffs (vnNIPD$_i$, $n = 50$) in the previous round and one where they were not informed (vnNIPD$_o$, $n = 59$) (although in all experiments used in this paper, subjects knew the outcome of their opponent(s) actions in the previous round).

An additional difference between both datasets is that in the PIPD and vnNIPD the payoffs at each round were produced by a “strong” IPD where the following relation holds: $T>R>P>S$ and $2R>T+S$ (see Table 1). In the mNIPD, as can be seen also in the Table 1, a weak dilemma was used, where the two formal relationships mentioned earlier no longer hold.

Table 1 Payoff matrices used in the three experiments. The top payoff matrix shows the payoff used in the PIPD. Participants were confronted with a strong PD where $T>R>P>S$ and $2R>T+S$. The middle payoff matrix used in the vnNIPD experiment is also a strong dilemma, while the bottom payoff matrix corresponds to a weak one, as both conditions $T>R>P>S$ and $2R>T+S$ are no longer satisfied.

Full size table

Defining a proxy for social preferences

To measure the resource allocation of individuals, we took the response of the participants given the outcome of the previous round (which was known by participants) and averaged this over multiple rounds. We call this measure Relative Allocation angle or $RA^{\circ }$. The assumption that is made here is that a person’s allocated gains (as defined by the payoffs) to self and others, captured by $RA^{\circ }$, is correlated to a person’s social preferences.

To measure $RA^{\circ }$, we use the subjects’ planned allocation for themselves ($a_{self}$), and for others ($a_{other}$), i.e. whether they cooperate or not, given the context c of the decision, which is defined here as the number of co-players that cooperated in the last round played. This way, if we visualize them in a Cartesian plane as in Supplementary Fig. S1A, subjects end up situated at an angle from the origin of the plane. To account for different payoff matrices, we normalized the payoffs in all matrices, preserving the proportion of the R, S, T and P parameters, but limiting the range from 0 to 1. For more information about the $RA^{\circ }$ measure, motivation and examples, see the Supplementary Information.

We took the first 20 rounds of each experiment to develop the $RA^{\circ }$ metric, in order to have enough occurrences of each context and to be able to measure the preferences of cooperation of the participants (the results remain robust for a higher and lower number of rounds). By taking the mean over these rounds of ${\bar{a}}_{other}$ and ${\bar{a}}_{self}$, we can visualize where subjects ended up in a Cartesian plane, as shown in Supplementary Fig. S1. Equation (1) generates the angle from the origin (hence the $RA^{\circ }$ notation). This approach is equivalent to what Murphy et al.’s Slider Measure for Social Value Orientation¹⁶.

$$\begin{aligned} RA^{\circ } = \arctan \left( \frac{{\bar{a}}_{other}}{{\bar{a}}_{self}} \right) \end{aligned}$$

(1)

In terms of the IPD payoffs (see Table 1), the higher $RA^{\circ }$, i.e. near $90^{\circ }$, the more cooperative this person was when being confronted with defectors, consequently allocating more for others than for themselves. Clearly, cooperating when being exposed to defectors leads to the biggest payoff advantage for the others over self. This thus corresponds to behaviour like “All-C”. A $RA^{\circ }$ near zero means that this person allocated more for themselves than for others, i.e. playing defect when being confronted with persistent cooperation. This behaviour can be considered “individualistic” and is similar to “All-D”. Those around $45^{\circ }$ acted conditionally on their opponents’ past behaviour, responding with defect or cooperate depending on the observed context, and their own intentions. The net effect of the conditional response leads to an equal allocation of payoffs to self and others.

Deliberation process measure

The three experiments, PIPD, mNIPD and vnNIPD, all recorded the response times of the participants for each decision made in the experiment. This way, following the work of Gallotti et al.⁴⁵, we use the DDM to extract insights into the cognitive efforts made by the participants. As shown in an abstract representation of the DDM in Supplementary Fig. S1B, the deliberation process between two options occurs as the accumulation of evidence (with drift-speed v) towards one of the two options (cooperation or defection in our case) until a certain threshold of decision a is reached. The deliberation process starts at a point z (representing the initial motivations), and there is a time $t_0$ where no deliberation occurs⁵⁴.

In this study, we analyse the values of the parameters (a, v, z) of the DDM: a indicates how difficult the decision is perceived by the subjects. v measures how fast each subject accumulates evidence to make their decision. These two measures describe a person’s cognitive/deliberation process while z reflects their initial bias towards one action or the other, i.e. the intuitive position before the deliberation process occurs. The models were fitted using the hddm package for Python⁵⁴. The output of these models is a distribution of each parameter (a, v and z) and we report on the average of these distributions.

To make sure these distributions are stable, these models were tested for convergence, as hddm provides different methods to test the stability of the parameters over the rounds. A lack of convergence would mean that the distribution of the parameters is noisy, and therefore, not useful for analysis. To assess the models’ convergence, we used the Gelman–Rubin ${\hat{R}}$ statistic as recommended by the package’s creators and drew 10,000 posterior samples, discarding the first 200 as also recommended⁵⁴.

To test the differences between samples in both average $RA^{\circ }$ and DDM parameters, we use a two-sample Kolmogorov–Smirnoff (KS) test.

Code and data availability

Both the code and the data used in this work are publicly available on the Zenodo repository: https://zenodo.org/record/6868396.

References

Fehr, E. & Schmidt, K. M. A theory of fairness, competition, and cooperation. Q. J. Econ. 114, 817–868. https://doi.org/10.1162/003355399556151 (1999).
Article MATH Google Scholar
Kagel, J. H. & Roth, A. E. The Handbook of Experimental Economics, vol. 2 (Princeton University Press, 2016). Google-Books-ID: y4LRDAAAQBAJ.
Fischbacher, U., Gächter, S. & Fehr, E. Are people conditionally cooperative? Evidence from a public goods experiment. Econ. Lett. 71, 397–404. https://doi.org/10.1016/S0165-1765(01)00394-9 (2001).
Article MATH Google Scholar
Fehr, E., Naef, M. & Schmidt, K. M. Inequality aversion, efficiency, and maximin preferences in simple distribution experiments: Comment. Am. Econ. Rev. 96, 1912–1917 (2006).
Article Google Scholar
Molina, J. A. et al. Gender differences in cooperation: Experimental evidence on high school students. PLoS One 8, e83700. https://doi.org/10.1371/journal.pone.0083700 (2013).
Article ADS CAS PubMed Central PubMed Google Scholar
Fehr, E. & Fischbacher, U. Social norms and human cooperation. Trends Cogn. Sci. 8, 185–190. https://doi.org/10.1016/j.tics.2004.02.007 (2004).
Article PubMed Google Scholar
Sun, W., Liu, L., Chen, X., Szolnoki, A. & Vasconcelos, V. V. Combination of institutional incentives for cooperative governance of risky commons. iScience 24, 102844. https://doi.org/10.1016/j.isci.2021.102844 (2021).
Article ADS PubMed Central PubMed Google Scholar
Capraro, V., Halpern, J. Y. & Perc, M. From outcome-based to language-based preferences. J. Econ. Lit. (2022) (forthcoming)
Santos, F. C., Pacheco, J. M. & Lenaerts, T. Evolutionary dynamics of social dilemmas in structured heterogeneous populations. Proc. Natl. Acad. Sci. 103, 3490–3494. https://doi.org/10.1073/pnas.0508201103 (2006).
Article ADS CAS PubMed Central PubMed Google Scholar
Santos, F. C. & Pacheco, J. M. A new route to the evolution of cooperation. J. Evol. Biol. 19, 726–733. https://doi.org/10.1111/j.1420-9101.2005.01063.x (2006).
Article CAS PubMed Google Scholar
Fleiß, J. & Leopold-Wildburger, U. Once nice, always nice? Results on factors influencing nice behavior from an iterated prisoner’s dilemma experiment. Syst. Res. Behav. Sci. 31, 327–334. https://doi.org/10.1002/sres.2194 (2014).
Article Google Scholar
Reuben, E. & Suetens, S. Revisiting strategic versus non-strategic cooperation. Exp. Econ. 15, 24–43. https://doi.org/10.1007/s10683-011-9286-4 (2012).
Article Google Scholar
Van Segbroeck, S., Pacheco, J. M., Lenaerts, T. & Santos, F. C. Emergence of fairness in repeated group interactions. Phys. Rev. Lett. 108, 158104. https://doi.org/10.1103/PhysRevLett.108.158104 (2012).
Article ADS CAS PubMed Google Scholar
Charness, G. & Rabin, M. Understanding social preferences with simple tests. Q. J. Econ. 117, 817–869 (2002).
Article MATH Google Scholar
Lange, P., Otten, W., Bruin, E. & Joireman, J. Development of prosocial, individualistic, and competitive orientations: Theory and preliminary evidence. J. Pers. Soc. Psychol. 73, 733–46. https://doi.org/10.1037//0022-3514.73.4.733 (1997).
Article PubMed Google Scholar
Murphy, R. O., Ackermann, K. A. & Handgraaf, M. Measuring Social Value Orientation. SSRN Scholarly Paper ID 1804189 (Social Science Research Network, 2011). https://doi.org/10.2139/ssrn.1804189.
Fehr, E., Epper, T. & Senn, J. Other-regarding preferences and redistributive politics. Tech. Rep. 339 (Department of Economics-University of Zurich, 2021)
Sethi, R. & Somanathan, E. Preference evolution and reciprocity. J. Econ. Theory 97, 273–297. https://doi.org/10.1006/jeth.2000.2683 (2001).
Article MathSciNet MATH Google Scholar
Ahn, T. K., Ostrom, E. & Walker, J. M. Heterogeneous preferences and collective action. Public Choice 117, 295–314 (2003).
Article Google Scholar
Lange, P. V. The pursuit of joint outcomes and equality in outcomes: An integrative model of social value orientation. J. Pers. Soc. Psychol. 77, 337–349. https://doi.org/10.1037/0022-3514.77.2.337 (1999).
Article Google Scholar
Brañas-Garza, P., Meloso, D. & Miller, L. Strategic risk and response time across games. Int. J. Game Theory 46, 511–523. https://doi.org/10.1007/s00182-016-0541-y (2017).
Article MathSciNet MATH Google Scholar
Cappelletti, D., Güth, W. & Ploner, M. Being of two minds: Ultimatum offers under cognitive constraints. J. Econ. Psychol. 32, 940–950. https://doi.org/10.1016/j.joep.2011.08.001 (2011).
Article Google Scholar
Evans, A. & Dillon, K. Reaction times and reflection in social dilemmas: Extreme responses are fast, but not intuitive. SSRN Electron. J. 1, 1. https://doi.org/10.2139/ssrn.2436750 (2014).
Article Google Scholar
Mischkowski, D. & Glöckner, A. Spontaneous cooperation for prosocials, but not for proselfs: Social value orientation moderates spontaneous cooperation behavior. Sci. Rep. 6, 21555. https://doi.org/10.1038/srep21555 (2016).
Article ADS CAS PubMed Central PubMed Google Scholar
Andrighetto, G., Capraro, V., Guido, A. & Szekely, A. Cooperation, Response Time, and Social Value Orientation: A Meta-Analysis. Tech. Rep., PsyArXiv (2020). https://doi.org/10.31234/osf.io/cbakz.
Yamagishi, T. et al. Response time in economic games reflects different types of decision conflict for prosocial and proself individuals. Proc. Natl. Acad. Sci. 114, 6394–6399. https://doi.org/10.1073/pnas.1608877114 (2017).
Article ADS CAS PubMed Central PubMed Google Scholar
Evans, A. M. & Rand, D. G. Cooperation and decision time. Curr. Opin. Psychol. 26, 67–71. https://doi.org/10.1016/j.copsyc.2018.05.007 (2019).
Article PubMed Google Scholar
Evans, A. M., Dillon, K. D. & Rand, D. G. Fast but not intuitive, slow but not reflective: Decision conflict drives reaction times in social dilemmas. J. Exp. Psychol. Gen. 144, 951–966. https://doi.org/10.1037/xge0000107 (2015).
Article PubMed Google Scholar
Krajbich, I., Bartling, B., Hare, T. & Fehr, E. Rethinking fast and slow based on a critique of reaction-time reverse inference. Nat. Commun. 6, 1–9. https://doi.org/10.1038/ncomms8455 (2015).
Article Google Scholar
Tinghög, G. et al. Intuition and cooperation reconsidered. Nature 498, E1–E2. https://doi.org/10.1038/nature12194 (2013).
Article CAS PubMed Google Scholar
Recalde, M. P., Riedl, A. & Vesterlund, L. Error Prone Inference from Response Time: The Case of Intuitive Generosity. SSRN Scholarly Paper ID 2507723 (Social Science Research Network, 2017).
Google Scholar
Bear, A. & Rand, D. G. Intuition, deliberation, and the evolution of cooperation. Proc. Natl. Acad. Sci. 113, 936–941. https://doi.org/10.1073/pnas.1517780113 (2016).
Article ADS CAS PubMed Central PubMed Google Scholar
Clithero, J. A. Response times in economics: Looking through the lens of sequential sampling models. J. Econ. Psychol. 69, 61–86. https://doi.org/10.1016/j.joep.2018.09.008 (2018).
Article Google Scholar
Krajbich, I. & Dean, M. How can neuroscience inform economics?. Curr. Opin. Behav. Sci. 5, 51–57. https://doi.org/10.1016/j.cobeha.2015.07.005 (2015).
Article Google Scholar
Camerer, C., Loewenstein, G. & Prelec, D. Neuroeconomics: How neuroscience can inform economics. J. Econ. Lit. 43, 9–64. https://doi.org/10.1257/0022051053737843 (2005).
Article Google Scholar
Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556. https://doi.org/10.1038/nrn2357 (2008).
Article CAS PubMed Central PubMed Google Scholar
Konovalov, A. & Krajbich, I. On the strategic use of response times. SSRN Electron. J.https://doi.org/10.2139/ssrn.3023640 (2017).
Article Google Scholar
Ratcliff, R. & Rouder, J. N. Modeling response times for two-choice decisions. Psychol. Sci.https://doi.org/10.1111/1467-9280.00067 (2016).
Article Google Scholar
Ratcliff, R. A theory of memory retrieval. Psychol. Rev. 85, 59–108. https://doi.org/10.1037/0033-295X.85.2.59 (1978).
Article Google Scholar
Zhao, W. J., Walasek, L. & Bhatia, S. Psychological mechanisms of loss aversion: A drift-diffusion decomposition. Cogn. Psychol. 123, 101331. https://doi.org/10.1016/j.cogpsych.2020.101331 (2020).
Article PubMed Google Scholar
Andrejević, M., White, J. P., Feuerriegel, D., Laham, S. & Bode, S. Response time modelling reveals evidence for multiple, distinct sources of moral decision caution. Cognition 223, 105026. https://doi.org/10.1016/j.cognition.2022.105026 (2022).
Article PubMed Google Scholar
Gates, V., Callaway, F., Ho, M. K. & Griffiths, T. L. A rational model of people’s inferences about others’ preferences based on response times. Cognition 217, 104885. https://doi.org/10.1016/j.cognition.2021.104885 (2021).
Article PubMed Google Scholar
Hutcherson, C., Bushong, B. & Rangel, A. A neurocomputational model of altruistic choice and its implications. Neuron 87, 451–462. https://doi.org/10.1016/j.neuron.2015.06.031 (2015).
Article CAS PubMed Central PubMed Google Scholar
Krajbich, I., Hare, T., Bartling, B., Morishima, Y. & Fehr, E. A common mechanism underlying food choice and social decisions. PLoS Comput. Biol. 11, e1004371. https://doi.org/10.1371/journal.pcbi.1004371 (2015).
Article ADS CAS PubMed Central PubMed Google Scholar
Gallotti, R. & Grujić, J. A quantitative description of the transition between intuitive altruism and rational deliberation in iterated Prisoner’s Dilemma experiments. Sci. Rep. 9, 1–11. https://doi.org/10.1038/s41598-019-52359-3 (2019).
Article CAS Google Scholar
Montero-Porras, E., Grujić, J., Fernández Domingos, E. & Lenaerts, T. Inferring strategies from observations in long iterated Prisoner’s dilemma experiments. Sci. Rep. 12, 7589. https://doi.org/10.1038/s41598-022-11654-2 (2022).
Article ADS CAS PubMed Central PubMed Google Scholar
Grujić, J., Fosco, C., Araujo, L., Cuesta, J. A. & Sánchez, A. Social experiments in the mesoscale: Humans playing a spatial prisoner’s dilemma. PLoS One 5, e13749. https://doi.org/10.1371/journal.pone.0013749 (2010).
Article ADS CAS PubMed Central PubMed Google Scholar
Grujić, J. & Lenaerts, T. Do people imitate when making decisions? Evidence from a spatial Prisoner’s Dilemma experiment. R. Soc. Open Sci. 7, 200618. https://doi.org/10.1098/rsos.200618 (2020).
Article ADS PubMed Central PubMed Google Scholar
Emonds, G., Declerck, C., Boone, C., Vandervliet, E. & Parizel, P. Comparing the neural basis of decision making in social dilemmas of people with different social value orientations, a fMRI study. J. Neurosci. Psychol. Econ. 4, 11–24. https://doi.org/10.1037/A0020151 (2011).
Article Google Scholar
Lambert, B., Declerck, C. H., Emonds, G. & Boone, C. Trust as commodity: Social value orientation affects the neural substrates of learning to cooperate. Soc. Cogn. Affect. Neurosci. 12, 609–617. https://doi.org/10.1093/scan/nsw170 (2017).
Article PubMed Central PubMed Google Scholar
Fiedler, S., Glöckner, A., Nicklisch, A. & Dickert, S. Social value orientation and information search in social dilemmas: An eye-tracking analysis. Organ. Behav. Hum. Decis. Process. 120, 272–284. https://doi.org/10.1016/j.obhdp.2012.07.002 (2013).
Article Google Scholar
Bieleke, M., Dohmen, D. & Gollwitzer, P. M. Effects of social value orientation (SVO) and decision mode on controlled information acquisition—A Mouselab perspective. J. Exp. Soc. Psychol. 86, 103896. https://doi.org/10.1016/j.jesp.2019.103896 (2020).
Article Google Scholar
Capraro, V., Corgnet, B., Espín, A. M. & Hernán-González, R. Deliberation favours social efficiency by making people disregard their relative shares: Evidence from USA and India. R. Soc. Open Sci. 4, 160605. https://doi.org/10.1098/rsos.160605 (2017).
Article ADS MathSciNet PubMed Central PubMed Google Scholar
Wiecki, T. V., Sofer, I. & Frank, M. J. HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python. Front. Neuroinform. 7, 14. https://doi.org/10.3389/fninf.2013.00014 (2013).
Article PubMed Central PubMed Google Scholar

Download references

Funding

E.M. and T.L benefit from the support by the Flemish Government through the AI Research Program and by TAILOR, a project funded by the EU Horizon 2020 research and innovation program under GA No 952215. J.G. is funded by FWO—Research Foundation Flanders. T.L. is furthermore supported by the F.N.R.S. projects with grant number 31257234 and 40007793, the F.W.O. project with grant no. G.0391.13N, and the Service Public de Wallonie Recherche under grant n° 2010235-ARIAC by DigitalWallonia4.ai. R.G is partly supported by the project AI@TN funded by the Autonomous Province of Trento.

Author information

Authors and Affiliations

AI Lab, Vrije Universiteit Brussel, Brussels, Belgium
Eladio Montero-Porras, Tom Lenaerts & Jelena Grujic
MLG, Université Libre de Bruxelles, Brussels, Belgium
Tom Lenaerts & Jelena Grujic
Center for Human-Compatible AI, UC Berkeley, Berkeley, 94702, USA
Tom Lenaerts
FARI Institute, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050, Brussels, Belgium
Tom Lenaerts
Fondazione Bruno Kessler, Trento, Italy
Riccardo Gallotti

Authors

Eladio Montero-Porras
View author publications
You can also search for this author in PubMed Google Scholar
Tom Lenaerts
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Gallotti
View author publications
You can also search for this author in PubMed Google Scholar
Jelena Grujic
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.M., R.G., J.G. and T.L. analysed the results and discussed them; E.M., R.G., J.G. and T.L. wrote the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Eladio Montero-Porras.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Montero-Porras, E., Lenaerts, T., Gallotti, R. et al. Fast deliberation is related to unconditional behaviour in iterated Prisoners’ Dilemma experiments. Sci Rep 12, 20287 (2022). https://doi.org/10.1038/s41598-022-24849-4

Download citation

Received: 19 September 2022
Accepted: 21 November 2022
Published: 24 November 2022
DOI: https://doi.org/10.1038/s41598-022-24849-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.