Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Examining learning coherence in group decision-making: triads vs. tetrads


This study examined whether three heads are better than four in terms of performance and learning properties in group decision-making. It was predicted that learning incoherence took place in tetrads because the majority rule could not be applied when two subgroups emerged. As a result, tetrads underperformed triads. To examine this hypothesis, we adopted a reinforcement learning framework using simple Q-learning and estimated learning parameters. Overall, the results were consistent with the hypothesis. Further, this study is one of a few attempts to apply a computational approach to learning behavior in small groups. This approach enables the identification of underlying learning parameters in group decision-making.


Division of labor and specialization have significantly increased in modern society, and most of the tasks involved in these processes are carried out by groups1,2. Management of groups plays a critical role in achieving greater performance, that in turn hinges on understanding underlying group dynamics. Lewin3 devised the term “group dynamics” to refer to the way groups and individuals act and react to changing circumstances. In related literature, group performance has been found to be related to a combination of personality traits4, member ability4,5, team familiarity, team roles6 or leadership styles7,8, identity, conformity, psychological safety, and cohesiveness9,10,11.

Although these psychological and sociological factors indeed account for group performance, little attention has been paid to empirically measuring and characterizing the learning properties in group decision-making. Thus, this study does so, aiming to estimate relevant learning parameters by taking a computational approach to group decision-making.

Group decision-making has some advantages over individual decision-making. The former induces more employee involvement and satisfaction12, leading to higher performance13. However, a number of related studies on group decision-making have supported the proposition that groups rarely outperform their best members14,15. Nevertheless, the majority of related literature on group decision-making, under the assumption that group members cooperate and share information voluntarily15,16,17, has shown that groups outperform individuals in decision-making, enabling knowledge transfer from group to individual contexts18,19, more accurate information recall20, better negotiation outcomes21, more creative ideas22, and more accuracy23,24.

The single detection approach highlights the performance problems that can occur with individuals vs. dyads, the effects of larger group size on performance remain to be examined. This study examined whether three heads perform better than four or one. The comparison of groups of three and four and individuals highlights new, interesting issues in group decision-making that do not arise with groups of two vs. one, that is, even-sized groups vs. odd-sized groups14,25. Small groups are likely to break into two coalitions. If a group has an even number of members, the two subgroups are equal in size. In this case, since the majority rule cannot be applied, subgroup dynamics might lead to deadlock26,27,28,29. In contrast, if a small group has an odd number of members, a minority and majority subgroup emerge, and the majority influence provides a clear direction and group cohesion14,25,30,31. Thus, it is predicted that odd-sized small groups have higher cohesion and consistent decision-making, leading to superior performance to even-sized ones.

This hypothesis could be reformulated in terms of learning coherence and incoherence. That is, triads (more generally, odd-sized groups) perform better than tetrads (more generally, even-sized groups) because the former maintains learning coherence due to the majority rule, whereas the latter suffers from learning incoherence due to conflict among group members. To formalize learning coherence in triads and learning incoherence in tetrads, assume two learning strategies exist, Sh and Sl, and the ratio of the Sh in the population is p. The strategies Sh and Sl generate expected rewards of Rh and Rl where Rh > Rl. In addition, a randomized strategy between Sh and Sl could exist, that underperforms Sh and Sl due to its learning incoherence. The expected rewards for this randomized strategy are Rw, where Rw < Rl < Rh. In the tetrads, when two subgroups of two members have different learning strategies, conflicts arise, leading to a situation in which both strategies are randomly adopted. The probability of adopting these strategies is \(6{p}^{2}{\left(1-p\right)}^{2}\). In contrast, the triads never encounter this situation because the majority, who decides the preferred strategy, always exists. Thus, learning coherence could be achieved in triads. The difference between expected rewards for the tetrads and triads are \(-3{p}^{2}{\left(1-p\right)}^{2}\left[Rh+Rl-2Rw\right]<0\), indicating that tetrads underperform triads.

The purpose of this study was to examine the hypothesis that learning coherence emerges in individuals and triads and learning incoherence occurs in tetrads by estimating and comparing learning parameters.



A total of 343 healthy undergraduate students at Kobe University (103 women, age range = 19–25 years, SD = 1.21) participated in the study for course credit. All experimental protocols in this study were approved by the Ethics Committee, Graduate School of Business Administration, Kobe University, and the study was carried out in accordance with the relevant guidelines and regulations. All participants signed an informed consent form before the experiment.


In test 1, participants undertook cognitive tasks (two-armed bandit [TAB] problems) individually. In test 2, they formed groups of three and performed as groups. In test 3, they formed groups of four and performed the same cognitive tasks as groups. There were seven rounds of tests. To control for learning effects, three tests were randomly assigned to either groups or individuals in each round, that is, some groups were triads, the other groups were tetrads, and the remaining were individuals. All tests were performed with the PsytoolKit32,33, and when participants performed the TAB tasks as a group, they communicated with each other via a breakout session in Zoom to decide the choices in the TAB.

All participants undertook test 1, and most of the participants took part in tests 2 and 3. In each of the triad and tetrad groups, at least one member participated in both tests 2 and 3. Because all group members in the triads did not participate in test 3, and all group members in the tetrads did not participate in test 2, 8 triad groups and 4 tetrad groups were dropped from the sample. As a result, the total number of groups of triads and tetrads examined in this study were 100 and 104, respectively.

Q-learning model

In this study, a simple Q-learning reinforcement learning algorithm34 was adopted to account for asymmetric learning rates (learning biases). Participants played a TAB problem, in which they chose either a right or left box on the screen. After the selection, the participants were awarded either 10 or 0 points, and they were instructed to try to achieve the highest score over a series of 100 choices. One of the boxes had a higher probability of being worth 1 point (70%), and the corresponding probability of the other box was set at 30%. However, we switched these probabilities twice over 100 choices. For example, the right and left boxes had a respective 70% and 30% probability of being worth 1 point for the first 30 choices, and from the 31st to the 70th choice, the probabilities switched such that the probability of earning 1 point for the right and left boxes became 30% and 70%, respectively. Then, for the last 30 choices, the probabilities of the right and left boxes returned to the initial respective levels of 70% and 30%. Thus, in each round of tests 1 and 2, these changes in probability took place three times over 100 choices. Moreover, the probabilities were randomized for every round of tests 1 and 2 so that even in the same test, the probability for each round differed. Therefore, participants could not transfer learning obtained in one round of the test to other rounds.

In the Q-learning framework, a decision-maker is assumed to calculate the action value for each choice (i.e., the right and left boxes). The action value of option i at trial t is denoted by \({Q}_{i}\left(t\right)\), calculated as follows:

$$ \begin{array}{*{20}c} {{\text{Q}}_{i} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {{\text{ Q}}_{i} \left( t \right) + \alpha^{ + } \delta \left( t \right) + \phi if \delta \left( t \right) \ge 0, } \\ {{\text{ Q}}_{i} \left( t \right) + \alpha^{ - } \delta \left( t \right) + \phi if \delta \left( t \right) < 0,} \\ \end{array} } \right.} \\ \end{array} $$


$$ \begin{array}{*{20}c} {\delta \left( t \right) = R_{i} \left( t \right) - {\text{Q}}_{i} \left( t \right), } \\ \end{array} $$

where \({\mathrm{R}}_{i}\left(t\right)\) is the reward associated with option \(i\) at trial \(t\), either 10 or 0 points, and \(\delta \left(t\right)\) is the reward prediction error. \({\alpha }^{\pm }\) indicates the learning rate so that the learning biases are measured by \({\alpha }^{+}-{\alpha }^{-}\). If this is positive (negative), positivity (negativity) biases exist. \(\phi \) is added in Eq. 1 as the choice trace to account for autocorrelation of choice, which could affect learning biases35.

As one of the characteristics of learning, this study compared positivity biases. The positivity and confirmation biases refer to the tendency to respond to positive news more sensitively than to negative news, and the tendency to respect outcomes consistent with one’s hypothesis36. Related studies examined the existence of these biases in individual reinforcement learning, and reported that learning rates tend to be positively biased37,38,39,40,41. Katahira35 suggested that the autocorrelation of choices itself tends to generate pseudo-positivity biases. Harada42 controlled for this autocorrelation by incorporating the effects of past choices into the learning model, and demonstrated that the positivity biases were indeed confirmed in a simple Q-learning model. However, once a more dynamic model was introduced, the positivity biases disappeared. Therefore, learning biases not only depended on the autocorrelation of choices, but also on autocorrelation of learning parameters in the model. While previous studies examined learning biases for individuals, this study investigated the existence of positivity biases in group learning of triads and tetrads. As related studies indicated, it could be inferred that either positivity biases existed or no biases existed for both triads and tetrads. According to our hypothesis, we speculated that learning coherence in triads lead to positivity biases because individual learning was reported to generate positivity biases in related studies while tetrads generated no biases due to learning incoherence.

If the decision-maker \(\mathrm{j }\left(\mathrm{i}\ne \mathrm{j}\right)\) does not choose the option, its action value does not change, remains to be changed:


Given these action values of the two options, the decision-maker determines one of the two options according to the softmax decision rule:

$$\begin{array}{c}P\left(a\left(t\right)=i\right)=\frac{exp\left(\beta {Q}_{i}\left(t\right)\right)}{\sum_{j=1}^{2}exp\left(\beta {Q}_{j}\left(t\right)\right)},\end{array}$$

where \(P\left(a\left(t\right)=i\right)\) is the probability of choosing the action \(a\left(t\right)=i\) at trial \(t\). The parameter \(\upbeta \) is the inverse temperature, that measures the relative strength of exploitation vs. exploration (exploitation/exploration ratio). Exploitation is related to optimization under current contexts, implying the choice of the option with the highest action value \({Q}_{i}\left(t\right)\). Exploration, on the other hand, refers to the digression from optimization so that one of the options without the highest action value is selected. If \(\upbeta \) is high, the probability of choosing the option with the highest action value increases, leading to exploitation. In contrast, if \(\upbeta \) is low, the probability of choosing the option without the highest action value increases. Thus, \(\upbeta \) measures the exploitation/exploration ratio.

Estimation method

The parameters specified in Eqs. (1)–(5) were estimated by optimizing the maximum a posteriori objective function:

$$\begin{array}{c}\widehat{\theta }=argmax \; p\left({D}_{s}|{\theta }_{s}\right)\mathrm{p}\left({\theta }_{s}\right),\end{array}$$

where \(p\left({D}_{s}|{\theta }_{s}\right)\) is the likelihood of data \({D}_{s}\) for a subject \(\mathrm{s}\) conditional on parameters \({\theta }_{s}=\left\{{{\alpha }^{\pm }}^{S}{, \phi }^{S},{\beta }^{S}\right\}\). \(\mathrm{p}\left({\theta }_{s}\right)\) is the prior probability of \({\theta }_{s}\). Note that \(\mathrm{\alpha }\) should be bounded between 0 and 1, and \(\upbeta \) take non-negative values. Therefore, the corresponding priors were assumed to follow beta distributions for \({\alpha }^{\pm }\) with shape parameters of 2 and 2, and gamma distributions for \(\upbeta \) with a shape parameter of 2 and a scale parameter of 3. In addition, \({\phi }^{S}\) is assumed to follow standard normal distribution with mean 0 and variance 1.


This study investigated underlying learning mechanisms of triads and tetrads from two perspectives: (1) group differences and (2) within-group effects. The descriptive statistics for relevant variables are reported in Table 1. Since the data rejected either the homogeneity of variance by the Bartlett test or the normality by the Shapiro–Wilk test in the statistical tests of the differences of relevant data across and within groups, the Kruskal–Wallis test was applied in the subsequent analyses without referring to the results of either the Bartlett or the Shapiro–Wilk tests, due to space limitation.

Table 1 Descriptive statistics.

Group differences


First, the performance difference between triads and tetrads was examined. The result suggested that a performance difference existed between triads and tetrads and triads outperformed tetrads (\({\chi }^{2}\)=4.12, p = 0.04). Thus, we could identify that triads generated slightly higher performance than tetrads (see Fig. 1).

Figure 1
figure 1

Comparison of average performance of triads and tetrads. Error bars represent standard errors of means. The Kruskal–Wallis test test was applied. **p < .05.

Inverse temperature

As the first characteristic of learning, the magnitude of the inverse temperature between triads and tetrads was compared. Inverse temperature measured the degree of exploitation vis-à-vis exploration. Exploitation adopts the optimal choices, given existing information, whereas exploration makes random choices. Inverse temperature was significantly higher for triads than for tetrads (\({\chi }^{2}\)=42.88, p = 5.8.e−11) (see Fig. 2). It follows that triads were more likely to make random choices, regardless of past records. It could be inferred that this result was generated due to the fact that the majority rule was harder to apply in tetrads than in triads. This implied learning coherence in triads and incoherence in tetrads.

Figure 2
figure 2

Comparison of average inverse temperatures (\(\upbeta )\) of triads and tetrads. The Kruskal–Wallis test was applied. ***p < .01.

Positivity biases

As the second characteristic of learning, this study compared positivity biases. While previous studies examined learning biases for individuals, this study investigated the existence of positivity biases in group learning of triads and tetrads. As related studies indicated, it could be inferred that either positivity biases existed or no biases existed for both triads and tetrads. For triads, the positivity biases were supported (\({\chi }^{2}\)=13.39, p = 2.5e−04). However, for tetrads, we confirmed negativity biases (\({\chi }^{2}\)=24.05, p = 9.4e−07). This study also investigated learning biases for individuals, revealing that positivity biases existed (\({\chi }^{2}\)=22.08, p = 2.6e−06). Thus, while individuals and triads confirmed positivity biases, tetrads generated negativity biases (see Fig. 3). According to related studies, this result suggested learning coherence for triads and learning incoherence for tetrads.

Figure 3
figure 3

Comparison of average positivity biases (\({\alpha }^{+}-{\alpha }^{-}\)) of triads and tetrads. The Kruskal–Wallis test was applied. ***p < .01.

Within-group effects

As the within-group effects, the maximum, minimum, and the average of group members’ individual performances and learning parameters were compared with the corresponding group variables.


In triads, group performance outperformed the minimum of individual performances of group members (\({\chi }^{2}\)=45.7, p = 1.4e-11), but underperformed its maximum version (\({\chi }^{2}\)=23.91, p = 1.1e-06). However, group performance and the average of individual performances were not differentiated (\({\chi }^{2}\)=0.89, p = 0.34). Similarly, in tetrads, group performance outperformed the minimum of individual members (\({\chi }^{2}\)=47.94, p = 4.4e-12), but underperformed both their maximum (\({\chi }^{2}\)=59.46, p = 1.3e-14). Group performance and its average version were not differentiated (\({\chi }^{2}\)=0.03, p = 0.87).

Inverse temperature

In triads, inverse temperature was greater than its minimum (\({\chi }^{2}\)=58.41, p = 2.1e−14) and average (\({\chi }^{2}\)=7.12, p = 0.01) of individual group members, but was weakly smaller than its maximum version (\({\chi }^{2}\)=3.50, p = 0.06). In tetrads, group inverse temperature was greater than the minimum of individual members (\({\chi }^{2}\)=26.68, p = 2.4e-07), but was smaller than both its maximum (\({\chi }^{2}\)=137.48, p = 2.2e-16) and average versions (\({\chi }^{2}\)=87.80, p = 2.2e-16).

Thus, group effects in triads were higher in inverse temperature because triads achieved higher group inverse temperature than their average, while those in tetrads were smaller than their average version.

Positivity biases

In triads, positivity biases were greater than the minimum of individual performances of group members (\({\chi }^{2}\)=40.02, p = 2.5e-10), but were smaller than its maximum version (\({\chi }^{2}\)=26.05, p = 3.3e-07). However, positivity biases and their average version were not differentiated (\({\chi }^{2}\)=1.03, p = 0.31). In tetrads, negativity biases were greater than all of their minimum (\({\chi }^{2}\)=34.33, p = 0.4.7e-09), maximum (\({\chi }^{2}\)=142, p = 2.2e-16), and average (\({\chi }^{2}\)=46.45, p = 9.4e-12) of individual members.

Thus, group effects in triads were high in generating positivity biases, but those in tetrads were also significant in giving rise to negativity biases.


Overall, our statistical analysis revealed that triads had higher performance, higher inverse temperature, and more positivity biases. Since inverse temperature and positivity biases were indicated to be positively related to performance, these results implied that triads achieved learning coherence, but tetrads experienced learning incoherence. On the one hand, it can be inferred that triads that might break into majority and minority subgroups, enabled the group to achieve consistent and efficient learning over 100 choices, indicated by high performance, inverse temperature and positivity biases. On the other hand, tetrads that might be constrained by two equal subgroups, encountered dispute and confrontation, sometimes leading to deadlock, resulting in lower performance and inconsistent learning behavior, represented as low inverse temperature and high negativity biases. These results were consistent with related studies14,25,26,27,28,29,31.

Of course, dispute and confrontation do not necessarily impair group performance. For example, in more creative tasks that require insight and experimentation, the high exploration observed in tetrads might be more efficient than triads subjected to majority influence. However, in the TAB problems, insight and experimentation were not required. Instead, utilizing past information and efficiently guessing an advantageous box played a critical role in achieving higher performance, that in turn hinged on consistency in learning strategies. Thus, while this study confirmed that odd-sized groups (i.e., triads) performed better than even-sized groups (i.e., tetrads) in learning tasks that do not require creativity or insight, tetrads might be superior to triads in creative tasks and insight problem-solving. This could be an interesting research topic in the future.

In contrast to performance, inverse temperature, positivity biases, risk parameters, \(\upmu ,\) and \(\upnu \), did not account for the difference between triads and tetrads. Note that risk-seeking behavior also has a tendency toward divergence from current learning strategies. In this sense, risk-seeking has some similarity to exploration. However, in our model, exploration corresponded to divergence from the optimal Q value, that already incorporated risk-seeking behavior. Hence, risk-seeking and exploration have subtle differences. That inverse temperature differed between triads and tetrads, implying that the divergence from a consistent learning strategy was reflected in the inverse temperature but not in risk attitudes.

In addition to these results, this paper contributes a novel methodology for the study of small groups. To the best of our knowledge, this is one of the first attempts to take a computational approach to the study of small-group dynamics. Of course, a large body of literature on group dynamics has empirically investigated the properties of the dynamics of small groups. However, most of these studies did not explicitly model the underlying mechanism of group decision-making or estimate parameters that characterize group dynamics. The computational approach proposed in this paper articulates the algorithm of group decision-making and enables the underlying learning parameters to be estimated, allowing for rigorous comparison among small groups in terms of learning parameters such as inverse temperature and risk attitudes. We hope this computational approach sheds new light on group dynamics and group decision-making.

In this respect, it should also be noted that a simple Q-learning model, or reinforcement learning in general, closely correspond to the actual working of neural networks in the brain. The key variables are the actual rewards and reward prediction errors. The Q value is the expected reward, that is updated by feedback from a reward prediction error. This reinforcement learning framework is supported by a number of empirical studies including neural signals in various cortical and subcortical structures that behave as predicted43,44,45,46. For example, it is now commonly accepted that dopamine neurons in the midbrain of humans and monkeys encode reward prediction errors46,47,48. Thus, the reinforcement learning model class is typically matched by brain activity. Since the simply Q-learning model considered in this paper belongs to this model class, the model matches brain activity, unlike abstract and unrealistic models without an empirical foundation.

One of the managerial implications derived from this study is that group size is crucial to the management of small groups. In particular, when groups undertake learning under uncertainty without the burden of creativity and insight, triads, rather than tetrads, should be selected. However, when tasks require much creativity and insight, tetrads, rather than triads, might be preferred, although this idea was not examined in this study. In broader contexts, odd-sized groups are favored for learning tasks without creativity and even-sized groups for creative problem-solving25. This rule is clear and straightforward to implement, but, of course, diversity in knowledge, skill, working experiences, cultural backgrounds, and personalities also account for group performance. However, unless managers have sufficient time to take these factors into account, this simple rule should be implemented.

Finally, we would like to point out the limitations of this study. First, while we confirmed that triads outperformed tetrads, learning coherence of triads and learning incoherence of tetrads were inferred from the results on inverse temperature and positivity biases, rather than derived from a strong theoretical background. In this sense, these learning characteristics were exploratory in our hypothesis. In future studies, more detailed learning mechanisms generating learning coherence and incoherence should be specified and empirically tested. Second, learning tasks (TAB) are fundamental to the results of this study. If different kinds of tasks are assigned, the relative performance of triads and tetrads would differ. In particular, as described above, insight problem-solving or creative tasks might have opposite results regarding the relative performance of triads and tetrads. This constitutes one of our future research challenges.


This study focused on the relative performance and learning characteristics of triads and tetrads as an extension of Simmel49 research on dyads vs. triads to triads vs. tetrads, and also serves as a specific investigation of odd- vs. even-sized group dynamics14,25. Generally, our study confirmed that the odd-sized groups performed better than the even-sized groups. Moreover, it was revealed that learning coherence and incoherence were observed in triads and tetrads, respectively. In addition to the confirmation of the theoretical predictions, this study developed a new computational model that enables the estimation of the underlying learning properties of small groups. In related works, Harada50 also showed that individuals and triads performed better than dyads due to learning coherence of individuals and triads and the learning incoherence of triads. This study was consistent with this result in that the odd-sized groups (triads) performed better than even-sized groups (tetrads). To the best of our knowledge, this study was one of a few attempts to apply the reinforcement learning framework to group decision making.


  1. Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316, 1036–1039. (2007).

    ADS  CAS  Article  PubMed  Google Scholar 

  2. Gowers, T. & Nielsen, M. Massively collaborative mathematics. Nature 461, 879–881. (2009).

    ADS  CAS  Article  PubMed  Google Scholar 

  3. Lewin, K. In Field theory in social science: selected theoretical papers (ed. Dorwin, C.) (Harpers, 1951).

    Google Scholar 

  4. van Vianen, A. E. M. & De Dreu, C. K. W. Personality in teams: Its relationship to social cohesion, task cohesion, and team performance. Eur. J. Work Organ. Psy. 10, 97–120. (2001).

    Article  Google Scholar 

  5. Barrick, M. R., Stewart, G. L., Neubert, M. J. & Mount, M. K. Relating member ability and personality to work-team processes and team effectiveness. J. Appl. Psychol. 83, 377–391. (1998).

    Article  Google Scholar 

  6. Fisher, S., Hunter, T. A. & Macrosson, W. D. K. Belbin’s team role theory: for non-managers also?. J. Manag. Psychol. 17, 14–20. (2002).

    Article  Google Scholar 

  7. De Church, L. A. & Marks, M. A. Leadership in multiteam systems. J Appl Psychol 91, 311–329. (2006).

    Article  Google Scholar 

  8. Gerstner, C. R. & Day, D. V. Meta-Analytic review of leader–member exchange theory: Correlates and construct issues. J. Appl. Psychol. 82, 827–844. (1997).

    Article  Google Scholar 

  9. Beal, D. J., Cohen, R. R., Burke, M. J. & McLendon, C. L. Cohesion and performance in groups: A meta-analytic clarification of construct relations. J. Appl. Psychol. 88, 989–1004. (2003).

    Article  PubMed  Google Scholar 

  10. Chiocchio, F. & Essiembre, H. Cohesion and performance: A meta-analytic review of disparities between project teams, production teams, and service teams. Small Group Res. 40, 382–420. (2009).

    Article  Google Scholar 

  11. Mullen, B. & Copper, C. The relation between group cohesiveness and performance: An integration. Psychol. Bull. 115, 210–227. (1994).

    Article  Google Scholar 

  12. Wellins, R. S., Byham, W. C. & Dixon, G. R. Inside Teams (Jossey-Bass, 1994).

    Google Scholar 

  13. Salas, E., Cooke, N. J. & Rosen, M. A. On teams, teamwork, and team performance: Discoveries and developments. Hum. Factors 50, 540–547. (2008).

    Article  PubMed  Google Scholar 

  14. Hastie, R. & Kameda, T. The robust beauty of majority rules in group decisions. Psychol. Rev. 112, 494–508. (2005).

    Article  PubMed  Google Scholar 

  15. Kerr, N. L. & Tindale, R. S. Group performance and decision making. Annu. Rev. Psychol. 55, 623–655. (2004).

    Article  PubMed  Google Scholar 

  16. Adamowicz, W. et al. Decision strategy and structure in households: A “groups” perspective. Mark. Lett. 16, 387–399. (2005).

    Article  Google Scholar 

  17. Tindale, R. S. & Kluwe, K. In The Wiley Blackwell Handbook of Judgment and Decision Making Vol. 2 (eds Gideon, K. & George, W.) 849–874 (John Wiley & Sons, 2015).

    Chapter  Google Scholar 

  18. Maciejovsky, B. & Budescu, D. V. Collective induction without cooperation? Learning and knowledge transfer in cooperative groups and competitive auctions. J. Pers. Soc. Psychol. 92, 854–870. (2007).

    Article  PubMed  Google Scholar 

  19. Laughlin, P. R. Group Problem Solving (Princeton University Press, 2011).

    Book  Google Scholar 

  20. Hinsz, V. B. Cognitive and consensus processes in group recognition memory performance. J. Pers. Soc. Psychol. 59, 705–718. (1990).

    ADS  Article  Google Scholar 

  21. Morgan, P. M. & Tindale, R. S. Group vs individual performance in mixed-motive situations: Exploring an inconsistency. Organ. Behav. Hum. Decis. Process. 87, 44–65. (2002).

    Article  Google Scholar 

  22. Nijstad, B. A. & Paulus, P. B. In Group Creativity: Innovation Through Collaboration (eds Paulus, P. B. & Nijstad, B. A.) 326–339 (Oxford University Press, 2003).

    Chapter  Google Scholar 

  23. Kerr, N. L. & Tindale, R. S. Group-based forecasting?: A social psychological analysis. Int. J. Forecast. 27, 14–40. (2011).

    Article  Google Scholar 

  24. Mellers, B. et al. Psychological strategies for winning a geopolitical forecasting tournament. Psychol. Sci. 25, 1106–1115. (2014).

    Article  PubMed  Google Scholar 

  25. Menon, T. & Phillips, K. W. Getting even or being at odds? Cohesion in even- and odd-sized small groups. Organ. Sci. 22, 738–753. (2011).

    Article  Google Scholar 

  26. Murnighan, J. K. Models of coalition behavior: Game theoretic, social psychological, and political perspectives. Psychol. Bull. 85, 1130–1153. (1978).

    Article  Google Scholar 

  27. O’Leary, M. B. & Mortensen, M. Go (con)figure: Subgroups, imbalance, and isolates in geographically dispersed teams. Organ. Sci. 21, 115–131. (2010).

    Article  Google Scholar 

  28. Polzer, J. T., Crisp, C. B., Jarvenpaa, S. L. & Kim, J. W. Extending the faultline model to geographically dispersed teams: How colocated subgroups can impair group functioning. Acad. Manag. J. 49, 679–692. (2006).

    Article  Google Scholar 

  29. Shears, L. M. Patterns of coalition formation in two games played by male tetrads. Behav. Sci. 12, 130–137. (1967).

    CAS  Article  PubMed  Google Scholar 

  30. Asch, S. E. In Groups, Leadership and Men; Research in Human Relations (ed. Guetzkow, H.) 177–190 (Carnegie Press, 1951).

    Google Scholar 

  31. Wittenbaum, G. M., Stasser, G. & Merry, C. J. Tacit coordination in anticipation of small group task completion. J. Exp. Soc. Psychol. 32, 129–152. (1996).

    Article  Google Scholar 

  32. Stoet, G. PsyToolkit—A software package for programming psychological experiments using Linux. Behav. Res. Methods 42, 1096–1104. (2010).

    Article  PubMed  Google Scholar 

  33. Stoet, G. PsyToolkit: A novel web-based method for running online questionnaires and reaction-time experiments. Teach. Psychol. 44, 24–31. (2017).

    Article  Google Scholar 

  34. Watkins, C. J. C. H. & Dayan, P. Q-learning. Mach. Learn. 8, 279–292. (1992).

    Article  MATH  Google Scholar 

  35. Katahira, K. The statistical structures of reinforcement learning with asymmetric value updates. J. Math. Psychol. 87, 31–45. (2018).

    MathSciNet  Article  MATH  Google Scholar 

  36. Palminteri, S., Lefebvre, G., Kilford, E. J. & Blakemore, S.-J. Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing. PLoS Comput. Biol. 13, 1–22. (2017).

    CAS  Article  Google Scholar 

  37. Aberg, K. C., Doell, K. C. & Schwartz, S. Hemispheric asymmetries in striatal reward responses relate to approach–avoidance learning and encoding of positive–negative prediction errors in dopaminergic midbrain regions. J. Neurosci. 35, 14491–14500. (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  38. den Ouden, H. E. M. et al. Dissociable effects of dopamine and serotonin on reversal learning. Neuron 80, 1090–1100. (2013).

    CAS  Article  Google Scholar 

  39. Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl. Acad. Sci. U.S.A. 104, 16311–16316. (2007).

    ADS  Article  PubMed  PubMed Central  Google Scholar 

  40. Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S. & Palminteri, S. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 1–9. (2017).

    Article  Google Scholar 

  41. van den Bos, W., Cohen, M. X., Kahnt, T. & Crone, E. A. Striatum-medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255. (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Harada, T. Learning from success or failure?—Positivity biases revisited. Front. Psychol. 11, 1–11. (2020).

    Article  Google Scholar 

  43. Glimcher, P. W. & Rustichini, A. Neuroeconomics: The consilience of brain and decision. Science 306, 447–452. (2004).

    ADS  CAS  Article  PubMed  Google Scholar 

  44. Hikosaka, O., Nakamura, K. & Nakahara, H. Basal ganglia orient eyes to reward. J. Neurophysiol. 95, 567–584. (2006).

    Article  PubMed  Google Scholar 

  45. Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556. (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599. (1997).

    CAS  Article  PubMed  Google Scholar 

  47. Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141. (2005).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  48. Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88. (2012).

    ADS  CAS  Article  PubMed  PubMed Central  Google Scholar 

  49. Simmel, G. The Sociology of Georg Simmel (The Free Press, 1964).

    Google Scholar 

  50. Harada, T. Three heads are better than two: Comparing learning properties and performances across individuals, dyads, and triads through a computational approach. PLoS ONE 16, 1–16. (2021).

    CAS  Article  Google Scholar 

Download references

Author information

Authors and Affiliations



T.H. wrote the whole manuscript, prepared figures and tables.

Corresponding author

Correspondence to Tsutomu Harada.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Harada, T. Examining learning coherence in group decision-making: triads vs. tetrads. Sci Rep 11, 20461 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing