We know that groups of animals are capable of marvelous feats of collective action that emerge from aggregation of numerous small contributions from a large number of individuals often through elegantly simple local mechanisms1. Many computational models have been proposed that demonstrate how a group may, for example, forage or navigate accurately without signaling between group members or without group members needing to recognize which is the best informed individual. Many such models of leadership and collective action in small animals do not even consider the case for groups of less than ten or so interacting agents2. These models are mute when it comes to the most basic unit of social interaction: a dyad. Whether pre-social animals such as fish can even engage in any useful one-to-one collaborative effort to solve complex cognitive problems is unknown and not accounted for by existing theories of collective action in animals.

To address this question, we compared the performance of individuals and dyads of guppies (Poecilia reticulata) in two different numerosity discrimination tasks. Guppies and other small fish that risk predation, can discriminate numerosities and use this ability to reduce the risk by spontaneously joining the larger shoal of conspecifics3,4,5,6. Furthermore, they can also learn to select the more or less numerous display of abstract objects even when controlled for other confounds of magnitude, such as size and density. Typically, guppies and other small fish can readily discriminate up to 1:2 ratios when numerosities larger than 4 are involved, although they exhibit a better numerical acuity within a limited range of numerosities (≤4)3,5. However, it is not known whether two guppies shoaling together would be better at numerosity discrimination than they would as individuals.


In Experiment 1, we used a two choice apparatus to measure the ability of 150 female guppies to discriminate shoal size. Forty-six fish were tested individually while the other 104 fish were matched for size and tested in dyads (n = 52). To investigate the role of familiarity, half of the dyads (n = 26) were composed of familiar individuals (fish that had lived in the same tank for at least 20 days) and half of non-familiar individuals. Singletons and dyads were presented with the numerical contrast 4 vs. 6, just above the threshold ratio reported for fishes in experiments using this procedure3,7 (Figure 1a).

Figure 1
figure 1

Experimental apparatus and results of experiment 1.

Subjects were inserted in the middle of three adjacent tanks (a). Two groups of social companions differing in numerosity (4 vs. 6) were presented at the two bottoms and the proportion of time spent near the larger shoal was taken as a measure of their numerical acuity. Dyads were significantly better than single fish (b). Real dyads were also more accurate than the “average of two” simulated data set but did not differ from the “better of two” simulated data set [Figure drawn by CA].

The larger shoal was significantly preferred by dyads (one sample t-test, t(49) = 2.741, p = 0.009) but not by individually tested fish (t(43) = 0.128, p = 0.898) that spent an equal amount of time with the two shoals (see Figure 1b). No difference was found between familiar and unfamiliar pairs (independent t-test, t(48) = 0.324, p = 0.747). Dyads spent significantly larger proportion of their time with the larger shoal than did the single fish (independent t-test, t(92) = 1.727, p = 0.043).

Two models of collective decision making have been proposed to explain better performance of groups in humans and animals. The first model refers to the so-called “many wrongs” principle (MW)8. When each individual makes an estimate that is a close approximation to the correct one but with some error, then, if these errors are randomly distributed around the true mean, they will cancel each other and the whole crowd will be more accurate than most, if not all, its single members. The MW principle has been suggested as an explanation for the advantage of group navigation in pigeons9. This mechanism to be effective depends on large numbers of individuals in the crowd. For dyads, it predicts that the group accuracy will be the average of its members. The second model might be called “Meritocratic Leadership” (ML) and applies if some members are more accurate than others to accomplish the task. In this scenario the group would enjoy an advantage provided that collective decisions are guided by its best performing members. This mechanism is thought to be at the basis of collective decision in honeybees where a few informed individuals can bias the decision of the whole group10.

To test these two hypotheses quantitatively, we generated two sets of “simulated dyads”: we sampled the data from individually tested fish by randomly selecting and assigning two individuals to a dyad. For one set of simulated dyads, we calculated the average accuracy of the two members as the dyadic accuracy and for the other set we assigned the more accurate member's accuracy to the simulated dyad. Real dyads were more accurate than the “average of two” simulated data set (t(98) = 2.136, p = 0.035, Figure 1b, grey bars) but did not differ from the “better of two” simulated data set (t(98) = 0.730, p = 0.467) thus providing indirect evidence against the MW hypothesis. This conclusion is supported by Bayesian model selection procedure. Bayes factor11 computed as BIC (Bayesian Information Criterion) shows that the ML model is 4.64 times more likely than the MW model to explain the performance of the dyads in our experiment (by convention12 there is evidence for a model against an alternative when this value is greater than 3). Performance of the dyads fits better with the performance of the “better of two” instead of the “average of two”. This finding suggests that, within a dyad, the better individuals emerge spontaneously as leaders.

Experiment 1 demonstrated that dyadic interaction enhances numerosity discrimination in a socially and ecologically relevant context, which is directionally specific (i.e. choose larger quantity). One may argue that these features may render the collective benefit observed in Experiment 1 extremely specific. In Experiment 2, we sought more general evidence for dyadic collective benefit, by successfully training thirty adult male guppies to discriminate between two numerosities presented in nonsocial context (Figure 2). These stimuli consisted of a different number of black dots on white background with a 1:2 ratio (5 vs. 10 or 6 vs. 12 dots). Stimuli were controlled for non-numerical cues such as surface area, density and the overall space occupied by the dots. Following a recently developed procedure, each fish was housed in a rectangular tank13. Two stimuli were introduced at the opposite ends of the tank and food was delivered near the stimulus to be reinforced. Half of the fish were reinforced for the larger and half for the smaller number. Fish received 12 reinforced trials equally subdivided in three consecutive days. Learning was then assessed by measuring the proportion of time spent near the rewarded stimulus in four probe trials with no reinforcement.

Figure 2
figure 2

Experimental apparatus used in experiment 2.

Subjects were housed in an experimental tank for the duration of the experiment. Stimuli (groups of dots differing in numerosity) were presented at the two ends of the tank and food was provided only in correspondence to the reinforced numerosity [Figure drawn by MEMP].

After the training, 10 subjects were randomly assigned to the “individual” condition and 20 subjects were paired to form 10 dyads and assigned to the “collective” condition. Single fish and dyads completed 12 non-reinforced probe trials with two novel ratios: six trials with a 2:3 ratio (8 vs. 12 dots) and six with a 3:4 ratio (9 vs. 12) distributed across 4 days and alternated with reinforced trials presenting the same stimuli with 1:2 ratios of the learning phase. The 2:3 ratio (8 vs. 12 dots) corresponds to the upper limit in the ability of fish to discriminate quantities beyond 4 and 3:4 is just above this threshold5,13.

Dyads did significantly better than singletons (ANOVA, F(1,18) = 6.492, p = 0.020) and there was a significant effect of numerosity ratio (F(1,18) = 10.282, p = 0.005; Interaction F(1,18) = 0.011, p = 0.917, Figure 3a). Discrimination was above chance for 2:3 ratio both by singletons (one sample t-test, t(9) = 2.782, p = 0.021) and by dyads (t(9) = 5.920, p < 0.001), while the 3:4 ratio was only discriminated better than chance by dyads (t(9) = 2.459, p = 0.036) but not by singletons (t(9) = 0.422, p = 0.683, Figure 3a).

Figure 3
figure 3

Results of experiment 2.

Both singletons and dyads were able to distinguish a 2:3 ratio; by contrast, only dyads were able reliably to distinguish the 3:4 (a). The accuracy of the dyads was superior to the average accuracy of the two members when performing individually, but did not differ from the accuracy of the better members (b), in agreement with the ML model [Figure drawn by MEMP].

Results of simulation based on data from Experiment 1 provided only indirect evidence in favour of the ML model. After the dyadic test phase, all 30 fish were singly housed and their individual performance assessed for the 2:3 and 3:4 ratios with the same procedure described above. This added step allowed us to test the MW and ML hypotheses directly. The MW hypothesis predicted that dyadic accuracy should be close to the average accuracy of the two members, whereas the ML hypothesis predicted that dyads should match the accuracy of the better member. The accuracy of the dyad was superior to the average accuracy of the two members when performing individually (repeated measures ANOVA F(2, 18) = 5.65, p = 0.012; pos-hoc LSD p = 0.045; Figure 3b) but did not differ from that of the better member (p = 0.95, Figure 3b). The Bayes factor (BIC) indicates that the ML model is 5.57 times more likely than MW model.

Thus, Experiment 2 replicated the finding that guppy dyads perform better in numerosity discrimination than singleton guppies. We found that dyads had a reliably higher numerical acuity beyond what is typically found in equivalent experiments with singletons both in this study and previously3,7,14. Moreover, the second experiment deployed a training paradigm which carefully balanced stimuli and task (choosing the larger or the smaller number) similar to what has been classically employed to study numerical abilities in birds15 and primates16,17.


Numerosity discrimination has been repeatedly demonstrated in many animal species including invertebrates18, amphibians and mammals19, as well as primates and fish. It has been shown to be beneficial in several ways. For example, bees enumerate the number of landmarks encountered during flight to relocate a food source20; fish use it to select the larger, safer shoal3; lions use it to decide whether to fight or to flee21. However, in this study we found that guppies tested singly show the same average discrimination limit observed in other teleost species, i.e. a 1:2 ratio14, though singleton guppies have been found previously to manage a 2:3 ratio5. However when tested in pairs they were able to discriminate numerosities with a 3:4 ratio, a numerical acuity that mammals17,22 and birds23 exhibit only after extensive training.

The results of Experiment 2, together with the evidence from the simulation based on data from Experiment 1, clearly reject the MW hypothesis as the mechanism underlying collective benefit at the dyadic level24. The critical point that renders this mechanism inadequate at explaining these results is the dyadic nature of collective decisions studied here. The larger the number of agents in the collective the more successful this mechanism is expected to be. With only two agents involved in collective decisions, this model predicts that the dyadic performance will be determined by average member accuracy which was not the case in both experiments. The results are consistent with the idea that dyad performance is determined by the better member taking a leadership role. We know that leadership can emerge spontaneously in the shoaling behaviour of teleost fish25 but, once again, this type of emergent phenomenon also depends critically on group size and the computational models explaining effective leadership2 have not been tested in group sizes of N < 10. Indeed, social learning in guppies mediates meritocratic leadership where younger shoal mates learn to follow older or more successful foragers26,27. Exactly how the pairs of guppies tested here could assign the leadership role among them is not possible to tell from the data presented here.

By verbally exchanging decision confidences, human decision makers with similar competence levels achieve a collective benefit over and above their best individual28,29. But whether guppy dyads behaviour implies social confidence sharing is unknown and beyond the scope of this study. Computationally, the confidence sharing model29 does not offer a fixed prediction for dyadic collective benefit but instead suggests that collective benefit is proportional to the similarity of competence between dyad members. But a test of the role of similarity in confidence sharing requires far more test trials than were administered in our paradigms. Whether special physical and/or social cues are employed in determining the dyadic leader is an intriguing and important question for future research. The counter intuitive fact, now well-supported, is that collaboration among pre-social animals is observable with the minimum possible social group size of two agents.


Ethics Statement

The Experiments comply with all laws of the country (Italy) in which it was performed (D.M. 116192) and was approved by ‘Ministero della Salute’ (permit number: 6726-2011).

Experiment 1. Spontaneous discrimination of shoals differing in numerosity


Adult female guppies (Poecilia reticulata) were stocked at the Laboratory of Comparative Psychology (University of Padova) and maintained for one month in 150 one-stock aquaria containing mixed-sex groups (15 individuals with approximately a 1:1 sex ratio). Subjects were descendants of wild-caught fish from the Lower Tacarigua River (Trinidad). Aquaria were provided with natural gravel, an air filter and live plants. Both stock aquaria and experimental tanks were maintained at a constant temperature of 25 ± 1°C and a 14:10 h light:dark (L:D) photoperiod with an 18-W fluorescent light. Before the experiment, fish were fed twice daily to satiation with commercial food flakes and live brine shrimp (Artemia salina nauplii).

We tested a total of 150 subjects (ranging from 3 to 6 cm in length). Forty-six subjects were singly tested, while 104 fish were tested in pairs (n = 52). In order to investigate the potential role of familiarity, half of the pairs (n = 26) was composed by familiar individuals (fish living in the same tank for at least 20 days), while the other half was composed by non-familiar fish. As no difference between familiar pairs and non-familiar pairs was found, the two groups were pooled together in the main analysis. All stimulus shoals were composed by non-familiar individuals.


The experimental apparatus was one we have used previously to study numerical competence in adult guppies and was composed of three adjacent tanks5. See Figure 1a. The central one, called “subject tank”, was 36 × 60 × 35 cm. At the two ends, facing the subject tank, there were two “stimulus tanks” (36 × 10 × 35 cm) into which two shoals differing in numerosity were placed. Water level was equal to 10 cm and the walls were covered with green plastic to prevent subjects from seeing outside. Stimulus and subject tanks were lit by one fluorescent lamp with water maintained at a temperature of 25° ± 2°C.


In most social fish, single individuals that happen to be in an unknown environment tend to join other conspecifics and, if choosing between two shoals, they exhibit a preference for the large one to reduce predation risk3,5,6. We used this spontaneous tendency to go to the larger shoal to study quantity discrimination of guppies.

Fish were presented with the same numerical contrast (4 vs. 6) and tested in two different conditions: singletons vs. dyads. In the single condition the subject was introduced into a hollow cylinder in the center of the subject tank. After 2 min the cylinder was carefully raised up and subject was allowed to acclimate for 2 min. After this period the subject was observed for 15 min. Shoal preference was calculated as the time spent by the subject shoaling within a distance of 11 cm from the glass facing the stimulus tanks (preference area). Subjects that did not visit each stimulus sector at least three times (2 singletons and 2 dyads) were discarded.

The testing procedure for dyads was the same. Two subjects were simultaneously inserted in the subject tank and their behavior was observed for 15 min. Shoal preference was calculated as the time spent by both subjects in the preference area. In the event that subjects were in the opposite preference area, or that only one subject was in a preference area, the choice of fish was not included in the analysis.

Experiment 2. Trained abilities to discriminate between sets of dots differing in numerosities


Subjects were 50 adult male guppies.

Apparatus and stimuli

The experimental apparatus was previously used to study numerical competence in a closely related species13. It was composed of a 50 × 19 × 32 cm tank filled with gravel and 24 cm of water. The long walls were covered with green plastic material and the short walls were covered with white plastic material. To reduce the potential effects of social isolation30, two mirrors (29 × 5 cm) were placed in the middle of the tank. An artificial leaf (9 × 8 cm) was placed between the mirrors to provide some shelter for the subject. In correspondence with the sides in which stimuli were presented, two ‘choice areas’ were defined by white rectangles (14 × 12 cm) covered by a green net. See Figure 2.

Stimuli were inserted in a 6 × 6 cm square and were presented at the bottom of a 6 × 32 cm transparent plexiglass panel. They were groups of black dots differing in size on a white background. Different numerical contrasts were presented: 5 vs. 10 and 6 vs. 12 (1:2 ratio) in the training phase; 8 vs. 12 and 9 vs. 12 (2:3 and 3:4 ratios, respectively) in the test phase. Stimuli were extracted from a pool of 24 different pairs for each numerical contrast. The size and position of the dots were changed across sets. Numerosity usually co-varies with several other attributes such as the cumulative surface area, the overall space occupied by the sets, or the density of the elements and human and non-human animals can use the relative magnitude of these non-numerical cues to estimate which group is larger/smaller31,32. The two numerosities were equated for cumulative surface area by using TpsDig software33. In addition, since density and overall space encompassed by the stimuli are inversely correlated, half of the set was controlled for the overall space, whereas the second half was controlled for the density of the dots.

Eight identical experimental tanks were lit by two fluorescent lamps (36 W). Four suspended camcorders recorded the position of the subjects during the tests.


The experiment was divided into three different phases: 1) individual training, 2) test and 3) control of individual performance of dyads. During the training phase, we presented an easy numerical ratio (1:2) with the purpose of training the fish to the new task and selecting those fish successfully accomplished the task. In the test phase, we assessed whether fish accuracy to discriminate novel numerical ratios (2:3 and 3:4) varied when subjects were tested singly or in dyads. In the control of individual performance of dyads, subjects previously tested in dyads were observed individually in their capacity to discriminate 2:3 and 3:4 numerical ratio, in order to assess whether the performance of dyads might be explained by a simultaneous increase in accuracy of both individuals, or by the presence of only one individual having a better performance.

Individual training

In the two days preceding the start of the training, eight fish were singly inserted into the experimental tanks in order to familiarize them with the tank. During this period, fish were fed twice a day by inserting brine shrimps near the two short walls. On days 1 to 3, fish received four trials per day (three consecutive days, for a total of 12 trials). Each trial consisted of inserting the two stimuli hanged on the short walls. Two numerical contrasts were presented in a pseudo-random sequence: 5 vs. 10 and 6 vs. 12. Half fish were reinforced to the larger numerosities and half fish to the smaller numerosities. Soon after the stimuli were inserted in the tank, the experimenter used a Pasteur pipette to release the food reward (brine shrimps) in correspondence with the reinforced numerosity; an identical syringe was used to simultaneously insert pure water close to the non-reinforced numerosity. Subjects were left free to feed for seven minutes. After this time, stimuli were removed from the tank. The inter-trial interval lasted three hours. The left-right positions of the stimuli were counterbalanced over trials.

On days 4 and 5, two probe trials were alternated each day with two reinforced trials (four probe trials in total). In probe trials (two trials with 5 vs. 10 and two trials with 6 vs. 12, presented in a pseudo-random sequence), stimuli were inserted in the tank for four minutes; no reinforcement was provided (extinction procedure) and the time spent by guppies in the ‘choice areas’ was recorded as a measure of their capacity to discriminate the two numerosities. Reinforced trials were identical to those described for days 1 to 3. To avoid the possibility of fish using the local/spatial cues of their tank, each subject was moved from one tank to another at the end of each day.

Only fish who met the learning criterion (defined as 60% of the time spent near the reinforced numerosity in probe trials) were selected for the test phase. Thirty fish out of 50 (60%) reached the criterion and hence started the test phase. Ten subjects were included in the singleton condition, 20 in the dyad condition.


After two-day interval (in which fish received a total of four reinforced trials, two each day), fish were divided in two groups: single vs. dyads. Fish included in the former group were observed individually; fish included in the latter group were tested in pairs. Dyads were assembled each morning, one hour before the beginning of the test; only in the evening (after the test) each fish was inserted singly in the tank. Each pair always comprised the same individuals.

Three probe trials were presented each day for four consecutive days (days 8 to 11). Fish were presented with two novel numerical ratios, 2:3 (8 vs. 12) and 3:4 (9 vs. 12) and six presentations for each ratio in a pseudo-random sequence. The inter-trial interval lasted three hours. Two reinforced trials presenting the numerical contrasts of the training (5 vs. 10 and 6 vs. 12) were alternated with the probe trials. As exp. 1, time spent in the choice area in the dyad condition was considered only when both subjects were simultaneously in the same choice area.

As no difference in the accuracy was reported between fish trained with the larger numerosity (mean ± s.d.: 0.648 ± 0.118) and those trained with the smaller numerosity as positive (0.564 ± 0.069, independent t-test t(18) = 1.98, p = 0.063), the two groups were pooled together in the main analyses.

Control of individual performance of pairs

On day 12, fish tested in pairs were separated and observed individually in their ability to discriminate 2:3 and 3:4 numerical ratios. Four probe trials were presented, two of each numerical ratio; two reinforced trials presenting the numerical contrasts of the training (5 vs. 10 and 6 vs. 12) were alternated with the probe trials.

To compare the dyads, the average of two individuals and of the better individual, we calculated the average accuracy of 2:3 and 3:4 ratio.