Convergence to consensus in heterogeneous groups and the emergence of informal leadership

When group cohesion is essential, groups must have efficient strategies in place for consensus decision-making. Recent theoretical work suggests that shared decision-making is often the most efficient way for dealing with both information uncertainty and individual variation in preferences. However, some animal and most human groups make collective decisions through particular individuals, leaders, that have a disproportionate influence on group decision-making. To address this discrepancy between theory and data, we study a simple, but general, model that explicitly focuses on the dynamics of consensus building in groups composed by individuals who are heterogeneous in preferences, certain personality traits (agreeability and persuasiveness), reputation, and social networks. We show that within-group heterogeneity can significantly delay democratic consensus building as well as give rise to the emergence of informal leaders, i.e. individuals with a disproportionately large impact on group decisions. Our results thus imply strong benefits of leadership particularly when groups experience time pressure and significant conflict of interest between members (due to various between-individual differences). Overall, our models shed light on why leadership and decision-making hierarchies are widespread, especially in human groups.

where α i measures the listener's agreeability and β j the speaker's persuasiveness. Let ψ i j be a random variable equal to 1 if the event includes individual i as a listener and individual j as a speaker and let ψ i j = 0 otherwise. Let Ψ be a stochastic square matrix with elements Ψ i j = α i β j ψ i j . Let matrix S = I + Ψ − diag(ΨJ), where I is the identity matrix, J is the column vector of 1's, and diag(ΨJ) is the diagonal matrix with the elements of vector ΨJ. Note that S is a stochastic matrix.Then the dynamics of vector x = (x 1 , . . . , x n ) T can be described by a system of stochastic linear equations where t specifies the event number. Let E(S) be the expectation of S. The largest eigenvalue of matrix E(S) is equal to 1. If the eigenvalue with the second largest modulus λ 2 of matrix E(S) is smaller than 1 in absolute value, then all elements of E(x) converge almost surely to a consensus value where v is the left eigenvector of matrix E(S) corresponding to eigenvalue 1. 1 Assume first that each possible pair of individuals has an equal probability 1 n(n−1) to be chosen as a listener-speaker pair. In this case where α and β are vectors of α i and β i , respectively. Then the expectation of the consensus value is Next assume that at each event, the listener is chosen with equal probability out of all group members while the speaker is chosen out of remaining n − 1 group members with probability proportional to their status ρ i (∑ ρ i = 1). In this case ψ i j = 1 with probability 1 n ρ j ∑ j =i ρ j = 1 n ρ j 1−ρ i . Now matrix E(S) is given by equation (S3) with α i substituted by α i /(1 − ρ i ) and β i substituted by β i ρ i . Correspondingly, the expectation of the consensus value becomes Assume that interactions happen on a connected social network specified by an adjacency matrix C with element c i j = 1 if a pair with listener i and speaker j is feasible. Assume that a listener is chosen randomly with a uniform probability while 1/17 the speaker is chosen randomly from d i = ∑ j c i j potential speakers (e.g., neighbors or friends). In this case, for each feasible pair ψ i j = 1 with probability 1 nd i . Then one can find E(S) and x * for any given adjacency matrix C.
In general finding or approximating the rate of convergence to τ is nontrivial (e.g. 2 and 3 ). Assume that there are only n = 2 individuals i and j and that each individual has an equal chance to be a speaker or listener. Let d = x i − x j . Then d t+1 = ξ t d t , where ξ t takes values 1 − α i β j and 1 − α j β i with equal probabilities. In this case, the asymptotic rate of convergence towards consensus is This is based on the fact that as t → ∞ the product ∏ t i=1 ξ i will have roughly t/2 terms 1 − α i β j and t/2 terms 1 − α j β i or, Note that because E(d t+1 ) = E(ξ ) t d 0 , the rate of convergence can be approximated as The two equations above are close if αβ ≪ 1. With arbitrary n, perhaps the simplest approximation is to use the second largest eigenvalue λ 2 of matrix E(S). This corresponds to approximating the dynamics of the stochastic system x t+1 = S t x t with the deterministic system x t+1 = E(S t )x t .
[From the above, we know that there is a bias even with n = 2.] Assume that individuals have equal agreeability α i = α but differ in persuasiveness β i . In this case, where β is the average of β i . The eigenvalues of E(S) are λ = 1 with multiplicity one and with multiplicity n − 1. Correspondingly, the characteristic time scale of convergence to the consensus value can be approximated as Note that the factor 1/n is present because our unit of time corresponds to n events. Figure (S2) shows that this approximation works reasonably well in providing a lower bound on τ.
If individuals also differ in agreeability, one has to find the eigenvalues of matrix (S3). The characteristic equation of this matrix is known (e.g., 4 ) which allows one to approximate λ 2 . However we find that a simple approximation where a h is the harmonic mean of α i , does a pretty good job in providing a lower bound on τ (see Figure S2 and Figure S4). If individuals differ only in their status ρ i but not in agreeability or persuasiveness (α i = α, β i = β ) our approximation based on λ 2 of E(S) predicts λ = 1 − αβ n−1 and thus does not capture the dependence of τ on the distribution of ρ i . Nevertheless the approximation provides a reasonable lower boundary on τ (see Figure S6). Figures (S8) numerically explores the dependence of τ on several parameters in more detail.
If individuals differ with respect to agreeability α i , persuasiveness β i , and reputation ρ i and speakers are chosen with a probability proportional to their reputation, then the time to convergence is is the harmonic mean of α i 1−ρ i values and β ρ is the average of β i ρ i values over the group.

Multiple listeners
Assume that each speaker is listened to not by one but by ℓ listeners. Figure (S10) shows that with ℓ listeners (ℓ = 2, 4, 8) the time to convergence reduces approximately by a factor of 1/ℓ.

Interactions on social networks
These can be incorporated by postulating that the probabilities ψ i j of i choosing j depend on i: where ψ i j is the adjacency matrix specifying the social network. For example, in a simple case of a star network when individual #1 is connected to all other individuals and there are no other connections, . . .
Assuming no variation in agreeability and persuasiveness, the characteristic time scale is τ ≈ 1 αβ , which is slightly longer than that in the basic case of a fully connected network (eq. S6). In the case of a circular network where each individual is connected only to its left and right neighbors, In this case, S is a circulant matrix matrix which allows one to find its eigenvalues analytically. For example, with n = 4, 6 and 8, τ = 1/(αβ ), 2/(αβ ) and (2 + √ 2)/(αβ ), respectively. In the case of the star network, the central individual has n − 1 times larger impact on the consensus value than each peripheral group member. In the case of a circular network, individual impacts on the consensus value are naturally the same.
In more general cases, matrix S loses the nice shape but the system remains linear. Figures (S11, S12, S13) show the dependence of τ on parameters in different networks.  Figure S8. Comparison of approximation (S7) with numerical results. All α i and β i are drawn from uniform distributions on 0.5 ± ∆ α and 0.5 ± ∆ β , respectively. Initial preferences are drawn randomly from a uniform distribution on the interval (0,1) and reputation values ρ i are chosen randomly from the symmetric Dirichlet distribution with different concentration parameters γ. When γ is small, most individuals have very low reputation while a few individuals will have a large reputation. The dashed line shows the diagonal. 200 parameter combinations; 200 runs for each parameter combination. Different group sizes correspond to different symbols: green circles (n = 4), blue squares (n = 8), purple diamonds (n = 16), red crosses (n = 32), orange asterisks (n = 64). Simulated results were scaled by n; µ = 1. circles correspond to γ = ∞ (equal reputations ρ i = 1/n), the lines with diamonds γ = 10 (uniform distribution of ρ), crosses γ = 1, and squares γ = 0.1 (a few highly reputable individuals). 10,000 simulations were run for each parameter combination and initial preferences were drawn from a uniform distribution on the interval (0,1). The lines with crosses correspond to ℓ = 8, the lines with circles ℓ = 4, diamonds ℓ = 2, and the squares for one listener per interaction. Results were scaled by ℓ/n. 10,000 simulations were run for each parameter combination and initial preferences were drawn from a uniform distribution on the interval (0,1).  Figure S12. The average time to reach consensus for Erdös-Rényi networks with different probabilities of connectedness and no variation in personality traits (α and β ). The squares represent the average time to reach consensus for a complete network, the diamonds denote networks with probability of connectedness of 8/n, while the circles networks had a probability of 4/n. A lower probability of connectedness increases the time to reach consensus dramatically but appears to have a reduced effect for larger group sizes. This could be due to networks having stubborn individuals with few connections, but larger groups having more interactions per time step mitigate this effect. 5,000 simulations were run for each parameter combination with random formation of dyads and initial preferences were drawn from a uniform distribution on the interval (0,1). Reducing the size of the initial complete network m 0 and therefore reducing the number of highly connected individuals dramatically increases the time to reach consensus similar to that found when reducing the number of connections with the other models. 5,000 simulations were run for each parameter combination with random formation of dyads and initial preferences were drawn from a uniform distribution on the interval (0,1).  Figure S14. Average sampling errors, (x i − x opt ) 2 , for individuals attempting to estimate the average initial preference x opt by randomly sampling n ′ group mates in a group of size n. Individuals are ranked by their relative agreeability within the group from the lowest on the left to the highest on the right. Individual agreeability was chosen with a uniform probability from the interval (0,1) and persuasiveness was fixed at 1. 5,000 runs for each parameter combination with random formation of dyads and initial preferences drawn from a uniform distribution on the interval (0,1).