Introduction

Several mechanisms and models have been implemented to explain the collective social behavior that arises from the interactions among individuals. The own experience and the experiences of others play an important role in determining the people choices in almost all human interactions. Imitation has been a widespread mechanism of human decision-making. Imitation of of common behavior reflects social influence in the individual, while imitation by others of a successful individual is of strategic nature1,2,3,4. Strategic interactions are often modeled by Game Theory. A relevant game theoretical model that describes many real-life interactions in which the best course of action is to conform to a consensus is the coordination game. The challenge of such model is how to coordinate among its multiple Nash equilibria5. This issue has been addressed in several works focusing on coordination games in a network framework6,7,8,9. However, two relevant aspects of this context have been largely unexplored.

First, the study of a kind of interactions in which individuals distinguish according to their roles between people with whom they play to obtain a payoff and those from whom they learn to update their strategies. An appropriate framework is needed to deal with the possibility that people may identify the kind of interaction they have with their partners. Such situations are very common and pertinent in real-life interactions. For example, the interactions between and within firms and consumers, employers and employees, governments and citizens, teachers and students, parents and children, medical doctors and patients. There individuals interact across groups and receive a payoff for such interactions (for instance parents with children) and look inside their group to learn and update their strategies (for instance parents learn from other parents and children learn from other children). What we have are the situations in which two populations are differentiated by the role that their individuals perform. In simple models of social networks individuals are unable to encompass different types of relationships. They play with and learn from the same set of neighbors. A different class of networks that have layers in addition to nodes and links, has been growing in popularity because of being a better description of a real networked society. The study and analysis of multilayer networks is relatively recent even though layered systems were examined decades ago in disciplines like sociology and engineering10,11,12, for a complete review see13. Here we propose a two-layer network in which inside each layer, individuals update their strategies by a rule of learning and across layers individuals receive an aggregate payoff by playing a coordination game. Most previous studies of games in multilayer networks14,15,16,17 consider playing the game inside the layers while we consider game theory interactions across layers. In a recent work18 the authors consider a two-layer network wherein one network layer is used for the accumulation of payoffs playing a social dilemma game and the other is used for strategy updating. There, each agent is simultaneously located on both layers. In contrast, in our two-layer network, each agent is located in just one layer. Therefore, there are two learning networks, one in each layer and a playing network across the two layers.

The second aspect refers to elucidate what happens when people make decisions heeding simultaneously social and strategic motivations4. In situations that call for accomplishing social efficiency and consensus two forces influence agent's choices: the strategic reasoning and the social pressure of the environment. In the sociological context, Granovetter19 proposed a model in which a certain amount of social pressure is necessary for a person to adopt a new idea, product or technology. Opinion, innovation spreading and social learning models have been dealing with this issue measuring the social pressure as the number of contacts that have already adopted the newness19,20,21,22. Here, we consider that the influence of social pressure is related with the degree of doubts about the strategies currently being played. Traditionally, the degree of doubts is measured as the subjective belief about the consequences of a certain action23. However, we assume doubts as a social factor influencing choices in strategic environments. Then, the doubts of an agent about how well she is playing depend on the popularity of her current strategy in her learning network. Our approach of doubts is inspired by the work of24. They introduce an evolutionary model of doubt-based selection dynamics. As well as24, we assume that the agents measure their doubts by observing the choices made by their fellow agents. Real-life interactions and laboratory experiments25,26,27 provide clear evidence of the importance of analyzing evolutionary dynamics based on social and strategic factors. For instance, in4,28 the authors explore the interplay between strategic and social imitative behaviors in a coordination problem on a social network and in a networked Prisoners' Dilemma respectively. In these works agents can evolve by a mixed dynamics of the voter model22,29 and the unconditional imitation. One of the main results in coordination games on complex networks is that the interplay of social and strategic imitation drives the system towards global consensus while neither social or strategic imitation alone does. Our approach aims to deal with these two important aspects mentioned above and verify the circumstances in which the complexity of such social and strategic behavior leads to the consensus on the whole society.

Results

Model description

In this paper we consider a two-layer network in which each individual is connected to two different social networks, the interlayer network or playing network and the intralayer network or learning network, see Fig. 1. In the playing network, each player interacts according to a coordination game with each of her neighbors using the same action for all those games. A normal form representation of this two-person, two-strategy coordination game is shown in Table 1. We focus our analysis in two parametric settings, a pure or symmetric coordination game in which a = d = 1 and b = 0 and a general or asymmetric coordination game in which a = 1, d = 2 and b > 0. The profiles (L, L) and (R, R) are the two Nash equilibria in pure strategies in both settings. Now, in the general coordination game the agents get a higher payoff by playing (R, R), the Pareto (payoff) dominant equilibrium while for b > 1 they risk less by coordinating on (L, L), called the risk dominant equilibrium. Games of this type are more interesting than their fully symmetrical versions as it is added a confidence problem when the socially efficient solution is also the riskier one.

Table 1 Pay-off matrix for a general two-person, two-strategy coordination game
Figure 1
figure 1

Sketch of a multilayer network that we consider in this paper.

The nodes are connected to each other in a pairwise manner both inside of the layers and between the layers for two populations A and B. Dotted lines describe the playing network (i.e. interlayer edges) and the solid lines describe the learning network (intralayer edges). The black nodes describe the agents playing strategy L and white nodes the agents playing strategy R in a coordination game.

Doubts and the parameter T

In the learning network, we propose an evolutionary update rule that heeds strategic thinking and the doubts that are generated by the popularity of the strategies. In order to describe this aspect in detail we provide some definitions. As Cabrales and Uriarte24, we assume that the doubts felt by an agent are related to the proportion of individuals with whom they interact who are equally using the same strategy. Our approach differs from24 since while those authors assume that the agents are endowed with a doubt function, we assume that they are endowed with a quantity T that calibrates their level of doubts about the collective wisdom of crowd, T [0, 1]. This parameter T is in the same line of the threshold value in19. Just as in24, we may distinguish two broad types of population, each corresponding to a doubtful behavior. A herding population, for T < 0.5, is a population in which agents rely on the wisdom of crowd. As a consequence, they are strongly influenced by the popularity of the current strategies of their partners. A skeptical population, for T ≥ 0.5, is a population in which agents are very suspicious of the wisdom of crowd: they are slightly influenced by the popularity of the current strategies of their partners. In the updating process, each player i observes the proportion of agents, di, who are playing the opposite strategy to hers in her learning neighborhood. Then, she measures how popular her strategy is, comparing di with T. For instance, when di > T, player i has doubts about the popularity of the strategy she is currently playing.

The degree of dissatisfaction

The evolution on time of the strategies derives from the levels of dissatisfaction felt by the agents. The criterion that defines the level of satisfaction of an agent is based on two key points: how well she is doing in terms of the payoff obtained in her playing network and how popular her current strategy is in her learning network. Our approach of satisfaction is quite different from24 where they justify the choice of an index of dissatisfied agent via a model of (correlated) similarities relations and from9 that define a quantity called satisfaction based on the strengths of the links. In our approach we distinguish four categories of agents as described in Table 2, where πi is the aggregate payoff of agent i and ni is her degree in the playing network.:

Table 2 Degrees of satisfaction

The value of β is derived from the parametric setting of the class of coordination game played. Since in the pairwise interaction of pure coordination games each player gets a payoff of 1 by coordinating and 0 otherwise, then β = 1 for such game. The equality πi = ni means that the player i coordinates with all her neighbors in the playing network: then we say that agent i is strategically satisfied. In the case of a general coordination game, β = 2 and an agent is strategically unsatisfied when she fails to coordinate with all her neighbors on the socially efficient solution, i.e. the Pareto dominant strategy. This will happen when in a time step πi < 2ni. When di < T the proportion of neighbors in her learning network who play the same strategy as she does is high enough so that player i feels socially satisfied with her current strategy. Then, the level of satisfaction of an agent i is: S (satisfied) when she is both socially (di < T) and strategically (πi = βni) satisfied, is P1 or P2 (partially satisfied) when she is either socially (di > T) or strategically (πi < βni) unsatisfied and is U (unsatisfied) when she is both socially (di > T) and strategically (πi < βni) unsatisfied.

The strategic update rule

We propose a synchronous update rule in which each player can change her current strategy according to her level of satisfaction. Namely,

  1. 1

    If her level of satisfaction is S, she remains with the same strategy.

  2. 2

    If her level of satisfaction is P1 or P2, she imitates the strategy of her best performing neighbor in her learning network when such neighbor has received a larger payoff than the player herself, otherwise she remains with the same strategy.

  3. 3

    If her level of satisfaction is U, she changes her current strategy.

This rule might resemble the well-known unconditional imitation (UI) update rule introduced in30. When agents follow the (UI) update rule, they seek to maximize their payoffs imitating the most successful individuals. However, the first important difference in our update rule is that individuals change their strategies conditional to their social or strategic dissatisfaction. Some experimental results show evidence of the use of the (UI) rule by individuals but also provide evidence that other social factors are influencing the updating process25,26,27. Other important difference is the environment in which learning takes place. Since individuals discriminate from whom they learn and with whom they play, this update rule only takes place in the learning networks. The proposed update rule aims to capture the individual behavior in a complex real life situation. Having setting out our strategic and social framework, we now turn to describe the evolutionary dynamics. At each elementary time step, each player plays the coordination game with each one of her interlayer neighbors. Once the game is over and a payoff is assigned to each player, each agent, observing her intralayer neighbor,s might change her strategy according to her level of dissatisfaction. The process is repeated setting payoffs to zero.

Simulation settings

The size of the populations A and B during simulations is NA = NB = 1000. The numerical results are obtained for random (Erdös- Rényi, ER) networks and fully connected networks. In the learning networks, kAA and kBB represent the mean degree (average number of links per node) for population A and B respectively. In the playing network, the mean degree kAB corresponds to the average number of links per node across populations A and B. The two strategies of the coordination games are L and R which are initially uniformly randomly distributed with proportion 0.5.

Results for pure coordination games

As a benchmark, it is helpful to remind the final configuration of a structured population playing a pure coordination game with the (UI) as update rule. The topology will define the outcome of such population. For instance, for a complete network, referred also as a fully connected network, in which each agent interact with every other agent, full coordination is reached in one time step, while for a social network displaying local connectivity, such as the random (ER) network, the system evolves to a non-coordinated frozen state. For the study of our model we focus on these two network topologies. Our simulation results show that the combination of strategic and social factors in a multilayer network drives the system to quite different outcomes than those ones. Before displaying the results, we need to clarify what a complete network means in our context of multilayer network. A complete network here implies that every agent plays with every other agent in the playing network and learns from every other agent in the learning network. Agents still discriminate between with whom they play and from whom they learn. Moreover an absorbing state in this framework is a state of intralayer coordination. In this state the agents are socially satisfied since inside each layer the same strategy is spreading all over the network. A state of interlayer coordination is a state of intralayer coordination in which the strategy displayed in one layer coincides with the strategy reached in the other layer: agents are socially and strategically satisfied. However, when the strategy in one layer is the opposite to the one in the other layer, the social satisfaction of agents makes the strategies to remain unchanged and the configuration of a polarized two-layer network is an absorbing state of the dynamics. In summary, a state of interlayer coordination implies a state of intralayer coordination. but the reciprocal is not necessarily fulfilled. Both interlayer coordination (or full coordination) and intralayer coordination are absorbing states of the dynamics.

The final configurations of the system can be described by the intra (inter) active links defined as the number of links connecting agents with different choices in the playing (learning) network. Figure 2 shows the average of the proportions of active links nA(T), inter layers between populations A and B and intra layer for each population, A and B, for T [0.4, 1] in the fully connected network (left panel) and in the random (ER) network (right panel). We find that for herding populations, T < 0.5, the final configuration of the system is a state of non-coordination in both the learning network (intralayer) and the playing network (interlayer) for the fully connected network and the random (ER) network. Too much sensitivity to the social pressure plays against the intralayer and therefore, the interlayer coordination in any of these two network topologies. Such non-coordination state is the one in which the proportions of the strategies in population A and B fluctuate over 0.5, see the left panel of Fig. 4. However, in the case of skeptical populations, T ≥ 0.5, the system always reaches intralayer coordination both in the fully connected and in the random (ER) networks. However, for interlayer coordinations, we observe coordination on all realizations of the process in the case of random (ER) networks, while interlayer coordination is only reached in half of the realizations in the fully connected network. Figure 3 shows the number of realizations in which the system reaches a state of interlayer coordination on the strategy L and R and a interlayer non-coordination state for T [0.4, 1]. For T > 0.5, we observe that in the fully connected network (left panel), agents fully coordinate either in L or R half of the realizations. The steady state of non interlayer coordination is a completely polarized multilayer network in which all agents in population A play the opposite strategy of all agents in B, see the right panel of Fig. 4. In the case of random (ER) networks (right panel of Fig. 3) a state of interlayer coordination either in L or R is always reached for T > 0.5. Comparison of this result with the one for fully connected networks highlights the role of local interactions to reach consensus or full coordination: While with all-to-all interactions (fully connected networks) interlayer coordination is only reached in half of the realizations, the presence of local interactions (ER networks) leads always to full (interlayer) coordination for skeptical populations (T > 0.5).

Figure 2
figure 2

Average over 500 realizations of the densities of intralayer active links for each population A and B and interlayer active links between A and B for each T [0.4, 1] with NA = NB = 1000 in a fully connected network (left panel) and in a random (ER) network with kAA = kBB = kAB = 10 (right panel).

Figure 3
figure 3

Number of realizations, out of a total of 500 realizations, that the populations A and B reach coordination using strategy R (blue), using strategy L (green) and are not able to coordinate (red) as function of T for a fully connected network(left panel) and kAA = kBB = kAB = 10 (right panel).

Figure 4
figure 4

Time series of the proportion of agents playing L in A (blue) and L in B (red) in a random (ER) network with kAA = kBB = kAB = 1000 and T = 0.3 (left panel) and T = 0.8 (right panel).

Results for general coordination games

To cover a better understanding of this multilayer model, we extend our analysis to a general coordination game setup whose normal form representation is shown in Table 1 with a = 1, d = 2 and b > 0. Due to their social and strategic implications this class of games has been studied analytically in an evolutionary framework31,32 and by the numerical simulations on several network topologies6,7,8. Previous numerical results have shown that in a fully connected network, the agents using the (UI) update rule tend to coordinate on (L, L), the risk dominant equilibrium whenever b > 1 and in the case of a complex network, the (UI) update rule leads to frozen disordered configurations. In our multilayer model with the dynamic update rule based on social and strategic implications, our numerical results are again quite different from these previous results and also are determined by the doubtful behavior of the populations. The same analysis made in the last section for pure coordination games leads too to the same conclusion that states of intralayer coordination are absorbing states of the dynamics. The state of interlayer coordination is another absorbing state that implies intralayer coordination.

As already seen in the previous section of pure coordination games, also in the general coordination games the herding populations are not able to reach intralayer coordination neither for fully connected nor for random (ER) networks. In contrast to Ref. 33 where the “wisdom of groups” promotes cooperative behavior in social dilemmas, in coordination games the sensitivity to the social pressure is a detrimental factor in any of the two network topologies. Similarly, for skeptical populations, the final configuration of intralayer coordination is always reached and depending on the network topology the state of interlayer coordination is also accomplished. As an example, Fig. 5 shows the densities of intralayer and interlayer active links for a general coordination game with b = 1.1. For T > 0.5, the system reaches interlayer coordination almost 70% of the realizations in the fully connected network (left panel of Fig. 5). This proportion is higher than the 50% observed in the case of the pure coordination games. In the random (ER) network, the final configuration of the system is always of interlayer coordination, see right panel of Fig. 5.

Figure 5
figure 5

For a general coordination game with b = 1.1, average over 500 realizations of the densities of intralayer active links for each population A and B and interlayer active links between A and B for T [0.4, 1] in a fully connected network (left panel) and a random (ER) network with kAA = kBB = kAB = 10 (right panel).

The main point at issue here is whether Pareto-dominant equilibrium can be coordinated by the agents. In the game theoretical approach, the coordination on the risk-dominant equilibrium (L,L) is unavoidable whenever b > 1. In our framework, skeptical individuals are those able to reach intralayer or interlayer coordination, however the key point is to find out whether such coordination favors the desirable socially efficient outcome, that is the (R,R) Pareto dominant coordination. First, let us analyze what happens in the complete multilayer network. As the initial strategies are uniformly randomly distributed with proportion 0.5, almost all individuals are at least strategically unsatisfied and willing to change their strategies. According to the update rule, an unsatisfied agent who is playing L in a fully connected network will change her strategy to R only when where pL is the proportion of agents playing L in her learning network. Due to the initial conditions pL ≈ 0.5 the parameter b must be approximately lesser than 1 to make agents who are playing L change to R. Panel (a) of Fig. 6 shows, for a fully connected network, the number of realizations that the system reaches interlayer coordination on L, on R and intra but not interlayer coordination as function of b. We observe that as b increases, the number of realizations reaching interlayer coordination on L increases. As a consequence, the rate of coordination on the Pareto dominant equilibrium (R, R) decreases with b, with the most likely coordination shifting from Pareto dominance to risk dominance around b* = 1, as expected. Noteworthy that the range of values of b in which the state of polarized layers can be reached is also around b = 1, where the two Nash equilibria have the same expected payoff. In panels (b) and (c) of Fig. 6 for ER networks, we show that such threshold b* in which the chance of coordination on R starts to decrease is higher the lower the average number of links per node is. The effect of locality not only favors interlayer coordination over only intralayer coordination (polarized layers) but also favors Pareto dominant coordination. In our numerical simulations (not shown) we find that already for kAA = kBB = kAB = 10 the agents manage to coordinate on the Pareto dominant equilibrium (R, R) for any value of b [0.5, 2], overcoming the frozen disordered configurations reported in previous works. The strong effect of locality is due to the possibility that pR > T for an agent who is playing L. In such case she will be totally unsatisfied and will switch her strategy to R. In our multilayer model, the locality for skeptical populations is the driving force that favors interlayer coordination on the socially efficient outcome, that is the Pareto dominant strategy.

Figure 6
figure 6

Number of realizations reaching interlayer coordination on L (blue), on R (green) and polarized networks state with only intralayer coordination (red) as function of b for 500 realizations for skeptical populations with T = 0.7, in a fully connected network (a), a random (ER) network with kAA = kBB = kAB = 500 (b) and kAA = kBB = kAB = 100 (c).

Discussion

In this paper we have introduced a multilayer network model in which agents of two populations play and learn in two disaggregated networks and update their strategies heeding social and strategic motivations. A network between the two populations is for playing according to a coordination game. There each agent receives an aggregate payoff as a result of her interaction with each of her playing neighbors. The other network is for learning in which each agent can update her game strategy motivated by a feeling of social or strategic dissatisfaction. When an agent is unsatisfied either socially or strategically, she can update her strategy imitating the strategy of the most successful neighbor. The agent searches for such neighbor looking inside her own population. We have shown that the degree of social pressure calibrated by the level of doubts plays an important role in the networks topologies considered. The skepticism about the wisdom of crowd and the locality of interactions are the driving forces for collaboration and social efficiency in both pure and general coordination games.

For pure coordination games in a skeptical environment, each population evolves towards a coordinated state in both fully connected and random (ER) networks. However, in fully connected networks (non-local interactions) the populations eventually may coordinate each other in the opposite strategy leading to a polarized multilayer network. In the case of general coordination games the challenge is to elucidate whether the Pareto-dominant strategy, the socially efficient outcome, can be established in the populations. Previous results in well-mixed and structured populations tend to favor the risk-dominant equilibrium in the parametric setting in which the Pareto dominant equilibrium is also the riskier one. In contrast, our simulation results show that the skepticism and the local connectivity allow the populations to coordinate on the Pareto-dominant equilibrium even in the riskier setting.