Balance and fragmentation in societies with homophily and social balance

Recent attempts to understand the origin of social fragmentation on the basis of spin models include terms accounting for two social phenomena: homophily—the tendency for people with similar opinions to establish positive relations—and social balance—the tendency for people to establish balanced triadic relations. Spins represent attribute vectors that encode G different opinions of individuals whose social interactions can be positive or negative. Here we present a co-evolutionary Hamiltonian model of societies where people minimise their individual social stresses. We show that societies always reach stationary, balanced, and fragmented states, if—in addition to homophily—individuals take into account a significant fraction, q, of their triadic relations. Above a critical value, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_c$$\end{document}qc, balanced and fragmented states exist for any number of opinions.


q c , balanced and fragmented states exist for any number of opinions.
The concept of so-called filter bubbles captures the fragmentation of society into isolated groups of people who trust each other, but clearly distinguish themselves from "other". Opinions tend to align within groups and diverge between them. Interest in this process of social disintegration, started by Durkheim 1 , has experienced a recent boost, fuelled by the availability of modern communication technologies. The extent to which societies fragment depends largely on the interplay of two basic mechanisms that drive social interactions: homophily and structural balance. Homophily is the "principle" that "similarity breeds connection" 2 . In particular, for those individuals who can be characterised by some social traits, such as opinions on a range of issues, homophily appears as the tendency of like-minded individuals to become friends 3 . The concept of structural balance, first described by Heider 4 , can be translated into a tendency of unbalanced triads to become balanced over time. A triad of individuals is balanced if all three individuals are mutual friends (friend of my friend is my friend) or if two friends have a mutual enemy (enemy of my enemy is my friend). Structural balance has been investigated by social scientists for a long time [5][6][7] and, more recently, by physicists and network scientists [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24] . Recent contributions study the dynamics on balanced networks [25][26][27] , the co-evolution of opinions and signed networks [28][29][30][31][32][33][34][35][36][37][38][39][40] , and generalized measures of structural balance 41,42 . For an overview, see 43,44 . A general survey of statistical physics methods applied to opinion dynamics is found in 45,46 .
Previous works studying social fragmentation under the joint effects of homophily and social balance have been in only partial agreement with Heider's theory. For example, in an attribute-based local triad dynamics model (ABLTD) 47 each agent has binary opinions on G attributes. If two agents agree on more attributes than they disagree on, they become friends (positive link). Agents tend to change their attributes to reduce stress in unbalanced triads. The paper showed that given a system of N agents, as N → ∞ , the so-called "paradise state", where all agents are friends of each other, is never reached unless the number of attributes G scales as O(N γ ) , with γ ≥ 2 . Instead, the society remains in a stationary unbalanced state with an equal number of balanced and unbalanced triads. Realistic social networks, where N is typically large and G remains relatively small, are hence expected to be unlikely to reach social balance, let alone the paradise state. This statement is to some extent contrary to empirical findings that societies are balanced to a high degree; see e.g. recent work on large scale networks 16,17 .
In another, so-called global social stress Hamiltonian framework 12,40,[48][49][50][51] , the opinion of individual i is denoted by s i and the relation between i and j by J ij (positive or negative). Defining a social stress, H, as the sum of a homophily-related term, − (i,j) J ij s i s j and a term reflecting social balance, − (i,j,k) J ij J jk J ki , it can be shown that societies, where social balance is present, necessarily become fragmented at some critical level of interconnectedness 40 . This result, however, is restricted to the case where the reduction of H can be realised by www.nature.com/scientificreports/ either an opinion update or a flip of link's sign with the latter happening to be independent from the former. Social relations, which are subject to a homophily effect, essentially depend on the agents' similarity in opinions, and hence necessarily evolve as opinions are updated. In this paper, motivated by the lack of a consistent theory of balance and fragmentation in societies of agents with multidimensional opinions and homophilic interactions, we propose an individual-stress-based model that takes into account the homophily effect between adjacent individuals and structural balance within a time-varying local neighborhood. The latter consists of the subset of the most relevant triads to an individual at a given moment in time, i.e. those that involve the relationships that are currently in their field of attention when considering their social stress. The ratio of relevant triads to the total number of triads the individuals belong to determines whether society fragments or remains cohesive. With the help of simulations on a regular network, we show that there exists a critical size of the local neighborhood above which society fragments, yet stays balanced. We discuss the relation of the presented model to both, the ABLDT model 47 and the social stress Hamiltonian approach 40 .

Results
Local social stress model. Consider a society of N individuals. Each individual i has binary opinions on G issues, characterized by an attribute vector, Further, i has relations to k i other individuals in a social network. Network topology does not change over time. Following 47 , the relation between two agents i and j is determined by the sign of their distance in attribute space: where the dot denotes the scalar product. J ij = 1 indicates friendship, J ij = −1 enmity. Each agent i has a social stress level, H (i) , defined as The first sum extends over all k i neighbours of i, while the second is restricted to Q i out of N i triads that node i belongs to (by definition, N � i ≡ c i k i (k i − 1)/2 , where c i is the local clustering coefficient). The relevance of this term in the model dynamics is discussed in the Supplemental Material. The notation (j, k) Q i means to sum over all pairs of j and k which, together with i, form the Q i triads. These are chosen at each step of the dynamics (see below). Q i represents the number of triads i would like to have socially balanced-i 's relevant neighborhood to the current stress-calculation. The existence of this neighborhood limits the extent to which the social network can change at any given update. Specifically, those edges that do not belong to the Q i triads, will not be updated. The idea behind this is that such links preserve a memory of i's relationships (at a previous time) with those who are currently not in the field of attention of i. As such, these links do not change instantaneously as i updates his opinion. For example, you may have an outdated relation to an old school friend until you two meet again at a class reunion and find out you still like each other or perhaps not. The factor 1/G ensures that contributions from any link towards H (i) do not diverge in the limit G → ∞ . Assuming agents try to minimize their individual social stress over time, we implement the following dynamics: 1. Initialize. Each node is assigned an opinion vector, A i , whose components are randomly chosen to be 1 or −1 with equal probability. Every node has the same degree, k i = K , and is connected to its neighbours in a regular way, forming the ring topology. The topology is fixed over time. For any pair of connected agents, i and j, we set J ij = sign(A i · A j ). 2. Update. (i) Pick a node i randomly and choose Q i of its triads, also randomly. Compute H (i) . In the current state its value is H . (ii) Flip one of i's attributes at random. Let Ã i be its new opinion vector. For each of the chosen triads, the weights of the two links adjacent to i are recomputed as J ij = sign(Ã i · A j ) . J is the new matrix. Compute the new stress H using J . The change in stress is �H (i) ≡H − H.
(iii) Update the system A i →Ã i and J ij →J ij with probability, min e −�H (i) , 1 , otherwise leave it unchanged. This stochastic rule means that agents are not always rational and might choose to increase their stress. 3. Continue with the next update of opinions and links by returning to step 2. Figure 1 illustrates an update where by changing one opinion, agent i becomes an enemy of j, but the chosen triad, (ijk), becomes balanced. If Q = 1 , this decreases i's social stress from H (i) = −5/3 to H (i) = −3 . If more triads are chosen, Q > 1 , this flip leads to a stress increase and is less likely accepted.
The change in stress for agent i, �H (i) , given that attribute a ℓ * i flips, can be written as where j|J ij = J ij means to sum over those j (neighbours of i) for whom the sign of the edge J ij remains unchanged and jk = J ij J ki −J ijJki J jk . Obviously, jk ∈ {−2, 0, 2} . According to the dynamical rule, the maximum number of links that may change their signs due to an opinion update depends on Q. Since most links are kept frozen for a small Q, the dynamics is mainly driven by the first term that makes friends more similar while enemies more dissimilar. Note the similarity to the Hebbian rule 34,52 . Because of the random assignment of the opinions at the start, there are approximately as many balanced as unbalanced triads in the stationary state. For large Q, the (2) Order parameter. To measure the level of social balance within a society, we define an order parameter, f, as the difference between the proportions of balanced and unbalanced triangles: where n + and n − are the number of balanced and unbalanced triangles, respectively. f = 1 means that all triangles are balanced, f < 1 signals the presence of unbalanced triangles. A network is called balanced 53 if and only if all cycles (including triangles as cycles of length 3) contain only an even number of negative edges. In our study, rather than following this strict mathematical definition of balanced graphs, we propose to call a society balanced, if all of its constituent triads are balanced. Fully-connected balanced networks are two-clusterable, i.e., they can be partitioned into two clusters of friends, within which all links are positive and between which links are exclusively negative 53 . For these networks, f = 1 is a necessary and sufficient condition for such a bipartition as they having all triads balanced is equivalent to having cycles of any length balanced. However, as we would only call a network fragmented (k-clusterable in the signed network literature), if it can be decomposed into k ≥ 2 clusters of friends, it is worth to noting that balanced states are generally different from fragmented ones in incomplete networks. This is because, for an incomplete network, all triangles may be balanced while leaving some cycles of larger lengths unbalanced. Therefore, being triad-wise balanced ( f = 1 ) is a necessary, but not sufficient condition for being fragmented (k-clusterable) 53 .

Results.
We first run the simulation on a regular ring network for N = 400 , where every node has a degree of K = 32 neighbors. Figure 2a shows a phase diagram of the order parameter, f, which indicates a transition from an unbalanced to a balanced society. For any given G, this transition occurs as q ≡ Q/N � passes a threshold q c . The existence of a critical q c demonstrates the importance of Heider's balance term in driving a society towards social balance: if a sufficiently large number of triangles is taken into account, society becomes balanced. For a wide range of G, q > 1/5 clearly suffices to be in the balanced phase. Also, q c increases with growing G, indicat-  www.nature.com/scientificreports/ ing that when more issues become relevant for homophily, the chance for achieving balance lowers. This can be understood as follows. The probability that a link incident with i switches its sign if a ℓ * i flips, is proportional to 1/ √ G , as G → ∞ (see the Supplemental Material for the derivation of this asymptotic formula). Therefore, links are less likely to change as G increases, making it harder for the dynamics to happen and the society to become balanced. Note that the situation resembles non-equilibrium in the sense that the quasi-stationary unbalanced states in the cohesive phase, due to fluctuations in finite-sized systems, eventually become highly balanced after a very long time. As the presented model is stochastic, these final states are not necessarily frozen (absorbing). This means that a small number of unbalanced triads still fluctuates over time. The transition is presumably firstorder, as a region of bi-stability is numerically observed where the order parameter can be f ∼ 0 or f ∼ 1 ; see Supplemental Material for examples of the transition at the critical value of Q. Since the total number of triads per agent, N , grows with K, Q must also grow with K as long as q = Q/N � fixed. Therefore, the balanced phase is expected to be reached if the network degree exceeds a critical value, K c . We verify this hypothesis in Fig. 2b for q = 1/3 . Interestingly, the transition becomes sharper at higher K.
We next study the scaling behaviour of the time to reach a balanced steady state and the number of clusters of friends in this state with the system size. The latter is investigated in order to check whether the balanced states are also fragmented. For N = 50, 100, 200, 400, 800 , the results in Fig. 3a demonstrate that the number of clusters grows with N for K = 8 , but remains small for higher degrees K = 16, 32 . In both cases, the fragmented state tends to persist also in the thermodynamic limit as long as the average number of clusters are always larger or equal to two. Further, the time to reach balance grows as t r ∝ N α , with α < 1 for K = 8 , and t r appears to be a convex function of N for K = 16, 32 , suggesting that it may saturate at some point, see inset in Fig. 3a. This means that the balanced phase should always be reached even though the time it takes may be quite long for very large systems. It would also be interesting to understand the temporal evolution of the number of clusters and to establish whether this number can converge to two at long times for K = O(N) . Finally, we find that the distribution of cluster sizes follows an exponential for networks with K = 8 , but shows a bimodal distribution for those with K = 32 , see inset in Fig. 3b. We can intuitively understand this observation as follows: A balanced network is expected to have a cluster statistics similar to that of an unsigned graph that can be obtained from it by removing all negative edges. For small K, this unsigned network has an expected number of connected components proportional to N (Fig. 3a main). Therefore, it must have an average degree, K , of positive edges that is below the percolation threshold for the emergence of giant component. That below this threshold unsigned networks, with high probability, exhibit an exponential distribution of component sizes, in full consistency with the distribution observed in the original balanced network. This situation changes as the degree, K, of the original network increases. In the limit of K → N − 1 , only two clusters of friends can emerge. While in general these two clusters can have different sizes, in the most probable configuration they are of almost equal size. This gives rise to a single peak at N/2 in the cluster distribution of fully-connected networks. In the intermediate range of K, a bimodal distribution necessarily occurs as a crossover between the exponential and the singly-peaked ones.
Limit of small Q. In the limit Q = 1 and G → ∞ , the society is expected to reach an unbalanced stationary state, where the order parameter f is close to zero. We show this by a mean-field approach for fully-connected networks; see Supplemental Material. Here we assume that the two links of a chosen triad are not likely to be flipped simultaneously, as G → ∞ . Instead, only one of them would be flipped at every update. We then derive a set of rate equations for triads of different types whose steady state solution is f (st) = 0 and ρ Limit of large Q. Another interesting limit is when Q → N . In this case, one can compare the model with the Hamiltonian approach used in 40 , in which the contribution of all N × N � /3 triangles, weighted by a coupling g, is taken into account: www.nature.com/scientificreports/ Here the first sum extends over all connected pairs, the second over all triangles. In Eq. (4), in contrast to the model presented here, J ij are random dynamical variables that co-evolve with, but are not strictly determined by the opinion vectors. The detailed updating procedure of 40 , which aims at minimizing H , is described in the Supplemental Material. Despite the differences in the concrete update dynamics, for a large enough Q ≥ Q MF , the two models are expected to yield similar results if g is related to Q by g = αQ/N � , for some constant α . Here the main idea is that for sufficiently large Q, individuals' actions have a similar outcome regardless of their knowledge of the total stress H in the society. Figure 4 shows the comparison for α = 1 . The curve of the presented model indeed crosses that of the model given by Eq. (4) at q 1 ≥ Q MF /N � ≃ 0.133 for G = 23 in (a), and at q 2 ≥ 1/6 , for G = 99 in (b), where the coupling, g, is chosen to be equal to q 1 in (a) and q 2 in (b), respectively.

Discussion
We showed that under the simultaneous effects of homophily and structural balance society can achieve structurally balanced states if individuals' opinions co-evolve with their social links so as to minimize their individual social stress. The parameter G controls the dimension of the opinion vectors relevant for homophily, and Q i specifies how many triangles individuals actually consider for their local social balance. The interplay between homophily and structural balance results in a nontrivial phase diagram showing an abrupt change in patterns of social structure. We find two regimes: fragmentation and cohesion. In the former, society is fragmented into locally cooperative clusters of agents who are linked positively within and negatively between clusters. In the latter, globally percolating cooperation is realized by the existence of a large connected component of positively linked agents. The transition between the regimes is numerically observed at a critical fraction of the considered triangles, q c , illustrating the main message of the paper: The more people try to balance their social neighborhoods, the more likely society is to become fragmented. Because of the relation between Q i and K, this message is robust with respect to the change of the network connectivity; for a fixed value of q, we see that the higher the degree, the more likely the society fragments.
The fragmented phase with most of the triads balanced agrees with the result of 47 . However, a crucial difference between the two models is that increasing G leads to a balanced society in 47 , but to the destruction of social balance in ours. While the reason for this difference is not fully clear to us, it seems that the probabilities of link updates in the two models depend on G in different ways. While less link updates can happen as G increases in our model, a large G seems to retain the constant high activity of link updates in 47 due to their local instant updating rule. Nevertheless, for small Q, the unbalanced steady states of both models have the same stationary values in the network observables. Note that if a term, equivalent to that of the p term, is introduced to Eq. (1), e.g. of the form h (i,j) (1 − J ij ) for some external field strength h, then the "paradise state" can be reached for sufficiently large h, see 40 . Beyond a value Q MF , the model produces very similar result to that obtained by minimising the Hamiltonian in Eq. (4) 40 . The existence of Q MF suggests that if individuals keep a large fraction of their local triads balanced then locally minimising an individual stress can become equivalent to reducing an overall stress. In comparison with these two approaches, the model presented here shows the possibility of social fragmentation being fully consistent with both, homophily and social balance theory.
The model, however, has a number of limitations that may be interesting to address in future work. The first is the choice of binary symmetric interaction coupling, J ij ∈ {−1, 1} , which does not capture the possibility of non-reciprocal and weighted links, as well as the existence of higher-order interactions between individuals. Further, the actual relations between agents can be poorly estimated by such binary definition. For instance, people who are only extreme about one particular issue can become enemies despite of their similar opinions on all other topics. The second limitation comes from the use of unrealistic networks with a fixed ring topology. While this special case is chosen to highlight the key effect of balancing a sufficiently large number of triads on social balance of the entire network, more general cases with heterogeneity and/or adaptive changes in the topology, such as link rewiring, may not ensure such effect to happen. Nevertheless, since our model becomes equivalent www.nature.com/scientificreports/ to that of 40 in the limit of Q → N , which does exhibit a balanced phase on both, time-varying topology and small-world topology, we conjecture that the main result of our model will still hold true for these topologies.
In the current implementation of the model all agents have the same fraction Q/N � . If Q i /N � varies from one individual to another, then, for some agents i, Q i /N � may become smaller than q c that is required to make all their N i triads balanced. As a consequence, the whole network appears to be partly-but not perfectly-balanced (indeed f ≃ 0.69 − 0.88 in online societies 16,17 ). Finally, one can generalise our treatment to the case of interdependent and continuous opinions on correlated topics, where, interestingly, an emergence of polarised ideological opinions has been observed 54 .