Abstract
Despite their entertainment oriented purpose, social media changed the way users access information, debate, and form their opinions. Recent studies, indeed, showed that users online tend to promote their favored narratives and thus to form polarized groups around a common system of beliefs. Confirmation bias helps to account for users’ decisions about whether to spread content, thus creating informational cascades within identifiable communities. At the same time, aggregation of favored information within those communities reinforces selective exposure and group polarization. Along this path, through a thorough quantitative analysis we approach connectivity patterns of 1.2 M Facebook users engaged with two very conflicting narratives: scientific and conspiracy news. Analyzing such data, we quantitatively investigate the effect of two mechanisms (namely challenge avoidance and reinforcement seeking) behind confirmation bias, one of the major drivers of human behavior in social media. We find that challenge avoidance mechanism triggers the emergence of two distinct and polarized groups of users (i.e., echo chambers) who also tend to be surrounded by friends having similar systems of beliefs. Through a network based approach, we show how the reinforcement seeking mechanism limits the influence of neighbors and primarily drives the selection and diffusion of contents even among likeminded users, thus fostering the formation of highly polarized subclusters within the same echo chamber. Finally, we show that polarized users reinforce their preexisting beliefs by leveraging the activity of their likeminded neighbors, and this trend grows with the user engagement suggesting how peer influence acts as a support for reinforcement seeking.
Introduction
Social media facilitated global communications all over the world, allowing information to spread faster and intensively. These changes led up to the formation of a disintermediated scenario, where contents flow directly from producers to consumers, without the mediation of journalists or experts in the field. Beyond its undoubted benefits, a hyperconnected world can foster confusion about causation, and thus encourage speculation, rumors, and mistrust^{1,2,3,4}. Since 2013, indeed, the World Economic Forum (WEF) has been placing the global threat of massive digital misinformation at the core of other technological and geopolitical risks, ranging from terrorism, to cyberattacks, up to the failure of global governance^{5}. People are misinformed when they hold beliefs neglecting factual evidence, and misinformation may influence public opinion negatively. Empirical investigations have shown that, in general, people tend to resist facts, holding inaccurate factual beliefs confidently^{6}. Moreover, corrections frequently fail to reduce misperceptions^{7} and often act as a backfire effect^{8}.
Confirmation bias  i.e., the tendency to seek, select, and interpret information coherently with one’s system of beliefs^{9}  helps, indeed, to account for users’ decisions about whether to promote content^{2,10,11,12}. The action of this cognitive bias may lead to the emergence of homogeneous and polarized communities  i.e., echochambers^{13,14,15}, thus facilitating fake news and, more in general, misinformation cascades^{3}.
According to^{16}, two primary cognitive mechanisms are used to explain why people experience the confirmation bias^{17}:

Challenge avoidance  i.e., the fact that people do not want to find out that they are wrong,

Reinforcement seeking  i.e., the fact that people want to find out that they are right.
^{16}Though the two are strongly related, and though both behaviors resolve around people’s attempt to minimize their cognitive dissonance  i.e., the psychological stress that people experience when they hold two or more contradictory beliefs simultaneously, challenge avoidance and reinforcement seeking are not inherently linked to each other, and they do not have to occur at the same time^{18}. This distinction is important because the consequences of challenge avoidance are significantly more harmful to democratic deliberation than those of reinforcement seeking^{17}. Additionally, group membership has an interplay with the aforementioned cognitive biases. When individuals belong to a certain group, those outside the group are far less likely to influence them on both easy and hard questions^{19}.
In this work, by exploiting the social network of 1.2 M Facebook users engaged with very polarizing contents, we investigate the role of challenge avoidance and reinforcement seeking on the selection and spread of information, and the connection of such cognitive mechanisms with peer influence.
To our aim, with the help of very active debunking groups, we identified all the Italian Facebook pages supporting scientific and conspiracy news, and on a time span of five years (2010–2014) we downloaded all their public posts (with the related lists of likes and comments). On the one hand, conspiracy news simplify causation, reduce the complexity of reality, and are formulated in a way that is able to tolerate a certain level of uncertainty^{20,21,22}. On the other hand, scientific news disseminates scientific advances and exhibits the process of scientific thinking. Notice that we do not focus on the quality of the information but rather on the possibility of verification. Indeed, the main difference between the two is content verifiability. The generators of scientific information and their data, methods, and outcomes are readily identifiable and available. The origins of conspiracy theories are often unknown and their content is strongly disengaged from mainstream society and sharply divergent from recommended practices^{8}, e.g., the belief that vaccines cause autism^{23}.
Our analyses show how challenge avoidance mechanism triggers the emergence, around the selected narratives, of two wellseparated and polarized groups of users who also tend to surround themselves with friends having similar systems of beliefs.
Through a network based approach, we also prove that polarized users span their attention focus on a higher number of pages (and topics) supporting their beliefs (hereafter referred to as community pages) as their engagement grows, but they tend to remain confined within groups of very few pages even when the corresponding neighborhoods are active on several news sources. This suggests that the reinforcement seeking mechanism limits the influence of neighbors and primarily drives the selection and the diffusion of contents even among likeminded users, fostering the formation of highly polarized subclusters within the same echo chamber.
Finally, we investigate the effects of the joint action of confirmation bias and peer influence when the latter does not conflict the cognitive mechanisms of challenge avoidance and reinforcement seeking. Namely, we compare the liking activity of polarized users and the liking activity of their part of neighborhood likewise polarized, both with respect to size and time. Our findings reveal that polarized users reinforce their preexisting beliefs by leveraging the activity of their likeminded neighbors. Such a trend grows with the user engagement and suggests how peer influence acts as a support for reinforcement seeking. In such a context, also the positive role played by social influence  e.g., by enabling social learning^{24,25,26}, seems to lose its effectiveness in the effort of smoothing polarization and reducing both the risk and the consequences of misinformation. This makes it even more difficult to design efficient communication strategies to prevent rumors and mistrust. Individual choices more than algorithms^{10} seem to characterize the consumption patterns of users and their friends. Therefore, working towards longterm solutions to polarization and misinformation online cannot be separated from a deep understanding of users’ cognitive determinants behind these mechanisms.
Methods
Ethics statement
Approval and informed consent were not needed because the data collection process has been carried out using the Facebook Graph application program interface (API), which is publicly available. For the analysis (according to the specification settings of the API) we only used publicly available data (thus users with privacy restrictions are not included in the dataset). The pages from which we download data are public Facebook entities and can be accessed by anyone. User content contributing to these pages is also public unless the user’s privacy settings specify otherwise, and in that case it is not available to us.
Data collection
Debate about social issues continues to expand across the Web, and unprecedented social phenomena such as the massive recruitment of people around common interests, ideas, and political visions are emerging. For our analysis, we identified two main categories of pages: conspiracy news – i.e., pages promoting contents neglected by main stream media – and science news. We defined the space of our investigation with the support of diverse Facebook groups that are very active in debunking conspiracy theses. As an additional control, we used the selfdescription of a page to determine its focus. The resulting dataset is composed by all the pages supporting the two distinct narratives in the Italian Facebook scenario: 39 about conspiracy theories and 33 about science news. For the two sets of pages we download all of the posts (and their respective user interactions) across a 5y time span (2010–2014). We perform the data collection process by using the Facebook Graph API, which is publicly available and accessible through any personal Facebook user account. The exact breakdown of the data is presented in Table 1. Likes and comments have a different meaning from the user viewpoint. Most of the times, a like stands for a positive feedback to the post and a comment is the way in which online collective debates take form. Comments may contain negative or positive feedbacks with respect to the post.
Ego networks
In addition, we collected the ego networks of users who liked at least one post on science or conspiracy pages  i.e., for each user we have collected her list of friends and the links between them (We used publicly available data, so we collected only data for which the users had the corresponding permissions open).
Preliminaries and definitions
Let \({\mathscr P}\) be the set of all the pages in our collection, and \({{\mathscr P}}_{{\rm{science}}}\) (\({{\mathscr P}}_{{\rm{conspir}}}\)) be the set of the 33 (39) Facebook pages about science (conspiracy) news. Let V be the set of all the 1.2 M users and E the edges representing their Facebook friendship connections; these sets define a graph \({G}=(V,E)\). Hence, the graph of likes on a post, \({{G}}^{L}=({V}^{L},{E}^{L})\) is the subgraph of G whose users have liked a post. Thus, V^{L} is the set of users of V who have liked at least one post, and we set \({E}^{L}=\{(u,v)\in E;u,v\in {V}^{L}\}\). Following previous works^{2,3,27}, we study the polarization of users  i.e., the tendency of users to interact with only a single type of information; in particular, we study the polarization towards science and conspiracy. Formally we define the polarization \(\rho (u)\in [\,\,1,1]\) of user \(u\in {V}^{L}\) as the ratio of likes that u has performed on conspiracy posts: assuming that u has performed x and y likes on conspiracy and science posts, respectively, we let \(\rho (u)=(xy)/(x+y)\). Thus, a user u for whom \(\rho (u)=\,1\) is totally polarized towards science, whereas a user with \(\rho (u)=1\) is totally polarized towards conspiracy. Note that we ignore the commenting activity since a comment may be an endorsement, a criticism, or even a response to a previous comment. Furthermore, we define the engagement \(\psi (u)\) of a polarized user u as her liking activity normalized with respect to the number of likes of the most active user of her community. By defining \(\theta (u)\) as the total number of likes that the user u has expressed in posts of \({\mathscr{P}}\), notice that the following condition holds: \(\psi (u)=\frac{\theta (u)}{{{\rm{\max }}}_{v}\,\theta (v)}\).
The degree of a node (here, user) u, deg(u), is the number of neighbors (here, friends) of u. For any user u, we consider the partition \({\deg }(u)={N}_{c}(u)+{N}_{ne}(u)+{N}_{np}(u)+{N}_{s}(u)\) where \({N}_{c}(u)({N}_{s}(u))\) denotes the neighborhood of u polarized towards conspiracy (science), \({N}_{ne}(u)\) denotes the neighborhood of u not engaged with science or conspiracy contents, \({N}_{np}(u)\) denotes the set of not polarized friends of u  i.e., friends who liked the same number of contents from science and conspiracy, respectively.
To understand the relationship between pages and user liking activity, we measure the polarization of users with respect to the pages of their own community. For a polarized user (or, more in general, a group of polarized users) u with \({\sum }_{i}\,{\theta }_{i}(u)=\theta (u)\) likes, where \({\theta }_{i}(u)\) counts the contents liked by u on the i^{th} community page (\(i=1,\ldots ,N\), where N equals the number of community pages), the probability \({{\phi }}_{i}(u)\) that u belongs to the i^{th} page of the community will then be \({{\phi }}_{i}(u)={\theta }_{i}(u)/\theta (u)\). We can define the localization order parameter L as:
Thus, in the case in which u only has likes in one page, \(L(u)=1\). If u, on the other hand, interacts equally with all the community pages (\({{\phi }}_{i}(u)=1/N\)) then \(L(u)=N\); hence, \(L(u)\) counts the community pages where u fairly equally distributes her liking activity.
List of pages
In this section are listed pages of our dataset. Table 2 shows the list of scientific news and Table 3 shows the list of conspiracy pages.
Augmented Dickey–Fuller test
An augmented Dickey–Fuller test (ADF) tests the null hypothesis that a unit root is present in a time series^{28,29}. The alternative hypothesis is stationarity. If we obtain a pvalue less than the threshold value \(\bar{\alpha }=0.05\), the null hypothesis is rejected in favor of the alternative one. ADF is an augmented version of the Dickey–Fuller test^{30} for a larger set of time series models. We use this test to investigate the stationarity of the time series given by the number of posts per day published by a community page during its lifetime. The general regression equation which incorporates a constant and a linear trend is used. The number of lags used in the regression corresponds to the upper bound on the rate at which the number of lags should be made to grow with the time series size T for the general ARMA(p, q) setup^{31}, and equals T^{1/3}.
Cosine similarity
Cosine similarity is a measure of similarity between two nonzero vectors \({\bf{u}}=({u}_{1},\ldots ,{u}_{k})\) and \({\bf{v}}=({v}_{1},\ldots ,{v}_{k})\) of a kdimensional inner product space expressed by the cosine of the angle between them^{32}. By means of the Euclidean dot product formula we obtain
We use cosine similarity to evaluate whether a polarized user u and the part of her neighborhood with likewise polarization proportionally distribute their liking activity across her preferred community pages. Namely, for any user u polarized towards science (conspiracy), denoted with \(\{{P}_{{i}_{1}},\ldots ,{P}_{{i}_{k}}\}={{\mathscr P}}_{{\rm{science}}}^{u}\,\,({{\mathscr P}}_{{\rm{conspir}}}^{u})\) the set of k science (conspiracy) pages where u distributes her liking activity, we compute the cosine between the vectors \(({\theta }_{{i}_{1}}(u),\ldots ,{\theta }_{{i}_{k}}(u))\) and \(({\theta }_{{i}_{1}}({N}_{s}(u)),\ldots ,{\theta }_{{i}_{k}}({N}_{s}(u)))\), both normalized with respect to the infinity norm. The space of such versors is positive, then the cosine measure outcome is neatly bounded in \([0,1]\): two versors are maximally similar if they are parallel and maximally dissimilar if they are orthogonal.
Akaike information criterion
The Akaike Information Criterion (AIC)^{33,34,35} is an asymptotically unbiased estimator of the expected relative KullbackLeibler distance (KL)^{36}, which represents the amount of information lost when we use model g to approximate model f:
where \(\mu =({\mu }_{1},\ldots ,{\mu }_{k})\) is the vector of k model parameters. The AIC for a given model is a function of its maximized loglikelihood (\(\ell \)) and k:
We use the AIC for selecting the optimal lag structure of a Granger causality test.
Granger causality and peer influence probability
The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another^{37}. Roughly speaking, a time series X is said to Grangercause (briefly, Gcause) the time series Y if the prediction of Y is improved when X is included in the prediction model of Y. Denoted with \({ {\mathcal I} }^{\ast }(\tau )\) the set of all information in the universe up to time \(\tau \) and with \({ {\mathcal I} }_{X}^{\ast }(\tau )\) the same information set except for the values of series X up to time \(\tau \), we write
for indicating that X does not cause Y.
Let \(t(u)\) be the time series given by the number of likes expressed by a user u polarized towards science on \({{\mathscr P}}_{{\rm{science}}}^{u}\) every day of her lifetime  i.e., the temporal distance between its first and its last like. Let \(t({N}_{s}(u))\) be the time series of the number of likes expressed by \({N}_{s}(u)\) on the same pages every day in the same time window. We investigate a causal effect of \(t({N}_{s}(u))\) on \(t(u)\) by testing the null hypothesis that the former does not Grangercause the latter:
through a series of Ftests on lagged values of \(t(u)\). The alternative hypothesis \({{\mathbb{H}}}_{1}\) is \(t({N}_{s}(u))\) Gcause \(t(u)\). The number of lags to be included is chosen using AIC. If we obtain a pvalue α(u) less than the threshold value \(\bar{\alpha }=0.05\), the null hypothesis \({{\mathbb{H}}}_{0}\) is rejected in favor of \({{\mathbb{H}}}_{1}\). The same analysis is carried out for testing a causal effect of \(t({N}_{c}(u))\) on \(t(u)\) for any polarized user u towards conspiracy.
Furthermore, we define the peer influence probability \({{\rm{PIP}}}_{{\rm{science}}}^{u}\) of \({N}_{s}(u)\) on u as the rational number in the range \([0,1]\) given by the complement of \(\alpha (u)\) in the positive space of pvalues, that is: \({{\rm{PIP}}}_{{\rm{science}}}^{u}=1\alpha (u)\). Values close to 0 indicate low probability of peer influence, values close to 1 suggest high probability of peer influence. Analogously we define the peer influence probability \({{\rm{PIP}}}_{{\rm{conspir}}}^{u}\) of \({N}_{c}(u)\) on u, for any user u polarized towards conspiracy.
Dynamic time warping
Dynamic time warping (DTW) is an algorithm for measuring similarity between two time series X and Y which computes the optimal (least cumulative distance) alignment between points of X (also said query vector) and Y (also said reference vector). If X has size n and Y has size m, DTW produces an n × m cost matrix D whose \((i,j)\)element is the Euclidean distance d\(({\bar{X}}_{i},{\bar{Y}}_{j})\) where \({\bar{X}}_{i}\) and \({\bar{Y}}_{j}\) are obtained by stretching in time the vectors \((X[1],\ldots ,X[i])\) and \((Y[1],\ldots ,Y[j])\) to optimize the best alignment. The value \(D(n,m)\)  i.e., the DTW distance between X and Y, is returned^{38}.
We use DTW distance for measuring the similarity between \(t(u)\) and \(t({N}_{s}(u))\) (\(t({N}_{c}(u))\)) for any user u polarized towards science (conspiracy).
Results and Discussion
Anatomy of science and conspiracy pages
To ensure the robustness of our analysis about the online behavior of polarized users (i.e., if likes are not trivially distributed across pages and if data respect the assumptions of the tests described in Methods), we verify the eligibility of the space of our investigation. Namely we study how likers and their activity are distributed over pages and how pages’ activity is distributed over time. Figure 1 shows the distribution of likes and likers across scientific and conspiracy news sources, respectively. Plots shows the ratio likers/likes for every science (left panel) and conspiracy (right panel) page. Points are colored according to the number of users who liked contents published by the corresponding page (See Tables 2 and 3 for the list of scientific and conspiracy news sources, respectively).
Points are mostly localized near the center of the radar chart and, in general, represent the pages with more likers (and more likes). Moreover, points far from the center correspond to pages with the lowest number of likers and likes. This ensures that a comparison between the normalized distributions of likes of two likeminded users (or groups of users) across the community pages is an unbiased estimator of their behavioral difference in terms of liking activity.
Furthermore, in order to investigate how scientific and conspiracy news sources distribute their posting activity over time, we compute the fraction of days with activity of any page with respect to its lifetime  i.e., the temporal distance between its first and its last post. Then we perform an augmented Dickey–Fuller (ADF) test for testing the null hypothesis that a unit root is present in the time series given by the number of posts per day published by a community page during its lifetime. The alternative hypothesis is stationarity (see Methods for further details). Figure 2 shows the PDF of the fraction of days with activity per page and the PDF of pvalues obtained by performing ADF test for all the pages of science community (left panel) and all the pages of conspiracy community (right panel), respectively.
Plots indicate that the most pages of both the communities are active with a nearly constant number of posts almost every day of their lifetime.
Experiencing the confirmation bias: polarization and homophily
Users’ liking activity across contents of the different categories^{2,3,27} may be intended as the preferential attitude towards the one or the other type of information (documented or not). In Fig. 3 we show that the probability density function (PDF) for the polarization of all the users in V^{L} is a sharply peaked bimodal where the vast majority of users are polarized either towards science (\(\rho (u)\sim \,1\)) or conspiracy (\(\rho (u)\sim 1\)). Hence, Fig. 3 shows that most of likers can be divided into two groups of users, those polarized towards science and those polarized towards conspiracy. To better define the properties of these groups, we define the set \({V}_{{\rm{science}}}^{L}\) of users with polarization more than 95% towards science
and the set \({V}_{{\rm{conspir}}}^{L}\) of users with polarization more than 95% towards conspiracy
such sets corresponds to the two peaks of the bimodal distribution and show how the most users are highly polarized: \({V}_{{\rm{science}}}^{L}=243,977\) and \({V}_{{\rm{conspir}}}^{L}=758,673\).
Moreover, for a polarized users \(u\in {V}_{{\rm{science}}}^{L}\), in the left panel of Fig. 4, we show the loglinear plot of the average fraction of science pages where u is present with liking activity, respect given number of likes θ of the user u. In the right panel, we show the same quantities for polarized users in \({V}_{{\rm{conspir}}}^{L}\). Figure 4 suggests in both cases a quadratic correlation among the variables; thus, we check whether for a polarized user u, the fraction of community pages where u spans her liking activity, \(y(u)\), can be predicted by means of a quadratic regression model where the explanatory variable is a logarithmic transformation of the number of likes θ(u), i.e. \(y(u)={\beta }_{0}+{\beta }_{1}\,\log \,\theta (u)+{\beta }_{2}\,{\log }^{2}\,\theta (u)\). Using the notation introduced in Methods, it is \(y(u)={{\mathscr P}}_{{\rm{science}}}^{u}/{{\mathscr P}}_{{\rm{science}}}\) for \(u\in {V}_{{\rm{science}}}^{L}\) and \(y(u)={{\mathscr P}}_{{\rm{conspir}}}^{u}/{{\mathscr P}}_{{\rm{conspir}}}\) for \(u\in {V}_{{\rm{conspir}}}^{L}\). Coefficients are estimated using weighted least squares with weights given by the total number of users per engagement value and they are – with the corresponding standard errors inside the round brackets  \({\beta }_{0}=0.0669(0.0011)\), \({\beta }_{1}=0.2719(0.0137)\) and \({\beta }_{2}=0.0419(0.0040)\), with \({r}^{2}=0.7133\), for users polarized towards science, and \({\beta }_{0}=0.1229(0.0014)\), \({\beta }_{1}=0.9023(0.0195)\) and \({\beta }_{2}=0.1629(0.0054)\), with \({r}^{2}=0.8876\), for users polarized towards conspiracy. All the pvalues are close to zero.
Summarizing, we find that the consumption of polarizing contents is dominated by confirmation bias through the mechanism of challenge avoidance: users polarized towards a narrative tend to consume nearly exclusively content adhering to their system of beliefs, thereby minimizing their cognitive dissonance. Indeed, as their engagement grows, polarized users span their attention focus over a higher number of pages (and topics) keeping consistence with their behavioral attitude.
By exploiting the social network of polarized users and their friends, we investigate the role of reinforcement seeking mechanism in the homophily driven choice of friends on Facebook  i.e., the tendency of users to aggregate around common interests. Figure 5 shows the fraction of friends of polarized users as a function of their engagement \(\psi (\,\cdot \,)\) both in the case of users in \({V}_{{\rm{science}}}^{L}\) and in the case of users in \({V}_{{\rm{conspir}}}^{L}\). Plots suggest that users not only tend to be very polarized, but they also tend to be linked to users with similar preferences. This is more evident among conspiracists where, for a polarized user u, the fraction of friends v with likewise polarization is very high (\(\gtrsim \)0.62) and grows with the engagement \(\psi \) up to \(\gtrsim \)0.87. The neighborhood of a polarized scientific user u tends to be more heterogeneous, but the fraction of friends with likewise polarization of u grows stronger with the engagement \(\psi \) (from \(\gtrsim \)0.30 up to \(\gtrsim \)0.66). Furthermore, Fig. 5 clearly indicates that the neighborhood of users engaged with polarizing contents (verified or not) is almost completely polarized as well (74–80% for science users and 72–90% of conspiracy users). The fact that highly polarized users have friends exhibiting an opposite polarization is a direct evidence of the challenge avoidance mechanism: contents promoted by friends which contrast one’s worldview are ignored.
Summarizing, we find that the activity of a user on a polarizing content increases the probability to have friends with similar characteristics. Such information is a precious insight toward the understanding of information diffusion. Indeed, a previous work has shown that users usually exposed to undocumented claims (e.g., conspiracy stories) are the most likely to confuse intentional false information as usual conspiracy stories^{3}.
Engagement, friends and shared news sources
Looking at the selfdescription of the news sources, several distinct targets emerge both between science pages and between conspiracy pages (see Tables 2 and 3, respectively). This calls for a distinction between friends of a polarized user u who share with u a similar polarization resulting by liking contents of the same community and friends of u who actually like contents promoted by the same pages supported by u. In other words, in the first case the user u and her neighbourhood are grouped together at communitylevel (they have same/similar polarization but they like different pages); in the second case the user u and her neighbourhood are grouped together at pagelevel (they like not only pages in the same community but they are also somewhat active on the same set of pages).
For a polarized scientific user \(u\in {V}_{{\rm{science}}}^{L}\), in the left panel of Fig. 6, we show the loglinear plot of the average fraction y of friends \(v\in {V}_{{\rm{science}}}^{L}\) with liking activity on the community pages liked by u, respect given number of likes θ of the user u. In the right panel, we show the same quantities for polarized conspiracy users in \({V}_{{\rm{conspir}}}^{L}\). Figure 6 suggests in both cases a linear correlation among the variables; thus, we check whether for a polarized user u, the fraction of friends in her category who like contents from the community pages preferred by u, \(y(u)\), can be predicted by means of a linear regression model where the explanatory variable is a logarithmic transformation of the number of likes \(\theta (u)\), i.e. \(y(u)={\beta }_{0}+{\beta }_{1}\,\log \,\theta (u)\). Coefficients are estimated using weighted least squares with weights given by the total number of users per engagement value and they are – with the corresponding standard errors inside the round brackets – \({\beta }_{0}=0.4062\,(0.0007)\) and \({\beta }_{1}=0.0869\,(0.0012)\), with \({r}^{2}=0.8744\), for users polarized towards science; \({\beta }_{0}=0.3582\,(0.0007)\) and \({\beta }_{1}=0.1501\,(0.0012)\), with \({r}^{2}=0.9413\), for users polarized towards conspiracy. All the pvalues are close to zero. This suggests that polarized users not only tend to surround themselves with friends having similar systems of beliefs, but they actually share with them the involvement within the same community pages.
Confirmation bias as a filter to peer influence
Here we study the liking activity of polarized users in more detail by measuring how they span such activity across the various community pages. For science (conspiracy) community, Fig. 7 shows the probability distribution of the localization L along the user set and along the neighborhood set, and the relationship between \(L(u)\) and \(L({N}_{s}(u))\,(L({N}_{c}(u)))\) for each science (conspiracy) user u.
For each polarized user u, we observe a positive correlation between these two order parameters: Pearson’s correlation coefficient \({r}_{L(u),L({N}_{s}(u))}\sim 0.5962\) with pvalue ~10^{−7} for science community, Pearson’s correlation coefficient \({r}_{L(u),L({N}_{c}(u))}\sim 0.5935\) with pvalue ~10^{−9} for conspiracy community. Nevertheless, the most users remain confined within groups of very few pages even with neighborhoods fairly active on several news sources. Moreover, the inset plots of Fig. 7 show on a logarithmic x scale the relation of \(\theta (u)\) with \(L(u)\) and \(L({N}_{s}(u))\,(L({N}_{c}(u)))\), respectively, for each \(u\in {V}_{{\rm{science}}}^{L}\) (\({V}_{{\rm{conspir}}}^{L}\)). Full lines are the results of a linear regression model whose coefficients are estimated using weighted least squares with weights given by the total number of users per engagement value.
By investigating the selfdescription of the news sources, we also find that the most users who decide to span their liking activity over a higher number of pages, choose pages dealing with very interlinked topics (\(\gtrsim \)76% of science users and \(\gtrsim \)69% of conspiracy users). Such an evidence suggests that the reinforcement seeking mechanism limits the influence of neighbors and primarily drives the selection and the diffusion of contents even within groups of likeminded people.
Peer support and reinforcement of preexisting beliefs
So far we have shown how confirmation bias acts as filter to peer influence. In this Section, we investigate the effects of the joint action of confirmation bias and peer influence when the latter does not conflict the cognitive mechanisms of challenge avoidance and reinforcement seeking. Namely, we first compare the liking activity of each polarized user across her preferred community pages with the liking activity expressed on the same pages by the part of her neighborhood with likewise polarization. Then we compare the daily time series given by the number of likes expressed by a polarized user and her likeminded neighborhood, respectively, and we investigate the existence of a causal effect of the latter on the former.
For any polarized user \(u\in {V}_{{\rm{science}}}^{L}\) we compute the cosine between the versors \(\hat{{\bf{u}}}=\frac{{\bf{u}}}{{\bf{u}}}\) and \(\widehat{{{\bf{N}}}_{{\bf{s}}}}({\bf{u}})=\frac{{{\bf{N}}}_{{\bf{s}}}({\bf{u}})}{{{\bf{N}}}_{{\bf{s}}}({\bf{u}})}\) where u and N_{s}(u) are the vectors whose \({k}^{{\rm{th}}}\) component is the number of likes expressed by u and \({N}_{s}(u)\) on the k^{th} page of \({{\mathscr P}}_{science}^{u}\), respectively (see Methods for further details). The same quantities are calculated for any polarized user \(u\in {V}_{{\rm{conspir}}}^{L}\). Figure 8 shows the level of proportionality between the distributions of liking activity of u and \({N}_{s}(u)\) (\({N}_{c}(u)\)) across the pages of \({{\mathscr P}}_{science}^{u}\) (\({{\mathscr P}}_{conspir}^{u}\)), respectively, versus the number of likes \({\log }_{2}(\theta (u))\) of user u. Segments represent the average of the cosine measurements regarding users with a liking activity in the range of the corresponding bin (one of \(1,2,(2,4],(4,8],(8,16],\ldots \)), and they are colored according to the total number of users belonging to such a range.
The plots show that a polarized user and her likewise polarized neighborhood distribute their likes across her community pages in a similar way, both in science (left panel) and conspiracy (right panel) community. Moreover, except a nearly constant early pattern for conspiracy users, this trend grows with the user engagement suggesting how peer influence acts as a support for reinforcement seeking. Such an interpretation is pointed out more clearly by comparing the temporal evolution of the liking activity of a polarized user and her likewise polarized neighborhood, respectively.
In order to carry out such an analysis we restrict the observations to those polarized users u who exhibit a liking activity large enough to allow the comparison between the time series of likes per day expressed by u and her likewise polarized neighborhood, respectively. Namely we define
where \({\bar{\theta }}_{{\rm{science}}}=13\) is the average number of total likes expressed by a user of \({V}_{{\rm{science}}}^{L}\), and
where \({\bar{\theta }}_{{\rm{conspir}}}=12\) is the average number of total likes expressed by a user of \({V}_{{\rm{conspir}}}^{L}\). Furthermore, let \(t(u)\) and \(t({N}_{s}(u))\) (\(t({N}_{c}(u))\)) be the time series of likes per day expressed over \({{\mathscr P}}_{{\rm{science}}}^{u}\) (\({{\mathscr P}}_{{\rm{conspir}}}^{u}\)) by a user \(u\in {\bar{V}}_{{\rm{science}}}^{L}\) (\(u\in {\bar{V}}_{{\rm{conspir}}}^{L}\)) and her likewise polarized neighborhood, respectively. We estimate the temporal similarity between the liking activity of u and \({N}_{s}(u)\) (\({N}_{c}(u)\)) by measuring the DTW distance \(d(t({N}_{s}(u)),t(u))\) (\(d(t({N}_{c}(u)),t(u))\)) (see Methods for further details). Figure 9 shows the PDF of such distances for science users (left panel) and conspiracy users (right panel). In both cases we can observe that the most users produce a daily time series of likes very similar to that produced by the likes of their likewise neighborhood. Moreover, the inset plots show the strong positive correlation (Pearson’s coefficient \(\gtrsim \)0.9887 and \(\gtrsim \)0.9886 for science and conspiracy, respectively, with both pvalues close to zero) between difference in size of u liking activity compared to \({N}_{s}(u)\) (\({N}_{c}(u)\)) and the corresponding DTW distance, suggesting that extreme DTW distances are due to the almost perfect uphill linear relationship more than to an effective temporal dissimilarity between liking activities.
For each science user in \({\bar{V}}_{{\rm{science}}}^{L}\), we also investigate a causal effect of \(t({N}_{s}(u))\) on \(t(u)\) by testing the null hypothesis that the former is Grangernoncausal for the latter, namely \({{\mathbb{H}}}_{0}:\,=t{(u)}_{\tau +1}{\mathrel{{\perp\mkern10mu\perp}}}{ {\mathcal I} }^{\ast }(\tau ){ {\mathcal I} }_{t({N}_{s}(u))}^{\ast }(\tau )\). The alternative hypothesis \({{\mathbb{H}}}_{1}\) is predictive causality. The same analysis is repeated for each conspiracy user in \({\bar{V}}_{{\rm{conspir}}}^{L}\) (see Methods for further details). In both panels of Fig. 10 we show the PDF of pvalues obtained by performing such Granger causality tests. The inset plots show the cumulative distribution function (CDF) of the same quantities. Graphics show that the null hypothesis can be rejected as false: pvalues less than the threshold \(\bar{\alpha }=0.05\) are more likely than the others in both the communities and represent ~29% and ~34% of the total in science and conspiracy, respectively.
As an example, Fig. 11 shows the daily time series of a selected user \(u\in {\bar{V}}_{{\rm{science}}}^{L}\) with \(\theta (u)=767\) (left panel) and a selected user \(v\in {\bar{V}}_{{\rm{conspir}}}^{L}\) with \(\theta (v)=488\) (right panel) compared with the daily time series of their neighborhood part \({N}_{s}(u)\) and \({N}_{c}(v)\) who have expressed 779 and 919 likes, respectively. For the pair of time series (\(t({N}_{s}(u)),t(u)\)), DTW returns a distance equal to 407 and the Granger causality test a pvalue ~10^{−4}. For the pair of time series (\(t({N}_{c}(v)),t(v)\)), DTW returns a distance equal to equal to 463 and the Granger causality test a pvalue ~10^{−5}.
Finally, for each polarized user \(u\in {\bar{V}}_{{\rm{science}}}^{L}\), we study the relationship between predictive causality of \(t({N}_{s}(u))\) on \(t(u)\) and the engagement of u. To this aim we use the peer influence probability \({{\rm{PIP}}}_{{\rm{science}}}^{u}\) (see Methods for further details) that provides a measure of neighbors influence effectiveness in reinforcing the system of beliefs of u. The same analysis is carried out for any polarized user \(u\in {\bar{V}}_{{\rm{conspir}}}^{L}\). Figure 12 shows the peer influence probability of u versus the number of likes \({\log }_{2}(\theta (u))\) of u both in science (left panel) and conspiracy (right panel) community. Segments represent the average of peer influence probabilities regarding users with a liking activity in the range of the corresponding bin, and they are colored according to the total number of users involved in such a range.
Plots show how, in both communities, polarized users reinforce their preexisting beliefs by leveraging the activity of their likeminded neighbors, and this trend grows with the user engagement suggesting how peer influence acts as a support for reinforcement seeking.
Conclusions
In this paper we studied the effects of confirmation bias experience on the spreading of information in a social network of 1.2 M users engaged with two very distinct and conflicting narratives on Facebook.
Our analyses showed the action of challenge avoidance mechanism in the emergence, around the selected narratives, of two wellseparated and polarized groups of users (i.e., echo chambers) who also tend to be surrounded by friends having similar systems of beliefs.
Furthermore, we explored the hypothesis that such a pattern is recursive within a single echo chamber. Despite a shared way of thinking, we proved how during social interactions the strength of confirmation bias is stronger than one could think, leading the action of peer influence into its service and fostering the formation of highly polarized subclusters within the same echo chamber. The fact that polarized users tend to remain confined within groups of very few pages even when the corresponding neighborhoods are active on several news sources, suggests that the reinforcement seeking mechanism limits the influence of neighbors and primarily drives the selection and the diffusion of contents even within groups of likeminded people.
Finally, we investigated the effects of the joint action of confirmation bias and peer influence when this latter does not conflict the cognitive mechanisms of challenge avoidance and reinforcement seeking. Namely, we compared the liking activity of polarized users and the liking activity of their likewise polarized neighborhood, and we test a causal effect of the latter on the former. Our findings revealed that polarized users reinforce their preexisting beliefs by leveraging the activity of their likeminded neighbors, and this trend grows with the user engagement suggesting how peer influence acts as a support for reinforcement seeking.
In such a context, also the positive role played by social influence  e.g., by enabling social learning, seems to lose its effectiveness in the effort to smooth polarization and reduce misinformation risk and its consequences. This makes it even more difficult to design efficient communication strategies to prevent rumors and mistrust.
Internet and social media are the ideal ground for the spread of misinformation to speed up, but individual choices more than algorithms characterise the consumption patterns of users and their friends. Therefore, working towards longterm solutions for these challenges can not be separated from a deep understanding of users’ cognitive determinants behind these phenomena.
Change history
09 March 2021
A Correction to this paper has been published: https://doi.org/10.1038/s41598021837580
References
Sunstein, C. R. & Vermeule, A. Conspiracy theories: Causes and cures. Journal of Political Philosophy 17, 202–227, https://doi.org/10.1111/j.14679760.2008.00325.x (2009).
Bessi, A. et al. Science vs conspiracy: Collective narratives in the age of misinformation. PLoS One 10(2), e0118093 (2015).
Mocanu, D., Rossi, L., Zhang, Q., Karsai, M. & Quattrociocchi, W. Collective attention in the age of (mis)information. Computers in Human Behavior 51, 1198–1204, https://doi.org/10.1016/j.chb.2015.01.024, Computing for Human Learning, Behaviour and Collaboration in the Social and Mobile Networks Era (2015).
Bessi, A., Scala, A., Rossi, L., Zhang, Q. & Quattrociocchi, W. The economy of attention in the age of (mis)information. Journal of Trust Management 1, 12, https://doi.org/10.1186/s404930140012y (2014).
Howell, W. L. Digital wildfires in a hyperconnected world (2013).
Kuklinski, J. H., Quirk, P. J., Jerit, J., Schwieder, D. & Rich, R. F. Misinformation and the currency of democratic citizenship. Journal of Politics 62, 790–816, https://doi.org/10.1111/00223816.00033 (2000).
Nyhan, B. & Reifler, J. When corrections fail: The persistence of political misperceptions. Political Behavior 32, 303–330, https://doi.org/10.1007/s1110901091122 (2010).
Zollo, F. et al. Debunking in a world of tribes. PLoS One 12, e0181821, https://doi.org/10.1371/journal.pone.0181821 (2017).
Nickerson, R. S. Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology 2, 175–220, https://doi.org/10.1037/10892680.2.2.175 (1998).
Bakshy, E., Messing, S. & Adamic, L. A. Exposure to ideologically diverse news and opinion on facebook. Science 348, 1130–1132, https://doi.org/10.1126/science.aaa1160, https://science.sciencemag.org/content/348/6239/1130.full.pdf (2015).
Del Vicario, M. et al. The spreading of misinformation. Proceedings of the National Academy of Sciences, https://doi.org/10.1073/pnas.1517441113, http://www.pnas.org/content/early/2016/01/02/1517441113.full.pdf (2016).
Cinelli, M. et al. Selective exposure shapes the facebook news diet. arXiv eprints arXiv:1903.00699, 1903.00699 (2019).
Sunstein, C. Echo Chambers (Princeton University Press, 2001).
Bastos, M., Mercea, D. & Baronchelli, A. The geographic embedding of online echo chambers: Evidence from the brexit campaign. PLoS One 13, 1–16, https://doi.org/10.1371/journal.pone.0206841 (2018).
Del Vicario, M. et al. Echo chambers: Emotional contagion and group polarization on facebook. Scientific Reports 6, https://doi.org/10.1038/srep37825 (2016).
Shatz, I. The Confirmation Bias: Why People See What They Want to See. Effectiviology https://effectiviology.com/confirmationbias/ (2018).
Garrett, R. K. Politically motivated reinforcement seeking: Reframing the selective exposure debate. Journal of Communication 59, 676–699, https://doi.org/10.1111/j.14602466.2009.01452.x (2009).
Festinger, L. A Theory of Cognitive Dissonance (Stanford University Press, 1957).
Abrams, D., Wetherell, M., Cochrane, S., Hogg, M. A. & Turner, J. C. Knowing what to think by knowing who you are: Selfcategorization and the nature of norm formation, conformity and group polarization. British journal of social psychology 29, 97–119 (1990).
Byford, J. Conspiracy Theories: A Critical Introduction. (Palgrave Macmillan, London, 2011).
Fine, G., CampionVincent, V. & Heath, C. Rumor Mills: The Social Impact of Rumor and Legend. (Routledge, New York, 2005).
Hogg, M. A. & Blaylock, D. L. Extremism and the Psychology of Uncertainty. (John Wiley & Sons, Chichester, UK, 2011).
Betsch, C. & Sachse, K. Debunking vaccination myths: strong risk negations can increase perceived vaccination risks. Health psychology: official journal of the Division of Health Psychology, American Psychological Association 32(2), 146–155 (2013).
Bikhchandani, S., Hirshleifer, D. & Welch, I. Learning from the behavior of others: Conformity, fads, and informational cascades. The Journal of Economic Perspectives 12, 151–170 (1998).
Baddeley, M. Herding, social influence and expert opinion. Journal of Economic Methodology 20, 35–44, https://doi.org/10.1080/1350178X.2013.774845 (2013).
Kassin, S. M., Dror, I. E. & Kukucka, J. The forensic confirmation bias: Problems, perspectives, and proposed solutions. Journal of Applied Research in Memory and Cognition 2, 42–52, https://doi.org/10.1016/j.jarmac.2013.01.001 (2013).
Bessi, A., Caldarelli, G., Del Vicario, M., Scala, A. & Quattrociocchi, W. Social Determinants of Content Selection in the Age of (Mis)Information, chap. 18, 259–268, https://doi.org/10.1007/9783319137346_18 (Springer International Publishing, Cham, 2014).
Said, S. E. & Dickey, D. A. Testing for unit roots in autoregressivemoving average models of unknown order. Biometrika 71, 599–607 (1984).
Fuller, W. A. Introduction to Statistical Time Series, Second Edition (John Wiley & Sons, Inc, 1996).
Dickey, D. A. & Fuller, W. A. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association 74, 427–431, https://doi.org/10.1080/01621459.1979.10482531 (1979).
Banerjee, A., Dolado, J., Galbraith, J. & Hendry, D. Cointegration, Error Correction, and the Econometric Analysis of NonStationary Data (Oxford University Press, 1993).
Newman, M. Networks: An Introduction (Oxford University Press, 2010).
Akaike, H. Information Theory and an Extension of the Maximum Likelihood Principle (pp. 199–213. Springer New York, New York, NY, 1998).
Akaike, H. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723 (1974).
Sakamoto, Y., Ishiguro, M. & Kitagawa, G. Akaike Information Criterion Statistics. (Springer, Netherlands, 1986).
Kullback, S. & Leibler, R. A. On information and sufficiency. Ann. Math. Statist. 22, 79–86, https://doi.org/10.1214/aoms/1177729694 (1951).
Granger, C. W. J. Investigating causal relations by econometric models and crossspectral methods. Econometrica 37, 424–438 (1969).
Berndt, D. J. & Clifford, J. Using dynamic time warping to find patterns in time series. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, AAAIWS’94, 359–370, (AAAI Press, 1994).
Acknowledgements
A.S., M.C. and E.B. acknowledge the support from CNRPNR National Project DFM.AD004.027 “CrisisLab” and P0000326 project AMOFI (Analysis and Models OF social medIa). Any opinion, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessary reflect the views of the funding parties.
Author information
Authors and Affiliations
Contributions
E.B., M.C. and A.S. conceived the experiments. E.B. conducted the experiments. E.B., M.C., W.Q. and A.S. analysed the results, wrote, reviewed and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Brugnoli, E., Cinelli, M., Quattrociocchi, W. et al. Recursive patterns in online echo chambers. Sci Rep 9, 20118 (2019). https://doi.org/10.1038/s41598019561917
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598019561917
Further reading

A Confirmation Bias View on Social Media Induced Polarisation During Covid19
Information Systems Frontiers (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.