## Introduction

Global Digital Report, in 2018, said that ‘more than 3 billion people around the world now use social media each month’ (https://digitalreport.wearesocial.com). Even traditional newspapers and news agencies moved to social networks, to cope with this societal change.

Academicians make their best efforts to fight the never ending plague of malicious bots populating social networks. The literature offers a plethora of successful approaches, based, e.g., on profile-14,15, network-16,17,18 and posting-characteristics19,20,21 of the accounts. In particular, the supervised approach proposed by Cresci et al. in ref. 14 tested a series of classification rules proposed by bloggers, and features sets by Academia, on a reference dataset of genuine and fake accounts, leading to the implementation of a classifier, which significantly reduces the cost for data gathering.

Actually, the studies regarding detection of automated accounts rarely analyse their effective contribution in the social networks panorama. Indeed, while messages exchanged on social platforms contain a great amount of data, just a fraction of them carries crucial information for the description of the system, while the rest contributes to random noise. Thus, detecting the relevant (i.e., those not compatible with users’ random activity) communication and interaction patterns is of utmost importance in order to understand which accounts, including bots, contribute to the effective dissemination of messages. In this sense, it is necessary to compare the properties of the real network with a proper null model.

Entropy-based null-models are a natural choice, since they are general and, being based on Shannon entropy, unbiased by construction. In a nutshell, starting from the real network, their definition relies on three steps: (1) the definition of an ensemble of graphs; (2) the definition of the entropy for this ensemble and its maximization up to some (local or global) constraints22; (3) the maximization of the likelihood of the real network23,24. Entropy-based null-models have been successfully used in the last years for the analysis of complex networks25,26. The fields of application are the most varied, from reconstructing a network from partial information27, to detecting early signals of structural changes28,29, to assessing the systemic risk of a financial system30,31. Recently, this approach has been applied by Becatti et al.32 to the Twitter traffic during the 2018 Italian election campaign. The study was able to infer political standings directly from data. Moreover, the analysis of the exchanged messages showed a signal of communication between opposite political forces during the election campaign, which anticipated an unexpected post-elections political agreement.

In the present paper, we merge the application of the lightweight classifier for bot detection proposed by Cresci et al. in ref. 14 with the analysis of complex networks via entropy-based null-models. Once we have cleaned the system from the random noise via the application of the null-model, we study the effects of social bots in retweeting a significant amount of messages on Twitter. The analysis is applied to a tweet corpus about migration in the Mediterranean Sea from North Africa to Italy.

This study has two main results: firstly, after cleaning the system from the random activity of users, we detect the main hubs of the network, i.e., the most effective accounts in significantly propagating their messages. We observe that those accounts have a number of bots among their followers (in the cleaned network) higher than average. Secondly, the strongest hubs in the network share a relatively high number of bots as followers, which most probably aim at further increasing the visibility of the hubs’ messages via following and retweeting. Hereafter, we will refer to groups of bots that follow and retweet the same group of hubs with the term bot squads. To the best of our knowledge, the existence of formations of bots shared by a group of human-operated accounts has never been reported in the literature before.

## Results

### User polarization

On Twitter, users are strongly clustered in communities sharing similar ideas (evidences of this, and discussions of its implications, can be found in many papers, see, e.g., refs. 33,34,35,36,37,38,39,40,41). Our assumption is that, if two users interact with the same followers and followees, they probably share similar viewpoints, including those regarding politics. We thus build clusters of politically homogeneous groups by starting from those accounts for which we have the greatest information available. We exploit the fact that Twitter offers the possibility (upon request of the account owner) to obtain an official certification of account’s authenticity. The procedure is mostly adopted by VIPs, official political parties, newspapers, radios and TV channels, to reduce interferences of fake users. Accounts that pass the procedure are tagged as verified and on the official portal have a blue circle, with a white tick at the center, close to their name. Verified users have been proved to be a solid starting point for accurate analyses. In fact, not only they lead to valuable information about the number of bots that follow them (see work by Varol et al. in ref. 42), but also, following the communication patterns of a set of verified accounts, it is possible to get a very large set of trusted, i.e., not bots, accounts43.

To infer the political orientation of a user from the available data, we focus on the bipartite network of verified (on one layer) and unverified (on the other layer) accounts, as in Fig. 1. A link between two users belonging to the different sets is present if one of them retweeted at least once the other user. In our representation, the network is undirected: we do not consider who retweeted who, but only the mere presence of at least one retweet.

It is worth noting that other types of interactions, such as replies, mentions and quoted tweets are present on Twitter. We are considering retweeting activity only, since it represents the preferred way through which users spread messages they agree with34. If we had used, for instance, replies, the network would have been much harder to interpret; replies can be used either to support the ideas of the original tweet, or to express disagreement towards them. A similar reasoning holds for the other possible interactions. By inserting mentions in a tweet, a user either invites the mentioned accounts to participate to the discussion, or points out that the tweet somehow affects them; however, it can be either in provocative or in constructive ways. Analogously, in the case of quoted tweets, the sender may intend to comment a tweet, but it might be either to support or to deplore it. Thus, since the intention of the mentions, quotes and replies can be of different kinds, we focus on retweetting, the type of interaction which is not amenable to multiple interpretations.

We project the bipartite network of verified and unverified users on the layer of the former. To do that, we consider the statistically significant amounts of interactions shared by pairs of verified accounts. The steps of such projection are sketched in Fig. 1. For every couple of verified users, we count the number of common unverified ones interacting with them and compare this number with its expected probability distribution according to an entropy-based null-model constraining the degree sequence of both layers of the bipartite network. If the p-value of the observation on the real network is statistically significant, we project a link on the layer of verified users. In this way, we can focus on that statistical significant group of verified users that is similarly perceived by the unverified users. The technical details about this validated projection can be found in the “Methods” section.

The presence of a strong community structure in the bipartite network composed of verified and unverified layers of users has already been observed by Adamic at al. in ref. 33. Here, we repeat the analysis of Becatti et al.32 and check the results on the layer of verified users, for which we have reliable information. Also in this case we find a strong community structure.

We would like to remark that we assign the terms ‘validated’ and ‘verified’ two different meanings: the former indicates a node that passes the filter of the projection while the latter refers to accounts that pass an authenticity check by Twitter.

There are other approaches to inferring the political orientation of users from data. Del Vicario et al.37, for instance, consider several groups of Facebook pages and divide users depending on the frequency they interact with each group of pages. Instead, to decide whether a user is a Democrat or a Republican, Conover et al. in refs. 34,35,36 use network properties of retweets and a machine learning algorithm to determine the topic of a tweet. However, in both approaches, the communities are somehow decided a priori, either by data selection or by defining the groups. On the contrary, with our method, communities naturally arise from data.

Political orientation: in the following, we will often refer to Italian parties and representatives of the Italian government in office during the period in which the data were collected (23 January– 22 February 2019). For the sake of clarity, Supplementary Note 1 briefly explains the high-level characteristics of such parties.

The bipartite network describing the retweets between verified and unverified users involves nearly one half of the unverified users in our dataset. Nevertheless, the network obtained by following the projection procedure described in the “Methods” section shows a strong community structure, see Fig. 2.

To quantify the presence of clusters, we use the Louvain algorithm44, one of the most effective community detection algorithms. To avoid problems related to node ordering45, on a network with N nodes, we apply the algorithm for N times, after reshuffling the order of nodes. Among the N partitions resulting by the application of the algorithm, we then select the configuration with the largest modularity45. Such configuration, reported in Fig. 2, displays three main communities: one tied to the current government (right-wing parties and Movimento 5 Stelle, the Mediterranean blue community), one tied to the Italian Democratic Party (PD), its representatives, and some representatives of smaller parties on the left of PD (the tomato red community), and one tied to several NGOs, politicians on the left of PD, and different online and offline news-papers (eggplant purple community). Smaller communities, including one with Joseph Muscat, Malta’s Prime Minister, and part of his ministers, have been involved in the discussion for the aid of migrants and castaways. The composition of the communities in Fig. 2 is detailed in the Supplementary Note 1.

If we compare the emerged communities with those observed by Becatti et al.32 during the 2018 electoral campaign, we notice several differences. The outcome of32 was that the Movimento 5 stelle (hereafter M5S) group and their supporters were clearly distinguishable from the right wings, while, on issues concerned with the Mediterranean migration, they are not. Instead, on the same issues, the left wing representatives, inside and outside the Democratic Party, are much closer than during the electoral campaign.

Polarization of unverified accounts: verified accounts of politicians can be easily associated to a political party; then membership of unverified users can be guessed by considering their interactions with the communities of verified ones. To do this, we use the polarization index ρi defined by Bessi et al. in refs. 46,47:

$${\rho }_{i}=\frac{{\max }_{c\in {\mathcal{C}}}{k}_{i}^{c}}{{k}_{i}},$$
(1)

where ki is the degree of node i, $${k}_{i}^{c}$$ is the number of links towards the community c and $${\mathcal{C}}$$ is the set of communities. The distribution of ρi is extremely peaked on values close to 1, see Fig. 3a. Given such a strong polarization, we can safely assign unverified users the polarization of the community they mostly interact with.

We also find a small amount (with respect to the size of verified–unverified network) of unverified nodes whose polarization is not strong enough to be uniquely assigned to a specific cluster: they are part of the grey group in Fig. 3b.

As noted at the beginning of the previous paragraph, this polarization procedure does not consider almost one half of the unverified users, since they do not interact, on the whole observation period, with a single verified user. This may be due to several reasons. Differently from Becatti et al.32, where a corpus of tweets exchanged during the election campaign was analyzed, here we focus on a set of tweets concerned with a specific topic of the political propaganda. We conjecture that, in the former case, the amount of unverified accounts interacting with the verified ones was much higher because it was of interest of the verified accounts (mostly, candidates in the elections) to involve ‘standard’ users.

In order to know more about unverified users not directly interacting with verified ones, we use what we call a contagion of polarization, namely a label propagation procedure in which the labels are those assigned in the previous steps. Even if unverified users do not retweet verified users, they may retweet other unverified ones and reveal their political orientation through those. As seeds for the propagation of the label, we use the tags obtained in the previous step, i.e., the polarization of verified and unverified users of the previously defined bipartite network. If the majority of the unverified accounts retweeted by the account we want to tag is polarized towards a certain group, we assign that account to the same group. The procedure is iterated and terminates as soon as it is not possible to assign a polarization to any unverified user. We thus assign to unverified users the prevailing polarization of the accounts they interact with and stop when there is no possibility to assign a polarization anymore, i.e., if there is no clear agreement among the neighbours of the considered node. In this way, after 10 rounds of such a procedure, we are able to increase the fraction of the users for which we determine a clear polarization by 27%. Even if this percentage looks small, we will see that the label propagation process is effective for the set of validated unverified accounts considered in the following sections. In this case, the increase of polarized users is almost 58%; additional details are reported in the Supplementary Note 4.

Supplementary Figure 1 illustrates the density of all users (left panel) and bots (right panel) in the three biggest communities, after the polarization by contagion procedure. Interestingly enough, assigning a polarization to bots turns out to be much harder than for genuine users. If we neglect the contribution of the grey bars, we can notice that the relative fractions of the different communities are more or less similar, but for a slight increase of the abundance of ‘eggplant purple’ bots.

### The backbone of the content exchange on Twitter

In the analysis of a complex system, one of the main issues is to skim relevant information from noise. Of course, the definition of noise itself depends on the system. In the previous section, we obtained the political affiliation of verified users by projecting the information in the bipartite network describing the interactions between verified and unverified users. Now, we will apply the entropy-based null model, in its directed variant—Bipartite Directed Configuration Model (BiDCM)—proposed by van Lidth de Jeude et al.48, to filter the total exchange of content in our dataset, after discounting the information regarding the activity of users and the virality of messages, as in ref. 32. We sketch the procedure in Fig. 4.

We start by considering the directed bipartite network of users (on one layer) and tweets (on the other layer): an arrow from user u to tweet t indicates that u wrote t. Analogously, the arrow from t to u represents u retweeting t. Thus, the bipartite directed network is intended to describe the retweeting activity of users, considering the information about the virality of the messages. We then construct the BiDCM. In the present case, the constraints describe the node activities, i.e., the number of original tweets posted by every user, the number of retweets of every message and the number of retweets of every account. As for the case of accounts’ polarization, we project the information contained in the directed bipartite network on one of the two layers, in order to obtain a monopartite directed network of accounts. The resulting network represents the significant flow of information among users. More in details, for each (ordered) pair of users $$(u,u^{\prime} )$$ we consider the number of tweets written by u that are retweeted by $$u^{\prime}$$. Subsequently, we assess the statistical significance of this retweeting activity by comparing the real value in the network under investigation with the theoretical distributions of the BiDCM. Otherwise stated, if the number of tweets written by u and retweeted by $$u^{\prime}$$ is greater than expected and it is not compatible with the theoretical distribution, a link from u to $$u^{\prime}$$ is projected. Thus, by comparing the real system with the null model, we can highlight all the contributions that cannot be explained only by the fixed constraints. Technical details can be found in the “Methods” section and in the Supplementary Notes 5 and 6.

Summing up, the filtering procedure returns a directed network in which the arrows go from the authors to the retweeters and it reduces the number of nodes to 14,883 users and of links to 34,302. The connectance, i.e. the density of links, of the network is ρ 3 × 10−5. This network is hereafter referred as directed validated network, or simply, validated network. Figure 5 shows the structure of the validated network in terms of communities.

On top of this, we have analyzed the presence of automated accounts, by using the bot detection method described in the “Methods” Section. The incidence of bots in the validated network is about 2.5%, against almost the 7% of nodes in the original network. The number of loops, i.e., users that retweet (significantly with respect to their activity) their own tweets, is around 1.2% of the total amount of links of the validated network, thus relatively high. This effect reverberates also on the number of validated nodes, that significantly retweet themselves (slightly <3%). For the subsequent analyses, we discard the contribution of loops, since we are interested in analysing the source of the shared contents on Twitter.

Hubs and bots: as mentioned in the previous section, the validated links go from the authors to the retweeters. The effectiveness of an author can be derived by its ability to reach a high number of relevant nodes: this principle is finely implemented in the Hubs-Authorities algorithm, originally introduced by Kleinberg in ref. 49 to rate web pages. In the original version, the paradigm assigns two scores for each web page: its authority, which estimates the value of the content of the page, and its hub value, which estimates the value of its links to other pages. In the scenario currently under investigation, hubs and authorities are Twitter accounts. The authorities are sort of sink of the content exchange. In the following, we will focus on hubs, because they represent the driving force of the discussion and are relatively popular users, and even if they are not verified by Twitter, we often have reliable information about their accounts.

Table 1 shows the values for the top 20 nodes, in term of hub scores. The first account is the one of Mr. Matteo Salvini. The second and the third ones refer to two journalists of a news website supported by Casa Pound, a neo-fascist Italian party. The fourth is the account of Ms. Giorgia Meloni, former ally, during the 2018 Italian electoral campaign, of Lega, the party of Mr. Salvini. We remind the reader that Mr. Salvini and Ms. Meloni are the leaders of the two major conservative Italian parties. Their account names have not been anonymed in Table 1 since they are verified accounts. The two leaders have similar opinions on how to deal with the Mediterranean migration. The fifth and sixth accounts are, respectively, a journalist of ‘Il Fatto Quotidiano’ (a newspaper close to M5S) and an unverified user with opinions in line with the ones of the two above mentioned politicians. Notably, all the accounts in Table 1 belong to the blue community. The first account with a different membership (‘TgLa7’, a popular newscast by a private TV channel, whose account belongs to the purple community) ranks 176th in the hub score ranking.

The case of the Italian chapter of a NGO assisting migrants in the Mediterranean Sea is worthy of note: while it has the fifth highest value of out-degree (kout = 1104), it has an extra low hub score (4 × 10−4), ranking 452nd. This is impressive since, in several occasions, the Italian government (in the figure of Salvini, the Minister of Internal Affair) and the NGO have been opponents on the issue of disembarking the migrants rescued during the NGO activities.

Remarkably, we observe a non zero overlap among the bots in the list of the validated followers of human-operated users. To the best of our knowledge, this is the first time that such a phenomenon is detected. In our opinion, the use of bot squads, retweeting the messages of two or more strong hubs, aims at increasing the visibility of their tweets. We have detected two main groups of such accounts, the other being composed by a maximum of two common bots. The first one includes 22 genuine accounts (9 of which are in the top 10 hubs), sharing 22 bots. In this set, some users share a relatively high fraction of bots; there is one right wing account that shares all its automated followers with both Meloni and Salvini, see Fig. 6. In Figs. 7 and 8, we represent two subgraphs of the validated network in Fig. 5. The subgraph in Fig. 7 shows the first group of genuine accounts sharing bots and all their bot followers. Such accounts belong almost exclusively to the blue community. The hub scores, represented as the dimensions of the nodes, are nearly homogeneous among the hubs. This does not happen in the subgraph referring to the second group (see Fig. 8): beside the presence of a strong hub, the hub score distribution is much skewer than for the previous group. Moreover, in absolute terms, the hub scores are much smaller than in the previous case, since the strongest hub is the aforementioned account of ‘TgLa7’ newscast. The accounts in the subgraph of Fig. 8 belong almost exclusively to the purple community.

Figure 9a shows that the main activity of the bots in the first bot squad is retweeting. As expected, they mostly retweets human-operated accounts connected to them (Fig. 9b). The same cannot be said for mentions that may be used either to provoke or to involve the target. Accounts from different political side are mentioned by bot squads; in fact, the bot accounts with more than 30 mentions point to members of the blue community as well as the official account of the Democratic Party (‘pdnetwork’). It is worth noting that other ‘not-sided’ verified accounts, as the one of the President of the Republic (‘Quirinale’) and the one of the President of the Chamber of the Deputies (‘Roberto_Fico’) are mentioned there and that, in most cases, the messages containing those mentions are sort of invites, for the institutional figures, to manage immigrants and migration (Fig. 9c).

The most striking outcome of our content analysis, however, concerns the sources cited by the bots in the blue squads: 89% of their original tweets (i.e, not replies, nor retweets or quoted tweets), contains a URL and 97% of those URLs refers to www.voxnews.info, a website blacklisted as a source of political disinformation by two popular fact checking websites, namely www.butac.it and www.bufale.net. Additional details about our study on the bots’ squads can be found in the Supplementary Note 7.

## Discussion

The 2018 Eurobarometer report on news consumption presents a clear increasing trend of popularity of online sources with respect to traditional ones50. Albeit this widespread favour, online media are not trusted as their offline counterparts: in a survey conducted in autumn 2017, 59% of respondents said they trusted radio content, while only 20% said they trusted information available on online social networks. Even beside the perception of common users, the presence of fake content has indeed been revealed in several research work, both at level of news per se, as done by, e.g., Quattrociocchi et al. in ref. 39 and Shao et al. in ref. 8, and of fake accounts contributing to spreading them (see, for example, the overview on the rise of social bots by Ferrara et al.51).

Twitter is one of the most studied social media, due to the openness of its data through the available public APIs. Also, it is strongly used by professionals for news distribution: a 2017 survey by AGCOM, the Italian guaranteeing agency for communications52, showed that Italian journalists appear on Twitter much more frequently than common users. Therefore, Twitter has been used for many analyses of communication in the political propaganda, see, e.g., refs. 4,8,9,10,32,38,53,54,55,56,57,58,59,60. Obviously, a major issue when performing such kind of analyses is the reliability of the results, which is closely connected to the reliability of the users in the game: in such sense, a rich stream of research is devoted to finding means for detecting automated accounts—even anticipating their future evolution, as done by Cresci et al. via genetic algorithms in ref. 61 —and their interactions with human-operated accounts9,10,62.

Remarkably, all the previous analyses rarely tackle the effect of random noise, which is indeed of utmost importance when studying complex systems. In63, Jaynes showed how Statistical Physics could be derived from Information Theory from an entropy maximization principle. Following Jaynes work, in recent years the same approach has been extended to complex networks22,23,24,25,26, to provide an unbiased benchmark for the analysis, by filtering out random noise. Such a framework proved to be extremely ductile and adaptable to the analysis of different phenomena, in trade networks24,28,29,64,65,66, financial networks27,30,31 or online social networks32,67. In the present study, we jointly use a bot detection techniques and an entropy-based null-model for the analysis of the content exchange on Twitter in the Italian discussions on migrant flows from Northern Africa. The analysed corpus has been extremely useful for highlighting the mechanisms used for disseminating information in political debates.

To get the political affiliation of users, we focused on the bipartite network in which the two layers represent verified and unverified users, respectively, and the (undirected) links label the correspond to the retweetting interactions between the two classes. The main idea is to infer the inclination of users towards a political point of view from (a proxy of) their contacts: users which share a big number of followers and followees probably have similar opinions. The bipartite network is then projected on the layer of verified users using an entropy-based null-model, by following the procedure introduced in Saracco et al. in ref. 29.

Verified users have been clustered into three main groups, see Fig. 2: one group includes Italian government representatives, the right wing and the Movimento 5 Stelle party; a second group includes the Italian Democratic party; a third one includes NGOs, online and offline media, journalists and some VIPs (like actors, singers, movie directors). Confirming results presented in other studies38,39,68,69, the polarization of unverified users is particularly strong: they interact quite exclusively with accounts of a single community, see Fig. 3.

Starting from verified users, using a label propagation algorithm we iteratively assign a group membership to unverified users, by considering the political inclination of the majority of all their followers and followees. This procedure reduces the number of unpolarized accounts of more than 35%. Oddly, the ratio of bot accounts that remain unpolarized after the ‘political contagion’ is higher than the analogous for all users, see Supplementary Fig. 1. However, we have seen that users, be they human-operated or bots, who take part in a significant and effective way in the discussion, are mostly polarized.

Finally, we extract the non trivial content exchange by adopting the validated projection developed by Becatti et al in ref. 32 in order to detect the significant flow of messages among users, while discounting the virality of messages, the retweeting activity of users and their productivity in writing tweets.

The network represented in Fig. 5 is extremely informative for different reasons. The validated network contains only 14,883 validated users out of the 127,275 users in the dataset. This highlights the fact that just a minority of all users effectively contributes to the online propaganda on the migration flow. Interestingly, the incidence of bots on the validated network is almost one third of the analogous measure on the entire dataset, signaling that the number of bots whose retweets are non compatible with a random activity is a minority. Since one of the targets of a social bot is to increase audience of the online content of a specific (group of) user(s), such a reduction shows that, in our scenario, the number of bots affecting significantly the political discussion is limited.

The accounts in the validated network are much more polarized than the whole set of users in the original network, see Fig. 10. In fact, in the original network, the overall fraction of unpolarized accounts represents more than 40% of all the accounts and more than 50% of all the automated ones. Instead, when considering the validated network, the same ratio is around 10% for the former and around 5% for the latter. Otherwise stated, the polarized bots pass the validation process more easily than their unpolarized counterparts and their contribution in spreading messages is more significant.

All the accounts that are mostly effective in delivering their messages (i.e., the Hubs, by following the paradigm defined by Kleinberg in ref. 49) refer to the blue area in Fig. 5, where we can find representatives of the Italian government in charge at the time of data collection, and the right wing. The first account referring to a community different from the blue one is the official account of the newscast ‘TgLa7’, at position 176th in the hub ranking.

Regarding the contribution of bots to the visibility of the various accounts, the fraction of bots that significantly retweet the content of two right wing political leaders (Mr. Salvini and Ms. Meloni) is greater than the incidence of bots in the whole validated network. Interestingly enough, other hubs show a smaller presence of bots among their followers, even if their hub score is not that different from the two political leaders.

Finally, we found that some hubs do share their bots: Fig. 6 describes the normalized overlap between the list of bots of each pair of users in the list of the top 20 hubs. As mentioned before, those accounts are from the right wing political area. To the best of our knowledge, this is the first time that such a behaviour is reported: in analyses tackling the same problem, i.e., the interactions among human-operated accounts and bots9,10,12, only star-like sub-graphs were observed, with a big number of bots among the followers of a (presumably) human-operated user. We have found bunches of bots attached to bunches of hubs, and although we cannot make any claim about the mind and the strategy behind this organization, we find the result noteworthy, and consider interesting to look for the phenomenon in other datasets.

We underline that the considered shared bots are particularly effective, since they are validated by the entropy-based projection. Actually, the group of right wing bots, each supporting more than a human-operated account, is not the only one in the set, but it is the greatest: if we consider the subgraphs of human-operated accounts sharing their bots, see Figs. 7 and 8, the former has 172 nodes against 58 of the latter. Moreover the first subgraph is by far more efficient; indeed, in the second one the greatest hub score ranks 176th.

It is well known that bots aim at increasing popularity of users by retweeting their messages (see, e.g., work by Cresci et al.13, that reveals how bots, e.g., retweet in a coordinated fashion celebrities’ accounts). The projection procedures followed in this paper reveal such a coordinated activity in a pretty robust way. In fact, we argue that the emergence of statistically significant communication patterns could hardly be hidden by an attacker, because the latter would have to employ more automated accounts to ‘hide’ their activities within the expectations of the probability distribution obtained by the BiDCM.

To the best of our knowledge, our study is the first investigation that merges bot detection and entropy-based analysis of Twitter traffic. Moreover, the obtained results are in line with the previous work by Shao et al.8, where the authors showed how bots massively support the spread of (low credibility) content. At the same time, the present investigation contributes in a different way, being not specifically focused on fake news, whereas8 concentrates on the way fake news become viral. Interestingly enough, among the many studies about the 2016 US presidential election, Grinberg et al.11 analyzed the proliferation of fake news on Twitter and determined both fake news spreaders and exposed users. The role of bots in effectively conveying a message—for the first time here highlighted even in a ‘shared fashion’—and the spreading of fake news in online discussions of great importance, e.g., about elections and news reports, see refs. 8,11 calls for future studies, which include a deeper analysis of the exchanged messages.

## Methods

### Data collection and processing

Our study is based on a large corpus of Twitter data, generated by collecting tweets about migrations, and focusing on the case of migrant flows from Northern Africa to Italy. For data collection, we developed a crawler based on the Twitter public Filter API, which provides real-time tweet delivery, filtered according to specified keywords. We selected a set of keywords compatible with recent chronicles. Table 2 lists the selected keywords. The filtering procedure was not case-sensitive. The keywords have been selected because they are commonly used in Italy when talking and writing about immigration flows from Northern Africa to the Italian coasts, including the dispute about the holder of jurisdiction for handling emergencies, involving European countries and NGOs (https://en.wikipedia.org/wiki/African_immigration_to_Europe).

We collected 1,082,029 tweets, posted by 127,275 unique account IDs, over a period of one month (from 23 January 2019 to 22 February 2019). By relying on the bot detection classifier developed by Cresci et al. in14 and recapped in the following section, all the accounts have been classified either as human-operated or as bots. This classification led to 117,879 genuine accounts and 9,396 social bots. All the collected tweets were stored in Elasticsearch (https://www.elastic.co) for fast and efficient retrieval.

It may be worth noting that the period over which the data have been collected was characterized by a lively political debate in Italy about the landing of one ship operated by NGOs rescuing migrants fleeing from North Africa to Italy. On 16 August 2018, the Italian coastguard boat ‘Diciotti’ rescued almost 200 migrants off Lampedusa island and initially received a veto to land from the Italian government; it was allowed to do so only after 10 days. Mr. Matteo Salvini, at that time Minister of Internal Affairs, was afterwards investigated for kidnapping and abuse of office; the case was stopped on 19 February 2019, when the Italian Senate did not grant judges the possibility to prosecute him. Right before and after the Senate’s decision there was an intense debate on social networks about migrants and NGOs, and about the role of Italian Government and of the European Union.

### Bot detection classifier

To assess the nature of the accounts in the dataset about migration from Northern Africa, we rely on a slightly modified version of the supervised classification model proposed in ref. 14.

The bot detector uses all the features of an account profile, and has been developed after testing known detection methodologies on a baseline dataset of fake and genuine accounts. The former were bought from three online markets, while the latter were certified as genuine by tech-savvy social media analysts. The authors tested Twitter accounts in the reference set against algorithms based on: (i) single classification rules proposed by Media and bloggers, and (ii) feature sets proposed in the literature for detecting spammers. The results of such preliminary analysis suggested that fake accounts detection needs specialized mechanisms. They classified rules and features according to the cost required for gathering the data needed to compute them and showed how the best performing features are also the most costly ones. Then, building on the cost of crawling analysis, they implemented a series of lightweight classifiers using less costly features, while still being able to correctly classify more than 95% of the accounts of the baseline dataset. They also validated performances of the classifiers over two other sets of human and fake accounts, disjoint from the original training dataset.

For the present paper, we reconstruct the model of the classifier of Cresci et al.14 and test its performances with J48, the Weka (https://www.cs.waikato.ac.nz/ml/weka/) implementation of C4.5 algorithm, on the same training set, publicly available at http://mib.projects.iit.cnr.it/dataset.html, obtaining the same classification performance results. The used features are listed in Table 3.

As a final note, it seems fair to underline how the war against automated, malicious accounts is constantly in progress. In fact, social bot developers have become so smart that their creations are very similar to genuine accounts. An interesting line of research is that of ‘adversarial social bot detection’, where, instead of taking countermeasures only after having collected evidence of new bot mischief, detection techniques are proactive and able to anticipate attacks and next generations of bots. The first seeds for this research were put by Yang et al. back in 2011–13 who provided the first evidence of social bot evolutions70. While the first wave of social bots, populating OSNs until around 2011, were rather simplistic, the second wave featured characteristics that were quite advanced for the time. Differently from the previous ones, the social bots studied by Yang et al. were used to purchase or exchange followers between each other, in order to look more popular and credible. Bot evolution thus leads us to 2016, when Ferrara et al. documented a third generation of social bots71. Needless to say, Yang’s classifier was no longer successful at detecting this third wave of social bots, as experimentally demonstrated in ref. 13 by Cresci et al.

After Yang et al.’s first adversarial work, many years passed before further studies were carried out. Indeed, only recently Cresci et al.61 and Grimme et al.72 proposed new adversarial studies in social bot detection. The continuous research around the theme of social bot detection leads to more and more accurate techniques, with increasingly reduced errors in classification.

### Validated projection of the bipartite network and users polarization

Because of the official certification released by Twitter about the authenticity of an account, users can be divided into two sets, the verified and unverified ones. Becatti et al., in ref. 32, used this feature to infer the accounts’ inclination towards a specific political area, directly from data. This has been possible by implementing the method of Saracco et al. proposed in ref. 66. The underlying idea is that unverified users follow and interact with verified users sharing their political ideals. In this sense, if two verified users have a high number of common followers and followees, they probably have a similar political affiliation. The a posteriori analysis of the results of the validated projection confirms the previous hypothesis. Due to the Twitter verification procedure, only the information provided by verified users is fact-checkable, thus our check is restricted to this class of users.

We have to pay attention to the contribution of remarkably active users. For example, if a verified user is extremely engaged in the political propaganda, such user may interact with a huge number of unverified ones and may thus share a great amount of contacts with almost all the other verified users, even those with an opposite political inclination. In this case, the contribution should be considered spurious, being just due to the popularity of the user. Analogously, the role of an unverified user that retweets all messages from her/his contacts should be discounted.

We obtain the political affiliation of the accounts by considering the undirected bipartite network of interactions (i.e., retweets) between verified and unverified users, aggregated over the whole period: we disregard the information about the direction of the retweets, since we are just interested in groups of users sharing content. The previous intuition leads us to compare the overlap of connections (literally, the number of common followers and followees) in the real network with the expectations of a null-model able to account for the degree sequence of both layers. In this way, we are able to discount the random noise due to the activity of users and get the statistically significant information from the data. The entropy-based BiCM, ref. 66 provides the correct benchmark for this analysis. While we describe more extensively the theoretical construction in the Supplementary Notes 2 and 3, here we outline the main intuitions behind the Bipartite Configuration Model and its monopartite validated projection.

The Bipartite Configuration Model: let us start from a real bipartite network and call the two layers L and Γ and their dimension respectively NL and NΓ; we label the nodes on those layers respectively with Latin and Greek indices. We represent the connection via the biadjacency matrix, i.e., the rectangular (NL × NΓ)-matrix M whose generic entry miα is 1 if there is a link connecting node i L and node α Γ, and 0 otherwise. We then consider the ensemble $${{\mathcal{G}}}_{\text{Bi}}$$ of all possible graphs with the same number of nodes on the two layers as those in the real network. If we assign a (formal) probability per graph, we can maximize the (Shannon) entropy,

$$S=-\sum _{{G}_{\text{Bi}}\in {{\mathcal{G}}}_{\text{Bi}}}P({G}_{\text{Bi}}){\mathrm{ln}}\,P({G}_{\text{Bi}}),$$

constraining the average value of some quantities of interest on the entire ensemble. If, as it is the case of the present article, we impose the ensemble to have fixed average for the number of links per node (i.e., the degree), the probability per graph factorizes in independent probabilities per link:

$${p}_{i\alpha }=\frac{{x}_{i}{y}_{\alpha }}{1+{x}_{i}{y}_{\alpha }},$$
(2)

where piα is the probability of finding a link between i and α and xi and yα (the fitnesses, as defined by Caldarelli et al. in ref. 73) are quantities that encode the attitude of the nodes to form links22. At this level, the previous definition is formal, since we just imposed to fix the average (over the ensemble) of the degree sequence, but we did not decide its value. It can be shown (see Supplementary Notes 2 and 3) that maximizing the likelihood of the real network is equivalent to fixing the average of the degree sequence to the one measured on the real network (proofs by Garlaschelli and Squartini, see refs. 23,24).

Monopartite validated projection: we can now highlight all those contributions that cannot be related to the degree sequence only, comparing the real network with the expectations of the BiCM. Following this line, Saracco et al.66 proposed a validated projection on top of the BiCM. The main idea is to consider the common links of two nodes on the same layer and compare it with the theoretical distribution of the BiCM: if the real system shows a commonality of links that cannot be explained only by the activity of the users, then we project a link between the two nodes under analysis. Using the formalism of Saracco et al.65, we call V-motif the overlap.

In formulas, by using the independence of probabilities per graph (2), the probability that both node i and j link the same node α is simply

$$p({V}_{\alpha }^{ij})={p}_{i\alpha }{p}_{j\alpha },$$

where $${V}_{\alpha }^{ij}$$ is the above mentioned V-motif among ij and α. The total overlap between i and j is simply $${V}^{ij}={\sum }_{\alpha }{V}_{\alpha }^{ij}$$ and, according to the BiCM, is distributed as a Poisson-binomial, i.e., the extension of a binomial distribution in which all the events have a different probability (see Hong74). We can further associate p-values to the observed V-motifs, i.e., the probabilities of finding a number of V-motifs greater than or equal to the one measured on the real network. In order to state the statistical significance of several p-values at the same time, we relied on a multiple test hypothesis. The false discovery rate (FDR) is generally considered the most effective one since it permits to control the number of false negatives, without being too conservative (see Benjamini and Hockberg75). The result of the projection is a binary undirected monopartite network of nodes from the same layer, that are linked if their similarity cannot be explained only by their degree. We therefore apply the Louvain community detection algorithm (by Blondel et al.44). Since this method is known to be order dependent, as shown by Fortunato45, we apply it several times after reshuffling the node order and take the maximum value of the modularity, i.e. the algorithm objective function.

The interested reader can find in the Supplementary Note 8 an alternative approach to assign a political inclination to users and a comparison with the approach followed in this paper (see also Supplementary Fig. 2).

### Extraction of the backbone of tweeting activity

As done when evaluating the statistical significance of the common links of two nodes on the same layer, also when studying content exchange, we are interested in the flow of information that cannot be explained by users’ activity only. Differently from other studies4,9,10,53, we take into account the virality of tweets. Methodologically, the approach is similar to the one adopted for the extraction of the users political affiliation. The difference consists in (1) substituting the BiCM with its analogous directed version, the BiDCM (proposed by van Lidth de Jeude et al.48) and in (2) considering layers of different kind. While in the previous section layers represent verified and unverified users, here they represent tweets on one layer and users (both verified and unverified) on the other. The validated projection procedure returns a directed monopartite network of significant exchange of messages, in which the arrow goes from the message author to the retweeters. As mentioned in the Results section, the connectance of the network is ρ 3 × 10−5. The effective FDR-threshold for p-values is FDRth 3.0 × 10−7 for α = 0.01; the analogous Bonferroni effective threshold is much stricter, Bonferronith 8.8 × 10−12. Additional details about the BiDCM procedure can be found in the Supplementary Notes 5 and 6.