Neutral bots probe political bias on social media

Social media platforms attempting to curb abuse and misinformation have been accused of political bias. We deploy neutral social bots who start following different news sources on Twitter, and track them to probe distinct biases emerging from platform mechanisms versus user interactions. We find no strong or consistent evidence of political bias in the news feed. Despite this, the news and information to which U.S. Twitter users are exposed depend strongly on the political leaning of their early connections. The interactions of conservative accounts are skewed toward the right, whereas liberal accounts are exposed to moderate content shifting their experience toward the political center. Partisan accounts, especially conservative ones, tend to receive more followers and follow more automated accounts. Conservative accounts also find themselves in denser communities and are exposed to more low-credibility content.

media-bias/media-bias-ratings). Note that The Wall Street Journal is categorized as both center and center-right by the list. We assign it to the centerright category. The selected accounts (in bold) are among the most popular news sources on Twitter.
We also wish to verify that the seed accounts are popular among active Twitter users who are politically aligned with those sources. To this end, we started with a 10% random sample of public tweets on August 1, 2019, comprising about 36M tweets from 14M unique accounts. We sampled 500k of these accounts; for each of them, we calculated the bot score using the BotometerLite tool (Yang et al. 2020). We removed likely bots (those with bot score above 0.5) as well as non-English accounts, leaving 151,570 accounts. We extracted tweets by those accounts from the 10% random sample during one week around August 1, 2019. We used the links shared in those tweets to assign a political score to each of those accounts (see Methods). We filtered out accounts for which we could not assign a score. That left 26,304 accounts with a political score. We grouped these accounts into five political bins using thresholds -1, -0.5, -0.1, +0.1, +0.5, and +1, yielding groups of 712,5,280,11,422,8,237, and 653 accounts in the left, center-left, center, center-right, and right bins, respectively. By examining the friends of these accounts, we eliminated those who did not follow any of the news sources in the AllSides list. 4,884 accounts remained: 187, 1,362, 1,623, 1,444, and 268 in the five groups, respectively. Finally, Supplementary Table 1 reports the proportions of accounts in each of these groups who follow the top sources in the AllSides list. This confirms that the seed sources are among the most followed by active Twitter users who are politically aligned with those sources.

Drifter actions and probabilities
An action is performed upon a sentence, an existing tweet, or a user. These inputs are selected from sources that are described below. A drifter can perform Supplementary • Tweet -post a sentence from Random Quotes, Trends, or Home Timeline. For Trends, the sentence is the text of the selected tweet. For Home Timeline, the sentence is obtained by concatenating a short phrase from a manually compiled list (e.g., "Wow!" or "Maybe so.") with the link of the selected tweet. This emulates a quoted tweet.
• Retweet -select a tweet from Trends, Home Timeline, or a list of Tweets Liked by Friends, and retweet it.
• Like -like a tweet selected in the same way as for a Retweet.
• Reply -reply to a tweet from the Mention Timeline. The reply is generated using the ChatterBot library (chatterbot.readthedocs.io). In case of failure, the reply is a random phrase from the precompiled list described above.
• Follow -select a user to follow from the list of Followers, Friends of Friends, users who posted Tweets liked by Friends, or users who posted tweets in the Home Timeline.
• Unfollow -select a user to unfollow from the latest 200 in the list of Friends.
Input elements for actions are selected from candidate lists that we call sources. The selection is random with uniform probability distribution unless otherwise explained below. Due to limitations of the Twitter APIs, we imitate some basic mechanisms offered by the platform, such as suggestions to follow friends of friends. Sources are defined as follows: • Random Quotes -sentences obtained from a random quote API (api. quotable.io/random).
• Mention Timeline -the latest 10 tweets in the mention timeline. If the drifter replied to any mentions in the past, this source only considers subsequent tweets.
• Friends of Friends -the model randomly selects three friends of the drifter and requests their latest 5,000 friends, ignoring those that are already friends of the drifter. A new friend is selected from the combined list with probability proportional to the occurrences in the list, to favor friends of multiple friends.
• Friends -most recent 200 friends. The user is selected from this list at random, but older friends are more likely to be unfollowed. We implement this mechanism by ranking friends chronologically; the latest friend has rank one. The unfollow probability is proportional to the rank. The initial friend can never be unfollowed.
• Trends -list obtained by randomly selecting three trending topics in the U.S. and fetching the top five tweets in each topic by the default ranking.
• Tweets Liked by Friends -start from the latest 15 tweets from the home timeline. Select a random subset of at most ten friends who posted these tweets. Select the three latest tweets liked by each of the selected friends, excluding any by the drifter itself. Select one tweet at random from this combined list. Depending on the selected action, the source can return the tweet itself (for Retweet or Like) or its author (for Follow).
• Home Timeline -the latest 15 tweets in the home timeline.
We list the probabilities used in the bot behavior model in Supplementary  Table 2. The numbers are inferred from a random sample of Twitter users. If the Follow or Unfollow action is selected, a precondition check is triggered. If the Follow precondition is not met, the Unfollow action is performed and viceversa; the two checks cannot both fail. A new friend can only be followed if the number of friends is sufficiently small compared to the number of followers: less than the number of followers plus 113. A friend can only be unfollowed if the drifter has at least 50 friends.

Extraction of hashtags and links
We accessed tweets using the Twitter API. Links (URLs) and hashtags were extracted from entities metadata. Tweets longer than 140 characters are truncated; in these cases, we extracted links and hashtags from the extended_entities metadata except for 4% of the tweets, for which this retrieval process failed. Many links are compressed using URL-shortening services. We expanded shortened links via HTTP HEAD requests using a heuristics based on the length of the URL (20 characters of less), allowing multiple redirects with a 10-second timeout.

Calibration of political alignment scores
We calibrated alignment scores so that positive scores mean right-leaning hashtags/links and negative scores mean left-leaning hashtags/links. To this end, we selected the news source account @USATODAY to have a zero alignment score. We used the 200 most recent tweets by @USATODAY in early June to calculate the raw center alignment score s c . We obtained s c = 0.058 and s c = 0.246 for the link-based and hashtag-based approach, respectively. The political alignment scores are then calibrated by s = 1 where t i is the score for tweet i and N is the number of tweets across which the score is aggregated.

Statistical analyses
All t-tests in our analyses are two-sided. The main results in the paper are significant at the 0.05 level. In cases where we compare five groups of drifters, a Bonferroni correction for multiple comparisons can be applied by dividing the significance level by 5 2 = 10.

Supplementary notes
Comparisons of follower growth rates and confounding factors There are multiple ways to test whether drifters in one group gain followers significantly faster than those in another group (Fig. 1). The method reported in the main text focuses on the daily follower growth for each drifter bot. We record the follower count on a daily basis in our experiment, with a few exceptions due to technical issues. We calculate the daily growth rate for any two consecutive observations of the follower count. We then combine the data points from each group and use t-tests to compare different groups (n between 373 and 389).
Here we report on analyses based on two additional methods. In the first, we first combine the raw observations (follower-date pairs) from the drifters within the same group and then combine them across two groups to be compared, using a dummy variable to distinguish them. Finally we apply linear regression to this combined data set with an extra interaction term between elapsed time and the dummy variable. The coefficient of the interaction term indicates whether the growth rates between the two groups are significantly different (n = 782, p < 0.001 comparing Left vs. Center and n = 779, p < 0.001 comparing Right vs. Left).
In the last method, we use linear regression to estimate the follower growth rate for each drifter. We then use a two-sided t-test to compare the estimated growth rates of two different groups (d. The differences in influence among drifters could be affected by the popularity of the seed accounts. Supplementary Fig. 1 shows the correlation between drifter influence and two measures of popularity of their respective seed accounts. We find no significant correlation between the numbers of followers of drifters and seed accounts (Pearson's r = 0.05, p = 0.850). However, the drifter influence is correlated with the popularity of the seeds among active accounts with similar political alignment (Pearson's r = 0.52, p = 0.049).

Individual political trajectories
Starting with the political alignment estimations, Fig. 5 in the main text shows the aggregated scores. Next we provide the individual trajectory of the political alignment for each drifter. Full-resolution vector images of the plots are available in the data repository (github.com/IUNetSci/DrifterBot). Supplementary Fig. 2 shows the results from the link-based approach and Supplementary Fig. 3 shows the results from the hashtag-based approach. We observe that the trajectories diverge in several examples, suggesting that the initial conditions do not limit the variability of the evolution. Supplementary  Fig. 4 shows the news feed bias computed for each drifter with both methods. Note that two of the Right drifters were temporarily suspended by Twitter in mid-November 2019, and we neglected to reactivate them until the end of the experiment.

Political bias of news feed algorithm
Supplementary Table 3 shows the results of the analysis of bias in the platform's news feed. As discussed in the main text, most effects have small size and are not consistent across link-and hashtag-based methods. Supplementary Fig. 5 plots the relative overlap between friends and followers of each drifter to examine the reciprocity of the links. We observe a higher follow-back rate for partisan drifters, and especially conservative ones. Table 4 provides descriptive statistics of the drifters, including the number of friends and followers, number of tweets liked, number of tweets posted (including retweets), number of hashtags and links with alignment scores in posted tweets, and total number of actions taken. Yang, K. C., Varol, O., Hui, P. M., & Menczer, F. (2020). Scalable and generalizable social bot detection through data selection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 01, pp. 1096-1103 Supplementary  Table 1; the methods to calculate overall and within-group popularity are documented in supplementary methods above. Shaded areas highlight the 95% confidence intervals around least-squared linear fits (solid lines). Source data are provided as a Source Data file. for all fifteen bots. A tweet is assigned a score between 1 (liberal) and +1 (conservative) based on the shared hashtags. A Home timeline: daily average score of the last 50 tweets in the home timeline. B User timeline: daily average score of the last 20 tweets in the user timeline. The summary represents the average for each group. Colored confidence intervals indicate ±1 standard error. Source data are provided as a Source Data file. A B Supplementary Figure 4: News feed bias for all fifteen bots. Bias is measured by the difference in alignment between the account's home timeline and its friends' user timelines, based on A links and B hashtags. The summary represents the average for each group. Colored confidence intervals indicate ±1 standard error. Source data are provided as a Source Data file.

Supplementary references
Supplementary Figure 5: Relative overlap between friends and followers of the drifters in each group. Error bars indicate standard errors (n = 3 drifters in each group). Source data are provided as a Source Data file.