Sentinel node approach to monitoring online COVID-19 misinformation

Understanding how different online communities engage with COVID-19 misinformation is critical for public health response. For example, misinformation confined to a small, isolated community of users poses a different public health risk than misinformation being consumed by a large population spanning many diverse communities. Here we take a longitudinal approach that leverages tools from network science to study COVID-19 misinformation on Twitter. Our approach provides a means to examine the breadth of misinformation engagement using modest data needs and computational resources. We identify a subset of accounts from different Twitter communities discussing COVID-19, and follow these ‘sentinel nodes’ longitudinally from July 2020 to January 2021. We characterize sentinel nodes in terms of a linked domain preference score, and use a standardized similarity score to examine alignment of tweets within and between communities. We find that media preference is strongly correlated with the amount of misinformation propagated by sentinel nodes. Engagement with sensationalist misinformation topics is largely confined to a cluster of sentinel nodes that includes influential conspiracy theorist accounts. By contrast, misinformation relating to COVID-19 severity generated widespread engagement across multiple communities. Our findings indicate that misinformation downplaying COVID-19 severity is of particular concern for public health response. We conclude that the sentinel node approach can be an effective way to assess breadth and depth of online misinformation penetration.


Hierarchical clustering
Below we present the sentinel communities along the linked domain score axis colored by their clustering in the main text along with the dendrogram that resulted from hierarchical clustering. Figure 1: The linked domain score for each sentinel community colored by cluster assignment (left) together with the dendrogram produced by mean linkage clustering (right). Colors correspond to cluster assignment (Left / Right / Far Right) as described in the main text. Figure 2: The linked domain score for each sentinel community (left) together with the dendrogram produced by mean linkage clustering (right), following removal of two domains that were disproportionately shared by a single sentinel node each. Colors correspond to cluster assignment (Left / Right / Far Right) as described in the main text.
The clustering we chose produced the second highest silhouette score of all possible clusterings that could result from the dendrogram in Figure 1 (0.737 in comparison to 0.894 for two clusters). This decision was made after noticing that the positioning of the two 'Right' communities was skewed left due to their disproportionate sharing of a domain connected to a single sentinel node (we check robustness of the clustering we chose to such link sharing behavior in section 1.1.3). Removing those two domains from consideration and repeating the PCA and clustering process results in the linked domain score and dendrogram seen in Figure 2.
The cut point in the dendrogram seen in Figure 2 producing three clusters has the largest silhouette score of all possible clusterings (0.814). Thus we selected a cut in the original dendrogram ( Figure 1) that resulted in three clusters even though it did not correspond to the highest silhouette score.

Clustering robustness
To examine robustness of clustering results to linked domains stemming from a small number of nodes (for example, corresponding to self-promoting posts from a handful of accounts), we eliminated domains that were linked to by only a fraction of individual sentinels less than a given threshold. Threshold values varied from 0.0 to 0.10 in intervals of 0.005 (a stepsize of which corresponds to slightly more than two unique sentinels). For each value we repeated the PCA clustering process, and calculated an adjusted Rand index [1] comparing the resulting clusters with the Left -Right -Far Right grouping we used in our analysis. In particular, after running the thresholded domain frequency matrix through PCA we performed clustering by finding a dendrogram cut point that yielded three clusters. In Figure 3, we present the adjusted Rand index as a function of the threshold value. The adjusted Rand index calculates the similarity of two data clusterings after adjusting for randomness in group assignment. The index ranges from −1 (completely dissimilar) to 1 (identical clusterings). As demonstrated in Figure 3 the adjusted Rand index stays at or near 1 for threshold values up to 0.03 (which represents at least 12 unique sentinels). This suggests that the clustering used in our analysis is not driven by linked domains corresponding to a handful of accounts.

Sentinel coverage of known COVID-19 misinformation
By using the sentinel monitoring approach we have described we will not have captured every piece of COVID-19 misinformation circulating on the platform. However, we believe that this approach did allow for the detection of the most notable misinformation, that which reached prominent Twitter users within a diverse array of digital social circles.
We performed a series of substring searches of our sentinel tweets in order to examine topical coverage of our sentinel monitoring approach. Specifically, we examined extent of coverage by our sentinel nodes concerning virus origins, COVID-19 treatments and preventatives, three COVID-19 conspiracy theories, and non-pharmaceutical interventions (NPIs). The number of tweets returned from each cluster for every topical substring search is presented in Table 3. Search strings that produced these counts were variations on the given sub-topic name. Note that we do not include topics that were extensively examined in the main text, namely vaccinations and perceived COVID-19 severity.  Table 3: Number of tweets posted by each cluster that contain strings related to each subtopic. Search strings were formed by taking variations on the sub-topic labeling, e.g. "Created in a Lab" counts were found with search strings "government lab", "laboratory", "made in a lab" and "man-made".

QAnon posts
To quantify association of clusters with the QAnon conspiracy theory [8], we performed a substring search of all tweets posted by sentinel nodes over the observation period for those that contained at least one of the following strings: "qanon", "qarmy", "wwg", "wga", "greatawakening", "great awakening", and "new world order". These strings have been found to be affiliated with QAnon posts across various social media platforms [8,9]. In Figure 4 we present a plot of the total QAnon tweets by sentinel account for each cluster. While all three clusters contain at least one account that posted more than 200 tweets featuring one of the sub-strings, the Far Right has many such accounts as well as a handful of users near or exceeding 1,000 found tweets.

Sentinel drop-off
An important question for a longitudinal cohort study is the effect of subject drop-off over the observation period. In our study, drop-off may have occurred due to a sentinel leaving Twitter of their own volition or being suspended by Twitter for violating their terms of service. Over our observation period Twitter took on suspension policies that actively pursued those posting COVID-19 misinformation [10] as well as accounts determined to have been affiliated with the QAnon conspiracy theory [11]. Twitter additionally removed over 70,000 accounts on January 7, 2021 related to the storming of the United States' Capitol Building [12]. Given our analysis, these policies may explain why we observed differential attrition between the Left, Right and Far Right clusters as demonstrated in Figure 5. Attrition in this setting can be problematic because the accounts that drop off may disproportionately come from the clusters most likely to post misinformation. Figure 5: Sentinel attrition over the observation period by cluster. We consider a sentinel account to have been lost to follow up at the time of their last tweet within our data set. Figure 5 shows that all clusters retained more than 80% of their initial members for all but a handful of days over our observation period, suggesting that our comparisons between clusters over time are unlikely to be skewed due to dropoff.

Search phrase tables
Below we present the search strings used to subset the COVID-19 tweets for our analysis of vaccination and perceived COVID severity content. Table 4: Search strings used to identify COVID-19 tweets pertaining to COVID-19 severity, arranged in alphabetical order.

.1 Augmented Dickey -Fuller test results
The burst score metric in equation (1) of the main text implicitly assumes that the similarity between two clusters is stationary in time and does not exhibit a trend. In order to assess the validity of this assumption we performed an augmented Dickey-Fuller test [13] with no trend and no lag using the adfuller model in the statsmodels Python package [14]. For the Right -Far Right between similarity this test resulted in a test statistic value of −11.165 with a corresponding p-value of 2.722 × 10 −20 , indicating that there is significant evidence to reject the null hypothesis that the time series has a unit root in favor of the alternative that the time series is stationary.

Flagged day table
In the main text we provided a figure with all of the flagged days over the observation period. We considered a flagged day to be any day in which the Far Right and Right clusters exhibited a burst score of at least two. For each flagged day we gave a short description of the "topic" driving increased similarity on that day. Here we elaborate further on each flagged day with the date, a longer description of the topic on that day as well as the original Right -Far Right between similarity score on that day followed by a recalculated score found by calculating the average between similarity after removing the topical tweets (as determined by latent semantic analysis). This information is provided in Table 8.

Date
Topic  Table 8: Descriptions of the flagged days from Figure 6, main text. Topics were determined by running the tweets from each cluster sent on that date through latent semantic analysis and reading the tweets with highest singular vector values. The original score refers to the Right -Far Right inter-similarity score on that day, and the removed score is the recalculated inter-similarity score after topical tweets were removed.
The topics driving increased similarity tend to fall into two categories: topics related to political discourse involving COVID-19, and topics downplaying COVID-19 severity.

Flagged day analysis
November 13, 2020 -Elon Musk tweets about His COVID tests High between cluster similarity on November 13, 2020 was driven by content related to a tweet sent by Tesla CEO Elon Musk on that day that harshly criticized the reliability of rapid antigen tests. The observed similarity between the Right and Far Right on this day was 0.11, roughly 3 standard deviations from the average similarity up to that time. In Figure 6 we plot the cumulative per community tweets containing Elon Musk's name along with a vertical dashed line denoting the time of Mr. Musk's original tweet.  [11] K.

Introduction
In this file are instructions for undergraduate coders. Within we will describe the data, what job the undergraduate coders will perform, and provide guidance as to how they should accomplish that job.

The Data Set
The data you will be coding contains 800 tweets spanning four distinct topics (200 tweets per topic) regarding COVID-19. You will be presented (in a Qualtrics survey) the tweet as well as the topic associated with the tweet. Note that it is possible for you to see the same tweet appear with different topics.

Coding Online
Each coder will have their own online survey link they will use to code each tweet.
The steps are: 1. Click on the link 2. Paste in tweet id # and tweet text from the excel sheet 3. Answer the questions about the tweet content as noted on the survey and coding sheet below 4. Hit submit 5. Re-open link and start again with the next tweet 6. Complete process for all 800 tweets 7. Keep track of progress in completing the tweets on the excel sheet so you do not code more than once

Coding
For each tweet you view you will be asked to answer the following four questions: • Is this tweet about the topic it is associated with? (Yes / No/Unsure) • Does this tweet include a "Call to Action". This is defined as mobilizing information, either (A) calling for in person collective action (protest, march, sit-in, etc), or (B) online collective action (signing online petition, asking people to share or reposting a video, meme, article, etc.) or (C) sharing information about how to take collection action (e.g. a link, contact information of an official/time/place of a protest etc.) (Yes/No and code type of mobilizing information (in person, online, sharing information) • For tweets related to hydroxychloroquine and mask wearing you will also be asked to respond to the following: Does this tweet downplay or play up the effectiveness of hydroxychloroquine (or facemasks)? (More effective / Less effective / Neutral) • For tweets about mortality rate, you will be asked: Does this tweet make claims that the mortality rate is lower, higher, or neutral than health experts suggest? ( Mortality Rate is Lower / Mortality Rate is Higher / Neutral / Unsure)

Coding Instructions
In this section we will provide context and give a list of the misinformation you will likely see along with the facts corresponding to said misinformation. Again we define misinformation as false or inaccurate information in accordance with the facts we present below.
We will first present a few pieces of misinformation that you may encounter across all four topics. We will then dive into topic-specific misinformation. For each of the four topics we will provide a brief background, references, and a list of common pieces of misinformation that you may encounter.

General Misinformation
In this section we will briefly describe some general misinformation that could appear in any of the four topics. For each item below the misinformation will be presented and then followed by the truth.

Misinformation Truth
The pandemic was planned There is no evidence of this The pandemic is fake or a hoax The COVID-19 pandemic is a real pandemic Hydroxychloroquine is a COVID cure The scientific consensus is that hydroxychloroquine is not effective for treating or preventing COVID-19

Misinformation Truth
Health and Human Services colluded with the Department of Justice and FBI to destroy Dr. Mikovits' reputation.
There is no evidence of this.
Health and Human Services colluded with the Department of Justice and FBI to destroy Dr. Mikovits' reputation.
There is no evidence of this.
Dr. Anthony Fauci directed aforementioned efforts.
There is no evidence of this.
Dr. Fauci delayed previous publications of Dr. Mikovits on HIV which benefitted Fauci and his friends while leading to the death of millions.
There is no evidence of this.
COVID-19 was manipulated in a lab and released into the world, either by accident or as a bioweapon.
Scientific consensus supports the theory that COVID-19 jumped from animals into human hosts in the wild.
Hospitals make money from Medicare if they label a death as being due to COVID-19.
Medicare does give money to hospitals that treat coronavirus patients. However, this is a standard practice for all diseases, and there is no indication that hospitals are over-identifying patients as having COVID-19 in an attempt to make money.
Getting a flu shot increases the odds that you'll contract COVID-19.
There is no evidence of this.
Hydroxychloroquine is an effective treatment against coronaviruses.
Evidence from various scientific studies suggests that hydroxychloroquine is no more effective than any other treatments that have been considered.

Hydroxychloroquine
Summary of Hydroxychloroquine In March 2020 Dr. Didier Raoult announced a study in which he claimed the use of hydroxychloroquine and azithromycin was effective in treating COVID-19. His study was found "irresponsible" by peer reviewers. Shortly after Raoult's announcement President Trump started promoting hydroxychloroquine as a COVID treatment. This led to a surge in off-label prescriptions of the drug. On March 28, 2020 the FDA issued an emergency use authorization to allow hydroxychloroquine to be prescribed to patients hospitalized with COVID-19. In May of 2020 President Trump stated he was taking hydroxychloroquine combined with zinc and an initial dose of azithromycin. On May 22, 2020 a study published in the Lancet raised concerns on the safety of prescribing hydroxychloroquine to treat COVID-19. The Lancet study was later retracted in June. On June 15 the FDA revoked emergency use authorization. June also marked a period in which more large-scale studies began to suggest that hydroxychloroquine was not an effective treatment for COVID-19.

Masks
Summary of Mask Wearing Recommendations During COVID19 CDC -Prior to April 3, 2020 the CDC did not recommend the wearing of face masks to prevent the transmission of COVID19. As of April 3, 2020 the CDC updated their recommendations to say that people should wear a cloth face covering in public. This was further updated on June 28, 2020 to recommend that people wear cloth face coverings in public settings and when around people who don't live in their household, especially when other social distancing measures are difficult to maintain. It is the position of the CDC (which is backed up by various scientific studies) that widespread proper use of facemarks is likely to reduce the spread of COVID-19 in public settings.
The World Health Organization (WHO) -Prior to June 5, 2020 WHO did not actively recommend that people wear facemasks in public. On June 5, 2020 WHO guidance was updated to recommend that the general public should wear non-medical fabric masks where social distancing is not possible and vulnerable people should wear medical masks in such settings.
Efficacy of Masks -Early in the pandemic there was little to no research on the effectiveness of face masks in decreasing COVID19 transmission. In the summer of 2020 the scientific consensus became that widespread proper face mask utilization is effective in reducing COVID19 transmission.
Correct Way to Wear a Mask -In order to be effective, face masks should cover both the nose and mouth.

Misinformation Truth
Face masks are worse than doing nothing.
While the effectiveness of a particular mask depends on a number of factors (for example, an N95 mask is more effective than a cloth mask) it is the current scientific consensus that properly using a face mask is better than nothing.

Duke University Study shows Many Face Coverings INCREASE Transmission of COVID-19.
This is referring to a misinterpreted study conducted by Duke University researchers that proposed a new method for testing the effectiveness of different face coverings. This study says nothing statistically significant about the effectiveness of different coverings.
Face masks are a way for the government (or others) to infringe on individual rights and freedoms.
Face mask recommendations are made in the interest of preventing the spread of COVID-19. There is no evidence of a plot to control the freedoms of American citizens using face masks.
Anthony Fauci said masks don't work. While Dr. Fauci did say not to wear face masks in a 60 minutes interview on March 8, this statement was made before the CDC altered their guidelines and before there was a better understanding of mechanisms for COVID-19 spread. Dr. Fauci has since changed his position as more evidence has illuminated how the virus spreads from person to person.
Face masks reduce oxygen intake and increase the amount of carbon dioxide you breathe in.
This is false. Masks may be uncomfortable and increase anxiety, but they do not reduce oxygen intake or increase your carbon dioxide levels.