Harnessing Twitter data to survey public attention and attitudes towards COVID-19 vaccines in the UK

Attitudes to COVID-19 vaccination vary considerably within and between countries. Although the contribution of socio-demographic factors to these attitudes has been studied, the role of social media and how it interacts with news about vaccine development and efficacy is uncertain. We examined around 2 million tweets from 522,893 persons in the UK from November 2020 to January 2021 to evaluate links between Twitter content about vaccines and major scientific news announcements about vaccines. The proportion of tweets with negative vaccine content varied, with reductions of 20–24% on the same day as major news announcement. However, the proportion of negative tweets reverted back to an average of around 40% within a few days. Engagement rates were higher for negative tweets. Public health messaging could consider the dynamics of Twitter-related traffic and the potential contribution of more targeted social media campaigns to address vaccine hesitancy.

Public attitudes towards COVID-19 vaccines. The percentage of negative sentiment tweets varied from 20.7% to 51.1% (excluding the reference week). The percentage of negative sentiment dropped with every new announcement (e.g., 24.4% with Phase 3 trial results from Pfizer/BioNTech, 20.9% with Phase 3 trial results from Oxford/AstraZeneca, 21.5% with UK starting its vaccine campaign), but then reverted to a higher level (36.4%) after each news announcement.
Negative tweets were posted by a smaller number of unique individuals during the study period, compared to tweets presenting positive views (38 vs 45 unique authors per 100 tweets). The engagement pattern for positive and negative tweets changed over time (Fig. 2). When the first trial results were released in November, negative tweets had higher engagement rates than positive tweets, but negative engagnment rates declined after the release of Oxford/AstraZeneca Phase 3 results (week 3). This was primarily driven by the decreasing number of likes for negative tweets (mean number of likes for negative tweets: 4.6 in week 1 and 2 vs. 3.1 in week 3-11, whereas mean number of likes for positive tweets increased: 4.5 in week 1 and 2 vs. 5.0 in week 3-11).

Discussion
Using Twitter data from the UK during November 2020 to January 2021, we investigated the ecological associations between vaccine-related major news announcements, and attitudes towards vaccines. This period coincided with news on the major vaccine trials being announced or published, and approvals by the UK Medicine and Health Regulations Authority (MHRA).
We report two main findings. First, each major news announcement related to vaccines was associated with a large decrease in negative sentiment on the same day, dropping from around 40% to 20% of all daily tweets. www.nature.com/scientificreports/ However, this was short-lived, and the proportion of negative tweets reverted back to the background average within a few days. A similar pattern of decreasing in negative sentiment when Pfizer/BioNTech announced its phase III vaccine trial results has been found 19 . This study also found a fluctuation pattern of public sentiment during the same period but other major news announcements were not investigated. Another study analyzing UK public sentiments toward COVID-19 vaccines on Twitter and Facebook also found public sentiment was potentially associated with news on vaccine development, although their study period ended in November 2020 15 (whereas our study period started in November 2020). Second, tweets with negative sentiment towards vaccines were posted by a smaller number of unique individuals, compared to tweets presenting positive views. Negative tweets were more likely to be liked and retweeted when the trial results were initially released in November, but their popularity gradually decreased with the vaccination campaign underway. Our data are limited by not having more information on the demographic factors associated with the tweets and the algorithm, which is necessarily limited by the nature of Twitter's free text interface. In addition, we examined the absolute change in number regarding COVID vaccine-related tweets during the study period, but were not able to examine the trend relative to the total Twitter traffic, as the background volume of tweets on the platform was not available. This is consistent with previous work using social media to examine health-related content over time 20,21 . Furthermore, we did not take bot messages into consideration when including COVID-19 related tweets. However, as we examined the trend in number of tweets and engagement of tweets by day/ week, this might not bias the results if the publishing of bot messages is non-differential by time. Finally, this investigation is based in one country, and only harvests information in English, despite many languages being spoken in the UK and possible differential attitudes across them.
Our results can inform public campaigns aimed at promoting vaccine take-up. They suggest that information campaigns need to be sustained beyond major news announcements. In addition, public education could consider the dynamics of the Twitter-related traffic on this issue by spacing out news announcements and repeating news stories about the vaccination programme, beyond simply publishing vaccination numbers. One possibility is regular news releases, specifically allied to tweets and other social media posts with more scientific content, as one way to develop a more informed discourse in the public sphere. This view is supported by research on attitudes before vaccines were available in Israel, which found an overall vaccine hesitancy rate of 25% 22 , whereas Israel has subsequently achieved very high rates of vaccine take-up 23 . This suggests that attitudes to vaccination may, in a proportion of people, be malleable and permeable to public health messaging.
Using a hybrid algorithm combining machine learning and rule-based approaches, each tweet was classified as expressing positive, negative, or neutral sentiment. Several steps of pre-processing including part-of-speech tagging, lemmatization, prior polarity, negations, amplifiers & other grammatical constructs were done before the machine learning model was performed. The machine learning model used by Sprout Social was built on a dataset of 50,000 tweets drawn randomly from Twitter. 10,000 tweets were used to test and tune the algorithm, none of which were used for building the algorithm. As the tweets are not specific on domains, the sentiment analysis could be performed on a wide range of domains 25 . Combining machine learning and rule-based approaches have been widely used to estimate the public sentiment 26,27 , and such methods have been shown to improve the effectiveness of sentiment analysis 25,27 .
The daily proportion (%) of negative tweets among the COVID-19 vaccine-related tweets dataset was calculated during the study period. Engagement with COVID-19 vaccine-related tweets was measured as average number of comments, shares and likes of the original tweet. We compared the average number of engagements with negative and positive tweets and present the figures week by week (reference week: Nov 2 to Nov 8, 2020; week 1: Nov 9 to Nov 15, 2020; week 2: Nov 16 to Nov 22, 2020 … week 11: Jan 18 to Jan 24, 2021). www.nature.com/scientificreports/