## Introduction

The coronavirus disease 2019 (COVID-19) pandemic, the first severe pandemic since the Spanish flu a century ago, has infected over 132 million people and killed about over 2.8 million as of April 8, 2021 (https://covid19.who.int/), thus causing colossal economic and employment losses globally (Fernandes, 2020) and confining ~58% of the world’s population (Bates et al., 2020), forcing them to work from home while parenting their children, who have been unable to attend school. The convergence of economic losses, job insecurity, health risks, and confinement represents an extraordinary pressure on citizens worldwide, which has raised concerns about their well-being and the psychological impacts of confinement and concurrent difficulties. Social media has been instrumental in keeping people informed and in touch during their confinement, thereby providing a record of their impressions, interests, and sentiments that represents an important resource to assess which events among the multiple disruptions affected citizens and how these events and the ensuing sentiments varied across cultures. We conducted sentiment analysis to evaluate the global rise and fall of sentiments during the COVID-19 pandemic. We performed a mass examination of sentiments expressed in millions of social media posts (Twitter and Weibo) about COVID-19 in English, Spanish, Arabic, French, Italian, and Chinese languages using deep learning classification. The data extracted and analyzed in this study are available in Yang et al. (2020).

In addition, most contemporary sentiment analysis applications focus on a single language or a single country or region. In two previous studies (Barkur and Vibha, 2020; Venigalla et al., 2020), lexicon-based methods or emotion dictionaries were used to analyze people’s emotions on India’s nationwide lockdown because of the COVID-19 outbreak. In a previous analysis (Alhajji et al., 2020), naive Bayes models categorized Saudis’ attitudes toward COVID-19 preventive measures as positive, negative, or neutral. In (Pastor, 2020), questionnaire survey and sentiment package were conducted to determine the Philippine students’ sentiments on the synchronous online delivery of instruction as a result of the country’s extreme community quarantine during the pandemic. In (Ziems et al., 2020), a simple logistic regression classifier was used with linguistic features, hashtags, and embedded tweets to identify anti-Asian hate and counter hate text. However, the nature of the pandemic since late February 2020 has broadened the interest and focus on COVID-19 to a global audience. We therefore present an analysis based on six languages (English, Spanish, Arabic, French, Italian, and Chinese) that represent ~2.4 billion citizens globally. We examined around 105+ million tweets that were collected between March 1 and May 15, 2020, and Weibo messages collected between January 20 and May 15, 2020, with a dominance of English posts (64%). After excluding neutral reports (e.g., media and government reports), we classified the content to reflect three sets of sentiments, each with several expressions, including positive (optimistic, thankful, and empathetic), negative (pessimistic, anxious, sad, annoyed, and denial), and a complicated expression (joking). We built the classification models using deep learning language models such as Bert (Devlin et al., 2018) for Spanish, French, and Italian; XLNet (Yang et al., 2019) for English; AraBert (Antoun et al., 2020) for Arabic; and ERNIE (Zhang et al., 2019) for Chinese.

## Methods

### Data collection

We collected the tweets using Twint (https://github.com/twintproject/twint), an open-source Twitter crawler and formed requests with specific parameters and scraped the resulting responses into JSON documents. We then used a unified query across the target languages: “COVID-19 OR coronavirus OR covid OR corona OR كورونا (Arabic for corona).” We launched 12 instances on 24 cores to download daily updates. Data rates slightly varied throughout the period, with an average of a little over a million tweets per day. We saved the tweets as JSON files and pooled them into a shared medium so that the language models can preprocess and consume them for sentiment analysis, which was conducted on a GPU server (GTX 1080ti GPU and 20 CPUs). Because of the limited number of Chinese posts on Twitter, COVID-19 sentiment analysis for China was performed through Sina Weibo, the largest social media platform in China. Weibo records were collected through Sina Weibo API by first gathering hashtags about COVID-19 and then extracting Weibo records that include these hashtags.

### Sentiment annotation

Because of the lack of a sentiment benchmark to support our fine-grained emotion analysis, we randomly selected 10,000 English and 10,000 Arabic tweets for sentiment annotation. We chose these languages because they are two of the top five most popular languages worldwide. Additionally, English can be effectively translated into other languages when needed. Domain experts determined the sentiment category after a review of a tweet subset and several rounds of discussion. The final set of ten labels reflected complicated sentiments during the pandemic; these labels covered auxiliary emotions such as optimistic (representing hopeful, proud, and trusting), thankful for efforts to combat the virus, empathetic (including praying), pessimistic (hopeless), anxious (scared, fearful, etc.), sad, annoyed (angry), in denial toward conspiracy theories, official report, and joking (ironical). We recruited more than 50 experienced annotators, at least three of whom labeled each tweet. We provided them example tweets in advance with suggested categories. Each tweet was allowed to be assigned to multiple labels, in line with the convention and supporting the analysis of the complicated emotions during the pandemic. To measure the reliability of these sentiment annotations, we conducted a verification study on the annotated tweets and determined the final results of the labeled tweets via majority voting. We acquired the annotated data for Spanish, French, and Italian by translating the labeled English tweets using Google Translate (https://translate.google.com/), which has been considered as a reliable strategy in multilingual studies (Balahur and Turchi, 2013). We then evaluated the quality of these translations by calculating the BLEU score, wherein we compared A and A’, where A’ is translated back by, for example, A(En)→B(Es)→A’(En). The BLEU score in our study was 0.33 (note that the state-of-the-art machine translation model has a BLEU score of 0.39 using a tied transformer), indicating a good translation quality. After performing a manual check of a subset of the translated tweets, we were surprised by the high quality of translation. For Weibo, we examined the COVID-19 posts and annotated 21,173 posts under 7 sentiment categories: optimistic, thankful, surprised, fearful, sad, angry, and disgusted. The detailed statistical analysis of the annotated data can be found in (Yang et al., 2020).

### Tweets/weibo preprocessing

We preprocessed the tweet data by removing user information, interactions (e.g., retweets and likes), emojis, and emoticons, although they are used to express emotions well because we focused only on textual information. Next, we filtered noisy symbols and texts, which cannot convey any meaningful semantic or lexical information and may even prevent the model from learning, such as user names with the mark “@”; the retweet symbol “RT”; hyperlinks; and some special symbols including line breaks, tabs, and redundant spaces. Unlike previous methods, which also removed hashtags from tweets, we retained these hashtags because they have meaningful semantics, such as in “Proud to be one of the few people who hasn’t texted their ex #COVID-19 #Quarantine #lockdown.” Moreover, we performed word tokenization, steaming, and tagging using the NLTK tool (https://www.nltk.org/) for English, Spanish, French, and Italian and Pyarabic for Arabic (https://github.com/linuxscout/pyarabic). For Weibo segmentation, we employed Jieba (https://github.com/fxsjy/jieba).

### Multilabel sentiment classifiers

For each language, we built a multilabel sentiment classifier based on deep neural network language models because of their success in diverse natural language processing tasks (Yang et al., 2019; Safaya et al., 2020). An integration framework called simpletransformer (https://simpletransformers.ai/) helped fine-tune these pretrained models and train a customized classifier. We used the pretrained models Bert (Devlin et al., 2018) for Spanish, French, and Italian; XLNet (Yang et al., 2019) for English; AraBert (Antoun et al., 2020) for Arabic; and ERNIE (Zhang et al., 2019) for Chinese and added the last layer of a fully connected network with the sigmoid activation function. These pretrained models represent a tweet or a Weibo post as a 768-dimensional vector that is sent to the last fully connected layer to predict a probability between 0 and 1 for each label. We applied a threshold (e.g., 0.5) to determine if one label should be assigned to the input tweet/Weibo post. Therefore, we were able to assign multiple labels to one tweet when they are all predicted with high probabilities. Before reporting the classification findings, we evaluated the classification models via 5-fold cross-validations. Table 1 presents the accuracy results of different languages. With a validated performance (accuracy > 0.82), all sentiment classifiers were then trained by 10,000 labeled tweets; meanwhile, the Chinese sentiment classifier was trained using 21,173 Weibo posts. We then used the trained models to predict the sentiments of millions of COVID-19 tweets (March 1–May 15, 2020) for our analysis. The implementation details of these classifiers can be found in (Yang et al., 2020).

## Results

### Rise and fall in the COVID-19-related conversations

We first examined people’s attention to COVID-19, which we quantified as the volume of conversations observed on social media such as Twitter and Sina Weibo. Figure 1 shows the volume of COVID-19 tweets (normalized to the maximum number, with volume statistics in the insert) from March 1 to May 15, 2020, in English, Spanish, Arabic, French, and Italian, as well as the normalized volume of COVID-19 Weibo posts in Chinese from January 20 to May 15, 2020. The figure also shows the volumes of English tweets by topic reflecting major citizen concerns during the pandemic, such as oil/stock price, economic stimulus, employment, herd immunity, medicine/vaccine, and working/studying from home.

Our results across languages show similar patterns of a rapid increase followed by a gradual decline in global citizen conversations on COVID-19; however, Chinese posts reached a peak on January 22, 2 months earlier than the peak in all other languages, which was attained between March 12 and 21 (Fig. 1A). This aspect reflects a lag between the development of the epidemic, which was first detected in Wuhan and then spread to pandemic status. While the rise and fall in the volume of COVID-19 posts were remarkably correlated for English, French, Italian, Arabic, and Spanish (Fig. 1B), the increase occurred earlier in Italian, as Italy was the first Western nation to suffer from the pandemic (Fig. 1A). Remarkably, the time series of COVID-19 conversations on Twitter revealed a clear weekly signal with reduced volumes on weekends and higher activity during weekdays (Fig. 1A), which is consistent with past Twitter activity reports (Golbeck et al., 2010). This strong weekly signal exhibited that differences in habits between weekends and weekdays, which also include reported variations in topics and sentiments (Golbeck et al., 2010), were maintained despite confinement (e.g., lack of workplace transport) somewhat blurring these differences.

Two key factors drove the surge in tweet volume: government enforcement of confinement measures and the collapse of the stock market and the global economy. First, the effect of the government’s confinement rulings on the volume of social media messages was clearly reflected in the peak in Chinese posts (January 22, 2020) when the decision to lock Wuhan down was made (Figs. 1A and S1) and Italian tweets (March 12, 2020) when the lockdown was tightened in Italy, closing all commercial and retail businesses (Figs. 1A and S2). English, French, and Spanish posts peaked on March 13 (Figs. 1A and S3S5), with the U.K.’s initial announcement of the consideration of relaxed social distancing to promote herd immunity, Spain’s imposition of a national lockdown and declaration of a state of alarm, and France’s closure of all museums and culture and leisure areas. From that point, Europe became the center of the pandemic. In North America, on that day, the Canadian prime minister’s wife tested positive, the U.S. declared a national state of emergency, and the U.S. president also tested positive. Meanwhile, the peak volume of Arabic tweets was reached a week later, with the closure of the two holy cities (Mecca and Medina) (March 21, 2020) (Figs. 1A and S6), the speech of Saudi Arabia’s king on April 19, and the introduction of a popular hashtag (#كلناـمسؤول، البقاء في المنزل سلاحنا الأقوى لمواجهة الفايروس, # We are all responsible, staying at home is our most powerful weapon against the virus).

Second, the sharp uptick in tweet volume across languages, except Chinese, started on March 9, 2020, which preceded lockdown and other measures in Europe and corresponded to the stock market collapse and the oil price crash on the same day (Fig. 1C, D) and foretold the true catastrophic impact of COVID-19 and was identified as the main driver of the rise in social media conversations. The breakdown of the stock market was followed by increased discussions on unemployment and the need for economic stimulus packages (peaking on March 26, 2020) (Fig. 1C, D, and E). As the cost of such aim in human lives became lucid, the early surge in herd immunity interest quickly dissipated and was replaced by a focus on vaccines, which remained high throughout the study period (Fig. 1C, F, and G). The promotion of confinement measures increased the volume of conversations regarding the challenge of working and studying from home, peaking on March 17, 2020 (Fig. 1C, H).

The decline in the volume of conversations started earlier in China, reaching lower levels between February 25 and March 25 when the infection was controlled and decreasing further after March 25 when China started reopening (Figs. 1A and S1). The decrease in tweet volume after the March 13 peak followed a remarkably similar slope across non-Chinese languages (Fig. 1A). The volume of English tweets was dominated by discussions on therapies and vaccines, followed by employment and working and studying from home, and then conversations on economic stimulus programs and economic impacts (Fig. 1C).

### Emotional imprint of the COVID-19 pandemic

Figure 2 presents the sentiment analysis results, which includes variations in the sentiment distribution in six languages over time and significant sentiments identified as sentiment percentage values above the 95% confidence interval ($$> {\cal{N}}^{ - 1}(0.95;\mu ,\sigma )$$, where μ and σ are the mean and standard deviation of the daily values of one sentiment, respectively). Figure 3 statistically compares the nine sentiment categories (denial, annoyed, sad, anxious, pessimistic, joking, empathetic, thankful, and optimistic) across different languages with respect to the mean, standard deviation, and the increase/decrease slope in the daily sentiment sequences.

Consistent with their correlated trajectories in the conversation volume, English, Spanish, French, and Italian tweets expressed remarkably similar emotional states as the epidemic worsened to pandemic status, including feelings dominated by a mix of joking and anxious/pessimistic/annoyed as the conversation volume surged, the pandemic spread, and the stock market collapsed between March 1 and 15, 2020 (Fig. 2A–D). In contrast, anxious states, alongside denial and empathetic (praying), prevailed in Arabic during that period (Fig. 2E). The drop in tweet volume after March 15 was followed by a general trend toward an increase in positive states (optimistic, thankful, and empathetic), which was the strongest in Arabic tweets (Fig. 2E(b)) but also present in English tweets (Fig. 2A(b)) and was conveyed with a combination of annoyed, sad, and pessimistic in Spanish, French, and Italian tweets (Fig. 2B(b)–D(b)). When Europe began its reopening plan in May, anxious states emerged in Spanish, French, and Italian tweets (Fig. 2B(b)–D(b)).

According to the previous study (Yigitcanlar et al., 2020), social media data (e.g., tweets) can guide authorities’ interventions and decisions during a pandemic, and the effective use of government social media channels (e.g., Twitter) can facilitate the public’s compliance with measures and restrictions. Findings in previous research (Chen et al., 2020) showed that stronger negative emotions lead to higher citizen engagement through government social media. This study’s emotion analysis results in different languages also supported these conclusions. Our fine-grained sentiment classification can help us understand people’s reactions and concerns, which would benefit public health institutes when designing interventions.

More than 50% of Weibo posts in Chinese showed the prevalence of a fearful state when human-to-human transmission was confirmed on January 20, 2020 (Figs. 2F and S12), continuing until January 22, when Wuhan was locked down. This aspect, along with the Chinese New Year celebration, led to a growth in positive sentiment (optimistic and thankful) on January 23–25. Sad states were particularly significant on February 4, when the total confirmed cases reached 20,000; on February 7, when Dr. Wenliang Li (the coronavirus whistleblower) died; and April 4, when China held a nationwide memorial for COVID-19 victims. Meanwhile, an increase in optimistic states was observed when Wuhan eased its restrictions on April 8. The ensuing sentiments expressed a mix of thanks to health-care workers and fear of the spread from a few new cases, with spikes of angry states on specific events, followed by a rise in optimistic and surprise states after the country reopened (Fig. 2F(b) and S12).

The overall prevalence of different emotional states was similar among languages (Fig. 3), with the important component of positive states, such as optimistic and joking (about 20% of tweets each, Fig. 3F, I), but also negative ones, such as annoyed and anxious (Fig. 3B, D). The increase and decline of sentiments in Arabic and Chinese are more significant than those in other languages, driven by various cultural/religious events, holidays, and government announcements. Across all languages, the optimistic and sad states tended to increase over time, whereas joking tended to decrease (Fig. 3C, F, I). Remarkably, Arabic speakers revealed the highest empathetic sentiments (Fig. 3G), possibly because the pandemic overlapped with Ramadan, a period that particularly promotes empathy toward those in need. Another big portion of sentiment is annoyed (constituting 20% of all languages) (Fig. 3B). Although it seems stable over time, it showed sudden spikes, as shown in Fig. 2A(b)–C(b), caused by individual COVID-19-related events such as deaths, economic and political effects, and others (with detailed annotation and discussion in Figs. S7S9).

To quantitatively analyze the social patterns shared by the rise and fall of global sentiment, we measured the correlation coefficients of the average daily sentiment distribution among the five languages (English, Spanish, French, Italian, and Arabic), as their results were all based on tweets. The presentation of these correlation coefficients (Fig. 4F) is similar to the form of volume correlation in Fig. 1B. We observed that the Arabic sentiment is less relevant to others, while Spanish, French, and Italian sentiments are highly similar (with correlation coefficients 0.97–0.99). English sentiment was closer to Spanish and Italian than to French sentiment. To further understand such a similarity, we applied principal component analysis and then t-distributed stochastic neighbor embedding (t-SNE) to visualize the daily sentiments of these different languages in a two-dimensional space, as shown in Fig. 4A–E. In each visualization figure, several samples were annotated with their corresponding dates. Several interesting findings were generated. First, Arabic sentiments (Fig. 4E) show two clusters, one mostly with date samples in March 2020 and the other with all other days. Notably, sentiments on March 21 were extremely close to those on May 15 rather than those on other days in March. March 21 is when the volume peaked, followed by a strong increase in positive states, thus the similarity with dates in May (Figs. 2E and S11). Second, English, Spanish, and French sentiments revealed a small cluster for days before the volume peak on March 12–13, which is consistent with our results in Fig. 2. The sentiments when the pandemic began were different from those after confinement measures were imposed on human mobility. Finally, the overall visualized sentiments in English, Spanish, French, and Arabic shared similar distributions, as quantitatively verified with correlation coefficients >0.89.

We also analyzed the sentiments of English tweets by topics reflecting major citizen concerns during COVID-19, such as oil/stock price, economic stimulus, employment, herd immunity, medicine/vaccine, and working/studying from home. Figure 5 summarizes the results, and Figs. S1319 in the supplementary information presents details for each topic. The results showed that in relation to COVID-19, working/studying from home led with expressions of optimism and thankfulness in contrast to herd immunity, which elicited negative states of denial and anxiety (Figs. 5 and S13S14). Conversations around economic collapse led to the strongest annoyed (stock market collapse), pessimistic (collapse of oil prices), and sad (high unemployment rate) states (Figs. 5 and S15S17). The sentiment in conversations about economic stimulus packages showed stronger joking and annoyed states than other emotions (Figs. 5 and S18). Discussions on drugs/vaccines showed the second strongest expressions of optimism, with significant denial and annoyed sentiments on events such as the poaching of vaccines from other countries and the hyped usage of unproven drugs (Figs. 5 and S19).

## Discussion

Our analysis, based on over 105 million social media posts in six languages spoken by 2.4 billion citizens worldwide, revealed that the global conversation on COVID-19 followed the spread of the pandemic, with the rise of the volume in China preceding those in Western and Arabic languages by 2 months. These conversations, regardless of language, showed a remarkably similar pattern of a rapid rise and then a slow decline over time across all nations, and the surge being affected by economic collapse and confinement decisions in Western nations. The global conversation had strongly negative reactions to specific topics, such as herd immunity strategies, which involve the acceptance of massive losses of lives among the elderly, and fake news designed to scare the public or spread conspiracy theories. Optimistic and positive sentiments increased over time and in parallel to a decrease in the volume of conversations, as policies were implemented to release the population from confinement and reactivate the economy as a result of lower infections and deaths.

Our investigation provided an unprecedented application of sentiment analysis to multiple languages, including English, Spanish, French, Italian, Arabic, and Chinese, as opposed to most studies that conducted monolingual research (Alamoodi et al., 2020), as they addressed sentiments related to events in a particular region and cultural setting. However, the COVID-19 pandemic has affected the entire world nearly synchronously except for the earlier effects in China, therefore providing an unprecedented opportunity to examine and compare sentiments across languages and over time.

Social psychology has shown that negative emotions such as anger, sadness, and fear often arise during stressful events, including those involving health risks (Ferrer et al., 2017). Hence, the pandemic, which caused stress in both health and the economy, was poised to generate an increase in overwhelmingly negative emotions. In contrast, positive emotions such as empathy and optimism can help overcome negative emotions and cope with stressful situations (Folkman, 2018). Indeed, positive emotions provide defense mechanisms against stressful events and help manage negative emotions, producing resilient individuals (Armstrong et al., 2011; Cohn et al., 2009; Tugade and Fredrickson, 2004). Our findings showed a rapid rise in negative emotions, signaling growing stress and rebalance, and sometimes, rise in positive emotions often associated with specific encouraging events. Additionally, the extent of positive responses was greater in Arabic Twitter conversations about COVID-19, which may reflect a cultural difference in coping with stress and crises and/or a more resilient society resulting from a more recurrent exposure to adverse situations. For instance, the Middle East, particularly the Arabic Peninsula, experienced a recent SARS epidemic (the Middle East respiratory syndrome coronavirus, MERS-CoV), which it successfully overcame and may have helped build greater resilience in the society when facing new pandemics (Algaissi et al., 2020).

Sentiment analysis provides a previously unconsidered tool for policymakers to track societal states that can guide interventions on health and other types of issues, as well as emergency responses. Sentiment analysis applied to public health crises supports action in monitoring, discovery, news sharing, and policy formulation and evaluation (Alamoodi et al., 2020; Jang et al., 2021). Large-scale analysis of emotions expressed via social media provides a tool for business and policy decisions and responses to certain events, which is beyond the scope of this paper. However, we can assume that the sentiments expressed on social media, well above the “background” noise, somehow reached policymakers, whether formally or informally, and stimulated a discussion of where and how the pandemic may have occurred.

COVID-19 is the first pandemic in a globalized, online world, which both relays information to citizens and allows for the tracking of the volume of sentiments expressed through social media. The three-pronged impact on health, movement, and the economy affected citizens across languages. We found remarkable similarities in the global COVID-19 conversation in terms of volume, sentiments, and their triggers despite broad social and cultural differences inherent to the different languages used by the citizens monitored in this study. Global media provides a vehicle for the spread of fake news, but our analysis showed users’ ability to identify these and respond with strong negative sentiments. Our results provided evidence of citizens’ globalized reactions to the constraints and risks imposed by the first pandemic in the internet era, with the growth in optimism over time foretelling a desire to seek together a reset for an improved COVID-19 world.