The impact of social ties and SARS memory on the public awareness of 2019 novel coronavirus (SARS-CoV-2) outbreak

This study examines publicly available online search data in China to investigate the spread of public awareness of the 2019 novel coronavirus (SARS-CoV-2) outbreak. We found that cities that had previously suffered from SARS (in 2003–04) and have greater migration ties to Wuhan had earlier, stronger and more durable public awareness of the outbreak. Our data indicate that 48 such cities developed awareness up to 19 days earlier than 255 comparable cities, giving them an opportunity to better prepare. This study suggests that it is important to consider memory of prior catastrophic events as they will influence the public response to emerging threats.

Public awareness is important in managing the spread of infectious diseases. Here, public awareness is defined as knowledge and understanding among the population about the risk of infectious diseases. Individual actions, such as increased attention to hygiene and avoiding crowds, can reduce disease spread. Awareness also supports rapid identification and treatment of new cases and facilitates collective responses, such as closures of schools or transit systems 1 . In the modern world, diseases can move faster than ever due to the growing movement of people between cities, regions and countries. However, digital technology means information can move even faster, providing an opportunity for individuals and communities to protect themselves ahead of the disease itself arriving 2 . This study considers the spread, and persistence, of public awareness of the novel coronavirus which emerged in late 2019. During the first few weeks of this outbreak, there was little coverage from mainstream media outlets, providing an unusual opportunity to study the spread of awareness of an emerging disease via other channels.
Previous studies show that the spread of awareness is strongly related to the physical locations of individuals in a social network in relation to the unfolding events [2][3][4] , termed the social ties effect. In online social networks, people with more connections tend to receive earlier warnings of catastrophic events. For example, during Hurricane Sandy in the USA, Twitter users with more followers had an awareness lead-time of up to 26 h than less connected users 4 . Moreover, the magnitude of awareness increases over decreasing distances to the epidemic centers. For example, public awareness in Weibo, a Chinese social media platform, was two orders of magnitude stronger for the H7N9 influenza outbreak that occurred in China than the Middle East Respiratory Syndrome Coronavirus (MERS-CoV) outbreak that occurred elsewhere 5 .
Experience of similar events, such as outbreaks of H5NI influenza in 2001, SARS (Severe Acute Respiratory Syndrome) in 2003, H1N1 influenza in 2009 and Ebola in 2014 is also likely to influence awareness. In China, the outbreak of SARS between 2003 and 2004 caused a total of 7,429 reported cases and 685 deaths 6 , and had a lasting traumatic impact on survivors and communities 7,8 . In this work, we set out to test whether public awareness of the new disease outbreak is related to social ties distance from the place impacted by the epidemic and past experience of the SARS epidemic in 2003. The SARS outbreak was 17 years ago, but its horror might still condition public awareness of lethal infectious diseases. To the best of our knowledge, few studies have been carried out to understand how past severe outbreaks affect public awareness when a new outbreak occurs. This study estimates the post-SARS effect, called SARS memory effect, on the current outbreak.
We use the continuing coronavirus outbreak as our case study to estimate the effects of social ties and SARS memory on the spread of public awareness. Chinese authorities officially announced human-to-human transmission of the 2019 novel coronavirus (SARS-CoV-2) on January 20th, 2020 9 . The outbreak originated in Wuhan City, a major transportation hub in central China long known as the "Nine Provinces" thoroughfare transiting more than 120 million passengers every year 10 . The massive numbers of transits provided a perfect opportunity for the virus to spread. Another feature is the timing of the outbreak, close to the Spring Festival travel season,

Data and methods
Public awareness measurement. Seeking epidemic-related information online can provide an indicator of public awareness of this new disease. In this study, we use the Baidu Search Index (BSI), available publicly at https:// index. baidu. com, to measure public awareness over time and locations (e.g., city). Currently, Baidu releases online search information across 364 Chinese cities, covering all four direct-administered municipalities, two special administrative regions (Macau and Hong Kong), 293 prefectural-level cities, Taiwan (released as a city in Baidu) and additional 64 country-level cities. The total number of internet users using the Baidu search engine reached 649 million in 2014, accounting for 47.9% of China's population 11 . BSI has been used to predict epidemic outbreak 12 , HIV/AIDS incidence 13 and tourism flows 14 , suggesting BSI can provide a representative proxy for public awareness. BSI provides a weighted index for each search term. In this study, we used a search term in Chinese that combines "Wuhan" and "pneumonia" (called "pneumonia" hereafter), as Chinese internet users widely used it during the time period of this study. We also tried another search term in Chinese "novel coronavirus", but it did not exhibit a search surge on Baidu. Due to the privacy concern, Baidu masks daily readings that are below 57 as zero. Therefore, we used the maximum BSI value of the search term "common cold" (in Chinese) between Dec 10th and 31st, 2019, to control the size effect for each city. We use the Ljung-Box test 15 to estimate whether or not the daily readings of "common cold" are stationary. As a result, the daily readings of 18 out of 364 cities were found to be non-stationary. In this study, that means the "common cold" readings in those 18 cities exhibit seasonality or a trend, so they were excluded from this study. The magnitude of public awareness of the Wuhan outbreak over time t and city i ∈ {1, . . . , 346} can be represented as O COVID−19 t,i , as defined below. S COVID−19 t,i and S cold t,i represent the BSI values of the search terms "pneumonia" and "common cold" respectively. The BSI raw data is provided in S1 of the supplementary materials (SM).
The earliest day the magnitude of public awareness exceeds the arbitrary thresholds C ∈ {1.5, 2, 3, 4} is defined as the earliest warning day, t warning(i) , for city i . We also define the starting day of Chunyun as t chunyun , indicating the onset day when it is likely the virus would reach all cities. As Chunyun transited approximately 3 billion passengers in 40 days in 2019 16 , crowded transport hubs create perfect opportunities for the virus to spread. Therefore, the earlier the lead-time awareness, the better for infection control. The lead-time of awareness for city i is thus defined as: Awareness typically follows a cyclical process, called the unaware-aware-unaware (UAU) process, as time passes. Keeping the public at a high level of awareness could help mitigate the virus transmission process. Therefore, we also measure the awareness retention rate as the average of magnitude from the next day of t warning(i) to the day of Chunyun over the magnitude at t warning(i) .

Measuring social ties.
Microblogs (e.g., Weibo) and private social media (e.g., WeChat) are the primary communication tools used by most Chinese people 11 . While information flows cannot be observed directly, empirical studies show that social networks are influenced by long distance travel 17 . We therefore use migration flows as a proxy for long-distance information flows. To be more specific, if workers born and raised in city A now work in city B, they are likely to relate information about an epidemic in city B back to friends and family in city A. This is particularly relevant in the Chinese context, where migrant workers account for more than onethird of the working population 18 .
We use the migration flows extracted from the Baidu Migration Matrix (BMM) to build a migration network (Fig. 1). We then compute the shortest steps between any city to Wuhan, deriving the variable social ties for city i as D i ∈ (1, 8) . Wuhan and the cities located in Hubei province have D i = 1 , while cities located far away from Wuhan tend to have larger values, e.g., D i=Lhasa = 6.
Moreover, we added D i=HongKong = 1 , based on the rationale that Hong Kong has more airline traffic flows to Wuhan than Shanghai 19 , which has D i=Shanghai = 1 . Macau and Taiwan both have frequent traffic flows to Hong Kong 19 , so we added D i=Macau = 2 and D i=Taiwan = 2.
Measuring SARS memory. We collected all reported SARS cases in mainland China, Hong Kong, Taiwan and Macau, and assign the numbers of cases to each city as SARS i , i ∈ {1, . . . , 346} . The values range between zero (no reported cases) and 2,521 ( i = Beijing ), with an average of 70.8 and a median of 4 cases in cities with at least one case reported. Given the long-tailed distribution of reported SARS cases, we use the logarithm. As most cities reported zero cases of SARS, we use the equation below to transform the variable SARS i . www.nature.com/scientificreports/ The log-transformation would alter the distributions of the variable, but it is still appropriate for testing the assumptions, i.e., cities with more reported SARS cases would exhibit stronger, earlier and more durable public awareness.
Estimating social ties and SARS memory effects. We build three groups of regression models to estimate the effects of social ties and SARS memory on public awareness measures and O i respectively. GDP_per_capita i represents the gross domestic product (GDP) per capita for city i . SubProvincial i indicates whether or not city i has sub-provincial or greater administrative power. Sub-provincial cities are mostly capitals of the provinces in which they are located, or important cities designated by the central government. Four cities, including Beijing, Shanghai, Tianjin and Chongqing, which are under direct control of the central government are also labelled as SubProvincial i = 1 . Those sub-provincial and above cities have much better facilities and expertise for infection control than other cities 20 , so we assume residents could be more alert. SubProvincial i is used to control the effects of administrative level. We also introduce Euclidean distances as a control variable, denoted as d i , which presents the straight-line distance between Wuhan to each city. The introduction of d i is to control for the fact that cities physically closer to Wuhan are likely to get more news from there regardless of the strength of social ties.
All data necessary to replicate the analysis is attached as S2 of the SM. www.nature.com/scientificreports/

Results
The early warnings of the outbreak. As early as Dec 31st, 2019, when Wuhan Municipal Health Commission first informed the public about the emerging pneumonia cases 21 , most of the cities (326 out of 346) exhibited at least some awareness of the emerging SARS-CoV-2 outbreak (Fig. 2b). However, awareness then decreased until Jan 19th, 2020, one day before the Chinese Centre for Disease Control and Prevention confirmed human-to-human transmissions of the novel coronavirus 9 . Since Jan 20th, 2020, overall awareness increased by a magnitude of at least five, demonstrating significant awareness across all cities (Fig. 2b). Awareness remained low as the epidemic spread, falling close to its lowest point on the starting day of Chunyun (Jan 10th, 2020).
Considering cities that showed initial novel coronavirus awareness levels at least 1.5 times that of the search term "common cold", we found a total of 166 alert cities as early as Dec 31st, 2019 (48 cities at a tighter threshold of C = 3.0 times, illustrated in Fig. 2a). However, awareness decreased significantly during Chunyun. The evolution of public awareness over time followed an unusual pattern. In a typical UAU process, people are unaware of emerging catastrophic events until they are told by their social contacts. They remain aware during the event, and awareness then fades subsequently 2,22 . However, during the Wuhan outbreak, the public experienced a process as aware-unaware-aware, with public awareness declining during the early phase of the outbreak.
Dividing cities into two groups according to whether or not they had reported SARS cases in 2003-04, we found the cities that had been struck by SARS to be more alert during onset (Fig. 2b). Therefore, we believe the SARS memory still conditions public awareness. We provide evidence of its effects at the end of this section. The frequency distributions of cities that exhibit the first significant signal of awareness over time. The number of cities for which searches for the combined term "Wuhan" and "pneumonia" exceed C = 3 times the search term "common cold" is reported every day. (b) Public awareness on the topic of "pneumonia" over time. All 346 cities exhibit at least some searches of the term "pneumonia" during the initial outbreak period. Of these, 326 cities recorded searches about it as early as Dec 31st, 2019. Cities are divided into two groups according to whether or not they had reported SARS cases in 2003-04. The mean values of awareness magnitude were computed on a daily basis for two groups of cities respectively. Accordingly, a paired t-test was performed on those two time-series, and we found the cities that had reported SARS cases had greater of awareness (t-statistic: 3.56; degrees of freedom: 23; p < 0.005).

Retention of awareness.
Even though most of the cities exhibit at least some awareness as early as Dec 31st, 2019, only a few retain it over the following weeks as the virus began to spread. The retention rates, O i , range between zero and 137%, with an average of 54% and a median of 55%. Eight cities lost awareness before Chunyun, while four cities developed greater awareness. Xilingol League in Inner Mongolia ranked 4th, with a retention rate at 103%. Xilingol is far away from Wuhan in terms of social ties distance, but it was struck by SARS. It is worth noting that a confirmed case of plague was reported in Xilingol on Nov 16th, 2019, only 45 days before the Wuhan authority confirmed the emerging pneumonia cases 21 .

Estimation of the social ties and SARS memory effects. The effects of social ties and SARS memory
on the lead-time advantage are estimated according to Eq. 4, controlled by Euclidean distances, GDP per capita and the city's administrative level (Table 1). We found that, in model (3) in Table 1, SARS i exhibits positive effects, while D i shows a negative association with awareness. That means cities of strong SARS memory and which are closer to Wuhan in terms of Social ties develop early awareness. Moreover, the interaction term D i * SARS i exhibits negative effects, indicating that the SARS memory effect becomes stronger where cities are closer to Wuhan in terms of social ties distance. www.nature.com/scientificreports/ While controlling the model with Euclidean distances (model (5) in Table 1), we found that SARS memory effect becomes non-significant, but social ties and its interaction with SARS memory hold. Meanwhile, Euclidean distances are non-significant, even though it exhibits a negative effect on its own in model (4) in Table 1.
We further control the model with GDP per capita and administrative level (models (6) & (7) in Table 1). Using Akaike information criterion (AIC) to select the best model 23 , we found the performance of model (6) and (7) are very similar. However, because model (6) achieves a slightly lower AIC score 24 at 1893.62 with fewer degrees of freedom (df = 8) than model (7) (AIC = 1894.77, df = 9), model (6) is [very slightly] preferred. For more information about the model selection for Eq. 4, 5 and 6, see File S3 in the SM. In model (6) in Table 1, we found that both social ties and Euclidean distances exhibit negative effects, but the social ties effects decrease almost half compared to model (5) in Table 1. The SARS memory effects hold. Also, the interaction term D i * SARS i is still significant, which means cities with stronger SARS memory will develop more lead-time advantage, particularly when they are closer to Wuhan. For example, Changchun in Jilin province with 34 SARS cases and far away from Wuhan still achieved a ten days lead-time advantage. The cities that did not exhibit awareness, such as Qaramay and Heihe, are mainly located far away from Wuhan and did not suffer from the SARS outbreak. GDP per capita and the binary variable SubProvincial i exhibit significant positive effects on the lead-time advantage.
The effects of social ties and SARS memory on the magnitude of awareness are estimated according to Eq. 5 ( Table 2). Similar to the findings in Table 1, SARS i memory positively affects public awareness in all models. Social ties D i show a significant negative effect only in the models without controlling variables (model (2) & (3) in Table 2). However, the interaction term between social ties with SARS memory show a significant negative effect. Using AIC-based model selection method, we found that model (6), which control by Euclidean distances, GDP per capita and the administrative level, is the best model. The effects of administrative level and development level both exhibit positive effects on the magnitude of awareness. We hypothesize that residents with better education (proxied by GDP per capita) better understand the danger of deadly infectious diseases and, accordingly, tend to seek up-to-date information online.
The effects of social ties and SARS memory on retention of awareness are estimated according to Eq. 6. Model (6) in Table 3 is the best model based on the model selection using AIC. Unlike the results in Tables 1 and 2, we observe no effects from SARS memory (model (6) in Table 3). When we control Euclidean distances, development level and administrative level, the explanatory power of the model is still relatively weak (Adj. R 2 = 0.104 ). It seems the decreasing awareness is a collective behavior that occurred simultaneously. Interestingly, social ties have a significant effect while the Euclidean distances do not. Development level exhibits positive effects, which suggests residents of better educated cities could be more alert during the epidemic onset. However, administrative www.nature.com/scientificreports/ level shows a negative effect. It seems residents living in important cities (in terms of administrative power) lost interest in the disease before Chunyun.

Discussion
The novel coronavirus outbreak is still spreading, with a growing death toll around the world. From this study we found that the spread of public awareness varied markedly across Chinese cities. Through controlling for development, administrative levels, and Euclidean distances, we observe cities that were struck by SARS and have more migration to Wuhan, showed earlier, stronger and more durable public awareness of the outbreak. These cities will have been better prepared to respond to the virus if and when it arrived. Specifically, 48 cities had developed public awareness as early as Dec 31st, 2019, with up to 19 days of lead-time advantage, compared to some other 255 cities. The study suggests that memory of previous events, as well as social links to an emerging threat, may influence public behaviour. Greater awareness could help slow the spread of a disease, for example through increased attention to hygiene, mask-wearing and reduced interpersonal contact. It might also facilitate collective responses such as enforced quarantine measures. However, in some circumstances enhanced awareness could have negative impacts, such as unnecessary panic or ostracism of groups perceived as being at greater risk of infection.
Due to the lack of infection statistics, we cannot yet statistically estimate the effect of public awareness on the subsequent seriousness of the outbreak. We note that Xilingol League in Inner Mongolia, which had relatively stronger and more durable public awareness, had fewer cases (two cases as reported at Feb 7th, 2020 25 ) than other cities in the same province (totally 50 cases, with an average of 4.55 cases per city 25 ).
To the best of our knowledge, this study is the first to investigate how memory of previous catastrophic events, e.g., SARS, and social ties could affect the spread of public awareness. Further studies will be needed to understand whether this holds in other contexts, beyond the unusual circumstances of the novel coronavirus in Wuhan. As much of the world has subsequently suffered from this virus, it at least gives cause to hope that we will be more aware, and so respond more rapidly to, a future pandemic.

Data availability
All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.
Received: 15 March 2020; Accepted: 28 September 2020 Table 3. Estimate social ties and SARS memory effects on the awareness retention rate. *p **p ***p < 0.01. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.