Effects of social distancing on the spreading of COVID-19 inferred from mobile phone data

A better understanding of how the COVID-19 pandemic responds to social distancing efforts is required for the control of future outbreaks and to calibrate partial lock-downs. We present quantitative relationships between key parameters characterizing the COVID-19 epidemiology and social distancing efforts of nine selected European countries. Epidemiological parameters were extracted from the number of daily deaths data, while mitigation efforts are estimated from mobile phone tracking data. The decrease of the basic reproductive number (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_0$$\end{document}R0) as well as the duration of the initial exponential expansion phase of the epidemic strongly correlates with the magnitude of mobility reduction. Utilizing these relationships we decipher the relative impact of the timing and the extent of social distancing on the total death burden of the pandemic.

Effects of social distancing on the spreading of COVID-19 inferred from mobile phone data Hamid Khataee 1 , Istvan Scheuring 2,3 , Andras Czirok 4,5 & Zoltan Neufeld 1* A better understanding of how the COVID-19 pandemic responds to social distancing efforts is required for the control of future outbreaks and to calibrate partial lock-downs. We present quantitative relationships between key parameters characterizing the COVID-19 epidemiology and social distancing efforts of nine selected European countries. Epidemiological parameters were extracted from the number of daily deaths data, while mitigation efforts are estimated from mobile phone tracking data. The decrease of the basic reproductive number ( R 0 ) as well as the duration of the initial exponential expansion phase of the epidemic strongly correlates with the magnitude of mobility reduction. Utilizing these relationships we decipher the relative impact of the timing and the extent of social distancing on the total death burden of the pandemic.
The COVID-19 pandemic started in late 2019 and within a few months it spread around the World infecting 9 million people, out of which half a million succumbed to the disease. As of June 2020, the transmission of the disease is still progressing in many countries, especially in the American continent. While there have been big regional differences in the extent of the pandemic, in most countries of Europe and Asia the initial exponential growth has gradually transitioned into a decaying phase 8 . An epidemic outbreak can recede either due to reduction of the transmission probability across contacts, or due to a gradual build up of immunity within the population. According to currently available immunological data 1,2 at most locations only a relatively small fraction of the population was infected, typically well below 10%, thus the receding disease mostly reflects changes in social behavior and the associated reduction in disease transmission. Changes in social behavior can include state-mandated control measures as well as voluntary reduction of social interactions. However, especially with the view of potential future outbreaks, it is important to better understand how the timing and extent of social distancing impacted the dynamics of the COVID-19 pandemic.
Previous studies estimated the effect of social distancing on the dynamics of COVID-19 pandemic either by direct data analysis or by modeling methods 7,[15][16][17] . A statistical analysis of the number of diagnosed cases, deaths and patients in intensive care units (ICU) in Italy and Spain have indicated that the epidemic started to decrease only after the introduction of strict lock-down action 3 . This was especially visible in Italy, where the final strict social distancing has been reached in a number of consecutive steps. A statistically more comprehensive analysis of hospitalized and ICU patients in France identified a 77% decrease in the growth rate of these numbers after the introduction of the lock-down 4 . The comparison of social distancing efforts in China, South Korea, Italy, France, Iran and USA 5 revealed that the initial doubling time of identified cases was about 2 days which was prolonged substantially by the various restrictions introduced in these countries. Epidemic models are widely used to estimate how various intervention strategies affect the transmission rate, however, the quantitative relationship between social interventions and epidemic parameters are hardly known 6,7 .
In this paper, we quantitatively characterise the time course of the COVID-19 pandemic using daily death data from nine selected European countries 8 . Statistical data on COVID-19-related deaths are considered to be more robust than that of daily cases of new infections. The latter is affected by the number of tests performed as well as by the testing strategy -e.g. its restriction to symptomatic patients -which may be highly variable across countries and often changes during the course of the epidemic. The time course of daily deaths can be considered www.nature.com/scientificreports/ as a more reliable indirect delayed indicator of daily infections. We thus do not address apparent differences in the case fatality ratio, and restrict our focus to the recorded COVID-19-associated death toll. Our choice of countries was motivated by the requirements that each of these countries (i) had a relatively large disease-associated death toll (i.e. typically above 10/day and more than 2000 overall) so we can assume that the deterministic component of the epidemic dynamics dominates over random fluctuations. (ii) The selected countries spent a suitably long time in the decaying phase of the epidemic, thus allowing its precise characterisation. Based on these two criteria, we analysed data from the following, socio-economically similar countries, each implementing a somewhat distinct social distancing response: Italy, Spain, France, UK, Germany, Switzerland, Netherlands, Belgium and Sweden.
To characterise social distancing responses we used mobile phone mobility trend data from Apple Inc. 9 . Our aim is to quantitatively determine the characteristic features of the progression of the epidemic, such as initial growth rate, timing of the peak, and final decay rate and investigate how these parameters are determined by the timing and strictness of social distancing measures. We demonstrate that the overall death burden of the epidemic can be well explained by both the timing and extent of social distancing, often organized voluntarily well in advance of the state-mandated lock-down, and present quantitative relationships between changes in epidemiological parameters and mobile phone mobility data.

Characterization of the epidemic
Daily COVID-19 death data, D(t), is presented in Fig. 1 for the nine countries analysed. We selected the t = 0 reference time point as the date when the daily death rate first exceeded 5 deaths. In each country D(t) indicates the presence of a well defined exponential growth phase, followed by a crossover region and later an exponential decay stage. The initial growth and final decay regions are characterized by fitting the exponential D (t) as where the fitted functions are distinguished from the actual time series using a tilde. R 0 is the basic reproductive number (i.e. number of new secondary infections caused by a single infected in a fully susceptible population) based on a simple SIR dynamics 10 and γ is the inverse average duration of the infectious period. For calculating the basic reproductive number we use γ = 0.1 day −1 . The exact value of γ is somewhat uncertain at present 11,12 , however our analysis and results do not rely on the value of this parameter or on any modelling assumption regarding the disease dynamics.
To characterize the growth and decay phases, the values of R 0 1 and R 0 2 are approximated, respectively, by fitting Eq. (1) to the death data points (see Fig. 1). Least-squares fitting of Eq. (1) was performed using Mathematica (version 11, Wolfram Research, Inc.) routine NonlinearModelFit; see Supplementary Tables 1 and 2. The initial growth rate R 0 1 is similar in the selected countries except in Sweden where it is substantially lower. This difference may reflect weaker social mixing, different cultural habits or a somewhat lower population density -an interesting problem outside of the scope of this report. The reproductive number in the decaying phase R 0 2 is more variable across the different countries: in particular the decay is significantly slower in Sweden and somewhat slower in the UK.
A third parameter characterizes the transition from the growth to decay phases, i.e., the peak of the epidemic in terms of deaths. We define (t c ,D c ) as the intersection point of the two fitted exponential functions. This is a more robust estimator than the actual maximum in daily deaths D c as at the time of transition daily deaths can exhibit a plateau, hence the value and location of the maximum is sensitive to stochastic fluctuations. The gradual transition from exponential growth to decay in the actual data D(t) can be due to gradual implementation of social distancing as well as to the case-to-case variability of the time elapsed from infection to death.
The total death toll of the disease can be estimated analytically using the three parameters Approximations by Eq. (2) are compared with the actual total death toll, D tot , calculated as the area under death data points added to the area under the decaying curve (blue curve in Fig. 1) approaching zero. As Fig. 2 demonstrates, despite the substantial variation in death toll among the nine countries, it can be fairly well estimated by the expression (2). Specifically, details of the cross-over region, which do not fit well to the two exponential functions, contribute only around 10% to the overall death toll, while variations in the three epidemiological parameters ( R 0 1 , R 0 2 , t c ) can change the death burden by an order of magnitude.

Characterization of social distancing
The timing and strictness of the often voluntary social distancing is quantified from the average mobility data (green circles in Fig. 1), by fitting the following piece-wise linear function: where M 1 and M 2 are the average mobility levels before and after social distancing (i.e., before t 1 and after t 2 ), respectively. To fit Eq. (3) to the mobility data, we used the data recorded over 90 days starting from 13-January-2020. The mobility data show a relative daily volume of requests made to Apple Maps for directions by (3) www.nature.com/scientificreports/ transportation type per country compared to a baseline volume on 13-January-2020. Data is sent from users' devices to the Apple Maps service and is associated with randomised rotating identifiers so that Apple does not have a profile of individual movements and searches. The availability of data in a particular country is subject to a number of factors, including minimum thresholds for direction requests made per day. A day is defined as midnight-to-midnight, US Pacific time 9 . Although the mobility data may have bias in the mobility signal, it may have relatively little direct effect on data reported for the countries studied here. In the 90-day period considered in this study, the data indicates the mobility levels before and after implementing the social distancing. For clarity, to set a unique time-scale for the mobility data for all the countries, we added more data points after t 2 in Fig. 1. These additional data points were not used in the fitting. The walking, driving and transit data are first averaged, then Eq. (3) was fitted to the average mobility level using Mathematica routine NonlinearModelFit. The fitted values of t 1 , M 1 , t 2 , and M 2 are listed in Supplementary Table 3. The mobility data with average pre-pandemic value scaled to unity, M(t)/M 1 , is shown in Fig. 1. We characterize the extent of social distancing by the ratio µ = M 2 /M 1 < 1 . A strict restriction of social interaction is expected to be reflected as µ ≈ 0.
Using the fitted parameters t 1 and t 2 obtained from Eq. (3), an effective social distancing date is approximated. The mobility data shows a transition period in dropping the mobility level due to the social distancing. We define the midpoint of this transition period as an approximation for the average date when the social distancing becomes effective, given by: The Supplementary Table 3 summaries the approximated t eff . It is noteworthy that t eff often preceded the official national lock-down t NL .

The effect of social distancing on epidemic parameters
Next, we investigate how the timing and extent of the social distancing changes the epidemic parameters, and as a consequence, the total death toll. First, we note that surprisingly neither the date of the official national lock-down t NL nor the effective date of social distancing predicts well -in itself -the time of the epidemic peak t c (Fig. 3). As the average time between infection and succumbing to COVID-19 is 18 days 13 , a delay of approximately 18 days is expected (dashed line in Fig. 3) between the introduction of social distancing and the change in the trend of the daily death count. Instead, we find that the time to the peak from the official national lock-down t c − t NL varies in a range from 10 days in Italy to more than 3 weeks in case of Switzerland, and the time from the change in mobility to the peak, t c − t eff , ranges from 19 days (Italy and Spain) up to 34 days (Sweden) as shown in Table. 1. Furthermore, neither the time of the peak t c nor the parameters characterising the time and strength of social distancing t eff , t NL , µ correlates-as a single parameter-well with the total number of deaths (Fig. 4). For example, Belgium registers a very high per capita death toll despite an early official lock-down, 5 days before the reference time t = 0 , and even earlier change is observed in mobility data ( t eff = −10 days). On the other hand, a moderate social distancing in Germany, indicated by the mobility ratio µ = 0.4 , led to the lowest death toll within this group of countries. Thus, we decided to investigate more carefully how social distancing affects the three epidemic parameters R 0 1 , R 0 2 and D c -with the hypothesis that the non-linear, multi-factor relationship (2) effectively masks correlations between the death toll and any single control parameter.
As Fig. 5 indicates, we found two strong relationships between the epidemiological parameters and measures of social distancing. Figure 5a indicates a strong positive correlation between the drop in basic reproductive number, R 0 1 − R 0 2 and the restriction of mobility µ . The relationship can be well approximated by the quantitative formula: with ρ = 0.56 ± 0.09 and ζ = 3.18 ± 0.11 . Furthermore, the time elapsed between the peak and the social distancing, t c − t eff , correlates negatively with the severity of the mobility restrictions (Fig. 5b) as where τ 0 = 17.04 ± 1.62 days is comparable with the mean value of the time from infection to death in fatal COVID-19 disease 13 , and η = 1.50 ± 0.40 is the factor characterizing the lengthening of the delay for less severe reduction of mobility. Thus, the peak follows strict lock-downs (small µ as in Italy and Spain) by around 18 days. For less restrictive social distancing ( µ ≈ 0.6 as in Sweden), however, the peak can be delayed by as much as 5 weeks. This delay is of crucial importance, as the peak of the daily death toll D c is determined as  www.nature.com/scientificreports/ where D (t eff ) is a natural measure of how early the social distancing took place relative to the dynamics of the epidemic. As D (t) is the exponential fit equation (1) instead of the actual daily death count D(t), unfortunately it is difficult to know in real-time. The country-specific values of D (t eff ) are listed in Table 2.
Our analysis thus indicates that social distancing has two effects: it reduces the basic reproduction number of the infection as expected, and shortens the time required for the epidemic to peak. This latter effect is unexpected, as changes in behavior should reduce transmission immediately, which, after a fixed delay -involving manifestation of symptoms and in a fraction of the patients death -should also appear in the daily death toll D(t). We propose that the increased time between the peak and the time of the social distancing may indicate the presence of subpopulations in which the disease continues to propagate with the initial reproduction number R 0 1 . In these populations the transmission is eventually blocked, not by the overall social distancing efforts within the society, but by some other means. As a potential mechanism, we suggest that the local outbreak can reach such a magnitude that it either triggers an intervention or allows the establishment of herd immunity. Prominent examples of such events that collectively expand the duration of the initial growth phase are outbreaks in nursing homes, meat processing plants, warehouses and prisons -which become more likely when the overall social distancing is weak. Furthermore, weak overall social distancing could also fail to protect and segregate these vulnerable subpopulations specifically, thus increase the effective size of such subpopulations.

Discussion
In this paper we analysed the interdependence of epidemic and mobility data and identified a quantitative relation between parameters of social distancing and key characteristics of the COVID-19 pandemic. Our sample consisted of 9 European countries where suitable data was available at current time. We found that the total death toll does not correlate well with any single parameter such as the timing of the official lock-down or the strictness of social distancing extracted from mobile phone location data. The total death toll, however, could be well estimated by a non-linear combination of three parameters: exponents characterising (i) the initial exponential growth rate (or reproductive number, R 0 1 ) and (ii) the final decay rate of the epidemic, R 0 2 , and (iii) the peak death rate D c which separates the two stages. The initial growth rate is an intrinsic parameter which may vary somewhat across different countries, but is not affected by control measures or social responses to the pandemic. Based on our data analysis we find that the two remaining parameters, R 0 2 and D c can be related to the timing ( t eff ) and strength ( µ ) of social distancing. www.nature.com/scientificreports/ Since the estimated daily death toll at the peak increases exponentially with the difference between the effective lock-down time ( t eff ) and time of the peak ( t c ), small changes in t c − t eff can yield substantial differences in the total death toll. The per capita values of D (t eff ) are fairly similar: within the range 0.1-0.16 for 7 of the countries analysed, suggesting that social distancing started at similar stages of the epidemic. The important exception is Germany, where a much smaller value of 0.02 corresponds to roughly a week earlier response. According to the analysis presented here, this explains the substantially lower German death toll, in spite of the relatively moderate social distancing. The slightly higher value of D (t eff )/population in Spain (0.23) was compensated by a strict lock-down (lower µ).
The higher death toll in Belgium cannot be explained by the proposed set of parameters ( R 0 1 , D(t eff ), µ ) as all three values are close to the average of the sample. This anomaly can be traced to Fig. 5(b), where the deviation of the Belgian data point from the fitted curve indicates that the peak of the epidemic was delayed by approximately  www.nature.com/scientificreports/ 4 days compared to what could be expected based on the proxy measure for social distancing µ . While the high Belgian death toll is often attributed to the different methodology of recording COVID-19 related fatalities, including suspected deaths which were not confirmed by lab analysis, such a difference in methodology cannot explain the markedly high time distance between the effective date of social distancing ( t eff ) and the epidemic peak ( t c ); see Fig. 5(b). As we propose that t c − t eff reflects the presence of subpopulations in which the disease can spread unmitigated by social distancing efforts, we suggest that such groups were relatively larger in Belgium than in the other countries within our sample. While in the case of Sweden the initial growth rate was the slowest and even the timing ( D(t eff ) ) was similar to other countries the weakness of the mobility restriction (i.e. large µ ) led to high death toll resulting from a strongly delayed peak. We would like to emphasize, that this delay was unexpected at the time of social distancing efforts, and still unaccounted by the typical SIR-based epidemiological models 10 .
In the case of the UK the timing of the social distancing ( D(t eff ) ) is similar to other countries, however the mobility restriction appears to be weaker (i.e. higher µ ) compared to similar countries (Spain, Italy and France) which results in a slower decay of the epidemic.
The phone mobility data is an indirect measure of the social distancing and disease transmission probabilities. However, it can be compared to more traditional measures in the Netherlands, where a study directly determined the number of daily personal interactions before and after the lock-down. Using questionnaires, the study identified a 71% decrease in interactions on average, which is fairly close to the 65% drop estimated from the phone mobility data 14 . While mobile phone tracking data thus can characterize the relatively early stages of social distancing efforts, it may be less useful to detect more subtle efforts like wearing masks or staggered working shifts at later stages of the pandemic.
In conclusion, we demonstrated a quantitative relation between social distancing efforts and statistical parameters describing the current COVID-19 pandemic. We identified an unexpected extension of the exponential growth phase when social distancing efforts are weak, which can substantially increase the death toll of the disease. Future studies are required to extend this analysis to countries with a markedly different socio-economical arrangements.