Public mobility data enables COVID-19 forecasting and management at local and global scales

Policymakers everywhere are working to determine the set of restrictions that will effectively contain the spread of COVID-19 without excessively stifling economic activity. We show that publicly available data on human mobility—collected by Google, Facebook, and other providers—can be used to evaluate the effectiveness of non-pharmaceutical interventions (NPIs) and forecast the spread of COVID-19. This approach uses simple and transparent statistical models to estimate the effect of NPIs on mobility, and basic machine learning methods to generate 10-day forecasts of COVID-19 cases. An advantage of the approach is that it involves minimal assumptions about disease dynamics, and requires only publicly-available data. We evaluate this approach using local and regional data from China, France, Italy, South Korea, and the United States, as well as national data from 80 countries around the world. We find that NPIs are associated with significant reductions in human mobility, and that changes in mobility can be used to forecast COVID-19 infections.

includes other policies that close cultural institutions (e.g., museums or libraries), or 466 encourage establishments to reduce density, such as limiting restaurant hours. 467 • no gathering: Represents a policy that prohibits any type of public or private gathering.

468
(whether cultural, sporting, recreational, or religious). Depending on the country, the 469 policy can prohibit a gathering above a certain size, in which case the number of people 470 is specified by the no gathering size variable.

471
• event cancel : Represents a policy that cancels a specific pre-scheduled large event (e.g., 472 parade, sporting event, etc). This is different from prohibiting all events over a certain  Table S1 484 contains a summary of the data used for each country. Google mobility data summarizes time spent by their users each day after Feb 6, 2020 in various 487 types of places, such as residential, workplaces and grocery stores. 32 Specifically, it provides the 488 percentage change in number of visits and length of stay in each type of place, compared to a 489 baseline value. The baseline is the value on the corresponding day of the week during the 5-week 490 period between Jan 3, 2020 and Feb 6, 2020. The metrics are available starting Feb 15, 2020 at the 491 country (Administrative Level 0) and state level (Administrative Level 1) for over 135 countries. 492 We also access county-level metrics (Administrative Level 2) for the US. Facebook summarizes and anonymizes its user data into useful metrics that can be used to evaluate 505 the movement of people. 33 Our analysis uses data beginning March 5, Feb 23 and Feb 24, 2020 506 for France, Italy and South Korea respectively. Specifically, Facebook aggregates the number of 507 trips between tiles of up to a resolution of 360 square meters. We aggregate these data to the level 508 of administrative regions, constructing metrics for number of trips between as well as within these 509 regions. We use the following variables from the data provided by Facebook:     (1)). The model is a commonly used reduced-form approach in econometrics. Details on the model 533 and model estimation are presented below.

534
Model details: where m t is a measure of mobility 536 behavior at time t, X t represents control variables, and t is the error. We use a linear  in each model using the model described above, and ordinary least squares. The time period 558 that we consider is the "first wave" of infections; specific dates are in Table S1. Similar to the behavior model, the infection model is also a reduced-form approach, used to describe 564 the relationship between infections and mobility behavior ( ∆inf ections ∆behavior in equation (1)). Model 565 details, as well as steps for model estimation, forecasting and cross-validation are outlined below.

566
Also included are steps for data selection.

567
Model details: 568 1. The model used is log( It It−1 ) = g(mobility t , X t ) + t , where log( It It−1 ) is the first-difference of 569 log confirmed infections at time t, X t represents control variables, and t is the error. We use  3. The mobility variable is a vector with mobility rates specific to each country, for each location 575 and day. Includes mobility measures averaged over lags 1-7, 8-14 and 15-21, respectively.

576
We use Google mobility data in its original form (percentage points), and take logs for the 577 Facebook and Baidu mobility data.
where i is the unit of analysis, γ i are unit-level fixed effects, δ t are day-of-week fixed effects, 583 and φ it are indicators for changes in testing regimes.

584
The model is robust to systematic differences in infection tracking across locations, since the de-585 pendent variable is a growth rate. Regional fixed effects allow for location-specific underreporting; 586 estimates are unbiased as long as the location-specific reporting rate remain constant over time.

587
Major changes in testing regimes within a location are included in our model via unit-day specific 588 dummy variables.

589
Steps for model estimation: The following steps are used to generate estimates of the average 590 effect of each mobility variable on the growth rate of infections (see Figure S2). These are then 591 used to estimate how a novel policy affecting mobility would alter future infections (Table 2). 592 1. Estimate the average effect of each mobility variable on the growth rate of infections, 593β β β = {β 1 β 2 β 3 }, using the model described above, and ordinary least squares.
Inew I original = e k( 3 l=1β l ∆ l ) , where ∆m l is the change in residential time over baseline for the lth mobility variable (e.g., ∆m l = .05 means a 5% increase, say from 20% to 25% residential time 603 over baseline, for all lags in the lth variable).

604
Steps for forecasting and cross-validation:

619
Steps for data selection:

620
The time period that we consider (see Table S1) is the "first wave" of infections, and to demon-  Figure S1: Spatial and temporal spillover of policies. (a-b) Solid markers indicate the direct impact of large policies on mobility. Hollow markers show the estimated effect of a policy on neighboring regions. Policies are jointly estimated at the local level for each country. In China (b), we also separately estimate the effect of each policy for each time period after the policy's implementation. (c) The impact of lockdown on the time spent at home is estimated using a country-level regression with 80 countries. We report the cumulative effect over time. Figure S2: Impact of mobility on the growth rate of COVID-19 cases. Estimated impact of mobility on COVID-19 infection growth rate over time. Effects are estimated for each of the preceding three weeks (lags of 1 to 21 days), where the measure of mobility is either the number of trips between administrative units (left) or the amount of time spent at home (right). The impact of mobility is gradually increasing over time and is highest after 2 weeks.