Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# The effect of large-scale anti-contagion policies on the COVID-19 pandemic

### Subjects

A Publisher Correction to this article was published on 22 August 2020

## Abstract

Governments around the world are responding to the coronavirus disease 2019 (COVID-19) pandemic1, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with unprecedented policies designed to slow the growth rate of infections. Many policies, such as closing schools and restricting populations to their homes, impose large and visible costs on society; however, their benefits cannot be directly observed and are currently understood only through process-based simulations2,3,4. Here we compile data on 1,700 local, regional and national non-pharmaceutical interventions that were deployed in the ongoing pandemic across localities in China, South Korea, Italy, Iran, France and the United States. We then apply reduced-form econometric methods, commonly used to measure the effect of policies on economic growth5,6, to empirically evaluate the effect that these anti-contagion policies have had on the growth rate of infections. In the absence of policy actions, we estimate that early infections of COVID-19 exhibit exponential growth rates of approximately 38% per day. We find that anti-contagion policies have significantly and substantially slowed this growth. Some policies have different effects on different populations, but we obtain consistent evidence that the policy packages that were deployed to reduce the rate of transmission achieved large, beneficial and measurable health outcomes. We estimate that across these 6 countries, interventions prevented or delayed on the order of 61 million confirmed cases, corresponding to averting approximately 495 million total infections. These findings may help to inform decisions regarding whether or when these policies should be deployed, intensified or lifted, and they can support policy-making in the more than 180 other countries in which COVID-19 has been reported7.

## Main

The COVID-19 pandemic is forcing societies worldwide to make consequential policy decisions with limited information. After containment of the initial outbreak failed, attention turned to implementing non-pharmaceutical interventions that are designed to slow the contagion of the virus. In general, these policies aim to decrease virus transmission by reducing contact among individuals within or between populations, such as by closing restaurants or restricting travel, thereby slowing the spread of COVID-19 to a manageable rate. These large-scale anti-contagion policies are informed by epidemiological simulations2,4,8,9 and a small number of natural experiments during past epidemics10. However, the actual effects of these policies on infection rates in the ongoing pandemic are unknown. Because the modern world has never confronted this pathogen, nor deployed anti-contagion policies of such scale and scope, it is crucial that direct measurements of the effects of policies are used together with numerical simulations in current decision-making.

Societies around the world are considering whether the health benefits of anti-contagion policies are worth their social and economic costs. Many of these costs are clearly observed; for example, business restrictions increase unemployment and school closures affect educational outcomes. It is therefore not surprising that some populations have hesitated before implementing such policies, especially when their costs are visible while their health benefits—infections and deaths that would have occurred but are instead avoided or delayed—are unseen. Our objective is to measure the direct health benefits of these policies; specifically, how much these policies slowed the growth rate of infections. To do this, we compare the growth rate of infections within hundreds of subnational regions before and after each of these policies is implemented locally. Intuitively, each administrative unit observed immediately before a policy deployment serves as the ‘control’ for the same unit in the days after it receives a policy ‘treatment’ (see Supplementary Information for accounts of these deployments). Our hope is to learn from the recent experience of six countries in which the early spread of the virus triggered large-scale policy actions, in part so that societies and decision-makers everywhere can access this information.

Here we directly estimate the effects of 1,700 local, regional and national policies on the growth rate of infections across localities within China, South Korea, Italy, Iran, France and the United States (Fig. 1 and Supplementary Table 1). We compile subnational data on daily infection rates, changes in case definitions and the timing of policy deployments, including (1) travel restrictions, (2) social distancing through the cancellations of events and suspensions of educational, commercial and religious activities, (3) quarantines and lockdowns, and (4) additional policies such as emergency declarations and expansions of paid sick leave, from the earliest available dates to 6 April 2020 (Extended Data Fig. 1 and Supplementary Notes). During this period, populations remained almost entirely susceptible to COVID-19, causing the natural spread of infections to exhibit almost perfect exponential growth11,12,13. The rate of this exponential growth could change daily, determined by epidemiological factors, such as disease infectivity, as well as policies that alter behaviour9,11. Because policies were deployed while the epidemic unfolded, we can estimate their effects empirically. We examine how the daily growth rate of infections in each locality changed in response to the collection of ongoing policies applied to that locality on that day.

We use well-established reduced-form econometric techniques5,14 that are commonly used to measure the effects of events6,15 on economic growth rates. Similar to early COVID-19 infections, economic output generally increases exponentially with a variable rate that can be affected by policies and other conditions. Here, this technique aims to measure the total magnitude of the effect of changes in policy, without requiring explicit prior information about fundamental epidemiological parameters or mechanisms, many of which remain uncertain in the current pandemic. Instead, the collective influence of these factors is empirically recovered from the data without modelling their individual effects explicitly (see Methods). Previous research on influenza16, for example, has shown that such statistical approaches can provide important complementary information to process-based models.

To construct the dependent variable, we transform location-specific, subnational time-series data on infections into first differences of their natural logarithm, which is the per-day growth rate of infections (see Methods). We use data from first- or second-level administrative units and data on active or cumulative cases, depending on availability (Supplementary Notes). We employ widely used panel regression models5,14 to estimate how the daily growth rate of infections changes over time within a location when different combinations of large-scale policies are enacted (see Methods). Our econometric approach accounts for differences in the baseline growth rate of infections across subnational locations, which may be affected by time-invariant characteristics, such as demographics, socioeconomic status, culture and health systems; it accounts for systematic patterns in growth rates within countries unrelated to policy, such as the effect of the workweek; it is robust to systematic undersurveillance specific to each subnational unit; and it accounts for changes in procedures to diagnose positive cases (Methods and Supplementary Methods).

We estimate that in the absence of policies, early infection rates of COVID-19 grow 43% per day on average across these six countries (s.e.m. = 5%), implying a doubling time of approximately 2 days. Country-specific estimates range from 34% per day in the United States (s.e.m. = 7%) to 68% per day in Iran (s.e.m. = 9%). We cannot determine whether the high estimate for Iran results from true epidemiological differences, data-quality issues (see Methods), the concurrence of the initial outbreak with a major religious holiday and pilgrimage (Supplementary Notes) or sampling variability. Excluding Iran, the average growth rate is 38% per day (s.e.m. = 5%). Growth rates in all five other countries are independently estimated to be very near this value (Fig. 2a). These estimated values differ from observed average growth rates because the latter are confounded by the effects of policies. These growth rates are not driven by the expansion of testing or increasing rates of case detection (Methods and Extended Data Fig. 2) nor by data from individual regions (Extended Data Fig. 3).

Some previous analyses of pre-intervention infections in Wuhan have suggested that the growth rates were slower (doubling every 5–7 days)17,18 using data collected before national standards for diagnosis and case definitions were first issued by the Chinese government on 15 January 202019. However, case data in Wuhan from before this date contain multiple irregularities: the cumulative case count decreased on 9 January 2020; no new cases were reported for 9–15 January; and there were concerns that information about the outbreak was suppressed20 (Supplementary Table 2). When we remove these data, using a shorter but more reliable pre-intervention time series from Wuhan (16–21 January), we recover a growth rate of 43% per day (s.e.m. = 3%), which corresponds to a doubling time of 2 days, consistent with results from all other countries except Iran (Fig. 2a).

During the early stages of an epidemic, a large proportion of the population remains susceptible to the virus, and if the spread of the virus is left uninhibited by changes in policies or behaviour, exponential growth continues until the fraction of the susceptible population decreases meaningfully11,13,21,22. After correcting for estimated rates of case detection23, we compute that the minimum susceptible fraction across administrative units in our sample is 72% of the total population (Cremona, Italy) and 87% of administrative units would be likely to be in a regime of uninhibited exponential growth (that is, more than 95% of the population remains susceptible) if policies were removed on the last date of our sample.

Consistent with predictions from epidemiological models2,10,24, we find that the combined effect of policies within each country reduces the growth rate of infections by a substantial and statistically significant amount (Fig. 2b and Supplementary Table 3). For example, a locality in France with a baseline growth rate of 0.33 (national average) that fully deployed all policy actions used in France would be expected to lower its daily growth rate by −0.17 to a growth rate of 0.16. In general, the estimated total effects of policy packages are large enough that they can in principle offset a large fraction of, or even eliminate, the baseline growth rate of infections—although in several countries, many localities have not deployed the full set of policies. Overall, the estimated effects of all policies combined are generally insensitive to withholding regional (that is, state- or province-level) blocks of data from the sample (Extended Data Fig. 3).

In China, only three policies were enacted across 115 cities early in a 7-week period, providing us with sufficient data to empirically estimate how the effects of these policies evolved over time without making assumptions about the timing of these effects (Fig. 2b and Methods). We estimate that the combined effect of these policies reduced the growth rate of infections by −0.026 (s.e.m. = 0.046) in the first week after they came into effect, increasing substantially in the second week to −0.20 (s.e.m. = 0.049), and essentially stabilizing in the third week around −0.28 (s.e.m. = 0.047). In other countries, we lack sufficient data to estimate these temporal dynamics explicitly and only report the average pooled effect of policies across all days after their deployment (Methods). If other countries have transient responses similar to China, we would expect that the effects in the first week after deployment are smaller in magnitude than the average effect that we report. We also explore how our estimates would change if we impose the assumption that policies cannot affect infection growth rates until after a fixed number of days (Extended Data Fig. 5a and Supplementary Methods section 3); however, we do not find evidence that this improves model fit.

The estimates described above (Fig. 2b) capture the superposition of all policies deployed in each country; that is, they represent the average effect of policies that we would expect to observe if all policies enacted anywhere in each country were implemented simultaneously in a single region of that country. We also estimate the effects of individual policies or clusters of policies (Fig. 2c) that are grouped based on either their similarity in goal (for example, library and museum closures) or timing (for example, policies deployed simultaneously). Our estimates for these individual effects tend to be statistically noisier than the estimates for all policies combined. Some estimates for the same policy differ between countries, perhaps because policies are not implemented identically or because populations behave differently. Nonetheless, 22 out of 29 point estimates indicate that individual policies are probably contributing to the reduction of the growth rate of infections. Seven policies (one in South Korea, two in Italy and four in the United States) have point estimates that are positive, six of which are small in magnitude (less than 0.1) and not statistically different from zero (5% level). Consistent with greater overall uncertainty in these disaggregated estimates, some of the estimates in China, South Korea, Italy and France are moderately more sensitive to withholding regional blocks of data (Extended Data Fig. 4), but remain broadly robust to the assumption of a constant delayed effect of all policies (Extended Data Fig. 5b).

On the basis of these results, we find that the deployment of anti-contagion policies in all six countries significantly slowed the pandemic. We combine the estimates above with our data on the timing of the 1,700 policy deployments to estimate the total effect of all policies across the dates in our sample. To do this, we use our estimates to predict the growth rate of infections in each locality on each day, given the actual policies in effect at that location on that date (Fig. 3). We then use the same model to predict what the counterfactual growth rates would be on that date if the effects of all policies were removed (Fig. 3), which we call the no-policy scenario. The difference between these two predictions is our estimate of the effect that all deployed policies had on the growth rate of infections. During our sample, we estimate that all policies combined slowed the average growth rate of infections by −0.252 per day (s.e.m. = 0.045, P < 0.001) in China, −0.248 (s.e.m. = 0.089, P < 0.01) in South Korea, −0.24 (s.e.m. = 0.068, P < 0.001) in Italy, −0.355 (s.e.m. = 0.063, P < 0.001) in Iran, −0.123 (s.e.m. = 0.019, P < 0.001) in France and −0.084 (s.e.m. = 0.03, P < 0.01) in the United States. These results are robust to modelling the effects of policies without grouping them (Extended Data Fig. 6a and Supplementary Table 4) or assuming a delayed effect of policy on infection growth rates (Supplementary Table 5).

The number of COVID-19 infections on a date depends on the growth rate of infections on all previous days. Thus, persistent reductions in growth rates have a compounding effect on infections, until growth is slowed by a shrinking susceptible population. To provide a sense of scale for our results, we integrate the growth rate of infections in each locality from Fig. 3 to estimate cumulative infections, both with actual anti-contagion policies and in the no-policy scenario. To account for the declining susceptible population in each administrative unit, we couple our econometric estimates of the effects of policies with a susceptible–infected–removed model11,13 that adjusts the susceptible population in each administrative unit based on estimated case-detection rates23,25 (see Methods). This allows us to extend our projections beyond the initial exponential growth phase of infections, a threshold that many localities cross in our no-policy scenario.

Our results suggest that anti-contagion policies have already substantially reduced the number of COVID-19 infections observed in the world at present (Fig. 4). Our central estimates suggest that there would be approximately 37 million more cumulative confirmed cases (corresponding to 285 million more total infections, including the confirmed cases by 5 March 2020) in China, 11.5 million more confirmed cases (38 million total infections by 6 April 2020) in South Korea, 2.1 million more confirmed cases (49 million total infections by 6 April 2020) in Italy, 4.9 million more confirmed cases (54 million total infections by 22 March 2020) in Iran, 280,000 more confirmed cases (9 million total infections by 25 March 2020) in France and 4.8 million more confirmed cases (60 million total infections by 6 April 2020) in the United States had these countries never enacted any anti-contagion policies since the start of the pandemic.The magnitudes of these impacts partially reflect the timing, intensity and extent of policy deployment (for example, how many localities deployed policies) and the duration for which they have been applied. Several of these estimates are subject to large statistical uncertainties (see intervals in Fig. 4). Sensitivity tests (Extended Data Fig. 7) that assume a range of plausible alternative parameter values relating to disease dynamics, such as incorporating a susceptible–exposed–infected–removed model, suggest that interventions may have reduced the severity of the outbreak by a total of 54–65 million confirmed cases over the dates in our sample (central estimates). Sensitivity tests in which the assumed infection–fatality ratio is varied (Supplementary Table 6) suggest a corresponding range of 46–77 million confirmed cases (490–580 million total infections).

Our empirical results indicate that large-scale anti-contagion policies have slowed the COVID-19 pandemic. Because infection rates in the countries that we studied would have initially followed rapid exponential growth had no policies been applied, our results suggest that these policies have provided large health benefits. For example, we estimate that there would be approximately 465× the observed number of confirmed cases in China, 17× the number in Italy and 14× the number in the United States by the end of our analysis if large-scale anti-contagion policies had not been deployed. Consistent with process-based simulations of COVID-19 infections2,4,8,9,22,26, our analysis of existing policies indicates that seemingly small delays in policy deployment are likely to have produced markedly different health outcomes.

Although the limitations of available data pose challenges to our analysis, our aim is to use what data exist to estimate the first-order effects of unprecedented policy actions in an ongoing global crisis. As more data become available, related findings will become more precise and may capture more complex interactions. Furthermore, this analysis does not account for interactions between populations in nearby localities13, nor mobility networks3,4,8,9. Nonetheless, we hope that these results can support critical decision-making, both in the countries that we study and in the more than 180 other countries in which COVID-19 infections have been reported7.

A key advantage of our reduced-form top-down statistical approach is that it captures the real-world behaviour of affected populations without requiring that we explicitly model the underlying mechanisms and processes. This is useful in the current pandemic, for which many process-related parameters remain uncertain. However, our results cannot and should not be interpreted as a substitute for bottom-up process-based epidemiological models that are specifically designed to provide guidance in public health crises. Rather, our results complement existing models, for example, by helping to calibrate key model parameters. We believe both forward-looking simulations and backward-looking empirical evaluations should be used to inform decision-making.

Our analysis measures changes in local infection growth rates associated with changes in anti-contagion policies. A necessary condition for this association to be interpreted as the plausibly causal effect of these policies is that the timing of policy deployment is independent of infection growth rates14. This assumption is supported by established epidemiological theory11,13,27 and evidence28,29, which indicate that infections in the absence of policy will grow exponentially early in the epidemic, implying that pre-policy infection growth rates should be constant over time and therefore uncorrelated with the timing of policy deployment. Furthermore, scientific guidance to decision-makers early in the current epidemic explicitly projected constant growth rates in the absence of anti-contagion measures, limiting the possibility that anticipated changes in natural growth rates affected decision-making2,22,30,31. In practice, policies tended to be deployed in response to the high total numbers of cases (for example, in France)32, in response to outbreaks in other regions (for example, in China, South Korea and Iran)33, after delays due to political constraints (for example, in the United States and Italy) and often with timings that coincided with arbitrary events, such as weekends or holidays (see Supplementary Notes for detailed chronologies).

Our analysis accounts for documented changes in COVID-19 testing procedures and availability, as well as differences in case detection across locations; however, unobserved trends in case detection could affect our results (see Methods). We analyse estimated case-detection trends23 (Extended Data Fig. 2) and find that this potential bias is small—possibly elevating our estimated no-policy growth rates by 0.026 (7%) on average.

It is also possible that changing public knowledge during the period of our study affects our results. If individuals alter their behaviour in response to new information unrelated to anti-contagion policies, such as seeking out online resources, this could alter the growth rate of infections and thus affect our estimates. If increasing availability of information reduces infection growth rates, it would cause us to overstate the effectiveness of anti-contagion policies. We note, however, that if public knowledge is increasing in response to policy actions, such as through news reports, then it should be considered a pathway through which policies alter infection growth, not a form of bias. Investigating these potential effects is beyond the scope of this analysis, but it is an important topic for future investigations.

Finally, our analysis focuses on confirmed infections, but other outcomes, such as hospitalizations or deaths, are also of policy interest. Future studies on these outcomes may require additional modelling approaches because they are relatively more context- and state-dependent. Nonetheless, we experimentally implement our approach on the daily growth rate of hospitalizations in France, the only country in our sample for which hospitalization data are available at the granularity of this study. We find that the total estimated effect of anti-contagion policies on the growth rate of hospitalizations is similar to our estimates for infection growth rates (Extended Data Fig. 6c).

## Methods

### Data reporting

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.

### Data collection and processing

We provide a brief summary of our data collection processes here; further details, including access dates are provided in the Supplementary Notes. Epidemiological data, case definitions/testing regimes and policy data for each of the six countries in our sample were collected from a variety of in-country data sources, including government public health websites, regional newspaper articles and crowd-sourced information on Wikipedia. The availability of epidemiological and policy data varied across the six countries, and preference was given to the collection of data at the most granular administrative unit level. The country-specific panel datasets are at the regional level in France, the state level in the United States, the province level in South Korea, Italy and Iran, and the city level in China. Owing to data availability, the sample dates differ across countries: in China we use data from 16 January to 5 March 2020; in South Korea from 17 February to 6 April 2020; in Italy from 26 February to 6 April 2020; in Iran from 27 February to 22 March 2020; in France from 29 February to 25 March 2020; and in the United States from 3 March to 6 April 2020. Our data sources are described in more detail below.

#### China

We acquired epidemiological data from an open-source GitHub project34 that scrapes time series data from Ding Xiang Yuan, a Chinese website that integrates COVID-19 epidemiological data from various local governments. We extended this dataset back in time to 10 January 2020 by manually collecting official daily statistics from the central and provincial (Hubei, Guangdong and Zhejiang) Chinese government websites. We compiled policies by collecting data on the start dates of emergency declarations, travel bans and lockdowns at the city level from the ‘2020 Hubei lockdowns’ Wikipedia page35 and various other news reports. We suspect that most Chinese cities have implemented at least one anti-contagion policy due to their reported trends in infections; as such, we dropped cities for which we could not identify a policy deployment date to avoid miscategorizing the policy status of these cities. Thus our results are only representative for the sample of 115 cities for which we obtained policy data.

#### South Korea

We manually collected and compiled the epidemiological dataset for South Korea, based on provincial government reports, policy briefings and news articles. We compiled policy actions from news articles and press releases from the Korean Centers for Disease Control and Prevention, the Ministry of Foreign Affairs and websites of local governments.

#### Iran

We used epidemiological data from the table ‘New COVID-19 cases in Iran by province’36 in the ‘2020 coronavirus pandemic in Iran’ Wikipedia article, which were compiled from data provided by the Iranian Ministry of Health website (in Persian). We relied on news reporting and two timelines of pandemic events in Iran36,37 to collate policy data. From 2 March to 3 March 2020, Iran did not report subnational cases. Around this period, the country implemented three national policies: a recommendation against local travel (1 March), work from home for government employees (3 March) and school closure (5 March). As the effects of these policies cannot be distinguished from each other due to the data gap, we group them together for the purpose of this analysis.

#### Italy

We used epidemiological data from the GitHub repository38 maintained by the Italian Department of Civil Protection (Dipartimento della Protezione Civile). For policies, we primarily relied on the English version of the COVID-19 dossier ‘Chronology of main steps and legal acts taken by the Italian Government for the containment of the COVID-19 epidemiological emergency’ written by the Dipartimento della Protezione Civile39, and Wikipedia40.

#### France

We used the region-level epidemiological dataset provided by the government website of France41 and supplemented it with the number of confirmed cases by region on the public health website of France, which was previously updated daily until 25 March42. We obtained data on the policy response to the COVID-19 pandemic from the French government website, press releases from each regional public health site43 and Wikipedia44.

#### United States

We used state-level epidemiological data from usafacts.org45, which are compiled from multiple sources. For policy responses, we relied on a number of sources, including the US Centers for Disease Control and the National Governors Association, as well as various executive orders from county- and city-level governments, and press releases from media outlets.

#### Policy data

Policies in administrative units were coded as binary variables, for which the policy was coded as either 1 (after the date that the policy was implemented and before it was removed) or 0 (otherwise) for the affected administrative units. When a policy only affected a fraction of an administrative unit (for example, half of the counties within a state), policy variables were weighted by the percentage of people within the administrative unit who were treated by the policy. We used the most recent population estimates we could find for the administrative units of countries (see the ‘Population Data’ section in the Supplementary Information). To standardize policy types across countries, we mapped each country-specific policy to one of the broader policy category variables in our analysis. In this exercise, we collected 168 policies for China, 59 for South Korea, 214 for Italy, 23 for Iran, 59 for France and 1,177 for the United States (Supplementary Table 1). There are some cases for which we encode policies that are necessarily in effect whenever another policy is in place, owing in particular to the far-reaching implications of home-isolation policies. In China, wherever home isolation is documented, we assume a local travel ban is enacted on the same day if we have not found an explicit local travel ban policy for a given locality. In France, we assume home isolation is accompanied by event cancellations, social distancing and no-gathering policies; in Italy, we assume home isolation entails no-gathering, local travel ban, work from home and social distancing policies; in the United States, we assume shelter-in-place orders indicate that non-essential business closures, work from home policies and no-gathering policies are in effect. For policy types that are enacted multiple times at increasing degrees of intensity within a locality, we add weights to the variable by escalating the intensity from 0 pre-policy in steps up to 1 for the final version of the policy (see the ‘Policy Data’ section in the Supplementary Information).

#### Epidemiological data

We collected information on cumulative confirmed cases, cumulative recoveries, cumulative deaths, active cases and any changes to domestic COVID-19-testing regimes, such as case definitions or testing methodology. For our regression analysis (Fig. 2), we use active cases when they are available (China and South Korea) and cumulative confirmed cases otherwise. We document quality-control steps in the Supplementary Information. For China and South Korea, we acquired more granular data than the data hosted on the Johns Hopkins University (JHU) interactive dashboard46; we confirm that the number of confirmed cases closely match between the two data sources (see Extended Data Fig. 1). To conduct the econometric analysis, we merge the epidemiological and policy data to form a single data set for each country.

### Econometric analysis

#### Reduced-form approach

The reduced-form econometric approach that we apply here is a ‘top-down’ approach that describes the behaviour of aggregate outcomes y in data (in this case, infection rates). This approach can identify plausibly causal effects5,14 induced by exogenous changes in independent policy variables z (for example, school closure) without explicitly describing all underlying mechanisms that link z to y, without observing intermediary variables x (for example, behaviour) that might link z to y, or without other determinants of y unrelated to z (for example, demographics), denoted w. Let f(·) describe a complex and unobserved process that generates infection rates y:

$$y=f({x}_{1}({z}_{1},\,\ldots ,{z}_{K}),\,\ldots ,{x}_{N}({z}_{1},\,\ldots ,{z}_{K}),{w}_{1},\,\ldots ,{w}_{M})$$
(1)

Process-based epidemiological models aim to capture elements of f(·) explicitly, and then simulate how changes in z, x or w affect y. This approach is particularly important and useful in forward-looking simulations in which future conditions are likely to be different than historical conditions. However, a challenge faced by this approach is that we may not know the full structure of f(·), for example, if a pathogen is new and many key biological and societal parameters remain uncertain. We may not know the effect that large-scale policy (z) will have on behaviour (x(z)) or how this behaviour change will affect infection rates (f(·)).

Alternatively, one can differentiate equation (1) with respect to the kth policy zk:

$$\frac{\partial y}{\partial {z}_{k}}=\mathop{\sum }\limits_{j\mathrm{=1}}^{N}\frac{\partial y}{\partial {x}_{j}}\frac{\partial {x}_{j}}{\partial {z}_{k}}$$
(2)

which describes how changes in the policy affects infections through all N potential pathways mediated by x1, ..., xN. Usefully, for a fixed population observed over time, empirically estimating an average value of the local derivative on the left side in equation (2) does not depend on explicit knowledge of w. If we can observe y and z directly and estimate changes over time $$\frac{\partial y}{\partial {z}_{k}}$$ with data, then intermediate variables x also need not be observed nor modelled. The reduced-form econometric approach5,14 thus attempts to measure $$\frac{\partial y}{\partial {z}_{k}}$$ directly, exploiting exogenous variation in policies z.

#### Model

Active infections grow exponentially during the initial phase of an epidemic, when the proportion of immune individuals in a population is near zero. Assuming a simple susceptible–infected–recovered (SIR) disease model11, the growth in infections during the early period is

$$\frac{{\rm{d}}{I}_{t}}{{\rm{d}}t}=({S}_{t}\beta -\gamma ){I}_{t}\mathop{=}\limits_{{S}_{t}\to 1}(\beta -\gamma ){I}_{t},$$
(3)

where It is the number of infected individuals at time t, β is the transmission rate (new infections per day per infected individual), γ is the removal rate (proportion of infected individuals recovering or dying each day) and S is the fraction of the population susceptible to the disease. The second equality holds in the limit S → 1, which describes conditions during the beginning of the COVID-19 pandemic. The solution to this ordinary differential equation is the exponential function

$$\frac{{I}_{{t}_{2}}}{{I}_{{t}_{1}}}={{\rm{e}}}^{g({t}_{2}-{t}_{1})},$$
(4)

where $${I}_{{t}_{1}}$$ is the initial condition. Taking the natural logarithm and rearranging, we have

$${\rm{l}}{\rm{o}}{\rm{g}}({I}_{{t}_{2}})-{\rm{l}}{\rm{o}}{\rm{g}}({I}_{{t}_{1}})=g({t}_{2}-{t}_{1}\mathrm{)}.$$
(5)

Anti-contagion policies are designed to alter g, through changes to β, by reducing contact between susceptible and infected individuals. Holding the time step between observations fixed at one day (t2 − t1 = 1), we thus model g as a time-varying outcome that is a linear function of a time-varying policy

$${g}_{t}={\rm{l}}{\rm{o}}{\rm{g}}({I}_{t})-{\rm{l}}{\rm{o}}{\rm{g}}({I}_{t-1})={\theta }_{0}+\theta {{\rm{policy}}}_{t}+{\varepsilon }_{t},$$
(6)

where θ0 is the average growth rate without a policy, policyt is a binary variable describing whether a policy is deployed at time t, and θ is the average effect of the policy on growth rate g over all periods subsequent to the introduction of the policy, thereby encompassing any lagged effects of policies. εt is a mean-zero disturbance term that captures interperiod changes not described by policyt. Using this approach, infections each day are treated as the initial conditions for integrating equation (4) through to the following day.

We compute the first differences log(It) − log(It − 1) using active infections in countries for which they are available, otherwise we use cumulative infections, noting that they are almost identical during this early period (except in China, where we use active infections). We then match these data to policy variables that we construct using the novel datasets that we assembled and apply a reduced-form approach to estimate a version of equation (6), although the actual expression has additional terms detailed below.

#### Estimation

To estimate a multi-variable version of equation (6), we estimate a separate regression for each country c. Observations are for subnational units indexed by i observed for each day t. Because not all localities began testing for COVID-19 on the same date, these samples are unbalanced panels. To ensure data quality, we restrict our analysis to localities after they have reported at least ten cumulative infections.

A necessary condition for unbiased estimates is that the timing of policy deployment is independent of natural infection growth rates14, a mathematical condition that should be true in the context of a new epidemic. In established epidemiological models, including the standard SIR model above, early rates of infection within a susceptible population are characterized by constant exponential growth. This phenomenon is well understood theoretically13,27,47, has been repeatedly documented in past epidemics28,29,48 as well as the current COVID-19 pandemic11,12, and implies constant infection growth rates in the absence of policy intervention. Thus, we treat changes in infection growth rates as conditionally independent of policy deployments since the correlation between a constant variable and any other variable is zero in expectation.

We estimate a multiple regression version of equation (6) using ordinary least squares. We include a vector of subnational unit fixed effects θ0 (that is, varying intercepts captured as coefficients to dummy variables) to account for all time-invariant factors that affect the local growth rate of infections, such as differences in demographics, socioeconomic status, culture and health systems5. We include a vector of day-of-week fixed effects δ to account for weekly patterns in the growth rate of infections that are common across locations within a country; however, in China, we omit day-of-week effects because we find no evidence they are present in the data—perhaps because of the fact that the outbreak of COVID-19 began during a national holiday and workers never returned to work. We also include a separate single-day dummy variable each time there is an abrupt change in the availability of COVID-19 testing or a change in the procedure to diagnose positive cases. Such changes generally manifest as a discontinuous jump in infections and a re-scaling of subsequent infection rates (for example, see China in Fig. 1), effects that are flexibly absorbed by a single-day dummy variable because the dependent variable is the first difference of the logarithm of infections. We denote the vector of these effects μ.

Lastly, we include a vector (length Pc) of country-specific policy variables (policy) for each location and day. These policy variables take on values between 0 and 1 (inclusive) where 0 indicates no policy action and 1 indicates a policy is fully enacted. In cases in which a policy variable captures the effects of collections of policies (for example, museum closures and library closures), a policy variable is computed for each, then they are averaged, so the coefficient on this type of variable is interpreted as the effect if all policies in the collection are fully enacted. There are also instances in which multiple policies are deployed on the same date in numerous locations, in which case we group policies that have similar objectives (for example, suspension of transit and travel ban, or cancelling of events and no gathering) and keep other policies separate (that is, business closure and school closure). The grouping of policies is useful for reducing the number of estimated parameters in our limited sample of data, allowing us to examine the impact of subsets of policies (Fig. 2c). However, policy grouping does not make a substantial difference to the estimated effect of all policies combined nor to the effect of actual policies, which we demonstrate by estimating a regression model in which no policies are grouped and these values are recalculated (Extended Data Fig. 6a and Supplementary Table 4).

In some cases (for Italy and the United States), policy data are available at a more spatially granular level than infection data (for example, city policies and state-level infections in the United States). In these cases, we code binary policy variables at the more granular level and use population weights to aggregate them to the level of the infection data. Thus, policy variables may take on continuous values between 0 and 1, with a value of 1 indicating that the policy is fully enacted for the entire population. Given the limited quantity of data currently available, we use a parsimonious model that assumes the effects of policies on infection growth rates are approximately linear and additively separable. However, future studies that comprise more data may be able to identify important nonlinearities or interactions between policies.

For each country, our general multiple regression model is thus

$${g}_{cit}={\rm{l}}{\rm{o}}{\rm{g}}({I}_{cit})-{\rm{l}}{\rm{o}}{\rm{g}}({I}_{ci,t-1})={\theta }_{0,ci}+{\delta }_{ct}+{\mu }_{cit}+\mathop{\sum }\limits_{p=1}^{{P}_{c}}({\theta }_{pc}{{\rm{p}}{\rm{o}}{\rm{l}}{\rm{i}}{\rm{c}}{\rm{y}}}_{pcit})+{\varepsilon }_{cit}$$
(7)

where observations are indexed by country c, subnational unit i and day t. The parameters of interest are the country-by-policy specific coefficients θcp. We display the estimated residuals εcit in Extended Data Fig. 10, which are mean zero but not strictly normal (normality is not a requirement of our modelling and inference strategy), and we estimate uncertainty over all parameters by calculating our standard errors robust to error clustering at the day level14. This approach allows the covariance in εcit across different locations within a country, observed on the same day, to be non-zero. Such clustering is important in this context because idiosyncratic events within a country, such as a holiday or a backlog in testing laboratories, could generate nonuniform country-wide changes in infection growth for individual days that are not explicitly captured in our model. Thus, this approach nonparametrically accounts for both arbitrary forms of spatial autocorrelation or systematic misreporting in regions of a country on any given day (we note that it generates larger estimates for uncertainty than clustering by i). When we report the effect of all policies combined (Fig. 2b), we are reporting the sum of coefficient estimates for all policies $${\sum }_{p\mathrm{=1}}^{{P}_{c}}{\theta }_{cp}$$, accounting for the covariance of errors in these estimates when computing the uncertainty of this sum.

Note that our estimates of θ and θ0 in equation (7) are robust to systematic underreporting of infections, a major concern in the ongoing pandemic, due to the construction of our dependent variable. This remains true even if different localities have different rates of underreporting, so long as the rate of underreporting is relatively constant. To see this, note that if each locality i has a medical system that reports only a fraction ψi of infections such that we observe $${\tilde{I}}_{it}={\psi }_{i}{I}_{it}$$ rather an actual infections Iit, then the left side of equation (7) will be

$$\begin{array}{c}{\rm{l}}{\rm{o}}{\rm{g}}({\tilde{I}}_{it})-{\rm{l}}{\rm{o}}{\rm{g}}({\tilde{I}}_{i,t-1})={\rm{l}}{\rm{o}}{\rm{g}}({\psi }_{i}{I}_{it})-{\rm{l}}{\rm{o}}{\rm{g}}({\psi }_{i}{I}_{i,t-1})\\ ={\rm{l}}{\rm{o}}{\rm{g}}({\psi }_{i})-{\rm{l}}{\rm{o}}{\rm{g}}({\psi }_{i})+{\rm{l}}{\rm{o}}{\rm{g}}({I}_{it})-{\rm{l}}{\rm{o}}{\rm{g}}({I}_{i,t-1})\\ ={\rm{l}}{\rm{o}}{\rm{g}}({I}_{it})-{\rm{l}}{\rm{o}}{\rm{g}}({I}_{i,t-1})={g}_{t}\end{array}$$

and is therefore unaffected by location-specific and time-invariant underreporting. Thus systematic underreporting does not affect our estimates for the effects of policy θ. As discussed above, potential biases associated with non-systematic underreporting that results from documented changes in testing regimes over space and time are absorbed by region–day-specific effects μ.

However, if the rate of underreporting within a locality is changing day-to-day, this could bias infection growth rates. We estimate the magnitude of this bias (Extended Data Fig. 2), and verify that it is quantitatively small. Specifically, if $${\tilde{I}}_{it}={\psi }_{it}{I}_{it}$$ where ψit changes day-to-day, then

$${\rm{l}}{\rm{o}}{\rm{g}}({\tilde{I}}_{it})-{\rm{l}}{\rm{o}}{\rm{g}}({\tilde{I}}_{i,t-1})={\rm{l}}{\rm{o}}{\rm{g}}({\psi }_{it})-{\rm{l}}{\rm{o}}{\rm{g}}({\psi }_{i,t-1})+{g}_{t}$$
(8)

where log(ψit) − log(ψi,t − 1) is the day-over-day growth rate of the case-detection probability. Disease surveillance has evolved slowly in some locations as governments gradually expand testing, which would cause ψit to change over time, but these changes in testing capacity do not appear to significantly alter our estimates of infection growth rates. In Extended Data Fig. 2, we show one set of epidemiological estimates23 for log(ψit) − log(ψi,t − 1). Despite random day-to-day variations, which do not cause systematic biases in our point estimates, the mean of log(ψit) − log(ψi,t − 1) is consistently small across the different countries: 0.05 in China, 0.064 in Iran, 0.019 in South Korea, −0.058 in France, 0.031 in Italy and 0.049 in the United States. The average of these estimates is 0.026, potentially accounting for 7.3% of our global average estimate for the no-policy infection growth rate (0.36). These estimates of log(ψit) − log(ψi,t − 1) also do not display strong temporal trends, alleviating concerns that time-varying underreporting generates sizable biases in our estimated effects of anti-contagion policies.

#### Transient dynamics

In China, we are able to examine the transient response of infection growth rates following policy deployment because only three policies were deployed early in a seven-week sample period during which we observe many cities simultaneously. This provides us with sufficient data to estimate the temporal structure of policy effects without imposing assumptions regarding this structure. To do this, we estimate a distributed-lag model that encodes policy parameters using weekly lags based on the date that each policy is first implemented in locality i. This means the effect of a policy implemented one week ago is allowed to differ arbitrarily from the effect of that same policy in the following week, and so on. These effects are then estimated simultaneously and are displayed in Fig. 2b, c (see also Supplementary Table 3). Such a distributed lag approach did not provide statistically meaningful insights in other countries using the currently available data because there were fewer administrative units and shorter periods of observation (that is, smaller samples), and more policies (that is, more parameters to estimate) in all other countries. Future studies may be able to successfully explore these dynamics outside of China.

As a robustness check, we examine whether excluding the transient response from the estimated effects of policy substantially alters our results. We do this by estimating a ‘fixed lag’ model, in which we assume that policies cannot influence infection growth rates for L days, recoding a policy variable at time t as zero if a policy was implemented fewer than L days before t. We reestimate equation (7) for each value of L and present results in Extended Data Fig. 5 and Supplementary Table 5.

#### Alternative disease models

Our main empirical specification is motivated with an SIR model of disease contagion, which assumes zero latent period between exposure to COVID-19 and infectiousness. If we relax this assumption to allow for a latent period of infection, as in a susceptible–exposed–infected–recovered (SEIR) model, the growth of the outbreak is only asymptotically exponential11. Nonetheless, we demonstrate that SEIR dynamics have only a minor potential impact on the coefficients recovered by using our empirical approach in this context. In Extended Data Figs. 8, 9 we present results from a simulation exercise which uses equations (9)–(11), along with a generalization to the SEIR model11 to generate synthetic outbreaks (see Supplementary Methods section 2). We use these simulated data to test the ability of our statistical model (equation (7)) to recover both the unimpeded growth rate (Extended Data Fig. 8) as well as the impact of simulated policies on growth rates (Extended Data Fig. 9) when applied to data generated by SIR or SEIR dynamics over a wide range of epidemiological conditions.

### Projections

#### Daily growth rates of infections

To estimate the instantaneous daily growth rate of infections if policies were absent, we obtain fitted values from equation (7) and compute a predicted value for the dependent variable when all Pc policy variables are set to 0. Thus, these estimated growth rates $${\hat{g}}_{cit}^{{\rm{no}}\,{\rm{policy}}}$$ capture the effect of all locality-specific factors on the growth rate of infections (for example, demographics), day-of-week effects, and adjustments based on the way in which infection cases are reported. This counterfactual does not account for changes in information that are triggered by policy deployment, as those should be considered a pathway through which policies affect outcomes, as discussed in the main text. Additionally, the ‘no policy’ counterfactual does not model previously unobserved changes in behaviour that might occur if fundamentally new behaviours emerge even in the absence of government intervention. When we report an average no-policy growth rate of infections (Fig. 2a), it is the average value of these predictions for all observations in the original sample. Location-and-day-specific counterfactual predictions $$({\hat{g}}_{cit}^{{\rm{no}}\,{\rm{policy}}})$$, accounting for the covariance of errors in estimated parameters, are shown as red markers in Fig. 3.

#### Cumulative infections

To provide a sense of scale for the estimated cumulative benefits of effects shown in Fig. 3, we link our reduced-form empirical estimates to the key structures in a simple SIR system and simulate this dynamical system over the course of our sample. The system is defined as the following:

$$\frac{{\rm{d}}{S}_{t}}{{\rm{d}}t}=-{\beta }_{t}{S}_{t}{I}_{t}$$
(9)
$$\frac{{\rm{d}}{I}_{t}}{{\rm{d}}t}=({\beta }_{t}{S}_{t}-\gamma ){I}_{t}$$
(10)
$$\frac{{\rm{d}}{R}_{t}}{{\rm{d}}t}=\gamma {I}_{t}$$
(11)

where St is the susceptible population and Rt is the removed population. Here βt is a time-evolving parameter, determined by our empirical estimates as described below. Accounting for changes in S becomes increasingly important as the size of cumulative infections (It + Rt) becomes a substantial fraction of the local subnational population, which occurs in some no-policy scenarios. Our reduced-form analysis provides estimates for the growth rate of active infections $$(\hat{g})$$ for each locality and day, in a regime where St ≈ 1. Thus we know

$$\frac{{\rm{d}}{I}_{t}}{{\rm{d}}t}/{I}_{t}{{\textstyle |}}_{S\approx 1}={\hat{g}}_{t}={\beta }_{t}-\gamma$$
(12)

but we do not know the values of either of the two right-side terms, which are required to simulate equations (9)–(11). To estimate γ, we note that the left-side term of equation (11) is

$$\frac{{\rm{d}}{R}_{t}}{{\rm{d}}t}\approx \frac{{\rm{d}}}{{\rm{d}}t}({\rm{cumulative\; recoveries}}+{\rm{cumulative\; deaths}})$$

which we can observe in our data for China and South Korea. Computing first differences in these two variables (to differentiate with respect to time), summing them, and then dividing by active cases gives us estimates of γ (medians: China = 0.11, South Korea = 0.05). These values differ slightly from the classical SIR interpretation of γ, because in the public data that we are able to obtain, individuals are coded as ‘recovered’ when they no longer test positive for COVID-19, whereas in the classical SIR model this occurs when they are no longer infectious. We adopt the average of these two medians, setting γ = 0.08. We use medians rather than simple averages because low values for It induce a long right tail in daily estimates of γ and medians are less vulnerable to this distortion. We then use our empirically based reduced-form estimates of $$\hat{g}$$ (both with and without policy) combined with equations (9)–(11) to project total cumulative cases in all countries (Fig. 4). We simulate infections and cases for each administrative unit in our sample beginning on the first day for which we observe 10 or more cases (for that unit) using a time step of 4 h. Because we observe confirmed cases rather than total infections, we seed each simulation by adjusting observed It on the first day using country-specific estimates of case detection rates. We adjust existing estimates of case underreporting23 to further account for asymptomatic infections assuming an infection–fatality ratio of 0.75%25. We assume Rt = 0 on the first day. To maintain consistency with the reported data, we report our output in confirmed cases by multiplying our simulated It + Rt values by the aforementioned proportion of infections confirmed. We estimate uncertainty by resampling from the estimated variance–covariance matrix of all regression parameters. In Extended Data Fig. 7, we show sensitivity of this simulation to the estimated value of γ as well as to the use of an SEIR framework. In Supplementary Table 6, we show sensitivity of this simulation to the assumed infection–fatality ratio (see Supplementary Methods section 1).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

## Data availability

The datasets generated and/or analysed during the current study are available at https://github.com/bolliger32/gpl-covid. Future updates and/or extensions to data or code will be listed at http://www.globalpolicy.science/covid19.

## Code availability

For easier replication, we have created a CodeOcean ‘capsule’, which contains a pre-built computing environment in addition to the source code and data. This is available at https://codeocean.com/capsule/1887579/tree/v1. Future updates and/or extensions to data or code will be listed at http://www.globalpolicy.science/covid19.

## References

1. Wu, F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020).

2. Ferguson, N. M. et al. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand. Technical Report (Imperial College London, 2020).

3. Chinazzi, M. et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 368, 395–400 (2020).

4. Kraemer, M. U. G. et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368, 493–497 (2020).

5. Greene, W. H. Econometric Analysis (Prentice Hall, 2003).

6. Romer, C. D. & Romer, D. H. The macroeconomic effects of tax changes: estimates based on a new measure of fiscal shocks. Am. Econ. Rev. 100, 763–801 (2010).

7. WHO. WHO Coronavirus Disease (COVID-19) Dashboard. https://covid19.who.int (accessed 13 April 2020).

8. Li, R. et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368, 489–493 (2020).

9. Tang, B. et al. Estimation of the transmission risk of the 2019-nCoV and its implication for public health interventions. J. Clin. Med. 9, 462 (2020).

10. Hatchett, R. J., Mecher, C. E. & Lipsitch, M. Public health interventions and epidemic intensity during the 1918 influenza pandemic. Proc. Natl Acad. Sci. USA 104, 7582–7587 (2007).

11. Ma, J. Estimating epidemic exponential growth rate and basic reproduction number. Infect. Dis. Model. 5, 129–141 (2020).

12. Muniz-Rodriguez, K. et al. Doubling time of the COVID-19 epidemic by province, China. Emerg. Infect. Dis. 26, https://doi.org/10.3201/eid2608.200219 (2020).

13. Chowell, G., Sattenspiel, L., Bansal, S. & Viboud, C. Mathematical models to characterize early epidemic growth: a review. Phys. Life Rev. 18, 66–97 (2016).

14. Angrist, J. D. & Pischke, J.-S. Mostly Harmless Econometrics: An Empiricist’s Companion (Princeton Univ. Press, 2008).

15. Burke, M., Hsiang, S. M. & Miguel, E. Global non-linear effect of temperature on economic production. Nature 527, 235–239 (2015).

16. Kandula, S. et al. Evaluation of mechanistic and statistical methods in forecasting influenza-like illness. J. R. Soc. Interface 15, 20180174 (2018).

17. Wu, J. T. et al. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat. Med. 26, 506–510 (2020).

18. Li, Q. et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 382, 1199–1207 (2020).

19. Tsang, T. K. et al. Impact of changing case definitions for COVID-19 on the epidemic curve and transmission parameters in mainland China. Preprint at medRxiv https://doi.org/10.1101/2020.03.23.20041319 (2020).

20. Wuhan pneumonia: 30 days from outbreak to out of control [in Chinese]. BBC News https://www.bbc.com/zhongwen/simp/chinese-news-51290945 (2020).

21. Fisman, D., Khoo, E. & Tuite, A. Early epidemic dynamics of the West African 2014 Ebola outbreak: estimates derived with a simple two-parameter model. PLoS Curr. 6 https://doi.org/10.1371/currents.outbreaks.89c0d3783f36958d96ebbae97348d571 (2014).

22. Maier, B. F. & Brockmann, D. Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China. Science 368, 742–746 (2020).

23. Russell, T. W. et al. Using a delay-adjusted case fatality ratio to estimate under-reporting. Technical Report (Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, 2020).

24. Bootsma, M. C. J. & Ferguson, N. M. The effect of public health measures on the 1918 influenza pandemic in U.S. cities. Proc. Natl Acad. Sci. USA 104, 7588–7593 (2007).

25. Meyerowitz-Katz, G. & Merone, L. A systematic review and meta-analysis of published research data on COVID-19 infection-fatality rates. Preprint at medRxiv https://doi.org/10.1101/2020.05.03.20089854 (2020).

26. Kucharski, A. J. et al. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect. Dis. 20, 553–558 (2020).

27. Anderson, R. M. & May, R. M. Infectious Diseases of Humans: Dynamics and Control (Oxford Univ. Press, 1992).

28. Nishiura, H., Chowell, G., Safan, M. & Castillo-Chavez, C. Pros and cons of estimating the reproduction number from early epidemic growth rate of influenza A (H1N1) 2009. Theor. Biol. Med. Model. 7, 1 (2010).

29. WHO Ebola Response Team. Ebola virus disease in West Africa—the first 9 months of the epidemic and forward projections. N. Engl. J. Med. 371, 1481–1495 (2014).

30. Flaxman, S. et al. Report 13: Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries. Technical Report (Imperial College London, 2020).

31. Lourenço, J. et al. Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the SARS-CoV-2 epidemic. Preprint at medRxiv https://doi.org/10.1101/2020.03.24.20042291 (2020).

32. Préparation au Risque épidémique COVID-19 [in French]. https://solidarites-sante.gouv.fr/IMG/pdf/guide_methodologique_covid-19-2.pdf (2020).

33. Tian, H. et al. An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China. Science 368, 638–642 (2020).

34. Lin, J. COVID-19/2019-nCoV Time Series Infection Data Warehouse. https://github.com/BlankerL/DXY-COVID-19-Data (2020).

35. COVID-19 Pandemic Lockdown in Hubei. https://en.wikipedia.org/w/index.php?title=COVID-19_pandemic_lockdown_in_Hubei (Wikipedia, 2020).

36. COVID-19 Pandemic in Iran. https://en.wikipedia.org/w/index.php?title=COVID-19_pandemic_in_Iran (Wikipedia, 2020).

37. Kantis, C., Keirnan, S. & Bardi, J. S. Timeline of the coronavirus. Think Global Health https://www.thinkglobalhealth.org/article/updated-timeline-coronavirus (2020).

38. Presidenza del Consiglio dei Ministri Dipartimento della Protezione Civile. Dati COVID-19 Italia. https://github.com/pcm-dpc/COVID-19 (2020).

39. Presidenza del Consiglio dei Ministri Dipartimento della Protezione Civile. Coronavirus Emergency [in Italian]. http://www.protezionecivile.it/web/guest/home (Governo Italiano, 2020).

40. COVID-19 Pandemic Lockdown in Italy. https://en.wikipedia.org/w/index.php?title=COVID-19_pandemic_lockdown_in_Italy (Wikipedia, 2020).

41. Roussel, O. Fr-SARS-CoV-2. https://www.data.gouv.fr/en/datasets/fr-sars-cov-2 (2020).

42. Sante Publique France. Coronavirus (COVID-19). https://www.santepubliquefrance.fr/ (2020).

43. Agence Régionale de Santé. Agir pour la santé de tous. https://www.ars.sante.fr/ (2020).

44. COVID-19 Pandemic in France. https://en.wikipedia.org/w/index.php?title=COVID-19_pandemic_in_France (Wikipedia, 2020).

45. Coronavirus Locations: COVID-19 Map by County and State. https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/ (USA FACTS, 2020).

46. JHU CSSE. COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. https://github.com/CSSEGISandData/COVID-19 (2020).

47. Kermack, W. O. & McKendrick, A. G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. A 115, 700–721 (1927).

48. Mills, C. E., Robins, J. M. & Lipsitch, M. Transmissibility of 1918 pandemic influenza. Nature 432, 904–906 (2004).

## Acknowledgements

We thank B. Chen for her role in initiating this work and A. Feller for his feedback. S.A.-P., E.K., P.L. and J.T. are supported by a gift from the Tuaropaki Trust. T.C. is supported by an AI for Earth grant from National Geographic and Microsoft. D.A., A.H. and I.B. are supported through joint collaborations with the Climate Impact Lab. K.B. is supported by the Royal Society Te Apārangi Rutherford Postdoctoral Fellowship. H.D. and E.R. are supported by the National Science Foundation Graduate Research Fellowship under grants DGE 1106400 and 1752814, respectively. Opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of supporting organizations.

## Author information

Authors

### Contributions

S.H. conceived and led the study. All authors designed analysis, interpreted results, designed figures and wrote the paper. D.A., S.A.-P., K.B., I.B., T.C., H.D., L.Y.H., A.H., E.K., P.L., J.L., E.R., J.T. and T.W. contributed equally and are listed alphabetically. China: L.Y.H. and T.W. collected health data, L.Y.H., T.W. and J.T. collected policy data, L.Y.H. cleaned data. South Korea: J.L. collected health data, T.C. and J.L. collected policy data, T.C. cleaned data. Italy: D.A. collected health data, P.L. collected policy data, D.A. cleaned data. France: S.A.-P. collected health data, S.A.-P., J.T. and H.D. collected policy data, S.A.-P. cleaned data. Iran: A.H. collected health data and policy data, A.H. and D.A. cleaned data. United States: E.R. and K.B. collected health data, E.K. collected policy data, E.R., D.A. and K.B. cleaned data. I.B. collected geographical and population data for all countries. S.H. designed the econometric model. S.H., S.A.-P. and J.T. conducted econometric analysis for all countries. K.B., I.B., A.H., E.R. and E.K. designed and implemented epidemiological models and projections. S.A.-P., K.B., I.B., J.T., A.H. and E.K. designed and implemented robustness checks. H.D. created Fig. 1, T.C. created Fig. 2, J.T. created Fig. 3, E.R. created Fig. 4, D.A. created Supplementary Table 1, L.Y.H. and J.L. created Supplementary Table 2, J.T. created Supplementary Tables 3, 4, S.A.-P. and J.T. created Supplementary Table 5, K.B. created Supplementary Table 6, L.Y.H. created Extended Data Figs. 1, 2, S.A.-P. created Extended Data Figs. 3–5, J.T. created Extended Data Fig. 6, K.B. created Extended Data Fig. 7, I.B. created Extended Data Figs. 8, 9, J.T. created Extended Data Fig. 10. D.A., I.B. and P.L. managed policy data collection and quality control. I.B. and TC managed the code repository. I.B. and P.L. ran project management. E.K., T.W., J.T. and P.L. managed literature review. L.Y.H., E.K. and T.W. managed references. P.L. managed Extended Data Figs. 1–10 and Supplementary Information.

### Corresponding author

Correspondence to Solomon Hsiang.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Peer review information Nature thanks Andrew Jones, Jeffrey Shaman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data figures and tables

### Extended Data Fig. 1 Validating disaggregated epidemiological data against aggregated data from the JHU Center for Systems Science and Engineering.

Comparison of cumulative confirmed cases from a subset of regions in our collated epidemiological dataset to the same statistics from the 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by the Johns Hopkins Center for Systems Science and Engineering (JHU CSSE)46. We conducted this comparison for Chinese provinces and South Korea, for which the data we collected were from local administrative units that are more spatially granular than the data in the JHU CSSE database. a, In China, we aggregated our city-level data to the province level. b, In South Korea, we aggregated province-level data up to the country level. Small discrepancies, especially in later periods of the outbreak, are generally due to imported cases (international or domestic) that are present in national statistics but that we do not assign to particular cities (in China) or provinces (in Korea).

### Extended Data Fig. 2 Estimated trends in case detection over time within each country.

Systematic trends in case detection may potentially bias estimates of no-policy infection growth rates (see equation (8)). We estimate the potential magnitude of this bias using data from the Centre for Mathematical Modelling of Infectious Diseases23. Markers indicate daily first differences in the logarithm of the fraction of estimated symptomatic cases reported for each country over time. The average value over time (solid line and value denoted in panel title) is the average growth rate of case detection, equal to the magnitude of the potential bias. For example, in the main text we estimate that the infection growth rate in the United States is 0.29 (Fig. 2a), of which growth in case detection might contribute 0.049 (this figure). Sample sizes are 75 in China, 41 in Iran, 40 in South Korea, 29 in France, 40 in Italy and 32 in the United States.

### Extended Data Fig. 3 Robustness of the estimated no-policy growth rate of infections and the combined effect of policies to withholding blocks of data from entire regions.

a, b, For each country, we reestimated equation (7) using real data k times, each time withholding one of the k first-level administrative regions (‘Adm1’, that is, state or province) in that country. Each grey circle is either the estimated no-policy growth rate (a) or the total effect of all policies combined (b), from one of these k regressions. Red and blue circles show estimates from the full sample, identical to the results presented in Fig. 2a, b, respectively. For each country panel, if a single region is influential, the estimated value when it is withheld from the sample will appear as an outlier. Samples that omit an influential region are highlighted with an open pink circle. As in Fig. 2b, we estimate a distributed lag model for China and display each of the estimated weekly lag effects (where the pink circle is the same ‘without Hubei’ sample for lags). The full sample includes 3,669 observations in China, 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States.

### Extended Data Fig. 4 Robustness of the estimated effects of individual policies to withholding blocks of data from entire regions.

Same as Extended Data Fig. 3, but for individual policies (analogous to Fig. 2c). WFH denotes work from home policies; opt denotes optional policies. In cases in which two regions are influential, a second region is highlighted with an open green circle. The full sample includes 3,669 observations in China, 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States.

### Extended Data Fig. 5 Evidence to support models in which policies affect infection growth rates in the days following deployment.

Existing evidence has not demonstrated whether policies should affect infection growth rates in the days immediately after deployment. It is therefore not clear ex ante whether the policy variables in equation (7) should be encoded as ‘on’ immediately following a policy deployment. We estimate ‘fixed-lag’ models in which a fixed delay between the deployment of a policy and its effect is assumed (see Supplementary Methods section 3). If a delay model is more consistent with real world infection dynamics, these fixed lag models should recover larger estimates for the impact of policies and exhibit better model fit. a, R2 values associated with fixed-lag lengths varying from 0 to 15 days. Centre values represent the R2 value in our sample, whiskers are 95% confidence interval computed through resampling with replacement. In-sample fit generally declines or remains unchanged if policies are assumed to have a delay longer than 4 days. b, Estimated effects for no lag (the model reported in the main text) and for fixed lags between 1 and 5 days. Centre values represent the point estimate, error bars are 95% confidence intervals. Estimates generally are unchanged or shrink towards zero (for example, home isolation in Iran), consistent with mis-coding of post-policy days as no-policy days. The sample size is 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States.

### Extended Data Fig. 6 Estimated infection or hospitalization growth rates with actual anti-contagion policies and in a no-policy counterfactual scenario.

a, The estimated daily growth rates of active (China and South Korea) or cumulative (all others) infections based on the observed timing of all policy deployments within each subnational unit (blue) and in a scenario in which no policies were deployed (red). Identical to Fig. 3, but using an alternative disaggregated encoding of policies that does not group any policies into policy packages. The sample size is 3,669 in China, 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States. b, Same as Fig. 3, but equation (7) is implemented for a single example administrative unit: Wuhan, China. The sample size is 46 observations. c, Same as Fig. 3, but using hospitalization data from France rather than cumulative cases (the French government stopped reporting cumulative cases after 25 March 2020). The sample size is 424 observations. For all panels, the difference between the with- and no-policy predictions is our estimated effect of actual anti-contagion policies on the growth rate of infections (or hospitalizations). The markers are daily estimates for each subnational administrative unit (vertical lines are 95% confidence intervals). Black circles are observed changes in log(infections) (or diamonds for log(hospitalizations)), averaged across observed administrative units.

### Extended Data Fig. 7 Sensitivity of estimated averted/delayed infections to the choice of γ and σ in an SIR/SEIR framework.

The sensitivity of total averted/delayed cases presented in Fig. 4 to alternative modelling assumptions. We compute total cases across the respective final days in our samples for the six countries presented in our analysis. The figure displays how these totals vary with eight values of γ (0.05–0.4) and four values of σ (0.2, 0.33, 0.5, ∞), where the final value of σ (∞) corresponds to the SIR model. a, The simulated total number of infections under no policy. b, Same as in a, but using actual policies. c, The difference between a and b, which is the total number of averted/delayed infections. d, Same as c, but on a logarithmic scale similar to Fig. 4 (ac are on a linear scale, trimmed to show details). Figure 4 uses γ = 0.079, which we calculate using empirical recovery/death rates in countries for which we observed them (China and South Korea; see Methods). If we assume a 14-day delay between infected individuals becoming non-infectious and being reported as ‘recovered’ in the data, we would calculate γ = 0.18. Figure 4 assumes σ = ∞.

### Extended Data Fig. 8 Simulating reduced-form estimates for the no-policy growth rate of infections for different population regimes and disease dynamics.

We examine the performance of reduced-form econometric estimators through simulations in which different underlying disease dynamics are assumed (see Supplementary Methods section 2). Each histogram shows the distribution of econometrically estimated values across 1,000 simulated outbreaks. Estimates are for the no-policy infection growth rate (analogous to Fig. 2a) when three different policies are deployed at random moments in time. The black line shows the correct value imposed on the simulation and the red histogram shows the distribution of estimates using the regression in equation (7), applied to data output from the simulation. The grey dashed line shows the mean of this distribution. The 12 subpanels describe the results when various values are assigned to the mean infectious period (γ−1) and mean latency period (σ−1) of the disease. σ = ∞ is equivalent to SIR disease dynamics. In each panel, Smin is the minimum susceptible fraction observed across all 1,000 45-day simulations shown in each panel. In the real datasets used in the main text, after correcting for country-specific underreporting, Smin across all units analysed is 0.72 and 95% of the analysed units finish with Smin > 0.91. Bias refers to the distance between the dashed grey and black line as a percentage of the true value. a, Simulations in near-ideal data conditions in which we observe active infections within a large population (such that the susceptible fraction of the population remains high during the sample period, similar to those in our data for Chongqing, China). b, Simulations in a non-ideal data scenario in which we are only able to observe cumulative infections in a small population (similar to those in our sample for Cremona, Italy).

### Extended Data Fig. 9 Simulating reduced-form estimates for anti-contagion policy effects for different population regimes and assumed disease dynamics.

Same as Extended Data Fig. 8, but estimates are for the combined effect of three different policies (analogous to Fig. 2b) that are deployed at random moments in time. a, Simulations in near-ideal data conditions in which we observe active infections within a large population (such that the susceptible fraction of the population remains high during the sample period, similar to those in our data for Chongqing, China). b, Simulations in a non-ideal data scenario in which we are only able to observe cumulative infections in a small population (similar to those in our sample for Cremona, Italy).

### Extended Data Fig. 10 Regression residuals for the growth rates of COVID-19 by country.

These plots show the estimated residuals from equation (7) for each country-specific econometric model. Histograms (left) show the estimated unconditional probability density function. Quantile plots (right) show quantiles of the cumulative density function (y axis) plotted against the same quantiles for a normal distribution. For additional details, see Fig. 3 and the ‘Econometric analysis’ section of the Methods.

## Supplementary information

### Supplementary Information

This file contains Supplementary Notes, Supplementary Methods, and Supplementary Tables 1-6. Supplementary Notes: Details policy deployment decisions in each of the countries analyzed and describes the data acquisition and processing for the epidemiological and policy data. Both types of data are gathered from a variety of in-country data sources, including government public health websites, regional newspaper articles, and Wikipedia crowd-sourced information. Supplementary Methods: Describes sensitivity analyses and simulations performed to verify the robustness of our model. These include the sensitivity of our regression model and counterfactual projections to varying epidemiological parameters, as well as the sensitivity of our estimates to alternative lag structures, withholding of data, and differing policy groupings. Supplementary Tables: Details: 1) the number of anti-contagion policies; 2) Wuhan pre-intervention epidemiological data; 3) the main results estimating the effect of policy on growth rates; 4-5) estimates of policy effects using a disaggregated, and lagged version of our main model, and 6) and estimates of the initial infection growth rate and case doubling times.

## Rights and permissions

Reprints and Permissions

Hsiang, S., Allen, D., Annan-Phan, S. et al. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 584, 262–267 (2020). https://doi.org/10.1038/s41586-020-2404-8

• Accepted:

• Published:

• Issue Date:

• DOI: https://doi.org/10.1038/s41586-020-2404-8

• ### Do COVID-19 containment measures work? Evidence from Switzerland

• Regina Pleninger
• Sina Streicher
• Jan-Egbert Sturm

Swiss Journal of Economics and Statistics (2022)

• ### How to boost the boosters? A survey-experiment on the effectiveness of different policies aimed at enhancing acceptance of a “Seasonal” vaccination against COVID-19

• Talia Goren
• Itai Beeri
• Dana Rachel Vashdi

Israel Journal of Health Policy Research (2022)

• ### Clustering and mapping the first COVID-19 outbreak in France

• Regis Darques
• Julie Trottier
• Nassim Ait-Mouheb

BMC Public Health (2022)

• ### COVID-19 data are messy: analytic methods for rigorous impact analyses with imperfect data

• Michael A. Stoto
• Abbey Woolverton
• Michael Clarke

Globalization and Health (2022)

• ### Evaluating an e-learning program to strengthen the capacity of humanitarian workers in the MENA region: the Humanitarian Leadership Diploma

• Dayana Brome

Conflict and Health (2022)