Abstract
Governments around the world are responding to the coronavirus disease 2019 (COVID19) pandemic^{1}, caused by severe acute respiratory syndrome coronavirus 2 (SARSCoV2), with unprecedented policies designed to slow the growth rate of infections. Many policies, such as closing schools and restricting populations to their homes, impose large and visible costs on society; however, their benefits cannot be directly observed and are currently understood only through processbased simulations^{2,3,4}. Here we compile data on 1,700 local, regional and national nonpharmaceutical interventions that were deployed in the ongoing pandemic across localities in China, South Korea, Italy, Iran, France and the United States. We then apply reducedform econometric methods, commonly used to measure the effect of policies on economic growth^{5,6}, to empirically evaluate the effect that these anticontagion policies have had on the growth rate of infections. In the absence of policy actions, we estimate that early infections of COVID19 exhibit exponential growth rates of approximately 38% per day. We find that anticontagion policies have significantly and substantially slowed this growth. Some policies have different effects on different populations, but we obtain consistent evidence that the policy packages that were deployed to reduce the rate of transmission achieved large, beneficial and measurable health outcomes. We estimate that across these 6 countries, interventions prevented or delayed on the order of 61 million confirmed cases, corresponding to averting approximately 495 million total infections. These findings may help to inform decisions regarding whether or when these policies should be deployed, intensified or lifted, and they can support policymaking in the more than 180 other countries in which COVID19 has been reported^{7}.
Main
The COVID19 pandemic is forcing societies worldwide to make consequential policy decisions with limited information. After containment of the initial outbreak failed, attention turned to implementing nonpharmaceutical interventions that are designed to slow the contagion of the virus. In general, these policies aim to decrease virus transmission by reducing contact among individuals within or between populations, such as by closing restaurants or restricting travel, thereby slowing the spread of COVID19 to a manageable rate. These largescale anticontagion policies are informed by epidemiological simulations^{2,4,8,9} and a small number of natural experiments during past epidemics^{10}. However, the actual effects of these policies on infection rates in the ongoing pandemic are unknown. Because the modern world has never confronted this pathogen, nor deployed anticontagion policies of such scale and scope, it is crucial that direct measurements of the effects of policies are used together with numerical simulations in current decisionmaking.
Societies around the world are considering whether the health benefits of anticontagion policies are worth their social and economic costs. Many of these costs are clearly observed; for example, business restrictions increase unemployment and school closures affect educational outcomes. It is therefore not surprising that some populations have hesitated before implementing such policies, especially when their costs are visible while their health benefits—infections and deaths that would have occurred but are instead avoided or delayed—are unseen. Our objective is to measure the direct health benefits of these policies; specifically, how much these policies slowed the growth rate of infections. To do this, we compare the growth rate of infections within hundreds of subnational regions before and after each of these policies is implemented locally. Intuitively, each administrative unit observed immediately before a policy deployment serves as the ‘control’ for the same unit in the days after it receives a policy ‘treatment’ (see Supplementary Information for accounts of these deployments). Our hope is to learn from the recent experience of six countries in which the early spread of the virus triggered largescale policy actions, in part so that societies and decisionmakers everywhere can access this information.
Here we directly estimate the effects of 1,700 local, regional and national policies on the growth rate of infections across localities within China, South Korea, Italy, Iran, France and the United States (Fig. 1 and Supplementary Table 1). We compile subnational data on daily infection rates, changes in case definitions and the timing of policy deployments, including (1) travel restrictions, (2) social distancing through the cancellations of events and suspensions of educational, commercial and religious activities, (3) quarantines and lockdowns, and (4) additional policies such as emergency declarations and expansions of paid sick leave, from the earliest available dates to 6 April 2020 (Extended Data Fig. 1 and Supplementary Notes). During this period, populations remained almost entirely susceptible to COVID19, causing the natural spread of infections to exhibit almost perfect exponential growth^{11,12,13}. The rate of this exponential growth could change daily, determined by epidemiological factors, such as disease infectivity, as well as policies that alter behaviour^{9,11}. Because policies were deployed while the epidemic unfolded, we can estimate their effects empirically. We examine how the daily growth rate of infections in each locality changed in response to the collection of ongoing policies applied to that locality on that day.
We use wellestablished reducedform econometric techniques^{5,14} that are commonly used to measure the effects of events^{6,15} on economic growth rates. Similar to early COVID19 infections, economic output generally increases exponentially with a variable rate that can be affected by policies and other conditions. Here, this technique aims to measure the total magnitude of the effect of changes in policy, without requiring explicit prior information about fundamental epidemiological parameters or mechanisms, many of which remain uncertain in the current pandemic. Instead, the collective influence of these factors is empirically recovered from the data without modelling their individual effects explicitly (see Methods). Previous research on influenza^{16}, for example, has shown that such statistical approaches can provide important complementary information to processbased models.
To construct the dependent variable, we transform locationspecific, subnational timeseries data on infections into first differences of their natural logarithm, which is the perday growth rate of infections (see Methods). We use data from first or secondlevel administrative units and data on active or cumulative cases, depending on availability (Supplementary Notes). We employ widely used panel regression models^{5,14} to estimate how the daily growth rate of infections changes over time within a location when different combinations of largescale policies are enacted (see Methods). Our econometric approach accounts for differences in the baseline growth rate of infections across subnational locations, which may be affected by timeinvariant characteristics, such as demographics, socioeconomic status, culture and health systems; it accounts for systematic patterns in growth rates within countries unrelated to policy, such as the effect of the workweek; it is robust to systematic undersurveillance specific to each subnational unit; and it accounts for changes in procedures to diagnose positive cases (Methods and Supplementary Methods).
We estimate that in the absence of policies, early infection rates of COVID19 grow 43% per day on average across these six countries (s.e.m. = 5%), implying a doubling time of approximately 2 days. Countryspecific estimates range from 34% per day in the United States (s.e.m. = 7%) to 68% per day in Iran (s.e.m. = 9%). We cannot determine whether the high estimate for Iran results from true epidemiological differences, dataquality issues (see Methods), the concurrence of the initial outbreak with a major religious holiday and pilgrimage (Supplementary Notes) or sampling variability. Excluding Iran, the average growth rate is 38% per day (s.e.m. = 5%). Growth rates in all five other countries are independently estimated to be very near this value (Fig. 2a). These estimated values differ from observed average growth rates because the latter are confounded by the effects of policies. These growth rates are not driven by the expansion of testing or increasing rates of case detection (Methods and Extended Data Fig. 2) nor by data from individual regions (Extended Data Fig. 3).
Some previous analyses of preintervention infections in Wuhan have suggested that the growth rates were slower (doubling every 5–7 days)^{17,18} using data collected before national standards for diagnosis and case definitions were first issued by the Chinese government on 15 January 2020^{19}. However, case data in Wuhan from before this date contain multiple irregularities: the cumulative case count decreased on 9 January 2020; no new cases were reported for 9–15 January; and there were concerns that information about the outbreak was suppressed^{20} (Supplementary Table 2). When we remove these data, using a shorter but more reliable preintervention time series from Wuhan (16–21 January), we recover a growth rate of 43% per day (s.e.m. = 3%), which corresponds to a doubling time of 2 days, consistent with results from all other countries except Iran (Fig. 2a).
During the early stages of an epidemic, a large proportion of the population remains susceptible to the virus, and if the spread of the virus is left uninhibited by changes in policies or behaviour, exponential growth continues until the fraction of the susceptible population decreases meaningfully^{11,13,21,22}. After correcting for estimated rates of case detection^{23}, we compute that the minimum susceptible fraction across administrative units in our sample is 72% of the total population (Cremona, Italy) and 87% of administrative units would be likely to be in a regime of uninhibited exponential growth (that is, more than 95% of the population remains susceptible) if policies were removed on the last date of our sample.
Consistent with predictions from epidemiological models^{2,10,24}, we find that the combined effect of policies within each country reduces the growth rate of infections by a substantial and statistically significant amount (Fig. 2b and Supplementary Table 3). For example, a locality in France with a baseline growth rate of 0.33 (national average) that fully deployed all policy actions used in France would be expected to lower its daily growth rate by −0.17 to a growth rate of 0.16. In general, the estimated total effects of policy packages are large enough that they can in principle offset a large fraction of, or even eliminate, the baseline growth rate of infections—although in several countries, many localities have not deployed the full set of policies. Overall, the estimated effects of all policies combined are generally insensitive to withholding regional (that is, state or provincelevel) blocks of data from the sample (Extended Data Fig. 3).
In China, only three policies were enacted across 115 cities early in a 7week period, providing us with sufficient data to empirically estimate how the effects of these policies evolved over time without making assumptions about the timing of these effects (Fig. 2b and Methods). We estimate that the combined effect of these policies reduced the growth rate of infections by −0.026 (s.e.m. = 0.046) in the first week after they came into effect, increasing substantially in the second week to −0.20 (s.e.m. = 0.049), and essentially stabilizing in the third week around −0.28 (s.e.m. = 0.047). In other countries, we lack sufficient data to estimate these temporal dynamics explicitly and only report the average pooled effect of policies across all days after their deployment (Methods). If other countries have transient responses similar to China, we would expect that the effects in the first week after deployment are smaller in magnitude than the average effect that we report. We also explore how our estimates would change if we impose the assumption that policies cannot affect infection growth rates until after a fixed number of days (Extended Data Fig. 5a and Supplementary Methods section 3); however, we do not find evidence that this improves model fit.
The estimates described above (Fig. 2b) capture the superposition of all policies deployed in each country; that is, they represent the average effect of policies that we would expect to observe if all policies enacted anywhere in each country were implemented simultaneously in a single region of that country. We also estimate the effects of individual policies or clusters of policies (Fig. 2c) that are grouped based on either their similarity in goal (for example, library and museum closures) or timing (for example, policies deployed simultaneously). Our estimates for these individual effects tend to be statistically noisier than the estimates for all policies combined. Some estimates for the same policy differ between countries, perhaps because policies are not implemented identically or because populations behave differently. Nonetheless, 22 out of 29 point estimates indicate that individual policies are probably contributing to the reduction of the growth rate of infections. Seven policies (one in South Korea, two in Italy and four in the United States) have point estimates that are positive, six of which are small in magnitude (less than 0.1) and not statistically different from zero (5% level). Consistent with greater overall uncertainty in these disaggregated estimates, some of the estimates in China, South Korea, Italy and France are moderately more sensitive to withholding regional blocks of data (Extended Data Fig. 4), but remain broadly robust to the assumption of a constant delayed effect of all policies (Extended Data Fig. 5b).
On the basis of these results, we find that the deployment of anticontagion policies in all six countries significantly slowed the pandemic. We combine the estimates above with our data on the timing of the 1,700 policy deployments to estimate the total effect of all policies across the dates in our sample. To do this, we use our estimates to predict the growth rate of infections in each locality on each day, given the actual policies in effect at that location on that date (Fig. 3). We then use the same model to predict what the counterfactual growth rates would be on that date if the effects of all policies were removed (Fig. 3), which we call the nopolicy scenario. The difference between these two predictions is our estimate of the effect that all deployed policies had on the growth rate of infections. During our sample, we estimate that all policies combined slowed the average growth rate of infections by −0.252 per day (s.e.m. = 0.045, P < 0.001) in China, −0.248 (s.e.m. = 0.089, P < 0.01) in South Korea, −0.24 (s.e.m. = 0.068, P < 0.001) in Italy, −0.355 (s.e.m. = 0.063, P < 0.001) in Iran, −0.123 (s.e.m. = 0.019, P < 0.001) in France and −0.084 (s.e.m. = 0.03, P < 0.01) in the United States. These results are robust to modelling the effects of policies without grouping them (Extended Data Fig. 6a and Supplementary Table 4) or assuming a delayed effect of policy on infection growth rates (Supplementary Table 5).
The number of COVID19 infections on a date depends on the growth rate of infections on all previous days. Thus, persistent reductions in growth rates have a compounding effect on infections, until growth is slowed by a shrinking susceptible population. To provide a sense of scale for our results, we integrate the growth rate of infections in each locality from Fig. 3 to estimate cumulative infections, both with actual anticontagion policies and in the nopolicy scenario. To account for the declining susceptible population in each administrative unit, we couple our econometric estimates of the effects of policies with a susceptible–infected–removed model^{11,13} that adjusts the susceptible population in each administrative unit based on estimated casedetection rates^{23,25} (see Methods). This allows us to extend our projections beyond the initial exponential growth phase of infections, a threshold that many localities cross in our nopolicy scenario.
Our results suggest that anticontagion policies have already substantially reduced the number of COVID19 infections observed in the world at present (Fig. 4). Our central estimates suggest that there would be approximately 37 million more cumulative confirmed cases (corresponding to 285 million more total infections, including the confirmed cases by 5 March 2020) in China, 11.5 million more confirmed cases (38 million total infections by 6 April 2020) in South Korea, 2.1 million more confirmed cases (49 million total infections by 6 April 2020) in Italy, 4.9 million more confirmed cases (54 million total infections by 22 March 2020) in Iran, 280,000 more confirmed cases (9 million total infections by 25 March 2020) in France and 4.8 million more confirmed cases (60 million total infections by 6 April 2020) in the United States had these countries never enacted any anticontagion policies since the start of the pandemic.The magnitudes of these impacts partially reflect the timing, intensity and extent of policy deployment (for example, how many localities deployed policies) and the duration for which they have been applied. Several of these estimates are subject to large statistical uncertainties (see intervals in Fig. 4). Sensitivity tests (Extended Data Fig. 7) that assume a range of plausible alternative parameter values relating to disease dynamics, such as incorporating a susceptible–exposed–infected–removed model, suggest that interventions may have reduced the severity of the outbreak by a total of 54–65 million confirmed cases over the dates in our sample (central estimates). Sensitivity tests in which the assumed infection–fatality ratio is varied (Supplementary Table 6) suggest a corresponding range of 46–77 million confirmed cases (490–580 million total infections).
Our empirical results indicate that largescale anticontagion policies have slowed the COVID19 pandemic. Because infection rates in the countries that we studied would have initially followed rapid exponential growth had no policies been applied, our results suggest that these policies have provided large health benefits. For example, we estimate that there would be approximately 465× the observed number of confirmed cases in China, 17× the number in Italy and 14× the number in the United States by the end of our analysis if largescale anticontagion policies had not been deployed. Consistent with processbased simulations of COVID19 infections^{2,4,8,9,22,26}, our analysis of existing policies indicates that seemingly small delays in policy deployment are likely to have produced markedly different health outcomes.
Although the limitations of available data pose challenges to our analysis, our aim is to use what data exist to estimate the firstorder effects of unprecedented policy actions in an ongoing global crisis. As more data become available, related findings will become more precise and may capture more complex interactions. Furthermore, this analysis does not account for interactions between populations in nearby localities^{13}, nor mobility networks^{3,4,8,9}. Nonetheless, we hope that these results can support critical decisionmaking, both in the countries that we study and in the more than 180 other countries in which COVID19 infections have been reported^{7}.
A key advantage of our reducedform topdown statistical approach is that it captures the realworld behaviour of affected populations without requiring that we explicitly model the underlying mechanisms and processes. This is useful in the current pandemic, for which many processrelated parameters remain uncertain. However, our results cannot and should not be interpreted as a substitute for bottomup processbased epidemiological models that are specifically designed to provide guidance in public health crises. Rather, our results complement existing models, for example, by helping to calibrate key model parameters. We believe both forwardlooking simulations and backwardlooking empirical evaluations should be used to inform decisionmaking.
Our analysis measures changes in local infection growth rates associated with changes in anticontagion policies. A necessary condition for this association to be interpreted as the plausibly causal effect of these policies is that the timing of policy deployment is independent of infection growth rates^{14}. This assumption is supported by established epidemiological theory^{11,13,27} and evidence^{28,29}, which indicate that infections in the absence of policy will grow exponentially early in the epidemic, implying that prepolicy infection growth rates should be constant over time and therefore uncorrelated with the timing of policy deployment. Furthermore, scientific guidance to decisionmakers early in the current epidemic explicitly projected constant growth rates in the absence of anticontagion measures, limiting the possibility that anticipated changes in natural growth rates affected decisionmaking^{2,22,30,31}. In practice, policies tended to be deployed in response to the high total numbers of cases (for example, in France)^{32}, in response to outbreaks in other regions (for example, in China, South Korea and Iran)^{33}, after delays due to political constraints (for example, in the United States and Italy) and often with timings that coincided with arbitrary events, such as weekends or holidays (see Supplementary Notes for detailed chronologies).
Our analysis accounts for documented changes in COVID19 testing procedures and availability, as well as differences in case detection across locations; however, unobserved trends in case detection could affect our results (see Methods). We analyse estimated casedetection trends^{23} (Extended Data Fig. 2) and find that this potential bias is small—possibly elevating our estimated nopolicy growth rates by 0.026 (7%) on average.
It is also possible that changing public knowledge during the period of our study affects our results. If individuals alter their behaviour in response to new information unrelated to anticontagion policies, such as seeking out online resources, this could alter the growth rate of infections and thus affect our estimates. If increasing availability of information reduces infection growth rates, it would cause us to overstate the effectiveness of anticontagion policies. We note, however, that if public knowledge is increasing in response to policy actions, such as through news reports, then it should be considered a pathway through which policies alter infection growth, not a form of bias. Investigating these potential effects is beyond the scope of this analysis, but it is an important topic for future investigations.
Finally, our analysis focuses on confirmed infections, but other outcomes, such as hospitalizations or deaths, are also of policy interest. Future studies on these outcomes may require additional modelling approaches because they are relatively more context and statedependent. Nonetheless, we experimentally implement our approach on the daily growth rate of hospitalizations in France, the only country in our sample for which hospitalization data are available at the granularity of this study. We find that the total estimated effect of anticontagion policies on the growth rate of hospitalizations is similar to our estimates for infection growth rates (Extended Data Fig. 6c).
Methods
Data reporting
No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.
Data collection and processing
We provide a brief summary of our data collection processes here; further details, including access dates are provided in the Supplementary Notes. Epidemiological data, case definitions/testing regimes and policy data for each of the six countries in our sample were collected from a variety of incountry data sources, including government public health websites, regional newspaper articles and crowdsourced information on Wikipedia. The availability of epidemiological and policy data varied across the six countries, and preference was given to the collection of data at the most granular administrative unit level. The countryspecific panel datasets are at the regional level in France, the state level in the United States, the province level in South Korea, Italy and Iran, and the city level in China. Owing to data availability, the sample dates differ across countries: in China we use data from 16 January to 5 March 2020; in South Korea from 17 February to 6 April 2020; in Italy from 26 February to 6 April 2020; in Iran from 27 February to 22 March 2020; in France from 29 February to 25 March 2020; and in the United States from 3 March to 6 April 2020. Our data sources are described in more detail below.
China
We acquired epidemiological data from an opensource GitHub project^{34} that scrapes time series data from Ding Xiang Yuan, a Chinese website that integrates COVID19 epidemiological data from various local governments. We extended this dataset back in time to 10 January 2020 by manually collecting official daily statistics from the central and provincial (Hubei, Guangdong and Zhejiang) Chinese government websites. We compiled policies by collecting data on the start dates of emergency declarations, travel bans and lockdowns at the city level from the ‘2020 Hubei lockdowns’ Wikipedia page^{35} and various other news reports. We suspect that most Chinese cities have implemented at least one anticontagion policy due to their reported trends in infections; as such, we dropped cities for which we could not identify a policy deployment date to avoid miscategorizing the policy status of these cities. Thus our results are only representative for the sample of 115 cities for which we obtained policy data.
South Korea
We manually collected and compiled the epidemiological dataset for South Korea, based on provincial government reports, policy briefings and news articles. We compiled policy actions from news articles and press releases from the Korean Centers for Disease Control and Prevention, the Ministry of Foreign Affairs and websites of local governments.
Iran
We used epidemiological data from the table ‘New COVID19 cases in Iran by province’^{36} in the ‘2020 coronavirus pandemic in Iran’ Wikipedia article, which were compiled from data provided by the Iranian Ministry of Health website (in Persian). We relied on news reporting and two timelines of pandemic events in Iran^{36,37} to collate policy data. From 2 March to 3 March 2020, Iran did not report subnational cases. Around this period, the country implemented three national policies: a recommendation against local travel (1 March), work from home for government employees (3 March) and school closure (5 March). As the effects of these policies cannot be distinguished from each other due to the data gap, we group them together for the purpose of this analysis.
Italy
We used epidemiological data from the GitHub repository^{38} maintained by the Italian Department of Civil Protection (Dipartimento della Protezione Civile). For policies, we primarily relied on the English version of the COVID19 dossier ‘Chronology of main steps and legal acts taken by the Italian Government for the containment of the COVID19 epidemiological emergency’ written by the Dipartimento della Protezione Civile^{39}, and Wikipedia^{40}.
France
We used the regionlevel epidemiological dataset provided by the government website of France^{41} and supplemented it with the number of confirmed cases by region on the public health website of France, which was previously updated daily until 25 March^{42}. We obtained data on the policy response to the COVID19 pandemic from the French government website, press releases from each regional public health site^{43} and Wikipedia^{44}.
United States
We used statelevel epidemiological data from usafacts.org^{45}, which are compiled from multiple sources. For policy responses, we relied on a number of sources, including the US Centers for Disease Control and the National Governors Association, as well as various executive orders from county and citylevel governments, and press releases from media outlets.
Policy data
Policies in administrative units were coded as binary variables, for which the policy was coded as either 1 (after the date that the policy was implemented and before it was removed) or 0 (otherwise) for the affected administrative units. When a policy only affected a fraction of an administrative unit (for example, half of the counties within a state), policy variables were weighted by the percentage of people within the administrative unit who were treated by the policy. We used the most recent population estimates we could find for the administrative units of countries (see the ‘Population Data’ section in the Supplementary Information). To standardize policy types across countries, we mapped each countryspecific policy to one of the broader policy category variables in our analysis. In this exercise, we collected 168 policies for China, 59 for South Korea, 214 for Italy, 23 for Iran, 59 for France and 1,177 for the United States (Supplementary Table 1). There are some cases for which we encode policies that are necessarily in effect whenever another policy is in place, owing in particular to the farreaching implications of homeisolation policies. In China, wherever home isolation is documented, we assume a local travel ban is enacted on the same day if we have not found an explicit local travel ban policy for a given locality. In France, we assume home isolation is accompanied by event cancellations, social distancing and nogathering policies; in Italy, we assume home isolation entails nogathering, local travel ban, work from home and social distancing policies; in the United States, we assume shelterinplace orders indicate that nonessential business closures, work from home policies and nogathering policies are in effect. For policy types that are enacted multiple times at increasing degrees of intensity within a locality, we add weights to the variable by escalating the intensity from 0 prepolicy in steps up to 1 for the final version of the policy (see the ‘Policy Data’ section in the Supplementary Information).
Epidemiological data
We collected information on cumulative confirmed cases, cumulative recoveries, cumulative deaths, active cases and any changes to domestic COVID19testing regimes, such as case definitions or testing methodology. For our regression analysis (Fig. 2), we use active cases when they are available (China and South Korea) and cumulative confirmed cases otherwise. We document qualitycontrol steps in the Supplementary Information. For China and South Korea, we acquired more granular data than the data hosted on the Johns Hopkins University (JHU) interactive dashboard^{46}; we confirm that the number of confirmed cases closely match between the two data sources (see Extended Data Fig. 1). To conduct the econometric analysis, we merge the epidemiological and policy data to form a single data set for each country.
Econometric analysis
Reducedform approach
The reducedform econometric approach that we apply here is a ‘topdown’ approach that describes the behaviour of aggregate outcomes y in data (in this case, infection rates). This approach can identify plausibly causal effects^{5,14} induced by exogenous changes in independent policy variables z (for example, school closure) without explicitly describing all underlying mechanisms that link z to y, without observing intermediary variables x (for example, behaviour) that might link z to y, or without other determinants of y unrelated to z (for example, demographics), denoted w. Let f(·) describe a complex and unobserved process that generates infection rates y:
Processbased epidemiological models aim to capture elements of f(·) explicitly, and then simulate how changes in z, x or w affect y. This approach is particularly important and useful in forwardlooking simulations in which future conditions are likely to be different than historical conditions. However, a challenge faced by this approach is that we may not know the full structure of f(·), for example, if a pathogen is new and many key biological and societal parameters remain uncertain. We may not know the effect that largescale policy (z) will have on behaviour (x(z)) or how this behaviour change will affect infection rates (f(·)).
Alternatively, one can differentiate equation (1) with respect to the kth policy z_{k}:
which describes how changes in the policy affects infections through all N potential pathways mediated by x_{1}, ..., x_{N}. Usefully, for a fixed population observed over time, empirically estimating an average value of the local derivative on the left side in equation (2) does not depend on explicit knowledge of w. If we can observe y and z directly and estimate changes over time \(\frac{\partial y}{\partial {z}_{k}}\) with data, then intermediate variables x also need not be observed nor modelled. The reducedform econometric approach^{5,14} thus attempts to measure \(\frac{\partial y}{\partial {z}_{k}}\) directly, exploiting exogenous variation in policies z.
Model
Active infections grow exponentially during the initial phase of an epidemic, when the proportion of immune individuals in a population is near zero. Assuming a simple susceptible–infected–recovered (SIR) disease model^{11}, the growth in infections during the early period is
where I_{t} is the number of infected individuals at time t, β is the transmission rate (new infections per day per infected individual), γ is the removal rate (proportion of infected individuals recovering or dying each day) and S is the fraction of the population susceptible to the disease. The second equality holds in the limit S → 1, which describes conditions during the beginning of the COVID19 pandemic. The solution to this ordinary differential equation is the exponential function
where \({I}_{{t}_{1}}\) is the initial condition. Taking the natural logarithm and rearranging, we have
Anticontagion policies are designed to alter g, through changes to β, by reducing contact between susceptible and infected individuals. Holding the time step between observations fixed at one day (t_{2} − t_{1} = 1), we thus model g as a timevarying outcome that is a linear function of a timevarying policy
where θ_{0} is the average growth rate without a policy, policy_{t} is a binary variable describing whether a policy is deployed at time t, and θ is the average effect of the policy on growth rate g over all periods subsequent to the introduction of the policy, thereby encompassing any lagged effects of policies. ε_{t} is a meanzero disturbance term that captures interperiod changes not described by policy_{t}. Using this approach, infections each day are treated as the initial conditions for integrating equation (4) through to the following day.
We compute the first differences log(I_{t}) − log(I_{t − 1}) using active infections in countries for which they are available, otherwise we use cumulative infections, noting that they are almost identical during this early period (except in China, where we use active infections). We then match these data to policy variables that we construct using the novel datasets that we assembled and apply a reducedform approach to estimate a version of equation (6), although the actual expression has additional terms detailed below.
Estimation
To estimate a multivariable version of equation (6), we estimate a separate regression for each country c. Observations are for subnational units indexed by i observed for each day t. Because not all localities began testing for COVID19 on the same date, these samples are unbalanced panels. To ensure data quality, we restrict our analysis to localities after they have reported at least ten cumulative infections.
A necessary condition for unbiased estimates is that the timing of policy deployment is independent of natural infection growth rates^{14}, a mathematical condition that should be true in the context of a new epidemic. In established epidemiological models, including the standard SIR model above, early rates of infection within a susceptible population are characterized by constant exponential growth. This phenomenon is well understood theoretically^{13,27,47}, has been repeatedly documented in past epidemics^{28,29,48} as well as the current COVID19 pandemic^{11,12}, and implies constant infection growth rates in the absence of policy intervention. Thus, we treat changes in infection growth rates as conditionally independent of policy deployments since the correlation between a constant variable and any other variable is zero in expectation.
We estimate a multiple regression version of equation (6) using ordinary least squares. We include a vector of subnational unit fixed effects θ_{0} (that is, varying intercepts captured as coefficients to dummy variables) to account for all timeinvariant factors that affect the local growth rate of infections, such as differences in demographics, socioeconomic status, culture and health systems^{5}. We include a vector of dayofweek fixed effects δ to account for weekly patterns in the growth rate of infections that are common across locations within a country; however, in China, we omit dayofweek effects because we find no evidence they are present in the data—perhaps because of the fact that the outbreak of COVID19 began during a national holiday and workers never returned to work. We also include a separate singleday dummy variable each time there is an abrupt change in the availability of COVID19 testing or a change in the procedure to diagnose positive cases. Such changes generally manifest as a discontinuous jump in infections and a rescaling of subsequent infection rates (for example, see China in Fig. 1), effects that are flexibly absorbed by a singleday dummy variable because the dependent variable is the first difference of the logarithm of infections. We denote the vector of these effects μ.
Lastly, we include a vector (length P_{c}) of countryspecific policy variables (policy) for each location and day. These policy variables take on values between 0 and 1 (inclusive) where 0 indicates no policy action and 1 indicates a policy is fully enacted. In cases in which a policy variable captures the effects of collections of policies (for example, museum closures and library closures), a policy variable is computed for each, then they are averaged, so the coefficient on this type of variable is interpreted as the effect if all policies in the collection are fully enacted. There are also instances in which multiple policies are deployed on the same date in numerous locations, in which case we group policies that have similar objectives (for example, suspension of transit and travel ban, or cancelling of events and no gathering) and keep other policies separate (that is, business closure and school closure). The grouping of policies is useful for reducing the number of estimated parameters in our limited sample of data, allowing us to examine the impact of subsets of policies (Fig. 2c). However, policy grouping does not make a substantial difference to the estimated effect of all policies combined nor to the effect of actual policies, which we demonstrate by estimating a regression model in which no policies are grouped and these values are recalculated (Extended Data Fig. 6a and Supplementary Table 4).
In some cases (for Italy and the United States), policy data are available at a more spatially granular level than infection data (for example, city policies and statelevel infections in the United States). In these cases, we code binary policy variables at the more granular level and use population weights to aggregate them to the level of the infection data. Thus, policy variables may take on continuous values between 0 and 1, with a value of 1 indicating that the policy is fully enacted for the entire population. Given the limited quantity of data currently available, we use a parsimonious model that assumes the effects of policies on infection growth rates are approximately linear and additively separable. However, future studies that comprise more data may be able to identify important nonlinearities or interactions between policies.
For each country, our general multiple regression model is thus
where observations are indexed by country c, subnational unit i and day t. The parameters of interest are the countrybypolicy specific coefficients θ_{cp}. We display the estimated residuals ε_{cit} in Extended Data Fig. 10, which are mean zero but not strictly normal (normality is not a requirement of our modelling and inference strategy), and we estimate uncertainty over all parameters by calculating our standard errors robust to error clustering at the day level^{14}. This approach allows the covariance in ε_{cit} across different locations within a country, observed on the same day, to be nonzero. Such clustering is important in this context because idiosyncratic events within a country, such as a holiday or a backlog in testing laboratories, could generate nonuniform countrywide changes in infection growth for individual days that are not explicitly captured in our model. Thus, this approach nonparametrically accounts for both arbitrary forms of spatial autocorrelation or systematic misreporting in regions of a country on any given day (we note that it generates larger estimates for uncertainty than clustering by i). When we report the effect of all policies combined (Fig. 2b), we are reporting the sum of coefficient estimates for all policies \({\sum }_{p\mathrm{=1}}^{{P}_{c}}{\theta }_{cp}\), accounting for the covariance of errors in these estimates when computing the uncertainty of this sum.
Note that our estimates of θ and θ_{0} in equation (7) are robust to systematic underreporting of infections, a major concern in the ongoing pandemic, due to the construction of our dependent variable. This remains true even if different localities have different rates of underreporting, so long as the rate of underreporting is relatively constant. To see this, note that if each locality i has a medical system that reports only a fraction ψ_{i} of infections such that we observe \({\tilde{I}}_{it}={\psi }_{i}{I}_{it}\) rather an actual infections I_{it}, then the left side of equation (7) will be
and is therefore unaffected by locationspecific and timeinvariant underreporting. Thus systematic underreporting does not affect our estimates for the effects of policy θ. As discussed above, potential biases associated with nonsystematic underreporting that results from documented changes in testing regimes over space and time are absorbed by region–dayspecific effects μ.
However, if the rate of underreporting within a locality is changing daytoday, this could bias infection growth rates. We estimate the magnitude of this bias (Extended Data Fig. 2), and verify that it is quantitatively small. Specifically, if \({\tilde{I}}_{it}={\psi }_{it}{I}_{it}\) where ψ_{it} changes daytoday, then
where log(ψ_{it}) − log(ψ_{i,t − 1}) is the dayoverday growth rate of the casedetection probability. Disease surveillance has evolved slowly in some locations as governments gradually expand testing, which would cause ψ_{it} to change over time, but these changes in testing capacity do not appear to significantly alter our estimates of infection growth rates. In Extended Data Fig. 2, we show one set of epidemiological estimates^{23} for log(ψ_{it}) − log(ψ_{i,t − 1}). Despite random daytoday variations, which do not cause systematic biases in our point estimates, the mean of log(ψ_{it}) − log(ψ_{i,t − 1}) is consistently small across the different countries: 0.05 in China, 0.064 in Iran, 0.019 in South Korea, −0.058 in France, 0.031 in Italy and 0.049 in the United States. The average of these estimates is 0.026, potentially accounting for 7.3% of our global average estimate for the nopolicy infection growth rate (0.36). These estimates of log(ψ_{it}) − log(ψ_{i,t − 1}) also do not display strong temporal trends, alleviating concerns that timevarying underreporting generates sizable biases in our estimated effects of anticontagion policies.
Transient dynamics
In China, we are able to examine the transient response of infection growth rates following policy deployment because only three policies were deployed early in a sevenweek sample period during which we observe many cities simultaneously. This provides us with sufficient data to estimate the temporal structure of policy effects without imposing assumptions regarding this structure. To do this, we estimate a distributedlag model that encodes policy parameters using weekly lags based on the date that each policy is first implemented in locality i. This means the effect of a policy implemented one week ago is allowed to differ arbitrarily from the effect of that same policy in the following week, and so on. These effects are then estimated simultaneously and are displayed in Fig. 2b, c (see also Supplementary Table 3). Such a distributed lag approach did not provide statistically meaningful insights in other countries using the currently available data because there were fewer administrative units and shorter periods of observation (that is, smaller samples), and more policies (that is, more parameters to estimate) in all other countries. Future studies may be able to successfully explore these dynamics outside of China.
As a robustness check, we examine whether excluding the transient response from the estimated effects of policy substantially alters our results. We do this by estimating a ‘fixed lag’ model, in which we assume that policies cannot influence infection growth rates for L days, recoding a policy variable at time t as zero if a policy was implemented fewer than L days before t. We reestimate equation (7) for each value of L and present results in Extended Data Fig. 5 and Supplementary Table 5.
Alternative disease models
Our main empirical specification is motivated with an SIR model of disease contagion, which assumes zero latent period between exposure to COVID19 and infectiousness. If we relax this assumption to allow for a latent period of infection, as in a susceptible–exposed–infected–recovered (SEIR) model, the growth of the outbreak is only asymptotically exponential^{11}. Nonetheless, we demonstrate that SEIR dynamics have only a minor potential impact on the coefficients recovered by using our empirical approach in this context. In Extended Data Figs. 8, 9 we present results from a simulation exercise which uses equations (9)–(11), along with a generalization to the SEIR model^{11} to generate synthetic outbreaks (see Supplementary Methods section 2). We use these simulated data to test the ability of our statistical model (equation (7)) to recover both the unimpeded growth rate (Extended Data Fig. 8) as well as the impact of simulated policies on growth rates (Extended Data Fig. 9) when applied to data generated by SIR or SEIR dynamics over a wide range of epidemiological conditions.
Projections
Daily growth rates of infections
To estimate the instantaneous daily growth rate of infections if policies were absent, we obtain fitted values from equation (7) and compute a predicted value for the dependent variable when all P_{c} policy variables are set to 0. Thus, these estimated growth rates \({\hat{g}}_{cit}^{{\rm{no}}\,{\rm{policy}}}\) capture the effect of all localityspecific factors on the growth rate of infections (for example, demographics), dayofweek effects, and adjustments based on the way in which infection cases are reported. This counterfactual does not account for changes in information that are triggered by policy deployment, as those should be considered a pathway through which policies affect outcomes, as discussed in the main text. Additionally, the ‘no policy’ counterfactual does not model previously unobserved changes in behaviour that might occur if fundamentally new behaviours emerge even in the absence of government intervention. When we report an average nopolicy growth rate of infections (Fig. 2a), it is the average value of these predictions for all observations in the original sample. Locationanddayspecific counterfactual predictions \(({\hat{g}}_{cit}^{{\rm{no}}\,{\rm{policy}}})\), accounting for the covariance of errors in estimated parameters, are shown as red markers in Fig. 3.
Cumulative infections
To provide a sense of scale for the estimated cumulative benefits of effects shown in Fig. 3, we link our reducedform empirical estimates to the key structures in a simple SIR system and simulate this dynamical system over the course of our sample. The system is defined as the following:
where S_{t} is the susceptible population and R_{t} is the removed population. Here β_{t} is a timeevolving parameter, determined by our empirical estimates as described below. Accounting for changes in S becomes increasingly important as the size of cumulative infections (I_{t} + R_{t}) becomes a substantial fraction of the local subnational population, which occurs in some nopolicy scenarios. Our reducedform analysis provides estimates for the growth rate of active infections \((\hat{g})\) for each locality and day, in a regime where S_{t} ≈ 1. Thus we know
but we do not know the values of either of the two rightside terms, which are required to simulate equations (9)–(11). To estimate γ, we note that the leftside term of equation (11) is
which we can observe in our data for China and South Korea. Computing first differences in these two variables (to differentiate with respect to time), summing them, and then dividing by active cases gives us estimates of γ (medians: China = 0.11, South Korea = 0.05). These values differ slightly from the classical SIR interpretation of γ, because in the public data that we are able to obtain, individuals are coded as ‘recovered’ when they no longer test positive for COVID19, whereas in the classical SIR model this occurs when they are no longer infectious. We adopt the average of these two medians, setting γ = 0.08. We use medians rather than simple averages because low values for I_{t} induce a long right tail in daily estimates of γ and medians are less vulnerable to this distortion. We then use our empirically based reducedform estimates of \(\hat{g}\) (both with and without policy) combined with equations (9)–(11) to project total cumulative cases in all countries (Fig. 4). We simulate infections and cases for each administrative unit in our sample beginning on the first day for which we observe 10 or more cases (for that unit) using a time step of 4 h. Because we observe confirmed cases rather than total infections, we seed each simulation by adjusting observed I_{t} on the first day using countryspecific estimates of case detection rates. We adjust existing estimates of case underreporting^{23} to further account for asymptomatic infections assuming an infection–fatality ratio of 0.75%^{25}. We assume R_{t} = 0 on the first day. To maintain consistency with the reported data, we report our output in confirmed cases by multiplying our simulated I_{t} + R_{t} values by the aforementioned proportion of infections confirmed. We estimate uncertainty by resampling from the estimated variance–covariance matrix of all regression parameters. In Extended Data Fig. 7, we show sensitivity of this simulation to the estimated value of γ as well as to the use of an SEIR framework. In Supplementary Table 6, we show sensitivity of this simulation to the assumed infection–fatality ratio (see Supplementary Methods section 1).
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.
Data availability
The datasets generated and/or analysed during the current study are available at https://github.com/bolliger32/gplcovid. Future updates and/or extensions to data or code will be listed at http://www.globalpolicy.science/covid19.
Code availability
For easier replication, we have created a CodeOcean ‘capsule’, which contains a prebuilt computing environment in addition to the source code and data. This is available at https://codeocean.com/capsule/1887579/tree/v1. Future updates and/or extensions to data or code will be listed at http://www.globalpolicy.science/covid19.
Change history
22 August 2020
A Correction to this paper has been published: https://doi.org/10.1038/s4158602026910
References
Wu, F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020).
Ferguson, N. M. et al. Report 9: Impact of nonpharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. Technical Report (Imperial College London, 2020).
Chinazzi, M. et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID19) outbreak. Science 368, 395–400 (2020).
Kraemer, M. U. G. et al. The effect of human mobility and control measures on the COVID19 epidemic in China. Science 368, 493–497 (2020).
Greene, W. H. Econometric Analysis (Prentice Hall, 2003).
Romer, C. D. & Romer, D. H. The macroeconomic effects of tax changes: estimates based on a new measure of fiscal shocks. Am. Econ. Rev. 100, 763–801 (2010).
WHO. WHO Coronavirus Disease (COVID19) Dashboard. https://covid19.who.int (accessed 13 April 2020).
Li, R. et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARSCoV2). Science 368, 489–493 (2020).
Tang, B. et al. Estimation of the transmission risk of the 2019nCoV and its implication for public health interventions. J. Clin. Med. 9, 462 (2020).
Hatchett, R. J., Mecher, C. E. & Lipsitch, M. Public health interventions and epidemic intensity during the 1918 influenza pandemic. Proc. Natl Acad. Sci. USA 104, 7582–7587 (2007).
Ma, J. Estimating epidemic exponential growth rate and basic reproduction number. Infect. Dis. Model. 5, 129–141 (2020).
MunizRodriguez, K. et al. Doubling time of the COVID19 epidemic by province, China. Emerg. Infect. Dis. 26, https://doi.org/10.3201/eid2608.200219 (2020).
Chowell, G., Sattenspiel, L., Bansal, S. & Viboud, C. Mathematical models to characterize early epidemic growth: a review. Phys. Life Rev. 18, 66–97 (2016).
Angrist, J. D. & Pischke, J.S. Mostly Harmless Econometrics: An Empiricist’s Companion (Princeton Univ. Press, 2008).
Burke, M., Hsiang, S. M. & Miguel, E. Global nonlinear effect of temperature on economic production. Nature 527, 235–239 (2015).
Kandula, S. et al. Evaluation of mechanistic and statistical methods in forecasting influenzalike illness. J. R. Soc. Interface 15, 20180174 (2018).
Wu, J. T. et al. Estimating clinical severity of COVID19 from the transmission dynamics in Wuhan, China. Nat. Med. 26, 506–510 (2020).
Li, Q. et al. Early transmission dynamics in Wuhan, China, of novel coronavirusinfected pneumonia. N. Engl. J. Med. 382, 1199–1207 (2020).
Tsang, T. K. et al. Impact of changing case definitions for COVID19 on the epidemic curve and transmission parameters in mainland China. Preprint at medRxiv https://doi.org/10.1101/2020.03.23.20041319 (2020).
Wuhan pneumonia: 30 days from outbreak to out of control [in Chinese]. BBC News https://www.bbc.com/zhongwen/simp/chinesenews51290945 (2020).
Fisman, D., Khoo, E. & Tuite, A. Early epidemic dynamics of the West African 2014 Ebola outbreak: estimates derived with a simple twoparameter model. PLoS Curr. 6 https://doi.org/10.1371/currents.outbreaks.89c0d3783f36958d96ebbae97348d571 (2014).
Maier, B. F. & Brockmann, D. Effective containment explains subexponential growth in recent confirmed COVID19 cases in China. Science 368, 742–746 (2020).
Russell, T. W. et al. Using a delayadjusted case fatality ratio to estimate underreporting. Technical Report (Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, 2020).
Bootsma, M. C. J. & Ferguson, N. M. The effect of public health measures on the 1918 influenza pandemic in U.S. cities. Proc. Natl Acad. Sci. USA 104, 7588–7593 (2007).
MeyerowitzKatz, G. & Merone, L. A systematic review and metaanalysis of published research data on COVID19 infectionfatality rates. Preprint at medRxiv https://doi.org/10.1101/2020.05.03.20089854 (2020).
Kucharski, A. J. et al. Early dynamics of transmission and control of COVID19: a mathematical modelling study. Lancet Infect. Dis. 20, 553–558 (2020).
Anderson, R. M. & May, R. M. Infectious Diseases of Humans: Dynamics and Control (Oxford Univ. Press, 1992).
Nishiura, H., Chowell, G., Safan, M. & CastilloChavez, C. Pros and cons of estimating the reproduction number from early epidemic growth rate of influenza A (H1N1) 2009. Theor. Biol. Med. Model. 7, 1 (2010).
WHO Ebola Response Team. Ebola virus disease in West Africa—the first 9 months of the epidemic and forward projections. N. Engl. J. Med. 371, 1481–1495 (2014).
Flaxman, S. et al. Report 13: Estimating the number of infections and the impact of nonpharmaceutical interventions on COVID19 in 11 European countries. Technical Report (Imperial College London, 2020).
Lourenço, J. et al. Fundamental principles of epidemic spread highlight the immediate need for largescale serological surveys to assess the stage of the SARSCoV2 epidemic. Preprint at medRxiv https://doi.org/10.1101/2020.03.24.20042291 (2020).
Préparation au Risque épidémique COVID19 [in French]. https://solidaritessante.gouv.fr/IMG/pdf/guide_methodologique_covid192.pdf (2020).
Tian, H. et al. An investigation of transmission control measures during the first 50 days of the COVID19 epidemic in China. Science 368, 638–642 (2020).
Lin, J. COVID19/2019nCoV Time Series Infection Data Warehouse. https://github.com/BlankerL/DXYCOVID19Data (2020).
COVID19 Pandemic Lockdown in Hubei. https://en.wikipedia.org/w/index.php?title=COVID19_pandemic_lockdown_in_Hubei (Wikipedia, 2020).
COVID19 Pandemic in Iran. https://en.wikipedia.org/w/index.php?title=COVID19_pandemic_in_Iran (Wikipedia, 2020).
Kantis, C., Keirnan, S. & Bardi, J. S. Timeline of the coronavirus. Think Global Health https://www.thinkglobalhealth.org/article/updatedtimelinecoronavirus (2020).
Presidenza del Consiglio dei Ministri Dipartimento della Protezione Civile. Dati COVID19 Italia. https://github.com/pcmdpc/COVID19 (2020).
Presidenza del Consiglio dei Ministri Dipartimento della Protezione Civile. Coronavirus Emergency [in Italian]. http://www.protezionecivile.it/web/guest/home (Governo Italiano, 2020).
COVID19 Pandemic Lockdown in Italy. https://en.wikipedia.org/w/index.php?title=COVID19_pandemic_lockdown_in_Italy (Wikipedia, 2020).
Roussel, O. FrSARSCoV2. https://www.data.gouv.fr/en/datasets/frsarscov2 (2020).
Sante Publique France. Coronavirus (COVID19). https://www.santepubliquefrance.fr/ (2020).
Agence Régionale de Santé. Agir pour la santé de tous. https://www.ars.sante.fr/ (2020).
COVID19 Pandemic in France. https://en.wikipedia.org/w/index.php?title=COVID19_pandemic_in_France (Wikipedia, 2020).
Coronavirus Locations: COVID19 Map by County and State. https://usafacts.org/visualizations/coronaviruscovid19spreadmap/ (USA FACTS, 2020).
JHU CSSE. COVID19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. https://github.com/CSSEGISandData/COVID19 (2020).
Kermack, W. O. & McKendrick, A. G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. A 115, 700–721 (1927).
Mills, C. E., Robins, J. M. & Lipsitch, M. Transmissibility of 1918 pandemic influenza. Nature 432, 904–906 (2004).
Acknowledgements
We thank B. Chen for her role in initiating this work and A. Feller for his feedback. S.A.P., E.K., P.L. and J.T. are supported by a gift from the Tuaropaki Trust. T.C. is supported by an AI for Earth grant from National Geographic and Microsoft. D.A., A.H. and I.B. are supported through joint collaborations with the Climate Impact Lab. K.B. is supported by the Royal Society Te Apārangi Rutherford Postdoctoral Fellowship. H.D. and E.R. are supported by the National Science Foundation Graduate Research Fellowship under grants DGE 1106400 and 1752814, respectively. Opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of supporting organizations.
Author information
Authors and Affiliations
Contributions
S.H. conceived and led the study. All authors designed analysis, interpreted results, designed figures and wrote the paper. D.A., S.A.P., K.B., I.B., T.C., H.D., L.Y.H., A.H., E.K., P.L., J.L., E.R., J.T. and T.W. contributed equally and are listed alphabetically. China: L.Y.H. and T.W. collected health data, L.Y.H., T.W. and J.T. collected policy data, L.Y.H. cleaned data. South Korea: J.L. collected health data, T.C. and J.L. collected policy data, T.C. cleaned data. Italy: D.A. collected health data, P.L. collected policy data, D.A. cleaned data. France: S.A.P. collected health data, S.A.P., J.T. and H.D. collected policy data, S.A.P. cleaned data. Iran: A.H. collected health data and policy data, A.H. and D.A. cleaned data. United States: E.R. and K.B. collected health data, E.K. collected policy data, E.R., D.A. and K.B. cleaned data. I.B. collected geographical and population data for all countries. S.H. designed the econometric model. S.H., S.A.P. and J.T. conducted econometric analysis for all countries. K.B., I.B., A.H., E.R. and E.K. designed and implemented epidemiological models and projections. S.A.P., K.B., I.B., J.T., A.H. and E.K. designed and implemented robustness checks. H.D. created Fig. 1, T.C. created Fig. 2, J.T. created Fig. 3, E.R. created Fig. 4, D.A. created Supplementary Table 1, L.Y.H. and J.L. created Supplementary Table 2, J.T. created Supplementary Tables 3, 4, S.A.P. and J.T. created Supplementary Table 5, K.B. created Supplementary Table 6, L.Y.H. created Extended Data Figs. 1, 2, S.A.P. created Extended Data Figs. 3–5, J.T. created Extended Data Fig. 6, K.B. created Extended Data Fig. 7, I.B. created Extended Data Figs. 8, 9, J.T. created Extended Data Fig. 10. D.A., I.B. and P.L. managed policy data collection and quality control. I.B. and TC managed the code repository. I.B. and P.L. ran project management. E.K., T.W., J.T. and P.L. managed literature review. L.Y.H., E.K. and T.W. managed references. P.L. managed Extended Data Figs. 1–10 and Supplementary Information.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature thanks Andrew Jones, Jeffrey Shaman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Validating disaggregated epidemiological data against aggregated data from the JHU Center for Systems Science and Engineering.
Comparison of cumulative confirmed cases from a subset of regions in our collated epidemiological dataset to the same statistics from the 2019 Novel Coronavirus COVID19 (2019nCoV) Data Repository by the Johns Hopkins Center for Systems Science and Engineering (JHU CSSE)^{46}. We conducted this comparison for Chinese provinces and South Korea, for which the data we collected were from local administrative units that are more spatially granular than the data in the JHU CSSE database. a, In China, we aggregated our citylevel data to the province level. b, In South Korea, we aggregated provincelevel data up to the country level. Small discrepancies, especially in later periods of the outbreak, are generally due to imported cases (international or domestic) that are present in national statistics but that we do not assign to particular cities (in China) or provinces (in Korea).
Extended Data Fig. 2 Estimated trends in case detection over time within each country.
Systematic trends in case detection may potentially bias estimates of nopolicy infection growth rates (see equation (8)). We estimate the potential magnitude of this bias using data from the Centre for Mathematical Modelling of Infectious Diseases^{23}. Markers indicate daily first differences in the logarithm of the fraction of estimated symptomatic cases reported for each country over time. The average value over time (solid line and value denoted in panel title) is the average growth rate of case detection, equal to the magnitude of the potential bias. For example, in the main text we estimate that the infection growth rate in the United States is 0.29 (Fig. 2a), of which growth in case detection might contribute 0.049 (this figure). Sample sizes are 75 in China, 41 in Iran, 40 in South Korea, 29 in France, 40 in Italy and 32 in the United States.
Extended Data Fig. 3 Robustness of the estimated nopolicy growth rate of infections and the combined effect of policies to withholding blocks of data from entire regions.
a, b, For each country, we reestimated equation (7) using real data k times, each time withholding one of the k firstlevel administrative regions (‘Adm1’, that is, state or province) in that country. Each grey circle is either the estimated nopolicy growth rate (a) or the total effect of all policies combined (b), from one of these k regressions. Red and blue circles show estimates from the full sample, identical to the results presented in Fig. 2a, b, respectively. For each country panel, if a single region is influential, the estimated value when it is withheld from the sample will appear as an outlier. Samples that omit an influential region are highlighted with an open pink circle. As in Fig. 2b, we estimate a distributed lag model for China and display each of the estimated weekly lag effects (where the pink circle is the same ‘without Hubei’ sample for lags). The full sample includes 3,669 observations in China, 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States.
Extended Data Fig. 4 Robustness of the estimated effects of individual policies to withholding blocks of data from entire regions.
Same as Extended Data Fig. 3, but for individual policies (analogous to Fig. 2c). WFH denotes work from home policies; opt denotes optional policies. In cases in which two regions are influential, a second region is highlighted with an open green circle. The full sample includes 3,669 observations in China, 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States.
Extended Data Fig. 5 Evidence to support models in which policies affect infection growth rates in the days following deployment.
Existing evidence has not demonstrated whether policies should affect infection growth rates in the days immediately after deployment. It is therefore not clear ex ante whether the policy variables in equation (7) should be encoded as ‘on’ immediately following a policy deployment. We estimate ‘fixedlag’ models in which a fixed delay between the deployment of a policy and its effect is assumed (see Supplementary Methods section 3). If a delay model is more consistent with real world infection dynamics, these fixed lag models should recover larger estimates for the impact of policies and exhibit better model fit. a, R^{2} values associated with fixedlag lengths varying from 0 to 15 days. Centre values represent the R^{2} value in our sample, whiskers are 95% confidence interval computed through resampling with replacement. Insample fit generally declines or remains unchanged if policies are assumed to have a delay longer than 4 days. b, Estimated effects for no lag (the model reported in the main text) and for fixed lags between 1 and 5 days. Centre values represent the point estimate, error bars are 95% confidence intervals. Estimates generally are unchanged or shrink towards zero (for example, home isolation in Iran), consistent with miscoding of postpolicy days as nopolicy days. The sample size is 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States.
Extended Data Fig. 6 Estimated infection or hospitalization growth rates with actual anticontagion policies and in a nopolicy counterfactual scenario.
a, The estimated daily growth rates of active (China and South Korea) or cumulative (all others) infections based on the observed timing of all policy deployments within each subnational unit (blue) and in a scenario in which no policies were deployed (red). Identical to Fig. 3, but using an alternative disaggregated encoding of policies that does not group any policies into policy packages. The sample size is 3,669 in China, 595 in South Korea, 2,898 in Italy, 548 in Iran, 270 in France and 1,238 in the United States. b, Same as Fig. 3, but equation (7) is implemented for a single example administrative unit: Wuhan, China. The sample size is 46 observations. c, Same as Fig. 3, but using hospitalization data from France rather than cumulative cases (the French government stopped reporting cumulative cases after 25 March 2020). The sample size is 424 observations. For all panels, the difference between the with and nopolicy predictions is our estimated effect of actual anticontagion policies on the growth rate of infections (or hospitalizations). The markers are daily estimates for each subnational administrative unit (vertical lines are 95% confidence intervals). Black circles are observed changes in log(infections) (or diamonds for log(hospitalizations)), averaged across observed administrative units.
Extended Data Fig. 7 Sensitivity of estimated averted/delayed infections to the choice of γ and σ in an SIR/SEIR framework.
The sensitivity of total averted/delayed cases presented in Fig. 4 to alternative modelling assumptions. We compute total cases across the respective final days in our samples for the six countries presented in our analysis. The figure displays how these totals vary with eight values of γ (0.05–0.4) and four values of σ (0.2, 0.33, 0.5, ∞), where the final value of σ (∞) corresponds to the SIR model. a, The simulated total number of infections under no policy. b, Same as in a, but using actual policies. c, The difference between a and b, which is the total number of averted/delayed infections. d, Same as c, but on a logarithmic scale similar to Fig. 4 (a–c are on a linear scale, trimmed to show details). Figure 4 uses γ = 0.079, which we calculate using empirical recovery/death rates in countries for which we observed them (China and South Korea; see Methods). If we assume a 14day delay between infected individuals becoming noninfectious and being reported as ‘recovered’ in the data, we would calculate γ = 0.18. Figure 4 assumes σ = ∞.
Extended Data Fig. 8 Simulating reducedform estimates for the nopolicy growth rate of infections for different population regimes and disease dynamics.
We examine the performance of reducedform econometric estimators through simulations in which different underlying disease dynamics are assumed (see Supplementary Methods section 2). Each histogram shows the distribution of econometrically estimated values across 1,000 simulated outbreaks. Estimates are for the nopolicy infection growth rate (analogous to Fig. 2a) when three different policies are deployed at random moments in time. The black line shows the correct value imposed on the simulation and the red histogram shows the distribution of estimates using the regression in equation (7), applied to data output from the simulation. The grey dashed line shows the mean of this distribution. The 12 subpanels describe the results when various values are assigned to the mean infectious period (γ^{−1}) and mean latency period (σ^{−1}) of the disease. σ = ∞ is equivalent to SIR disease dynamics. In each panel, S_{min} is the minimum susceptible fraction observed across all 1,000 45day simulations shown in each panel. In the real datasets used in the main text, after correcting for countryspecific underreporting, S_{min} across all units analysed is 0.72 and 95% of the analysed units finish with S_{min} > 0.91. Bias refers to the distance between the dashed grey and black line as a percentage of the true value. a, Simulations in nearideal data conditions in which we observe active infections within a large population (such that the susceptible fraction of the population remains high during the sample period, similar to those in our data for Chongqing, China). b, Simulations in a nonideal data scenario in which we are only able to observe cumulative infections in a small population (similar to those in our sample for Cremona, Italy).
Extended Data Fig. 9 Simulating reducedform estimates for anticontagion policy effects for different population regimes and assumed disease dynamics.
Same as Extended Data Fig. 8, but estimates are for the combined effect of three different policies (analogous to Fig. 2b) that are deployed at random moments in time. a, Simulations in nearideal data conditions in which we observe active infections within a large population (such that the susceptible fraction of the population remains high during the sample period, similar to those in our data for Chongqing, China). b, Simulations in a nonideal data scenario in which we are only able to observe cumulative infections in a small population (similar to those in our sample for Cremona, Italy).
Extended Data Fig. 10 Regression residuals for the growth rates of COVID19 by country.
These plots show the estimated residuals from equation (7) for each countryspecific econometric model. Histograms (left) show the estimated unconditional probability density function. Quantile plots (right) show quantiles of the cumulative density function (y axis) plotted against the same quantiles for a normal distribution. For additional details, see Fig. 3 and the ‘Econometric analysis’ section of the Methods.
Supplementary information
Supplementary Information
This file contains Supplementary Notes, Supplementary Methods, and Supplementary Tables 16. Supplementary Notes: Details policy deployment decisions in each of the countries analyzed and describes the data acquisition and processing for the epidemiological and policy data. Both types of data are gathered from a variety of incountry data sources, including government public health websites, regional newspaper articles, and Wikipedia crowdsourced information. Supplementary Methods: Describes sensitivity analyses and simulations performed to verify the robustness of our model. These include the sensitivity of our regression model and counterfactual projections to varying epidemiological parameters, as well as the sensitivity of our estimates to alternative lag structures, withholding of data, and differing policy groupings. Supplementary Tables: Details: 1) the number of anticontagion policies; 2) Wuhan preintervention epidemiological data; 3) the main results estimating the effect of policy on growth rates; 45) estimates of policy effects using a disaggregated, and lagged version of our main model, and 6) and estimates of the initial infection growth rate and case doubling times.
Rights and permissions
About this article
Cite this article
Hsiang, S., Allen, D., AnnanPhan, S. et al. The effect of largescale anticontagion policies on the COVID19 pandemic. Nature 584, 262–267 (2020). https://doi.org/10.1038/s4158602024048
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s4158602024048
This article is cited by

Potential determinants of the quantity and duration of COVID19 outbreaks in geriatric longterm care facilities
BMC Geriatrics (2023)

Expectations of nonCOVID19 deaths during the prevaccine pandemic: a processcontrol approach
BMC Public Health (2023)

Modelling the pulse populationwide nucleic acid screening in mitigating and stopping COVID19 outbreaks in China
BMC Infectious Diseases (2023)

Uncovering COVID19 infection determinants in Portugal: towards an evidencebased spatial susceptibility index to support epidemiological containment policies
International Journal of Health Geographics (2023)

Experiences, coping strategies and perspectives of people in Malaysia during the COVID19 pandemic
BMC Public Health (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.