Modeling COVID-19 scenarios for the United States

We use COVID-19 case and mortality data from 1 February 2020 to 21 September 2020 and a deterministic SEIR (susceptible, exposed, infectious and recovered) compartmental framework to model possible trajectories of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections and the effects of non-pharmaceutical interventions in the United States at the state level from 22 September 2020 through 28 February 2021. Using this SEIR model, and projections of critical driving covariates (pneumonia seasonality, mobility, testing rates and mask use per capita), we assessed scenarios of social distancing mandates and levels of mask use. Projections of current non-pharmaceutical intervention strategies by state—with social distancing mandates reinstated when a threshold of 8 deaths per million population is exceeded (reference scenario)—suggest that, cumulatively, 511,373 (469,578–578,347) lives could be lost to COVID-19 across the United States by 28 February 2021. We find that achieving universal mask use (95% mask use in public) could be sufficient to ameliorate the worst effects of epidemic resurgences in many states. Universal mask use could save an additional 129,574 (85,284–170,867) lives from September 22, 2020 through the end of February 2021, or an additional 95,814 (60,731–133,077) lives assuming a lesser adoption of mask wearing (85%), when compared to the reference scenario.

There remains no approved vaccine for the prevention of SARS-CoV-2 infection, and few pharmaceutical options for the treatment of COVID-19 are available [9][10][11] . The most optimistic scientists do not predict the availability of new vaccines or therapeutics before 2021 (refs. [12][13][14][15] ). Non-pharmaceutical interventions (NPIs) are, therefore, the only available policy levers to reduce transmission 16 . Several NPIs have been put in place across the United States in response to the epidemic (Fig. 1), including the dampening of transmission through the wearing of face masks and social distancing mandates (SDMs) aimed at reducing contacts through school closures, restrictions of gatherings, stay-at-home orders and the partial or full closure of nonessential businesses. Increased testing and isolation of infected individuals and their contacts will also have had an impact 17 . These NPIs are credited with a reduction in viral transmission 18,19 , along with a host of other environmental, behavioral and social determinants postulated to affect the course of the epidemic at the state level.
In the United States, decisions to implement SDM or require mask use are generally made at the state level by government officials. These executives need to balance net losses from the societal turmoil, economic damage and indirect effects on health caused by NPIs with the direct benefits to human health of controlling the epidemic. Disease control has often been operationally defined in this pandemic context as the restriction of infections to below a specified level at which health services are not overwhelmed by demand and the loss of human health and life is consequently minimized 20 .
In the first months of the SARS-CoV-2 outbreak in the United States, states enacted restrictive SDMs intended to reduce transmission (by limiting human-to-human contact) 5 , while there was confli cting advice on the use of masks ( https://www.npr.org/sections/ goatsandsoda/2020/04/10/829890635/why-there-so-many different-guidelines-for-face-masks-for-the-public/). A t that early stage, relatively simple statistical models of future risk were sufficient to capture the general patterns of transmission 21 . As different behavioral responses to SDMs emerged and, more importantly, as some states began to relax SDMs (Fig. 1), a modeling approach that directly quantified transmission and could be used to explore these developing scenarios was necessary. As states varied in their actions to remove and reinstate SDMs (Fig. 1) or began to issue mandatory mask-use orders (https://www.cnn.com/2020/06/19/us/ states-face-mask-coronavirus-trnd/index.html) amid resurgences of COVID-19 (https://www.nytimes.com/2020/07/01/world/ coronavirus-updates.html), a clear need for evidence-based assessments of the possible effect of the NPI options available to decision-makers became apparent.
There is now growing evidence that face masks can considerably reduce the transmission of respiratory viruses like SARS-CoV-2, thereby limiting the spread of COVID-19 (refs. [22][23][24]. We updated a recently published review 24 to generate a new meta-analysis (Supplementary Information) of peer-reviewed studies and preprints to assess the effectiveness of masks at preventing respiratory viral infections in humans 25 . This analysis indicated a reduction in infection (from all respiratory viruses) for mask wearers by 40%

Modeling COVID-19 scenarios for the United States IHME COVID-19 Forecasting Team*
We use COVID-19 case and mortality data from 1 February 2020 to 21 September 2020 and a deterministic SEIR (susceptible, exposed, infectious and recovered) compartmental framework to model possible trajectories of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections and the effects of non-pharmaceutical interventions in the United States at the state level from 22 September 2020 through 28 February 2021. Using this SEIR model, and projections of critical driving covariates (pneumonia seasonality, mobility, testing rates and mask use per capita), we assessed scenarios of social distancing mandates and levels of mask use. Projections of current non-pharmaceutical intervention strategies by state-with social distancing mandates reinstated when a threshold of 8 deaths per million population is exceeded (reference scenario)-suggest that, cumulatively, 511,373 (469,578-578,347) lives could be lost to COVID-19 across the United States by 28 February 2021. We find that achieving universal mask use (95% mask use in public) could be sufficient to ameliorate the worst effects of epidemic resurgences in many states. Universal mask use could save an additional 129,574 (85,284-170,867) lives from September 22, 2020 through the end of February 2021, or an additional 95,814 (60,731-133,077) lives assuming a lesser adoption of mask wearing (85%), when compared to the reference scenario.
(relative risk = 0.60, 95% uncertainty interval (UI) = 0.46-0.80)) relative to controls 25 . This is suggestive of a considerable population health benefit to mask use with great potential for uptake in the United States, where the national average for self-reported mask wearing was 49% as of 21 September 2020 (https://covid19.healthdata.org/; Supplementary Information).
Here we provide a state-level descriptive epidemiological analysis of the introduction of SARS-CoV-2 infection across the United States, from the first recorded case through to 21 September 2020. We use these observations to learn about epidemic progression and thereby model the first wave of transmission using a deterministic SEIR compartmental framework 26,27 . This observed, process-based understanding of how NPIs affect epidemiological processes is then used to make inferences about the future trajectory of COVID-19 and how different combinations of existing NPIs might affect this course. Five SEIR-driven scenarios, along with covariates that inform them, were then projected through to 28 February 2021 (Methods). We use these scenarios as a sequence of experiments to describe a range of model outputs, including R effective (the change over time in the average number of secondary cases per infectious case in a population where not everyone is susceptible [26][27][28], infections, deaths and hospital demand outcomes, which might be expected from plausible boundaries of the policy options available the fall and winter of 2020 (see Methods and Supplementary Information for an extended rationale on scenario construction).
We established three boundary scenarios. First, we forecast the expected outcomes if states continue to remove SDMs at the current pace of 'mandate easing' , with resulting increases in population mobility and number of person-to-person contacts. This is an alternative scenario to the more probable situation where states are expected to respond to an impending health crisis by reinstating some SDMs. In the second, 'plausible reference' scenario, we model the future progress of the pandemic assuming that states would once again shut down social interaction and some economic activity at a threshold for the daily death rate of 8 deaths per million population-the 90th percentile of the observed distribution of when states previously implemented SDMs ( Fig. 1 and Supplementary  Information). This scenario assumes reinstatement of SDMs for 6 weeks. In addition, newly available data on mask efficacy enabled the exploration of a third, 'universal mask-use' scenario to investigate the potential population-level benefits of increased mask use in addition to the same threshold-driven reinstatement of SDMs. In this best-case scenario model, 'universal' was defined as 95% of people wearing masks in public, based on the highest observed coverage of mask use globally (in Singapore) during the COVID-19 pandemic to date (Supplementary Information). Two derivative scenarios were also included to assist understanding, nuance and policy resolution around the three boundary scenarios. The first scenario, termed 'plausible reference + 85% mask use' , modeled less than universal mask use in public (85%) in the presence of reinstatement of SDMs. The second was a scenario of universal mask use (95%) in the absence of any NPIs (termed 'mandate easing + universal mask use'). Details and results for these additional scenarios are in the Supplementary Information. In addition, sensitivity analyses and detailed diagnostics are provided to help users calibrate the effects of the covariates used in the models on the scenarios discussed (Supplementary Information).

results
Observed COVID-19 patterns. The COVID-19 epidemic has progressed unevenly across states. Since the first death was recorded in the United States in early February 2020, cumulative through 21 September 2020, 199,213 deaths from COVID-19 have been reported in the United States (Fig. 2); a sixth of those (16.6%) occurred in New York alone. Washington and California issued the first sets of state-level mandates on 11 March 2020, prohibiting gatherings of 250 people or more in certain counties, and by 23 March 2020, all 50 states initiated some combination of SDMs (Fig. 1). The highest levels of daily deaths at the state level between February and September of 2020 occurred in New York, New Jersey and Texas at 998, 311 and 220 deaths per day, respectively (Fig. 3 and Extended Data Fig. 1). On 21 September 2020, the highest level of daily deaths was in Florida at 101 deaths per day. A critical policy need at this stage of the modeling was the forecasting of hospital resource demands in the US states with the worst effective transmission rates (Virginia, New York and Missouri; Fig. 4). The highest peak demand was observed as 8,380 hospital intensive care unit (ICU) beds in New York (estimated initial hospital ICU bed availability of 718) on April 10 and 2,786 hospital ICU beds in New Jersey (estimated initial hospital ICU bed availability of 466) on April 21; demand for hospital ICU beds had receded to within initial capacity levels across the United States by 21 September 2020 (Extended Data Fig. 3 4 and Table 1). By the US national election on 3 November 2020, a total of five states are predicted to exceed a threshold of daily deaths of 8 deaths per million (Fig. 3), and a total of 40 states would have an R effective greater than one (Fig. 4). By 28 February 2021, a total of 45 states are predicted to exceed that threshold under this scenario, and all states would reach an R effective of greater than one before the end of February 2021 (Table 1 and Fig. 4). This scenario results in an estimated total of 152, 775,751 (115,305,817-199,130,145) infections across the United States by the end of February 2021 (Extended Data Fig. 5 When we modeled the future course of the epidemic assuming that states will once again shut down social interaction and economic activity when daily deaths reach a threshold of 8 deaths per million (plausible reference scenario), the projected cumulative death toll across the United States is forecast to be lower than that under the mandate-easing scenario, with 511,373 (469,578-578,347) deaths by 28 February 2021 (Fig. 2) Fig. 6). As with the previous scenario, even with the reinstatement of SDMs when daily deaths exceed 8 per million population, all states would reach an R effective greater than one before the end of the February 2021 ( Fig. 4  The universal mask-use scenario where the population of each state was assumed to adopt and maintain a 95% level of mask use in public (Methods)-in addition to states reinstating SDM if a threshold daily death rate of 8 deaths per million population was exceeded-resulted in the lowest projected cumulative death toll across US states, with a total of 381,798 (336,479-421,953) cumulative deaths by 28 February 2021 ( Fig. 2 and Table 1). Under this scenario, on 3 November 2020, no states will have exceeded a daily death rate of 8 deaths per million (Fig. 3), although 47 states are still estimated to exceed an R effective of 1.0 at some point in the projected period, and three states would have an R effective greater than 1.0 on 28 February 2021 (Fig. 4)     the plausible reference + 85% mask-use scenario saves a considerable number of lives at the national level (95,814 (60,731-133,077) over and above the reference scenario, but is not as effective as the plausible reference + universal mask-use scenario. Although not surprising, this does help to confirm that any additional coverage that can be achieved through mask use will save lives. The mandateeasing + universal mask-use scenario reveals substantial lives saved (20,936 (0-102,507)) over the plausible reference scenario, even in the absence of reinstatement of SDMs at the daily threshold of 8 deaths per one million population, underscoring the potential effects that increased levels of mask adoption could have while minimizing the deleterious economic repercussions of other NPIs. Two out-of-sample (OOS) model assessments were conducted for two different time intervals of the modeling period to investigate the strength of evidence behind each of the covariate drivers of SARS-CoV-2 transmission intensity. Full details of these sensitivity analyses are shown at the national level in the Supplementary Information. These analyses indicate that care needs to be taken in interpreting the strength of these relationships, which show variability in time and space. For example, our OOS tests indicate that over some time frames, pneumonia mortality seasonality was either the most or least useful covariate, despite in-sample tests having consistently shown this to be an important predictor. Since pneumonia seasonality is one of the leading covariates driving expected increases in COVID-19 deaths in the fall and winter, it is important to be aware of this uncertainty when assessing the forecasts. It is critical to note, however, that even when we completely remove this covariate from our model, sensitivity analyses show a forecast of over 100,000 deaths from COVID-19 by the end of winter (101,615 (81,479-126,295) additional deaths; Supplementary Information). Since this covariate complexity makes it difficult to generalize the effects of this uncertainty, we provide extensive diagnostics for the covariate relationships in each of the states with examples of how to interpret these findings (Supplementary Information).
Model performance. The models presented here have been evaluated for OOS predictive validity using standard tests and metrics in an ongoing fashion and in a publicly available framework 21 . These SEIR models have consistently produced among the most accurate forecasts observed across models compared 21 . For example, for models released in June, the Institute for Health Metrics and Evaluation (IHME) SEIR model had the lowest median absolute percentage error (MAPE) at 10 weeks of forecasting at 20.2%, compared to 32.6% across models. We have included new sets of model and covariate diagnostics with worked descriptions for the most populous states (Supplementary Information and Supplementary Data 1-4) for transparent evaluation of our model performance. We emphasize that these are forecasts of possible futures, which are subject to many model assumptions and sources of data variability.

Discussion
We have delimited three possible future scenarios of the course of the COVID-19 epidemic in the United States, at the state level-mandateeasing, plausible reference and universal mask-use scenariosto help frame and inform a national discussion on what actions could be taken during the fall of 2020 and the public health, economic and political influences that these decisions will have for the rest of the winter (here defined as the end of February 2021). To help us understand the policy nuances of these boundary scenarios, two derivative scenarios (plausible reference + 85% mask use and mandate easing + universal mask use) were also explored. In addition, selected sensitivity analyses were conducted for the covariates used in the models, so that their influence could be better understood.
Under all scenarios evaluated here, the United States is likely to face a continued public health challenge from the COVID-19 pandemic through 28 February 2021 and beyond, with populous states in particular potentially facing high levels of illness, deaths and ICU demands as a result of the disease. The implementation of SDMs as soon as individual states reach a threshold of 8 daily deaths per million could dramatically ameliorate the effects of the disease; achieving near-universal mask use could delay, or in many states, possibly prevent, this threshold from being reached and has the potential to save the most lives while minimizing damage to the economy. National and state-level decision-makers can use these forecasts of the potential health benefits of available NPIs, alongside considerations of economic and other social costs, to make more informed decisions on how to confront the COVID-19 pandemic at the local level. Our findings indicate that universal mask use, a relatively affordable and low-impact intervention, has the potential to serve as a priority life-saving strategy in all US states. Our derivative scenarios suggest that this likely remains true at sub-universal levels of mask coverage and at universal mask coverage in the absence of any other NPIs.
New epidemics, resurgences and second waves are not inevitable. Several countries, such as South Korea, Germany and New Zealand have sustained reductions in COVID-19 cases over time (https:// covid19.healthdata.org/). Early indications that seasonality may play a role in transmission, with increased spread during colder winter months as is seen with other respiratory viruses [29][30][31][32] , highlight the importance of taking action both before and during the pneumonia season in the United States. While it is yet unclear if COVID-19 seasonality will follow the pattern of related coronaviruses 32 and parallel that of pneumonia seasonality, the sometimes strong associations observed in these forecasts indicate that increased government vigilance is prudent. Moreover, given the potential sensitivity of the model to effects of seasonality, a substantial winter effect cannot be ruled out. This effect would be against a background of more widespread and prevalent COVID-19 infection than experienced in the first wave.
Mask use has emerged as a contentious issue in the United States with only 49% of US residents reporting that they 'always' wear a mask in public as of 21 September 2020 (https://covid19.healthdata.org/). Regardless, toward the end of 2020, masks could help to contain a second wave of resurgence while reducing the need for frequent and widespread implementation of SDMs. Although 95% mask use across the population may seem a high threshold to achieve and maintain, on a neighborhood scale this level has already been observed in areas of New York (https://www.nytimes.com/2020/08/20/nyregion/ nyc-face-masks.html); and on a state level, reported mask use has exceeded 60% in Virginia, Florida and California (see Supplementary Information for related methods). In countries where mask use has been widely adopted, such as Singapore, South Korea, Hong Kong, Japan and Iceland among others, transmission has declined and, in some cases, halted (https://covid19.healthdata.org/). These examples serve as additional natural experiments 33 of the likely effects of masks and support the assumptions and findings from the universal mask-use scenario in our study. The potential life-saving benefit of increasing mask use in the coming fall and winter cannot be overstated. It is likely that US residents will need to choose between higher levels of mask use or risk the frequent redeployment of more stringent and economically damaging SDMs; or, in the absence of either measure, face a reality of a rising death toll 34 . Longer term, the future of COVID-19 in the United States will be determined by the deployment of an efficacious vaccine and the evolution of herd immunity 35 .
This work represents the outputs of a class of models that aim to abstract the disease transmission process in populations to a level that is tractable for understanding, and, in this case, that can be used for prediction. A clear limitation of any such modeling exercise is that it will be constrained by data (disease and relevant covariates), the model of understanding developed and the length of time available to the model to learn/train the important dynamics. We have therefore tried to benchmark our model against alternative models of the COVID-19 pandemic and fully document our predictive performance with a range of measures 21 . In addition, we have provided all the data and model code to enable full reproducibility and increased transparency, provided sensitivity analyses to some of our core assumptions; and presented a range of likely futures 36 in the form of mandate-easing, plausible reference and universal mask-use scenarios (as well as two derivative scenarios thereof) for decision-makers to review. In addition, triangulation of other outputs of the SEIR model, such as the proportion of the population that are affected, are also provided and tested against independent data, in this case seroprevalence surveys (Extended Data Fig. 9). Finally, because uncertainty compounds with increased distance into the future predicted, the data, model and its assumptions will be iteratively updated as the pandemic continues to unfold (https://www.latimes.com/opinion/story/2020-07-10/covidforecast-deaths-ihme-washington/).
We wish to reiterate to decision-makers that there are a multitude of limitations in any modeling study of this type 26,27 ; an extended description of the limitations specific to this study is provided (Methods). Specifically, (1) these models are approximations of real-world scenarios, and we have simplified many aspects of the epidemiological process of disease transmission to make these models computationally feasible; (2) these models are driven strongly by mortality data with all of its fidelity and recording imperfections; (3) these models are also informed by a wealth of other data types that each have differential availability, as well as detection and measurement bias issues for which we can never fully calibrate; (4) these models make particular assumptions about covariates, including seasonality, that while evidence-based and explicitly stated, are subject to sensitivity analyses because their effects could be substantial; and (5) our knowledge of this dynamic pandemic improves daily, so there should be no expectation that this modeling framework is final or that the data that drive it are fixed. While acknowledging all of these policy-relevant limitations, we take care to note that our publicly released model comparison framework 21 supports the robust, iterative and objective evaluation of our modeling approach. This is especially valuable as the complexities of the pandemic response require that our modeling efforts remain agile to epidemiological and societal developments and that we continue to reevaluate and post updates weekly (https://covid19.healthdata.org/). Finally, it is especially important for decision-makers that we emphasize that we are not forecasting a future, but rather a range of outcomes that we believe are more probable given the scenarios tested, based on the data observed so far and our model assumptions. These forecasts are best considered as helpful guides, rather than definitive maps.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41591-020-1132-9.

Methods
Our analysis strategy supports two main and interconnected objectives: (1) to generate forecasts of COVID-19 deaths, infections and hospital resource needs for all US states; and (2) to explore alternative scenarios on the basis of changes in state-enforced SDMs or population-level mask use. The modeling approach to achieve this is summarized in the Supplementary Information and can be divided into four stages: (1) identification and processing of COVID-19 data, (2) exploration and selection of key drivers or covariates, (3) modeling deaths and cases across three boundary scenarios of SDMs in US states using an SEIR framework and (4) modeling health service utilization as a function of forecast infections and deaths within those scenarios. This study complies with the Guidelines for Accurate and Transparent Health Estimates Reporting statement (Supplementary Information).
Data identification and processing. IHME forecasts include data from local and national governments, hospital networks and associations, the World Health Organization, third-party aggregators and a range of other sources. Data sources and corrections are described in detail in the Supplementary Information and in the data availability statement. Briefly, daily confirmed case and death numbers due to COVID-19 are collated from the Johns Hopkins University data repository; we supplement and correct this dataset as needed to improve the accuracy of our projections and adjust for reporting-day biases ( Supplementary Information) Before modeling, observed cumulative deaths are smoothed using a spline-based smoothing algorithm with randomly placed knots 37 . Uncertainty is derived from bootstrapping and resampling of the observed deaths. The time series of case data is used as a leading indicator of death based on an infection fatality ratio (IFR) and a lag from infection to death. These smoothed estimates of observed deaths by location are then used to create estimated infections based on an age distribution of infections and on age-specific IFRs. The age-specific infections were collapsed into total infections by day and state and used as data inputs in the SEIR model. Detailed descriptions of data smoothing and transformation steps are provided in the Supplementary Information.

Covariate selection.
Covariates for the compartmental transmission SEIR model are predictors of the β parameter in the model that affects the transition from the susceptible to exposed state; specifically, β represents the contact rate multiplied by the probability of transmission per contact. Covariates were evaluated on the basis of biological plausibility and on the impact on the results of the SEIR model. Given limited empirical evidence of population-level predictors of SARS-CoV-2 transmission, biologically plausible predictors of pneumonia such as population density (percentage of the population living in areas with more than 1,000 individuals per square kilometer), tobacco smoking prevalence, population-weighted elevation, lower respiratory infection mortality rate and particulate matter air pollution were considered. These covariates are representative at a population level and are time invariant. Location-specific estimates for these covariates are derived from the Global Burden of Disease Study 2019 (refs. 38-40 ). Time-varying covariates include pneumonia excess mortality seasonality, diagnostic tests administered per capita, population-level mobility and personal mask use. These are described below.
Pneumonia seasonality. We used weekly pneumonia mortality data from the National Center for Health Statistics Mortality Surveillance System (https://gis. cdc.gov/grasp/fluview/mortality.html) from 2013 to 2019 by US state. Pneumonia deaths included all deaths classified by the full range of the International Classification of Disease codes in J12-J18.9. We pooled data over available years for each state and found the weekly deviation from the annual, state-specific mean mortality due to pneumonia. We then fit a seasonal pattern using a Bayesian meta-regression model with a flexible spline and assumed annual periodicity (Supplementary Information). For locations outside the United States, we used vital registration data where available. Locations without vital registration data had weekly pneumonia seasonality predicted based on latitude from a model pooling all available data (Supplementary Information).
Testing per capita. We considered diagnostic testing for active SARS-CoV-2 infections as a predictor of the ability for a state to identify and isolate active infections. We assumed that higher rates of testing were negatively associated with SARS-CoV-2 transmission. Our primary sources for US testing data were compiled by the COVID Tracking Project (Supplementary Information). Unless testing data existed before the first confirmed case in a state, we assumed that testing was non zero after the date of the first confirmed case. Before producing predictions of testing per capita, we smoothed the input data by using the same smoothing algorithm used for smoothing daily death data before modeling (previously described). Testing per capita projections for unobserved future days were based on linearly extrapolating the mean day-over-day difference in daily tests per capita for each location. We put an upper limit on diagnostic tests per capita of 500 per 100,000 based on the highest observed rates in June 2020.
Social distancing mandates. SDMs were not used as direct covariates in the transmission model. Rather, SDMs were used to predict population mobility (see below), which was subsequently used as a covariate in the transmission model. We collected the dates of state-issued mandates enforcing social distancing, as well as the planned or actual removal of these mandates. The measures that we included in our model were: (1) severe travel restrictions, (2) closing of public educational facilities, (3) closure of nonessential businesses, (4) stay-at-home orders and (5) restrictions on gathering size. Generally, these came from state government official orders or press releases.
To determine the expected change in mobility due to SDMs, we used a Bayesian, hierarchical meta-regression model with random effects by location on the composite mobility indicator to estimate the effects of social distancing policies on changes in mobility (Supplementary Information).
Mobility. We used four data sources on human mobility to construct a composite mobility indicator. Those sources were Facebook, Google, SafeGraph and Descartes Labs (Supplementary Information). Each source takes a slightly different approach to capturing mobility, so before constructing a composite mobility indicator, we standardized these different data sources (Supplementary Information). Briefly, this first involved determining the change in a baseline level of mobility for each location by data source. Then, we determined a location-specific median ratio of change in mobility for each pairwise comparison of mobility sources, using Google as a reference and adjusting the other sources by that ratio. The time series for mobility was estimated using a Gaussian process regression model using the standardized data sources to get a composite indicator for change in mobility for each location day.
We calculated the residuals between our predicted composite mobility time series and input composite time series, and then applied a first-order random walk to the residuals. The random walk was used to predict residuals from 1 January 2020 to 1 January 2021, which were then added to the mobility predictions to produce a final time series with uncertainty: 'past' changes in mobility from 1 January 2020 to 28 September 2020 and projected mobility from 28 September 2020 to 1 January 2021.

Masks.
We performed a meta-analysis of 40 peer-reviewed scientific studies in an assessment of mask effectiveness for preventing respiratory viral infections (Supplementary Information). The studies were extracted from a preprint publication 24 . In addition, we considered all articles from a second meta-analysis 23 and one supplemental publication 41 . These studies included both persons working in health care and the general population, especially family members of those with known infections. The studies indicate overall reductions in infections due to masks preventing exhalation of respiratory droplets containing viruses, as well as some prevention of inhalation by those uninfected. The resulting meta-regression calculated log-transformed relative risks and corresponding log-transformed standard errors based on raw counts and used a continuity correction for studies with zero counts in the raw data (0.001). We included additional specifications and characteristics to account for differences in the characteristics of individual studies and to identify important factors impacting mask effectiveness (Supplementary Information).
We used MR-BRT (meta-regression, Bayesian, regularized and trimmed), a meta-regression tool developed at the Institute for Health Metrics and Evaluation (Supplementary Information), to perform a meta-analysis that considered the various characteristics of each study. We accounted for between-study heterogeneity and quantified remaining between-study heterogeneity into the width of the UI. We also performed various sensitivity analyses to verify the robustness of the modeled estimates and found that the estimate of the effectiveness of mask use did not change significantly when we explored four alternative analyses, including changing the continuity correction assumption, using odds ratio versus relative risk from published studies, using a fixed-effects versus a mixed-effects model and including studies without information on covariates.
We estimated the proportion of people who self-reported always wearing a face mask when outside in public for both US and global locations using data from PREMISE (US), the Kaiser Family Foundation (US), YouGov (non-US) and Facebook (non-US) surveys (Supplementary Information). We used the same smoothing model as for COVID-19 deaths and testing per capita to produce estimates of observed mask use. This smoothing process averaged each data point with its neighbors. The level of mask use starting on 21 September 2020 (the last day of processed and analyzed data) was assumed to be flat. Among states without state-specific data, a within-the-US regional average was used.
Deterministic modeling framework. Model specification is summarized in a schematic with additional details provided in the Supplementary Information. To fit and predict disease transmission dynamics, we include a SEIR component in our multistage model. In particular, the population of each location is tracked through the following system of differential equations: where α represents a mixing coefficient to account for imperfect mixing within each location, σ is the rate at which infected individuals become infectious, γ 1 is the rate at which infectious people transition out of the presymptomatic phase and γ 2 is the rate at which individuals recover. This model does not distinguish between symptomatic and asymptomatic infections but has two infectious compartments (I 1 and I 2 ) to allow for interventions that would avoid focus on those who could not be symptomatic; I 1 is thus the presymptomatic compartment.
Using the next-generation matrix approach, we can directly calculate both the basic reproductive number under control (R c (t)) and the effective reproductive number (R effective (t)) as (Supplementary Information): By allowing β(t) to vary in time, our model is able to account for increases in transmission intensity as human behavior shifts over time (for example, changes in mobility, adding or removing SDMs and changes in population mask use). Briefly, we combine data on cases (correcting for trends in testing), hospitalizations and deaths into a distribution of trends in daily deaths.
To fit this model, we resampled 1,000 draws of daily deaths from this distribution for each state (Supplementary Information). Using an estimated IFR by age and the distribution of time from infection to death (Supplementary Information), we then used the daily deaths to generate 1,000 distributions of estimated infections by day from 10 January to 21 September 2020. We then fit the rates at which infectious individuals may come into contact and infect susceptible individuals (denoted as β(t)) as a function of a number of predictors that affect transmission. Our modeling approach acts across the overall population (that is, no assumed age structure for transmission dynamics), and each location is modeled independently of the others (that is, we do not account for potential movement between locations).
We detail the SEIR fitting algorithm in the Supplementary Information. Briefly, for each draw, we first fit a smooth curve to our estimates of daily new infections. Then, sampling γ 2 , σ and α from defined ranges from the literature (Supplementary Information) and using γ 1 ¼ 1 2 I , we then sequentially fit the E, I 1 , I 2 and R components in the past. We then algebraically solve the above system of differential equations for β(t).
The next stage of our model fit relationships between past changes in β(t) and covariates described above: mobility, testing, masks, pneumonia seasonality and others. The time-varying covariates were forecast from 28 September to 28 February 2021 (Supplementary Information). The fitted regression was then used to estimate future transmission intensity β pred (t). The final future transmission intensity is then an adjusted version of β pred (t) based on the average fit over the recent past (where the window of averaging varies by draw from 2 to 4 weeks; Supplementary Information).
Finally, we used the future estimated transmission intensity to predict future transmission (using the same parameter values for all other SEIR parameters for each draw). In a reversal of the translation of deaths into infections, we then used the estimated daily new infections to calculate estimated daily deaths (again using the location-specific IFR). We also used the estimated trajectories of each SEIR compartment to calculate R c and R effective .
A final step to take predicted infections and deaths and a hospital-use microsimulation to estimate hospital resource need for each US state is described in the Supplementary Information and the results are presented online (https:// covid19.healthdata.org/).
Forecasts/scenarios. Policy responses to COVID-19 can be supported by the evaluation of the impacts of various scenarios of those options, against a background of a business-as-usual assumption, to explore fully the potential impact of policy levers available. Additional details are available in the Supplementary Information.
We estimate the trajectory of the epidemic by state under a mandate-easing scenario that models what would happen in each state if the current pattern of easing SDMs continues and new mandates are not implemented. This should be thought of as a worst-case scenario where, regardless of how high the daily death rate becomes, SDMs will not be reintroduced and behavior (including population mobility and mask use) will not vary before 28 February 2021. In locations where the number of cases is rising, this leads to very high numbers of cases by the end of the year.
As a more plausible scenario, we use the observed experience from the first phase of the pandemic to predict the likely response of state and local governments during the second phase. This plausible reference scenario assumes that in each location the trend of easing SDMs will continue at its current trajectory until the daily death rate reaches a threshold of 8 deaths per million. If the daily death rate in a location exceeds that threshold, we assume that SDMs will be reintroduced for a 6-week period. The choice of threshold (of a daily rate of 8 deaths per million) represents the 90th percentile of the distribution of daily death rate at which US states implemented their mandates during the first months of the COVID-19 pandemic. We selected the 90th percentile rather than the 50th percentile to capture an anticipated increased reluctance from governments to reinstate mandates because of the economic effects of the first set of mandates. In locations that do not exceed the threshold of a daily death rate of 8 per million, the projection is based on the covariates in the model and the forecasts for these to 28 February 2021. In locations where the daily death rate exceeded 8 per million at the time of running our final model (21 September 2020), we assumed that mandates would be introduced within 7 days.
The scenario of universal mask use models what would happen if 95% of the population in each state always wore a mask when they were in public. This value was chosen to represent the highest observed rate of mask use in the world so far during the COVID-19 pandemic (Supplementary Information). In this scenario, we also assumed that if the daily death rate in a state exceeds 8 deaths per million, SDMs will be reintroduced for a 6-week period.
Two additional, derivative scenarios were included to assist understanding and policy resolution of these main framework scenarios: a less comprehensive mask-wearing scenario of 85% public use of masks and a scenario of universal mask use in the absence of any additional NPIs. The less comprehensive mask-wearing scenario evaluated what would happen if 85% of the population in each state always wore a mask when they were in public. As with the universal mask-use scenario, we also assumed that if the daily death rate in a state exceeds 8 deaths per million, SDMs will be reintroduced for a 6-week period. For completeness, we also evaluated universal mask use by 95% of the population in a scenario that assumes no implementation of other NPIs at any threshold value of daily deaths-the results from this scenario, which did not differ notably from the more probable version where states respond to rising numbers of daily COVID-19 deaths by reinstating SDM, are provided in the Supplementary  Model validation. OOS predictive performance for IHME SEIR models has been assessed against subsequently observed trends in an ongoing fashion and compared to other publicly available COVID-19 mortality forecasting models in a publicly available framework 21 . The IHME SEIR model described here has consistently demonstrated high accuracy, as measured by a low MAPE, when compared to models from other groups. For example, among models released in June, at 10 weeks of extrapolation, the IHME SEIR model had the lowest MAPE of any observed forecasting group at 20.2%, compared to an average of 32.6% across groups. Numerous other aspects of predictive performance are assessed in our publicly available framework 21 .
The increasing number of population-based serology surveys conducted also provides a unique opportunity to cross-validate our forecasts with modeled epidemiological outcomes. In Extended Data Fig. 9, we compare these serology surveys (such as the Spanish ENE-COVID study 42 ) to our estimated population seropositivity, time indexed to the date that the survey was conducted. In general, across the varied locations that have been reported globally, we note a high degree of agreement between the estimated and surveyed seropositivity. As more serology studies are conducted and published, especially in the United States, this will allow an ongoing and iterative assessment of model validity. Two sensitivity analyses were conducted; the first assessed the importance of specific model assumptions on OOS predictive validity, while the second assessed the robustness of our conclusions to these same model assumptions (Supplementary Information).
Limitations. Epidemics progress based on complex nonlinear and dynamic biological and social processes that are difficult to observe directly and at scale. Mechanistic models of epidemics, formulated either as ordinary differential equations or as individual-based simulation models, are a useful tool for conceptualizing, analyzing or forecasting the time course of epidemics. In the COVID-19 epidemic, effective policies and the responses to those policies have changed the conditions supporting transmission from one week to the next, with the effects of policies realized typically after a variable time lag. Each model approximates an epidemic, and whether used to understand, forecast or advise, there are limitations on the quality and availability of the data used to inform it and the simplifications chosen in model specification. It is unreasonable to expect any model to do everything well, so each model makes compromises to serve a purpose, while maintaining computational tractability.
One of the largest determinants of the quality of a model is the corresponding quality of the input data. Our model is anchored to daily COVID-19-related deaths, as opposed to daily COVID-19 case counts, due to the assumption that death counts are a less biased estimate of true COVID-19-related deaths than COVID-19 case counts are of the true number of SARS-CoV-2 infections. Numerous biases such as treatment-seeking behavior, testing protocols (such as only testing those who have traveled abroad) and differential access to care greatly influence the utility of case count data. Moreover, there is growing evidence that inapparent and asymptomatic individuals are infectious, as well as individuals who eventually become symptomatic and are infectious before the onset of any symptoms. As such, our primary input data for our model are counts of deaths; death data can likewise be fallible, however, and where available, we combine death data, case data and hospitalization data to estimate COVID-19 deaths.
Beyond the basic input data, a large number of other data sources with their own potential biases are incorporated into our model. Testing, mobility and mask use are all imperfectly measured and may or may not be representative of the practices of those that are susceptible and/or infectious. Moreover, any forecast of the patterns of these covariates is associated with a large number of assumptions (Supplementary Information), and as such, care must be taken in the interpretation of estimates farther into the future, as the uncertainty associated with the numerous submodels that go into these estimates increases in time. Moreover, although our time-invariant covariates are simpler to estimate, some of them may be more associated with disease outcome than transmission potential, and thus their impact on the model may be more muted.
For practical purposes, our transmission model has made a large number of simplifying assumptions. Key among these is the exclusion of movement between locations (for example, importation) and the absence of age structure and mixing within location (for example, we assume a well-mixed population). It is clear that there are large, super-spreader-like events that have occurred throughout the COVID-19 pandemic, and our current model is unable to fully capture these dynamics. Another important assumption to note is that of the relationship between pneumonia seasonality and SARS-CoV-2 seasonality. To date, across both the Northern and Southern Hemispheres, there is a strong association between COVID-19 cases and deaths and general seasonal patterns of pneumonia deaths (Supplementary Information). Our forecasts to the end of February 2021 are immensely influenced by the assumption that this relationship will maintain throughout the year and that SARS-CoV-2 seasonality will be well approximated by pneumonia seasonality. While we assess this assumption to the extent possible (Supplementary Information), we have not yet experienced a full year of SARS-CoV-2 transmission, and as such cannot yet know if this assumption is valid. Additionally, our model attempts to account for some of the associated uncertainties in the process but does not fully capture all levels of uncertainty. Future iterations should track uncertainties that arise from more complex processes such as demographic stochasticity. There is also uncertainty (and unidentifiability) surrounding a number of the parameters of the transmission model. Here we have chosen to incorporate this lack of knowledge by drawing key transmission parameters from plausible distributions and then presenting the average result across these potential realities. As more information becomes available, we hope to tune these parameters to each location in turn. Finally, the model presented herein is not the first model our team has developed to predict current and future transmission of SARS-CoV-2. As the outbreak has progressed, we have attempted to adapt our modeling framework to both the changing epidemiological landscape, as well as the increase in data that could be useful to inform a model. Changes in the dynamics of the outbreak overwhelmed both the initial purpose and some key assumptions of our first model, requiring evolution in our approach. While the current SEIR formulation is a more flexible framework (and thus less likely to need complete reconfiguration as the outbreak progresses further), we fully expect the need to adapt our model to accommodate future shifts in patterns of SARS-CoV-2 transmission. Incorporating movement within and without locations is one example, but resolving our model at finer spatial scales, as well as accounting for differential exposure and treatment rates across sexes and races are other dimensions of transmission modeling that we currently do not account for but expect will be necessary additions in the coming months. As we have done before, we will continually adapt, update and improve our model based on need and predictive validity.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Extended Data Fig. 1 | Estimated daily COVID-19 death rate (per 100,000 population) by state for all five scenarios. The inset map displays the estimated daily deaths from COVID-19 death per 100,000 population by state on 28 February 2021. The light yellow background separates the observed and predicted part of the time series, before and after 21 September 2020. The dashed vertical line identifies 03 november 2020. numbers are the means and uncertainty interval (uI) for the plausible reference scenario on dates highlighted. Corresponding author(s): SIMON I HAY Last updated by author(s): Oct 9, 2020 Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted

Software and code
Policy information about availability of computer code Data collection No primary data collection was carried out for this analysis. No software was used for primary data collection purposes.

Data analysis
All code used for these analyses was custom created for this study and is publicly available online (https://github.com/ihmeuw/covid-modelseiir-pipeline; https://github.com/ihmeuw/covid-model-deaths-spline). Analyses were carried out using R version 3.6.1, python 3.8, and using R-INLA v20.01.29.9000. All maps presented in this study are generated by the authors using RStudio (R Version 3.6.3) and ArcGIS Desktop 10.6 and no permissions are required to publish them.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Sample size was calculated as the number of unique data source-location pairs with observations of covid-19 epidemiological data. This sample size is reported in the Supplementary Information, section 2; we use 10,984 location-days of covid-19 epidemiological data.
Data exclusions Reasons for data exclusion were pre-established and are described in supplementary information section 2. Principal epidemiological data streams were excluded and replaced with human curated alternatives based on the pre-established verification process described in supplementary information section 2.

Replication
This is an observational study using many months of reported COVID-19 epidemiological data, combined with covariate data, and could be replicated using publicly available code and data.
Randomization There was no experimental design or group allocation involved in this study.

Blinding
There was no experimental design or group allocation involved in this study.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.