State-level needs for social distancing and contact tracing to contain COVID-19 in the United States

Starting in mid-May 2020, many US states began relaxing social distancing measures that were put in place to mitigate the spread of COVID-19. To evaluate the impact of relaxation of restrictions on COVID-19 dynamics and control, we developed a transmission dynamic model and calibrated it to US state-level COVID-19 cases and deaths. We used this model to evaluate the impact of social distancing, testing and contact tracing on the COVID-19 epidemic in each state. As of July 22, 2020, we found only three states were on track to curtail their epidemic curve. Thirty-nine states and the District of Columbia may have to double their testing and/or tracing rates and/or rolling back reopening by 25%, while eight states require an even greater measure of combined testing, tracing, and distancing. Increased testing and contact tracing capacity is paramount for mitigating the recent large-scale increases in U.S. cases and deaths.

3 tracing capacities 12 . Mathematical modeling is a unique tool to help answer these important and timely questions. Models can contribute valuable insight for public health decision-makers by providing an evaluation of the effectiveness of ongoing control strategies along with predictions of the potential impact of alternative policy scenarios 13 .
To address these needs, we developed and validated a data-driven transmission dynamic model to evaluate the impact of social distancing, state-reopening, testing, and contact tracing on the state-level dynamics of COVID-19 infections and mortality in the US, shown schematically in Figure 1. Like many other COVID-19 transmission models [14][15][16][17] , we used an extended SEIR (susceptible, exposed, infectious, removed) compartmental model. The model divides the population into several disease compartments and tracts movements of individuals between the compartments through different transition rates. The main model compartments include: S, susceptible, E, exposed, A, infectious and asymptomatic, I, infectious and symptomatic, R, recovered, and F, dead. In addition to disease progression stages, our model incorporates social distancing informed by several public sources of mobility data, case identification via testing, isolation of detected cases, and contact tracing. This is a mean-field epidemiological modeling approach that captures the average disease dynamics behavior within a population 18,19 . We used Bayesian inference methods to calibrate and validate our model prediction to state-level daily reported COVID-19 cases and fatality data. Model parameters, prior distributions, and their sources are shown in Table 1. We used the calibrated model to evaluate the transmissibility of COVID-19 in each state from March, 2020 to late July, 2020, to estimate the state-level impact of shelter-in-place and reopening on COVID-19 transmission. Finally, we evaluated the degree to which increasing testing efforts (rate of identification of infected cases) and/or contact tracing could curtail the spread of the diseases and enable greater relaxation of social distancing restrictions while preventing a resurgence of infections and deaths. A detailed description of the model considerations, parameterization, and analysis is provided in Methods.

Estimations of effective reproduction number
The effective reproduction number, is the average number of secondary infection cases generated by a single infectious individual during her infectious period 18 . When the epidemic curve is increasing, and when , the epidemic curve is decreasing 18 . Using the posterior distribution of our model parameters we estimated the effective reproduction number from March 19 th to late July, 2020 and identified the minimum level of transmission achieved in each state ( Figure 2A). We found that for all except five states (Alabama, Arkansas, North Carolina, Wisconsin, and Utah), the inter-quartile range for the minimum value was less than 1 and these values were mainly achieved during the state shelter-in-place ( Figure 2A). Following states' relaxations of social distancing measures, disease transmission started to re-increase. By July 22 nd , 2020, 42 states and the District of Columbia had at least a 75% probability that >1. Thus, the model predicts that as states are reopening, a majority of states are at risk of continued increases in the scale of the outbreak and require additional mitigation to contain the spread of the disease.
We conducted an analysis of variance to evaluate the contribution of each parameter to the variation in value (Supplementary Table 1). Across states, we found that the largest drivers of variation in are the power parameter for relating social distancing to hygiene-associated reduction in transmission, , degree of mitigation during shelter-in-place, , the maximum relative increase in contact after shelter-in-place orders, , and the fraction of contact traced, , which together contribute over 50% of variance (Extended Data Figure 1). This observation is consistent with mobility data alone being insufficient to account for the combined effect of multiple control measures, and suggest that the degree of adoption of non-mobility-related measures, such as enhanced hygiene practices and contact tracing, play a large role in the extent to which a state may reduce disease transmission.
For each state, we defined as the level of reopening/rebound ( = 0% at minimum, 100% at full reopening) in disease transmission relative to its lowest transmission rate observed during shelter-inplace, and estimated the current level of reopening/rebound ( Figure 2B). We found that 24 states had a 50% or more rebound in COVID-19 transmission by July 22 nd , 2020, while no states had a 25% or less rebound in transmission ( Figure 2B).

Impact of testing and contact tracing on easing of social distancing
Bringing and keeping the effective reproduction number, , below 1 is necessary to curtail the spread of an outbreak. We evaluated the probability of keeping <1 for different levels of testing and contact tracing under the July 22 nd , 2020 level of state reopening. We found that for most states bringing and keeping <1 may not be possible without increased contact tracing efforts, as increasing testing and isolation alone would be sufficient or require extremely high coverage to curtail the epidemic curve with a 0.975 probability (Extended Data Figures 2 & 3, and Supplementary Table 2).
The challenges are even greater to ensure continued control of the epidemic with full reopening, which require much larger increases in tracing and testing (Extended Data Figure 4, Supplementary Table 3).
To evaluate the impact of scaling up testing and contact tracing on the epidemic dynamics in each state, we assumed a linear "ramp-up" of either testing and/or contact tracing from August 1 st -14 th , 2020, after which both parameters remain constant. We then predicted the daily number of reported cases and deaths (  7 Moreover, reported cases increase during the "ramp-up" period ( Figure 3). We also found that in most states additional relaxation of restrictions without simultaneously increasing contact tracing may exacerbate disease dynamics and result in large-scale outbreaks (Supplementary Figure 10).
We next evaluated the maximal degree of rebound in transmission (i.e., level of reopening) permitted while keeping <1 under different testing and contact tracing scenarios ( Figure 4). We found that under the current level of testing and contact tracing rate, 27 states cannot keep their <1 (at 75% confidence) even with only 25% reopening/rebound in transmission ( Figure 4A). By doubling the current testing rate, eight states could keep their <1 even with a 50% level of reopening ( Figure 4B). By doubling contact tracing, nine states could remove all mobility restrictions while keeping <1 ( Figure   4C). By doubling both testing rate and contact tracing, ten states could remove all mobility restrictions while keeping <1 ( Figure 4D).
We categorized states by the additional amount of mitigation efforts needed to keep ( ) < 1 with at least 75% confidence ( a 50% reversal of current reopening in addition to increased testing and/or contact tracing are needed in order to to reduce and keep ( ) < 1 ("Very High" Category).

Discussion
There is a delicate and continuous balance to strike between the use of social distancing measures to mitigate the spread of an emerging and deadly disease such as COVID-19 and the need for re/opening various sectors of activities for the social, economic, mental, and physical well-being of a community. To address this issue, it is imperative to design measurable, data-driven, and flexible milestones for identifying when to make specific transitions with regard to easing or retightening specific social distancing measures. We developed a data-driven SARS-CoV-2 transmission dynamic model not only to make short-term predictions on COVID-19 incidence and mortality in the US, but more importantly to evaluate the impact that relaxing social distancing measures and increasing testing and contact tracing would have on the epidemic in each state.
We showed that in most states, control strategies implemented during their "shelter-in-place" period were sufficient to contain the outbreak, defined as reducing and ultimately maintaining the effective reproduction number below 1 ( <1). However, for the majority of states, our modelling suggests that "reopening" has proceeded too rapidly and/or without adequate testing and contact tracing to prevent a resurgence of the epidemic. Our model suggests for some states, a substantial fraction of the population may have already been infected such that even without additional intervention, ( ) is declining towards (or below) 1 even as ( )>1. The most extreme example is Arizona, where ( ) is estimated to have declined below the previous minimum value achieved during shelter-in-place.
However, accurate estimation of the susceptible fraction of the population is difficult due to uncertain degree of undercounting in the reported case data. Thus, we used ( )to categorize the mitigation needs in each state and evaluate the level of control effort needed to curtail the spread of the epidemic in each state.

9
Moreover, even in states with currently decreasing incidence and mortality, such as Maine and New Jersey, additional relaxation of restrictions is likely to "bend the epidemic curve upwards" in the absence of increased testing or tracing. However, our model predicts that a combination of increased testing, increased contact tracing, and/or scaling back reopening will be sufficient for curtailing the spread of COVID-19 in most states. Specifically, doubling of current testing and contact tracing rates would enable the majority of states to either maintain or increase the easing of social distancing restrictions in a "safe" manner in the short term. Scaling back the current level of reopening by 25% in combination with doubling of testing and tracing will be sufficient to control the epidemic in the long term in all but eight should also be noted that increased testing and contact tracing will lead to a short-term increase in reported cases because a larger fraction of the infected population is being observed, and that several weeks may pass before these rates begin to show a decline. Therefore, it is imperative that policymakers and the public recognize that such a surge is actually a sign that testing and tracing efforts are succeeding, and exercise the patience to wait several weeks before these successes are reflected as declining rates of reported cases.
Other modeling studies have used SEIR-type compartmental models to assess the impact of social distancing, testing and contact tracing to curb the epidemic curve in Italy and the United Kingdom [14][15][16][17] .
Consistent with our results, these studies have shown that rapid reopening of the economy without adequate testing and contact tracing could lead to a resurgence of the epidemic [14][15][16][17] . Specifically, they show that high testing and contact tracing rates may enable to maintain/increase the easing of social distancing restrictions without an increased rate of COVID-19 transmission 14 .
Our study has several limitations due to modelling assumptions and the quality of available data. Like most COVID-19 transmission models 14-17 , we used a compartmental SEIR-type model to model the spread of SARS-CoV-2 because of its simplicity and ability to capture population average dynamics. This modeling approach does not account for heterogeneity in individual-level behavior, overdispersion due to "super-spreaders," social contact networks, and inherent stochasticity which may play an important role in SARS-CoV-2 transmission dynamics. These factors can be modeled through the use of individualbased models [20][21][22] . However, individual-based modeling is a more complex modeling framework and may require a substantial amount of individual-level data for model parameterization, calibration, and validation.
To characterize the limitations of using cell phone-based mobility data to infer (prior distributions for) contact rates, we examined the state-to-state variation in mobility data to the corresponding posterior distributions for each mobility-related parameter (see Supplementary Figure 9). Three parameters of particular interest are the minimum relative contact rate , the duration of the shelter-in-place phase , and the maximum amount of reopening . For , none of the r 2 values were consistently less than 0.2, although the slope and intercept of the regression line for the Unacast Visitation metric were within 15% of 1 and 0, respectively. Similarly, for , the highest r 2 value was 0.37, for OpenTable Bookings data, which also had a relatively accurate regression line (again within 11 15%) . For , the highest r 2 values were for Google retail and recreation (0.49), and Unacast Visitation (0.52) metrics, but the Google data were much more accurate, with a slope close to 1 and intercept close to 0. Overall, these results suggest that cell-phone based mobility data vary substantially in their accuracy (slope and intercept near 1 and 0, respectively) and overall have low precision (no r 2 more than about 0.5), and supports our use of the range across multiple sources in developing prior distributions, rather than using such data directly for modeling contact rates.
The initiation of social distancing measures, such as stay-at-home orders in the US, for mitigating the spread of COVID-19 has occurred concurrently with increased promotion and application of other NPIs, such as hygiene practices (e.g. hand hygiene, surface cleaning, cough etiquette, and wearing of face mask). These hygiene practices coupled with the avoidance of physical contact whenever possible (keeping six feet apart) could impact the spread of COVID-19 by reducing both the risk of exposure and the risk of transmission of SARS-CoV-2 from infected patients 23,24 . Though our model explicitly accounts for the differential contribution of social distancing (mobility reduction) versus hygiene practices and physical distancing to reducing COVID-19 transmission, we assume that the impact of hygiene practices and physical distancing was a function of social distancing (mobility reduction). While cell phone mobility data may continue to be informative as to contact rates, at least in aggregate, the impact of enhanced hygiene practices is more difficult to measure independently. As several states have eased their social distancing requirements, especially their stay-at-home orders, compliance with hygiene practices would become even more important for reducing individuals' risk of getting or transmitting the pathogen. However, keeping a high population-level adherence to these measures is required to mitigate the spread of the COVID-19 epidemic 25 . As states are reopening various aspects of their economy, data on compliance with enhanced hygiene practices and physical distancing are needed to improve the estimation of these measures' population-level impact on reducing disease transmission.

Model formulation
We modified the standard SEIR model to address testing and contact tracing, as well as asymptomatic individuals. A fraction of those exposed (E) to enter the asymptomatic class (divided into for untested, and for contact traced) instead of the infected class, which in our model formulation also includes infectious pre-symptomatic individuals. With respect to testing, separate compartments were added for untested, "freely roaming" infected individuals ( ), tested/isolated cases , fatalities .
Upon recovery, untested infected individuals ) and all asymptomatic individuals move to the untested recovered compartment , and tested infected individuals move to the tested recovered compartment . In balancing considerations of model fidelity and parameter identifiability, we made the reasonably conservative assumptions that all tested cases are effectively isolated (through self-quarantine or hospitalization) and thus unavailable for transmission, and that all COVID-related deaths are identified/tested.
With respect to contact tracing, the additional compartment represents unexposed contacts, who undergo a period of isolation during which they are not susceptible before returning to ; while , , and represent contacts who were exposed. Again, the reasonably conservative assumption was made that all exposed contacts undergo testing, with an accelerated testing rate compared to the general population. We assume a closed population of constant size for each state.
The ordinary differential equations governing our model are as follows: is the contact rate between individuals, is the transmission probability per infected contact, is the fraction of contacts identified through contact tracing, 1/ is the duration of self-isolation after contact tracing, 1/ is the latent period, is the fraction of exposed who are asymptomatic, is the testing rate, is the fatality rate, is the recovery rate, and is the testing rate and recovery rate of contact traced individuals, respectively. The testing rates and , the fatality rate , and the recovery rate of traced contacts are each composites of several underlying parameters. The testing rate defined as where ,0 is the current testing coverage (fraction of infected individuals tested), is the test sensitivity (true positive rate), and is rate of testing for those tested, with a typical time-to-test equal to 1/ . The time-dependence term models the "ramp-up" of testing using a logistic function with a growth rate of 1/ −1 , where 50 is the time where 50% of the current testing rate is achieved. Similarly, for testing of traced contacts, the same definition is used with the assumption that all identified contacts are tested, ,0 = 1 and at a faster assumed testing rate , : Because all contacts are assumed to be tested, the rate at which they enter the "recovered" compartment is simply the rate of false negative test results: The fatality rate is adjusted to maintain consistency with the assumption that all COVID-19 deaths are identified, assuming a constant infected fatality rate ( ). Specifically, we first calculated the fraction of infected that are tested and positive Then the case fatality rate ( ) = / ( ). Because the = /( + ), this implies The model is "seeded" cases on February 29 th , 2020. Because in the early stages of the outbreak, there may be multiple "imported" cases, we only fit to data from March 19 th , 2020 onwards, one week after the U.S. travel ban was put in place 31 .
Our model is fit to daily case and death data (cumulative data are not used for fitting because of autocorrelation). To adequately fit the case and mortality data, we accounted for two lag times.

Incorporating social distancing, enhanced hygiene practices, and reopening
The impact of social distancing, hygiene practices, and reopening were modeled through a time- We selected the functional form above for ( ) because it was found to be able to represent a wide variety of social distancing data, including cell phone mobility data from Unacast 33 and Google 34 , as well as restaurant booking data from OpenTable 35 . We used these different mobility sources to derive statespecific prior distributions because different social distancing datasets had different values for , , , , , and ( Figure S1).
With respect to the reduction in transmission probability , we assumed that during the "shelter-inplace" phase, hygiene-based mitigation paralleled this decline with an effectiveness power , and that this mitigation continued through re-opening.
Finally, we define an overall "reopening" parameter that measures the "rebound" in disease transmission ⋅ relative to its minimum, defined to be 0 during shelter-in-place (i.e., ( ) is at a minimum), and 1 when all restrictions are removed (when ( ) = 0 ), which can be derived as: Our model is illustrated in Figure 1, with parameters and prior distributions listed in Table 1.

Scenario evaluation
We We then conducted scenario-based prospective predictions using our model's parameters as estimated through July 22 nd , 2020. We asked the following questions: (a) Assuming current levels of reopening, what increases in general testing and/or contact tracing would be necessary to bring < 1? We then fixed the scaling factors at 1 or 2, and solved the above equation for such that < 1. Finally, for (c), we additionally evaluated changes in reopening → + for values of +25%

Values of
(+50%) or -25% (-50%), for a total of 20 scenarios (4 different levels of testing and tracing, and 5 different levels of reopening). We then ran the SEIR model forward in time until September 30th, 2020.
For all three intervention parameters , , and , we assumed a "ramp-up" period of 2 weeks from August 1 st -14 th , 2020.
To summarize the relative need for mitigation in each state, we categorized states based on which scenarios resulted in the IQR of ( ) being < 1 on August 15 th , 2020. The categories were defined as follows: • Very Low: Can reopen further by >25% while keeping ( ) < 1; • Low: Can reopen further by < 25% with up to 2X increase in testing while keeping ( ) < 1; • Moderate: Requires 2X contact tracing or reversal of reopening by 25% to bring and keep ( ) < 1; • High: Requires multiple interventions (2X testing, 2X contract tracing, reversal of reopening by 25%) to bring and keep ( ) < 1; • Very High: Combining 2X testing, 2X contact tracing, and reversal of reopening by 50% is needed to bring and keep ( ) < 1.
We use ( ) instead of ( ), to minimize the impact of heterogeneity and uncertainty in the value of ( )/ on our results. Thus, requiring ( ) < 1 provides greater assurance of state-wide control of the epidemic.

Software and code:
Posterior distributions were sampled using Markov chain Monte Carlo simulation performed using
• Case and death data were sourced from The COVID Tracking Project         Time (t) is measured from t=1 corresponds to 2020-01-01. ¶ Assumed, non-informative prior wide enough to have adequate validation coverage. ϯ Standard contact tracing guidance is to self-isolate for 2 weeks. Ƣ For calibration to 6/20/20, state-specific priors were derived by fitting to different social distancing data sets, with each parameter's mean, standard deviation, and range used to define a normal distribution prior. * See Methods for relationship between IFR and δ. Figure 1 SEIR model structure, parameter, data sources, and tting/validation methods. We used mobility data to constrain the time-dependence of the contact rate. We tted the model to daily reported cases and con rmed deaths from March 19th to April 30th and validated its projections against data from May 1st

Figures
to June 20th. On the model projections, the black solid line is the median, the pink band is the 95% credible interval (CrI) and the orange is the interquartile range (IQR). We show model tting and validation for four states: New York (NY), Ohio (OH), Texas (TX), and Washington (WA).

Figure 2
Estimated effective reproduction number and the level of reopening/rebound in transmission as of July 22nd, 2020 for all states. (A) shows estimated (median, IQR, and 95% CrI) across States. The gure shows the value of on July 22nd, 2020, as well as the "minimum" value of between March 19th, 2020 and July 22nd, 2020, in lighter shades of each color. It also includes the date of the minimum . (B) shows the level of reopening/rebound in disease transmission in each state relative to its minimum value during state shelter-in-place (median, IQR, and 95% CrI).   Reopening/rebound in transmission permitted (0% = minimum shelter-in-place value, 100% = return to no restrictions) to keep < 1. (A) If testing and contact rates are unchanged, (B) testing rate is doubled, (C) contact tracing is doubled, or (D) both testing and contact tracing are doubled. () the level of reopening/rebound in transmission on July 22nd, 2020 is shown by the circle. All boxplots show median, IQR, and 95% CrI.