Robust estimation of SARS-CoV-2 epidemic in US counties

The COVID-19 outbreak is asynchronous in US counties. Mitigating the COVID-19 transmission requires not only the state and federal level order of protective measures such as social distancing and testing, but also public awareness of time-dependent risk and reactions at county and community levels. We propose a robust approach to estimate the heterogeneous progression of SARS-CoV-2 at all US counties having no less than 2 COVID-19 associated deaths, and we use the daily probability of contracting (PoC) SARS-CoV-2 for a susceptible individual to quantify the risk of SARS-CoV-2 transmission in a community. We found that shortening by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5\%$$\end{document}5% of the infectious period of SARS-CoV-2 can reduce around \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$39\%$$\end{document}39% (or 78 K, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%$$\end{document}95% CI: [66 K , 89 K ]) of the COVID-19 associated deaths in the US as of 20 September 2020. Our findings also indicate that reducing infection and deaths by a shortened infectious period is more pronounced for areas with the effective reproduction number close to 1, suggesting that testing should be used along with other mitigation measures, such as social distancing and facial mask-wearing, to reduce the transmission rate. Our deliverable includes a dynamic county-level map for local officials to determine optimal policy responses and for the public to better understand the risk of contracting SARS-CoV-2 on each day.


Results
We first verify our model performance by forecasting at the county level. The 7-day and 21-day death projections for 2277 US counties using data by 20 September 2020, for instance, are close to the held-out test death toll in these counties, shown in part b and part c of Fig. 1. The Pearson correlation coefficient ( ρ ) is larger than 0.999 7-day and 21-day forecast. We also calculate the weighted average of Pearson correlation coefficient for counties ( ρ county ), which treats each county as a different population and population size is used to computed the weighted average of Pearson correlation coefficient for counties. The 21-day forecast of each considered county in Florida and California using observations by 20 September 2020 is provided in Figs. 2 and 3 , respectively. The death toll forecast based on our model is accurate for most US counties, and around 95% of the held-out test data is covered by nominal 95% predictive interval (Supplementary Table S1 in supplementary information), indicating that the uncertainty assessment is accurate. To further test the predictive performance of our model, we use data by 1 December, 2020 to make 21-day and 90-day predictions of deaths in the 10 largest counties in Florida and California. The forecast results are shown in Figs. 7 and 8 , respectively. While this is a challenging scenario, as confirmed cases and deaths increase dramatically across the US during the winter, we found that our 21-day predictions are reasonably accurate for all 20 counties. Thus, our models can be used reliably for the short-term projection of COVID-19 related deaths at the county level during different periods of the epidemic. Furthermore, a 90-day accurate forecast of US counties before the winter may be an almost impossible task, and indeed we underestimate death counts for a few counties due to a rapid increase in death counts during the winter. On the other hand, our model that fuses test data and death toll correctly projects the rapid increase in death counts for most counties during the winter, even if death counts do not increase dramatically during the training period.
Based on the robust estimation of transmission rates, we derived the county-level estimation of daily PoC SARS-CoV-2. We classify the daily PoC SARS-CoV-2 in a community into five levels listed in Table 2. On 20 September 2020, out of 2277 US counties, only 60 counties were at the controllable level and 311 counties were at the moderate level, whereas 1906 counties were at the either alarming, strongly alarming, or hazardous level. The daily PoC SARS-CoV-2 measures the average probability to contract SARS-CoV-2 for a susceptible individual in a community, and the risk varies from individuals to individuals. Nonetheless, the PoC SARS-CoV-2 is an Each dot is a cumulative death toll for one county at one held-out day. Counties from the same state are graphed using the same color. The Pearson correlation coefficient ( ρ ) of the nation and the weighted average of Pearson correlation coefficient for counties ( ρ county ) are recorded. (c) 21-day death toll forecasts in 10 counties with largest population in Florida, where the red line represents the observed death toll and blue line means the forecast. The forecast starts from 21 September 2020, marked by the vertical black dash line. The grey shadow area is the 95% confidence interval of the forecast. Numbers in the parentheses right after the county name are population in million. The Figs. 2 and 3 show 21-day death toll forecast for all counties in Florida and California. Table 1. Policy summary.

Background
The transmission of SARS-CoV-2 is heterogeneous and asynchronous in US counties. It is thus important to assess the risk before lifting or replacing any mitigation measure in the community. We have developed a novel approach to integrate test data and death toll to estimate the probability of contracting COVID-19, as well as the time-dependent transmission rate and number of active infectious individuals at the county level in the US

Main findings and limitations
National level order of protective measures reduces the transmission rate and active number of infectious individuals for most US counties in April, whereas the risk of contracting SARS-CoV-2 rebounded between late June and early July, as the protective measures were relaxed. We found that when the infectious period of SARS-CoV-2 is shortened by 5% and 10% , the number of deaths can be reduced from 199 K to 120 K ( 95% CI [109 K, 132 K ]) and 80 K ( 95% CI [72 K, 89 K]) as of 20 September 2020, respectively, when other protective measures were kept the same. The reduction of the infectious period can be achieved by extra testing in addition to ongoing protective measures. Our model relies on the existing knowledge of the COVID-19 and model assumptions. Other information, such as demographic profiles, mobility, and serology test data, can be used to calibrate the model parameters and assumptions at the community level.

Policy implications
Our model indicates that extra testings, along with the current NPIs, can significantly reduce the number of deaths associated with COVID-19. The estimated probability of contracting COVID-19 can be used as an interpretable risk factor to guide community policy responses. Figure 2. The 21-day forecast in 67 Florida counties with death toll no less than 2 as of 20 September 2020. The training period is from 21 March 2020 to 20 September 2020, whereas the forecast starts from 21 September 2020. The red curves are the cumulative observed death toll from 21 September 2020 to 11 October 2020 and the blue line indicates the forecast for the same period. The shaded area represents the 95% predictive intervals of the forecast for each analyzed county in Florida. www.nature.com/scientificreports/ interpretable measure for public understanding of the average risk of contracting SARS-CoV-2 in a community on a given day. We graph the estimated PoC SARS-CoV-2 of an individual at US counties on 20 April 2020 and 20 September 2020 in Fig. 9. On 20 April 2020, the PoC SARS-CoV-2 is large in northeastern regions and some southern states such as Arizona, New Mexico, and New Orleans. On 20 September 2020, the PoC SARS-CoV-2 is large in many inland states, for instance, Montana, North Dakota, Mississippi, and Alabama. Although the PoC SARS-CoV-2 on 20 September in northeastern regions is substantially lower than that on 20 April, the PoC SARS-CoV-2 for an individual is large in most other states on 20 September, suggesting that the relaxation of protective measures can lead to more population contracting COVID-19, and consequently more deaths at a rate no slower than that in late April.
Officials can use the daily PoC SARS-CoV-2 to determine whether the mitigation policies can be lifted or replaced by other measures for different regions. The probability of contracting COVID-19 in many counties in Texas on 20 September 2020, for example, is larger than those in Washington [(part (a) and (d) in Fig. 4], indicating that Texas should undertake more protective measures to reduce the risk. The nationwide lockdown order and social distancing in spring effectively reduced the PoC SARS-CoV-2 in 4 out of 5 counties in Washington, while the PoC SARS-CoV-2 of all counties increases in late June and early July, as some of the nonpharmaceutical interventions (NPIs) were lifted (part b in Fig. 4). Part (c) shows that the model fits the death toll. With only two parameters estimated numerically for each county, the fit is reasonably good for these counties at a wide range of dates. In comparison, though the outbreak of 5 counties in Texas started in early summer, the PoC SARS-CoV-2 in these Texas counties is much higher than that in Washington counties on 20 September [part (e) in Fig. 4]. Our model also fits the death toll of the counties in Texas relatively well [part (f) in Fig. 4]. The county-level estimation and forecast are updated regularly on the COVID-19 US Dashboard:https:// covid 19-study. pstat. ucsb. edu/.
The effectiveness of protective measures were studied to reduce the transmission rate 7,8,11,12,14,19 , whereas the efficacy of these measures depends on the reactions from the public, which is likely to vary from region to region. Another simultaneous effort to mitigate the spread of the COVID-19 outbreak is through testing and contact tracing, which reduces the infectious period, and consequently, the number of active infectious individuals. For Washington and Texas, we simulate the model output with infectious period reduced by 5% (or equivalently 4.75 days in total), while the transmission rate ( β t in SIRDC model) is held the same. We found that the PoC SARS-CoV-2 is reduced by 5 times for 12 counties out of 28 considered counties in Washington and 6 counties out of 209 considered counties in Texas, as shown in the Fig. 10. Furthermore, when we reduce the infectious period by 10% (or equivalently 4.5 days in total), while the transmission rate ( β t in SIRDC model) is held the same, the PoC SARS-CoV-2 is reduced by 5 times for 26 out of 28 counties in Washington and 146 out of 209 counties in Texas, shown in Fig. 11.
We graph the estimated effective reproduction number, the number of active infectious individuals, and the cumulative death toll in the US, along with the simulated values when the average infectious period is reduced from 5 to 4.75 days and 4.5 days in Fig. 12. First, we found that mitigation measures in March effectively reduce the effective reproduction number to below 1, whereas the value rebounded in summer after some of these measures were relaxed in different regions. Consequently, the US has experienced two waves of the outbreak in terms of the number of active infectious individuals [part (b) in Fig. 12]. The high test positive rate at the beginning of the epidemic (Fig. 13) indicates that a substantial number of active infectious individuals were not diagnosed in April due to the lack of diagnostic tests. According to our estimates, the peak of the first wave in April is larger than that of the second wave in July in terms of the number of active infectious individuals, whereas the peak of the daily observed confirmed cases in April is smaller than that of the second wave in July (Fig. 13).
Second, the simulated results suggest that shortening infectious period of SARS-CoV-2 by 5% and 10% can reduce the total deaths from 199 K to 120 K ( 95% CI [109 K, 132 K]) and 80 K ( 95% CI [72 K, 89 K]), respectively, as of 20 September 2020, when other protective measures were held as the same (part c in Fig. 12). Note that since we held the transmission rate parameter ( β t ) to be the same (a scenario where the public adheres to the protective measure same as the reality), the effective reproduction number barely changes (part a in Fig. 12). However, the slightly shortened infectious periods of SARS-CoV-2 can reduce the death toll substantially (part c in Fig. 12), as the number of active infectious individuals decreases (part b in Fig. 12).
We found that a shortened infectious period substantially reduces the number of active infectious individuals and fatalities in the second wave. However, the changes are smaller in the first wave, since the effective reproduction number in the second wave is smaller than that in the first wave (Fig. 12). The county level estimation also validates this point (Figs. 10 and 11). This finding indicates that the efforts of shortening the infectious period of SARS-CoV-2 should not replace the other protective measures, such as social distancing and facial maskwearing to reduce the transmission rate.
Diagnostic tests can be used to shorten the length of the infectious period of an active infectious individual. Drastically reducing the infectious period may not be possible without contact tracing, which is challenging when there is a large number of active infective cases. Reducing the infectious period by around 5% , in comparison, may be achieved by periodically diagnostic tests every 20 days for each susceptible individual. More frequent testing or contact tracing may be needed to achieve this goal, as the infection is most likely to happen between days 2 and 6 after exposure due to the high viral load of SARS-CoV-2 20 . Another efficient way is to test susceptible individuals with a high risk of contracting or spreading SARS-CoV-2, such as individuals with more daily contacts or have contacts with vulnerable populations, e.g., workers from senior living facilities. Our estimation of the PoC SARS-CoV-2 can be used as a response to develop regression models using covariates including demographic information and mobility to elicit personalized risk of contracting SARS-CoV-2 for susceptible individuals.
Finally, efforts on reducing the length of the infectious period should not replace other protective measures for reducing transmission rates of SARS-CoV-2, as the number of active infectious individuals and death toll can be effectively reduced only if the effective reproduction number is not substantially larger than 1. www.nature.com/scientificreports/

Discussion
Our study has several limitations. First, our findings are based on the available knowledge and model assumptions, as with all other studies. One critical parameter is the death rate, assumed to be 0.66% on average 21 , whereas this parameter can vary across regions due to the demographic profile of the population and available medical resources. The studies of the prevalence of SARS-CoV-2 antibodies based on serology tests 17 can be used to determine the size of the population who have contracted SARS-CoV-2, and thus provides estimates on the death rate, as the death toll is observed. Besides, we assume the infected population can develop immunity since recovery for a few months, which is commonly used in other models. The exact duration of immunity post-infection, however, remains unverified scientifically. Third, we assume that the number of susceptible individuals and, consequently, the number of individuals who have contracted SARS-CoV-2 can be written as a function of the number of observed confirmed cases and test positive rates, calibrated based on the death toll. More information such as the proportion of population adhere to the mitigation measures, mobility, and demographic profile can be used to improve the estimation of susceptible individuals in a region.
Our results can be used to mitigate the ongoing pandemic of SARS-COV-2 and other infectious disease outbreaks in the future. The estimated daily PoC SARS-CoV-2 at the county level, for example, is an interpretable measure to understand the risk of contracting COVID-19 on a daily basis and a surveillance marker to determine appropriate policy responses. Besides, Our method can be extended when an effective vaccine becomes available 10 . Finally, further studies of this measure relative to different mobility, demographic information, and social-economic status can provide more precise guidance for local officials to protect vulnerable populations from contracting SARS-CoV-2, when an effective vaccine is not available.

Methods
We introduce our methods in this section. The main symbols used in this section and their definitions are provided in Table 3.
SIRDC compartmental models. The SIRDC model for the jth county in the ith state in the US is described below: where S i,j (t) , I i,j (t) , R i,j (t) , D i,j (t) and C i,j (t) denote the number of individuals at these 5 compartmental groups on day t, respectively, and N i,j denotes the number of individuals in county j from state i for i = 1, 2, . . . , k , j = 1, 2, . . . , n i with n i being the number of counties of the ith state considered in the analysis and t = 1, 2, . . . , T i,j . The time-dependent transmission rate parameter is denoted by β i,j (t) and the inverse of average number of days an infectious individual can transmit the COVID-19 is denoted by γ . The inverse of the average number of dates for a case to get resolved (i.e. deceased or recovered) is denoted by θ and the proportion of deceased cases (i.e. death rate) is denoted by δ . The parameters (γ , θ , δ) were invariant over time and held fixed in this study. Following 19 , we assume the infectious period to be 5 days on average, and a case is expected to resolve after 10 days. The average death rate is assumed to be 0.66% 21 . Additional verification of these assumptions and sensitivity analysis of these parameters are provided in the supplementary information.
To determine the characteristics of the SARS-CoV-2 epidemic in US counties, we define the timedependent effective reproduction number, i.e. the average number of secondary cases per primary cases as R i,j When R i,j eff (t) < 1 , it means that the number of the active infectious individuals will decrease (and vice versa, if R i,j eff (t) > 1 ). The effective reproduction number was often used to quantify whether or not the disease is under control 22 . However, the effective reproduction number does not directly quantify risk of contracting SARS-COV-2 for a susceptible individual, as the number of active infectious individuals in a region was not taken into consideration. We compute the average probability of contracting (PoC) SARS-CoV-2, denoted as P i,j (t) = R i,j eff (t)I i,j (t)γ /(S i,j (t)) = β i,j (t)I i,j (t)/N i,j , which quantifies the risk of a susceptible individual in county j from state i to catch SARS-CoV-2 on day t. Here the risk is on an average sense among all susceptible individuals in a region.
The most critical parameter of the SIRDC model is the transmission rate parameter, β i,j (t) , as a function of time, based on which we obtain the reproduction number on day t. To estimate the time-dependent transmission rates for communities with small population sizes, we derive a more robust estimation of the transmission rate of each county based on the death toll and testing data, discussed below.

Closed-form expressions of the time-dependent transmission rates.
Since the observations such as death toll and confirmed cases are generally updated daily, we solve the ordinary differential equations (ODEs) in the SIRDC model (Eq. 1) approximately by the midpoint rule of the integral with a step size of 1 day. For day t ∈ N + , the approximation is described below: (1) www.nature.com/scientificreports/ www.nature.com/scientificreports/ Further by assuming the transmission rate parameter β i,j (t) is day-to-day invariant (i.e. a step function with step size 1), based on Eqs. (2) and (3), we obtain β i,j (t + 0.5) from t = 1 to T i,j − 1 , iteratively, based on the sequence of susceptible individuals {S i,j (t)} T i,j t=1 and the initial number of active infectious individuals I i,j (1) described in algorithm 1.  www.nature.com/scientificreports/ After we get the number of active infective individuals ( I i,j (t) ) on each day, sequences of the resolving, deceased and recovered compartments can be solved subsequently following the same manner using Eqs. (4)-(6), after specifying their initial values. Expressing the time-dependent transmission rate by the number of susceptive and infective cases is the key to integrating death toll and testing data for estimation.
In Figs. 5 and 6, we demonstrate that in order to solve the ODEs in the SIRDC model, our approach is more accurate and robust than the method F&J in Ref. 9 under both simulated and real scenarios. Other more accurate methods (such as the Runge-Kutta method) can also solve the ODEs of SIRDC model, but the time-dependent ) results of the simulation with noisy observations, which have the same interpretation as (a-c). In this simulation, we set the transmission rate β(t) = exp −0.7( 9 T−1 (t − 1) + 1) + ǫ , for 1 ≤ t ≤ T and ǫ ∼ N(0, 0.04) , and the other parameters are held the same as in the noise-free simulation. The transmission rates estimated from the method F&J are truncated to be within [0,10]. The solution from our robust estimation approach, the Isoda and the Runge-Kutta method with the 4th order and step size being 0.1 overlap for both scenarios. www.nature.com/scientificreports/  Estimation of the number of susceptible individuals. Note that we have S i,j (t) + c o i,j (t) + c u i,j (t) = N i,j for any t, where c o i,j (t) and c u i,j (t) are the number of cumulative observed confirmed cases and unobserved confirmed cases, respectively. Estimating the number of susceptible individuals is equivalent to estimating the number of unobserved confirmed cases c u i,j (t) , because the number of observed confirmed cases c o i,j (t) and the population N i,j are known. Here we combine them with the positive test rates to estimate c u i,j (t) , as large positive test rates typically indicate a large number of unobserved confirmed cases. We assume that the total number of confirmed cases is equal to the observed confirmed cases, adjusted by the state-level test positive rate p i (t) , a power parameter α i and a weight parameter ω i,j , leading to the following formula of the susceptible population: where �c o i,j (t) is the observed daily confirmed cases on day t, for t = 1, 2, . . . , T i,j , i = 1, 2, . . . , k and j = 1, 2, . . . , n i . Since the positive test rates are only available at the state level, the power parameter α i ∈ [0, 2]  www.nature.com/scientificreports/ is estimated by the state-level observations. According to Eq. (7), the time-invariant weight ω i,j can be expressed below: where I i,j (1) , R i,j (1) , D i,j (1) and C i,j (1) are the number of active infectious, resolving, deceased and recovered cases on day 1, respectively.

Estimation of initial values of infectious and resolving cases.
We define day 1 of a county as the more recent date between 21 March 2020 and the date that the county has 5 observed confirmed cases for the first time. Since all counties were at an early stage of the epidemic on the starting day, we let the initial value of the death toll D i,j (1) be the observed death toll on the day 1, and the initial value of the recovered cases be 0. This assumption is not likely going to strongly influence our analysis, as the number of recovered cases is only a negligible proportion of the susceptible individual on the starting day if not zero. The only parameters to estimate are the number of infectious individuals I i,j (1) and the number of resolving cases R i,j (1) on the day 1 for county j from state i, after the power parameter α i is estimated using the state-level observations to minimize the same loss function below: Figure 11.
(a-f) The simulated results of COVID-19 progression characteristics in Washington (the first row) and in Texas (the second row) that have the same interpretation as (a-f) in Fig. 4 with the infection period changed from 5 to 4.5 days, whereas other parameters are held the same. www.nature.com/scientificreports/ where the upper bound U i,j is chosen to guarantee the estimated number of the susceptible cases S i,j (t) to be larger than 0: 0 ≤ I i,j (1) + R i,j (1) ≤ U i,j , I i,j (1) ≥ 0, and R i,j (1) ≥ 0,  www.nature.com/scientificreports/ for t = 1, 2, . . . , T i,j . After the initial values of infectious and resolving cases are estimated, we obtain the estimation of the susceptible cases from Eq. (7), and the infectious cases and transmission rates on each date for each county from Algorithm 1. The resolving cases, deaths, and recovered cases can be derived subsequently from Eqs. (4)-(6), respectively. The estimated basic and effective reproduction rates can be derived by the fitted time-dependent transmission rate, and the estimated probability of contracting SARS-CoV-2 for an individual can be computed based on transmission rate and number of infectious individuals for each county on each day.
Forecast and uncertainty assessment. Our method can also be used as a tool for forecasting compartments (e.g., death toll), reproduction numbers, and the probability of contracting SARS-CoV-2 at each county for a short period. We extrapolate the transmission rate based on Gaussian processes implemented in Robust-GaSP R package 23 with robust parameter estimation 24,25 . Based on the extrapolated transmission rates, the compartments can be solved iteratively based on Eqs. (2)- (6).
We also found that the forecast will generally be improved by modeling residuals between observed deaths and modeled deaths by a zero-mean Gaussian process (GP). One advantage of a GP model is the internal assessment of the uncertainty of the forecast from the predictive distribution, which is of crucial importance. The aggregated model that combines the SIRDC model and the GP model for county j from state i in the US is described as follows.
where D i,j (t) and F i,j (t) denote the observed death toll and estimated death toll via the SIRDC model, respectively; The noise follows independently as a Gaussian distribution ε i,j,t ∼ N(0, σ 2 i,j,0 ) with variance parameter σ 2 i,j,0 . The latent temporal process z i,j (t) is modeled by a zero-mean GP, meaning that for time points {1, 2, . . . , T i,j } , z i,j = z i,j (1), . . . , z i,j (T i,j ) T follows a multivariate normal distribution: where the (l, m) entry of i,j is parameterized by a covariance function σ 2 i,j K i,j (l, m) for 1 ≤ l, m ≤ T i,j . Here σ 2 i,j is the variance parameter and K i,j (·, ·) is a one-dimensional correlation function. We use the power exponential correlation function: where a is the roughness parameter fixed to be 1.9 as in other studies 26,27 , to avoid possible singularity in inversion of the covariance matrix using the Gaussian correlation ( a = 2 ), and b i,j is a range parameter for each county estimated from the data. We define the nugget parameter η i,j = σ 2 i,j,0 /σ 2 i,j . The range parameter b i,j , and the nugget parameter η i,j in Eq. (10) are estimated based on the marginal posterior mode estimation using the rgasp function in the package RobustGaSP available on CRAN 24 .
Denote D i,j = (D i,j (1), . . . , D i,j (T i,j )) T and F i,j = (F i,j (1), . . . , F i,j (T i,j )) T . After marginalizing out the variance parameter by the reference prior p(σ 2 i,j ) ∝ 1/σ 2 i,j , for any t * , the predictive distribution of z i,j (t * ) , conditional on the observations, range parameter b i,j and nugget parameter η i,j , follows a non-central Student's t-distribution with degrees of freedom T i,j 24 where with R i,j = R i,j + η i,j I T i,j , the (l, m)th term of R i,j being K i,j (l, m) for 1 ≤ l, m ≤ T i,j , and r i,j (t * ) = (K i,j (t * , 1), . . . , K i,j (t * , T i,j )) T , by plugging in the estimated range parameter b i,j and nugget η i,j . The predictive mean ẑ i,j (t * ) for forecasting the death toll of the jth county in the ith state at a future day t * and the predictive interval can be computed based on the Student's t distribution. An overview of the forecast algorithm and the numerical comparison of different approaches in forecast is given in the Supplementary Information. www.nature.com/scientificreports/ graphed based on publicly available R package urbnmapr. The code used in this paper is publicly available: https:// github. com/ Hanmo Li/ Robust-estim ation-of-SARS-CoV-2-epide mic-in-US-count ies/.