Supplementary Material for : Modeling serological testing to inform relaxation of social distancing for COVID-19 control

Serological testing remains a passive component of the public health response to the COVID-19 pandemic. Using a transmission model, we examine how serological testing could have enabled seropositive individuals to increase their relative levels of social interaction while offsetting transmission risks. We simulate widespread serological testing in New York City, South Florida, and Washington Puget Sound and assume seropositive individuals partially restore their social contacts. Compared to no intervention, our model suggests that widespread serological testing starting in late 2020 would have averted approximately 3300 deaths in New York City, 1400 deaths in South Florida and 11,000 deaths in Washington State by June 2021. In all sites, serological testing blunted subsequent waves of transmission. Findings demonstrate the potential benefit of widespread serological testing, had it been implemented in the pre-vaccine era, and remain relevant now amid the potential for emergence of new variants.

s × m 2 (t)γ hs H s,i + (1 − H i )γ s I + s,i + γ a I + a,i m 1 (t) and m 2 (t) are defined as follows (t start is when testing starts, on November 1, 2020): The force of infection λ(t) for the i-th group is a function of the number of social contacts for age group i with each subgroup j at time t (x i,j (t)), the probability of infection given contact (q), the number of infections in each group at time t (infec j (t)), and the population size of each group at time t (n j (t)). The overall equation for λ i (t) is shown below: λ i (t) = q x i,ch (t)i ch (t) n ch (t) + x i,ch + (t)i ch + (t) n ch + (t) + x i,ad (t)i ad (t) n ad (t) + x i,ad + (t)i ad + (t) n ad + (t) + x i,rc (t)i rc (t) n rc (t) + x i,rc + (t)i rc + (t) n rc + (t) + x i,f c (t)i f c (t) n f c (t) + x i,f c + (t)i f c + (t) n f c + (t) + x i,el (t)i el (t) n el (t) + x i,el + (t)i el + (t) n el + (t) i and j take values of ch (children age 0-19 years), ad (adults age 20-65 years who are not working or working from home), el (older adults 65+ years of age), rc (adults at work in 'reduced-contact' occupations, where they have fewer contacts than pre-pandemic), f c (adults as work in full contact occupations, where they have the same number of contacts as pre-pandemic).
The number of infectious individuals by age group and test status is equal to the sum of documented (symptomatic) cases and a fraction of the undocumented (asymptomatic) cases, where this fraction η corresponds to the relative infectiousness of undocumented cases. This is shown below for children: i ch (t) = ηI a,ch + I s,ch

S2. Contact matrices Baseline contacts
Baseline contact matrices for 'work' contacts, 'school' contacts, 'home' contacts, and 'other' contacts were taken from [1]. To expand these baseline matrices to the 5 population groups in our model (separating the adult population into f c, rc, and h classes), we multiplied all contacts with adults x i,ad by the proportion of the adult population falling into each class. We define the fraction of the population falling in each working group as follows: f home = n home n home + n reduced + n full f reduced = n reduced n home + n reduced + n full f f ull = n full n home + n reduced + n full For baseline contact matrix values based on [1], we have: x ch,ch x ch,ad x ch,el x ad,ch x ad,ad x ad,el x el,ch x el,ad x el,el   

S3
For simplicity, we assume that baseline interactions between worker subgroups are only assortative with respect to age (and not with respect to occupation type). To expand this matrix to a 5x5 matrix we use the following notation, where rows 2, 3, and 4 correspond to the work from home, reduced contact, and full contact occupation groups, respectively: Based on these proportions, we define x i,h , x i,rc , and x i,f c as follows:

Contacts under social distancing
After social distancing has begun, we assume that: • Home contacts remain the same.
• Only working age adults continue to work. Baseline workplace contacts for children and young adults under 20 years of age are nearly zero (average 0.84 contacts/day) and the average workplace contacts for the elderly is 0, so this does not appreciably impact our results.
• All working adults who are able work from home.
• Adults continuing to work outside the home reduce their workplace contacts by constant p reduced .
• Other contacts are reduced by scalar constant θ o .
The revised contact matrix for work contacts then becomes: The revised contact matrix for other contacts becomes:

Contacts during initial relaxing of social distancing
When stay at home orders are initially lifted, we assume that: • Home contacts remain the same.
• Adults who were working from home continue to work from home.
• Workers in reduced contact occupations increase their workplace contacts based on the intensity of social distancing maintained.
• Schools remain closed until September 1, 2020 in South Florida and until October 1, 2020 in New York City and Washington, after which time they reopen at 50% capacity.
• Other contacts continue to be reduced based on the intensity of social distancing maintained.

Contacts after testing begins
When testing begins, we assume that: • Test-positive individuals move to the test positive group.
• Home contacts remain the same, but their distribution by test status is driven by the proportion of test-positives in the general population.
• Adults who were working from home may return to work if they test positive. Upon returning to work, their workplace contacts are assortative with respect to test status (but not with respect to occupation type).
• Workers in reduced contact occupations increase their workplace contacts based on the intensity of social distancing maintained. Work contacts are preferentially with test-positive individuals, as determined by α, or shielding strength.
• Other contacts are increased for test positive individuals to their pre-pandemic levels. Other contacts continue to be reduced for test negative/untested individuals based on the intensity of social distancing maintained. Other contacts are preferentially with test-positive individuals, as determined by α, or shielding strength.
After testing has begun, all contact matrices are dependent on the proportion of the population that has tested positive and been released from social distancing at time t. We define this proportion as r i (t), where (1 − r i (t)) is the fraction of the population who has not yet tested positive.
We assume that social distancing parameters are relaxed from their initial values as follows: For contact matrices of work and 'other' contacts, we implement shielding factor α, which increases the probability of contacting a test-positive individual according to their prevalence in the population (achieved by multiplying expected contact rates due to prevalence by scaling factor α + 1). To account for the fact that, when prevalence is high, (α + 1)r i (t) may exceed 1, we introduce a variable s i (t): This shielding structure is similar to 'fixed shielding' [2] in that it preserves the baseline number of contacts and increases contacts for test positive individuals by 1 + α, as shown below: The structure of all three matrices (home, work, and other) is given by M:

S3. Model parameters
We defined the three metropolitan areas in the same way as Havers et al [3]. For Washington Puget Sound metropolitan region, we included death data from King, Snohomish, Pierce, Kitsap and Grays Harbor counties. For the New York City metropolitan region, we included data from Manhattan, Bronx, Queens, Kings, and Nassau Counties. For the South Florida metropolitan region, we included data from Miami-Dade, Broward, Palm Beach, and Martin counties. These same counties were also used to derive age-specific population sizes for each region.
Parameters used in the model simulations are shown in Table 1 and Table 2. We assume that the size of the working population is stable over the duration of the simulation. Although this may not be the case as unemployment increases throughout the pandemic, the rate at which unemployment has changed so far has been time-varying and its future trajectory is unknown.
Where possible, parameters were taken from prior literature. We fixed all parameters in Table 1 as well as the site-specific population age structure and the timing of public health interventions. We did not vary these parameters as part of our fitting procedure-only fitted parameters were considered to be uncertain as part of our fitting procedure and it is the uncertainty in these parameters that is captured by the credible intervals. We fitted the following parameters: the probability of infection per contact (q), the strength of social distancing (c), the fraction of infections symptomatic (p), the initial intensity of social distancing for workplace contacts (θ w ) and for other contacts (θ o ). These parameters were fitted to initial outbreak We used the dates that stay-at-home orders were enacted, and later lifted, in each location and the dates corresponding to when local schools opened for at least partial in-person instruction in each location to inform dates of reopening in the model.

S4. Model fitting Initial Conditions
We first calculate the number of weeks between the first death and the first week where the cumulative death toll exceeds 10. For example, in the South Florida region, the first death was reported the week of March 18, 2020. The two subsequent weeks saw the death toll rise to 6 and 42. Using region-specific conditions (population demographics, stay-at-home order and lift dates, and deaths data), we initialize an epidemic consisting of a single exposed adult (a0). We use baseline estimates for the parameters we aim to fit later (q = 0.0451, c = 1, p = 0.14, θ o = 0.25, θ w = 0.1). We forward simulate a single-origin epidemic until the modeled number of deaths exceeds the number of deaths reported in the second week after the first death (in South Florida, 42 deaths), then use the distribution two weeks prior as our initial conditions.

S8
Parameter Code Value Units Source(s)

MCMC
Using Markov Chain Monte Carlo (MCMC) we estimate the five model parameters listed in Table 2 with uniform priors on a fixed range (q bounded from 0 to 0.07; c, p, θ o , and θ w bounded from 0 to 1).
The penalized log-likelihood function is defined as follows: let d o (t) be the observed cumulative number of deaths at time t and r o (t) be the observed weekly death rate. Let d s (t) be the model simulated cumulative number of deaths and r s (t) be the weekly death rates at time t. Let t m and t f denote a midpoint (11 weeks) and final time point of our simulation window. Let σ s and σ o denote the simulated and observed populationlevel seroprevalence estimates for each location by Havers et al [3]. Let a = 100000 be a scaling factor and F lp denote the Poisson log-likelihood function. The log-likelihood function (LL)is defined as: We checked for chain convergence using the Gelman-Rubin diagnostic (Supplementary Figure 1). We also considered fitting the relative transmissibility of asymptomatic cases (η) and the latent period (1/γ e ) (7 parameter fits in Supplementary Figure 1) but these fits did not converge and were therefore not considered further. Supplementary Figures 2, 5,  and 100 random parameter set draws from the posterior distribution (gray). Simulated data are presented as mean (black line) +/-1.96 sd (grey bands). Cumulative deaths are plotted (blue squares), in addition to seroprevalence estimates (red squares) in each location. Each column corresponds to an MCMC chain. Each chain is seeded randomly.

Credible intervals width
The overall width of credible intervals (as shown in the main text) was determined by the value of c (Supplementary Figure 14). As shown in the trace plots, the value of c was not identifiable from our model simulations, particularly for Washington and New York City, with a wide variety of initial values matching the initial dynamics. In South Florida, the fitted value of c was more constrained, yet model predictions still in large part depended on the ongoing level of social distancing. In part, this issue arose because the relaxation of social distancing in these two locations began after the first wave of the epidemic was largely complete (main text Figure 3), and thus there were few cases shortly after reopening with which to calibrate the dynamics. However, as the outbreak continued, this parameter was crucial for determining the number

S5. R 0 Estimation
The dynamics of the system and R 0 are determined by how the outbreak would proceed at time zero in the absence of any interventions, and therefore depends on the fitted values of p and q, but not on any of the other fitted parameters. Therefore, we assume no testing at time 0, no social distancing, and no differences in worker contact levels (i.e., all groups mix at the population-average level prior to the outbreak). In this situation, there are only 3 population subgroups at time zero: The λ i for each group is defined as follows.
λ ch = qx ch,ch (ηI a,ch + I s,ch ) n ch + qx ch,ad (ηI a,ad + I s,ad ) n ad + qx ch,el (ηI a,el + I s,el ) n el λ ad = qx ad,ch (ηI a,ch + I s,ch ) n ch + qx ad,ad (ηI a,ad + I s,ad ) n ad + qx ad,el (ηI a,el + I s,el ) n el λ el = qx el,ch (ηI a,ch + I s,ch ) n ch + qx el,ad (ηI a,ad + I s,ad ) n ad + qx el,el (ηI a,el + I s,el ) n el The matrices F and V corresponding to these equations are: To derive an expression for R 0 we inverted the V matrix and multiplied by F. The dominant eigenvalues of this matrix can be computed, but are very complex and are therefore not shown here. It is notable that because hospitalized cases do not contribute to the force of infection, the value of R 0 does not depend on γ hs or γ hc . We calculated the value of R 0 for each of the three locations based on the fitted values of p and q and the other relevant parameters.

S7. Sensitivity of model results to fixed parameters
Overall PRCC results are shown in Supplementary Figures 12-14. In general, increased relative transmissibility for asymptomatic cases tended to increase the potential for deaths averted through testing/shielding, whereas faster recovery rates for both symptomatic and asymptomatic cases tended to decrease the potential impact of shielding. These patterns were most pronounced for South Florida and Washington. In New York City, a shorter latent period was associated with decreased potential impact of shielding when monthly testing with a highly specific test was used, but not in any of the other intervention scenarios considered.
These patterns were reversed when testing was employed without shielding. The duration of hospitalization did not alter the ultimate impact of a shielding strategy in any of the sites.  shown for key fixed parameters in the model, including the relative infectiousness of asymptomatic cases (η), the latent (incubation) period (γ e ), recovery rates for both symptomatic and asymptomatic infections (γ a , γ s ), and duration of hospitalization (γ hs , γ hc ).We used Latin Hypercube Sampling to generate 300 random parameter sets based on probable ranges for each parameter, assuming a uniform distribution within each range. Error bars represent the 95% bootstrap confidence intervals (1000 bootstrap replicates) for the Pearson correlation coefficient accounting for multiple (6, one for each parameter tested) comparisons using a Bonferroni correction.

S8. Adding vaccination to the model
In this extension of our main model, we assume a constant complete vaccination rate of 0.5% of the population per day (similar to the US peak distribution of 3 million doses/day when accounting for lower rates due to the need to administer 2 doses to complete the series) starting on January 1, 2021. We assume a vaccine efficacy of 95%: vaccinated, protected individuals (95% of vaccinees) are moved from the untested compartments to the tested, recovered (R) compartment and vaccinated, unprotected persons (5% of vaccinees), are placed in the tested, susceptible (S) compartment. This represents an "all-or-nothing" vaccine, whereby vaccinated persons receive complete protection from SARS-CoV-2 infection and disease. Interestingly, we find that the impact on overall deaths is not large relative to shielding if vaccination is implemented at a rate similar to that seen currently (and surmise that less effective vaccines will also lead to modest reductions in overall fatalities). While vaccines are undoubtedly highly effective at mitigating pandemic effects, this result underscores the large effect of implementing a testing and shielding intervention on a population-level. On top of the impact of a shielding strategy (assuming both high levels of shielding and rapid testing rates), vaccination provides little additional benefit. Of note, in New York City, where the benefit of shielding is very sensitive to small changes in test specificity (Figure 4), our scenarios with vaccination lead to slightly more deaths than those with shielding alone, since a vaccine with < 100% efficacy leads to 'false positives', and thus increased risk because those who are vaccinated but unprotected resume pre-pandemic levels of social interaction but can still be infected and transmit).   shown for key fixed parameters in the model, including the relative infectiousness of asymptomatic cases (η), the latent (incubation) period (γ e ), recovery rates for both symptomatic and asymptomatic infections (γ a , γ s ), and duration of hospitalization (γ hs , γ hc ). We used Latin Hypercube Sampling to generate 300 random parameter sets based on probable ranges for each parameter, assuming a uniform distribution within each range.Error bars represent the 95% bootstrap confidence intervals (1000 bootstrap replicates) for the Pearson correlation coefficient accounting for multiple (6, one for each parameter tested) comparisons using a Bonferroni correction.  for key fixed parameters in the model, including the relative infectiousness of asymptomatic cases (η), the latent (incubation) period (γ e ), recovery rates for both symptomatic and asymptomatic infections (γ a , γ s ), and duration of hospitalization (γ hs , γ hc ). We used Latin Hypercube Sampling to generate 300 random parameter sets based on probable ranges for each parameter, assuming a uniform distribution within each range. Error bars represent the 95% bootstrap confidence intervals (1000 bootstrap replicates) for the Pearson correlation coefficient accounting for multiple (6, one for each parameter tested) comparisons using a Bonferroni correction. S30