Main

The first report of a new infectious disease, later called COVID-19, appeared on 31 December 20191. As of 15 July 2020, the disease has spread to 188 countries, with more than 13.3 million confirmed cases, and has killed more than 579,500 people2. As the number of confirmed COVID-19 cases increased and the expansion of the disease entered into a global exponential growth phase, a large number of affected countries were forced to adopt non-pharmaceutical interventions at an unprecedented scale. Given the absence of specific antiviral prophylaxis, therapeutics or a vaccine, non-pharmaceutical interventions ranging from case isolation and quarantine of contacts, to the lockdown of entire populations have been implemented with the aim of suppressing and mitigating the epidemic before it could overwhelm the healthcare system. Although these aggressive measures appear to be successful in reducing the number of deaths and hospitalizations3,4, and in reducing the transmission of the SARS-CoV-2 virus, the absence of herd immunity after the first wave of the epidemic points to a high risk of resurgence when interventions are relaxed and societies return to a ‘business as usual’ lifestyle5,6,7. It is therefore of paramount importance to analyse different mitigation and containment strategies aimed at minimizing the risk of potential additional waves of the COVID-19 epidemic, while providing an acceptable trade-off between economic and public health objectives.

In this study, through the integration of anonymized and privacy-enhanced data from mobile devices and census data, we build a detailed sample of the synthetic population of the Boston metropolitan area (BMA) in the United States (see Fig. 1a,b). This synthetic population (Fig. 1a) is used to define a data-driven agent-based model of SARS-CoV-2 transmission and to provide a quantitative analysis of the evolution of the epidemic and the effectiveness of social-distancing interventions. The model allows us to explore strategies concerning the lifting of social-distancing interventions in conjunction with testing and isolation of cases and tracing and quarantine of exposed contacts. Our results indicate that, after the abatement of the epidemic through the ‘stay-at-home’ orders and halt to all non-essential activities, a proactive policy of testing, contact tracing and household quarantine of contacts enables the gradual reopening of economic activities and workplaces, with a low COVID-19 incidence in the population and a manageable impact on the healthcare system.

To provide a quantitative estimate of the contact patterns for the population of agents and to build the synthetic population of the BMA, we used detailed mobility and socio-demographic data and generated a network that encodes the contact patterns of around 85,000 agents in the area during a period of six months (see Supplementary Information). Agents are chosen to be representative of the different census areas in the Boston area following the methodology used in ref. 8. This defines a weighted multilayer network consisting of three layers representing the network of social interactions at (1) workplace and community level, (2) households, and (3) schools, as shown in Fig. 1a. Connections between two agents in the workplace and community layer are estimated from the data by the probability of both being present in a specific place (for example, a restaurant, workplace or shop) weighted according to the time they have spent in the same place. A second layer represents the households of each anonymous individual. Using the home census block group of each anonymous user, we associate each individual with a specific household profile on the basis of socio-demographic data at US census block group level9. Families are generated by randomly mixing nodes from the community living in the same census block group, following the statistical features of family types and sizes. Finally, a third layer represents the contacts in schools (that is, every node represents one synthetic student and has contacts only with other individuals attending the same school). To study the evolving dynamics of the infection, we implemented a stochastic, discrete-time compartmental model in which individuals transition from one state to the other according to key time-to-event intervals (for example, incubation period, serial interval and time from symptom onset to hospital admission), as from available data on SARS-CoV-2 transmission. The natural history of the disease is captured by the epidemiological model represented in Fig. 1c, where we also show the transition rates among compartments8,10,11,12. The model considers that susceptible individuals (S) become infected through contact with any of the infectious categories (infectious symptomatic (IS), infectious asymptomatic (IA) and pre-symptomatic (PS)), transitioning to latent compartments (LS and LA), where they are infected but not yet infectious. Latent individuals branch out in two paths, according to whether the infection will be symptomatic or not. We also consider that symptomatic individuals experience a pre-symptomatic phase and that once they develop symptoms, they can experience diverse degrees of illness severity, from mild symptoms to hospitalization (H) or in need of an intensive care unit (ICU)13. Finally, individuals transition in the removed compartment (identifying recovered or dead individuals). The model assumes a basic reproductive number, R0 = 2.5, which together with the other parameters (Supplementary Table 1) yields a generation time Tg = 6.6 d. We consider a 25% fraction of asymptomatic individuals. We report the full set of parameters used in the model and an extensive sensitivity analysis in the Supplementary Information. The model is not calibrated to account for the specific evolution of the COVID-19 epidemic in Boston, as it is aimed at showing the effect of different non-pharmaceutical interventions rather than providing a forensic analysis of the outbreak in the BMA. Furthermore, current data on the detailed reopening policies in the BMA during the summer are still under evaluation pending the future evaluation of the epidemic trajectory. Details on the generation of the synthetic population network and the infection-transmission model are provided in the Supplementary Information.

Results

To provide a baseline of the impact of COVID-19 in the BMA, we have first investigated an unmitigated scenario in which no interventions are implemented. Results for the unmitigated scenario are shown in Fig. 2a–c. An unmitigated COVID-19 would have a peak of daily incidence of 25.2 (95% confidence interval (CI): 23.8–26.4) newly infected individuals per 1,000 people. The epidemic follows a typical trajectory; namely, when the effective reproduction number Rt as a function of time (Fig. 2c) becomes smaller than 1, the transmission dynamics slow down and eventually vanish after having infected about 75% of the population (Fig. 2b). Figure 3a shows the evolution of the estimated number of new severely affected individuals who require hospitalization and admission into ICUs. At the peak of the unmitigated epidemic, the number of ICU beds needed exceeds the available capacity (dashed horizontal line in Fig. 3a) by a factor of more than 10, indicating that the healthcare system would suffer large service disruptions, resulting in additional deaths due to hospitals overcrowded with patients with COVID-19 (https://www.ahd.com/data_sources.html). It is worth noting that current estimated fatality rates consider the general availability of ICU beds and critical care capacity. If this is not possible, the fatality rate may increase substantially. We do not report fatality estimates here, as this goes beyond the scope of our analysis and should consider specific data for the BMA, as well as changing medical treatment and therapeutics in future months.

To avoid the harmful effects of an unmitigated COVID-19 epidemic, governments and policy makers across the world have relied on the introduction of aggressive social-distancing measures. In the United States, as of April 15 2020, it was estimated that more than 95% of the population was under a stay-at-home or ‘shelter-in-place’ order14,15. To model the social-distancing policies implemented in the whole BMA, we have considered March 17 2020 as the average starting date of social-distancing policies that include school closures, the shut down of all non-essential work activities as well as mobility restrictions (details in Supplementary Information). This scenario mimics the social-distancing intervention implemented in most of the high income countries, in Europe and across states in the United States. Such extreme social-distancing policies come with very large economic costs and social disruption effects16, prompting the question of what exit strategy can be devised to restart economic activities and normal societal functions17. For this reason, we explore two different scenarios for lifting social-distancing interventions:

• Lift scenario (LIFT): the stay-at-home order is lifted after eight weeks by reopening all work and community places, except for mass-gathering locations such as restaurants, theatres and similar locations (see Supplementary Information). The latter partial reopening is enforced for another four weeks, which is followed by a full lifting of all the remaining restrictions. We consider that schools will remain closed, given the impending summer break in July and August 2020. Indeed, some school systems, such as the Boston public schools, have announced that they will remain closed for the rest of the 2019–2020 school year.

• Lift and enhanced tracing (LET) scenario: The stay-at-home order is lifted as in the previous scenario. Once partial reopening is implemented, we assume that 50% of symptomatic COVID-19 cases can be diagnosed with SARS-CoV-2 infection, on average, within 2 d after the onset of symptoms and that they are isolated at home and their household members are quarantined successfully for 2 weeks (a sensitivity analysis for lower rates of isolation and quarantine is presented in the Supplementary Information). Although COVID-19 tests are highly specific, our 50% detection rate also accounts for imperfect testing. We also assume that a fraction of the non-household contacts (we show results for 20% and 40%) of the symptomatic infections can be traced and quarantined along with their household—note that we consider that the contacts are identified with a rate proportional to the duration of the interaction with the symptomatic individual.

The above scenarios are mechanistically simulated on the multilayer network of Fig. 1a by allowing different interactions (between effective contacts) according to the simulated strategy. As a result, the average number of interactions in the workplace and community layer changes from 10.86 (95% CI: 1.51–42.39) under the unmitigated scenario, to 4.10 (95% CI: 0–23.79) for the partial lockdown and 0.89 (95% CI: 0–8.39) contacts for the stay-at-home policy (further details in Methods and Supplementary Information). This result is in agreement with previously published work18 and recent reports in the New York City area19. It is worth remarking that the fluctuations in the number of contacts under the stay-at-home policy is due to a large extent to contacts that take place in grocery stores and other public venues.

The numerical results show that the LIFT scenario, while able to temporarily abate the epidemic incidence, does not prevent the resurgence of the epidemic and a second COVID-19 wave when the social-distancing measures are relaxed. In Fig. 2d, we show that following the lifting of social-distancing, the infection incidence starts to increase again, and the effective reproductive number, which dropped by around 75% and reached values below 1 with the intervention, increases to values up to 2.05 (95%CI: 1.73–2.47) (Fig. 2f). Indeed, at the time of lifting the social-distancing intervention, the population has not achieved the level of herd immunity that would protect it from a resurgence of the epidemic. It is important to stress that here we do not consider additional mitigation measures such as behavioural changes in the population, such as mask wearing and similar measures (see the Supplementary Information). We also estimate that a second wave of the epidemic still has the potential to infect a large fraction of the population (Fig. 2e) and to overwhelm the healthcare system, as shown in Fig. 3b. The number of ICU beds needed, although it half of the number of the unmitigated scenario, still far exceeds the estimated availability, as pointed out in similar scenario analysis5,6,7,20. This suggests that lifting social distancing without the support of additional containment strategies is not a viable option.

In the case of the LET scenario, the lifting of the social-distancing intervention is accompanied by a substantial amount of contact tracing and precautionary quarantine of potentially exposed individuals. The quarantine is not limited to the contacts of the identified symptomatic COVID-19 sufferer, but extends to their households. This strategy amounts to a simplified tracing of contacts of contacts, which would not require extensive investigations. In other words, this strategy does not require the tracking of a large number of single contacts, but instead leverages the contacts’ households as the basic unit21. Households could, however, be monitored with daily calls or messages to determine the onset of symptomatic infections and provide medical support as needed.

Figure 2g shows results obtained for different levels of tracing (no tracing, and 20% and 40% tracing) of the contacts of the symptomatic isolated COVID-19 cases. By comparing Fig. 2d with Fig. 2g (for no tracing), we find that quarantining households of symptomatic individuals alone is not sufficient to substantially change the course of the epidemic and the conclusions reached for the first of these scenarios. When 40% or more of the contacts of the detected symptomatic infections are traced and they and their households are quarantined, the ensuing reduction in transmission leads to a noticeable flattening of the epidemic curve and appears to effectively limit the possible resurgence of a second epidemic wave. It is also worth noticing that we assume the absence of other additional and minimally disruptive social-distancing policies such as crowd control, smart working, wearing of masks, and others that could lead to a further reduction of the transmissibility of the virus with respect to our estimates. In the Supplementary Information, we investigate the positive synergistic effect of the combination of the LET strategy in the presence of different levels of transmissibility reduction due to additional mitigation policies. It is important to stress that the contact tracing proposed here works at the level of household unit, simplifying the monitoring and follow up process, by contacting only one member of the household to monitor the onset of symptoms among all members (we further explore other isolation and quarantine strategies in the Supplementary Information). Figure 3c and Table 1 show the burden on hospitalization and ICU demand in the unmitigated situation and the two mitigation scenarios. The LET scenario allows relaxation of the social-distancing interventions while maintaining the hospital and ICU demand at levels close to the healthcare availability and surge capacity. For the sake of completeness, the SM file includes analysis of a LIFT scenario including schools and universities reopening in the autumn. The results show that in the absence of additional containment policies, the tracing effort would need to be raised to about 50% to cope with the increased number of infections.

Discussion

The efforts in the suppression and mitigation of COVID-19 are pursuing the objectives of protecting the healthcare system from disruptive failures due to overwhelming stress imposed by the large number of severe cases and of minimizing the morbidity and mortality related to the epidemic. The aggressive social-distancing interventions implemented by many countries in response to the COVID-19 pandemic appear to have achieved the interruption of transmission and the abatement of the epidemic, but this has been at the price of huge societal disruption and economic costs. In such a context, the identification of ‘exit strategies’ that allow restarting economic and social activities while still protecting healthcare systems and minimizing the burden of the epidemic is of primary importance. Several modelling studies have already pointed out that resuming economic activities and social life is likely to lead to a resurgence of the COVID-19 epidemic, and combined social-distancing interventions of different degrees and intensity have been proposed to substantially delay and mitigate the epidemic16,20. These interventions still generate economic loss and widespread disruption to social life. Here we show how testing, contact-tracing strategies at scale, based on home isolation of symptomatic individuals with COVID-19, and the quarantine of a fraction of their households’ contacts, have the potential to provide a viable course of action to manage and mitigate the epidemic when social-distancing interventions are progressively lifted22,23. These strategies present us with logistic challenges that include large-scale and rapid diagnostic capacity, and a large surge in the number of contact tracers. We have investigated what fraction of the population would be isolated or quarantined under the proposed contact-tracing and isolation strategy. Figure 4a shows the fraction of households that needs to be quarantined. Assuming the identification of 50% of the symptomatic infections, and tracing of 40% of their contacts and households, only about 9% of the population would be quarantined at any time. While this is certainly a relevant fraction of the population, it is a much better option compared with massive social-distancing policies affecting the entire population that last for many months.

In Table 1, we report the number of symptomatic infections for which the contact-tracing investigation should be performed in the basic scenarios. This number provides an estimate of the number of contact tracers per 1,000 people. It is important to note that the more effective the contact tracing starting from each individual is, the smaller the number of generally traced households will be, because the epidemic has lower incidence rates. In addition, as illustrated in Fig. 4b, the health status of the vast majority of quarantined individuals is unknown, as contact tracing does not imply testing. The curves in Fig. 4a constitute the upper bounds for each simulated case. If we assume that the capacity to do massive testing will probably ramp up in the near future, then it is expected that the actual number of people in quarantine could be significantly lowered by testing the quarantined household. This would also alleviate the burden on household members that were unable to go to work and increase compliance of quarantine for the positive cases. It is also worth remarking that many of the logistic challenges faced with massive contact tracing might be eased by digital technologies that are currently being investigated across the world, following the examples of COVID-19 response in Asian countries23. Also, it may be difficult to quarantine the entire household of individuals who are potentially exposed, since this is a hardship suffered with high uncertainty about their risk of infection. Offering other logistic quarantine solutions (quarantine centres or hotel rooms) might substantially increase the rate of compliance.

These results were obtained under several assumptions. There are very large uncertainties around the transmission of SARS-CoV-2; in particular, about the fraction of subclinical and asymptomatic cases and their transmission. Estimates of age-specific severity are informed by the analysis on individual-level data from China and other countries, and subject to change as more US data become available. We also do not include specific co-morbidities or pre-existing conditions of the specific BMA population. For this reason, we perform an extensive sensitivity analysis showing that the modelling results discussed here are robust to the plausible range of parameter values for the key time-to-event intervals of COVID-19 (for example, incubation period, serial interval, time from symptom onset to hospital admission, among other) as well as the fraction of pre-symptomatic and asymptomatic transmission (Supplementary Information). We are also not considering here potential changes to the virus transmissibility due to environmental factors; in particular, seasonal drivers such as temperature and humidity. The modelling does not consider possible reintroduction of SARS-CoV-2 in the population from infected travellers. Strategies based on testing, isolation and contact tracing might be hampered by the importation through travel of a large number of infections, thus travel restrictions and screening may need to be introduced to and from places that show sustained local transmission. Finally, we report in the Supplementary Information the effect of the widespread use in the population of masks or other personal protective equipment that lead to a reduction of the transmissibility of SARS-Cov-2. These active protection measures improve the effectiveness of the exit strategies modelled here.

The modelling of the impact of testing, contact tracing and isolation on second-wave scenarios of the COVID-19 epidemic could be instrumental for national and international agencies for public health response planning. While we show that contact tracing and household quarantine at scale may be effective, even assuming a complete lifting of the social-distancing measures, future decisions on when and for how long to relax policies will need to be informed by ongoing surveillance. For instance, smart working from home for people who can adhere to it without serious disruptions should be encouraged. This, as well as other minimally disturbing policies, together with efficient and large-scale testing, contact tracing and monitoring of the epidemic, should be considered in the definition of exit strategies from large-scale social-distancing policies.

Methods

Weighted synthetic population

Our synthetic population constitutes around 85,000 nodes, of which 64,000 are adults and 21,000 are children (defined as individuals aged 17 or less), as shown in Fig. 1b. The total number of interactions among these individuals before social distancing is given by more than five million edges, see Supplementary Information for a more detailed description.

Community-weighted contact network

The community network is approximated using six months of data observation in the Boston area from anonymized users who have opted-in to provide access to their location data anonymously through a GDPR-compliant framework provided by Cuebiq. Individuals performing the analysis were legally required to never single out identifiable individuals and not make attempts to link these data to third-party data about an individual. In this layer, each agent in our synthetic population represents an anonymous individual of the real population. The data allow us to understand how infection can propagate in each layer by estimating colocation of two individuals in the same setting. We use a large database of 83,000 places from Foursquare application programming interface in the BMA. Specifically, the weight, $${\omega }_{{C}_{ij}}$$, of a link between individuals i and j within the workplace plus community layer is computed according to the expression:

$${\omega }_{{C}_{ij}}=\mathop{\sum }\limits_{p}^{n}\frac{{T}_{ip}}{{T}_{i}}\frac{{T}_{jp}}{{T}_{j}},\qquad \forall i,j$$

where Tip is the total time that individual i was observed at place p and Ti is the total time that individual i has been observed at any place set within the workplace plus community layer. Note that while the mobility dataset we use is large, colocation events between individuals are still quite sparse. Because of this sparsity, and to protect individual privacy in our analysis, we have adopted this probabilistic approach to measure co-presence in all locations mapped in the dataset. Since agents are representative of the different census areas and groups of the Boston area, our probabilistic approach is a good proxy for the real probability of co-presence between those groups and areas when networks are scaled up to the total population of the Boston area, which is approximately 4,628,910 inhabitants. Finally, for robustness and computational reasons, we have included only links for which $${\omega }_{{C}_{ij}}>0.01$$.

Household-weighted contact network

We first localize individuals’ approximate home place according to the US census block group. Then we assign a type of household based on Table B11016: Household Type by Household Size from US Census 2018 (ref. 9), and mix individuals that live in the same block according to statistics of household type and size. Finally, children are assigned to households (see the Supplementary Information for a more detailed description). We also assign individuals to an age group, on the basis of Table B01001: Sex and age from the US Census 2018. To assign weights, we assume that the probability of interaction within a household is proportional to the number of people living in the same household (well-mixing). Therefore, the weight, $${\omega }_{{H}_{ij}}$$, of a link between individuals i and j within the same household is given by:

$${\omega }_{{H}_{ij}}=\frac{1}{({n}_{h}-1)}$$

where nh is the number of household members. This fraction is assumed to be the same for all individuals in the population.

School-weighted contact network

To calculate the weights of the links at the school layer, we mix together all children that live in the same school catchment area. Interactions are considered well-mixed; hence, the probability of interaction at a school is proportional to the number of children at the same school. Therefore, the weight, $${\omega }_{{S}_{ij}}$$, of a link between children i and j within the same school is given by:

$${\omega }_{{S}_{ij}}=\frac{1}{({n}_{s}-1)}$$

where ns is the number of school members.

Within each connected component of the network in each layer (for example, a household or a school), the links between nodes are weighted to account for the effective daily number of contacts. For example, if we consider a school, while a student can potentially contact all her or his schoolmates, she or he only meet a relatively small fraction of them on a daily basis, as estimated in empirical studies on mixing patterns24,25. To account for this, we calibrate the weight of the links in each layer of the synthetic network26 so that the mean number of daily contacts matches the estimation provided in Mistry et al.27 (details in Supplementary Information). On the basis of the analysis of contact survey data from 9 countries24,25,28,29, this study estimated the mean number of daily contacts at 10.86, 4.11 and 11.41 in the community and workplace, household and school layers, respectively.

Stochastic simulations of COVID-19 dynamics

We describe the SARS-CoV-2 transmission process using a discrete-time susceptible–latent–infected–removed (SLIR) stochastic model, with some additional compartments to incorporate the special characteristics of SARS-CoV-2 infection (Fig. 1c). In particular, at each time-step t (1 d), the infectious asymptomatic (IA), infectious symptomatic (IS) and pre-symptomatic (PS) individuals can transmit the disease to susceptible (S) individuals with probability rβ, β and βS, respectively. If the transmission is successful, the susceptible node will move to the latent asymptomatic state (LA) with probability p or to the latent symptomatic state (LS) with probability (1 − p). A latent asymptomatic individual becomes infectious asymptomatic after a period (ϵʹ)−1, whereas latent symptomatic individuals transition, after a period ϵ−1, to the pre-symptomatic (PS) compartment. The average period to develop the disease and move to the infectious symptomatic state is γ−1. Infectious asymptomatic nodes are removed (R) after an average of μ steps. Conversely, infectious symptomatic nodes can either recover after that period with probability (1 − α) or, with probability α, these nodes will need hospitalization. It is considered that due to their symptoms they will self-isolate at home after an average period of μ−1. Then, depending on the severity of the symptoms, after a period δ−1 the individual will end in hospitalization with probability (1 − χ) or require hospitalization and ICU care with probability χ. Finally, individuals that are either hospitalized or in ICU become removed with probability μH or μICU, respectively. We initialize the model in the city of Boston by selecting an attack rate on 17 March 2020 of 1.5% (a sensitivity analysis of this quantity is provided in the Supplementary Information).

Social-distancing strategies

To simulate social-distancing measures, we modify the synthetic population such that:

• School closures are simulated by removing all the schools from the system simultaneously.

• Partial ‘stay-at-home’ assumes that all places are open except restaurants, nightlife and cultural places. Closures of these places are simulated by removing the interactions that occur in any place that falls into that category according to Foursquare’s taxonomy of places. This is the situation after the first reopening.

• Full lockdown and confinement assume that schools and all non-essential workplaces are closed. Here we close all workplaces except essential ones and remove interactions that occur there. Essential workplaces are: hospitals, salons, barbershops, grocery stores, dispensaries, supermarkets, pet stores, pharmacies, urgent care centres, dry cleaners, drugstores, maternity clinics, medical supplies and petrol stations.

The connectivity distributions for each of the scenarios simulated as well as other statistics related to the effects of the lockdown are shown in the Supplementary Information.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.