Since COVID-19 was first identified in December 2019 in Wuhan, China1, the entire world has been adversely affected by the ensuing pandemic2,3,4. As infections decreased during the summer months of 2020, many countries relaxed their lockdown and physical distancing measures in the course of reopening their economies and societies. Due to the increasing mobility and social contact rates, accompanied with the large numbers of susceptible people in the population, countries worldwide have been experiencing COVID-19 resurgences5,6.

In the absence of an effective vaccine, physical distancing interventions (for example, closure of schools and workplaces) are critical to contain the resurgences, even though such interventions have caused substantial disruptions to societies and economies7,8. As of January 2021, ten vaccines for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of COVID-19, have received either full approval or limited approval for emergency use9. Policymakers are also looking forward to the arrival of other vaccines in the coming months so that physical distancing restrictions can be alleviated, as vaccination can reduce the portion of susceptible contacts that may result in transmission10,11. However, the extent of potential alleviation remains unclear, especially given that the supply of vaccines will probably not be sufficient to achieve herd immunity in the immediate future. Therefore, more comprehensive interventions including both physical distancing and vaccine-implementation strategies should be developed to avoid COVID-19 resurgences. To this end, there is an urgent need to understand the interrelationships among mobility, social contacts, physical distancing, vaccination and virus transmission for tailoring and adjusting preventive interventions.

Human movement and contact rates have fundamental roles in shaping the transmission patterns of infectious diseases12,13. Their impact on COVID-19 inter-city spread has been extensively investigated using anonymized mobile phone data14,15,16, and previous studies have attempted to assess the effects of travel and physical distancing measures on the first wave of the COVID-19 pandemic17,18,19,20,21,22. In addition, mobility data have been recognized to effectively reflect the resumed human activities after lifting lockdown measures23,24, and have also been used as a proxy for measuring the effectiveness of interventions, such as stay-at-home orders, to mitigate or contain the transmission of COVID-1925,26. However, most studies have primarily used existing publicly available datasets27,28,29,30,31 to derive and provide coarse information on population mobility to measure changes in inter-city travel flow, check-in intensity or trip length under interventions. More refined data with anonymized geolocation information have been under-utilized for directly informing social contact rates32,33. The use of only mobility data is also unable to help assess the impact of vaccination measures, which requires deriving information on safe social contacts in modelling transmission dynamics.

Because mobility data cannot directly inform physical distancing and reductions in contact rates, studies examining the effect of lockdown policies on transmission have generally assumed that when people reduce their mobility, they proportionally reduce their contacts23,24. However, in reality, similar levels of social contact have been observed at both high and low levels of mobility in various studies32,33, and using mobility or social contact data alone may not be sufficient for precisely measuring physical distancing. The paired relationship between reductions or restorations in mobility and social contacts is more suited to serve this purpose. Thus, it is crucial to understand the interaction between mobility and social contact rates over time under COVID-19 interventions.

Future studies on epidemic spread should be able to consider the strong interaction between physical distancing and mobility to reduce contact rates with or without vaccination in case of resurgences. Ideally, investigation of the reduction in contact rates would involve creating a singular index that encompasses the combined impact of reduced mobility and physical distancing on contact rates so as to directly inform the estimates of contact levels among populations over time. This index should also be able to account for the impact of vaccination on reducing susceptible contact rates with minimal adaptations. In this Article, we propose a social contact index (SCI) to represent the daily average potential social contact (or spatiotemporal co-presence) per person, which was derived from an anonymized mobile geolocation dataset. The index associates a series of mobility levels (for example, 10% to 100%, with increments of 10%) with their corresponding social contact rates. On the basis of this association, an empirical relationship was established using allometric equations34,35 to compute the contact rates under a given level of physical distancing and population density for future scenario simulations. As a proxy for daily social contact rates per person, this index was incorporated into a modified susceptible–exposed–infectious–recovered (SEIR) model to quantify the impact of physical distancing interventions in light of the COVID-19 outbreak across China. The risks of COVID-19 resurgence without vaccination were then assessed under varying mobility, physical distancing and population density scenarios. Subsequently, the effect of administering vaccines to relax physical distancing interventions and reduce unsafe social contacts in preventing a resurgence of infections and curtailing the pandemic was assessed.


Deriving a social contact metric in populations

The proposed SCI used to measure the level of contacts per person in a city was created using the total number of potential contact events (or the total social contacts index (TSCI)), which was directly determined using a large, near-real-time anonymized mobile device positioning dataset from Tencent30 that covers more than 70% of the population in mainland China31. In this dataset, the potential contact events are detected in the context of the spatiotemporal co-presence of people36,37,38 or when their mobile devices request positioning services within a specified space–time bin (that is, 250 m and 10 min, in our case) (Extended Data Fig. 1a).

The SCI is an average per person TSCI. It is paired with mobility in a non-linear form (Methods), but is also influenced by physical distancing (Extended Data Fig. 1b) and population density. Here, mobility refers to the number of trips people make outside their homes. During the COVID-19 outbreak, people might have been ordered to, encouraged to, or volunteered to quarantine and work from home to reduce social contacts, thereby leading to a decrease in the SCI. However, after the lockdown measures were lifted, mobility was gradually restored. Physical distancing measures are applied to keep people who are outside their homes away from crowded places by closing parts of facilities or limiting the maximum number of people gathering in facilities through, for example, staggered entry and exit. Generally, high mobility leads to a high SCI, as does a high population density. However, strong physical distancing measures lead to a low SCI. Vaccination can protect people by greatly enhancing immunity and thus reduce the SCI of COVID-19 transmission between susceptible populations and infectors. The population density of a city may change owing to its population inflow and outflow, which in turn affects the SCI and disease transmission (Methods).

Taking Wuhan as an example, an overview of the overall social contact rates, as reflected by the TSCI of the entire city, is provided, together with the changes in the TSCI before, during and after the lockdown (23 January to 7 April 2020) due to the implementation of physical distancing measures of varying intensities (Fig. 1a–c). The TSCI was categorized into five types on the basis of the areas of interest30 where contact occurred, namely residential communities, workplaces, schools, shopping or recreation facilities, and other facilities. Before the lockdown (normal period) in December 2019, more than 80% of the contact occurred in the first four types of places. Figure 1d shows the changes in the TSCI of Wuhan and those of four other major cities in China (Beijing, Shanghai, Guangzhou and Shenzhen) against their normal patterns (the average levels in December 2019). After the implementation of nationwide interventions since 23 January 2020, 2 days before the start of Chinese New Year (25 January 2020), the TSCI of the five cities markedly decreased; the TSCI of Wuhan dropped to only 2% of its normal level (that is, 1.016) on 27 January, day 4 after imposing the lockdown measures. On 8 April, the lockdown was lifted in Wuhan and the TSCI slowly recovered, reaching 50% on 31 May, while the TSCI values of other Chinese cities returned close to their respective normal levels (Fig. 1d).

Fig. 1: Change in TSCI in Wuhan and four other major Chinese cities.
figure 1

a, Change of TSCI in Wuhan from December 2019 to May 2020 as a percentage of the average pre-lockdown level in December 2019. Categories are indicated in c. b, Enlarged view of a at TSCI values between 0% and 2%. c, The proportions of TSCI values that occurred in the indicated categories of locations in December 2019. d, Change of TSCI in Wuhan, Beijing, Shanghai, Guangzhou and Shenzhen (r denotes the pre-lockdown TSCI that the city returned to post-lockdown and l denotes the pre-lockdown TSCI that the city decreased to during the lockdown). Lines in a denoting the lockdown and lockdown-lifting dates are applicable to Wuhan only. The pink band indicates the Chinese New Year holiday period.

As revealed by the variations in the SCI values shown in Fig. 1, physical distancing restrictions of varying intensities were imposed over different periods in Wuhan, namely pre-lockdown (for example, December 2019)39,40, during lockdown (from 23 January to 7 April 2020)39,40, shortly after lockdown lifting (SALDL) (from 8 April to 20 May 2020), and longer after lockdown lifting (LALDL), after nucleic acid testing of all citizens (from 20 May to 30 May 2020)41. Various interventions were implemented together, which made it difficult to quantify their effects on the contact rate. To solve this problem, we derived four explicit sets of physical distancing interventions using the mobility dataset, namely ‘no’, ‘mild’, ‘moderate’ and ‘strong’ (Methods and Supplementary Table 1). Each set of these interventions at a certain intensity was found to cause similar impacts on the SCI values as those imposed together during the same period in Wuhan. This outcome enabled us to inform transmission dynamics under explicit sets of physical distancing measures.

The relationship between mobility and the SCI was modelled as an allometric growth curve34,35, but it varied over the three levels of population density and the four intensities of physical distancing (or the four explicit sets of physical distancing measures) (Fig. 2). The SCI values for fitting each curve under a series of randomly sampled mobility levels were determined using the mobility dataset during the normal period (December 2019) because there was no significant population migration or physical distancing intervention. The modelled SCI values were validated against the actual SCI values extracted from the original dataset under each mobility level (Methods). Thus, a mobility–SCI-coupled metric was formulated in the form of these equations to determine the SCI. This metric can also be adapted to evaluate the effect of vaccination on the SCI. As the people with immunity to SARS-CoV-2 cannot infect or be infected by others, their contacts would not affect the transmission process. This group of people can therefore be treated similarly to mobility reduction. Consequently, the empirical relationships were used in scenario-based simulations to derive the resulting contact rates in accordance with mobility restoration (representing the levels of reopening economies) and mobility reduction, physical distancing and/or vaccination interventions under certain level of population density.

Fig. 2: Change in the SCI in Wuhan with different levels of population density and physical distancing.
figure 2

a, Each chart shows changes in SCI (as a percentage of the average pre-lockdown SCI) over mobility levels from 0% (no population movement) to 25%, 50% and 100% (no stay-at-home order) under different population densities (low, medium or high) and no, mild, moderate or strong physical distancing measures. The charts correspond to varying intensities of physical distancing imposed during pre-lockdown (no), LALDL (mild), SALDL (moderate) and lockdown (strong) in Wuhan. b, The daily change of SCI (as a percentage of the average pre-lockdown SCI) in Wuhan between December 2019 and May 2020.

Reconstruction of COVID-19 transmissions using a mobility and contact-based SEIR model

A classical SEIR model was modified to accommodate both intra-city and inter-city mobility and social contacts—forming a mobility and contact-based SEIR (MC-SEIR)—to recapitulate the transmission process of COVID-19 in Wuhan from 2 December 2019 to 31 March 2020. Because the social contact rate directly affects the transmission process, to more precisely assess the effect of physical distancing measures on transmission, a dynamic daily transmission rate derived from SCI was used to replace the fixed transmission rate in the conventional SEIR model.

Specifically, the mobility and SCI data were used to estimate the instantaneous effective reproduction number (Rt) via a generalized linear model. The transmission dynamic was calibrated using a Bayesian optimization method42 with the reported case data from Wuhan (Methods). The model predicted daily new cases over the period from December 2019 to March 2020 with relatively high accuracy (R2 = 0.95; Fig. 3). Under a non-linearity assumption, the Kendall’s tau correlation between SCI and Rt was examined, with a coefficient of 0.59 (95% confidence interval: 0.47 to 0.72; P < 0.001, n = 61). If using the mobility data alone, the correlation coefficient decreased substantially to 0.41 (95% CI: 0.30 to 0.54; P < 0.001; n = 61). This result suggested that SCI was more strongly rank-correlated with Rt than mobility.

Fig. 3: Fitted curves and Rt as predicted by the mobility and contact-based SEIR model.
figure 3

Top: estimates of the daily exposed and infectious individuals and numbers of new infections from December 2019 to March 2020. Comparison of the number of new infections with the daily reported cases yields an R2 of 0.95 (n = 92) at a statistical significance level of 0.001. Bottom: the corresponding daily Rt over the same period. The pink band indicates the Chinese New Year holiday period.

Effect of physical distancing interventions on future resurgences without vaccination

The effectiveness of interventions in preventing a COVID-19 resurgence was assessed by factoring in mobility, physical distancing and population density under a no-vaccination scenario. The median duration required to contain a resurgence was estimated for each scenario, as shown in Fig. 4. Specifically, physical distancing measures were applied under a certain level of mobility and a selected population density scenario when the new cases exceeded ten per day. The measures were lifted after no new cases were registered for 14 d. The scenarios and corresponding simulation results are expected to be useful in designing preventive interventions against COVID-19 for other cities worldwide with similar variations in population density (Methods).

Fig. 4: Estimated effects of control measures on containing a resurgence of infections under different population densities.
figure 4

ad, The median duration required to contain a resurgence (DCR, in days) for four intensities of physical distancing measures (no, a; mild, b; moderate, c; and strong, d) in Wuhan with four levels of pre-lockdown mobility (25%, 50%, 75% and 100%) under different population densities. The value in each cell denotes the DCR with respect to its associated level of mobility, intensity of physical distancing and population density. Population densities are expressed as percentages of the pre-lockdown population density in Wuhan.

Physical distancing is deemed necessary for cities with a low (30% of the pre-lockdown population density in Wuhan), medium (50% of the pre-lockdown population density in Wuhan) or high (100% of the pre-lockdown population density in Wuhan; 1,282 people per km2) population density to curb resurgences of coronavirus infections (Fig. 4). Moderate and strong physical distancing measures could help a city with a low population density to relax interventions within three months if mobility reduction (for example, reduced to 50% pre-lockdown level) was simultaneously applied. However, for a city with a high population density, a combination of mobility reduction and physical distancing measures would be needed to contain the resurgence within nine months. We found that the duration of intervention implementation (259 d, interquartile range (IQR): 162–345 d) for a city with high population density would almost triple that for a city with a population density of 50% or lower (86 d, IQR: 71–104 d) if both moderate physical distancing interventions and mobility reduction to 50% of the pre-lockdown level were adopted. Thus, to shorten the duration of intervention implementation for a city with a high population density, the strong physical distancing measures would be suggested.

Joint effects of vaccination and physical distancing in avoiding resurgences

Vaccines against SARS-CoV-2 have become available since late 2020, and it is imperative to assess the potential combined effects of vaccination and physical distancing, especially as there will be a period when only limited supplies of vaccines are available or only a limited proportion of people (lower than the theoretical herd immunity threshold) are vaccinated. Thus, we designed a set of scenarios using physical distancing measures to end the epidemic for a city with an increasing proportion of vaccinated population. In total, 64.2% of the population (that is, the herd immunity threshold derived from the median of a set of reported R0 values43) would be vaccinated within one year. The effectiveness of vaccines (the seroprotection rate) was set at 75% (neutral scenario) (Methods). According to the simulation results (Table 1), the combination of physical distancing and vaccination was predicted to further reduce the number of infected cases compared with vaccination alone. The reduction effects were predicted to be more significant for cities with a high population density, such as Wuhan. Specifically, 97.72%, 99.99% and 99.99% of the infections were predicted to be avoided under mild, moderate and strong physical distancing intensities, respectively. Strong and moderate physical distancing together with vaccination were predicted to suppress the infections to low levels, that is, 213 (95% CI: 122–347) and 1,800 (95% CI: 991–2821) cases in one year, respectively, and thus were recommended for adoption. Meanwhile, the total duration of physical distancing to end the resurgences decreased gradually from 350 d (95% CI: 338–354 d; mild) to 234 d (95% CI: 166–395 d; moderate) and then 43 d (95% CI: 33–64 d; strong). The results implied that physical distancing with strong intensity and short duration would be a better solution for curtailing resurgences in terms of the number of cases number and duration of intervention.

Table 1 Simulated joint effects of vaccination and physical distancing measures

Compared with the no-vaccination scenario, vaccination combined with physical distancing was predicted to contain the resurgence without relying on mobility reduction, whereas a gradual vaccination process alone could not achieve this. Specifically, for cities with low population density, physical distancing would not be required. For cities with medium population density, vaccination could shorten the duration of physical distancing measures required to end the resurgence by 36%–78% and limit the number of infected cases to 298 (95% CI: 130–438) and 776 (95% CI: 368–1064) under moderate and mild physical distancing, respectively. For cities with high population density, vaccination enabled strong physical distancing measures to be replaced by moderate physical distancing. Consequently, it is of value to apply joint physical distancing and vaccination interventions while approaching herd immunity, especially in large cities with high population densities, such as Wuhan.

To evaluate uncertainties, the above simulations were replicated under pessimistic and optimistic scenarios with the effectiveness set at 50% and 100%, respectively. Without loss of generality, the population density was set as 100% of the pre-lockdown population density of Wuhan for the optimistic, neutral and pessimistic scenarios. The results showed that the uncertainty (IQR of daily new cases) would be high when applying limited physical distancing measures. In the pessimistic scenario, the peak of daily new infections was 1.25 to 8.08 times that in the neutral scenario when only no or mild physical distancing measures were applied, whereas the peak of daily new cases was similar (1.08 and 0.97 times) when moderate or strong physical distancing measures were imposed (Fig. 5). This further confirmed that the joint implementation of physical distancing and vaccination could reduce uncertainty in ending the epidemic.

Fig. 5: Combined effects of vaccination and physical distancing.
figure 5

The number of daily new cases are simulated under scenarios with varying vaccine effectiveness (optimistic, pessimistic and neutral) and physical distancing measures (no, mild, moderate and strong). The y-axis represents the number of daily new cases and the x-axis shows the number of days since the start of the simulation. The shaded area indicates the IQR for the neutral scenario.

To test the generalizability of our proposed SCI models under varying population densities, we examined six cities in China (low density: Zhuzhou and Qiqihar; medium density: Hefei and Hangzhou; and high density: Beijing and Chengdu), with each density set forming a group under one population-density scenario. The four sets of physical distancing measures (for example, ‘the closure of schools and 20% reduction in contact in all other categories’ for mild intensity) were directly applied to these cities, but an SCI–mobility curve was derived for each city (Methods). According to the results (Supplementary Table 4), all of the low-density scenarios required no physical distancing measures when vaccination was applied. The medium-density cities (Hefei and Hangzhou) had less than 1,500 cases in one year, even with mild physical distancing measures. Thus, we could further infer that the mild, moderate and strong physical distancing measures would all be acceptable for medium-density cities according to the number of potential cases in one year. This result is similar to the result for the 50% population density scenario in Wuhan (Supplementary Table 4). In all three high-density scenarios, the no and mild physical distancing measures did not stop the resurgence or reduce the high number of infections. Thus, the moderate and strong physical distancing measures would be needed for high-density cities when vaccines became available. Strong physical distancing measures should be applied first, because this intensity would probably end the need for interventions within two months. The results in all six cities were similar to those in the corresponding population density scenarios in Wuhan. This evidence suggests that our proposed mobility–SCI model and the explicit sets of physical distancing measures could be used to inform the combined effects of interventions in other cities with similar population densities.


Our study assessed the effects of physical distancing interventions on the resurgence of COVID-19 with increased mobility and social contacts following the lifting of lockdown measures under both vaccination and no-vaccination scenarios. The effectiveness and duration of physical distancing interventions in containing future resurgences is highly dependent on the intensity of the measures, the population density and the availability of vaccines across space and time. Large cities with a high population density, such as Wuhan (1,282 people per km2), are more vulnerable to resurgence when reopening the economy and society. On the one hand, the naturally high social contact rates in these cities would require more effort (longer and stronger physical distancing measures and stronger mobility reduction measures) than would be necessary for less dense cities. On the other hand, containing a resurgence in the absence of an effective vaccine is a costly process, especially for cities with high population density; the containment would require two months even with strong physical distancing measures. Thus, it is important to take precautionary measures against future resurgences in the upcoming few months before effective vaccines become widely available. Socioeconomic activities can be safely restored to normal levels if the vaccinated population reach the herd immunity threshold (64.2%) derived from the R0 value (2.79). Before achieving herd immunity, however, if cases are imported into regions where outbreaks have been fully contained, the cities with high population density would still require moderate or strong physical distancing measures, whereas the cities with low population density could curtail the transmission solely by vaccination.

Effective vaccination with a high coverage of the population can greatly ease the reliance on physical distancing interventions with respect to implementation intensity and duration. Specifically, stay-at-home orders (mobility and contact reductions used in our simulation) would no longer be needed, even with a gradual vaccination process (reaching 64.2% of the population in one year). However, only cities with low population density can fully interrupt transmission without implementing any physical distancing measures. For cities with medium or high population density, physical distancing would still be required to lower the infection number. Meanwhile, the joint implementation of vaccination and physical distancing can limit the uncertainty of transmissions caused by the ineffectiveness of vaccines or short-term immunity. This result suggests that strong physical distancing interventions with short durations might be more effective than mild interventions with long durations while the long-term effectiveness of vaccines is unconfirmed.

Although this study shows that non-pharmaceutical interventions are likely to substantially reduce COVID-19 transmission, it is difficult for the public to adhere to travel and physical distancing measures for a long period6,44,45, and subsequent waves of resurgence may emerge if interventions are relaxed and normal levels and patterns of travel are resumed6,23,24 before achieving herd immunity through vaccination. Our research provides a framework and set of outputs that can be used across a wide range of settings, and includes (1) more precise estimates of COVID-19 outbreaks and the efficacy of interventions under both vaccination and no-vaccination scenarios; (2) identification of the most effective combinations of physical distancing and vaccination interventions and their intensities for preventing or suppressing resurgences; and (3) supporting disease control strategy design through improved understanding of interventions and their effects across space and time in regions with different population densities.

Our findings should be considered in the context of several assumptions and data limitations. First, we did not derive individual-level mobility and social contacts to estimate COVID-19 transmission owing to data availability and privacy issues. If demographic attributes of individuals are available in the future32,33,46, then we could extend the methodology and analyses to detect potential social disparities in the vulnerability to COVID-19 and assess potential heterogeneities in the efficacy of intervention. Second, the mobile phone-derived data in China for parameterizing travel and physical distancing interventions in our SCI model might not be representative of populations in other regions or countries owing to variations in user coverage, population dynamics across regions and human behaviour patterns. However, the data used here cover over 70% of the population in China31, and thus represent the Chinese population reasonably well. Third, the accuracy of our model relies on the accuracy of epidemiological parameters derived from reported case data, the quality of which might be constrained by case definitions, the capacity for diagnosis and surveillance, and other factors that vary across countries, regions and time46,47,48. Fourth, other factors and interventions, such as hand washing and wearing of face masks, may also contribute to mitigating COVID-19 spread across space and time46,49,50, but our simulations did not specify their contributions to transmission. Fifth, the exemplified intervention measures and vaccination settings in our simulation scenarios might not be complete, especially given the ongoing and rapidly changing challenges of vaccine supply and logistics, as well as the potential impact of new virus variants on the vaccine effectiveness11,51. Although the heterogeneity in human behaviour and interventions across regions or countries may limit the generalizability of our model and findings, this study provides an evidence-based assessment of the joint effect of physical distancing and vaccination interventions on COVID-19 in other cities worldwide with similar levels of population density, settings and human behaviour patterns. Additionally, our methods integrate near-real-time mobility and social contact data as well as vaccination scenarios, thereby suggesting that our approaches can be adapted to address emergent needs given the rapid changes in the COVID-19 transmission dynamics post-lockdown.


No statistical methods were used to predetermine the sample size in fitting the relationship between contact rates and mobility. However, our sample was sufficiently large, as the original data cover more than 70% of the population in China (for example, 19,728 spatiotemporal bins in Wuhan were used as samples). The individual characteristics (for example, sex and age) and the exact number of smartphone users were not available as aggregated data were used. We were also aware that users opted in to provide their location information. The aggregated data included all available users. In other words, there was no randomization in collecting the data of social contact events.

Ethics declaration

The collection and analysis of COVID-19 case data were determined by the National Health Commission of China to investigate and control the outbreak. Ethical clearance for collecting and using secondary data in this study was granted by the institutional review board of the University of Southampton (no. 61865). All data included in models were supplied and analysed in an anonymous and aggregated format, without access to personal identifying information.

Case incidence data

The daily numbers of COVID-19 cases by date of illness onset in Wuhan as of 17 April 2020, as obtained from the national information reporting system for notifiable infectious diseases in China, were used to further evaluate the performance of the baseline model. There was an abnormal increase in the number of cases in Wuhan on 1 February based on the date of illness onset. We interpolated the number on 1 February as the mean number of cases reported on 31 January and 2 February in the epi curve.

Estimation of population migration and associated population density variations

Population migration had an important role in our assessment of the population in the study area (that is, Wuhan) and subsequently impacted the transmission dynamics. Population migration between cities was detected, and the corresponding data were acquired from the mobile device dataset from Tencent. For a given day (t), the relative values of the population inflow (Ir,t) and outflow (Or,t) were calculated by detecting the number of move-in and move-out mobile device users. These values were used to estimate the actual inflow (It) and outflow (Ot) together with the relative value of the population (Ut) (number of mobile device users after deduplication) and the ambient population (Pt) (the average total population in a particular location, for example, a grid cell or a city throughout a period, for example, 1 h or 1 d) on day t (that is, daily ambient population). Finally, the Pt was updated on the basis of the estimated inflow and outflow on that day, as follows:

$$I_t = I_{r,t} \times \frac{{P_t}}{{U_t}},$$
$$O_t = O_{r,t} \times \frac{{P_t}}{{U_t}}$$
$$P_t = P_{t - 1} + I_t - O_t$$

The population density of a city is influenced by population migration, and thus varies over time. Population density may directly affect the base contact rate between individuals. In our simulations, the Pt of Wuhan at the beginning of the study period was obtained from the government statistical data (that is, 11.21 million (Wuhan’s permanent population on 1 December 2019)52 and updated on a daily basis using the migration data. Consequently, the population density also changed daily during the study period, but it was assumed to be homogeneous within the city.

In addition, the cases that were registered in the city, regardless of whether they were from the permanent population or the migrant population, would be included in the total number of cases in a simulation scenario. Thus, to better evaluate the risk of COVID-19 to the population under this statistical calibre30, we used an equivalent population base that we designated ‘potentially exposed population’, which was calculated by summing the city’s permanent population and its annualized inflow population53. The annualized inflow population refers to the weighted sum of daily inflow population over one year. For each daily inflow, its weight is the proportion of the number of days between the inflow date and the end of the year in a whole year. For example, the weight of the inflow population on the first day is 364/365.

Social contact measurement

The numbers of total potential social contact events (that is, TSCI) and outside-home trips (mobility) were derived from Tencent’s mobile device geolocation dataset (Extended Data Fig. 2). By using the government statistical data54 and considering the population coverage of this dataset, the time-varying gridded population was then obtained and aggregated to estimate the population stock number in any specified space–time bin according to the user coverage rate. Each bin was indexed with a spatial coordinate l and a temporal coordinate j, and labelled with a land use type k (for example, residential community, work, school, shopping or recreation, or others). TSCI denotes the overall scale of potential social contact (or contact events) occurring in a city within a given time period (for example, 1 d). Each contact event refers to a one-time co-presence within a space–time bin (Extended Data Fig. 1). The number of such events can be calculated as follows:

$${\mathrm{TSCI}}_t = {\sum} {G_{l,j,k}\left( {G_{l,j,k} - 1} \right)}$$
$$G_{l,j,k} = \frac{{U_{l,j,k}}}{C}$$

where Gl,j,k denotes the ambient population in bin (l, j, k), Ul,j,k is the number of mobile device users in the same bin, and C is the conversion ratio, which is constant.

On a per-person level, the SCI on day t can subsequently be calculated (Extended Data Fig. 2) as follows:

$${{\mathrm{SCI}}_t} = \frac{{{\mathrm{TSCI}}_t}}{{P_t}},$$

where Pt is the total ambient population of the study area on day t (aggregated from all the bins over all data-collection time intervals).

The SCI computed using mobile geolocation data can help us one to understand the transmission process with a higher spatiotemporal resolution by providing the number of potential social contacts on a per-person basis. To explore the non-linear relationship between SCI and mobility, 10% to 100% of mobile device users (in increments of 10%) were randomly selected, and their corresponding contact events were detected in the form of spatiotemporal co-presence (Extended Data Fig. 1). Thus, a series of mobility–SCI pairs was created; an empirical relationship was then built on the basis of these pairs to estimate SCI values in scenario-based simulations, given an assumed restoration or reduction in mobility levels.

Quantifying the relationship between mobility and social contacts

To quantify the association between mobility and the SCI, we used an allometric curve34,35, as follows:

$${{\mathrm{SCI}}_t} = \alpha \times \left( {h \times P_t/A} \right)^\beta$$

where A is the size of the study area and Pt/A is the population density. h denotes a ratio for quantifying the mobility level of the entire population (that is, Pt), which can also reflect the level of mobility reduction or restoration. For instance, a stay-at-home order followed by approximately 50% of the population can be simulated by setting h at 0.5. Thus, the mobility is calculated as h × Pt and the mobility per unit area is calculated as h × Pt/A. α and β together define a power law equation representing the influence of physical distancing measures. Higher values of α and β indicate weaker physical distancing, thereby implying more social contact under a given population density and mobility level. In addition, an inflow of population can cause an increase in Pt and an increase in population density and SCIt.

To fit this model, the social contact data of Wuhan for one week during the pre-lockdown period (1 December 1 to 7 December) were extracted and used to generate a set of mobility–SCI observations. The social contacts were categorized into five types according to the places where they occurred, namely the residential community (Cr), work (Cw), school (Csch), shopping or recreation (Cs) and others (Co).

$${\mathrm{TSCI}} = {C_{\mathrm{r}}} + {C_{\mathrm{w}}} + {C_{\mathrm{sch}}} + {C_{\mathrm{s}}} + {C_{\mathrm{o}}}$$

The values of each of these types of contact under a given mobility level (from 1% to 100%) were also recorded and denoted as Ck,i, where k {r,w,sch,s,o} and i {1%,2%,3%,...,100%}. The mobility level was obtained by randomly sampling the overall mobile device users (represented by i) and calculating the contact events belonging to the given category k. The contact value (SCIt,i) with respect to a given mobility level i and a physical distancing intervention was then derived (Extended Data Fig. 2), where pk,i, denoting a ratio between 0 and 1, was used to quantify the contacts of a given category k under the mobility level i on day t, as follows:

$${\mathrm{SCI}}_{t,i} = \left( {{\sum} {p_{k,i} \times C_{k,i,t}} } \right)/P_t$$

A set of SCIt,i and i values were generated under a given physical distancing intervention specified by α and β and a population density (Pt/A). These observations were used to quantify the relationship between a specified mobility level and an SCI value for a given population density. Therefore, the α and β values for a given physical distancing intervention could be obtained as follows:

$${\mathrm{SCI}}_{t,i} = \alpha \times \left( {i \times h \times P_t/{\mathrm{A}}} \right)^\beta$$

The above process was conducted four times to determine the parameters α and β under varying intensities of physical distancing interventions, namely no, mild, moderate and strong. Using the mobility data of Wuhan from December 2019 to May 2020, we were able to derive the social contact rates at different locations and infer their opening and closure during the periods in which physical distancing interventions of varying intensities were implemented. As shown in Supplementary Table 1, the four intensities were consistent with those imposed in the following periods: pre-lockdown, LALDL, SALDL and lockdown. Each set of measures was as follows. No physical distancing refers to the normal status pre-lockdown; mild physical distancing measures include the closure of schools and a 20% reduction in contact in all other categories; moderate physical distancing measures include the closure of schools, an 80% reduction in contact in shopping or recreation, and a 50% reduction in contact in all other categories; and strong physical distancing include only 50% of the pre-lockdown-level contact in residential communities and the closure of all other non-essential facilities.

The above considerations indicate that the four intensities of physical distancing can be implemented explicitly because each intensity is linked to a set of control measures (for example, closure of schools and 20% closure of other services), which can then inform the changes in the SCI. These changes then indicate the potential variations in disease transmission.

To validate the proposed mobility–SCI relationships under the four intensities of physical distancing or the four sets of explicit physical distancing measures (Fig. 2), the relationships were examined against the observed mobility–SCI values in Wuhan over the corresponding periods of pre-lockdown, LALDL, SALDL and lockdown under the same population density (low, medium or high). The low population density was set to approximately 30% of the population of Wuhan, medium population density was set to 50% of the population, and high population density was set to 100% of the population. Under each physical distancing intensity, the modelled mobility–SCI relationships for the three population densities followed the same allometric function34,35. Thus, only four relationships required validation (Extended Data Fig. 3).

The results indicated that the modelled SCI values under each mobility level (10% to 100% with increments of 10%; that is, for each relationship), were significantly correlated with the observed SCI values (Pearson correlation coefficients ranging from 0.95 to 0.97 for all of the relationships, each with P < 0.001; mean absolute error ranging from 0.03 to 1.42; n = 10). Therefore, the proposed mobility–SCI metric was considered useful for measuring the actual intensities of physical distancing that existed over the aforementioned periods for different population densities. Accordingly, this metric was used to quantify the intensities of physical distancing in our simulations.

SEIR computation

For the purpose of this computation, the population (N) was subdivided into five groups, namely susceptible (S), exposed (E), infectious (I), recovered/removed (R) and vaccinated (V). During each time step, five sub-steps were performed sequentially, as follows:

$$N = S + E + I + R + V$$
$$\frac{{{\mathrm{d}}S}}{{{\mathrm{d}}t}} = S - E_I\frac{{SI}}{N} + \xi _RR + \xi _VV$$
$$\frac{{{\mathrm{d}}E}}{{{\mathrm{d}}t}} = E_I\frac{{SI}}{N} - E_c$$
$$\frac{{{\mathrm{d}}I}}{{{\mathrm{d}}t}} = E_c - r_tI$$
$$\frac{{{\mathrm{d}}R}}{{{\mathrm{d}}t}} = r_tI - \xi _RR$$
$$\frac{{{\mathrm{d}}V}}{{{\mathrm{d}}t}} = V_{e,t} - \xi _VV$$

First, the disease transmission coefficient EI was calculated based on a transmission process following a Poisson (λ = Rs) distribution55. The theoretical number of newly exposed people EI × I was then multiplied by the percentage of susceptible people (S/N) in the city. Here, the SCI-adjusted transmission rate (Rs) was obtained from the basic reproduction rate (R0) (2.2, 95% CI: 1.4–3.9), divided by the average number of days (tg) (5.8, 95% CI: 4.3–7.5) between the onset and first medical visit and isolation, and weighted using the level of social contact (Sc) determined using the mobility data. Sc was computed using a generalized linear model consisting of the parameters β0, β1 and β2. The observed SCI values were derived from the TSCI divided by the total ambient population of the city when fitting the models. In the simulation processes, the SCI values were derived using our proposed physical distancing and mobility reduction measures. We assumed that the mean incubation period for exposed people was 4 d (IQR: 2–7 d)11. A set of onset dates (that is, current time point plus a stochastic incubation period) was stochastically generated and recorded together with the onset dates of the previous exposed population, as follows:

$$R_{\mathrm{s}} = S_c \times R_0/t_g$$
$$S_c = \beta _0 \times {\mathrm{SCI}}^{\beta _1} + \beta _2$$

Second, the exposed people were considered infectious if their onset dates (specified in the previous sub-step) were equal to the current time step (that is, day t). The number of such people is denoted by Ec. Typically, a direct estimation based on the total number of existing exposed people and the conversion rate (σ) derived from the delay in symptom onset distribution was applied. For instance, 20% of the exposed population on day t − 1 would be converted to the infected population if σ is equal to 0.2. However, such an approach can cause a premature conversion of the exposed population to the infectious population; for example, the abrupt peak in mass social contact before the Chinese New Year holidays might have resulted in an immediate increase in the exposed population but a delayed increase in the infectious population. In contrast, a zero-delay peak of new infectious people would appear on the next day if only the number of people exposed in the current time step were considered instead of their potential onset days.

Third, the infected people were later removed or recovered at an average rate of rt (recovery/removal rate) where t {1,2,3,4,5}, which represents the five periods identified in Wuhan39,40. A dynamic rt was applied in light of the significant changes in the diagnosis and isolation strategies implemented during December 2019 to March 2020. rt was modelled as an optimizable parameter and was determined using a Bayesian optimization method42. Everyone in the recovered/removed group lost immunity at a rate of ξR every day during the period.

Fourth, when fitting against the real situation in Wuhan, the initial model did not include the vaccination group and vaccination process (the vaccinated population was 0). However, they were included in the SEIR model in simulating the combined effects of vaccination and physical distancing. On day t, part of the vaccinated population (Ve,t) was immunized, and everyone in the vaccinated group lost immunity at a rate of ξV every day.

Finally, the exported and imported population were processed to update the total population in the city using the migration data. The exported population followed the same fractions of the susceptible, exposed, infectious and recovered/removed population in the current time step in the city, while the imported population was considered to join the susceptible population. This indicated a limitation that the exposed and infectious populations could only be input at the start time step of simulations.

Optimization of parameters for SEIR modelling

We developed the MC-SEIR model by modifying the classical SEIR model with mobility and social contact data to reconstruct the transmission dynamics of COVID-19 in Wuhan between December 2019 and March 2020. The model was calibrated within a Bayesian optimization framework by using a tree-structured Parzen estimator42, in which the relationship between social contact and transmission rate as specified by the generalized linear model, the removal rate rt, and the initial cases were optimizable. These parameters were estimated by minimizing the squared error between the model-estimated daily new cases and the actual case report data. To deal with the uncertainty in the stochastic SEIR model, each parameter set was evaluated 150 times. The mean squared errors were finally used as the object function values by the tree-structured Parzen estimator.

We also tested the non-linear correlation between the SCI (4 days ahead) and Rt by using the case report data of Wuhan for the period of 1 December 2019 to 31 March 2020. A four-day (median value of the incubation period) time lag was applied to examine the lagged correlation between Rt and SCI.

Simulation for containing resurgences through physical distancing without vaccination

The resurgence of COVID-19 is highly possible, and there is a high risk of resurgence in the near future. Therefore, it is imperative to devise appropriate physical distancing interventions that can help to effectively contain potential resurgences. We simulated resurgence under different intervention strategies and levels of mobility and evaluated the effectiveness of the strategies using the median duration required to contain the resurgence. A physical distancing intervention was commenced under a certain level of mobility when the number of daily new cases exceeded a threshold (that is, ten people in the simulation). Thus, the SCI could be controlled under the proposed intervention, along with the level of mobility, to decelerate the transmission process. The SCI value was acquired from the SCI curve defined earlier with respect to the level of mobility, intensity of physical distancing and population density.

Ideally, the number of new cases would decrease owing to significant reductions in the TSCI and SCI. The interventions were lifted when there were no new cases for 14 consecutive days. Otherwise, the measures were continuously implemented for one year (the remainder of the total simulation period). In other words, an estimated duration of more than 300 d indicated that the conducted interventions could not contain the resurgence effectively.

Simulation of the joint effects of vaccination and physical distancing

To understand the joint effects of vaccination and physical distancing in the cities with varying population densities, a set of scenarios differentiated by vaccination, physical distancing and population density were designed. During the simulation period (that is, 365 d), the same number of people would be vaccinated every day (approximately 0.18% of the total population). By the end of the simulation period, 64.2% of the population would be vaccinated. The proportion of the vaccinated population (1 − 1/R0) was derived from R0 (2.79), that is, the median R0 value as reported in a set of previous studies43 (Supplementary Table 2).

The vaccinated population was assumed to inject two doses, and gradually obtained immunity to SARS-CoV-2. Specifically, six types of COVID-19 vaccines56,57,58,59,60,61 that had finished phase II trials (Supplementary Table 3) were reviewed. The probability of inducing an immune response (probability of seroconversion) was recorded on different observation days (for example, 75%; day 14). These records were later grouped by date and used to calculate the quantiles (that is, 25%, 50% and 75%) of seroconversion on each date (for example, 14, 28 or 42 d since the first shot) (Extended Data Fig. 4). In our simulations, the median value of seroconversion was used. Under pessimistic, neutral, and optimistic scenarios, 50%, 75% and 100% of the population that had experienced seroconversion would acquire immunity, respectively. The population with immunity could not infect or be infected by other people.

The long-term immunity loss was also considered. The population can obtain immunity by either recovery from infection or vaccination, but the achieved immunity will fade at different speeds. For the recovered group, we assumed that their immunity would follow a similar decreasing curve to the one for severe acute respiratory syndrome (SARS), which is also caused by a coronavirus, because there was no available systematic review or report on the immunity-fading rate of COVID-19 recovered patients.

Specifically, we assumed that 6.12% of the recovered population would lose their immunity to COVID-19 in the first year after their recovery62. For the vaccinated group, we assumed that their immunity would fade at a faster rate. Thus, a unique fading curve was adopted to simulate the fading of immunity in the vaccinated population. However, there were also no data on the long-term effects of vaccines for COVID-19 or for other coronaviruses (for example, SARS and Middle East respiratory syndrome coronavirus). Therefore, the fading trend of influenza vaccination was used instead. We assumed that 53.05% (95% CI: 45.79%–60.29%)63 of the vaccinated population would lose their immunity in the first year.

Simultaneously, physical distancing measures (with mild, moderate or strong intensity) would commence when daily new cases exceeded a threshold (that is, 10), which would later be lifted if there were 14 consecutive days with no new cases. It was considered that vaccination and mobility reduction could achieve a similar effect, that is, exposure reduction in population that may potentially cause infection. However, compared with vaccination measures, travel restrictions have serious adverse socioeconomic effects. Moreover, travel restrictions may be difficult to enforce in some countries. Therefore, the physical distance measures that we adopted excluded mobility reduction in this scenario; that is, the mobility was 100% in the simulations.

The scenarios were simulated under 3 population densities (that is, 30% (low), 50% (medium) and 100% (high) of the population density of Wuhan) and 4 physical distancing intensities. Each of the 12 scenarios was run 200 times to evaluate the uncertainty. In the simulations, contacts caused by people belonging to the vaccinated and removed or recovered groups were removed from the TSCI because these people could not infect or be infected by other people. Finally, the cumulative infected population and duration of physical distancing measures were reported with 95% CIs in Table 1.

Application of the joint vaccination and physical distancing interventions to other cities

To examine the extensibility of our proposed SCI models grouped by population density, we replicated the simulations of the combined effects of vaccination and physical distancing measures in six other cities, namely Zhuzhou and Qiqihar (low density), Hefei and Hangzhou (medium density), and Beijing and Chengdu (high density), and compared them with Wuhan’s population density scenarios. For each city, the mobility–SCI relationships were first extracted under no, mild, moderate and strong physical distancing intensities. In this process, the set of measures under a physical distancing intensity in Wuhan remained unchanged to examine if the measures could be feasibly applied to a city with a similar population density. The extracted SCI curves were later used to derive the SCI value of a given city under various physical distancing and vaccination scenarios. The simulations, which were the same as those for Wuhan, were then replicated with all of the conditions unchanged; the major differences in cities were in their different SCI–mobility curves. Finally, the estimated number of cases (in a one-year period) and duration of physical distancing were reported as indicators for the assessment.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.