## Main

A few weeks after the initial announcement of a cluster of atypical pneumonia cases in Wuhan, China, the first confirmed cases of coronavirus disease 2019 (COVID-19) in the USA and Europe were detected (on 21 January 2020 in WA, USA1 and on 24 January 2020 in France2). Although many more states and countries began to report initial introductions in the following weeks, only a few cases were detected daily during this time period (Fig. 1a), and most countries adopted a testing policy that targeted symptomatic individuals with a travel history linked to China. Several reports suggest that the introduction of SARS-CoV-2 occurred earlier than initially recognized3,4,5,6,7,8, raising questions about the effectiveness of the initial testing policies and travel-related restrictions, as well as the extent to which the SARS-CoV-2 virus spread through cryptic transmission in January and February 2020. To address these questions, we use the global epidemic and mobility (GLEAM) model, a data-driven, stochastic, spatial, age-structured metapopulation epidemic model11,12, to study the global dynamic underlying the evolution of the COVID-19 pandemic in Europe and the USA. Our model maps the plausible pathways of the pandemic using information available at the early stages of the outbreak and provides a global picture of the cryptic phase as well as the ensuing first wave of the COVID-19 pandemic.

We consider data concerning the continental USA and 30 European countries (the full list is reported in Extended Data Table 1). The model integrates real-time human mobility and population data with a mechanistic epidemic model at a global scale, incorporating changes in contact patterns and mobility according to the non-pharmaceutical interventions (NPIs) implemented in each region. It is calibrated on international case introductions out of mainland China at the early stage of the pandemic using an approximate Bayesian computation (ABC) methodology13. The model returns an ensemble of stochastic realizations of the global epidemic spread including international and domestic infection importations, incidence of infections and deaths at a daily resolution (see Methods). In the following text, we provide a detailed discussion of the analyses and results concerning European countries and the US states; however, to further test and validate our approach, in the Supplementary Information, we report the modelling results for 24 additional countries that are globally representative, including countries of world regions such as Latin America, the Middle East, Africa, East Asia and Oceania.

In Fig. 1b we show the model estimates of the median daily incidence of new infections up to 21 February 2020, for both the USA and Europe. These values are much larger than the number of officially reported cases (see Fig. 1a), highlighting the substantial number of potential transmission events that may have already occurred before many states and countries had implemented testing strategies independent of travel history. As validation we compare our model’s estimates of the number of infections during the week of 8 March 2020 to the number of cases reported during that week within each US state and European country with at least one reported case (shown in Fig. 1b inset). While we see a strong correlation between the reported cases and our model’s estimated number of infections (Pearson’s correlation coefficient on log values, USA: 0.79, P < 0.001; Europe: 0.80, P < 0.001), far fewer cases had actually been reported by that time. If we assume that the number of reported cases and simulated infections are related through a simple binomial sampling process, we find that on average 9 in 1,000 infections (90% confidence interval (CI) 1–35 per 1,000) and 35 in 1,000 infections (90% CI 4–90 per 1,000) were detected by 8 March 2020 in the USA and Europe, respectively. As testing capacity increased, the ascertainment rate grows and our estimates increase to detecting 17 in 1,000 infections (90% CI 2–55 per 1,000) by 14 March 2020 in the USA and 77 in 1,000 infections (90% CI 5–166 per 1,000) in Europe. The estimated ascertainment rates are in agreement with independent results based on different statistical methodologies14,15,16. In Fig. 1c we show the probability that a city in the USA or Europe had generated at least 100 infections by 21 February 2020. We see that the progression of the virus through the USA and Europe was both temporally and spatially heterogeneous. While many cities had not yet experienced much community transmission by late February, a few areas such as New York City and London are very likely to have already had local outbreaks.

## Onset of local transmission

The model’s ensemble of realizations provides a statistical description of all the potential pandemic histories compatible with the initial evolution of the pandemic in China. Rather than describing a specific, causal chain of events, we can estimate possible time windows pertaining to the initial chains of transmission in different geographical regions. We define the onset of local transmission for a country or state as the earliest date when at least 10 new infections are generated per day. This number is chosen because at this threshold the likelihood of stochastic extinction is extremely small17,18. As detailed in the Supplementary Information, further calibration on the US states and European countries suggests posterior values of R0 ranging from 2.4 to 2.8. These values are consistent with many other (country-dependent) estimates19,20,21,22,23,24. At the same time, given the doubling time of the number of COVID-19 cases before the implementation of public health measures, any variation of a factor 2 around the 10 infections per day threshold corresponds to a small adjustment of 3−5 days to the presented timelines.

In Fig. 2, we show the posterior probability distribution, p(t), of the week, t, of the onset of local transmission for 15 US states (Fig. 2a) and European countries (Fig. 2b) (see Supplementary Information for all states and countries). We also calculate, for each country or state, the median date, T, that identifies the first week in which the cumulative distribution function is larger than 50%. Among the US states, CA and NY have the earliest dates, T, by the week of 19 January (CA) and 2 February (NY) 2020. In Europe, Italy, the UK, Germany and France are the first countries with T close to the end of January 2020. However, it is worth noting that each distribution, p(t), has a support spanning several weeks. In Italy, the 5th and 95th percentiles of the p(t) distribution are the week of 6 January and the week of 30 January 2020, respectively. These dates also suggest that it is not possible to rule out introductions and transmission events as early as December 2019, although the probability of this is very small.

For each state in the USA and each country in Europe, we compared the order in which they surpassed 100 cumulative infections in the model and in the surveillance data (gathered from the John Hopkins University Coronavirus Resource Center25). In Extended Data Fig. 1a we plot the ordering for states and compute the Kendall rank correlation coefficient τ (see Supplementary Information). The correlation is positive ($${\tau }_{{\rm{EU}}}$$= 0.71, P < 0.001 and $${\tau }_{{\rm{US}}}$$ = 0.68, P < 0.001) indicating that, despite the detection and testing issues, the expected patterns of epidemic diffusion are largely described by the model in both regions.

## SARS-CoV-2 introductions

As the model allows the recording of the origin and destination of travellers carrying SARS-CoV-2 at the global scale, we can study the possible sources of SARS-CoV-2 introductions for each US state and European country. More specifically, we record the cumulative number of introductions in each stochastic realization of the model until 30 April 2020. In Fig. 3 we visualize the origin of the introductions considering some key geographical regions (for example, Europe and Asia) while keeping the USA and China separate and aggregating all of the other countries (Others). For both the USA and Europe, the contribution from mainland China is barely visible and the local share (that is, sources within Europe and the USA) becomes significantly higher across the board. Hence, while introduction events in the early phases of the outbreak were key to start local spreading (see details in the Supplementary Information), the cryptic transmission phase was sustained largely by internal flows. Domestic SARS-CoV-2 introductions to 30 April 2020 account for 69% (interquartile range (IQR) 60%−81%) of the introductions in CA, 78% (IQR 71%−87%) in TX and 69% (IQR 60%−80%) in MA, which is supported by phylogenetic analysis26. European origins account for 69% (IQR 60%−80%), 84% (IQR 79%−91%) and 58% (IQR 48%−68%) of the introductions in Italy, Spain and the UK, respectively. In the Supplementary Information, we report the full breakdown for all states and countries.

It is also necessary to distinguish between the full volume of SARS-CoV-2 introductions and the introduction events that could be relevant to the early onset of local transmission in each stochastic realization of the model. To this point, it is worth stressing that seeding introductions are different from the actual number of times the virus has been introduced to each location with subsequent onward transmission. Even after a local outbreak has started, future importation events may give rise to additional onward transmission forming independently introduced transmission lineages of the virus27. In the model, we can investigate seeding events by recording introduction events before the local transmission chains were established. We report the results of this analysis in the Supplementary Information, showing that importations from mainland China may be relevant in seeding the epidemic in January, but then play a comparatively small role in the expansion in the number of COVID-19 cases in the USA and Europe owing to the travel restrictions imposed to/from mainland China after 23 January 2020.

The early timing of the initial introductions and diffusion pattern of SARS-CoV-2 were driven by air travel. We find a positive correlation ($${\tau }_{{\rm{EU}}}$$ = 0.66, P < 0.001 and $${\tau }_{{\rm{US}}}$$ = 0.66, P < 0.001) comparing the ordering of states according to when they surpassed 100 cumulative, reported cases (referred to as the epidemic order) and their domestic and international air travel volume rank (Extended Data Fig. 1B). Similar observations have been reported in China, where the initial spreading of the virus outside Hubei was strongly correlated with the traffic to/from the province28. Other factors such as population size are also correlated with both the travel flows ($${\tau }_{{\rm{EU}}}$$ = 0.59, P < 0.001 and $${\tau }_{{\rm{US}}}$$ = 0.7, P < 0.001) and the epidemic order ($${\tau }_{{\rm{EU}}}$$ = 0.46, P < 0.001 and $${\tau }_{{\rm{US}}}$$ = 0.68, P < 0.001), which are discussed in detail in the Supplementary Information. In our model, it is not possible to exclude increased contacts in highly populated places before social distancing interventions and disentangle this effect from increased seeding due to the correlation between travel volume and population size.

## COVID-19 burden

Starting in March 2020, the establishment and timing of NPIs as well as other epidemiological drivers (that is, population size and density, age structure and so on) determined the disease burden in the USA and Europe29,30,31,32. We account for these features by calibrating the model results, individually, for each US state and European country. More precisely, we estimate the posterior distribution of the infection fatality ratio (IFR) and infection attack rate in each US state and European country. To this end, we adopt the ABC approach using as evidence the number of new deaths reported from 22 March 2020 to 27 June 2020. We consider a uniform prior for the average IFR in the range from 0.4% to 2% that is age stratified proportional to the IFR values reported in ref. 33. We also consider a uniform prior for reporting delays between the date of death and reporting ranging from 2 to 22 days in both Europe and the USA34. Details are provided in the Supplementary Information.

In Fig. 4a–d, f–i, we report the model fit of the estimated weekly deaths of the first wave for selected states and countries. Additional model results for all investigated regions including a sensitivity analysis of different calibration methods can be found in the Supplementary Information. We find a strong correlation between the weekly model-estimated deaths and the reported values with a Pearson correlation coefficient of 0.99 (P < 0.001) for both Europe and the USA (see Supplementary Fig. 6). As the data suggest, many European countries and US states saw peaks in April and May with various decreasing trajectories that depend on the mitigation strategies in place. Additionally, we report the estimated posteriors for the cumulative infection attack rates and IFRs as of 4 July 2020 in European countries experiencing more than 100 total deaths and the top 20 states ranked by infection attack rate in the USA.

Within Europe, Belgium has the highest estimated infection attack rate of 13.2% (90% CI [8.5%28.3%]) by 4 July 2020, in agreement with the results in ref. 14. Furthermore, by that time Belgium reported the highest COVID-19 mortality rate out of the European countries investigated with 8.5 deaths per 10,000 individuals. However, Italy is estimated to have the highest median IFR of 1.4% (90% CI [0.6%–1.8%]), which aligns with other ranges reported in the literature35,36. The US states with the highest infection attack rates are located within the northeast and experienced a significant first wave during March–April 2020. NY and NJ are the top two states with infection attack rates of 13.4% (90% CI [9.1%−26.7%]) and 15.2% (90% CI [10.2%−31.3%]), respectively. These numbers are aligned with estimates from New York City reported in ref. 37. In the Supplementary Information, we report summary tables with estimated IFRs, infection attack rates and the reproductive number in the absence of mitigation measures for all calibrated US states and European countries. Additionally, we compare our attack rate estimates to the prevalence of individuals with SARS-CoV-2 antibodies from serological studies across the USA and Europe (Extended Data Fig. 1D). The seroprevalence estimates are compared to the model estimates during the same time window the studies were performed (details on the seroprevalence data from this figure can be found in Supplementary Table 8 and Supplementary Section 9.3).

## Discussion

The model presented here captures the spatial and temporal heterogeneity of the early stage of the pandemic, going beyond the single-country-level reconstruction. It provides a mechanistic understanding of the underlying dynamics of the pandemic’s interconnected evolution. Furthermore, rather than showing specific evidence for early infection in a few locations, our study aims at providing a statistical characterization and quantification of the initial transmission pathways at a global scale. Our results can be compared to and complement analyses based on gene sequencing and travel volumes. We find that 72% of the early introductions to Italy, before the local outbreak, are linked to China, which is in agreement with ref. 38 highlighting the key role of importations between these regions at the beginning of the pandemic. Additionally, similar to our findings, ref. 27 estimates that the majority of importation events to April 2020, associated with onward transmission in the UK, came from Europe. The contributions from China are quantified below 1% and limited to the very early phase. Furthermore, seeding events from the USA are estimated to be below 3%, which aligns with our estimate (8%; IQR 3%−9%). However, their results point to a larger share from Europe (90%) compared to ours (58%; IQR 48%−68%), and conversely, we estimate a larger contribution from Asia (27%; IQR 19%−35%). As our analysis is a statistical description of the possible introduction pathways, differences could arise due to our model design, and also from genomic sampling biases39.

The sources of introduction of SARS-CoV-2 infections in Europe and the USA changed substantially and rapidly through time. This caused reactive response strategies, such as issuing travel restrictions targeting countries only after local transmission is confirmed, ineffective at preventing local outbreaks. Our results suggest that many regions in the USA and Europe experienced an onset of local transmission in January and February 2020, during the time when testing capacity was limited. If testing had been more widespread and not restricted to individuals with a travel history from China, there would have been more opportunities for earlier detection and interventions. In the Supplementary Information, we report a counterfactual scenario where we assume broader testing specifications not based on the individual travel history and find that the epidemic progression is considerably delayed (see Supplementary Section 8).

As testing capacity increased and more cases were detected, many governments began to issue social distancing guidelines to mitigate the spread of SARS-CoV-2. The first European country to implement a cordon sanitaire was Italy on 23 February 2020, for a few northern cities40. Many other countries followed suit and implemented national lockdowns in March 2020 (refs.30,41); however, this was weeks after our model estimates that SARS-CoV-2 was introduced and locally spreading. We find a strong correlation between the number of cases reported by the date of a lockdown/social distancing measure and the cumulative infections projected by 4 July 2020 (Extended Data Fig. 1C), indicating that the earlier NPIs had been issued, the smaller the COVID-19 burden experienced during the first wave. This is in agreement with other analyses showing that the timing of NPIs is crucial in limiting the burden of COVID-19 (refs.19,29,42,43,44,45,46,47,48). Overall, our results strengthen the case for preparedness plans with broader indication for testing that are able to detect local transmission earlier.

As with all modelling analyses, results are subject to biases from the limitations and assumptions within the model as well as the data used in its calibration. The model’s parameters, such as generation time, incubation period and the proportion of asymptomatic infections, are chosen according to the current knowledge of SARS-CoV-2. Although the model is robust to variations in these parameters (see the Supplementary Information for the sensitivity analysis), more information on the key characteristics of the disease would considerably reduce uncertainties. The model calibration does not consider correlations among importations (that is, family travel) and assumes that travel probabilities are age specific across all individuals in the catchment area of each transportation hub.

In light of the assumptions and limitations inherent to this modelling approach, the results are able to complement the SARS-CoV-2 genome sequencing analyses used to reconstruct the early epidemic history of the COVID-19 pandemic38. The methods used in this analysis offer a blueprint to identify the most likely early spreading dynamics of emerging viruses, and they can be used as a real-time risk assessment tool. Anticipating the locations where a virus is most likely to spread to next could be instrumental in guiding enhanced testing and surveillance activities. The estimated SARS-CoV-2 importation patterns and the cryptic transmission phase dynamics are of potential use when planning and developing public health policies in relation to international travelling, and they could provide important insights into assessing the potential risk and impact of emerging SARS-CoV-2 variants in regions of the world with limited testing and genomic surveillance resources.

## Methods

### The GLEAM model

The GLEAM model is a stochastic, spatial, age-structured metapopulation model. Previously this model was used to characterize the early stage of the COVID-19 epidemic in mainland China to estimate the effectiveness of travel bans and restrictions49. The GLEAM model divides the global population into more than 3,200 subpopulations in roughly 200 different countries and territories interconnected by realistic air-travel and commuting mobility networks. A subpopulation is defined as the catchment area around major transportation hubs. The airline transportation data encompass daily travel data in the origin–destination format from the Official Aviation Guide database50 reflecting actual traffic changes that occurred during the pandemic. Ground mobility and commuting flows are derived from the analysis and modelling of data collected from the statistics offices of 30 countries on 5 continents11,12. The international travel data account for travel restrictions and government-issued policies. Furthermore, the model accounts for the reduction of internal, country-wide mobility and changes in contact patterns in each country and state in 2020. Specific model details are reported in the Supplementary Information.

### SARS-CoV-2 transmission dynamics

The transmission dynamics take place within each subpopulation and assume a classic compartmentalization scheme for disease progression similar to those used in several large-scale models of SARS-CoV-2 transmission15,51,52,53,54,55. Each individual, at any given point in time, is assigned to a compartment corresponding to their particular disease-related state (specifically, one could be susceptible, latent, infectious or removed)49. This state also controls the individual’s ability to travel (details in the Supplementary Information). Individuals transition between compartments through stochastic chain binomial processes. Susceptible individuals can acquire the virus through contact with individuals in the infectious category and can subsequently become latent (that is, infected but not yet able to transmit the infection). The process of infection is modelled using age-stratified contact patterns at the state and country level56,57. Latent individuals progress to the infectious stage at a rate inversely proportional to the latent period, and infectious individuals progress to the removed stage at a rate inversely proportional to the infectious period. The sum of the mean latent and infectious periods defines the generation time. Removed individuals are those who can no longer infect others. To estimate the number of deaths, we consider a uniformly distributed prior of the IFRs (ranging from 0.4% to 2%) that is age stratified proportional to the values estimated by ref. 33 and incorporates reporting delays. The transmission model does not assume heterogeneities due to age differences in susceptibility to the SARS-CoV-2 infection for younger children (1–10 years old). This is an intense area of discussion58,59,60. The transmission dynamic and the offspring distribution of infectious individuals in the model will depend on the specific details of each population, local and global mobility, NPIs and so on. While overdispersion in transmission varies by location in our model, we find that overall, it is consistent with 25% of primary infections causing 75% of transmission in our simulations (Supplementary Fig. 9). Additional simulations considering a fixed level of dispersion, informed by past studies, result in differences of less than 3 days in onset times (Supplementary Fig. 10; see also the Supplementary Information for further discussion).

### Model calibration

We assume a start date of the epidemic in Wuhan, China, that falls between 15 November 2019 and 1 December 2019, with 20 initial infections49,51,61,62,63. This considers that our model has a posterior distribution for the emergence of the outbreak in China that includes the possibility of transmission starting in October, 2019 (refs. 64,65). The model generates an ensemble of possible epidemic realizations and is initially calibrated using an approximate Bayesian computation (ABC) rejection approach13 based on the observed international importations from mainland China up to 21 January 2020 (ref. 49). Only a fraction of imported cases is generally detected at the destination10,66. According to the estimates proposed in ref. 67, we stratify the detection capacity of countries into three groups: high, medium, and low surveillance capacity according to the Global Health Security Index68, and assume that asymptomatic infections are never detected. The model calibration does not consider correlated importations (for example, family travel) and assumes that travel probabilities are homogeneous across all individuals in each subpopulation. We further calibrate our model using the temporal ordering of the onset of local transmission (as defined in the section ‘Onset of local transmission’) of the countries investigated. If we consider the epidemiological evidence38,69,70, Italy was the first European country to experience substantial community transmission. Therefore, throughout the paper, we constrain the ensemble of simulations focusing only on stochastic realizations in which Italy is the first country, in the group under examination, to experience sustained local transmission (see Supplementary Information for details and further analyses of unconstrained simulations). Furthermore, we perform for each state and country an additional ABC rejection analysis using as evidence the weekly reported deaths in the time window starting on 22 March 2020 and ending on 27 June 2020. A full description of the model calibration is provided in the Supplementary Information.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.