Main

SARS-CoV-2, the causative agent of COVID-19, spreads efficiently, with a basic reproductive number of 2.2 to 2.5 determined in Wuhan1,2. The effectiveness of control measures depends on several key epidemiological parameters (Fig. 1a), including the serial interval (duration between symptom onsets of successive cases in a transmission chain) and the incubation period (time between infection and onset of symptoms). Variation between individuals and transmission chains is summarized by the incubation period distribution and the serial interval distribution, respectively. If the observed mean serial interval is shorter than the observed mean incubation period, this indicates that a significant portion of transmission may have occurred before infected persons have developed symptoms. Significant presymptomatic transmission would probably reduce the effectiveness of control measures that are initiated by symptom onset, such as isolation, contact tracing and enhanced hygiene or use of face masks for symptomatic persons.

Fig. 1: Transmission of infectious diseases.
figure 1

a, Schematic of the relation between different time periods in the transmission of infectious disease. b, Human-to-human transmission pairs of SAR-CoV-2 virus (N = 77). We assumed a maximum exposure window of 21 days prior to symptom onset of the secondary cases. Detailed information on the transmission pairs and the source of information is summarized in Supplementary Tables 2 and 3. c, Estimated serial interval distribution (top), inferred infectiousness profile (middle) and assumed incubation period (bottom) of COVID-19.

SARS (severe acute respiratory syndrome) was notable, because infectiousness increased around 7–10 days after symptom onset3,4. Onward transmission can be substantially reduced by containment measures such as isolation and quarantine (Fig. 1a)5. In contrast, influenza is characterized by increased infectiousness shortly around or even before symptom onset6.

In this study, we compared clinical data on virus shedding with separate epidemiologic data on incubation periods and serial intervals between cases in transmission chains, to draw inferences on infectiousness profiles.

Among 94 patients with laboratory-confirmed COVID-19 admitted to Guangzhou Eighth People’s Hospital, 47/94 (50%) were male, the median age was 47 years and 61/93 (66%) were moderately ill (with fever and/or respiratory symptoms and radiographic evidence of pneumonia), but none were classified as ‘severe’ or ‘critical’ on hospital admission (Supplementary Table 1).

A total of 414 throat swabs were collected from these 94 patients, from symptom onset up to 32 days after onset. We detected high viral loads soon after symptom onset, which then gradually decreased towards the detection limit at about day 21. There was no obvious difference in viral loads across sex, age groups and disease severity (Fig. 2).

Fig. 2: Temporal patterns of viral shedding.
figure 2

Viral load (threshold cycle (Ct) values) detected by RT–PCR (PCR with reverse transcription) in throat swabs from patients infected with SARS-CoV-2 (N = 94), overall and stratified by disease severity, sex, age group and link to Hubei province. The detection limit was Ct = 40, which was used to indicate negative samples. The thick lines show the trend in viral load, using smoothing splines. We added some noise to the data points to avoid overlaps.

Separately, based on 77 transmission pairs obtained from publicly available sources within and outside mainland China (Fig. 1b and Supplementary Table 2), the serial interval was estimated to have a mean of 5.8 days (95% confidence interval (CI), 4.8–6.8 days) and a median of 5.2 days (95% CI, 4.1–6.4 days) based on a fitted gamma distribution, with 7.6% negative serial intervals (Fig. 1c). Assuming an incubation period distribution of mean 5.2 days from a separate study of early COVID-19 cases1, we inferred that infectiousness started from 12.3 days (95% CI, 5.9–17.0 days) before symptom onset and peaked at symptom onset (95% CI, –0.9–0.9 days) (Fig. 1c). We further observed that only <0.1% of transmission would occur before 7 days, 1% of transmission would occur before 5 days and 9% of transmission would occur before 3 days prior to symptom onset. The estimated proportion of presymptomatic transmission (area under the curve) was 44% (95% CI, 30–57%). Infectiousness was estimated to decline quickly within 7 days. Viral load data were not used in the estimation but showed a similar monotonic decreasing pattern.

In sensitivity analysis, using the same estimating procedure but holding constant the start of infectiousness from 5, 8 and 11 days before symptom onset, infectiousness was shown to peak at 2 days before to 1 day after symptom onset, and the proportion of presymptomatic transmission ranged from 37% to 48% (Extended Data Fig. 1).

Finally, simulation showed that the proportion of short serial intervals (for example, <2 days) would be larger if infectiousness were assumed to start before symptom onset (Extended Data Fig. 2). Given the 7.6% negative serial intervals estimated from the infector–infectee paired data, start of infectiousness at least 2 days before onset and peak infectiousness at 2 days before to 1 day after onset would be most consistent with this observed proportion (Extended Data Fig. 3).

Here, we used detailed information on the timing of symptom onsets in transmission pairs to infer the infectiousness profile of COVID-19. We showed substantial transmission potential before symptom onset. Of note, most cases were isolated after symptom onset, preventing some post-symptomatic transmission. Even higher proportions of presymptomatic transmission of 48% and 62% have been estimated for Singapore and Tianjin, where active case finding was implemented7. Places with active case finding would tend to have a higher proportion of presymptomatic transmission, mainly due to quick quarantine of close contacts and isolation, thus reducing the probability of secondary spread later on in the course of illness. In a rapidly expanding epidemic wherein contact tracing/quarantine and perhaps even isolation are no longer feasible, or in locations where cases are not isolated outside the home, we should therefore observe a lower proportion of presymptomatic transmission.

Our analysis suggests that viral shedding may begin 5 to 6 days before the appearance of the first symptoms. After symptom onset, viral loads decreased monotonically, consistent with two recent studies8,9. Another study from Wuhan reported that virus was detected for a median of 20 days (up to 37 days among survivors) after symptom onset10, but infectiousness may decline significantly 8 days after symptom onset, as live virus could no longer be cultured (according to Wölfel and colleagues11). Together, these results support our findings that the infectiousness profile may more closely resemble that of influenza than of SARS (Fig. 1a), although we did not have data on viral shedding before symptom onset6,12. Our results are also supported by reports of asymptomatic and presymptomatic transmission13,14.

For a reproductive number of 2.5 (ref. 2), contact tracing and isolation alone are less likely to be successful if more than 30% of transmission occurred before symptom onset, unless >90% of the contacts can be traced15. This is more likely achievable if the definition of contacts covers 2 to 3 days prior to symptom onset of the index case, as has been done in Hong Kong and mainland China since late February. Even when the control strategy is shifting away from containment to mitigation, contact tracing would still be an important measure, such as when there are super-spreading events that may occur in high-risk settings including nursing homes or hospitals. With a substantial proportion of presymptomatic transmission, measures such as enhanced personal hygiene and social distancing for all would likely be the key instruments for community disease control.

Our study has several limitations. First, symptom onset relies on patient recall after confirmation of COVID-19. The potential recall bias would probably have tended toward the direction of under-ascertainment, that is, delay in recognizing first symptoms. As long as these biases did not differ systematically between infector and infectee, the serial interval estimate would not be substantially affected. However, the incubation period would have been overestimated, and thus the proportion of presymptomatic transmission artifactually inflated. Second, shorter serial intervals than those reported here have been reported, but such estimates lengthened when restricted to infector–infectee pairs with more certain transmission links16. Finally, the viral shedding dynamics were based on data for patients who received treatment according to nationally promulgated protocols, including combinations of antivirals, antibiotics, corticosteroids, immunomodulatory agents and Chinese medicine preparations, which could have modified the shedding dynamical patterns.

In conclusion, we have estimated that viral shedding of patients with laboratory-confirmed COVID-19 peaked on or before symptom onset, and a substantial proportion of transmission probably occurred before first symptoms in the index case. More inclusive criteria for contact tracing to capture potential transmission events 2 to 3 days before symptom onset should be urgently considered for effective control of the outbreak.

Methods

Sources of data

Guangzhou Eighth People’s Hospital in Guangdong, China was designated as one of the specialized hospitals for treating patients with COVID-19 at both city and provincial levels on 20 January 2020. After that, many people with COVID-19 were admitted via fever clinics, the hospital emergency room or after confirmation of cases from community epidemiological investigation carried out by the Guangzhou Center for Disease Control and Prevention, or transferred from other hospitals. The first confirmed patient with COVID-19 was admitted on 21 January 2020, but in the initial phase, patients suspected to have COVID-19 were also admitted. We identified all suspected and confirmed COVID-19 cases admitted from 21 January 2020 to 14 February 2020 and collected throat swabs in each case. Patients included those who traveled from Wuhan or Hubei to Guangzhou as well as locals, with cases ranging from asymptomatic, mild to moderate at admission.

The samples were tested by N-gene-specific quantitative RT–PCR assay as previously described17. To understand the temporal dynamics of viral shedding and exclude non-confirmed COVID-19 cases, we selected 94 patients who had at least one positive result (cycle threshold (Ct) value < 40) in their throat samples. Serial samples were collected from some but not all patients for clinical monitoring purposes.

We collected information reported on possible human-to-human transmission pairs of patients with laboratory-confirmed COVID-19 from publicly available sources, including announcements made by government health agencies and media reports in mainland China and countries/regions outside China. A transmission pair was defined as two confirmed COVID-19 cases identified in the epidemiologic investigation by showing a clear epidemiologic link with each other, such that one case (infectee) was highly likely to have been infected by the other (infector), by fulfilling the following criteria: (1) the infectee did not report a travel history to an area affected by COVID-19 or any contact with other confirmed or suspected COVID-19 cases except for the infector within 14 days before symptom onset; (2) the infector and infectee were not identified in a patient cluster where other COVID-19 cases had also been confirmed; and (3) the infector and infectee pair did not share a common source of exposure to a COVID-19 case or a place where there were COVID-19 case(s) reported. We excluded possible transmission pairs without a clear exposure history reported prior to symptom onset. Data of possible transmission pairs of COVID-19 were extracted, including age, sex, location, date of symptom onset, type or relationship between the pair cases and time of contact of the cases.

Statistical analysis

We analyzed two separate data sets—clinical and epidemiologic—to assess presymptomatic infectiousness. First, we assessed longitudinal viral shedding data from patients with laboratory-confirmed COVID-19 starting from symptom onset, where viral shedding during the first few days after illness onset could be compared with the inferred infectiousness. Second, the serial intervals from clear transmission chains, combined with information on the incubation period distribution, were used to infer the infectiousness profile, as described in the following.

We present SARS-CoV-2 viral loads in the throat swabs of each patient by day of symptom onset. To aid visualization, a smoothing spline was fitted to the Ct values to summarize the overall trend. Specifically, a generalized additive model, E(Y) = β0 + s(t), with an identity link was fitted, where Y are the Ct values, β0 is the intercept and s(t) is a cubic spline evaluated at t days after symptom onset. We also compared the viral load by disease severity, age, sex and travel history from Hubei.

We fitted a gamma distribution to the transmission pairs data to estimate the serial interval distribution. We used a published estimate of the incubation period distribution to infer infectiousness with respect to symptom onset from the first 425 patients with COVID-19 in Wuhan with detailed exposure history1. We considered that infected cases would become infectious at a certain time point before or after illness onset (tS1). Infectiousness—that is, transmission probability to a secondary case—would then increase until reaching its peak (Fig. 1). The transmission event would occur at time tI with a probability described by the infectiousness profile βc(tI − tS1) relative to the illness onset date, assuming a gamma distribution β(t) with a time shift c to allow for start of infectiousness c days prior to symptom onset; that is, βc(t) = β(t + c). The secondary case would then show symptoms at time tS2, after the incubation period that is assumed to follow a lognormal distribution g(tS2 − tI). Hence the observed serial intervals distribution f(tS2 − tS1) would be the convolution between the infectiousness profile and incubation period distribution. We constructed a likelihood function based on the convolution, which was fitted to the observed serial intervals, allowing for the start of infectiousness around symptom onset and window of symptom onset (tS1l, tS1u), given by

$$L\left( {t_{\rm{S1u}},t_{\rm{S1l}},t_{\rm{S2}}|\theta } \right) = \mathop {\int}\limits_{t_{\rm{S1l}}}^{t_{\rm{S1u}}} {\mathop {\int}\limits_{ - \infty }^{t_{\rm{S}2}} {\beta _c} } \left( {t_{\rm{I}} - t_{\rm{S}1}} \right)g(t_{\rm{S}2} - t_{\rm{I}}){\rm{d}}t_{\rm{I}}{\rm{d}}t_{\rm{S}1}$$

A normalization factor can be added to account for the uncertainty in the symptom-onset dates of the index cases. Assuming a uniform distribution, the likelihood would differ only by a multiplicative constant and give the same estimates.

Parameters θ, including the gamma distribution parameters and the start of infectiousness, were estimated using maximum likelihood. The 95% CIs were obtained by bootstrapping with 1,000 replications. We also performed sensitivity analyses by fixing the start of infectiousness from days 5, 8 and 11 before symptom onset and inferred the infectiousness profile.

As an additional check, we simulated the expected serial intervals assuming the same aforementioned incubation period but two different infectiousness profiles, where infectiousness started on the same day and from 2 days before symptom onset, respectively. A recent study isolated live infectious SARS-CoV-2 virus from patients with COVID-19 up to 8 days after symptom onset11, thus we assumed the same duration of infectiousness. We also assumed that infectiousness peaked on the day of symptom onset. The timing of transmission to secondary cases was simulated according to the infectiousness profile using a lognormal and exponential distribution, respectively, where the serial intervals were estimated as the sum of the onset to transmission interval and the incubation period. We drew random samples for the transmission time relative to symptom onset of the infector TI ≈ βc(t), and also the incubation period Tinc ≈ f(t), then the simulated serial interval was TI + Tinc. We also performed simulation considering combinations of different infectiousness profiles, with start of infectiousness 7 days before to 3 days after symptom onset, and peak infectiousness also 7 days before to 3 days after symptom onset. We present the distribution of the serial intervals and proportion of negative serial intervals over 10,000 simulations.

All statistical analyses were conducted in R version 3.6.3 (R Development Core Team).

Ethics approval

Data collection and analysis were required by the National Health Commission of the People’s Republic of China to be part of a continuing public health outbreak investigation.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.