Main

The novel human coronavirus SARS-CoV-2 is a highly contagious virus, and its disease, COVID-19, can lead to significant morbidity and mortality in a proportion of patients1,2,3. On 12 March 2020, the World Health Organization declared it a global pandemic4. As of 12 May 2020, there were more than 4.2 million confirmed infections globally in more than 180 countries with over 290,000 deaths (https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6).

Detection of SARS-CoV-2 in asymptomatic individuals5,6 suggests that subclinical active infection might be an important contributor to this outbreak. Currently, reported cases of COVID-19 are mainly limited to symptomatic individuals, those having close contact with confirmed patients and those with a history of travel to the epidemic regions, and the diagnosis is usually based on a viral RNA test by reverse transcription polymerase chain reaction (RT–PCR) that appears to be sensitive to the assay method and the timing of specimen collection, transportation and storage7,8. As such, a large number of subclinical and asymptomatic infected individuals might have been undetected.

Assessing the cumulative prevalence of SARS-CoV-2 infection will help our understanding of the epidemiology of the outbreak, the contagiousness of SARS-CoV-2 and the immunity to COVID-19 in both the vulnerable and general populations9,10. The serological test for the presence of antibodies (IgM or IgG) against SARS-CoV-2 might provide a more accurate estimate of the cumulative prevalence of SARS-CoV-2 infection in a population compared to the viral test, as the antibodies against the virus, in particular IgG, are likely to persist for a longer period of time after the viral infection is cleared. In this study, we conducted a serological survey using a validated assay for IgM and IgG antibodies against the recombinant antigens containing the nucleoprotein and a peptide from the spike protein of SARS-CoV-211, in a total of 17,368 individuals from four different geographic locations and different populations in China. The survey was conducted between 9 March 2020 and 10 April 2020.

First, we internally validated our serological assay11 with serum samples collected from 447 patients with end-stage kidney diseases for a different study before June 2019, before the pandemic. Of these, 50.9% of patients were male, and the median age was 50.0 years. In the validation test, the serological test showed specificity of 99.3% (444 of 447) and 100% (447 of 447) for IgG and IgM antibodies, respectively. In this validation set, three samples had borderline positive results for IgG antibody in the initial test and became negative upon re-testing and then were considered false positives. Second, we conducted the assay in serum samples collected from 242 patients with COVID-19 confirmed by a viral nucleic acid RT–PCR test (SARS-CoV-2 ORF1ab and N gene) on nasopharyngeal swabs7. In this testing set, 136 patients were male (56.2%), 32 (13.2%) had severe cases (defined as the need of intensive care and/or mechanical ventilation) and the median age was 47 years. The estimated cumulative seroprevalence for IgM and IgG antibodies are presented in Fig. 1. The cumulative seroprevalence of IgM and IgG were 44% (37–50%, 95% CI) and 56% (49–62%, 95% CI), respectively, on day 7 after the onset of symptoms and reached over 95% on day 20 and day 16, respectively. The levels of both IgM and IgG antibodies appeared to remain above the cutoff value on day 28. It was surprising to see an early and a higher level of IgG response to the same SARS-CoV-2 antigen, suggesting that the value of IgM as an early marker for the acute phase of SARS-COV-2 infection might not be on par with that in other viral infection diagnostics.

Fig. 1: The cumulative seropositive rates of IgM and IgG antibodies to SARS-CoV-2 antigens in 242 patients with COVID-19.
figure 1

All patients were confirmed by the RT–PCR test for SARS-CoV-2 RNA. One blood sample within 28 d after the onset of symptoms was collected from each patient and tested for IgM and IgG antibodies against SARS-CoV-2. Each serological observation was treated as interval censored, and the cumulative seropositive rates were calculated using Turnbull’s method. The dotted lines indicate 95% CIs of the estimates.

Source data

After validation of the test, we then evaluated the seroprevalence in different populations in Wuhan, the epicenter of the COVID-19 pandemic in China, including healthcare workers and their family members as well as staff members from the hotels designated for accommodation of the healthcare workers responsible for COVID-19 management. The survey was conducted between 30 March 2020 and 10 April 2020. The seropositive prevalence rate was 3.8% (2.6–5.4%, 95% CI) in the cohort of 714 healthcare workers, 3.8% (2.2–6.3%, 95% CI) in the cohort of 346 hotel staff members and 3.2% (1.6–6.4%, 95% CI) in the cohort of 219 family members of the healthcare workers (Table 1).

Table 1 Demographic characteristics and positive rates of the serological and RT–PCR tests in different study populations

To understand the level of exposure to the virus in areas outside of the outbreak epicenter, we surveyed cohorts in two nearby cities to the west of Wuhan, Jinzhou and Honghu of Hubei Province, and estimated a seroprevalence of 1.3% (1.0–1.8%, 95% CI) in 3,091 healthcare workers and 3.6% (2.0–4.9%, 95% CI) in 979 patients who visited the hospital for routine maintenance hemodialysis but did not present symptoms of COVID-19. Moving further west to Wuhan, we found seroprevalence of 3.1% (1.7–5.7%, 95% CI) and 3.8% (2.8–5.2%, 95% CI), respectively, in 319 healthcare workers and 993 hospital outpatients in the city of Chongqing and 0.58% (0.45–0.76%, 95% CI) in 9,442 community residents in Chengdu, the capital of Sichuan Province. Finally, in Guangzhou and Foshan, two major neighboring cities far south of Wuhan, the seroprevalence was 2.8% (1.8–4.6%, 95% CI), 1.2% (0.4–3.3%, 95% CI) and 1.4% (0.6–2.9%, 95% CI), respectively, in cohorts of 563 patients undergoing maintenance hemodialysis, 260 healthcare workers and 442 factory workers.

These results show that seropositive rates in different geographic areas were consistent with the early spread of the SARS-CoV-2 coronavirus in China. Although the highest seropositive rates were observed for IgG, it is important to note that there were individuals with IgM or IgG positivity alone. Therefore, for serologic surveys, both IgM and IgG should be measured (Extended Data Figs. 1 and 2). We also observed that, among the subset of individuals who had been tested for SARS-CoV2 infection by RT–PCR, very few individuals had detectable SARS-CoV-2 viral nucleic acid sequences in their nasal swabs at the time of the collection. Of the 23 individuals who did test positive in the viral RT–PCR test, only 19 were IgG positive and two were IgM positive. In the entire study population, we did not observe significant differences in seropositive rates between genders (1.6% in males and 1.3% in females), but we did find that seroposivity was significantly higher in individuals older than 65 years (2.0% in those ≥65 years versus 1.3% in those <65 years, P < 0.01).

Our study had several limitations. Because our study population was not drawn by random sampling, the estimation of the seroprevalence was subject to potential sampling bias. The sensitivity of the serological test also depends on the test time from the onset of disease. Samples collected from infected individuals outside the time window of antibody response could produce false negatives, and therefore the observed seroprevalence in our study could potentially underestimate the true prevalence rate of the disease. Owing to the cross-sectional design of the current study, the dynamic changes of antibody titer in infected individuals over time were not evaluated, and long-term follow-up will be important to define the value of these serology markers in the estimation of the cumulative attack rate in the future.

Methods

Study design and participants

We enrolled individuals from different populations in seven cities in Huibei Province (Wuhan, Honghu and Jingzhou), Guangdong Province (Guangzhou and Foshan), Sichuan Province (Chengdu) and Chongqing in China. A total of 17,368 individuals, including 2,535 patients, 4,384 healthcare workers, 219 family members of healthcare workers, 346 staff members from hotels designated for accommodation of healthcare workers responsible for COVID-19 management, 442 factory workers and 9,442 community residents, were enrolled. Healthcare workers, family members, outpatients and community residents were recruited by voluntary participation by a public call. Patients undergoing hemodialysis, hotel staff members and factory workers were required to take the serological test at the participant centers after implementation of regulations for COVID-19 surveillance of these populations during the epidemic in China.

Demographic data, including age, gender, residential region and occupation of each participant, were collected. The participants were screened for SARS-CoV-2 infection by a serological test for IgG and IgM antibodies against a recombinant antigen of the virus. For some study populations, RT–PCR tests for viral RNA from nasopharyngeal swabs were also conducted.

Ethics approval

The Medical Ethics Committees of Nanfang Hospital, Sichuan Provincial People’s Hospital and Chongqing Medical University approved this study, and all participants signed a consent form.

Laboratory measurements

Nasopharyngeal swabs were collected in each local facility and tested for SARS-CoV-2 RNA in a designated virology laboratory in the region of collection using the real-time RT–PCR assay as previously reported7.

Serum samples were collected in clinical laboratories in local outpatient clinics or hospitals. All samples were inactivated at 56 °C for 30 min and stored at −20 °C before testing. The IgG and IgM antibodies against SARS-CoV-2 were measured using a commercially available Magnetic Chemiluminescence Enzyme Immunoassay Kit (Bioscience) according to the manufacturer’s instructions. Antibody levels were expressed as the ratio of the chemiluminescence signal over the cutoff (S/CO) value. An S/CO value higher than 1.0 for either IgG or IgM was regarded as positive.

We independently validated the serological assay using sera of 447 patients with end-stage kidney disease that were collected before June 2019 as a negative control and sera of 242 patients with COVID-19 confirmed by the viral RT–PCR test as a positive control and calculated the sensitivity and specificity of the assay.

Statistical analysis

Continuous variables were presented as median (q25 and q75) and categorical variables as counts and percentages. The 95% CI of the seroprevalence was calculated from binomial probabilities using Wilson’s methods. Each serological observation in patients with COVID-19 was treated as interval censored (left censored for positive observations and right censored for negative observations), and the cumulative incidence was calculated using Turnbull’s method12.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.