Main

The first case of COVID-19 in New York City (NYC) was identified at Mount Sinai Hospital on 29 February 2020 and confirmed on 1 March 20201. A sharp rise in infections occurred shortly afterwards during the week ending on 8 March, followed by a strong increase in the number of COVID-19-related deaths during the week ending on 15 March (Fig. 1). New York State (NYS) implemented a stay-at-home order on 22 March 2020, after which daily case numbers in NYC started to plateau and then decrease in April and May 2020. There was little capacity for nucleic acid amplification testing (NAAT) at the beginning of the local epidemic in early March, and many asymptomatic and mild-to-moderate cases probably went undetected.

Fig. 1: Seroprevalence, antibody titres against the full-length spike protein and the number of confirmed cases in NYC.
figure 1

a, Serum antibody prevalence in the urgent care group between the weeks ending 9 February (first two weeks combined) to 5 July 2020. b, Serum antibody prevalence during that same timeframe in the routine care group. c, Antibody titres against the full-length spike protein in the urgent care group in the sampled time period. d, Spike end-point titres in the routine care group. e, Numbers of confirmed SARS-CoV-2 cases and deaths in NYC for the same time frame. f, Comparison of end-point titres in the routine care (n = 733) and urgent care (n = 1,067) groups. Red lines indicate the geometric mean. The P value was obtained using a two-tailed Mann–Whitney U-test. Data for confirmed COVID-19 cases and deaths in NYC were retrieved from https://www1.nyc.gov/site/doh/covid/covid-19-data.page. Each plasma sample was tested once. Sample numbers and percentages per week are indicated in a, b. a, b, Lines show the mean and error bars represent the 95% confidence intervals. c, d, Bars show the geometric mean titres. c, d, f, The dashed grey line shows the limit of detection.

Source data

Although it is currently unknown whether previous infection with SARS-CoV-2 can protect against reinfection, data from animal models as well as from studies with other human coronaviruses suggest that infection may confer immunity2,3. It is therefore important to determine the true infection rates in a population to assess how close this population is to potential ‘community immunity’4. Knowing the true infection rate also enables the calculation of the infection fatality rate, which is probably much lower than the case fatality rate. To estimate the true infection rates, sero-surveys can be used that measure the presence of antibodies to past virus infections, rather than the presence of the virus5. Serological assays that measure antibodies against SARS-CoV-2 rely on the viral nucleoprotein, the spike protein on the virus surface or the receptor binding domain (RBD) of the spike protein, which is an important part of this protein that interacts with angiotensin-converting enzyme 2 (ACE2), the host cellular receptor that mediates SARS-CoV-2 entry into cells6. A two-step enzyme-linked immunosorbent assay (ELISA) has recently been established in which serum or plasma samples are pre-screened at a set dilution for reactivity to the RBD of the spike. Positive samples in this first step are confirmed and the antibody titre is assessed in a second ELISA against the full-length spike protein7,8. The use of two sequential assays reduces the false-positive rate and favours high specificity, which results in a sensitivity of 95% and a specificity of 100% (Extended Data Table 1).

Sero-survey strategy

In the week of 9 February 2020, we started to collect residual, random, de-identified, cross-sectional plasma samples that were originally obtained for standard-of-care medical purposes. These samples were divided into two distinct groups. The first group included samples from patients seen in the emergency department at Mount Sinai Hospital and from patients that were admitted to the hospital for urgent care during the period starting from the week ending on 9 February to the week ending on 5 July 2020. This group, termed the urgent care group, served as a positive-control group designed to detect the increasing number of SARS-CoV-2 infections as we assumed that individuals with moderate-to-severe COVID-19 would seek care at the emergency department and be admitted to the hospital at increasing rates as the local epidemic progressed. The second group of samples, termed the routine care group, were obtained from patients who visited the obstetrics and gynaecology department and visits for labour and deliveries, oncology-related visits, as well as hospitalizations owing to elective surgeries, transplant surgeries, pre-operative medical assessments and related outpatient visits, cardiology office visits and other regular office and/or treatment visits. We reasoned that these samples might resemble the general population more closely, because the purposes for these scheduled visits were unrelated to COVID-19. The urgent care group comprised 45.5% female patients whereas the routine care group included 67.6% female patients (Table 1). The majority of individuals in the urgent care group were at least 61 years of age, whereas the routine care group had a more balanced age distribution that more closely resembled the general population, although the age group of 0–20 year olds was severely underrepresented (Table 1). Except for the weeks ending 9 February and 16 February, for which only 16 samples were obtained across the 2 weeks (3 in the urgent care group and 13 in the routine care group), the urgent care group size ranged between 168 and 274 samples per week and the routine care group included 231–493 samples per week (Extended Data Tables 2, 3). A total of 10,691 samples obtained between the weeks ending 9 February and 5 July 2020 were tested: 6,590 samples in the routine care group and 4,101 in the urgent care group.

Table 1 Demographic characteristics, seroprevalence and COVID-19 diagnosis in different study populations.

Seroprevalence in the urgent care group

In the urgent care group, no sero-positive samples were detected in the weeks ending on 9 and 16 February, and low seroprevalence was found during the weeks ending on 23 February–15 March (seroprevalence ranged between 1.4 and 3.2%) (Fig. 1a). A sharp increase to 6.2% was detected in the week ending on 22 March, concurrent with the increase in new confirmed SARS-CoV-2 infections and related hospitalizations. This increase continued in the weeks ending on 29 March (17.4%), 5 April (46.7%) and 12 April (56.7%). Seroprevalence in this group peaked at 61.7% in the week ending on 19 April, then started to decline to 21.9% in the week ending on 24 May and remained flat thereafter, matching the decrease in new COVID-19 diagnoses. For many samples in the urgent care group, NAAT results for SARS-CoV-2 infection were available and the rate of positivity in nasopharyngeal swab specimen tracked well with seropositivity (Extended Data Fig. 1). The seroprevalence values that we report reflect hospital admissions due to COVID-19, although the increase in positive serology results lagged approximately 1–2 weeks behind the increased number of SARS-CoV-2 infections confirmed by NAAT. This is expected as there is usually a delay between acute infection and seroconversion.

Seroprevalence in the routine care group

Similar to the urgent care group, the seroprevalence found in the routine care group was very low during the weeks ending on 9 February–29 March (ranging from 0% to 2%) (Fig. 1b). Notably, some samples during that time had moderately high reactivity (endpoint titres of 1:150–1:400) (Fig. 1d). An increase in seroprevalence from 1.6% to 2.2% was detected in the week ending on 29 March, and to 10.1% and 11.7% in the following weeks, reaching 19.1% seroprevalence in the week ending on 19 April. From 19 April to 5 July, the seroprevalence stabilized at the 20% level in the routine care group, whereas the number of confirmed cases of COVID-19 in NYC rapidly declined (Extended Data Fig. 2a). Notably, the delay between the sharp increase in SARS-CoV-2 detection by NAAT in NYC (Fig. 1e) and the increase in seroprevalence in the routine care group is longer than the delay between the increase in the number of confirmed cases and the increase in seroprevalence in the urgent care group (Fig. 1a, b). This may be attributed to different antibody kinetics in mild cases, which were recently shown to be delayed9,10,11,12,13 and which probably constitute the majority of infections in the routine care group. We also analysed the increase in seroprevalence over the observation period compared with the official cumulative numbers of cases. Although there is an overall good agreement, the curve fit for the increase in seroprevalence seems to have a steeper slope than the curves of the cumulative number of cases (Extended Data Fig. 2c).

The antibody titres detected in both groups were initially lower and gradually increased to titres as high as 1:51,200 (Fig. 1c, d). Overall, the titres in the urgent care group were significantly higher than in the routine care group (Fig. 1f), which is probably a function of the disease severity in individuals in the urgent care group. Notably, SARS-CoV-2 spike antibody titres detected in the routine care specimens were stable from 5 April onwards (Extended Data Fig. 2b) although the seroprevalence did not increase further after 19 April and the recorded number of cases decreased (Extended Data Fig. 2a), indicating that there was no appreciable waning of SARS-CoV-2 spike-specific IgG in the community during this time frame.

Seroprevalence in different subgroups

To determine which patient subgroup(s) were driving the increase in seroprevalence, we further divided the routine care group into four subgroups: (1) obstetrics and gynaecology visits and labour and deliveries (n = 2,643 samples); (2) oncology visits and treatment hospitalizations (n = 2,592); (3) surgery, including various elective surgeries, transplant surgeries, pre-operative medical assessments and related visits (n = 1,072); and (4) cardiology, including cardiology office visits and other regular office or treatment visits (n = 283). The rise in seroprevalence was mostly driven by obstetrics and gynaecology visits and labour and deliveries, which showed an early increase in seroprevalence in the week ending on 29 March (2.8%) followed by continued increase to 9.6%, 15.6% and 26.8% in the weeks ending on 5, 12 and 19 April, respectively, and stabilized thereafter at 24.5% in the week ending on 24 May, maintaining a seroprevalence of about 20% until the week ending on 5 July (Fig. 2a). Seroprevalence in patients of the oncology subgroup increased during the same time frame, but more slowly and steadily, reaching 19.0% in the week ending with 5 July (Fig. 2b). Similar seroprevalence trends were observed in the surgery and cardiology and other office visits subgroups, although the small number of specimens—which was driven in part by a pause in elective hospital procedures and surgeries due to the pandemic in April and May—limited conclusions for these subgroups individually as the confidence intervals were very wide (Fig. 2c, d).

Fig. 2: Seroprevalence in the different screening subclasses over time.
figure 2

ad, Seroprevalence for obstetrics and gynaecology (a), surgery (b), oncology (c) and cardiology and related office visits (d) groups in the weeks ending 9 February (first two weeks combined) to 5 July 2020. Sample numbers and percentages per week are indicated. Error bars represent the 95% confidence intervals. Each plasma sample was tested once.

Source data

Discussion

Our study provides a window into the extent of seroprevalence in NYC throughout the first wave of the COVID-19 pandemic when the city emerged as an early epicentre. Although our specimen sampling approach results in a biased representation of the entire metropolitan area, it provides valuable insights on the dynamic nature of SARS-CoV-2 antibody responses at the population level.

The urgent care group was designed as a positive-control group that would become enriched in seropositive individuals during the epidemic as individuals with an acute SARS-CoV-2 infection would seek care in the emergency department and would be admitted to the hospital in large numbers. As anticipated, a very high seroprevalence and a rapid increase in titres were detected in this group owing to the marked surge in SARS-CoV-2-related visits to the emergency department and hospital admissions between February and April 2020. During that phase of the epidemic in NYC, the seroprevalence and percentage of NAAT-positive individuals tracked well in the group. If no or very few new cases of SARS-CoV-2 occur, the seroprevalence in this group should drop and track closely with the seroprevalence in the general population, which is what we observed from the end of May to beginning of July. Although this group cannot be used to determine the general seroprevalence during a pandemic, it confirmed that our sero-surveillance was working and increased our confidence in the results of the routine care group. Including similar enriched, positive-control groups into sero-surveillance studies, especially at the beginning of an outbreak, could be valuable to validate serological methods. On its own, monitoring seropositivity of emergency department and admitted patients might also be a valuable tool to track an epidemic if NAAT is not available.

By contrast, the routine care group was designed to resemble the NYC population more closely, albeit with several caveats: 0–20-year-old individuals were underrepresented, (5.4% versus 25.9%), above-61-year-old individuals were overrepresented (31.0% versus 16.1%) (Table 1) and female patients were also overrepresented (67.6% versus 52.5%) compared with the population of NYC. Additional biases include that vulnerable individuals (for example, pregnant women and patients with cancer) were probably more cautious to avoid exposure to the virus. Conversely, owing to their need to access medical care during the lockdown period, their risk for virus exposure may also have increased. The seroprevalence in this group consequently may be either an underrepresentation or overrepresentation of the seroprevalence in the general population.

The seroprevalence that we found in the routine care group is consistent with data from Columbia University that showed that 15.4% of pregnant women who delivered infants at their facilities between 22 March and 4 April 2020 were infected with SARS-CoV-214. This tracks well with seroprevalence in the routine care group, which was between 10.1 and 19.1% in the weeks following 5 April. A sero-survey conducted by the NYS Department of Health determined that between 19 and 28 April, the seroprevalence for SARS-CoV-2 in the NYC metropolitan region15 was 22.7%, which matches very well with the data for our routine care group from the week ending on 19 April. Finally, a study16 by the US Centers for Disease Control and Prevention found a seroprevalence of 6.9% in NYC in samples obtained between 23 March and 1 April 2020. During this time frame (weeks ending on 22 March–5 April), we saw an increase from 1.6% to 10.1%. Although all three of these single-time-point studies used different methodologies and samples, the seroprevalence in our routine care group agrees very well with their results and indicates that this group, despite the strong biases, probably reflects seroprevalence in NYC relatively well.

The first seropositive samples in our study were already detected during the week of 23 February, one week before the first confirmed case of SARS-CoV-2 in NYC was identified, which suggests that SARS-CoV-2 was probably introduced to the NYC area several weeks earlier than has previously been assumed. This would not be unexpected given the unique diversity and connectivity of NYC and the large numbers of travellers that were arriving from SARS-CoV-2-affected regions of the world in January and February 2020. The antibody titres of initial positive individuals were low, which is consistent with slower seroconversion of perhaps mild cases9,10,11,12,13. Of course, we cannot exclude with absolute certainty that some of the lower positive titres are false positives as the initially low seroprevalence falls within the confidence intervals of the positive predictive value.

Of note, the seroprevalence in the routine care group (as well as the urgent care group at the end of May, after the peak) falls significantly below the threshold for potential community immunity, which has been estimated by one study to require at least a seropositivity rate of 67% for SARS-CoV-24. On the basis of the population of NYC (8.4 million), we estimate that by the week ending 24 May, approximately 1.7 million individuals had been infected with SARS-CoV-2. Taking into account the cumulative number of deaths in the city by 19 May (16,674—this number includes only officially confirmed, not suspected, COVID-19-related deaths), this suggests a preliminary infection fatality rate of 0.97% (with the assumption that both seroconversion and death occur with similar delays). This is in stark contrast to the infection fatality rate of the 2009 H1N1 pandemic17, which was estimated to be 0.01–0.001%.

A unique aspect of our study is that it provides important information about the dynamics of SARS-CoV-2 seroconversion and the stability of antibody responses before, during and after the first epidemic peak in an early and major epicentre of the COVID-19 pandemic. Our dataset will be useful to generate models that can then more accurately predict the dynamics of seroprevalence in other geographical locations, even if sero-surveys in these locations are based on few, widely spaced points in time. Our study further underscores the value of serological testing, even when done retrospectively, to capture the onset and full extent of a pandemic wave when there is limited capacity for NAAT. Although the phylogenetic analysis of later isolates can provide estimated times of introductions, these estimates do have large confidence intervals. A combination of both serological and phylogenetic analyses using precision surveillance is probably the most accurate and useful tool to determine the true onset of an epidemic.

We will continue this repeated cross-sectional sero-survey for at least one year, and expect that, if antibody titres remain stable, the seroprevalence will probably not change significantly unless the number of new infections increases again or vaccines become available.

Methods

Study participants and human samples

Residual EDTA-anticoagulated blood specimens remaining after standard-of-care testing were obtained from the MSH Blood Bank and sorted by collection date, location and by patient practice category (OB/GYN visits and labour and deliveries; oncology visits; surgeries and related visits; cardiology office or treatment visits; emergency department visits; and other related hospital admissions). To select samples in an unbiased manner, plasma samples were selected randomly from up to the first 152 specimens for each location per collection week and these were aliquoted for testing. All test samples were logged into a de-identified database and matched to anonymous patient identifiers and verified patient categories, after which duplicate samples obtained from the same patient within one week were retrospectively removed from further analyses. Samples from the same patient were included in the analysis if they were collected at least 7 days apart. If 7 days or more apart, seroconversion could potentially occur and we therefore consider this small number of specimens as an independent sample of the population. Collection started in the week ending 9 February 2020. Between 231 and 493 plasma samples per week (starting from the week ending on 23 February 2020) were selected from the routine care patient setting from patients that went for OB/GYN visits and labour and deliveries, oncology visits, surgeries and related visits as well as cardiology office visits and other regular office visits (see Extended Data Table 1 for detailed numbers and breakdown per cohort). Approximately 200 plasma samples per week were selected from an inpatient cohort setting, consisting of plasma samples from patients that were admitted to the emergency department and other related hospital admissions (urgent care). For some individuals, a PCR test for viral RNA was performed to diagnose infection with SARS-CoV-2.

This study (protocol HS 20-00308) was reviewed by the Mount Sinai Health System Institutional Review Board, Icahn School of Medicine at Mount Sinai, and determined to be exempt from human research as defined by Department of Health and Human Services regulations (45 CFR 46. 104). The collection and testing of residual plasma specimen were performed in a blinded manner, and all data used in this study was anonymized following local and reporting regulations through the use of an honest broker.

Recombinant proteins

The recombinant RBD and spike protein of SARS-CoV-2 were generated and expressed as previously described7,8. In brief, the mammalian-cell codon-optimized nucleotide sequences for RBD (amino acids 319–541) including a signal peptide and hexahistidine tag or the soluble version of the spike protein (amino acids 1–1,213) including a signal peptide, C-terminal thrombin cleavage site, T4 fold-on trimerization domain and hexahistidine tag were cloned into the mammalian expression vector pCAGGS. The nucleotide sequence of the spike protein was additionally modified to remove the polybasic cleavage site and two stabilizing mutations (K986P and V987P) were introduced. The expression plasmids are available at BEI Resources Repository (https://www.beiresources.org/).

Recombinant proteins were produced in Expi293F cells (Thermo Fisher) using the ExpiFectamine 293 Transfection Kit (Thermo Fisher) according to manufacturer’s instructions. Expi293F cells were not authenticated and tested negative for mycoplasma. Proteins were purified by gravity flow using Ni-NTA Agarose (Qiagen) and concentrated in Amicon centrifugal units (EMD Millipore). Purified proteins were analysed by reducing sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) and correct folding was confirmed by performing ELISAs with RBD-specific monoclonal antibody CR302218,19 or 2B3E5.

ELISA

The serological assays were performed as previously described in detail following a two-step ELISA protocol7,8. The assay used here has a workflow that closely resembles an assay established in the Mount Sinai Health System CLIA-certified Clinical Pathology Laboratory, which received an FDA Emergency Use Authorization in April 2020. However, the assay used in this study was performed in a research laboratory setting with a sensitivity of 95% and a specificity of 100% (Extended Data Table 1), resulting in a positive predictive value of 1 (95% confidence interval, 0.908–1.000) and a negative predictive value of 0.97 (95% confidence interval, 0.909–0.995).

In the first step, plasma samples were screened in a high-throughput assay using the recombinant RBD protein. In brief, 96-well microtitre plates (Thermo Fisher) were coated with 50 μl recombinant RBD protein at a concentration of 2 μg ml−1 overnight at 4 °C. The next day, the plates were washed three times with phosphate-buffered saline (PBS; Gibco) supplemented with 0.1% Tween-20 (PBS-T; Fisher Scientific) using an automatic plate washer (BioTek). The plates were blocked with 200 μl blocking solution consisting of PBS-T with 3% (w/v) milk powder (American Bio) and incubated for 1 h at room temperature. As a general safety precaution, plasma samples were heat-inactivated for 1 h at 56 °C. The blocking solution was taken off the plates and 100 μl of the plasma samples diluted 1:50 in PBS-T containing 1% (w/v) milk powder was added to the respective wells of the microtitre plates. After 2 h, the plates were washed three times with PBS-T and 50 μl anti-human IgG (Fab-specific) horseradish peroxidase antibody (produced in goat; Sigma, A0293) diluted 1:3,000 in PBS-T containing 1% milk powder was added to all wells and incubated for 1 h at room temperature. The microtitre plates were washed three times with PBS-T and 100 μl SigmaFast o-phenylenediamine dihydrochloride (Sigma) was added to all wells. The reaction was stopped after 10 min with 50 μl per well 3 M hydrochloric acid (Thermo Fisher) and the plates were read at a wavelength of 490 nm with a plate reader (BioTek). Plasma samples that exceeded an optical density at 490 nm (OD490) cut-off value of 0.15 were categorized as presumptive positive and were tested in a second step in confirmatory ELISAs using the full-length, recombinant spike protein.

To perform the confirmatory ELISAs, the plates were coated and blocked as described above except full-length spike protein at a concentration of 2 μg ml−1 was added to the plates. After 1 h, the blocking solution was removed, presumptive positive plasma samples that were serially diluted in 1% milk prepared in PBS-T were added and the plates were incubated for 2 h at room temperature. The remainder of the assay was performed as described above. The data were analysed in Microsoft Excel and GraphPad Prism 7. The cut-off value was set to an OD490 value of 0.15 and true-positive samples were defined as samples that exceeded an OD490 value of 0.15 at a 1:80 plasma dilution. The end-point titre was calculated and defined as the last dilution before the signal dropped below an OD490 of 0.15. For samples that exceeded an OD490 of 0.15 at the last dilution (1:12,800 for samples of weeks ending on 29 March and 5 April; 1:6,480 for samples of weeks ending 12 April and 24 May), a four-parameter curve fit (variable slope) was applied and the end-point titre was determined by interpolation.

The sensitivity and specificity of the assay were determined using a panel of serum and/or plasma samples of 40 patients that had a PCR-confirmed SARS-CoV-2 infection (true positives) and 74 negative-control samples (56 samples that were taken before the pandemic and 18 samples without confirmed SARS-CoV-2 infection; true negatives). The positive and predictive values were determined taking into account the ratio of true positives and true negatives (seroprevalence of 35%) in the panel. Notably, using the 100% specificity determined using the panel and assuming a low (for example, 1%) true seroprevalence in the test group would not change the positive predictive value.

Statistical analysis

The 95% confidence interval of the seroprevalence was calculated assuming binomial data based on methods by Wilson–Brown20. Significant differences in end-point titres between the urgent care and routine care groups were identified using Mann–Whitney U-tests. The 95% confidence interval for assay sensitivity, specificity and the positive predictive and negative predictive values were determined using methods by Wilson and Brown.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.