Main

Antibody responses to SARS-CoV-2 were initially characterized in a cohort of individuals convalescing from COVID-19 at approximately 40 days (1.3 months) after infection1. Between 31 August and 16 October 2020, 100 participants returned for a 6-month follow-up study visit. Although the initial criteria allowed enrolment of the close contacts of individuals with SARS-CoV-2 infection confirmed by reverse-transcription PCR (RT–PCR)1, 13 of the contacts did not seroconvert and were excluded from further analyses. The remaining 87 participants with RT–PCR-confirmed SARS-CoV-2 infection and/or seroconversion returned for analysis at approximately 191 days (6.2 months; range of 165 to 223 days) after the onset of symptoms. In this cohort, symptoms lasted for a median of 12 days (range of 0 to 44 days) during the acute phase, and 10 (11%) of the participants were hospitalized. Consistent with other reports3,4, 38 (44%) of the participants reported persistent long-term symptoms attributable to COVID-19 (Methods, Supplementary Tables 1, 2). The duration and severity of symptoms during acute disease was significantly greater among participants with persistent post-acute symptoms at the second study visit (Extended Data Fig. 1m–q). Importantly, all 87 participants tested negative for SARS-CoV-2 at the 6-month follow-up study visit using an approved saliva-based PCR assay (Methods). Participant demographics and clinical characteristics are shown in Supplementary Tables 1, 2.

Plasma SARS-CoV-2 antibody reactivity

Antibody reactivity in plasma to the RBD and nucleoprotein (N) of SARS-CoV-2 was measured by enzyme-linked immunosorbent assay (ELISA) and automated serological assays1,5,6. Anti-RBD assays were strongly correlated (anti-RBD IgG ELISA and Pylon–IgG, anti-RBD IgM ELISA and Pylon–IgM at 1.3 months, r = 0.9200 and r = 0.7543, P < 0.0001, respectively) (Extended Data Fig. 2a–d), and anti-N assays showed a moderate correlation (anti-N IgG ELISA and Roche anti-N total immunoglobulin at 1.3 months, r = 0.3596, P = 0.0012) (Extended Data Fig. 2e, f). The anti-RBD and ELISA anti-N antibodies in plasma decreased significantly between 1.3 and 6.2 months (Fig. 1a–d). Notably, the decreased binding activity differed substantially by isotype and target. IgM showed the greatest decrease in anti-RBD reactivity (53%), followed by IgG (32%); anti-RBD IgA decreased by only 15% and anti-N IgG levels by 22% (Fig. 1e). By contrast, the Roche anti-N assay6 showed a small but significant increase (19%) in reactivity between the two time points (Extended Data Fig. 2g), which might be explained by the use of an antigen bridging approach7. In all cases, the magnitude of the decrease was inversely proportional to and directly correlated with the initial antibody levels, such that individuals with higher initial levels showed greater relative changes (Fig. 1f–i, Extended Data Fig. 2h). All measurements were strongly correlated between the two time points (Extended Data Fig. 2i–m) and anti-N IgG correlated with respective IgM, IgG and IgA anti-RBD reactivity at 1.3 months (Extended Data Fig. 2n–p). Notably, individuals with persistent post-acute symptoms had significantly higher levels of anti-RBD IgG and anti-N total antibody at both study visits (Extended Data Fig. 1a–l).

Fig. 1: Plasma antibody dynamics against SARS-CoV-2.
figure 1

ad, Results of ELISAs measuring plasma reactivity to RBD (a, b, c) and N protein (d) at the initial 1.3- and 6.2-month follow-up visit, respectively. a, Anti-RBD IgM. b, Anti-RBD IgG. c, Anti-RBD IgA d, Anti-N IgG. The normalized area under the curve (AUC) values for 87 individuals are shown in ad for both time points. Positive and negative controls were included for validation1. e, Relative change in plasma antibody levels between 1.3 and 6.2 months for anti-RBD IgM, IgG, IgA and anti-N IgG in all 87 individuals. fi, Relative change in antibody levels between 1.3 and 6.2 months plotted against the corresponding antibody levels at 1.3 months. f, Anti-RBD IgM. r = −0.83, P < 0.0001. g, Anti-RBD IgG. r = −0.76, P < 0.0001. h, Anti-RBD IgA. r = −0.67, P < 0.0001. i, Anti-N IgG. r = −0.87, P < 0.0001. j, Ranked average NT50 at 1.3 months (blue) and 6.2 months (red) for the 87 individuals studied. k, Graph shows NT50 for plasma from all 87 individuals collected at 1.3 and 6.2 months. P < 0.0001. l, Relative change in plasma neutralizing titres between 1.3 and 6.2 months plotted against the corresponding titres at 1.3 months. For ae, k plotted values and horizontal bars indicate geometric mean. Statistical significance was determined using two-tailed Wilcoxon matched-pairs signed-rank test in ad, k, and Friedman with Dunn’s multiple comparison test in e. The r and P values in fi, l were determined by two-tailed Spearman’s correlations.

We measured plasma neutralizing activity using an HIV-1 virus pseudotyped with the SARS-CoV-2 spike protein1,8. Consistent with other reports9,10, the geometric mean half-maximal neutralizing titre (NT50) in this group of 87 participants was 401 and 78 at 1.3 and 6.2 months, respectively—representing a fivefold decrease (Fig. 1j, k). Neutralizing activity was directly correlated with the IgG anti-RBD ELISA measurements (Extended Data Fig. 2q, r). Moreover, the absolute magnitude of the decrease in neutralizing activity was inversely proportional to and directly correlated with the neutralizing activity at the earlier time point (Fig. 1l). We conclude that antibodies to RBD and plasma neutralizing activity decrease significantly, but remain detectable, 6 months after infection with SARS-CoV-2 in the majority of individuals.

To examine the phenotypic landscape of circulating B cells, we performed high-dimensional flow cytometry on 41 randomly selected individuals at both time points and compared the results to pre-COVID-19 samples from healthy individuals (n = 20). Global high-dimensional mapping with t-distributed stochastic neighbour embedding revealed significant persistent alterations in individuals who had recovered from COVID-19 (Extended Data Fig. 3a). The relative representation of clusters 2, 7, 8 and 10 (corresponding to naive, memory, plasmablast and plasma cells, respectively) was decreased at 1.3 months and remained so at the later time point. Metacluster 15—which corresponds to immature B cells that are recent immigrants from the bone marrow—was increased at the early time point but returned to control levels at the end of the observation period (Extended Data Fig. 3b–d).

SARS-CoV-2 memory B cell repertoire

Whereas plasma cells are the source of circulating antibodies, memory B cells contribute to recall responses. To identify and enumerate the circulating SARS-CoV-2-specific memory B cell compartment, we used flow cytometry to isolate individual B lymphocytes with receptors that bound to RBD1 (Fig. 2a, Extended Data Fig. 4). Notably, the percentage of RBD-binding memory B cells increased marginally between 1.3 and 6.2 months in 21 randomly selected individuals (Fig. 2b).

Fig. 2: Sequences of anti-SARS-CoV-2 RBD antibodies.
figure 2

a, Representative flow cytometry plots showing dual AlexaFluor-647–RBD- and PE–RBD-binding B cells for six study individuals (designated COV21, COV47, COV57, COV72, COV96 and COV107) (the gating strategy is shown in Extended Data Fig. 5). The percentage of antigen-specific B cells is indicated. b, As in a. Graph summarizes the percentage of RBD-binding memory B cells in samples obtained at 1.3 and 6.2 months from 21 randomly selected individuals. Red horizontal bars indicate geometric mean values. Statistical significance was determined using two-tailed Wilcoxon matched-pairs signed-rank test. c, Number of somatic nucleotide mutations in the IGVH (top) and IGVL (bottom) genes in antibodies obtained after 1.3 or 6.2 months from the indicated individual (left) or all six donors (right). Statistical significance was determined using two-tailed Mann–Whitney U-tests. Horizontal bars indicate median values. d, Pie charts show the distribution of antibody sequences from six individuals after 1.31 (top) or 6.2 months (bottom). The number in the inner circle indicates the number of sequences analysed for the individual denoted above the circle. Pie-slice size is proportional to the number of clonally related sequences. The black outline indicates the frequency of clonally expanded sequences detected in each participant. Coloured slices indicate persisting clones (same IGHV and IGLV genes and highly similar CDR3) found at both time points in the same participant. Grey slices indicate clones unique to the time point. White slices indicate singlets found at both time points, and the remaining white area indicates sequences that were isolated once. e, Graph shows relative clonality at both time points for all six donors. Red horizontal bars indicate mean values. Statistical significance was determined using two-tailed t-test.

To determine whether there were changes in the antibodies produced by memory B cells after 6.2 months, we obtained 532 paired antibody heavy and light chains from the same 6 individuals who were examined at the earlier time point1 (Supplementary Table 3). There was no significant difference in the representation of IGV genes at the two time points, including the over-representation of the IGHV3-30 and IGHV3-53 gene segments1,11,12,13,14,15,16 (Extended Data Fig. 5a, b). In keeping with this observation (and similar to the earlier time point), antibodies that shared the same IGHV and IGLV genes comprised 8.6% of all sequences in different individuals (Extended Data Fig. 5c). As might be expected, there was a small—but significant—overall increase in the percentage of IgG-expressing anti-RBD memory cells, from 49% to 58% (P = 0.011) (Extended Data Fig. 5d–f). Consistent with the fractional increase in IgG memory cells, the extent of somatic hypermutation for both IGH and IGL differed significantly in all six individuals between the two time points. Whereas the average number of nucleotide mutations in IGH and IGL was only 4.2 and 2.8, respectively, at the first time point, these values were increased to 11.7 and 6.5, respectively, at the second time point (P < 0.0001) (Fig. 2c, Extended Data Fig. 6a–f). By contrast, the overall average length and hydrophobicity of complementarity-determining region 3 (CDR3) of IGH and IGL were unchanged (Extended Data Fig. 6g, h).

Similar to the earlier time point, we found expanded clones of memory B cells at 6.2 months (including 23 clones that appeared at both time points). However, expanded clones accounted for only 12.4% of all antibody sequences after 6.2 months, compared to 32% after 1.3 months (P = 0.0225) (Fig. 2d, e). In addition, the overall clonal composition of the memory compartment differed at the two time points in all of the individuals we examined (Fig. 2d). Forty-three expanded clones that were present at the earlier time point were not detectable after 6.2 months, and 22 new expanded clones appeared. In addition, the relative distribution of clones that appeared at both time points also varied. For example, the dominant clones in the individuals designated COV21 and COV57—in whom they represented 9.0% and 16.7% of all sequences, respectively, after 1.3 months —were reduced to 1.1% and 1.9%, respectively, of all sequences after 6.2 months (Fig. 2d, Supplementary Table 3). We conclude that, although the magnitude of the RBD-specific memory B cell compartment is conserved between 1.3 and 6.2 months after infection with SARS-CoV-2, there is extensive clonal turnover and antibody sequence evolution that is consistent with prolonged germinal centre reactions.

SARS-CoV-2 monoclonal antibodies

We tested 122 representative antibodies from the 6.2-month time point for reactivity to the RBD (Supplementary Table 4). The antibodies that we evaluated included: (1) 49 antibodies that were randomly selected from those antibodies that appeared only once; (2) 23 antibodies that appeared as singles at both 1.3 and 6.2 months; (3) 23 representatives of newly appearing expanded clones; and (4) 27 representatives of expanded clones appearing at both time points. One hundred and fifteen out of the 122 antibodies bound to RBD, which indicates that flow cytometry efficiently identified B cells that produce anti-RBD antibodies (Fig. 3a, Supplementary Tables 4, 5). Taking all antibodies together, the mean ELISA half-maximal effective concentration (EC50) was not significantly different at the two time points1 (Fig. 3a, Supplementary Table 4). However, comparison of the antibodies that were present at both time points revealed a significant improvement of the EC50 after 6.2 months (P = 0.0227) (Fig. 3b, Extended Data Fig. 7a).

Fig. 3: Reactivity of anti-SARS-CoV-2 RBD monoclonal antibodies.
figure 3

a, Graph shows anti-SARS-CoV-2 RBD antibody reactivity. ELISA EC50 values for all antibodies measured at 1.3 months1, and 122 selected monoclonal antibodies measured at 6.2 months. Horizontal bars indicate geometric mean. Statistical significance was determined using two-tailed Mann–Whitney U-test. b, EC50 values for all 52 antibodies that appear at 1.3 and 6.2 months. Average of two or more experiments. Horizontal bars indicate geometric mean. Statistical significance was determined using two-tailed Wilcoxon matched-pairs signed-rank test. c, Surface representation of the RBD with the ACE2-binding footprint indicated as a dotted line and selected residues found in circulating strains (grey) and residues that mediate resistance to class-2 (red) (C144) and -3 (green) (C135) antibodies highlighted as sticks. d, Graphs show ELISA binding curves for C144 (black dashed line) and its clonal relatives obtained after 6.2 months (C050, C051, C052, C053 and C054) (solid lines) binding to wild-type (WT), Q493R, R346S and E484K RBDs. e. Heat map shows log2-transformed relative fold change in EC50 against indicated RBD mutants for 26 antibody clonal pairs obtained at 1.3 and 6.2 months with the most pronounced changes in reactivity. The participant of origin for each antibody pair is indicated above. All experiments were performed at least twice.

To determine whether the antibodies expressed by memory B cells at the late time point also showed altered breadth, we compared them to earlier clonal relatives in binding assays using control and mutant RBDs. The substitutions E484K and Q493R17 were selected for resistance to class-2 antibodies (such as C144 and C121) that bind directly to the ACE2-interaction ridge in the RBD1,18,19,20, and R346S, N439K, and N440K were selected for resistance to class-3 antibodies (such as C135) that do not directly interfere with ACE2 binding1,17,18,19,20 (Fig. 3c). In addition, we also tested the V367F, A475V, S477N and V483A mutants of the RBD, which represent circulating variants that confer complete or partial resistance to class-1 and -2 antibodies17,18,21 (Fig. 3c). Out of 52 antibody clonal pairs that appeared at both time points, 43 (83%) showed overall increased binding to mutant RBDs at the 6.2-month time point (Extended Data Fig. 7b–k, Supplementary Table 5). For example, C144 (an antibody recovered at the 1.3-month time point) was unable to bind to RBD(Q493R) or RBD(E484K), but all five of its clonal derivatives collected at 6.2 months bound to RBD(Q493R) and one also showed binding to RBD(E484K) (Fig. 3d). Overall, the most pronounced increase in binding occurred for mutations affecting the RBD in amino acid positions such as E484, Q493, N439, N440 and R346, which are critical for the binding of class-2 and -3 antibodies17,18 (Fig. 3e, Extended Data Fig. 7b–k, Supplementary Table 5).

Next, we tested all 122 antibodies from the 6.2-month time point for activity in a pseudotyped SARS-CoV-2 neutralization assay1,8 (Fig. 4a, Supplementary Table 6). Consistent with RBD-binding assays, the mean neutralization half-maximal inhibitory concentrations (IC50) values were not significantly different at the two time points when all antibodies were compared1 (Fig. 4a). However, comparison of the antibodies that were present at both time points revealed a significant improvement of the IC50 values at 6.2 months (P = 0.0003) (Fig. 4b, Extended Data Fig. 8a).

Fig. 4: Neutralizing activity of anti-SARS-CoV-2 RBD monoclonal antibodies.
figure 4

a, SARS-CoV-2 pseudovirus neutralization assay. IC50 values for all antibodies measured at 1.3 months1, and 122 selected antibodies measured at 6.2 months. Antibodies with IC50 values above 1 μg ml−1 were plotted at 1 μg ml−1. Mean of two independent experiments. Red bar indicates geometric mean. Statistical significance was determined using two-tailed Mann–Whitney U-test. b, IC50 values for 52 antibodies appearing at 1.3 and 6.2 months. Red bar indicates geometric mean. Statistical significance was determined using two-tailed Wilcoxon matched-pairs signed-rank test. c, IC50 values for 5 pairs of monoclonal antibody clonal relatives obtained after 1.3 or 6.2 months for neutralization of wild-type and mutant SARS-CoV-2 pseudovirus. Antibody identifiers of the 1.3-month–6.2-month monoclonal antibody pairs as indicated. Bold styling denotes antibody pairs with substantial increase in neutralizing activity after 6.2 months. d, Graph shows the normalized relative luminescence units (RLU) for cell lysates of 293T cells expressing ACE2, 48 h after infection with SARS-CoV-2 pseudovirus containing wild-type RBD or one of three mutant RBDs (Q493R, E484G and R346S) in the presence of increasing concentrations of one of two monoclonal antibodies C144 (1.3 months) (dashed lines) or C051 (6.2 months) (solid lines). Experiments were performed at least twice. e, C051 binding model. Surface representation of two adjacent ‘down’ RBDs (RBDA and RBDB) on a spike trimer, with the C144 epitope on the RBDs highlighted in cyan and positions of amino acid mutations that accumulated in C051 compared to the parent antibody C144 highlighted as stick side chains on a Cα atom representation of C051 VHVL binding to adjacent RBDs. The C051 interaction with two RBDs was modelled on the basis of a cryo-electron microscopy structure of C144 Fab bound to spike trimer18. HC, heavy chain; LC, light chain.

To determine whether the antibodies exhibiting altered RBD binding also show increased neutralizing breadth, we tested five representative antibody pairs recovered at the two time points against HIV-1 viruses pseudotyped with E484G, Q493R, and R346S mutant spike proteins (Fig. 4c, Supplementary Table 6). Notably, the Q493R and E484G pseudotyped viruses were resistant to neutralization by C144; by contrast, C051 (a 6.2-month clonal derivative of C144) neutralized both variants, with IC50 values of 4.7 and 3.1 ng ml−1, respectively (Fig. 4c, d). Similarly, R346S pseudotyped viruses were resistant to C032, but C080 (a 6.2-month clonal derivative of C032) neutralized this variant with an IC50 of 5.3 ng ml−1 (Fig. 4c, Extended Data Fig. 8b–f). Consistent with the observed changes in binding and neutralizing activity, several late-appearing antibodies (for example, C051) had acquired mutations directly in or adjacent to the RBD-binding paratope (Fig. 4e, Extended Data Fig. 8g–j). We conclude that memory B cells that evolved during the observation period express antibodies with increased neutralizing potency and breadth.

SARS-CoV-2 antigen persistence

Antibody evolution occurs by somatic mutation and selection in germinal centres in which antigen can be retained in the form of immune complexes on the surface of follicular dendritic cells for prolonged periods of time. Residual protein in tissues represents another potential source of antigen. SARS-CoV-2 replicates in ACE2-expressing cells in the lungs, nasopharynx and small intestine22,23,24,25, and viral RNA has been detected in stool samples even after the virus is cleared from the nasopharynx26,27,28. To determine whether there might be antigen persistence in the intestine after resolution of clinical illness, we obtained biopsies from the upper and lower gastrointestinal tract of 14 individuals at an average of 4 months (range of 2.8 to 5.7 months) after initial COVID-19 diagnosis (Supplementary Table 7). Immunostaining was performed to determine whether viral protein was also detectable in upper and lower gastrointestinal tract, with de-identified biopsies from individuals pre-dating the pandemic (n = 10) serving as controls. ACE2 and SARS-CoV-2 N protein was detected in intestinal enterocytes in 5 of 14 individuals (Fig. 5a–d, Extended Data Figs. 9a–h, 10a, b, Supplementary Table 7) but not in control samples (Extended Data Fig. 9i–l). When detected, immunostaining was sporadic, patchy, exclusive to the intestinal epithelium and not associated with inflammatory infiltrates (Extended Data Figs. 9a–h, 10a, b). Clinically approved nasopharyngeal-swab PCR assays were negative in all 14 individuals at the time of biopsy. However, biopsy samples from 3 of the 14 participants produced PCR amplicons that were sequence-verified as SARS-CoV-2 (Methods, Supplementary Table 7). In addition, viral RNA was detected by in situ hybridization in biopsy samples from the two participants who were tested (Extended Data Fig. 10c, d) but not in control samples (Extended Data Fig. 10e). Sampling variability during the endoscopic procedure probably contributed to incomplete concordance between detection of viral RNA and protein assays.

Fig. 5: Immunofluorescence imaging of intestinal biopsies.
figure 5

a, Immunofluorescence images of human enterocytes stained for EPCAM (red), DAPI (blue) and either ACE2 (green in a, c) or SARS-CoV-2 N (green in b, d) in intestinal biopsies taken 92 d after onset of COVID-19 symptoms in participant CGI-088, in the terminal ileum (a, b) or duodenum (c, d). Regions in white boxes in the right panels of a, c are shown expanded in b, d, respectively. Arrows indicate enterocytes with detectable SARS-CoV-2 antigen. Scale bars, 100 μm. The experiments were repeated independently at least twice with similar results.

Neutralizing antibodies to SARS-CoV-2 develop in most individuals after infection but decay with time7,9,10,29,30,31,32. These antibodies are effective in prevention and therapy in animal models and are likely to have a role in protection from reinfection in humans2. Although there is a significant decrease in plasma neutralizing activity between 1.3 and 6.2 months, antibody titres remain measurable in most individuals9,10,29,30,31,32,33,34,35,36.

Neutralizing monoclonal antibodies obtained from individuals during the early convalescence period showed notably low levels of somatic mutations, which some previous reports have attributed to defects in germinal centre formation11,12,14,37,38,39,40. Our data indicate that the anti-SARS-CoV-2 memory B cell response evolves during the first six months after infection, with accumulation of immunoglobulin somatic mutations, and production of antibodies with increased neutralizing breadth and potency. Persistent antibody evolution occurs in germinal centres and requires that B cells are exposed to antigen trapped in the form of immune complexes on follicular dendritic cells41. This form of antigen can be long-lived, because follicular dendritic cells do not internalize immune complexes. In addition, even small amounts of persistent viral antigen could fuel antibody evolution. The observation that SARS-CoV-2 mRNA and protein remains detectable in the small intestinal epithelium in some individuals at months after infection is consistent with the relative persistence of anti-RBD IgA antibodies and continued antibody evolution34,35,36.

Memory responses are responsible for protection from reinfection and are essential for effective vaccination. The observation that memory B cell responses do not decay after 6.2 months34,35,42, but instead continue to evolve, is strongly suggestive that individuals who are infected with SARS-CoV-2 could mount a rapid and effective response to the virus upon re-exposure.

Methods

Data reporting

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.

Study participants

Previously enrolled study participants1 were asked to return for a 6-month follow-up visit at the Rockefeller University Hospital from 31 August through to 16 October 2020. Eligible participants were adults aged 18–76 years and were either diagnosed with SARS-CoV-2 infection by RT–PCR (cases) or were close contacts (for example, members of the same household, coworkers or members of same religious community) of someone who had been diagnosed with SARS-CoV-2 infection by RT–PCR (contacts). Close contacts without seroconversion against SARS-CoV-2 as assessed by serological assays (described in ‘High-throughput automated serology assays’) were not included in the subsequent analysis. Most study participants were residents of the greater New York city tri-state region and were asked to return approximately six months after the time of onset of COVID-19 symptoms. Participants presented to the Rockefeller University Hospital for blood sample collection and were asked to recall the symptoms and severity of clinical presentation during the acute (first six weeks) and the convalescent (seven weeks until second study visit) phase of COVID-19, respectively. The severity of acute infection was assessed by the World Health Organization (WHO) ‘Ordinal Clinical Progression/Improvement Scale’ (https://www.who.int/publications/i/item/covid-19-therapeutic-trial-synopsis). Shortness of breath was assessed through the modified Medical Research Council dyspnoea scale43. Participants who presented with persistent symptoms attributable to COVID-19 were identified on the basis of chronic shortness of breath or fatigue, deficit in athletic ability and/or three or more additional long-term symptoms such as persistent unexplained fevers, chest pain, new-onset cardiac sequalae, arthralgias, impairment of concentration or mental acuity, impairment of sense of smell or taste, neuropathy or cutaneous findings3,4. All participants at Rockefeller University provided written informed consent before participation in the study and the study was conducted in accordance with good clinical practice. Clinical data collection and management were carried out using the software iRIS by iMedRIS. The study was performed in compliance with all relevant ethical regulations and the protocol for studies with human participants was approved by the Institutional Review Board (IRB) of the Rockefeller University.

Gastrointestinal biopsy cohort

To determine whether SARS-CoV-2 can persist in the gastrointestinal tract, we recruited a cohort of 14 individuals with prior diagnosis of and recovery from COVID-19 illness. Eligible participants included adults, 18–76 years of age who were previously diagnosed with SARS-CoV-2 by RT–PCR or through a combination of clinical symptoms consistent with COVID-19 plus evidence of seroconversion, and presented to the gastroenterology clinics of Mount Sinai Hospital. Endoscopic procedures were performed for clinically indicated conditions as detailed in Supplementary Table 7. All participants were asymptomatic at the time of the endoscopic procedures and negative for SARS-CoV-2 by nasal-swab PCR (cycle threshold (Ct) cut-off < 38).

The CLIA-certified laboratory of the Mount Sinai Health System validated the laboratory-developed nasopharyngeal-swab real-time RT–PCR test according to the New York State Department of Health Wadsworth Center validation procedure for SARS-CoV-244. Informed consent was obtained from all participants. The biopsy-related studies were approved by the Mount Sinai Ethics Committee/IRB (IRB 16-0583, ‘The impact of viral infections and their treatment on gastrointestinal immune cells’).

SARS-CoV-2 saliva PCR test

The SARS-CoV-2 PCR method for saliva samples was developed and its performance characteristics determined by the Rockefeller University Clinical Genomics Laboratory. This laboratory-developed test has been authorized by New York state under an emergency use authorization for use by authorized laboratories. Saliva was collected into guanidine thiocyanate buffer as previously described45. RNA was extracted using either a column-based (Qiagen QIAmp DSP Viral RNA Mini Kit, 61904) or a magnetic-bead-based method as previously described46. Reverse-transcribed cDNA was amplified using primers and probes validated by the Centers for Disease Control and Prevention or by Columbia University Personalized Medicine Genomics Laboratory, respectively, and approved by the US Food and Drug Administration under the emergency use authorization. Viral RNA was considered detected if the Ct for two viral primer and probe combinations was < 40.

Blood sample processing and storage

Peripheral blood mononuclear cells were obtained by gradient centrifugation and stored in liquid nitrogen in the presence of FCS and DMSO. Heparinized plasma and serum samples were aliquoted and stored at −20 °C or below. Before experiments, aliquots of plasma samples were heat-inactivated (56 °C for 1 h) and then stored at 4 °C.

High-throughput automated serology assays

Plasma samples from 80 out of 87 participants were tested by high-throughput automated serology assays. The Roche Elecsys anti-SARS-CoV-2 assay was performed on Roche Cobas e411 (Roche Diagnostics). The Elecsys anti‐SARS‐CoV-2 assay uses a recombinant protein representing the N antigen for the determination of antibodies against SARS‐CoV‐2. This assay received emergency use authorization approval from the US Food and Drug Administration6. The Pylon COVID-19 IgG and IgM assays were used to measure plasma IgG and IgM antibodies against SARS-CoV-2, respectively. Plasma samples were assayed on the Pylon 3D analyser (ET HealthCare) as previously described5. This assay was implemented clinically as a laboratory-developed test under New York State Department of Health regulations. In brief, the assay was performed using a unitized test strip containing wells with predispensed reagents. The COVID-19 reagent contains biotinylated recombinant versions of the SARS-CoV-2 S-protein RBD and trace amounts of N protein as antigens that bind IgG and IgM, respectively. The cut-off values for both Pylon assays were determined using the mean of non-COVID-19 samples plus 6 s.d. The results of a sample are reported in the form of a cut-off index or an index value, which were determined by the instrument readout of the test sample divided by instrument readout at cut-off.

ELISAs

ELISAs47,48 to evaluate antibodies binding to SARS-CoV-2 N (Sino Biological 40588-V08B), RBD and additional RBDs were performed by coating of high-binding 96-half-well plates (Corning 3690) with 50 μl per well of a 1 μg ml−1 protein solution in phosphate-buffered saline (PBS) overnight at 4 °C. Plates were washed 6 times with washing buffer (1× PBS with 0.05% Tween-20 (Sigma-Aldrich)) and incubated with 170 μl per well blocking buffer (1× PBS with 2% BSA and 0.05% Tween-20 (Sigma)) for 1 h at room temperature. Immediately after blocking, monoclonal antibodies or plasma samples were added in PBS and incubated for 1 h at room temperature. Plasma samples were assayed at a 1:67 starting dilution and 7 additional threefold serial dilutions. Monoclonal antibodies were tested at 10 μg ml−1 starting concentration and 10 additional fourfold serial dilutions. Plates were washed 6 times with washing buffer and then incubated with anti-human IgG, IgM or IgA secondary antibody conjugated to horseradish peroxidase (HRP) (Jackson Immuno Research 109-036-088 109-035-129 and Sigma A0295) in blocking buffer at a 1:5,000 dilution (IgM and IgG) or 1:3,000 dilution (IgA). Plates were developed by addition of the HRP substrate, TMB (ThermoFisher) for 10 min (plasma samples) or 4 min (monoclonal antibodies), then the developing reaction was stopped by adding 50 μl 1 M H2SO4 and absorbance was measured at 450 nm with an ELISA microplate reader (FluoStar Omega, BMG Labtech) with Omega and Omega MARS software for analysis. For plasma samples, a positive control (plasma from participant COV72, diluted 66.6-fold and 7 additional threefold serial dilutions in PBS) was added to every assay plate for validation. The average of its signal was used for normalization of all of the other values on the same plate with Excel software before calculating the AUC using Prism v.8.4 (GraphPad). For monoclonal antibodies, the EC50 was determined using four-parameter nonlinear regression (GraphPad Prism v.8.4).

Expression of RBD proteins

Mammalian expression vectors encoding the RBDs of SARS-CoV-2 (GenBank MN985325.1; S protein residues 319–539) and eight additional mutant RBD proteins (E484K, Q493R, R346S, N493K, N440K, V367F, A475V, S477N and V483A) with an N-terminal human IL-2 or Mu phosphatase signal peptide were previously described49.

SARS-CoV-2 pseudotyped reporter virus

SARS-CoV-2 pseudotyped particles were generated as previously described1,8. In brief, 293T cells were transfected with pNL4-3ΔEnv-nanoluc and pSARS-CoV-2-SΔ19. For generation of RBD-mutant pseudoviruses, pSARS-CoV-2-SΔ19 carrying either of the following spike mutations was used instead of its wild-type counterpart: Q493R, R346S or E484G50. Particles were collected at 48 h after transfection, filtered and stored at −80 °C.

Pseudotyped virus neutralization assay

Fourfold serially diluted plasma from individuals convalescent from COVID-19, or monoclonal antibodies, were incubated with SARS-CoV-2 pseudotyped virus for 1 h at 37 °C. The mixture was subsequently incubated with 293TACE2 cells for 48 h, after which cells were washed with PBS and lysed with Luciferase Cell Culture Lysis 5× reagent (Promega). Nanoluc luciferase activity in lysates was measured using the Nano-Glo Luciferase Assay System (Promega) with the Glomax Navigator (Promega). The obtained relative luminescence units were normalized to those derived from cells infected with SARS-CoV-2 pseudotyped virus in the absence of plasma or monoclonal antibodies. The half-maximal inhibitory concentration for plasma (NT50) or monoclonal antibodies (IC50) was determined using four-parameter nonlinear regression (least squares regression method without weighting; constraints: top = 1, bottom = 0) (GraphPad Prism).

High-dimensional data analysis of flow cytometry data

High-dimensional viSNE and FlowSOM data analysis and visualization of flow cytometry data were performed on B cells using the Cytobank platform (https://cytobank.org). viSNE analysis was performed using equal sampling of 4,893 cells from each FCS file, with 75,00 iterations, a perplexity of 30 and a theta of 0.5. The following markers were used to generate viSNE maps: IgA, CD305, TGFb-RII, CD138, CD10, CD272, IgD, CD24, CD21, CD95, HLA-DR, IgG, CD279, CD38, IgM, CD274, CD27, CD23, CXCR5, CD32, CD86, CD40, CD85j, CD11c and CXCR3. Resulting viSNE maps were fed into the FlowSOM clustering algorithm51. The self-organizing map was generated using hierarchical consensus clustering on the t-distributed stochastic neighbour embedding axes.

Heat map visualization

Heat maps to display column-scaled z-scores of mean fluorescence intensity for individual FlowSOM clusters according to marker expression were created using the R function pheatmap.

Biotinylation of viral protein for use in flow cytometry

Purified and Avi-tagged SARS-CoV-2 RBD was biotinylated using the Biotin-Protein Ligase-BIRA kit according to manufacturer’s instructions (Avidity) as previously described1. Ovalbumin (Sigma, A5503-1G) was biotinylated using the EZ-Link Sulfo-NHS-LC-Biotinylation kit according to the manufacturer’s instructions (Thermo Scientific). Biotinylated ovalbumin was conjugated to streptavidin–BV711 (BD biosciences, 563262) and RBD to streptavidin–PE (BD Biosciences, 554061) and streptavidin–AF647 (Biolegend, 405237)1.

Single-cell sorting by flow cytometry

Single-cell sorting by flow cytometry was previously described1. In brief, peripheral blood mononuclear cells were enriched for B cells by negative selection using a pan-B-cell isolation kit according to the manufacturer’s instructions (Miltenyi Biotec, 130-101-638). The enriched B cells were incubated in FACS buffer (1× PBS, 2% FCS, 1 mM EDTA) with the following anti-human antibodies (all at 1:200 dilution): anti-CD20–PECy7 (BD Biosciences, 335793), anti-CD3–APC–eFluro 780 (Invitrogen, 47-0037-41), anti-CD8–APC–eFluor 780 (Invitrogen, 47-0086-42), anti-CD16–APC–eFluor 780 (Invitrogen, 47-0168-41), anti-CD14–APC–eFluor 780 (Invitrogen, 47-0149-42), as well as Zombie NIR (BioLegend, 423105) and fluorophore-labelled RBD and ovalbumin (Ova) for 30 min on ice. Single CD3CD8CD14CD16CD20+OvaRBD–PE+RBD–AF647+ B cells were sorted into individual wells of 96-well plates containing 4 μl of lysis buffer (0.5× PBS, 10 mM DTT, 3,000 units per ml RNasin ribonuclease inhibitors (Promega, N2615)) per well using a FACS Aria III and FACSDiva software (Becton Dickinson) for acquisition and FlowJo for analysis. The sorted cells were frozen on dry ice, and then stored at −80 °C or immediately used for subsequent RNA reverse transcription.

Antibody sequencing, cloning and expression

Antibodies were identified and sequenced as previously described1. In brief, RNA from single cells was reverse-transcribed (SuperScript III Reverse Transcriptase, Invitrogen, 18080-044) and the cDNA stored at −20 °C or used for subsequent amplification of the variable IGH, IGL and IGK genes by nested PCR and Sanger sequencing. Sequence analysis was performed using MacVector. Amplicons from the first PCR reaction were used as templates for sequence- and ligation-independent cloning into antibody expression vectors. Recombinant monoclonal antibodies and Fabs were produced and purified as previously described1.

Computational analyses of antibody sequences

Antibody sequences were trimmed on the basis of quality and annotated using Igblastn v.1.14. with IMGT domain delineation system. Annotation was performed systematically using Change-O toolkit v.0.4.54052. Heavy and light chains derived from the same cell were paired, and clonotypes were assigned on the basis of their V and J genes using in-house R and Perl scripts (Extended Data Fig. 5). All scripts and the data used to process antibody sequences are publicly available on GitHub (https://github.com/stratust/igpipeline).

The frequency distributions of human V genes in anti-SARS-CoV-2 antibodies from this study were compared to 131,284,220 previously generated IgH and IgL sequences53 and downloaded from cAb-Rep54 (a database of human shared BCR clonotypes available at https://cab-rep.c2b2.columbia.edu/). On the basis of the 82 distinct V genes that make up the 1,703 analysed sequences from immunoglobulin repertoire of the three participants present in this study, we selected the IgH and IgL sequences from the database that are partially coded by the same V genes and counted them according to the constant region. The frequencies shown in (Extended Data Fig. 5) are relative to the source and isotype analysed. We used the two-sided binomial test to check whether the number of sequences belonging to a specific IGHV or IGLV gene in the repertoire is different according to the frequency of the same IGV gene in the database. Adjusted P values were calculated using the false-discovery rate correction. In Extended Data Figs. 5, 6, significant differences are denoted with asterisks (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001).

Nucleotide somatic hypermutation and CDR3 length were determined using in-house R and Perl scripts. For somatic hypermutations, IGHV and IGLV nucleotide sequences were aligned against their closest germlines using Igblastn and the number of differences were considered nucleotide mutations. The average mutations for V genes was calculated by dividing the sum of all nucleotide mutations across all participants by the number of sequences used for the analysis. To calculate the GRAVY scores of hydrophobicity55 we used the Guy H. R. Hydrophobicity scale based on free energy of transfer (kcal per mole)56 implemented by the R package Peptides (https://CRAN.R-project.org/package=Peptides). We used 532 heavy-chain CDR3 amino acid sequences from this study and 22,654,256 IGH CDR3 sequences from the public database of memory B cell receptor sequences57. The Shapiro–Wilk test was used to determine whether the GRAVY scores were normally distributed. The GRAVY scores from all 532 IGH CDR3 amino acid sequences from this study were used to perform the test and 5,000 GRAVY scores of the sequences from the public database were randomly selected. The Shapiro–Wilk P values were 6.896 × 10−3 and 2.217 × 10−6 for sequences from this study and the public database, respectively, indicating that the data were not normally distributed. Therefore, we used the two-tailed Wilcoxon nonparametric test to compare the samples, which indicated a difference in hydrophobicity distribution (P = 5 × 10−6) (Extended Data Fig. 6h).

A heat map of log2-transformed relative fold change in EC50 against the indicated RBD mutants for antibody clonal pairs obtained at 1.3 and 6.2 months (Fig. 3e, Extended Data Fig. 7k) was created with R pheatmap package (https://github.com/raivokolde/pheatmap) using Euclidean distance and Ward.2 clustering method.

Biopsies and immunofluorescence

Endoscopically obtained mucosal biopsies were formalin-fixed and paraffin-embedded. Sections (5 μm) were cut, dewaxed in xylene, and rehydrated in graded alcohol and PBS. Heat-induced epitope retrieval was performed in target retrieval solution (DAKO, S1699) using a commercial pressure cooker. Slides were then cooled to room temperature, washed in PBS and permeabilized for 30 min in 0.1% Triton X-100 in PBS. Nonspecific binding was blocked with 10% goat serum (Invitrogen, 50062Z) for 1 h at room temperature. Sections were then incubated with a combination of primary antibodies diluted in blocking solution overnight at 4 °C. Slides were washed 3 times in PBS and then incubated in secondary antibody and DAPI (1 μg ml−1) for 1 h at room temperature. Sections were washed in PBS three times and then mounted with Fluoromount-G (Electron Microscopy Sciences, 1798425). Controls included omitting primary antibody (no primary 995 control) or substituting primary antibodies with nonreactive antibodies of the same isotype (isotype control). A Nikon Eclipse Ni microscope and digital SLR camera (Nikon, DS-Qi2) was used to visualize and image the tissue.

The antibody used to stain sections for N protein was raised in rabbits against SARS-CoV N and is cross-reactive with SARS-CoV-2 N protein58 (Supplementary Table 8).

SARS-CoV-2 PCR from intestinal biopsies

To determine whether SARS-CoV-2 RNA is present in the gastrointestinal tract, we isolated RNA from endoscopically obtained mucosal biopsies using Direct-zol miniprep kit (Zymo research, R2050). Reverse-transcribed cDNA was amplified using 2019-nCov Ruo Kit (IDT) to detect viral nucleocapsid genomic RNA. Amplification of subgenomic nucleocapsid RNA was done using following primers and probe: sgLeadSARSCov2_F 5′-CGATCTCTTGTAGATCTGTTCTC-3′28, wtN_R4 5′-GGTGAACCAAGACGCAGTAT-3′, wtN_P4 5′-/56-FAM/TAACCAGAA/ZEN/TGGAGAACGCAGTGGG/3IABkFQ/-3′.

Quantitative PCR was performed using QuantTect probe PCR kit (Qiagen, 204345) under following conditions: 95 °C 15 s, 95 °C 15 s and 60 °C 1 min using the Applied Biosystem QuantStudio 6 Flex Real-Time PCR System. Viral RNA was considered detected if the Ct for viral primer and probe combinations was < 40. Samples from positive wells were column-purified and presence of N1 sequences additionally verified by Sanger sequencing.

SARS-CoV-2 RNA detection by probe proximity ligation

Probes were designed with a 20–25 nucleotide homology to SARS-CoV-2 genomic RNA. Probes were assessed by NCBI BLAST to exclude off target binding to other cellular transcripts. IDT OligoAnalyzer (Integrated DNA Technologies) was used to identify probe pairs with similar thermodynamic properties; melting temperature 45–60 °C, GC content of 40–55% and low self-complementary. The 3′ end of each one of the probes used for proximity ligation signal amplification is designed with a partially complementary sequence to the 61-bp-long backbone and partially to the 21-bp insert (Supplementary Table 8). Single-molecule fluorescence in situ hybridization (smFISH) probes were designed with a complementary 3′ end to the biotin detection probe (Supplementary Table 8).

Paraffin-embedded samples were sectioned at 10 μm. Sections were deparaffinized using 100% xylene, 5 min at room temperature, repeated twice. Slides were rinsed in 100% ethanol, 1 min at room temperature, twice, and air-dried. Endogenous peroxidase activity was eliminated by treating the samples with 0.3% hydrogen peroxide, 10 min at room temperature followed by washing with DEPC-treated water. Samples were incubated 15 min at 95–100 °C in antigen retrieval solution (ACDBio) rinsed in DEPC-treated water and dehydrated in 100% ethanol, 3 min at room temperature and air-dried. Tissue sections were permeabilized 30 min at 40 °C using RNAscope protease plus solution (ACDBio) and rinsed in DEPC-treated water.

Hybridization was performed overnight at 40 °C in a buffer based on DEPC-treated water containing 2× SSC, 20% formamide (Thermo Fischer Scientific), 2.5% (v/v) polyvinylsulfonic acid, 20 mM ribonucleoside vanadyl complex (New England Biolabs), 40 U ml−1 RNasin (Promega), 0.1% (v/v) Tween 20 (Sigma Aldrich), 100 μg ml−1 salmon sperm DNA (Thermo Fisher Scientific), 100 μg ml−1 yeast RNA (Thermo Fisher Scientific). DNA probes dissolved in DEPC-treated water were added at a final concentration of 100 nM (Integrated DNA Technologies). Samples were washed briefly and incubated in a buffer containing 2× SSC, 20% formamide, 40 U ml−1 RNasin at 40 °C and then washed four times (5 min each) in wash buffer, PBS, 0.1% (v/v) Tween 20, and 4 U ml−1 RNasin (Promega). Slides were then incubated with 100 nM insert and backbone oligonucleotides in PBS, 1× SSC, 0.1% (v/v) Tween 20, 100 μg ml−1 salmon sperm DNA (Thermo Fisher Scientific), 100 μg ml−1 yeast RNA (Thermo Fisher Scientific), 40 U ml−1 RNasin at 37 °C. After four washes, tissues were incubated at 37 °C with 0.1 U μl−1 T4 DNA ligase (New England Biolabs) in 50 mM Tris-HCl, 10 mM MgCl2, 1 mM ATP, 1 mM DTT, 250 μg ml−1 BSA, 0.05% Tween 20, 40 U ml−1 RNasin, followed by incubation with 0.1 U μl−1 phi29 DNA polymerase in 50 mM Tris–HCl, 10 mM MgCl2, 10 mM (NH4)2SO4, 250 μM dNTPs, 1 mM DTT, 0.05% Tween 20, 40 U ml−1 RNasin pH 7.5 at 30 °C. Slides were washed and endogenous biotin was blocked using Avidin/Biotin blocking kit (Vector Laboratories) according to the manufacturer’s instructions. Rolling cycle amplicons were identified using a biotin-labelled DNA probe at a concentration of 5 nM at 37 °C in PBS, 1× SSC, 0.1% Tween 20, 100 μg ml−1 salmon sperm DNA, 100 μg ml−1 yeast RNA. After washing, samples were incubated with 1:100 diluted streptavidin–HRP (Thermo Fisher Scientific) in PBS, 60 min at room temperature followed by washing. Fluorescent labelling was accomplished using Alexa Fluor 647 Tyramide SuperBoostKit (Thermo Fischer Scientific) according to the manufacturer’s instructions. Hoechst 33342 was used for nuclear counterstaining (Thermo Fischer Scientific) and samples were mounted in ProLong gold antifade (Thermo Fischer Scientific).

SARS-CoV-2 RNA detection by smFISH

Hybridization was performed overnight at 40 °C in a buffer based on DEPC-treated water containing 2× SSC, 20% formamide (Thermo Fischer Scientific), 2.5% (v/v) polyvinylsulfonic acid, 20 mM ribonucleoside vanadyl complex (New England Biolabs), 40 U ml−1 RNasin (Promega), 0.1% (v/v) Tween 20 (Sigma Aldrich), 100 μg ml−1 salmon sperm DNA (Thermo Fisher Scientific), 100 μg ml−1 yeast RNA (Thermo Fisher Scientific). DNA probes dissolved in DEPC-treated water were added at a final concentration of 10 nM (Integrated DNA Technologies). Samples were washed briefly and incubated in a buffer containing 2× SSC, 20% formamide, 40 U ml−1 RNasin at 40 °C and then washed four times in wash buffer, PBS, 0.1% (v/v) Tween 20, and 4 U ml−1 RNasin (Promega). Samples were washed and endogenous biotin was blocked using Avidin/Biotin blocking kit (Vector Laboratories) according to the manufacturer’s instructions. Slides were incubated with a biotin-labelled DNA probe at a concentration of 10 nM at 37 °C in PBS, 1× SSC, 0.1% Tween 20, 100 μg ml−1 salmon sperm DNA, 100 μg ml−1 yeast RNA. After washing, samples were incubated with 1:100 diluted streptavidin–HRP (Thermo Fisher Scientific) in PBS, 60 min at room temperature followed by washing. Samples were labelled using ImmPACT-DAB substrate, counterstained using haematoxylin QS and imbedded in VectaMount AQ mounting medium (Vector Laboratories) according to the manufacturer’s instructions.

Data presentation

Extended Data Figures were arranged in Adobe Illustrator 2020.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.