BNT162b2 is one of the two first vaccines that are based on lipid nanoparticle delivery of modified mRNA and are dependent on the host cells for translation and expression of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (S) protein, which consists of subunits S1 and S2 (ref. 4). S1 contains the receptor binding domain (RBD), which binds the host entry receptor angiotensin-converting enzyme 2 (ACE2) and initiates viral cell entry, whereas S2 mediates virus–cell membrane fusion5,6. RBD is the target of most neutralizing antibodies found in patients with coronavirus disease 2019 (COVID-19)7. The cellular processes that generate potent neutralizing antibodies in response to mRNA vaccines are not fully characterized and to what degree these antibodies protect against new variants remains unclear.

Nine healthy individuals without previous SARS-CoV-2 infection were included in this study (Supplementary Table 1). One individual contracted COVID-19 8 weeks after the second dose. Peripheral blood B cells were investigated by droplet-based single-cell sequencing before vaccination (day 0), as well as 7–9 days (day 7), 21–23 days (day 21) and 28 days (day 28) after the first dose (Fig. 1a). The second dose was given on day 21. Additionally, SARS-CoV-2 S-specific B cells were labeled with S1, S2 and RBD tetramers conjugated to fluorochromes and DNA barcodes and sorted by fluorescence-activated cell sorting (FACS) before sequencing (Fig. 1a). A total of 131,138 B cells were included in the global transcriptomic analysis. Dimensionality reduction by uniform manifold approximation and projection (UMAP)8 and graph-based clustering distinguished nine B cell populations present in all individuals at all time points, including naive B cells (Bnaive cells), unswitched and switched memory B cells (U-Bmem cells and S-Bmem cells, respectively) and plasmablast clusters (Fig. 1b, Extended Data Fig. 1a–c and Supplementary Table 2). Bnaive cells were further sub-categorized into CCR7loPTRCAPhiJUNBhi Bnaive cells (Bnaive1 cells) and CCR7hiPTRCAPloJUNBlo Bnaive cells (Bnaive2 cells) (Extended Data Fig. 1b). S-Bmem cells were divided into PTPN6loBLKloCD86lo resting switched Bmem cells (SR-Bmem cells) and PTPN6hiBLKhiCD86hi activated switched Bmem cells (SA-Bmem cells) (Extended Data Fig. 1b)9. Fitting to the cluster assignments, Bnaive cells mainly expressed low-mutation IgM, whereas Bmem cells and plasmablasts expressed other isotypes with higher frequencies of somatic hypermutation (SHM) (Extended Data Fig. 1d,e). Single-cell transcriptome sequencing on days 0, 7, 21 and 28 showed decreased frequencies of U-Bmem cells (day 0, 35.46%; day 7, 13.88%) and increased frequencies of SR-Bmem cells (day 0, 9.38%; day 7, 23.42%) on day 7, which was sustained until day 28 (Fig. 1c,d). To understand the fate of U-Bmem cells, we tracked them from day 0 to day 28 using B cell receptor (BCR) sequencing. Clonally related BCR sequences were defined by shared heavy chain and light-chain variable genes and >70% overlap in both CDR3 regions. B cells clonally related to day 0 U-Bmem cells were identified in day 7, 21 and 28 datasets (Fig. 2a). The majority of clonal day 0 U-Bmem cells differentiated into SR-Bmem cells on days 7–28 (day 7, 46.91%; day 21, 63.21%; day 28, 66.34%) (Fig. 2a,b and Extended Data Fig. 2a,b). SHM frequencies of day 0 U-Bmem cells did not differ from clonally related days 7–28 SR-Bmem cells (Extended Data Fig. 2c), suggesting differentiation without germinal center (GC) maturation. Only 1.98% of B cells clonally related to day 0 U-Bmem cells developed into plasmablasts on day 28 (Fig. 2a,b and Extended Data Fig. 2a,b), suggesting that the differentiation of day 0 U-Bmem cells was separate from the plasmablast response to SARS-CoV-2. Pathway enrichment analysis revealed increased expression of B cell activation genes in U-Bmem cells between day 0 and days 7–28 (Fig. 2b and Extended Data Fig. 2f,j). In contrast, the same genes were downregulated in SR-Bmem cells between day 0 and days 7–28 (Fig. 2c and Extended Data Fig. 2g), although upregulation of CD83 and CD69 in SR-Bmem cells indicated that they were recently activated (Extended Data Fig. 2k)10,11. Together, these observations suggest that BNT162b2 activated U-Bmem cells and induced class-switching and differentiation into SR-Bmem cells independent of the GC.

Fig. 1: Single-cell transcriptomic analysis of B cell response to BNT162b2 vaccination.
figure 1

a, Outline of experimental approach. bd, Single-cell transcriptome analysis. UMAP visualization of B cell clusters for all individuals at four time points (b). Cluster assignments are based on gene expression and cell surface expression (CITE-seq) of canonical B cell markers. Mean percentages of B cell clusters shown in b (c). Individual and mean percentages of B cell clusters in b at four time points (d). n = 9 individuals at all time points. Individual values, means and s.d. are shown. Exact P values were obtained by two-tailed one-way analysis of variance (ANOVA), followed by Dunnett’s multiple comparison test. Bdn cells, double-negative B cells.

Source data

Fig. 2: Unswitched memory B cell differentiation and expansion of IgG+ plasmablasts in response to vaccination.
figure 2

a, Single-cell BCR repertoire data, showing trajectories of B cell clonal families across time points as alluvial plot (left) and UMAP projection of clonal B cells (right) that are related to U-Bmem cells at day 0 of n = 9 participants. b,c, Single-cell transcriptomic sequencing data showing expression of genes involved in B cell activation and BCR signaling in U-Bmem cells (b) and SR-Bmem cells (c). d, Single-cell BCR repertoire data, showing trajectories of B cell clonal families across time points as alluvial plot (left) and UMAP projection of clonal B cells (right) that are related to plasmablasts at day 28 of n = 9 participants. eg, Single-cell transcriptomic sequencing data showing expression of genes involved in B cell activation in plasmablasts (e), percentage of IgG+ plasmablasts in all plasmablasts per individual (n = 9) at four time points (f) and average SHM frequency of IgG+ plasmablasts per individual (n = 9) at four time points (g). Clonal families included have ≥3 members at ≥2 time points (a,d). Line thickness represents the number of clonal families. Individual values, means and s.d. are shown (f,g). Exact P values obtained using two-tailed one-way ANOVA, followed by Dunnett’s multiple comparison test. IGHV, immunoglobulin heavy V gene; IGLV, immunoglobulin light V gene.

Source data

U-Bmem cells are thought to develop independently of the GC and possess a polyreactive repertoire for rapid B cell responses12. The observed separation of clonal U-Bmem cells from the plasmablast lineages indicates that U-Bmem cells were likely not a major source of S-antigen-specific antibodies. Plasmablasts are short-lived activated B cells that expand in response to antigen stimulation and secrete large amounts of antibodies13. To identify B cell populations directly involved in the antigen-specific response, we focused on clonally related B cell expansions that included plasmablasts at day 28. Based on shared BCR sequences, the majority of B cells clonally related to day 28 plasmablasts were found in the plasmablast compartment at days 0–21 (day 0, 53.57%; day 7, 52.75%; day 21, 66.67%) (Fig. 2d and Extended Data Fig. 2d,e). Considering the short life span of plasmablasts14, this indicates continued recruitment from either Bnaive cell or Bmem cell pools. Genes associated with leukocyte activation and protein processing were enriched in plasmablasts on days 7–28 over day 0 (Fig. 2e and Extended Data Fig. 2h), consistent with their function as antibody producers. Genes associated with SARS-CoV-2 infection were enriched in plasmablasts on day 21 over day 0 (Extended Data Fig. 2h), suggesting that recognition of viral proteins triggers similar pathways in vaccination and infection.

IgA+ plasmablasts are prevalent in peripheral blood during health, while IgG+ plasmablasts increase during systemic infection and vaccination15,16. Frequencies of IgG+ plasmablast and in particular IgG1+ plasmablast increased between days 0 and 7 and between days 21 and 28 in response to each dose (Fig. 2f and Extended Data Fig. 3a–c). The average SHM frequency in IgG+ plasmablasts decreased from day 0 to day 28 (Fig. 2g and Extended Data Fig. 3d–k). These observations show an influx of minimally mutated IgG+ plasmablasts developed in response to vaccination, likely originating from the Bnaive cell pool.

To further characterize the S-specific B cell response to vaccination, we labeled B cells with fluorescent S1, S2 and RBD tetramers, each with a unique barcode and FACS-sorted antigen-tetramer-specific and nonspecific IgA+, IgG+ and IgM plasmablasts and S-Bmem cells (Extended Data Fig. 4a). Antigen specificities were determined by their barcode after demultiplexing (Extended Data Fig. 5a–e). S-specific IgA+ and IgG+ plasmablasts expanded at days 7 and 28 (day 0, 0.22%; day 7, 4.58%; day 28, 1.37%) (Fig. 3a and Extended Data Fig. 4b,c). S-specific IgA+ and IgG+ S-Bmem cell response increased from day 0 to day 28 (day 0, 0.08%; day 28, 0.395%) (Fig. 3a and Extended Data Fig. 4b,e). After day 7, the S-specific B cell response shifted from an IgA+ to an IgG+ response (Fig. 3a and Extended Data Fig. 4d,f).

Fig. 3: Vaccination induces an IgA+ anti-S2 response on day 7 followed by an IgG+ anti-RBD response on days 21 and 28.
figure 3

a, Flow cytometry data, showing percentages of S-specific plasmablasts in all IgG+ and IgA+ plasmablasts (left), S-specific S-Bmem cells in all IgG+ and IgA+ S-Bmem cells (center) and IgG+ S-specific S-Bmem cells in all S-specific IgG+ and IgA+ S-Bmem cells (right). Individual data points are averages of two independent experiments, including n = 9 participants at four time points. bf, Single-cell transcriptome and BCR repertoire sequencing data of S-specific and nonspecific sorted B cells showing UMAP visualization with cluster assignments of SR-Bmem cells, SA-Bmem cells and plasmablasts (b, left) and antigen-specificity to S2, RBD and S1n (b, right). Proportions of cells shown in b, separated by antigen and cluster (c), day and antigen (d), antigen and clonality (e) and antigen and immunoglobulin class (f). g, Plasma IgG levels against S2 (left), RBD (center) and S1 (right) for n = 9 individuals at five time points. Area under the curve (AUC) for plasma dilutions are shown as individual data points. h, Ratio of IgG levels of day 120 AUC to day 28 AUC for antigens S2, RBD and S1. Gray dots represents the vaccinated participant who contracted COVID-19 2 weeks before day 120 and is excluded from statistics. Individual data points and medians for n = 9 participants are shown. i, Plasma neutralization of SARS-CoV-2 pseudovirus shown as dilution curves (left) and quantification (NT50) of neutralization titers (right) of n = 9 individuals at five time points. j, Single-cell BCR repertoire sequencing data showing IGHV and IGLV gene SHM frequencies of B cells specific for S2 (n = 210 cells, left), RBD (n = 124 cells, center) and S1n (n = 24 cells, right) at three time points after vaccination. Dashed line indicates the average mutation frequency of sorted nonspecific B cells. Individual data points represent single B cells from n = 9 individuals. Individual values, means and s.d. are shown (a,j). Exact P values according to two-tailed one-way ANOVA, followed by Dunnett’s multiple comparison test (a,gi), chi-squared test (cf), two-tailed Kruskal–Wallis test followed by Dunnett’s multiple comparison test (j). PB, plasmablast.

Source data

Antibody responses against distinct epitopes contribute differently to immune protection17. We investigated how the antibody response to each S subunit varies within B cell subsets and over time. UMAP analysis indicated that S2-specific B cells were predominantly plasmablasts and accounted for 82.14% of all S-specific B cells at day 7 (Fig. 3b–d). In contrast, the majority of B cells specific to RBD and specific to S1, but not RBD (S1n), were SA-Bmem cells (Fig. 3b,c and Extended Data Fig. 4g). RBD-specific and S1n-specific B cells increased from <20% on day 7 to >50% on day 28 of all S-specific B cells (Fig. 3b–d). The S2-specific response was highly clonal and dominated by IgA1+ plasmablasts, whereas RBD-specific and S1n-specific SA-Bmem cells used predominantly IgG1 and were less clonally expanded than S2 plasmablasts (Fig. 3e,f). The rapid onset of the S2-specific response and the delayed S1n- and RBD-specific response indicated that the S2-specific B cell response is a secondary response, whereas the S1n and RBD-specific B cell response is a primary response.

The early S2-specific plasmablast response corresponded with S2-specific plasma antibody titers, which rose quickly from day 0 to day 7 and plateaued around day 21. RBD-specific and S1-specific IgG and IgA levels were low until day 21 and increased at day 28 in response to the second dose (Fig. 3g and Extended Data Fig. 6a,b). Accordingly, day 28 plasma neutralized the SARS-CoV-2 Wuhan-Hu-1 pseudovirus more potently than day 0–21 plasma (Fig. 3i). Notably, by day 120, RBD-specific and S1-specific IgG levels decreased by 52%, whereas S2-specific IgG levels decreased only by 23.61% (excluding the participant who contracted COVID-19 2 weeks before day 120) (Fig. 3g,h). This suggests less-effective induction of RBD-specific long-lived plasma cells upon recruitment from Bnaive cells.

The participant with the lowest RBD-specific IgG and IgA levels contracted COVID-19 8 weeks after the second vaccination (Extended Data Fig. 6a–d). While this participant produced high-affinity neutralizing B cells (Fig. 4f,g), antibody levels were not protective. In addition, the participant’s antibody levels did not increase by day 120, 2 weeks after COVID-19 infection (Extended Data Fig. 6a–c), suggesting that the low antibody levels were not due to the effectiveness of the mRNA vaccine.

Fig. 4: Neutralization of SARS-CoV-2 pseudovirus and variants by BNT162b2-induced antibodies.
figure 4

a,b, Single-cell sequencing data indicating abundance of barcoded and fluorochrome-tagged antigen tetramers for each sorted B cell (two-column heat maps, one barcode corresponding to PE and APC, respectively), paired with ELISA data of the corresponding expressed monoclonal antibodies (bar graphs). Reactivity against S2, RBD and S1 is shown. Barcoded tetramer binding data are represented as centered log ratio-transformed (CLR) counts. ELISA data showing reactivity of the same monoclonal antibodies as a against OC43 and HKU1 spike (b). ELISA data are shown as AUC of plasma dilutions. Threshold at 0 was set to the average binding to bovine serum albumin (BSA) plus its threefold s.d. c,d, Single-cell BCR repertoire sequencing data showing IGHV and IGLV SHM frequencies of RBD- and S1n-specific B cells (high, n = 17 cells; low, n = 18 cells) (c) and S2-specific B cells (high, n = 8 cells; low, n = 6 cells) (d). High-binding monoclonal antibodies are defined by data shown in a as positive for both PE and APC barcode tetramer binding and ELISA AUC > 3. Individual values, means and s.d. are shown. exact P values were obtained using unpaired two-tailed Student’s t-test. eg, Neutralization of SARS-CoV-2 pseudovirus Wuhan-Hu-1 and variants Alpha to Epsilon are shown as dilution curves for antibodies RBD-11 (e, left) and RBD-4 (e, right) and as heat maps indicating half-maximum inhibitory concentration (IC50) (ng ml−1) for each monoclonal antibody (f) and 50% neutralization titer (NT50) (reciprocal dilution) of day 28 plasma for each of the nine participants (g).

Source data

Neutralizing antibodies from patients with COVID-19 show characteristically low mutation rates18, indicating recruitment of Bnaive cells to the GC in response to a novel antigen. We found that on day 7, when the majority of antigen-specific Bmem cells and plasmablasts was S2-specific (Fig. 3d), mutation frequencies of antigen-specific Bmem cells and plasmablasts were similar to control-sorted Bmem cells and plasmablasts without specificity to S-antigens (Fig. 3j and Extended Data Fig. 7a–c). In contrast, mutation frequencies of antigen-specific Bmem cells and plasmablasts on days 21 and 28, when the majority of B cells reacted to S1 and RBD, were significantly lower than on day 7 (Fig. 3d,j and Extended Data Fig. 7a–c). Thus, the SHM analysis indicated that the S2-specific response was a recall response from Bmem, whereas the delayed S1n- and RBD-specific response was a primary response from Bnaive cells.

To further characterize the primary and secondary B cell response, we expressed 50 representative BCR sequences from S-antigen-specific B cells; 14 S2-specific, 30 S1- and RBD-specific and 6 S1n-specific, as recombinant monoclonal antibodies (Supplementary Table 3). ELISA and biolayer interferometry indicated that of the selected monoclonal antibodies, 8 potently bound S2, 15 bound RBD and 3 bound S1n (Fig. 4a and Extended Data Fig. 8). When tested against S-protein of four other human-pathogenic coronaviruses (229E, NL63, OC43 and HKU1), 9 and 8 of the 12 S2-reactive monoclonal antibodies cross-reacted with betacoronaviruses OC43 and HKU1 S-protein, respectively, but none cross-reacted with the alphacoronaviruses 229E and NL63 (Fig. 4b). Of the RBD-specific monoclonal antibodies, only the polyreactive monoclonal antibody RBD-16 (Extended Data Fig. 9), showed cross-reactivity to OC43 and HKU1 S-protein (Fig. 4b). These observations are in line with a higher conservation of S2 than S1 across coronavirus species19,20.

Monoclonal antibodies that bound strongly to S1 by ELISA and biolayer interferometry (high binders) harbored low SHM frequencies in comparison to monoclonal antibodies that bound S1-specific antigen tetramers but showed little or no reactivity to S-antigens by ELISA (low binders) (Fig. 4c and Extended Data Fig. 7d). In contrast, SHM frequencies in S2-specific monoclonal antibodies did not differ significantly between high and low binders (Fig. 4d and Extended Data Fig. 7e). Monoclonal antibodies with the highest affinity for S2 were derived from day 7 plasmablasts, whereas monoclonal antibodies with the highest affinity for S1 (including RBD) originated from S-Bmem cells from days 21 and 28 (Fig. 4a). While a correlation between increased affinity and low SHM frequency is counter intuitive, it is characteristic for the B cell repertoire in patients with COVID-19 (refs. 21,22) and likely caused by the reduced structural overlap of the S1 protein to other pathogens, requiring recruitment from the Bnaive cell pool.

The emergence of novel SARS-CoV-2 variants could jeopardize vaccine efficacy. RBD mutations contribute significantly to immune escape23. After testing neutralization efficacy of all RBD-specific monoclonal antibodies to the Wuhan-Hu-1 pseudovirus (Extended Data Fig. 10), we selected the ten best Wuhan-Hu-1-neutralizing RBD-specific monoclonal antibodies and tested their neutralization of five SARS-CoV-2 variants, Alpha+E484K (B.1.1.7 + E484K; N501Y, E484K), Beta (B.1.351; N501Y, K417N), Gamma (P.1; N501Y, K417T), Delta (B.1.617.2; L452R, T478K) and Epsilon (B.1.429; L452R) (Supplementary Table 4). Each variant was neutralized by at least four of the ten monoclonal antibodies (Fig. 4e,f and Extended Data Fig. 10), indicating at least partial protection. Most antibodies neutralized either Alpha, Beta and Gamma (N501Y) or Delta and Epsilon (L452R) (Fig. 4f), indicating that neutralization was dependent on RBD mutations shared between the variants. Two antibodies were broadly neutralizing (Fig. 4f and Extended Data Fig. 10). Additionally, we tested neutralization efficacy in plasma of the nine participants (Fig. 4g). Overall, plasma neutralization potency was highest against Wuhan-Hu-1, followed by Epsilon and Delta and lowest against Alpha, Beta and Gamma. Together, despite low plasma neutralization potency against variants in some participants, we detected B cells with potently neutralizing BCRs against variants in the same participants.

Here we provided a detailed characterization of the B cell response to the BNT162b2 mRNA vaccine at a single-cell level. Parsing the S1-specific and S2-specific responses provides important insights into why a second dose is vital for protection. Our results indicate that the first dose activated a non-neutralizing recall response that initially targets epitopes in the S2-protein subunit, conserved across human-pathogenic betacoronaviruses19,20, while the second dose boosted the neutralizing B cell responses to S1 including RBD.

The first dose provides an IgA-dominant plasmablast response against S2 with high SHM, which is cross-reactive to the betacoronaviruses OC43 and HKU1. This is consistent with a recall response of mucosal Bmem cells to previous pulmonary coronavirus infections. The first dose conveys a degree of protection against COVID-19 (ref. 3) and this initial S2-specific response likely contributes to it. S2-specific antibodies can neutralize SARS-CoV-2 by inhibiting virus–cell membrane fusion and boosting antiviral T cell immunity24,25,26

After the S2-specific response on day 7, the frequency of minimally mutated SA-Bmem cells increased on days 21 and 28. The SA-Bmem cells at these later time points mostly targeted S1, including RBD and their low SHM frequency indicates a primary B cell response. As shown for B cell responses in influenza vaccine settings, blocking of a dominant epitope by early antibodies (anti-S2) can facilitate the development of later antibodies against another epitope (anti-S1)27. We observed that low mutation frequency corresponded to high affinity against RBD. High BCR affinity and a naive phenotype foster preferential recruitment into GC28,29,30 and high affinity also promotes release from the GC as plasmablasts, plasma cells or Bmem cells31. High affinity in minimally mutated BCRs could therefore limit GC maturation to a relatively short time frame. mRNA vaccines against SARS-CoV-2 induce robust and prolonged GC reactions, with plasmablasts and S-Bmem cells persisting in GC for over 3 months32,33. However, despite the prolonged GC reaction, the more rapid decrease of antibody titers against RBD and S1 than against S2 suggests less-efficient development of long-lived plasma cells upon recruitment from Bnaive cells34.

Plasma neutralization assays indicated a marked degree of immune escape of SARS-CoV-2 variants. However, several selected monoclonal antibodies from the same participants were neutralizing, indicating that even individuals with low neutralizing titers can raise a recall memory response after infection with SARS-CoV-2 variant. Together, our study provides a detailed characterization of the blood B cell response to the BNT162b2 mRNA vaccine. Our data emphasize the importance of the second dose in inducing generation of RBD-specific antibodies that contribute to neutralization of SARS-CoV-2 variants.


Study design, sample collection and storage

All studies were approved by the Institutional Review Board of Stanford University (IRB-3780) and the studies complied with relevant ethical regulations. All participants provided written informed consent before participating in the study. Nine healthy individuals were enrolled in the study (Supplementary Table 1). All individuals had undergone routine PCR with reverse transcription (RT–PCR) testing for SARS-CoV-2 infection before study. None of the participants had been previously diagnosed with SARS-CoV-2 infection. No statistical methods were used to predetermine sample sizes but our sample sizes are similar to those reported in previous publications35,36,37. Blood samples were collected in heparin tubes (BD) at four different time points including pre-vaccination (day 0), 7–9 d after initial vaccination (day 7), on the day of and before the second dose 21–23 d after initial vaccination (day 21) and 28–30 d after initial and 7–9 d after second vaccination (day 28). Plasma samples were obtained after centrifugation and stored at −80 °C. Peripheral blood mononuclear cells (PBMCs) were obtained by density gradient centrifugation over Ficoll PLUS medium (Cytiva) and stored in cell-freezing medium (Thermo Fisher Scientific). PBMCs were aliquoted and stored until use at −80 °C.

Generation of barcoded fluorescent antigen tetramers

Recombinant Avi-tag biotinylated SARS-CoV-2 S2 protein (Acro Biosystems, S2N-C52E8-25ug), SARS-CoV-2 RBD (Acro, SPD-C82E9-25ug) and SARS-CoV-2 S1 (Acro, S1N-C82E8-25ug) were mixed with barcoded, fluorescently labeled streptavidin (BioLegend) at a 4:1 molar ratio for 45 min while rotating. Excess biotin was added to saturate all streptavidin binding sites.

Flow cytometry, cell sorting and 10X sample preparation

PBMCs were thawed at 37 °C, treated for 15 min with DNase and washed in complete RPMI. PBMCs were enriched for B cells using the EasySep Human Pan-B Cell Enrichment kit (Stem Cell Technologies) according to the manufacturer instructions. B cell samples without antigen enrichment were stained with CD19, IgD, CD27 and CD38 TotalSeq-C antibodies (0.5 µg per 1,000,000 cells) (all BioLegend). For antigen-sorted B cell samples, cells were stained with the following fluorescently labeled antibodies according to standard protocols: CD19 (1:100 dilution), CD20 (1:300 dilution), CD38 (1:100 dilution) (all BD Biosciences), CD3 (1:60 dilution), CD27 (1:100 dilution), IgM (1:100 dilution), IgD (1:100 dilution), (all BioLegend), IgA (1:100 dilution) (Miltenyi Biotec), Sytox blue (1:1,000 dilution) (Thermo Fisher Scientific) and S-antigen tetramers. Additionally, sorted samples were labeled with TotalSeq-C hashtag 1–9 antibodies (0.5 µg per 1,000,000 cells) (BioLegend) for demultiplexing individual samples in downstream analysis. Single cells were sorted with a FACSAria II cell sorter (BD Biosciences) into cooled 1.5-ml tubes (BioRad). FACS data were collected with the BD FACSDiva (v.8.0) software. FlowJo v.10.7.1 (BD Biosciences) was used for flow cytometry data analysis. Flow cytometry experiments were performed with n = 9 biological replicates (study participants) and the experiment was performed twice.

Droplet-based single-cell sequencing

Using a Single-Cell 5′ Library and Gel Bead kit v.1.1 (10X Genomics, 1000165) and Next GEM Chip G Single-Cell kit (10X Genomics, 1000120), the cell suspension was loaded onto a Chromium single-cell controller (10X Genomics) to generate single-cell gel beads in the emulsion (GEMs) according to the manufacturer’s protocol. Briefly, approximately 8,000 cells were added to each channel and approximately 4,000 target cells were recovered. Captured cells were lysed and released RNA was barcoded through reverse transcription in individual GEMs. The 5′ gene expression (GEX) libraries, single-cell V(D)J libraries (1000016) and cell surface protein libraries were constructed according to manufacturer protocols. Library quality was assessed using a 2200 TapeStation (Agilent). The libraries were sequenced using an Illumina Novaseq6000 sequencer with a paired-end 150-bp (PE150) reading strategy (Novogene).

Single-cell RNA-seq data processing

Raw gene expression and cell surface matrices were generated for each sample by the Cell Ranger Pipeline (v.6.0.1) coupled with human reference version GRCh38. Briefly, gene expression analyses of single cells were conducted using the R package Seurat (v.4.0.2) to perform data scaling, transformation, clustering, dimensionality reduction, differential expression analyses and most visualization38. The count matrix was filtered to remove cells with >10% of mitochondrial genes or low gene counts (<600 for enriched B cells and <200 for sorted B cells). The normalized data were integrated into one Seurat data file using the IntegrateData function. Principal component analysis was performed using variable genes. We compared the ranking of principal components (PCs) with the percentage of variance to determine the number of first-ranked PCs to use for UMAP8 to reduce the integrated dataset into two dimensions. Afterwards, the same number of first-ranked PCs were used to construct a shared nearest-neighbor graph, which was used to cluster the cells. For sorted cells, all sorted cells (antigen-specific and nonspecific) were used in the UMAP. Contaminant cells (non-B cells) were removed and the data were normalized, integrated and clustered with only B cells. Specific B cell clusters were identified using canonical B cell markers9 (Extended Data Fig. 1).

Identification of differentially expressed genes and functional enrichment

We performed differential gene expression testing using the FindMarkers function in Seurat with Wilcoxon rank-sum test and the Benjamini–Hochberg method was used to adjust the P values for multiple hypothesis testing. Differentially expressed genes were filtered using a minimum log2(fold change) of 0.25 and a maximum false discovery rate value of 0.05. Pathway analysis for differentially expressed genes was conducted using Metascape39.

VDJ sequence analysis

BCR VDJ regions were generated for each sample using the Cell Ranger Pipeline (v.6.0.1). BCR sequences were then filtered to include cells that have one light and one heavy chain per cell. Consensus sequences were aligned to germline variable-chain immunoglobulin sequences with IMGT HighV-QUEST v.1.8.3 (ref. 40). Clonal families were defined based on sharing the same heavy and light-chain V and J genes with >70% amino acid identity in heavy and light-chain CDR3s. Mutations were identified by aligning the nucleotide sequence to germline variable-chain immunoglobulin sequences with IMGT HighV-QUEST. To calculate the mutation frequency, we divided the number of mutations (silent and nonsilent) by the length of the V gene.

Recombinant monoclonal antibody production

Heavy chain and light-chain variable sequences were codon optimized and cloned into in-house vectors, containing human IgG1 constant region or κ or λ constant regions, respectively. Expi293F cells were transfected with heavy chain and light-chain plasmids using FectoPro (Polyplus transfection). Medium was collected after 7 days and monoclonal antibodies were purified with AmMag Protein A magnetic beads (Genscript). Antibody concentrations were measured with a nanodrop spectrophotometer (Thermo Fisher Scientific) and human IgG quantitation ELISAs (Bethyl Laboratories).


For protein ELISAs, MaxiSorp 384-well plates (Thermo Fisher Scientific) were coated with 1 µg ml−1 recombinant SARS-CoV-2 S2 protein (Acro, S2N-C52H5), SARS-CoV-2 RBD (Acro, SPD-C52H3) or SARS-CoV-2 S1 (Acro, S1N-C52H3), HCoV-OC43 spike protein (Sino Biological, 40607-V08B), HCoV-HKU1 spike protein (Sino Biological, 40606-V08B), HCoV-229E spike protein (Sino Biological, 40605-V08B), HCoV-NL63 spike protein (Sino Biological, 40604-V08B), 10 µg ml−1 LPS (Sigma), 10 µg ml−1 calf thymus DNA (Invitrogen), 5 µg ml−1 of insulin (Sigma) or 2 µg ml−1 of flagellin (Invivogen) in carbonate-bicarbonate buffer at 4 °C overnight. Plates were washed six times with PBST (PBS + 0.1% Tween20) after each step. The plates were blocked with blocking buffer (PBS + 1% BSA) for 1 h at room temperature. Human plasma was serially diluted and added for 1 h at room temperature. Human monoclonal antibodies were added at concentrations of 10 µg ml−1 and three tenfold serial dilutions and incubated overnight at 4 °C. Secondary HRP-conjugated antibodies goat anti-human IgG (Bethyl Laboratories) or HRP-conjugated goat anti-human IgA (Bethyl Laboratories) were applied for 1 h at room temperature and plates were developed with TMB substrate (Thermo Fisher Scientific) and stopped with 2 N sulfuric acid. On each plate, four dilutions of positive-control plasma and secondary only controls were run. Additionally, a BSA-only plate was run in parallel to each antigen. Plates were read on a GloMax Explorer Microplate Reader (Promega). ELISA assays were performed at least twice, in duplicate or triplicate.

Biolayer interferometry

Monoclonal antibody interactions with S2, RBD and S1 protein were measured on an Octet Red96e (Fortebio/Sartorius). Association and dissociation curves were measured with monoclonal antibodies bound to anti-human IgG Fc Capture (AHC) sensors at 20 nM and antigens in solution at 0, 16.7, 50, 150 and 450 nM in 1× kinetic buffer (Fortebio/Sartorius). BLI analysis software (Fortebio/Sartorius, v.7.1) was used for data processing and analysis. Buffer controls were subtracted from antigen values and curves were fitted globally for each group, consisting of all concentrations of the same ligand. Association and dissociation curves and constants as well as KD values for each antibody were reported and graphed with GraphPad Prism. Biolayer interferometry assays were performed 1–2 times, in triplicate.

Cell culture

Expi293F cells were cultured in 33% Expi293 Expression Medium (Gibco) and 67% Freestyle293 Expression Medium (Gibco). HeLa-ACE2 were kindly provided by D. Burton41 and were cultured in Eagle’s minimum essential medium (ATCC, 30-2003) with 10% heat-inactivated FBS (Corning) and 100 U ml−1 penicillin/streptomycin (Gibco). LentiX 293T cells (Takara Bio) were cultured in DMEM (ATCC, 30-2002) with 10% heat-inactivated FBS (Corning) and 100 U ml−1 penicillin/streptomycin (Gibco).

Generation SARS-CoV-2 spike pseudotyped lentiviral particles

Pseudotyped lentiviral particles were generated as previously described42. Briefly, LentiX 293T cells were seeded in 10-cm plates. After 24 h, cells were transfected using Fugene transfection reagent (Promega) with pHAGE-CMV-Luc2-IRES-ZsGreen-W, lentiviral helper plasmids (HDM-Hgpm2, HDM-tat1b and pRC-CMV-Rev1b) and wild-type or variant SARS-CoV-2 spike plasmids (parent plasmids publicly available from J. Bloom laboratory). After 48–60 h, viral supernatants were collected and spun at 1,000g for 10 min to remove cell debris. Lentiviral supernatants were concentrated using LentiX concentrator (Takara) according to the manufacturer’s instructions. Lentiviral pellets were resuspended at a 20-fold viral increase in EMEM and stored at −80 °C. Virus was titrated on HeLa-ACE2 cells.

Viral inhibition assays

Neutralization assays were performed as previously described42. Briefly, eightfold serially diluted plasma starting at 1:80 from vaccinated individuals or fivefold serially diluted monoclonal antibodies starting at a concentration of 10 µg ml−1 were incubated with SARS-CoV-2 pseudotyped virus for 1 h at 37 °C. The mixture was added to HeLa-ACE2 cells plated the previous day. After ~50 h post-infection, luciferase activity was measured on a GloMax Explorer Microplate Reader (Promega). Neutralization assays were performed 1–2 times in triplicate.

Statistics and software

GraphPad Prism v.9.1.0 and R v.4.0.3 were used for statistical analyses. Statistical tests used and significance levels are indicated in the respective methods section or in the figure legends. Normal distribution was assumed for the nine biological replicates where Student’s t-test and ANOVA was performed. Graphical illustrations were created with BioRender.

Materials availability

Materials generated in this study will be made available on request and may require a material transfer agreement.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.