There is a worldwide pandemic of COVID-19 caused by SARS-CoV-2.1 By January 28, 2021, more than 100 million cases had been diagnosed, and more than 2 million deaths had been reported (https://coronavirus.jhu.edu/map.html). The comprehensive and in-depth elucidation of SARS-CoV-2-specific IgG responses will help us to better understand COVID-19 immunity and facilitate the precise development of neutralizing antibodies and vaccines. To date, IgG responses at the protein and peptide levels,2,3,4 but not at the amino acid level, have been at least partially revealed by these studies. Herein, we aimed to go one step further to dissect SARS-CoV-2-specific IgG responses systematically at the amino acid level. We adopted AbMap,5 a method that was developed recently for high-throughput epitope mapping. A set of 55 convalescent sera and 226 protein/peptide-enriched antibodies from these sera were analyzed. A map of SARS-CoV-2-specific IgG epitopes at amino acid resolution was generated for the first time.

We collected sera from 55 convalescent patients and 25 controls (Table S1). The patients were hospitalized at Foshan Fourth Hospital in China from 2020-1-25 to 2020-3-8 for various durations. The 55 convalescent sera were subjected to AbMap analysis directly. We identified 418 motifs (Table S2), 275 of which could be matched to 27 of the 28 known SARS-CoV-2 proteins (Table S3, Fig. S1A). Viewing the data vertically (Fig. S1A), the results clearly showed that there were significant differences among the patients. The number of patients was inversely proportional to the frequency (Fig. S1B). Viewing the data horizontally, there were large variations in sum_epitopes and the ratio of sum_epitopes to length of proteins (Fig. S1C).

To reveal more SARS-CoV-2 epitopes of high confidence, we enriched the protein-specific antibodies from each sample in a consequential manner, i.e., RBD, S1, S2, N (Fig. S2) and Orf9b. A variety of epitopes were identified at high confidence and could be matched to the sequences of the corresponding proteins (Tables S2 and S3).

We plotted the epitopes and the frequencies alongside the linear sequence and domains of the S protein (Fig. 1A). Two areas that were rich in significant epitopes were identified. The first area covers almost the entire CTD (C-terminal domain), and the second area covers the S2’ protease cleavage site and the fusion peptide (FP). Significant epitopes were also identified at the cytoplasmic C-terminal end of the S protein. These results are highly consistent with a peptide microarray-based study, in which ~4000 samples were analyzed against a set of 197 peptides that covers the entire S protein.6,7 In addition, we took the S protein as an example to demonstrate the necessity of enriching SARS-CoV-2-specific antibodies from sera (Fig. S3). We defined epitopes with a frequency >=3 as significant epitopes and obtained 28 epitopes (Fig. S4A); we mapped them to the 3D structure of the S protein8 monomer (Fig. 1B) and trimer (Fig. S4B). It is clear that most of these epitopes are located on the surface of the S protein monomer and trimer (Fig. 1C, D and Table S4). High homology was observed among SARS-CoV-2, SARS-CoV and BtCoV-RaTG13, especially between SARS-CoV-2 and SARS-CoV (Fig. 1E and Table S4).

Fig. 1
figure 1

Significant epitopes on the S protein identified by AbMap and validated by peptide microarray. A The distribution of the sequence-matched epitopes on the S protein. B The distribution of the significant epitopes (frequency ≥ 3) on the 3D structure of the S protein monomer. Red amino acids represent key residues of epitopes. C A representative significant epitope. D The epitope was matched to the S protein, and critical residues were labeled yellow. E Homology analysis of the epitope among coronaviruses. F Validation of epitope identification procedures by a peptide microarray: comparison of sera and protein-enriched antibodies. Peptide 1, which corresponds to Epitope-S1-8 (KLFPFQQF), was selected as an example. The ratio of signal intensity (S1 enriched)/signal intensity (serum) for each residue was plotted (right)

Biotinylated RBD was also applied to enrich specific antibodies from COVID-19 convalescent sera (Fig. S1). One relatively significant epitope was identified (Fig. S5A). To visualize the conformational location of this epitope, we mapped the critical epitope residues (yellow) to the 3D structure of the RBD (Fig. S5B). It is clear that the critical epitope residues are not on the exact binding interface of RBD and ACE2 but adjacent to it (Fig. S5C), which was similar to the results reported for several well-studied antibodies (Fig. S5D–J). Antibodies targeting this epitope may cause conformational changes or spatial hindrance, thus interfering with the binding of ACE2 and these neutralizing antibodies to the RBD.

To explore the roles of the critical epitope residues, we matched the mutation sites with the epitope map of the S protein (Fig. 1A) and found that six mutation sites are also critical epitope residues (Fig. S6A and Table S5), of which D936Y and P1263L could cause 60–70% and 98–99% infectivity loss, respectively.9 These mutation sites were also of high frequency (Fig. S6B). Mutations with high frequency, i.e., D614G,10 were not identified as critical epitope residues. Furthermore, most of the identified critical residues have a low frequency of mutation. Overall, it is reasonable to argue that these critical residues, especially D936 and P1263, may be useful for vaccine development.

To present the distribution of the epitopes on the N protein, we also matched the epitopes to the N protein sequence. The epitopes were distributed across but not evenly on the N protein. One area rich in significant epitopes was readily identified (Fig. S7A), and 6 significant epitopes were identified when the cutoff was set as frequency >=3 (Fig. S7B). High homology for the epitopes and the critical epitope residues was observed among SARS-CoV-2, SARS-CoV and BtCoV-RaTG13 for all 6 significant epitopes (Fig. S7C).

To validate the key residues of the significant epitopes, we selected four epitopes, i.e., Epitope-S1-8 and Epitope-S1-12, which are of the highest frequency, Epitope-RBD-1 from the RBD, and an epitope of low frequency, Epitope-S2-15, as a control. We then mutated the amino acids of these epitopes to alanine one by one, synthesized the mutated peptides, conjugated them to BSA, and fabricated a peptide microarray (Fig. S8). To validate the epitopes on the peptide microarray, we selected 5 sera, from which at least one of the significant epitopes among Epitope-S1-8, Epitope-S1-12 and Epitope-RBD-1 was identified by AbMap (Fig. S9A). For all three epitopes, when the corresponding samples were tested, significant binding signal loss was observed when any of the critical residues was mutated to alanine (Figs. 1F and  S9B). In addition, the results in Figs. 1F and  S10 further confirmed the necessity for the enrichment of antibodies from sera by specific proteins for epitope mapping.

The aim of this study was to promote our understanding of the SARS-CoV-2-specific IgG response in a more precise way. Currently, no traditional platforms can provide epitope information at the amino acid level. Taking advantage of AbMap, we constructed the first map of SARS-CoV-2-specific IgG binding epitopes at amino acid resolution. We identified several critical epitope residues on the S protein that are highly correlated with the infectivity of SARS-CoV-2 and the binding of ACE2/neutralizing antibodies to the S protein.

There are a variety of possible applications for the significant epitopes/critical residues identified in this study. For example, researchers can select one or a few epitopes based on neutralization activity to generate site-specific antibodies or vaccines on demand for S protein as well as other proteins. Since the COVID-19 pandemic is still unfolding and the sequenced genomes of SARS-CoV-2 are accumulating, undoubtedly, more functionally important amino acid sites will be identified, and we can match those sites to the identified critical epitope residues, thus linking mutation or SARS-CoV-2 evolution with immune responses.

Taken together, these results facilitate the in-depth understanding of SARS-CoV-2-specific IgG responses and provide a basis for the precise development of diagnostic reagents, therapeutic antibodies and even vaccines.