Main

Over the past 4 years of SARS-CoV-2 evolution, the virus has accumulated mutations throughout its genome. The most rapid evolution has occurred in the viral spike, for instance, the XBB-descended variants that dominated in 2023 have 45–48 spike protein mutations relative to the earliest known strains from Wuhan in late 2019. The reason for this rapid evolution is that spike mutations can strongly affect both the virus’s inherent transmissibility and ability to escape pre-existing immunity1,3. A crucial aspect of interpreting and forecasting SARS-CoV-2 evolution is therefore understanding the impact of current and potential future mutations on the spike.

Here we measure how thousands of mutations to the spike glycoprotein of the XBB.1.5 and BA.2 SARS-CoV-2 strains affect three molecular phenotypes critical to viral evolution: cell entry, ACE2 binding and neutralization by human polyclonal serum (Fig. 1a). To do this, we extend a recently described pseudotyped lentivirus deep mutational scanning system4 that enables safe experimental characterization of mutations throughout the spike5. We demonstrate that mutations outside the RBD can substantially affect spike binding to ACE2. We also define the mutations that escape neutralization by sera from humans who have been multiply vaccinated and also recently infected by XBB or one of its descendant lineages (XBB*), and show there is appreciable heterogeneity in the antigenic impact of mutations across individuals. Finally, we show that the spike phenotypes we measure explain much of the changes in viral growth rate among different SARS-CoV-2 clades that have emerged in humans over the past few years.

Fig. 1: Deep mutational scanning to measure phenotypes of the XBB.1.5 and BA.2 spikes.
figure 1

a, We measure the effects of mutations in spike on cell entry, receptor binding and serum escape using deep mutational scanning (DMS). We then use these measurements to predict the evolutionary success of human SARS-CoV-2 clades. b, Distribution of effects of mutations in XBB.1.5 and BA.2 spikes on entry into 293T-ACE2 cells for all mutations in the deep mutational scanning libraries, stratified by the type of mutation and the domain in spike. Negative values indicate worse cell entry than the unmutated parental spike. Note that the library design favoured introduction of substitutions and deletions that are well tolerated by spike, explaining why many mutations of these types have neutral to only modestly deleterious effects on cell entry. c, Cell entry effects of mutations F456L, P1143L and deletion of V483 relative to the distribution of effects of all substitution and deletion mutations in the libraries. Interactive heat maps with effects of individual mutations across the whole spike on cell entry are at https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/293T_high_ACE2_entry_func_effects.html and https://dms-vep.org/SARS-CoV-2_Omicron_BA.2_spike_ACE2_binding/htmls/293T_high_ACE2_entry_func_effects.html. The boxes in b and c span the interquartile range, with the horizontal white line indicating the median. Whiskers in b indicate 0.75 of the interquartile range plotted from the smallest value of the first and highest value of the third quartile. For c, the effect of deleting V483 was not measured in the BA.2 spike. The effects of mutations are the mean of two biological replicate measurements made with different deep mutational scanning libraries.

Design of spike mutant libraries

We created mutant libraries of the spikes from the XBB.1.5 and BA.2 strains. We chose these strains because nearly all human SARS-CoV-2 circulating at present descends from either BA.2 or XBB.1.5’s parent lineage XBB6, and because XBB.1.5 is the sole component of the COVID-19 booster vaccine recommended by the WHO in 2023 (ref. 7). We wanted the libraries to contain all evolutionary accessible amino-acid mutations tolerable for spike function. We therefore included all mutations observed an appreciable number of times among the millions of SARS-CoV-2 sequences in Global Initiative on Sharing All Influenza Data (GISAID). In addition, we included all possible mutations at sites that change often during SARS-CoV-2 evolution or are antigenically important1,8, and deletions at key N-terminal domain (NTD) and RBD sites. These criteria led us to target roughly 7,000 amino-acid mutations in each of the XBB.1.5 and BA.2 libraries (Extended Data Fig. 1a). We created two independent libraries for each spike so we could perform all deep mutational scanning in full biological duplicate. The actual libraries contained between 69,000 and 102,000 barcoded spike variants with an average of two mutations per variant, and covered 99% of the targeted mutations, as well as some extra mutations (Extended Data Fig. 1a). To retrospectively validate that this library design covered most evolutionarily important mutations, we confirmed that our XBB.1.5 libraries provided adequate coverage for high-confidence experimental measurements of nearly all spike mutations now present in XBB, BA.2 and BA.2.86-descended Pango clades—despite the fact that BA.2.86 had not even emerged yet at the time we designed the library (Extended Data Fig. 1b). So although our libraries do not contain all spike mutations, they cover nearly all mutations that are relevant in the near- to mid-term evolution of SARS-CoV-2. Because the RBD is an especially important determinant of ACE2 binding and serum antibody escape9, we also made duplicate XBB.1.5 libraries that saturated all amino-acid mutations in only the RBD (Extended Data Fig. 1a).

Effects of spike mutations on cell entry

We measured the effects of all library mutations on spike-mediated cell entry in 293T-ACE2 cells (Extended Data Fig. 1c,d and interactive heat maps at https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/293T_high_ACE2_entry_func_effects.html and https://dms-vep.github.io/SARS-CoV-2_Omicron_BA.2_spike_ACE2_binding/htmls/293T_high_ACE2_entry_func_effects.html). These measurements were highly correlated between the replicate libraries for each spike, indicating the experiments have good repeatability (Extended Data Fig. 1e). The effects of mutations were also well correlated between the XBB.1.5 and BA.2 spikes (Extended Data Fig. 1f), consistent with previous reports that most but not all mutations have similar effects on the spikes of different SARS-CoV-2 variants10,11. As expected, stop codons were highly deleterious for cell entry (Fig. 1b). Because our full-spike library design strategy favours functionally tolerated mutations in spike, most amino-acid mutations in our libraries just slightly impaired cell entry and some but not all single-residue deletions were also well tolerated (Fig. 1b). SARS-CoV-2 has acquired numerous deletions in the NTD’s flexible loops during its evolution12,13, and consistent with that evolution we find that the flexible loops but not the core β sheets of the NTD are relatively tolerant of deletions (Extended Data Fig. 1g). Overall, the effects of mutations on cell entry were fairly well correlated with the effects of amino-acid mutations on viral fitness estimated from millions of natural human SARS-CoV-2 sequences14 (Extended Data Fig. 1h).

No individual mutation in either the XBB.1.5 or BA.2 spikes notably increased pseudovirus cell entry, although some mutations did marginally improve entry (Fig. 1b and interactive heat maps linked in figure legend). One mutation that slightly improves pseudovirus entry in both XBB.1.5 and BA.2 is P1143L (Fig. 1c), which is found in the recently emerged BA.2.86 lineage15. We previously reported that mutations to P1143 also improve cell entry for BA.1 and Delta pseudoviruses4. The deletion mutations in our libraries are usually more deleterious for cell entry than substitutions (Fig. 1b); however, deletion of V483 in the RBD is well tolerated for cell entry, consistent with emergence of this mutation in the BA.2.86 variant15. The F456L mutation, which has emerged repeatedly in XBB clades after being rare in earlier BA.2-derived clades, is well tolerated for cell entry in XBB.1.5 but substantially deleterious in BA.2 (Fig. 1c).

Non-RBD mutations affect ACE2 binding

To measure how mutations in spike affect receptor binding, we leveraged the fact that the soluble ACE2 ectodomain neutralizes spike-mediated infection with a potency proportional to the strength of spike binding to ACE2 (refs. 1,16). To validate this fact, we made pseudoviruses with six different spike variants and quantified their neutralization by monomeric ACE2 (Fig. 2a). Compared to the BA.2 spike, the Wuhan-Hu-1+D614G spike is neutralized less potently by soluble ACE2 consistent with its weaker ACE2 binding17,18, whereas four mutants of BA.2 known to have higher ACE2 binding2 (N417K, N417F, R493Q and Y453F) were all neutralized more potently by soluble ACE2 (Fig. 2a). The quantitative neutralization by soluble ACE2 was highly correlated with previously measured monomeric RBD-ACE2 affinities2,18,19 (Fig. 2b).

Fig. 2: Effects of mutations on full-spike ACE2 binding measured using pseudovirus deep mutational scanning.
figure 2

a, Neutralization of pseudoviruses with the indicated spikes by soluble monomeric ACE2. Viruses with spikes that have stronger binding to ACE2 are neutralized more efficiently by soluble ACE2 (lower half-maximal neutralizing titers(NT50)), whereas viruses with spikes with worse binding are neutralized more weakly. Error bars indicate standard error between two replicates. ACE2 affinity values measured by surface plasmon resonance for BA.2 and Wu-1+D614G are shown in brackets18. b, Correlation between neutralization NT50 by soluble ACE2 versus the RBD affinity for ACE2 as measured by titrations using yeast-displayed RBD2. c, Correlations between the effects of RBD mutations on ACE2 binding measured using the pseudovirus-based approach (this study) and yeast-based RBD display2,20. d, Distribution of effects of individual mutations on full-spike ACE2 binding for all functionally tolerated mutations in our libraries, stratified by RBD versus non-RBD mutations. Note that effects of magnitude greater than two are clamped to the limits of the plots’ x axes. The effects of individual XBB.1.5 spike mutations on ACE2 binding are shown in Extended Data Fig. 3.

Using this approach, we measured how mutations across both the XBB.1.5 and BA.2 spikes affect apparent ACE2 binding (Extended Data Fig. 2 and interactive heat maps of all mutation effects at https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/monomeric_ACE2_mut_effect.html and https://dms-vep.github.io/SARS-CoV-2_Omicron_BA.2_spike_ACE2_binding/htmls/monomeric_ACE2_mut_effect.html). Because our assay measures ACE2 neutralization rather than 1:1 ACE2-RBD affinity there are several distinct mechanisms that could affect what we refer to as ACE2 binding: direct changes in 1:1 RBD-ACE2 binding affinity2,20, changes in spike that modulate the conformation of the RBDs (such as up and down movements)21,22 and ACE2-induced shedding of the S1 subunit23,24.

The effects of RBD mutations on ACE2 binding to the spike measured using pseudovirus deep mutational scanning correlate well with previously reported measurements from RBD yeast display for both XBB.1.5 and BA.2 (ref. 20) (Fig. 2c). We also measured ACE2 binding for the XBB.1.5 pseudovirus libraries with saturating RBD mutations using both monomeric and dimeric soluble ACE2. The RBD-only pseudovirus measurements were highly correlated with the full-spike library measurements (Extended Data Fig. 3a), and the measured values were highly similar for monomeric versus dimeric soluble ACE2 (Extended Data Fig. 3b). ACE2 binding and pseudovirus cell entry are distinct properties, with no strong correlation between these properties among tolerated mutations (Extended Data Fig. 3c), probably reflecting the fact that cell entry can be limited by factors unrelated to receptor binding, especially in target cells expressing moderate to high levels of ACE2, such as those used in our experiments.

A striking observation from the deep mutational scanning is that some mutations outside the RBD appreciably affect binding to ACE2 (Fig. 2d and Extended Data Figs. 2 and 3). To validate these findings, we used mass photometry to measure binding of the soluble native ACE2 dimer to the spike ectodomain trimer (Fig. 3a). Mass photometry measures protein-protein interactions in solution by detecting changes in light scattering that are proportional to protein molecular mass25, which allows us to detect binding of one or more ACE2 molecules to the spike (Fig. 3a). We produced prefusion-stabilized HexaPro26 BA.2 and XBB.1.5 spikes, along with mutants that our deep mutational scanning experiments showed to modulate ACE2 binding, and performed mass photometry in the presence of a series of ACE2 concentrations (Fig. 3a,b, Extended Data Fig. 4 and Supplementary Figs. 13). As expected, we observed better and worse ACE2 binding for RBD mutations that have been previously identified to either increase (R493Q) or abrogate (R498V) ACE2 engagement, respectively2 (Fig. 3b, left panels). Furthermore, we detected increased ACE2 binding for all but one of the BA.2 and XBB.1.5 spike trimers harbouring S1 subunit mutations (in NTD, RBD and SD1 domains) that our deep mutational scanning indicated had better binding (Fig. 3b middle panel, Extended Data Fig. 4 and Supplementary Figs. 2 and 3), as well as decreased ACE2 binding for S1 mutations that our deep mutational scanning indicated had worse binding (Fig. 3b). However, mutations to the BA.2 and XBB1.5 S2 subunit found to increase binding to ACE2 in our deep mutational scanning did not lead to increased ACE2 binding detectable by mass photometry (Fig. 3b right panel, Extended Data Fig. 4b,c and Supplementary Figs. 2 and 3). Notably, some of these S2 mutations were previously reported to affect spike fusion27,28,29 suggesting that they may indeed affect S1 shedding and in turn affect ACE2 binding consistent with our deep mutational scanning. However, unlike the spikes in deep mutational scanning experiments, the spikes used in mass photometry experiments are prefusion stabilized by introduction of the HexaPro mutations in the fusion machinery26. These modifications to spike may limit the propagation of long-range allosteric changes induced by S2 subunit mutations, possibly explaining the discrepancy between deep mutational scanning and mass photometry. Concurring with this hypothesis, we previously showed that ACE2-induced allosteric conformational changes that drive fusion peptide exposure were inhibited by the prefusion-stabilizing 2P mutations30.

Fig. 3: Non-RBD mutations affect ACE2 binding.
figure 3

a, ACE2 binding measurements using mass photometry. The histogram on the left shows distribution of spike molecular mass when no (S0xACE2), one (S1xACE2), two (S2xACE2) or three (S3xACE2) ACE2 molecules are bound. We measure how this mass distribution changes as spike is incubated with increasing concentrations of soluble dimeric ACE2. RBD occupancy is the fraction of RBDs bound to ACE2, calculated using Gaussian components for S0xACE, S1xACE2, S2xACE2 and S3xACE2 at each ACE2 concentration. b, RBD occupancy measured using mass photometry for different BA.2 and XBB.1.5 spike variants. Top left panel shows that a BA.2 spike mutation known to increase ACE2 binding (R493Q/blue) has greater RBD occupancy relative to unmutated BA.2 (black) spike, by contrast a mutation known to decrease ACE2 binding (R498V/green) has lower RBD occupancy in both BA.2 (top left panel) and XBB.1.5 (bottom left panel) backgrounds. Panels on the right show RBD occupancy for BA.2 (top right) and XBB.1.5 (bottom right) spike variants with mutations in S1 or S2 subunits measured to increase ACE2 binding in the deep mutational scanning. Values shown in parentheses after the mutation in the legend are the effect on ACE2 binding measured by deep mutational scanning. Error bars in plots a and b indicate standard error between two replicates. c, Non-RBD mutations measured to increase ACE2 binding in deep mutational scanning experiments that have arisen independently as defining mutations in at least four XBB-descended clades.

Non-RBD mutations that enhance ACE2 binding have played an important role in SARS-CoV-2 evolution. The following non-RBD mutations that enhance ACE2 binding occurred in the main pre-Omicron variants of concern: A570D (Alpha), A222V (several moderate-frequency Delta sublineages), T1027I (Gamma) and D950N (Delta) (Extended Data Fig. 2d). In addition, the following non-RBD mutations that occurred in Omicron variants, all of which represent reversions to pre-Omicron residue identities, increase ACE2 binding: K969N, K764N and Y655H. Consistent with previous studies showing that the original D614G mutation increased the proportion of RBDs in the up conformation21, we find that G614D decreases full-spike ACE2 binding (Fig. 3b and Extended Data Fig. 2d).

To systematically examine the recent evolutionary role of non-RBD-ACE2 binding-enhancing mutations, we tabulated non-RBD mutations that enhance binding and are new mutations in at least four XBB-descended Pango clades (Fig. 3c). Some of these mutations may explain why certain clades had a growth advantage. For example, the NTD mutation Q52H provided the EG.5.1 lineage with a clear growth advantage over EG.5 (ref. 6), despite not measurably affecting serum neutralization31. Our deep mutational scanning provides an explanation for the success of EG.5.1 by showing that Q52H enhances ACE2 binding. Similarly, T572I is now appearing convergently in JN.1-descended lineages6, and our results show that mutation enhances ACE2 binding.

Heterogeneous sera escape

We next mapped how mutations in spike affect neutralization by the polyclonal antibodies in sera from ten vaccinated individuals who either had a confirmed XBB* infection or whose last infection was during a period when XBB lineages were the dominant circulating variants (Supplementary Table 1). We performed these measurements with the full-spike XBB.1.5 libraries using 293T cells expressing moderate levels of ACE2 that better capture the activities of non-RBD antibodies32,33, although the key sites of escape were mostly similar if we used 293T cells expressing high levels of ACE2 or the RBD-only libraries (Extended Data Fig. 5). The sites of greatest serum escape were mainly in the RBD (Fig. 4a–c and interactive plot at https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/summary_overlaid.html). These sites include 357, 371, 420, the 440–447 loop, 455–456 and 473, as well as a few sites in the NTD, such as positions 200 and 234. At some sites, the escape mutations are strongly deleterious to ACE2 binding (Fig. 4c). For instance, mutations at Y473 cause strong neutralization escape but greatly reduce ACE2 binding, probably explaining their low frequency among circulating SARS-CoV-2 variants. In addition, only some of the antibody escape mutations mapped in our experiments are accessible by single-nucleotide mutations to XBB.1.5 (Fig. 4c). Several escape mutations that are single-nucleotide accessible and do not strongly impair ACE2 binding are found in recent variants, including mutations at site 456 in EG.5.1 and many other XBB variants, mutations at 455 in HK.3.1 and JN.1, mutations at 420 in GL.1 and mutations at 200 in XBB.1.22 (ref. 6).

Fig. 4: Serum antibody escape mutations for individuals with previous XBB* infections.
figure 4

a, Escape at each site in the XBB.1.5 spike averaged across ten sera collected from individuals with previous XBB* infections. The points indicate the total positive escape caused by all mutations at each site. See https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/summary_overlaid.html for an interactive version of this plot with extra mutation-level data. b, Enlarged view of the escape at each site in RBD with each line representing one of the ten sera. Key sites are labelled with red circles indicating escape for each of the ten sera. Red data points indicate escape for each individual at select RBD positions. c, Logo plots showing the 16 sites of greatest total escape after averaging across the sera. Letter heights indicate escape caused by mutation to that amino acid, and letters are coloured light yellow to dark brown depending on the impact of that mutation on ACE2 binding (see colour key). The top plot shows all amino-acid mutations measured, and the bottom plot shows only amino acids accessible by a single-nucleotide mutation to the XBB.1.5 spike. d, The left shows a correlation between DMS escape scores and pseudovirus neutralization assay IC90 values for three sera. The right is a logo plot showing escape for all sites with mutations validated in the neutralization assays, with the specific validated mutations in red.

Whereas the same mutations often escape many sera, there is also heterogeneity such that the sera-average is not fully reflective of the effects of mutations on any individual serum (Fig. 4b,d and Extended Data Fig. 6). For example, whereas mutations to site Y473 strongly escape neutralization by most sera, two sera we analysed (493C and 501C) are largely unaffected by mutations at that site. Other key sites of escape, including 420 and 456, show similar heterogeneity across sera. To validate that escape mutations can have very different effects across sera, we performed standard pseudovirus neutralization assays5 against a panel of point mutants to the XBB.1.5 spike (Fig. 4d). The changes in neutralization in these validation assays were highly correlated with the escape measured by deep mutational scanning, and confirmed the serum-to-serum heterogeneity. For example, Y473S strongly reduces neutralization by sera 287C and 500C, but actually slightly increases neutralization by serum 501C. Similarly, F456L substantially reduces neutralization by only some sera (Fig. 4d).

The deep mutational scanning identifies mutations that increase, as well as escape, neutralization (Extended Data Fig. 7). Sensitizing mutations often occur at sites that are mutated in XBB.1.5 relative to earlier variants, such as sites 373, 405, 417, 460, 486 and 505 (Extended Data Fig. 7). Presumably in many cases, reverting mutations at these sites restores neutralization by antibodies elicited by infection or vaccination with earlier viral strains. To confirm that the sensitizing mutations identified in the deep mutational scanning actually increased neutralization, we validated the sensitizing effects of R403K and N405K in standard pseudovirus neutralization assay (Fig. 4d). In addition, some sensitizing mutations seem to act by placing the RBD in a more up conformation as discussed in the next subsection.

RBD conformation affects serum escape

Most sites of strong escape described in the previous section are proximal to the ACE2-binding motif in the RBD that is the target of many potent neutralizing antibodies34,35. However, the deep mutational scanning also reveals individual mutations at non-RBD or ACE2-distal RBD sites that strongly escape neutralization. Some of these sites, such as 42, 200 and 234 in the NTD, 572 in SD1 and 852 in S2 have mutations that cause as much escape as ACE2-proximal RBD mutations, decreasing serum neutralization by as much as sixfold (Fig. 4a,d). Whereas most mutations at any given site have similar effects on escape (that is, either promoting or sensitizing) at many ACE2-proximal RBD sites, different mutations at the same non-RBD or ACE2-distal RBD site can have opposing effects on neutralization (Fig. 5a–c). Furthermore, there is a strong correlation between mutational effects on neutralization and ACE2 binding at these sites: mutations that reduce neutralization also reduce ACE2 binding, and mutations that increase neutralization also increase ACE2 binding (Fig. 5a,b). No such consistent correlation exists between neutralization and ACE2 binding for RBD escape sites in close proximity of ACE2 binding interface (Fig. 5c).

Fig. 5: Sera escape and ACE2 binding are inversely correlated for non-RBD and ACE2-distal RBD sites.
figure 5

a, The left shows a correlation between ACE2 binding and sera escape for amino-acid mutations at non-RBD sites with the highest mutation-level sera escape (each point is a distinct amino-acid mutation). Average escape for each mutation across all sera is shown. The right shows a logo plot for the same sites, with letter heights proportional to escape caused by that mutation (negative heights mean more neutralization), and letter colours indicating effect on ACE2 binding (green means better binding). b, A similar plot for RBD sites that are distal (at least 15 Å) from ACE2. c, A similar plot for RBD sites proximal (within 15 Å) to ACE2. Only sites with at least seven different mutations measured are included in the logo plots. d, Top-down view of XBB spike (Protein Data Bank ID 8IOT) with the non-RBD and ACE2-distal sites shown in a and b highlighted as spheres. The RBD is pink, the NTD is blue and sites in SD1 are green.

We propose that non-RBD and ACE2-distal RBD mutations that increase both neutralization and ACE2 binding do so by shifting the RBD to a more upright position, whereas those that decrease neutralization and ACE2 binding do so by shifting the RBD to a more downwards position36,37,38. Previous work has shown that mutations that put the RBD in a down position reduce neutralization by antibodies that target RBD residues only accessible in the up position, whereas antibodies that can bind both the up and down RBD are unaffected by such mutations15,39. Consistent with this previous work, we confirmed that the mutations at ACE2-distal sites identified in our full-spike deep mutational scanning as probably affecting RBD conformation only affect neutralization by monoclonal antibodies that bind only to the up conformation of the RBD (Extended Data Fig. 8).

Our results show that mutations that affect neutralization and ACE2 binding by modulating RBD conformation are common in certain regions of spike: a result that makes structural sense, because most of these mutations are located near the interfaces between the RBD and other spike domains (Fig. 5d and Extended Data Fig. 9). Furthermore, many of these strong escape sites, including N234, F371, P373, F375, A376, S408, A570 and T572, have been previously shown by structural methods to affect RBD conformation22,36,37,38,40,41,42,43 or the conformation of key RBD epitopes19,44.

Spike phenotypes and clade growth

SARS-CoV-2 evolution in humans is characterized by the repeated emergence of new viral clades, which often possess extra amino-acid mutations in spike relative to their predecessors. To test whether our deep mutational scanning measurements could help explain which clades are evolutionarily successful, we estimated the relative growth rates in humans of sufficiently-sampled SARS-CoV-2 clades using multinomial logistic regression45 (Extended Data Fig. 10a–c). As expected, more recent clades generally had higher growth rates, consistent with evolution selecting for viral clades that are more fit (Extended Data Fig. 10a), presumably in part due to further mutations in spike46.

We sought to determine whether the growth of clades could be predicted from how their mutations affect the spike phenotypes measured by deep mutational scanning. Note that almost any mutation-based measurement (such as just counting mutations) trivially correlates with clade growth because newer clades typically have both better growth rates and more spike mutations (Extended Data Fig. 10a,d). For instance, clade growth rates strongly correlate with the number of spike mutations relative to the early Wuhan-Hu-1 sequence (Extended Data Fig. 10e). But this correlation is not informative because the question of evolutionary interest is not whether SARS-CoV-2’s spike will acquire more mutations over time (we already know this will happen), but rather which of the various mutant viruses present at any given time will spread. Furthermore, phylogenetic correlations can exaggerate associations between mutations and clade growth47. Therefore, we focused on predicting changes in clade growth for each pair of parent–descendant clades separated by at least one spike mutation (Extended Data Fig. 10b). This approach eliminates the confounding effects of phylogenetic relatedness and the accumulation of mutations over time (Extended Data Fig. 10e,f), and better answers the question of how specific mutations affect clade growth.

Changes in growth between parent–descendant clade pairs were positively correlated with all three experimentally measured spike phenotypes both among just XBB-descended clades (Fig. 6a and Extended Data Fig. 11) and among all BA.2, BA.5 and XBB-descended clades (Extended Data Fig. 12). The correlations were statistically significant for sera escape and cell entry as assessed by randomization of the measurements among mutations. However, these univariate correlations do not fully capture the information in the experiments, as the effects of mutations on the spike phenotypes are themselves correlated (for example, mutations that cause sera escape sometimes decrease ACE2 binding). We therefore performed ordinary least-squares multiple linear regression of changes in clade growth versus all three phenotypes. The predictions of this regression correlated with changes in clade growth better than any individual phenotype, and were highly statistically significant as assessed by randomization of the measurements among mutations (Fig. 6b and Extended Data Fig. 12). Sera escape uniquely explained the largest fraction of the variance in changes in clade growth, but ACE2 binding and cell entry effects also explained some variance. By contrast, neither RBD yeast-display deep mutational scanning of antibody escape8,48 and ACE2 affinity20 nor the EVEscape deep learning model49 were consistently better than randomized data at predicting changes in clade growth at a significance level of P = 0.05 (Extended Data Figs. 11 and 12).

Fig. 6: Spike phenotypes measured by deep mutational scanning partially predict the evolutionary success of SARS-CoV-2 clades.
figure 6

a, Correlation between the changes in growth rate for parent–descendant clade pairs versus the change in each spike phenotype measured in the XBB.1.5 full-spike deep mutational scanning (several mutations are assumed to have additive effects). The text above each plot shows the Pearson correlation (r) and a P value computed by comparing the actual correlation to that for 100 randomizations of the experimental data among mutations. b, Ordinary least-squares multiple linear regression of changes in growth rate versus all three measured spike phenotypes. The small text indicates the unique variance explained by each variable, as well as the coefficients (coef.) in the regression. See https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/current_dms_clade_pair_growth.html and https://dms-vep.github.io/SARS-CoV-2_XBB.1.5_spike_DMS/htmls/current_dms_ols_clade_pair_growth.html for interactive versions of both panels in which points can be hovered over for details on clades and their mutations. P values are for one-sided tests of the hypothesis that the tested predictor outperforms randomizations, and are reported individually for each comparison. See Extended Data Fig. 12 for a similar analysis that also includes BA.2 and BA.5 descended clades.

We also sought to test the ability of full-spike deep mutational scanning to explain the high fitness of BA.2.86 and its descendant clades (for example, JN.1), which were identified after the completion of the experiments described in this study50. Because there are not yet sufficient distinct BA.2.86-descended clades to make meaningful comparisons with clade growth, instead we performed a different test inspired by Thadani et al.49: we generated sequences with random sets of naturally observed spike amino-acid mutations that had the same number of differences relative to BA.2 as did BA.2.86, or relative to BA.2.86 as all designated BA.2.86-descended clades. Our XBB.1.5-based full-spike deep mutational scanning could distinguish the true BA.2.86 and BA.2.86-descended clades from sequences with the same number of mutations with high statistical significance, and did so better than RBD yeast-display deep mutational scanning or EVEscape (Supplementary Fig. 4).

Discussion

More than 16 million human SARS-CoV-2 genomes have been sequenced to date, enabling rapid identification of variants with new mutations at the sequence level. However, interpreting the consequences of these mutations on viral spread in a partially immune population remains a major challenge. Here we show how pseudovirus-based deep mutational scanning can characterize the effects of mutations throughout spike on three distinct phenotypes critical to viral fitness: cell entry, ACE2 binding and serum antibody escape.

The full-spike pseudovirus data we generate enables several key insights that were not apparent from previous yeast-display RBD deep mutational scanning approaches1,2,48. Most obviously, the data encompass all spike domains, not just the RBD. These data show that non-RBD mutations can affect ACE2 binding, probably by altering the conformation of the RBD in the context of the spike trimer (for example, in up versus down position). Such mutations are highly relevant for SARS-CoV-2 evolution—for instance, enhancement of ACE2 binding by non-RBD mutations appears to explain why EG.5.1 spread so rapidly after it acquired Q52H, why A222V subvariants of Delta spread widely, why A570D was selected in Alpha, and why T572I is now arising so frequently in BA.2.86-descended variants.

Pseudovirus deep mutational scanning also enables us to directly measure how mutations affect neutralization by polyclonal sera. By contrast, previous RBD-display deep mutational scanning could only measure how mutations affect antibody binding51, and so to estimate mutational effects on serum neutralization escape it was necessary to characterize hundreds of individual antibodies assumed to represent the polyclonal neutralizing repertoire of humans1,8. The ability to directly map how mutations affect serum neutralization leads to two new insights. First, it reveals the heterogeneity in how mutations affect neutralization by sera from different individuals. For instance, we characterize sera from XBB* infected individuals that are both strongly affected and almost completely unaffected by mutations at key sites such as 456 or 473. The sera examined in this study came from individuals with varied immunization and infection histories, which probably contributes to observed escape heterogeneity, although individual-to-individual variation in humoral response may also play a role. This person-to-person heterogeneity in the antigenic effects of spike mutations will increase as individuals accumulate increasingly distinct exposure histories, and could come to play an important role in shaping SARS-CoV-2 evolution and disease susceptibility as it does for influenza virus52,53,54.

The second major insight from direct mapping of serum escape is that mutations outside the RBD can have marked effects on neutralization. For instance, NTD mutations such as Y42F and N234T decrease neutralization by some sera by nearly sixfold. The existence of such strong non-RBD escape mutations may seem surprising given that most neutralizing activity in human sera come from antibodies that bind the RBD9,32,51,55. However, our data indicate that the strongest non-RBD serum escape mutations act primarily by shifting the RBD to the down conformation, thereby indirectly escaping class 1 and 4 antibodies that bind to RBD surfaces only accessible in the up conformation15,39. Of course, such mutations come at a cost to ACE2 binding, because the RBD cannot bind receptor in the down conformation56,57. Nonetheless, the ubiquity of such mutations suggests that this mechanism of escape merits monitoring and is in line with previous observations made with endemic human coronaviruses58,59,60. For instance, the RBD of the CoV-229E spike has never been observed in the up conformation61,62 despite the fact that this spike somehow manages to bind its receptor during infection. Whether SARS-CoV-2’s spike could eventually evolve to also far more strongly favour a down RBD conformation is unknown.

The most important indication of the relevance of our work is that our measurements of spike phenotypes partially explain the evolutionary success of different SARS-CoV-2 clades in humans. A longstanding goal of evolutionary biology is to understand the molecular phenotypes that contribute to fitness63, and then measure them with sufficient accuracy to predict which mutants will actually spread in the real world. We have taken a real step towards this goal, because the spike phenotypes measured by our deep mutational scanning explain a substantial amount of the changes in growth rates of recent SARS-CoV-2 clades. Of course, pseudovirus spike deep mutational scanning will never perfectly predict SARS-CoV-2 evolution: evolution itself is partially stochastic64, pseudovirus experiments do not capture all phenotypes of spike relevant to transmission or multicycle replication and our experiments completely ignore mutations to non-spike genes that contribute to fitness14,65. Furthermore, it remains technically challenging for deep mutational scanning to account for epistatic interactions among mutations66, and we need modelling approaches that better account for how person-to-person heterogeneity in immune-escape mutations shape viral evolution52. However, the fact that our deep mutational scanning has substantial power to explain clade growth shows that we have reached the point at which experiments can enable useful predictions about SARS-CoV-2 evolution. An important area of future work will be integrating these highly informative experimental measurements into even more sophisticated models of viral evolution49,67,68.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.