Introduction

Since its initial emergence in Wuhan in December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused more than 641 million cases of COVID-19 and more than 6.6 million deaths as of December 2022 (ref. 1). SARS-CoV-2 (along with SARS-CoV, the cause of SARS) is a member of the species Severe acute respiratory syndrome-related coronavirus, the sole member of a subgenus of viruses, Sarbecovirus, primarily found in horseshoe bats2. Like other coronaviruses, SARS-CoV-2 possesses a large RNA genome, comprising ~30,000 nucleotides, whose replication is mediated by RNA-dependent RNA polymerase (RdRP) and an associated proofreading enzyme exoribonuclease (ExoN). This, combined with the discontinuous nature of coronavirus transcription, has resulted in coronaviruses with high rates of recombination, insertions and deletions, and point mutations (although the rates are lower than for other RNA viruses due to the proofreading), as previously reviewed3. The success of novel genetic variants generated, although prone to stochastic sampling processes, will be very dependent on natural selection; in particular, positive selection associated with mutations that are beneficial to the virus in which they occur.

SARS-CoV-2 has proven to be a highly capable human pathogen, but also a generalist in terms of host tropism, establishing infections in a variety of mammalian species, including infections in farmed mink4, a stable reservoir in white-tailed deer5,6 and incidental infections of many other animal species7. Once SARS-CoV-2 was in humans, the first months of SARS-CoV-2 evolution were characterized by limited adaptation and phenotypic change relative to its later evolution8. The first notable change, a single spike substitution (D614G), arose early in the pandemic and conferred an ~20% growth advantage relative to preceding variants9. A lineage defined by D614G (PANGO lineage10 B.1) quickly became dominant in Europe, giving an early indication of the potential for SARS-CoV-2 to increase its transmissibility in humans. As we described previously3,11, from October 2020 onwards, novel, more heavily mutated SARS-CoV-2 variants began to emerge. These variants were distinguished by higher numbers of non-synonymous mutations principally in the spike protein — particularly the case for Omicron — and distinct phenotypic properties, including altered transmissibility and antigenicity. To date, five SARS-CoV-2 variants have been declared variants of concern (VOCs) by the World Health Organization (and national public health agencies) on the basis that they exhibit substantially altered transmissibility or immune escape, warranting close monitoring. Each VOC showed transmission advantages over preceding variants and became dominant, either regionally in the cases of Alpha (PANGO lineage10 B.1.1.7), Beta (B.1.351) and Gamma (P.1) — in Europe, southern Africa and South America, respectively — or globally, in the cases of Delta (B.1.617.2/AY sublineages) and the many Omicron sublineages (B.1.1.529/BA sublineages, such as BA.1, BA.2 and BA.5).

In contrast to the expectation that viruses undergo rapid host adaptation following spillover12,13, selection analysis indicates that SARS-CoV-2 lacked notable levels of observable adaptation early in the pandemic14. It subsequently became clear that SARS-CoV-2 is a generalist virus capable of using a variety of mammalian angiotensin-converting enzyme 2 (ACE2) membrane proteins for cell entry15, enabling infection of a wide range of mammals14,16. The sarbecoviruses are transmitted frequently between different horseshoe bat species17 and non-bat species with ACE2-binding capability (the inferred ancestral trait in sarbecoviruses18), which happens to include humans. The SARS-CoV-2 spike protein contains important properties that are responsible and required for efficient human-to-human transmission, in particular: human ACE2 binding and the polybasic furin cleavage site (FCS) at the S1–S2 junction19,20. At present the SARS-CoV-2 S1–S2 FCS is unique among sarbecoviruses, although analogous sequences are observed in other betacoronaviruses.

The entry of SARS-CoV-2 into airway cells requires furin-mediated cleavage at the FCS, enabling membrane fusion. The FCS is therefore a key determinant of the high transmission rates of SARS-CoV-2, contributing to its efficient spread in humans19,21. Further optimization of the wild-type FCS during the course of the pandemic has resulted in enhanced furin cleavage of the Alpha and Delta spike proteins22,23,24,25. In concert with other mutations, notably those enhancing ACE2 binding26,27, mutations optimizing furin cleavage are thought to have contributed to enhanced transmissibility, and thus fitness, of the Alpha and Delta VOCs, with 65% and 55% higher relative transmissibility compared with the variants they replaced, respectively28,29,30. In contrast to Alpha and Delta, the evolutionary success of the Omicron variant is not linked to optimization of furin cleavage. Rather, Omicron is characterized by an altered entry phenotype31,32, coupled with significant immune escape31,33,34, enabling efficient infection of vaccinated or previously infected individuals. Although transmissibility in a naive population is largely determined by intrinsic viral properties, the increasingly complex immune landscape in which SARS-CoV-2 now circulates means that antibody escape (as opposed to transmissibility being enhanced by virus biology alone, a trait that might be difficult to optimize further than that achieved by Omicron) is becoming the prime driver of variant success. Before Omicron’s emergence (Box 1), each of the dominant variants had evolved from pre-VOC progenitors, rather than evolving from one another. By contrast, successive waves are now being caused by Omicron sublineages (for example, BA.5, one of its sublineages BQ.1 and BA.2.75, a sublineage of BA.2). Of note, it is possible that an undetected variant, potentially a recombinant (Box 2), could emerge with high transmissibility linked to intrinsic biology and novel antigenic properties.

Whether an entirely novel variant emerges, or future viruses evolve from the Omicron sublineages with novel antigenic changes (increasingly probable as previous VOCs are no longer circulating), it is clear that novel SARS-CoV-2 variants possessing unique combinations of mutations will continue to emerge and those with a fitness advantage will dominate relative to previous variants. To date, successful variants have also exhibited variation in clinically relevant traits, including disease severity, immune evasion and sensitivity to therapeutics (particularly monoclonal antibodies). It is therefore of public health and clinical importance to understand the drivers of SARS-CoV-2 fitness. Variant fitness — a virus’s reproductive success — depends on a variety of factors that determine its ability to infect, replicate within and spread between hosts. In this Review, we provide an overview of observed mutations that are described as impacting SARS-CoV-2 infectivity and transmissibility, and discuss the viral capacity to escape T cell-mediated, innate or humoral immunity11. See our previous review11 for more details on spike-mediated humoral immunity.

Antigenic escape and SARS-CoV-2 variants

An early concern regarding SARS-CoV-2 evolution was the potential emergence of antigenically distinct variants with the ability to evade vaccine- or infection-acquired immunity, as exemplified by the N439K spike substitution35. Before their update in late 2022, all widely used COVID-19 vaccines were based on the spike antigen of early variants, most using the reference sequence Wuhan-Hu-1, sampled from a lineage B infection at the Huanan Seafood Market, often with mutations that stabilize the spike protein in a prefusion conformation36. Although limited antigenic change was reported for Alpha25,37,38, moderate escape from vaccine-derived antibodies and convalescent sera was observed for Beta, Gamma and Delta in laboratory experiments23,37,38,39. Nonetheless, epidemiological studies provided evidence that vaccine effectiveness against Delta and Beta was largely preserved40,41,42. Thus, despite a design based on a very early spike sequence, first-generation SARS-CoV-2 vaccines conferred remarkable protection against severe disease and allowed much of the world to return to a semblance of normality.

The Omicron ‘complex’ — comprising the distinct sublineages BA.1, BA.2, BA.3, BA.4 and BA.5 (Box 1) — is capable of infecting the vaccinated and previously infected, bringing about the challenge of vaccine-sequence selection and universal vaccines to the fore of SARS-CoV-2 control strategy discussions. With more than 15 spike receptor-binding domain (RBD) mutations and a number of antigenic deletions and substitutions in the amino-terminal domain (NTD)43,44, BA.1, BA.2, BA.4 and BA.5 are very poorly neutralized with first-generation vaccines and by pre-Omicron infection-derived antibodies (Fig. 1). In addition, escape from the vast majority of current therapeutic monoclonal antibodies has been demonstrated; at present, only bebtelovimab — a monoclonal antibody targeting the RBD of the spike protein — has been reported to retain its efficacy against all SARS-CoV-2 variants31,33,34,38,45,46,47,48,49,50. This large antigenic ‘shift’ has led some to propose that Omicron lineages should be considered a separate strain or serotype compared with pre-Omicron lineages51,52. Importantly, the magnitude of this antigenic change is reflected in data on real-world vaccine effectiveness against infections and symptomatic disease41,53,54,55,56,57. Booster doses are required to maintain any vaccine effectiveness against Omicron, which reduces as antibody titres wane41,55. Indeed, vaccine effectiveness against severe disease for Omicron remained high 4 months after a booster dose and then decreased quickly, although the reduction was less rapid than that seen after primary vaccination58. Due to the short duration of protective immunity against Omicron infection with current vaccines, many vaccine manufacturers and academics are focusing on second-generation vaccines, such as monovalent or bivalent Omicron-specific boosters59 (which are being deployed at present), nasal vaccine delivery to stimulate greater mucosal immunity60 or universal vaccine approaches61. In common with seasonal human coronaviruses, the extent to which long-term acquired immunity can prevent reinfection by SARS-CoV-2 is limited due to a combination of antibody waning and virus antigenic drift — the incremental acquisition of mutations that permit immune evasion62,63. In addition to step changes in antigenicity, virus evolution during persistent infections has enabled SARS-CoV-2 to accumulate multiple mutations in the context of a single or a few long-term infections, contributing to antigenic shift events when these variants go on to infect others (Box 1).

Fig. 1: Properties of amino acid substitutions or deletions in selected SARS-CoV-2 variants of concern.
figure 1

Black boxes denote the presence of each mutation in the variant of concern. Epitope residues are coloured to indicate the amino-terminal domain (NTD) supersite187 or the receptor-binding domain (RBD) class188. For RBD residues, the results of deep mutational scanning (DMS) studies show the escape fraction (that is, a quantitative measure of the extent to which a mutation reduced polyclonal antibody binding) for each mutant averaged across plasma (‘plasma avg’) and for the most sensitive plasma (‘plasma max’)189, illustrating consistency or variation in the effect of a mutation depending on differences in the antibody repertoire of individuals. Mutations in the furin cleavage site are highlighted. Orange shading indicates the distance to angiotensin-converting enzyme 2 (ACE2)-contacting residues that form the receptor-binding site (RBS). Note that the RBS is defined as residues with an atom <4 Å from an ACE2 atom in the structure of the RBD bound to ACE2 (RCSB Protein Data Bank ID 6M0J190). Finally, ACE2-binding scores representing the binding constant (Δlog10 KD) relative to the wild-type reference amino acid from DMS experiments are shown in shades of red or blue26.

The FCS in SARS-CoV-2 variant emergence

Compared with other known sarbecoviruses, a unique feature of SARS-CoV-2 is the presence of an FCS within its spike protein that can be cleaved by furin. FCSs are nonetheless present in many other betacoronaviruses of the subgenera Embecovirus and Merbecovirus, such as human coronavirus OC43, human coronavirus HKU1 and Middle East respiratory syndrome coronavirus (Fig. 2a). The SARS-CoV-2 FCS has been shown to be vital for optimal virus replication in human airway cells64, transmissibility19,21 and pathogenicity65. Furin, a host protease, is most abundant in the Golgi apparatus, allowing cleavage during trafficking of the virus to the cell surface66. It is now clear, however, that the FCS of early SARS-CoV-2 variants was suboptimal and not efficiently cleaved by furin19,24,65. Interestingly, one of the roles described for the early substitution mutation, spike D614G, was modestly enhanced spike cleavage (reviewed in detail previously3). A number of subsequent SARS-CoV-2 variants harbour mutations adjacent to the FCS, which increase the number of basic amino acid residues — the known recognition site for furin; for example, Alpha, Mu and Omicron contain the FCS mutation P681H44,67,68, which is predicted to increase cleavage activity. Moreover, a different mutation at the same position, P681R, enhances the replication and pathogenicity of the Delta VOC23,30,69 (Fig. 2b). Of note, Omicron contains P681H, as well as the further mutation N679K44, which together result in an optimized FCS19,23,24,25,32,70. Importantly, however, furin site optimization alone does not enhance the transmissibility or replication of SARS-CoV-2 and might be detrimental to efficient virus transmission27, indicating that additional mutations seen in these variants are required for optimized replication and transmissibility.

Fig. 2: Furin cleavage site and variant success.
figure 2

a, Comparison of S1–S2 cleavage site sequences in wild-type (WT) severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and Alpha, Delta and Omicron variants of concern compared with other coronaviruses: SARS-CoV, Middle East respiratory syndrome coronavirus (MERS-CoV), human coronavirus OC43 (HCoV-OC43), HCoV-HKU1, HCoV-NL63 and HCoV-229E. A slash indicates the putative furin/serine protease cleavage site. Amino acids contributing to monobasic or polybasic cleavages sites are shaded. b, Diagrammatic representation of estimated relative optimization of the S1–S2 furin cleavage site (FCS) for the variants of concern. The mutations affecting FCS function are indicated. Note that non-concordant results have been observed for Omicron, as indicated. The level of FCS optimization for future variants is uncertain. Data on Alpha from refs. 19,24,25, data on Delta from refs. 22,23,69 and data on Omicron from refs. 32,75,76.

The exact mechanism by which the observed FCS mutations enhance furin cleavage remains a topic of debate. Although there is fairly strong evidence that P681R directly enhances furin engagement and cleavage of the S1–S2 site19,24,69, the functional consequence of P681H71 is less clear. Mechanistically, it is likely relevant that position T678 of the spike protein, located close to the residue P681, can be post-translationally modified with an O-linked glycosylation site72,73,74. Downstream prolines are known to promote O-linked glycosylation; therefore, an alternative explanation could be that the removal of P681, rather than the addition of the histidine per se, leads to loss of the potentially furin-obstructing glycosylation site and associated enhancement of cleavage.

Recent insights into Omicron biology have challenged the hypothesis that enhancing spike cleavage is essential for increased viral transmission. All Omicron lineages contain P681H and N679K43,44, which alone, or together, enhance cleavage of the S1–S2 site in the wild-type spike protein32,70. However, in the context of the full Omicron spike protein, evidence of such improved cleavage is less clear, with some studies finding that the S1–S2 junction appears more poorly cleaved than in previous VOCs50,75, whereas others show cleavage efficiency comparable to that of Delta32,76,77. Regardless of the cleavage phenotype, many groups have described how Omicron is able to efficiently utilize an alternative cell entry pathway. Indeed, whereas previous VOCs such as Delta are highly reliant on fusion priming by transmembrane protease serine 2 (TMPRSS2) at the cell surface, Omicron is also able to be efficiently primed by endosomal proteases, such as cathepsins, in a manner similar to SARS-CoV31,32,50,75,76,78,79. This alternative mechanism is hypothesized to be partly responsible for the reduced severity of Omicron, at least in rodent models32,75,78,80,81, due to lower fusogenicity and potentially altered tissue tropism, with a bias towards infection of the upper respiratory tract over the lower respiratory tract. Several studies have suggested molecular mechanisms for this reduced fusogenicity and altered entry route, including mutations in the spike RBD of Omicron31,82, H655Y83 or mutations in the S2 domain31,32,76, specifically N969K32, although it is contentious whether this trait is conserved across all Omicron lineages32,84. Nonetheless, Omicron lineages continue to also show high transmissibility, at least equivalent to that of Delta85,86, implying a potential decoupling between efficiency of furin cleavage and fusogenicity and their contributions to virus transmissibility.

Beyond the S1–S2 site, betacoronaviruses also require cleavage of a second protease cleavage site, known as the S2′ site. Following cleavage of the S1–S2 site and cognate receptor binding, the S2′ site becomes exposed in the S2 domain of the spike protein87. Cleavage of S2′ directly liberates the fusion peptide, leading to virus–host membrane fusion88,89,90. For SARS-CoV-2 entry into airway cells, this site is preferentially cleaved by host serine proteases, such as TMPRSS2, but it can alternatively be cleaved by endolyosomal cathepsins19,91. A number of recent articles have suggested that variation in the NTD of SARS-CoV-2, particularly via remodelling of external loops through the acquisition of deletions or insertions, can allosterically influence both S1–S2 cleavage and S2′ cleavage, and therefore fusion92,93,94.

Overall, the relationship between protease usage, S1–S2 cleavage efficiency, tropism, pathogenicity and transmissibility of SARS-CoV-2 is highly complex and, at times, results among studies have been inconsistent. Further work is, thus, required to bridge these gaps in knowledge and to fully understand this system.

Other structural and non-structural proteins and infectivity

Several recent studies have investigated the consequence of mutations in structural proteins other than the spike protein, including membrane (M), envelope (E) and nucleocapsid (N) proteins. The B.1.1 lineage is defined by a pair of substitutions in the N protein — R203K and G204R. Variants derived from B.1.1 (such as Alpha, Gamma and Omicron) inherited the same mutations, whereas Delta and Beta independently evolved R203M and T205I, respectively, with convergent functional properties. Specifically, these mutations have been shown to increase viral infectivity95,96,97, although the exact mechanism of action remains disputed. On the one hand, data from a virus-like particle-based reporter-based assay95 suggested that these mutations directly increase virus particle formation, whereas another report suggested a role for phosphorylation of N protein allowing escape restriction by the kinase GSK3 (ref. 97). An alternative explanation is that the R203K and G204R mutations introduce a novel transcriptional regulation site into the middle of the N gene, allowing the expression of a truncated form of N protein (termed ‘N*’ or ‘N.iORF3’), which might enhance virus infectivity through enhanced interferon antagonism98,99. Interestingly, there are several further examples of SARS-CoV-2 lineages evolving novel transcriptional regulation site sequences that could result in expression of truncated protein products, either in frame or out of frame, most prominently in non-structural protein 16 (NSP16)98.

Alongside N protein, mutations in M and E proteins have also been implicated in modulating SARS-CoV-2 infectivity. Substitutions in the M and E proteins of BA.1 (Omicron) have been shown to reduce cell entry of virus-like particles, although these mutations are compensated for by further substitutions in S and N proteins100. Coronavirus E proteins have several functions, one of which is to act as a cation channel, potentially within the endoplasmic reticulum (ER) and Golgi compartments to regulate multiple stages of the viral life cycle101. The T9I mutation found in Omicron E protein has been shown to attenuate this ion channel activity in vitro102, although its functional consequences are unclear.

Although ORF1ab makes up two-thirds of the SARS-CoV-2 genome, it remains the region where the impact of variant mutations is least understood. One exception is a deletion at positions 106–108 within NSP6, a mutation that is conserved among all the VOCs except Delta. NSP6 is a multipass transmembrane protein associated with the formation of the coronavirus replication organelle — the ER-derived membranous structure that is produced during infection, providing a compartment for viral RNA replication shielded from innate immunity103. A recent study showed that NSP6 forms homodimers and mediates the formation of a ‘zippered ER’ — narrow and exclusive membranous channels that connect ‘double-membrane vesicles’, which are the primary site of viral genome replication103. It was found that the 106–108 deletion in NSP6 specifically enhances the formation of this zippered ER, indicating a potential host-specific adaptation. The exact mechanism of this enhancement remains to be elucidated, although the study authors postulated that the deletion removes a putative O-linked glycosylation site. At present it remains unclear why Delta and several other variants were never observed to have gained this adaptation despite the ease with which this deletion can occur.

Beyond the structural and ORF1ab proteins, there is some very limited evidence showing adaptive changes in accessory proteins, which is covered later in this Review. Unfortunately, experimental characterization of non-spike mutations and any associated adaptations to humans in SARS-CoV-2 variants remains far behind that of the spike protein. This is due to a number of factors, including the ubiquitousness of pseudovirus technology used to study spike phenotypes compared with the technical complexity of reverse genetics (usually required for virological studies of non-spike mutations) and the scarcity of in vitro systems for investigating non-spike proteins. It is clear from current work that non-spike adaptations contribute largely to virus fitness and pathogenicity, and the continued development of systems to study these regions is vital to ongoing research.

T cell responses and antigenic escape

T cells are a major part of the adaptive immune response to SARS-CoV-2 infection, with a profound CD4+ T cell and CD8+ T cell response observed in most infected individuals104. Several studies suggest an important role for T cell immunity in protection from severe COVID-19, although this is likely to be more nuanced and complex than the well-characterized correlate of protection from infection of neutralizing antibody responses105,106. Although CD4+ helper T cell responses are likely to be broadly important for antibody generation, the importance of T cells in reducing disease severity might be relatively more important in scenarios where neutralizing antibody responses are diminished or not yet detectable. Early induction of SARS-CoV-2-specific T cells is seen more frequently in mild infections than in severe infections107, and CD8+ T cells may be particularly important in reducing severe outcomes in patients with B cell deficiency108. Furthermore, a fully functional CD8+ T cell response is mobilized 1 week after the first dose of mRNA BNT162b2 vaccine (Pfizer–BioNTech), at a time when neutralizing antibodies are not fully induced, raising the possibility that early vaccine-induced protection may be mostly reliant on T cells109.

Given the integral role of T cells in SARS-CoV-2 immunity, the potential exists for selective pressure to lead to T cell escape, although the extent to which SARS-CoV-2 mutations affect T cells is currently poorly understood. Functional T cell responses are directed against multiple virus proteins, with the magnitude of response correlating with viral protein expression levels. Responses to spike protein, N protein and M protein dominate, with appreciable responses also seen against ORF3a and the non-structural proteins NSP3 and NSP12 (ref. 110). As the T cell response targets epitopes across the SARS-CoV-2 genome, the footprints of T cell escape are more broadly distributed than antibody-driven changes, which are concentrated within dominant epitopes of the spike protein (Fig. 3). Few studies have documented intrahost evolution within T cell epitopes, which would serve as direct evidence of T cell escape. Mutations within CD8+ epitopes in N protein (M322I and L331F), M protein (L90F) and the spike protein (L270F) were noted within minority variants in one study during the course of acute infections, resulting in loss of epitope-specific responses111. Prolonged SARS-CoV-2 infections in immunocompromised hosts may offer greater opportunities for T cell escape, akin to the extensively described examples in HIV-1 infection112. Emergence of the NSP3 T504P mutation resulting in loss of a CD8+ epitope response has been reported in multiple individuals with impaired humoral immune deficiency but preserved T cell responses in the context of chronic SARS-CoV-2 infection113,114. These findings are limited to a few cases, demonstrating the need for more prospective cohort studies systematically evaluating the risk of T cell escape in certain patient populations.

Fig. 3: Potential impact of SARS-CoV-2 variants on T cell responses and innate immunity.
figure 3

a, Following severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, CD4+ T cell and CD8+ T cell responses are generated against 30–40 epitopes across the virus genome110 (epitopes are shown in red and blue). b, An example of how an amino acid change within one epitope might impact epitope-specific cytotoxic T cell responses, thereby inhibiting the elimination of virus-infected cells115. T cell evasion of SARS-CoV-2 has been shown to be the consequence of impaired peptide binding to the major histocompatibility complex (MHC) or poor binding of the T cell receptor (TCR) to the peptide–MHC complex. c, Although the T cell response to vaccination is focused on the spike protein alone, even the multiple spike mutations in the Omicron variant of concern reduce the vaccine-induced spike-specific T cell response only by less than 30%, with considerable interindividual variability120,191. M, membrane protein; N, nucleocapsid protein.

Several mutations within immunodominant ORF3a and N protein CD8+ T cell epitopes that result in complete loss of recognition have arisen independently in multiple SARS-CoV-2 lineages115. Among these is N protein P13L, present in Omicron within a B*27:05 restricted CD8+ epitope. Given the hypothesis that VOCs emerge in chronic infections, it is tempting to speculate that the presence of P13L in the Omicron VOC reflects selection due to T cell pressure during a chronic infection, in addition to the constellation of spike mutations that are likely driven by antibody pressure. L452R found in Delta, Epsilon, Kappa and the BA.4/BA.5 variant spike proteins results in loss of an A*24:02-restricted CD8+ response116. The spike P272L substitution has arisen in multiple lineages worldwide, and leads to loss of a dominant HLA A*02:01 restricted CD8+ epitope117. The role that T cells have in driving this change, in addition to antibody evasion and enhanced ACE2-binding affinity, is uncertain. Other spike mutations in VOCs that have been associated with loss of specific CD4+ responses include L18F, D80A and D215G in Beta, and D1118H in Alpha118,119. The degree to which these observations also represent incidental effects on T cell responses with mutations driven by other pressures is currently unknown.

Despite the loss of these specific responses, several studies show that the overall T cell response induced by infections and first-generation vaccines is preserved against most VOCs105,118,119,120. Even the extensive mutations in the Omicron spike protein result in only a modest less than 30% reduction in total CD4+ and CD8+ responses, with considerable interindividual variability119,120. Most high-frequency spike CD4+ epitope responses are focused on discrete regions of the NTD, carboxy terminus and fusion protein regions, with very few in the RBD110. There are no clear hotspots of spike CD8+ epitopes110. Mutations concentrated in the spike RBD and NTD found in many VOCs that are thought to be driven by antibody evasion and increased ACE2-binding affinity might therefore have limited impact on the overall T cell response. Thus, most T cell epitopes are conserved in different VOCs, and this is likely to contribute to the preserved vaccine effectiveness against hospitalization and death with Omicron seen after a second dose and after a third dose when compared with no vaccination121.

The other key reason for the modest impact of variants on T cell immunity is the breadth of response generated, with each individual mounting responses to 30–40 epitopes following infection110. At a population level, there is also far greater heterogeneity to the T cell response than antibody immunity due to a number of polymorphisms present in human HLA genes. At an individual level, however, significant reductions in spike-specific CD8+ responses to Omicron have been reported in ~15% of convalescent and vaccinated donors117, with another study noting more than 50% loss of CD4+ T cell and CD8+ T cell responses in ~20% of individuals tested122. Although the generalizability of these results is limited by small sample sizes and the HLA distribution of the populations studied, it nonetheless highlights the potential impact of VOCs on T cell responses in certain individuals who have spike-specific immunity generated only through vaccination.

It seems likely that antibody evasion and enhanced transmissibility will continue to be greater drivers of emerging VOCs than significant T cell escape. Whether we will see slow and sequential loss of CD8+ epitopes over time, similar to long-term adaptation in H3N2 influenza, is difficult to predict97. T cell escape can occur through several mechanisms98. Amino acid changes within epitopes or flanking regions can disrupt antigen processing, and changes to anchor residues can interfere with major histocompatibility complex (MHC) binding to epitopes123. Both these mechanisms can result in irreversible loss of T cell responsiveness to a particular epitope. Changes that impair T cell receptor binding to the peptide–MHC complex can, by contrast, result in partial or complete escape. This latter scenario might also be overcome by de novo T cell responses using alternative T cell receptor repertoires, as previously described in HIV-1 infection124.

As well as potential T cell escape, SARS-CoV-2, like many other viruses, directly downregulates MHC class I (MHC-I) expression on infected cells to evade T cell recognition, best described for the accessory protein ORF8 (refs. 125,126). ORF7a (as well as ORF3a and ORF6 (refs. 127,128)) has also been reported to downregulate MHC-I (refs. 128,129), although it is unclear whether this is a specific effect or merely the result of nonspecific Golgi fragmentation130. Recent work has shown that common substitutions seen in VOCs do not alter the ability of ORF8 to suppress MHC-I expression, with the exception of a premature stop codon at amino acid 27 in Alpha, resulting in the expression of a truncated, non-functional form of ORF8 (refs. 126,131). Despite the truncation of ORF8, in the context of infection, Alpha still downregulates MHC-I expression, implying that this variant, or possibly SARS-CoV-2 more generally, has evolved redundant mechanisms to inhibit this pathway126.

Innate immunity and SARS-CoV-2 variants

Innate immunity is an integral part of host defence against pathogens, with a key role in early viral control and tuning of adaptive immune responses132. Innate immune responses are particularly critical against novel viruses, such as zoonotic pathogens, for which there is usually no pre-existing adaptive immunity. It is surprising when novel zoonotic viruses are effective at spreading between humans through effective antagonism of innate host defences, despite recent successful replication and transmission in a distantly related host species, with its typically divergent innate immune system. A consequence of the robustness of the human innate immune system is that the proportion of zoonotic viruses that go on to cause pandemics is extremely low, and properties that evolved in the SARS-CoV-2 reservoir species, specifically generalist host tropism (which happened to include humans) coupled with acquisition of its FCS in the spike protein, possibly in an intermediate host species, were critical features for starting the COVID-19 pandemic. It is noteworthy that the related SARS virus SARS-CoV did not establish itself in the human population, despite also being highly transmissible, its eradication being attributed to less asymptomatic spread compared with SARS-CoV-2, making it easier identify infections. Importantly, as SARS-CoV-2 VOCs have emerged, it has become clear that they are acquiring adaptations to more efficiently infect humans, and this in part is through enhanced innate immune evasion. Indeed, Alpha133,134 and more recent VOCs77,135,136 have evolved reduced sensitivity to interferons, consistent with this being a key selective pressure for virus transmission in humans. Surprisingly, rather than adapting to specifically antagonize human proteins, Alpha and Omicron sublineages BA.4 and BA.5 have in part achieved this by upregulating expression of innate immunity suppressing viral proteins, particularly the protein ORF6 (ref. 77). ORF6 inhibits nuclear transport of transcription factors, including STAT1 and IRF3 (ref. 137), which control the expression of antiviral proteins and soluble pro-inflammatory mediators. Alpha has also enhanced expression of ORF9b (which inhibits signalling downstream of RNA sensing138) and N protein (which sequesters viral RNA to prevent activation of sensing mechanisms98), as well as de novo expression of N*/N.iORF3 – the amino-terminally truncated form of N protein, which displays some interferon antagonism but is expressed at low levels98. Increased levels of these proteins are likely the result of mutations in regulatory regions that modulate subgenomic RNA synthesis and protein expression. This highlights the importance of changes outside the spike protein in determining VOC properties, particularly in regulatory regions. Critically, mutations in the N protein Kozak sequence that are expected to influence expression of Alpha N protein and the overlapping protein ORF9b also appear in the dominant VOCs Delta and Omicron, but their full impact on innate immune antagonism in these VOCs remains to be determined. The relationship between SARS-CoV-2 and innate immunity is highly complex. For example, the viral antagonist ORF9b appears to be negatively regulated through phosphorylation by host kinases, suggesting its innate immune inhibition could be switched off at some point during infection, perhaps once host responses are triggered in infected cells133. Such an inflammatory switch mechanism, which essentially regulates host responses to infection, could drive symptomology and subsequent viral spread by altering cellular activation.

At least 15 SARS-CoV-2 proteins have been suggested to contribute to antagonism of innate responses to date (reviewed in detail elsewhere139,140). These proteins have typically been identified via reporter screens in which the SARS-CoV-2 protein of interest is expressed during a simulated in vitro innate immune response and its capacity to antagonize the response is evaluated. Such experiments do not typically provide mechanistic insight but are effective in discovering novel protein functions. Several of the VOCs exhibit amino acid changes in many of the proteins implicated as innate immune system antagonists, including NSP1, NSP3, NSP6, ORF3a, ORF6, ORF7b, ORF8 and N protein. Whether these coding mutations reflect adaptations to better antagonize human innate immunity is yet to be established in most cases. Moreover, the contribution of these proteins to the VOC phenotype is poorly understood, as this requires the painstaking generation of isogenic mutants using reverse genetics and assessment of their impact on replication, interferon production and sensitivity.

As well as reducing the induction of interferon, the Alpha and Omicron VOCs are more resistant to inhibition by its antiviral effects126,133,134,135,141. This has been best described as being related to spike adaptations which reduce sensitivity to interferon-induced transmembrane (IFITM) restriction factors76,134,142. The exact effect of IFITM proteins during SARS-CoV-2 replication is contentious, with some studies showing that IFITM proteins inhibit cellular entry of SARS-CoV-2, in a similar manner to influenza19,76,134,142, whereas others suggest that in some contexts they enhance infection, in a manner similar to that described for OC43 and HKU1 (refs. 76,134,143,144,145). The exact role of IFITM proteins is likely to be context specific, such as the particular entry pathway used by a particular virus in a given cell type or cell line, the use of live virus versus pseudovirus and the level of IFITM protein expressed. Members of the IFITM family are small, interferon-stimulated transmembrane proteins that are associated with different cell membranes; human IFITM1 is generally associated with cell surface membranes, whereas IFITM2 and IFITM3 are more localized to late and early endosomes, respectively76,146. The exact mechanism by which IFITM proteins impact viral entry is not fully resolved, but inhibition of viral glycoprotein fusion with host membranes is thought to be involved147,148. Several VOCs have been shown to have different degrees of sensitivity to IFITM protein inhibition or enhancement, quite often associated with specific entry pathways or furin cleavage phenotypes. For example, Omicron, which is more efficient at endosomal entry than early variants and other VOCs, appears to show greater inhibition (or in some cases enhancement) by endosomal IFITM proteins, although this does appear to be highly dependent on the cell system used32,76,145. This is possibly due to Omicron either having to compromise innate evasion due to adaptive immunity avoidance, particularly in the context of vaccine-induced spike antibodies, or having to adapt to use IFITM proteins as cofactors for entry.

It is not yet clear how enhanced antagonism of innate immunity might influence virus transmission. We hypothesize that the capacity of SARS-CoV-2 to spread efficiently is strongly linked to its ability to evade and antagonize innate immune responses in the first cells that encounter the virus in the airway. Indeed, the efficiency of infection events is expected to influence viral spread through the airway and thus the likelihood of seeding a productive infection at all. Type I interferon responses have been shown to be important in determining infection efficiency and outcome for other viruses, and have been well characterized for lentiviral infection of macaques149.

Finally, the fact that at present each VOC has evolved independently from an ancestral virus circulating early during the pandemic means that each VOC has taken a different mutational pathway to acquire distinct adaptations to humans. As a consequence, VOCs, or indeed other variants of SARS-CoV-2, could potentially recombine to unite independently encoded adaptations and thus phenotypic advantages from different variant genomes (Box 2).

Antigenic distance in determining transmission and fitness

Ever-changing population immunity creates a dynamic fitness landscape for virus variants because their fitness is as much dependent on acquired immunity as on their set of unique mutations. Although complexities such as the breadth and duration of immunity are important considerations150,151, the cumulative number of humans exposed to SARS-CoV-2, via infection and/or vaccination, results in a population much less susceptible to most circulating (and past) variants with an ever-decreasing number of immunologically naive, fully susceptible hosts (Fig. 4).

Fig. 4: Dominant SARS-CoV-2 variants, vaccinations, infections and deaths since early 2020 in the UK.
figure 4

a, Waves of dominant variants in the UK (B.1, B.1.1.7/Alpha, B.1.617.2/Delta, AY.4.2/Delta and Omicron sublineages BA.1, BA.2, BA.4 and BA.5), proportion of the UK population with one or two doses and with one booster vaccination, number of COVID-19 cases and number of reported COVID-19-related deaths. Data from COG-UK Mutation Explorer192 and GOV.UK. b, Diagrammatic visualization of the dynamic relationship between variant transmissibility, antigenicity, virulence and fitness. As population immunity derived from infection and vaccination increases, the fraction of completely immunologically naive hosts declines (gradient blue lines). Consequently, the importance of antigenic novelty in determining variant fitness increases. Antigenic distance to previously circulating variants becomes an increasingly key determinant of variant transmissibility, increasing the potential for intrinsic and real-world transmissibility to diverge. Similarly, antigenic distance influences a variant’s potential to infect and cause disease in immune hosts, increasing the potential for a variant’s intrinsic virulence to diverge from its real-world clinical impact. VOC, variant of concern.

Early in the COVID-19 pandemic, when the fraction of naive hosts was greatest, there was little evolutionary benefit of antigenic novelty relative to wild-type variants. Instead, selection favoured variants capable of maximizing reproductive success through adaptation of intrinsic biological features such as the conformational change in the spike protein caused by D614G, the defining mutation of PANGO lineage B.1, or the enhanced furin cleavage phenotype coupled with increased ACE2 binding exhibited by Alpha27,29. As immunity in the host population increased, a variant’s antigenic novelty played an increasingly important role in its reproductive success, relative to intrinsic biological changes152. Subsequently, the Delta VOC became dominant globally, displacing previous variants in countries with partially immune populations with moderate-to-high vaccination coverage23,153. Virus neutralization data indicated moderate immune escape from neutralizing antibodies by the Delta VOC23,37,38,39, and vaccine effectiveness data indicated that antigenic novelty was not the primary driver of increased transmissibility40,41,42,154,155, indicating the high fitness of Delta was more the result of intrinsic viral properties such as optimization of spike furin cleavage22,23,69.

Compared with previous variants, Omicron showed an unprecedented degree of antigenic novelty31,33,34, arguably comparable to an influenza-like antigenic shift event51. The ‘shift’ here, the accumulation of mutations contributing to antigenic distance, probably occurs at least partly in the context of a chronic infection or chronic infections (Box 1). Comparison of transmission dynamics within vaccinated and unvaccinated households has indicated that immune escape was a critical component of the increased transmissibility of Omicron (BA.1) relative to Delta during their period of co-circulation86. The Omicron sublineage BA.2 has proved even more capable of infecting both unvaccinated and vaccinated individuals, potentially driven by immune evasiveness properties similar to those of BA.1 but with higher intrinsic transmissibility156. Most recently BA.4, BA.5, BA.2.75 and their sublineages have shown not only even greater evasion of immunity than pre-Omicron variants, but also escape from immunity generated from prior infection with Omicron, particularly BA.1 (refs. 46,47,48,49,157,158,159). This has been largely attributed to mutations at antigenically potent RBD positions, particularly L452R and F486V in the example of BA.4/BA.5 (ref. 160). Although there may remains the potential for further optimization of SARS-CoV-2 for transmission within humans, it now seems likely that the antigenic novelty and immune evasiveness of emerging variants will be the primary determinant of variant fitness and evolutionary success going forward. Consequently, understanding the complexities of cross-protection between variants is a major research priority.

Relative severity of SARS-CoV-2 variants

There is an imperative to understand how the virulence of SARS-CoV-2 variants might evolve in response to changing selection pressures. Pathogen virulence, alongside immunity, individual susceptibility, disease predisposition and other host factors, is a major contributor to disease severity and is defined in the evolutionary literature as the increased morbidity and mortality of individuals due to infection. Virulence does not necessarily decrease with time in the host population, rather, modelling data generally show a trade-off between transmission rate and virulence161,162. However, the predictability of virulence evolution is complicated by several mechanisms, including within-host competition, changing transmission routes and tropism, and interactions with the immune system162. For instance, repeated epidemic waves, characteristic of an antigenically evolving pathogen, can select for higher pathogen virulence163. Assessing the relative virulence of SARS-CoV-2 variants through disease severity in humans is challenging due to changing immune status and developments in medical interventions throughout the pandemic, although it is possible to compare disease severity of variants that infected the same population within a given period164. This approach suggests inconsistency in the direction of change in disease severity between successively dominant SARS-CoV-2 variants: the successful variants exhibited increased disease severity as Alpha replaced B.1.177, and as Delta replaced Alpha, correlating with relative changes in transmissibility164. By contrast, Omicron exhibited reduced disease severity in the period in which it co-existed with Delta, a decrease which appears to reflect a complex combination of factors including both higher infection rates of those with some degree of prior infection and an intrinsically lower virulence164,165,166,167,168. Proposed explanations for the lower disease severity of Omicron infections include reduced fusogenicity of the spike protein, leading to less tissue damage, and altered tropism restricted more to the upper respiratory tract (due to altered TMPRSS2 use)32,75,78,79. These previous studies compared the virulence of variants existing in the same population, although they could not determine the ‘intrinsic’ virulence of non-overlapping variants that circulated in populations with different immune statuses, such as Alpha and Omicron. Given that the immune status of an individual influences the severity of symptoms in addition to the likelihood of infection, the antigenic ‘distance’ between past exposure and a diverged variant has the potential to facilitate onset of disease in an otherwise protected individual, representing the potential for divergence between inherent potential to cause harm and actual virulence in infected people.

A complementary approach to measure severity of SARS-CoV-2 variants involves the use of animal models (reviewed in more depth previously169). Common animal models have included naive rodents, such as transgenic mice expressing human ACE2 under the control of the keratin 18 promoter (expressed abundantly in epithelial cells170) or hamsters, whose natively expressed ACE2 proteins are efficiently used by all current SARS-CoV-2 VOCs32,171. Pathogenicity in these rodent models is most commonly measured as a function of percentage weight loss, sometimes alongside use of survival curves and measures of pulmonary function172. These rodent models have largely recapitulated equivalent severity data from epidemiological studies in humans, such as Delta being more pathogenic than earlier variants and Omicron being less pathogenic than Delta32,69,75,173,174. However, the models have several limitations, as exemplified by recent epidemiological evidence from Hong Kong suggesting that Omicron sublineage BA.2 displays a disease severity similar to that of first-wave variants175, whereas rodent models generally show that Omicron is less severe than previous variants32,75,173,174,176. This inconsistency could potentially be explained by ongoing adaptation of SARS-CoV-2 to human hosts, resulting in concomitant adaptation away from rodents and other animal models32,80,177,178.

Disease severity following SARS-CoV-2 infection is correlated with several risk factors, including advanced age, being male and clinical comorbidities such as obesity and immunodeficiency, and several inflammatory markers179. As reviewed elsewhere, several recent genetic studies have focused on characteristics that might explain why some individuals are more susceptible to SARS-CoV-2 infections and others develop more severe symptoms180. However, there is an urgent need to combine these findings with data from specific variants and interventions (that is, vaccines, drugs and monoclonal antibodies) that might directly affect the observed phenotypes179. Furthermore, although the most common and easily measurable outcomes from acute infections are hospitalization or death, outcomes that are more difficult to measure also differ widely among variants, such as primary symptomatology181 or post-acute COVID-19 syndrome, although these likely differ greatly by prior immune status as well.

Conclusion

SARS-CoV-2 has been circulating for 3 years within the human population, infecting hundreds of millions of individuals. It remains, however, a relatively new human virus that continues to evolve and acquire adaptations to its new host species. Unprecedented SARS-CoV-2 genome sequence datasets generated globally have revealed evidence of beneficial mutations arising in real time and have guided laboratory experiments to better understand intrinsic properties of the interactions between the virus and the host. Although we have this extraordinary understanding of SARS-CoV-2 biology, virus fitness is highly dynamic and the ability of SARS-CoV-2 to infect, replicate within and spread among the human population has depended explicitly on the specific immune context at different periods of the pandemic. At present, Omicron is dominant worldwide, with infections being driven by emergent BA.2 and BA.5 sublineages. Although our understanding of SARS-CoV-2 is improving, virus evolution is inherently unpredictable, and a likely future scenario is the emergence of a new VOC that is antigenically and, potentially, phenotypically distinct from the early forms of Omicron. At the same time, population immunity against SARS-CoV-2 continues to accumulate and may well compensate in the case of a future variant emerging with higher severity, leading to milder acute disease.

All of the VOC progenitors have evolved from an ancestral pre-VOC virus present during the first wave of the pandemic, taking different but often convergent pathways to more efficiently infect and spread among humans, and to resist antibodies, T cell-driven immunity and innate immunity. The required adaptations, as we have discussed throughout this Review, are a mixture of changes to the host via intrinsic virus properties and escape of innate or adaptive immunity (Box 3). The prevailing hypothesis is that variants originate from chronic infections in immunocompromised individuals, in which the virus is able to establish a persistent infection due to impaired immune function67,113,182. This hypothesis explains the step change of seemingly rapid evolution seen before the emergence of new variants67,183,184. However, it should be noted that future variants will probably be directly derived from prior or contemporary VOCs, most recently exemplified by the spate of ‘second-generation’ Omicron variants derived from BA.2, such as BA.2.75, BJ.1 and BA.2.10.4 (ref. 185). While intralineage recombination serves as an opportunity for the virus to gain additive adaptations and phenotypic advantages from distantly related circulating variants, before the emergence of XBB, recombinants have exerted only a minor impact on the course of the pandemic at present (Box 2). Furthermore, although there is currently very limited evidence for the establishment of long-term circulation and evolution in animal reservoir species, intensive and active surveillance of susceptible species is needed as reverse zoonosis is being documented4,5,186. There are many countries with low sequencing capacity, or places with previously good surveillance that are decreasing or phasing out sequencing altogether. This is troublesome as a lack of genomic surveillance will mean future variants will be detected much later or could be circulating at low levels before eventual detection. There is, thus, a need for widespread and equitable surveillance coverage to rapidly detect potential new VOCs among these individuals and communities before they spread more widely.