Pathogen evolution in response to public health interventions such as vaccines or antimicrobials is often complex, involving multiple genetic mutations affecting distinct aspects of pathogen life history. For instance, a mutation at one locus may facilitate the escape of vaccine protections or render an antibiotic ineffective but carry a fitness cost, while a second, compensatory mutation at a different locus may reduce this cost. Crucially, any such fitness interaction between mutations (that is, epistasis) means that the transient evolutionary dynamics of each individual mutation will hinge upon their non-random associations with the others (that is, linkage disequilibrium) and so cannot be easily disentangled. Such multi-locus dynamics are especially relevant for vaccination; although vaccines are a comparatively evolutionarily robust public health measure1,2,3, pathogen evolution to vaccination can occur, with two key adaptive routes identified.

First, pathogens can undergo rapid antigenic evolution, allowing them to escape the vaccine-induced immune response (vaccine escape; for example, refs. 4,5,6,7,8). Second, if vaccines are imperfect and so do not completely block infection, the pathogen population experiences two selective environments, vaccinated and unvaccinated hosts. Pathogen fitness in vaccinated hosts is typically reduced and so increased pathogen transmissibility (or decreased clearance) is selected for9,10,11,12,13,14,15. As increasing transmission (or decreasing clearance) often requires an increase in virulence16,17,18,19, more virulent pathogens will be favoured. The evolution of higher virulence as a consequence of vaccination has been documented empirically in Marek’s disease20,21,22, experimentally in rodent malaria23,24 and has been implicated as a contributing factor to the re-emergence of pertussis25.

Although both the vaccine-induced evolution of escape (for example, refs. 4,5,6,7,8) and virulence8,9,10,11,12 are well studied, these adaptations have been considered in isolation from one another. When they have been considered in combination15, the focus was their long-term evolution and so the possibility of transient multi-locus dynamics was ignored. Here we develop a model at the interface of epidemiology and population genetics to study both the short- and long-term dynamics of multi-locus adaptation of pathogens to vaccination. Our goal is to understand the role of (deterministic) selection, and so we neglect the influence of stochasticity and chance mutations (for example, refs. 26,27,28,29). We focus upon the protective effects of vaccination acting at different stages of the pathogen’s life cycle and show that these have important consequences for the sign of epistasis between virulence and vaccine escape. We then examine the evolutionary role played by recombination on the evolutionary outcome.


Consider a standard SIR model (Fig. 1). Transmission of the pathogen occurs via mass action with rate constant β, while pathogen infection causes virulence-related mortality at a per capita rate α; thus, in what follows, the term ‘virulence’ refers specifically to infection-related mortality rather than as a measure of morbidity or symptomatic disease. A positive relationship between virulence and transmission, potentially mediated by within-host pathogen growth, has been observed for many infectious diseases16,17,19,30,31. To indicate the possibility of such a relationship, we will write β as β[α]. Infections are cleared at rate γ, and recovered hosts are fully protected from future infections through naturally acquired immunity. Although we focus primarily upon a link between virulence and transmission, we later consider a link between virulence and clearance9,17. Hosts enter the population at a fixed rate λ and are removed due to natural causes at a per capita rate d.

Fig. 1: Schematic of the epidemiological model and the evolutionary model.
figure 1

a, We highlight how each of the possible vaccine protections affect the epidemiological dynamics, where S and I are the density of susceptible and infected hosts, respectively, and the \(\hat{\cdot }\) indicates a vaccinated host. In particular, vaccines may reduce the risk of infection (ρ1), reduce within-host pathogen growth (ρ2), reduce infectiousness (ρ3), reduce virulence (ρ4) and/or hasten infection clearance (ρ5). b, We highlight the two-locus, diallelic evolutionary model. Each locus undergoes mutation (μV, μE), while recombination occurs between loci (σIT). Selection occurs through the additive selection coefficients (sE, sV) and epistasis (sEV); each of these is defined in terms of the per capita growth rates of the different strains, rij35,43. In turn, the per capita growth rates, rij, depend upon the epidemiological model (and so are not constant) and are the average growth rate of strain ij across the two selective environments, vaccinated and unvaccinated hosts (captured by vij, which denotes the fraction of strain ij infections in vaccinated hosts). Note that δEi is the Kronecker delta and is equal to one if E = i and zero otherwise.

A fraction p of the hosts entering the population are vaccinated. The vaccine has five protective effects altering distinct steps of the pathogen’s life cycle (Fig. 1a):

  1. (1)

    it reduces the risk of infection by a factor 0≤ρ1≤1;

  2. (2)

    it reduces within-host pathogen growth by a factor 0≤ρ2≤1, affecting both virulence and transmission;

  3. (3)

    it reduces infectiousness (without affecting virulence) by a factor 0≤ρ3≤1;

  4. (4)

    it reduces virulence (without affecting transmission) by a factor 0≤ρ4≤1 and

  5. (5)

    it reduces the duration of infection by increasing the recovery rate by a factor 0≤ρ5.

Vaccine protection is assumed to be lifelong.

The pathogen has two diallelic loci of interest (Fig. 1b). The first locus controls the pathogen’s ability to escape (allele E) vaccine protection or not (allele N) by reducing each of the ρi by a factor e. Carrying allele E, however, may come with fitness costs reducing transmission (cβ) or increasing the rate of clearance (cγ). Although we will primarily treat cβ and cγ as costs, our modelling framework is robust to the possibility that the escape allele increases transmission and/or clearance (that is, cβ<0 and/or cγ<0) and so is advantageous in both unvaccinated and vaccinated populations. The second locus controls the pathogen’s virulence. Strains carrying allele A express virulence αA, whereas strains carrying allele V express virulence αV and are more virulent and transmissible (that is, αV>αA and β[αV]>β[αA]). There are thus four possible pathogen strains (or variants), ij  {NA, NV, EA, EV}. Mutations occur at the escape and virulence locus at per capita rates μE and μV, respectively, while recombination (or genetic re-assortment) between strains occurs at a per capita rate σIT, where IT is the total infection density and σ is a rate constant (Fig. 1b).

Let fij and fk denote the frequency of strain ij and allele k, respectively, and D the linkage disequilibrium (LD) between alleles E and V, that is, D ≡ fEV − fEfV (ref. 32). Further, denote the additive selection coefficients for vaccine escape and (increased) virulence as sE and sV, respectively, and the epistasis in per capita growth between alleles E and V as sEV (Fig. 1). Then the evolutionary dynamics are

$$\begin{array}{rcl}\frac{\,{{\mbox{d}}}{f}_\mathrm{E}}{{{\mbox{d}}}\,t}&=&{s}_\mathrm{E}{f}_\mathrm{E}(1-{f}_\mathrm{E})+{s}_\mathrm{V}D+{s}_\mathrm{EV}(1-{f}_\mathrm{E}){f}_\mathrm{EV}+{\mu }_\mathrm{E}(1-2{f}_\mathrm{E}),\\ \frac{\,{{\mbox{d}}}{f}_\mathrm{V}}{{{\mbox{d}}}\,t}&=&{s}_\mathrm{V}{f}_\mathrm{V}(1-{f}_\mathrm{V})+{s}_\mathrm{E}D+{s}_\mathrm{EV}(1-{f}_\mathrm{V}){f}_\mathrm{EV}+{\mu }_\mathrm{V}(1-2{f}_\mathrm{V}),\\ \frac{\,{{\mbox{d}}}D}{{{\mbox{d}}}\,t}&=&\left({s}_\mathrm{E}+{s}_\mathrm{V}-2s\right)D+{s}_\mathrm{EV}{f}_\mathrm{EV}{f}_\mathrm{NA}-(2{\mu }_\mathrm{E}+2{\mu }_\mathrm{V}+\sigma {I}_\mathrm{T})D,\end{array}$$

where t denotes time and s = sEfE + sVfV + sEVfEV. System (1) has the standard interpretation (for example, refs. 33,34,35). The dynamics of allele E (mutatis mutandis allele V) depend upon: (1) the action of direct selection, sE, proportional to the genetic variance, (2) the influence of indirect selection, sV, mediated through LD, (3) epistasis, sEV, and (4) unbiased mutations. Thus the two key emergent quantities when considering the multi-locus dynamics are the non-random assortment of alleles (LD32) and any gene interactions (epistasis36). Moreover, from (1), these two quantities are linked: LD is generated by epistasis (sEV ≠ 0), which produces same sign LD37,38. Once LD is present in the population, it can be amplified by directional selection and be removed by mutations and recombination39,40.

In addition to the evolutionary dynamics described by (1), there is a system of differential equations describing the epidemiological dynamics (Supplementary Information 1). Importantly, the selection coefficients and epistasis can temporally vary in both magnitude and sign due to the epidemiological dynamics (Fig. 1).

Single locus evolution

First, consider evolution restricted to a single locus due to an absence of accessible mutations at the other locus. For example, smallpox was unable to escape vaccine immunity, leading to its eradication41,42, while for other pathogens, evolutionary changes to virulence may not be possible due to biological constraints.

If evolution is restricted to the escape locus (so μV = 0 and fV = D = 0), system (1) reduces to

$$\frac{\,{{\mbox{d}}}{f}_\mathrm{E}}{{{\mbox{d}}}\,t}={s}_\mathrm{E}{f}_\mathrm{E}(1-{f}_\mathrm{E})+{\mu }_\mathrm{E}(1-2{f}_\mathrm{E}).$$

Thus the single locus dynamics are simply governed by the flux of mutation, μE, and selection for the escape allele, sE. As the mutation rate is expected to be small, the dynamics of the escape allele will be primarily shaped by selection: if sE>0, allele E is favoured. From consideration of sE (Fig. 1b), increasing the vaccine protection against infection (ρ1), growth (ρ2), transmission (ρ3) and clearance (ρ5) reduces strain NA fitness and so favours allele E. Likewise, decreasing the costs of escape increases transmissibility and duration of carriage of strain EA, favouring allele E (Supplementary Information 3). Vaccine protection against virulence (ρ4) is an exception as it increases the duration of carriage without a concomitant reduction in transmission and so increases strain NA fitness. Generally speaking, however, if the costs of escape are not too high relative to the degree of escape and vaccine protection and coverage are not too low, allele E is favoured over allele N on the genetic background A (sE>0).

If instead evolution is restricted to the virulence locus (so μE = 0 and fE = D = 0), system (1) reduces to

$$\frac{\,{{\mbox{d}}}{f}_\mathrm{V}}{{{\mbox{d}}}\,t}={s}_\mathrm{V}{f}_\mathrm{V}(1-{f}_\mathrm{V})+{\mu }_\mathrm{V}(1-2{f}_\mathrm{V}).$$

Thus if the flux of mutations is small (μVsV), the more virulent strain will be favoured by selection if sV>0. Depending upon how much allele V increases transmissibility relative to increasing virulence, it can be unconditionally favoured or disfavoured regardless of vaccination rates. As vaccination plays a limited role in either case, suppose allele V is neither unconditionally favoured nor disfavoured. Then, selection for virulence on the genetic background N is a weighted average of conflicting selective pressures across two environments, unvaccinated and vaccinated hosts; the weights are the fraction of infections of a given strain that are found in either environment (Fig. 1b and Supplementary Information 4). If the vaccine affects virulence mortality (ρ2, ρ4) and/or clearance (ρ5), higher virulence and transmission will be selected for in vaccinated hosts to compensate for the reduced duration of infection9,10,11,13. Whether higher virulence is favoured in the population as a whole, however, is determined by the balance between the two selective environments: increasing coverage (p) and decreasing protection from infection (ρ1) and transmission (ρ3) increases the importance of vaccinated hosts and favours higher virulence9,10,11,13.

Epistasis and the production of linkage disequilibrium

Next, suppose that segregation is occurring at both loci and so both adaptations are mutationally accessible. Here a key quantity is epistasis, sEV, which occurs whenever the fitness of an allele depends upon its genetic background. Epistasis shapes the dynamics of adaptation through the build-up of LD and in our model emerges in two ways. First, whenever vaccines reduce virulence or increase clearance, higher virulence will disproportionately reduce the expected duration of infection in unvaccinated hosts relative to vaccinated hosts. Consequently, a greater proportion of strain iV infections will be found in vaccinated hosts than strain iA infections and so vaccination disproportionately affects allele V relative to allele A. Because escape reduces the effects of vaccination, this difference is magnified on the genetic background N relative to E. The second way epistasis is generated is through any non-additive interactions between parameters associated with the virulence locus (β, α) and those associated with the escape locus (vaccine protections, costs of escape35). Because αV>αA and β[αV]>β[αA], any such interactions will disproportionately affect infections carrying allele V over A for a given genetic background (E or N).

As epistasis produces same sign LD32,37,38, which in turn shapes the evolutionary dynamics, we are particularly interested in the sign of the epistatic contribution of the different types of vaccine protection. As escape reduces the effects of vaccination, manipulating the vaccine protections disproportionately affects the difference in per capita growth rate (fitness) between strains NA and NV as compared with strains EA and EV. Because epistasis is defined as sEV ≡ rEV − rEA + rNA − rNV (ref. 35,43), where rij is the fitness of strain ij, and as the difference rNA − rNV will be larger in magnitude than the difference rEA − rEV, the sign of the epistatic contribution of a given vaccine protection can be determined from whether interactions between vaccine protections and virulence disproportionately increase or decrease the fitness of strain NA relative to strain NV (this assumes escape is not too ineffective; Supplementary Information 5). If strain NA experiences an increase (respectively decrease) in fitness relative to strain NV, then positive (respectively negative) epistasis will be produced. With this in mind, the predictions are as follows (Table 1):

  1. (1)

    Vaccines that block infection (ρ1) and reduce transmission (ρ2, ρ3) produce positive epistasis. Because allele V is more transmissible than allele A and as a larger fraction of NV infections are in vaccinated hosts than NA infections, vaccination will disproportionately reduce strain NV transmissibility as compared with strain NA. Thus vaccines blocking infection or reducing transmission produce positive epistasis.

  2. (2)

    Vaccines that reduce virulence mortality (ρ2, ρ4) produce negative epistasis. Ignoring any concomitant effects to transmission, reducing virulence mortality increases pathogen fitness. Because allele V corresponds to more virulent infections and as a larger fraction of NV infections are in vaccinated hosts as compared with NA infections, allele V will disproportionately benefit from any reductions in virulence. Thus vaccines reducing virulence mortality produce negative epistasis.

  3. (3)

    Vaccines that increase clearance (ρ5) produce positive epistasis. Because a larger fraction of NV infections are in vaccinated hosts than NA infections, on the genetic background N, allele V will be disproportionately negatively affected by increased clearance due to vaccination. Consequently, vaccines increasing clearance produce positive epistasis.

Note that vaccines which block growth (ρ2) have two opposing contributions: by reducing transmission, ρ2 produces positive epistasis and by reducing virulence mortality, ρ2 produces negative epistasis. Thus, depending upon the epidemiology and the difference between αA and αV and/or β[αA] and β[αV], the epistatic contribution of ρ2 can be positive or negative. More generally, the relative importance of the different factors contributing to epistasis can temporally vary; for example, if susceptible hosts are abundant, the contribution of transmission to epistasis is likely to be more important (Supplementary Information 5).

Table 1 Summary of key results

In the next two sections, we consider the dynamic implications of the sign of epistasis by focusing upon a scenario in which the epidemiological dynamics have reached an endemic equilibrium before mutations are introduced. However, our theoretical framework allows the exploration of alternative scenarios in which transient epidemiological dynamics feedback on pathogen evolution44 (Supplementary Information 2).

Negative epistasis and evolutionary bistability

First, suppose the vaccine protections produce negative epistasis (that is, vaccines reduce virulence mortality, ρ2, ρ4; Fig. 2a) and there is positive directional selection (that is, sE>0 and sV>0) at the strain NA endemic equilibrium. In this circumstance, strains EA and NV will initially increase at a rate dictated by the strength of selection for allele E and V, respectively. As negative epistasis disfavours strain EV, the increase in strains EA and NV will cause an increase in negative LD until fE + fV = 1 and D = −fEfV. At this point and assuming mutations and recombination are sufficiently infrequent, system (1) reduces to

$$\frac{\,{{\mbox{d}}}{f}_\mathrm{E}}{{{\mbox{d}}}\,t}\approx ({s}_\mathrm{E}-{s}_\mathrm{V}){f}_\mathrm{E}(1-{f}_\mathrm{E}).$$

From equation (4), epistasis no longer directly impacts the dynamics of allele frequencies. Instead, whichever of alleles E or V is more strongly selected for will increase in frequency at the expense of the other (clonal interference), causing the population to evolve along the curve Γ(fE) in (fE, fV, D)-space (Fig. 3).

Fig. 2: Fitness landscape at the strain NA equilibrium.
figure 2

a,b, Each bar represents the per capita growth rate per unit time of strain ij at the strain NA equilibrium (so rNA = 0); the colours indicate the contribution of allele E (blue), allele V (red) and epistasis (magenta), as indicated by the labelled arrows, relative to baseline fitness (black is the fitness of strain NA). For visual clarity, the contribution of the additive selection coefficients to the fitness of strain EV when epistasis is negative are translucent grey, while the dashed lines show strain NA fitness for reference. Negative epistasis (a) occurs when vaccines reduce virulence mortality (ρ2, ρ4), while positive epistasis (b) occurs when vaccines block infection (ρ1) and transmission (ρ2, ρ3) and increase clearance (ρ5). Parameter values used were p = 0.6, e = 1, \(\beta [\alpha ]=\sqrt{\alpha }\), λ = d = 0.05, γ = 0.1; while for a, ρ1 = ρ3 = ρ5 = 0, ρ2 = 0.3, ρ4 = 0.01, cβ = 0, cγ = 0.01, and for b, ρ2 = ρ4 = 0, ρ1 = 0.025, ρ3 = 0.02, ρ5 = 1, cβ = 0.03, cγ = 0.45. Finally, αA and αV were chosen to be the optimal virulence on genetic background N in an entirely unvaccinated and vaccinated population, respectively (Supplementary Information 4).

Fig. 3: Negative epistasis and recombination can lead to evolutionary bistability.
figure 3

a,b, If both vaccine escape and virulence are advantageous (sE>0 and sV>0) but epistasis is negative (sEV<0), alleles E and V are in competition. Starting from the strain NA equilibrium (solid black circle), strains NV and EA will increase in frequency, producing negative LD (D < 0; note that D is dimensionless) until the population is in the vicinity of the curve, Γ(fE) (dashed black curve). Along Γ(fE), strains EA and NV are in direct competition, and so, whichever allele (E or V) has the larger selection coefficient will go to fixation (a) (σ = 0); blue curves correspond to fixation of strain EA and red curves to fixation of strain NV. Recombination can create an evolutionary bistability such that ‘faster’ growing strains tend to reach fixation, even if they are competitively inferior (b) (σ = 0.05); here the colours indicate which strain would fix in the absence of recombination. The magenta circle indicates the strain EV equilibrium, while the dashed lines indicate D = 0 for visual reference. Each simulation starts at the strain NA endemic equilibrium following vaccination; mutation introduces genetic variation and allows pathogen adaptation to vaccination. One hundred simulations are shown using the parameter set: p = 0.6, e = 1, \(\beta [\alpha ]=\sqrt{\alpha }\), λ = d = 0.05, γ = 0.1, μE = 10−4, μV = 10−4, ρ1 = ρ3 = ρ5 = 0 and with ρ2, ρ4, cβ, cγ chosen uniformly at random on the intervals [0, 0.95], [0, 0.95], [0, 0.05], [0, 0.25], respectively, subject to the constraints at the strain NA equilibrium, sE>0, sV>0 and αVαA−1>1.25. αA and αV were chosen to be the optimal virulence in an entirely unvaccinated and vaccinated population, respectively (Supplementary Information 4). Although equal mutation rates were chosen here (μE = μV), in general, unequal mutation rates across the loci distort the dynamics in favour of the locus with the higher mutation rate (Supplementary Information 9).

Recombination affects these dynamics; by removing LD from the population, recombination prevents LD from increasing in magnitude39,40. Consequently, the combination of strain competition and recombination means that it no longer need be true that the ‘fitter’ strain ultimately reaches fixation if doing so requires the population to first pass through an intermediate state with a large amount of LD. As such, recombination can create a bistability between the NV- and EA-strain equilibria (Fig. 3b and Supplementary Information 10). The evolutionary implications of this bistability is that being ‘first’ is more important than being ‘fitter’. Specifically, if either allele E or V is more mutationally accessible or is associated with more rapid growth initially, even if this allele is competitively inferior to the other in the long term (in the absence of recombination), it is likely to reach fixation and exclude the other (Fig. 3 and Table 1).

Positive epistasis and the evolution of virulence

Next, suppose epistasis is positive (that is, vaccines reduce infection/transmission, ρ1, ρ3 and/or increase clearance, ρ5; Fig. 2b) and sE>0 and sV>0 at the strain NA equilibrium. When does strain EV (and so positive LD) transiently increase in the population? If the selection coefficients are of comparable magnitude, sE ≈ sV, alleles E and V increase in the population at a similar speed and so positive epistasis transiently favours strain EV (Fig. 4a). On the other hand, if one selection coefficient is much larger than the other (say, sEsV), then allele E will rapidly increase, favouring whichever virulence allele it is initially associated with (that is, the virulence allele hitchhikes45). Consequently, if mutations at each locus are independent, strain EA will transiently dominate as it is more mutationally accessible from strain NA (Fig. 4a). If instead double mutations (for example, NA mutates to EV) occur with comparable frequency to single mutations, or strains EA and EV are initially equally abundant, strain EV will transiently dominate as it benefits from both the positive epistasis and selection on allele V. One key determinant of the relative magnitude of sE and sV will be the costs and degree of vaccine escape. If costs are low and vaccine escape is easily accessible, typically sEsV, whereas costly and/or limited escape can lead to sVsE.

Fig. 4: Positive epistasis and the evolution of virulence.
figure 4

a, Positive epistasis favours strains that are both more virulent and evasive in the short and long term leading to the build-up of LD (D > 0; note that D is dimensionless) in the absence of recombination (σ = 0). b, Frequent recombination breaks up LD (σ = 0.05), favouring the sequential fixation of traits; in this case, allele E reaches quasi-fixation first followed by allele V. Colours indicate which strain fixes: blue is strain EA and magenta is strain EV. Each simulation starts at the strain NA endemic equilibrium following vaccination (solid black circle); mutation introduces genetic variation and allows pathogen adaptation to vaccination. The red circle indicates the strain NV equilibrium, while the dashed lines indicate D = 0 for visual reference. a and b show 100 simulations using the parameter set: p = 0.6, e = 1, \(\beta [\alpha ]=\sqrt{\alpha }\), λ = d = 0.05, γ = 0.1, μE = 10−4, μV = 10−4, ρ2 = ρ4 = cβ = 0, and with ρ1, ρ3, ρ5, cγ chosen uniformly at random on the intervals [0, 0.9], [0, 0.9], [0, 10], [0, 5], respectively, subject to the constraints that at the strain NA equilibrium, sE>0, sV>0 and αV αA−1>1.25. αA and αV were chosen to be the optimal virulence in an entirely unvaccinated and vaccinated population, respectively (Supplementary Information 4). Although equal mutation rates were chosen here (μE = μV), in general, unequal mutation rates across the loci distort the dynamics in favour of the locus with the higher mutation rate (Supplementary Information 9).

By breaking up LD, recombination can prevent the transient selection for strain EV. Moreover, if strain EV is favoured both transiently and in the long term, since recombination breaks up positive LD evolution occurs sequentially such that either allele E or V will fix first, depending upon which allele is more strongly selected for, before evolution proceeds at the other locus (Fig. 4b and Table 1).

Epistasis and alternative life history trade-offs

The previous results assumed a link between virulence and transmission (that is, a virulence–transmission trade-off or VTT). Therefore are the epistatic contributions of the different vaccine protections contingent upon the specific life history trade-off?

To answer this, suppose higher virulence is associated with lower clearance αV>αA and γA>γV (a virulence–clearance trade-off, or VCT9,17). Under a VCT, vaccines that reduce transmission (ρ2, ρ3) and virulence mortality (ρ2, ρ4) produce positive and negative epistasis, respectively, in agreement with the predictions obtained under a VTT (Supplementary Information 8). Vaccines reducing infection (ρ1), however, do not generate epistasis under a VCT; instead, reducing infection (increasing ρ1) dampens the epistatic contribution of vaccines reducing transmission (ρ2, ρ3) by decreasing the influence of the selective environment of vaccinated hosts (Supplementary Information 8). More importantly, vaccines hastening clearance (ρ5) can generate either positive or negative epistasis under a VCT. This is because, on the one hand, a greater fraction of NV infections are found in vaccinated hosts as compared with NA infections, producing positive epistasis (this was also observed under a VTT). On the other hand, the non-additive interaction between clearance rate and ρ5 disproportionately impacts infections with a higher clearance rate (strain NV over NA), producing negative epistasis. The balance between these two forces determines the sign of epistasis produced by vaccines increasing the clearance rate (under a VCT; Table 1). Note the dynamical consequences of the sign of epistasis are unchanged under a VCT.


Unlike antimicrobials, vaccines typically provide evolutionarily robust protection and have had some remarkable successes1,2,3. However, although less common than antimicrobial resistance, pathogen evolution in response to vaccination does occur1,3,22,46,47,48. Two primary adaptive routes have been identified along which pathogens adapt to vaccination22: by evading the vaccine-induced immune response or by adjusting life history traits, particularly virulence. Here we show how different vaccine protections affect the joint evolution of vaccine escape and virulence. When virulence is linked to transmission, vaccines blocking infection, reducing transmission and/or increasing clearance generate positive epistasis and so strains that carry both vaccine-escape and virulence mutations can be favoured transiently and in the long term. In contrast, vaccines reducing virulence mortality generate negative epistasis, favouring one adaptive route or the other, but not both (Table 1). If instead virulence is linked to clearance rate, these predictions change slightly: vaccines blocking infection do not create epistasis, whereas vaccines increasing the clearance rate can create either positive or negative epistasis depending upon the epidemiology (Table 1).

From a public health standpoint, the least desirable situation is the evolution of strains that escape vaccine immunity and are more virulent; this occurs when epistasis is positive. In this case, the dynamics are very sensitive to the initial conditions, and substantial positive LD (strain EV over-representation) can build-up transiently (Supplementary Information 7). This suggests stochasticity and chance are far more likely to play important evolutionary roles if epistasis is positive. By breaking up LD, recombination can prevent the transient evolution of more virulent vaccine-escape strains, while if such strains are favoured in the long term, recombination leads to the sequential fixation of mutations. Often, vaccine-escape mutations will reach quasi-fixation first, followed by the evolution of higher virulence. In this circumstance, if vaccines can be ‘tweaked’ to restore coverage once their efficacy begins to wane due to vaccine escape (for example, as for seasonal influenza), this will force the pathogen to continually evolve vaccine escape rather than become more virulent. This would suggest that proactive updates to vaccines may be advisable; although this may be impractical with conventional technology (for example, updating/producing seasonal influenza vaccines takes months49), recent technological advances allow vaccines to be rapidly updated and manufactured (for example, mRNA vaccines50).

The competition between vaccine escape and virulence (negative epistasis) may have implications for ‘universal’ vaccines51,52,53,54. Universal vaccines target more conserved viral components than ‘conventional’ vaccines and so are expected to provide broader cross protection against a range of pathogen subtypes, thereby preventing and/or slowing antigenic escape51,53. However, the evolutionary pressures imposed by universal vaccines are not fully understood52. To date, models have primarily focused upon the potential of universal vaccines to reduce antigenic escape51,53 and have neglected other adaptive responses (for example, virulence evolution). Reducing the evolutionary likelihood of vaccine escape (by reducing μE) could induce collateral evolutionary damage by tilting the competitive balance in favour of virulence (assuming μV is also not small). Importantly, the consequences of virulence evolution will depend upon the mechanism of protection offered by universal vaccines. For example, in influenza there are two broad categories of universal vaccine candidates: those that target haemagglutinin proteins55,56 and those that induce a broadly protective T cell immune response52,53,55. Vaccines in the former category are expected to block infection55, while vaccines in the latter are not and instead are likely to reduce growth, disease severity and duration of infection52,53,57. Although both categories of vaccines should reduce the likelihood of escape51,53, our analysis suggests that by only weakly blocking infection and by decreasing disease severity, vaccines inducing a broadly protective T cell immune response will also favour selection for increased virulence. More generally, understanding the evolutionary response to universal vaccines should not focus solely upon antigenic drift.

The sequencing of SARS-CoV-2 is unveiling the complex evolutionary dynamics taking place at multiple sites in the virus genome. Current attempts to understand the dynamics of the multitude of SARS-CoV-2 variants is overwhelming. Yet, different variants often share the same mutations affecting, for example, transmission (for example, D614G58,59) or vaccine escape (for example, E484K60,61). Hence, studying the dynamics of the mutations, rather than the variants, can provide greater insight. For example, as recombination is common in coronaviruses62, including SARS-CoV-2 (for example, refs. 63,64,65), one pressing question is whether it will produce a variant that is both highly transmissible/virulent and exhibits vaccine escape63,66. Although this is a worrisome possibility, our mutation-centred analysis emphasizes that such a variant need not be ‘fitter’; depending upon the vaccine protections, these two adaptations (mutations) may produce negative epistasis and so be disfavoured in combination. Indeed, the SARS-CoV-2 vaccines provide multifaceted protection, reducing infection67,68, infectiousness69,70, virulence68 and infection duration71. The relative strengths of these protections will determine the sign and magnitude of epistasis, and it is epistasis that determines the evolutionary fate of recombinant variants, not the fitness of the ‘parental’ variants. Indeed, even if the escape allele is not costly and instead has increased transmission and/or decreased clearance as compared with the wild type, negative epistasis can still prevent the escape allele from evolving (Supplementary Information 6). Furthermore, in addition to combining different mutations, recombination also acts to break up LD between mutations. As the rate of recombination depends upon the likelihood of co-infection (and so the density of infections), non-pharmaceutical interventions such as social distancing, travel restrictions or mask wearing,will reduce its rate. Previous work suggests that the usage of non-pharmaceutical interventions in response to the SARS-CoV-2 pandemic may affect the epidemiological dynamics of seasonal influenza and respiratory syncytial virus72; our work indicates that limiting recombination may have other evolutionary consequences beyond restricting the generation of novel variants.

More generally, the framework we have developed here could equally be applied to other situations involving multiple interacting loci such as distinct loci controlling different aspects of immune-escape or compensatory mutations, while incorporating additional aspects of population structure such as a spatially heterogeneous vaccination rate or other heterogeneities among hosts are straightforward extensions35,73. Further, our multi-locus model is particularly well suited to understand how temporally variable perturbations of the epidemiological dynamics such as non-pharmaceutical interventions or other immunological effects (for example, vaccination of adults, waning immunity) will feedback on the transient evolutionary dynamics. Thus, this framework helps to account for the joint evolution of multiple mutations and can provide a complementary perspective to the variant-centred view that is currently being used to understand the ongoing pandemic of SARS-CoV-2.