“Every day sadder and sadder news of its increase. In the City died this week 7496; and of them, 6102 of the plague. But it is feared that the true number of the dead this week is near 10,000 ....” —Samuel Pepys, 1665

In the next few articles, we will discuss mathematical and statistical models that are commonly used to study the spread of infectious diseases. Such models are used to inform decisions on disease prevention, surveillance, control and treatment and can be applied to new epidemics, such as the ongoing COVID-19 outbreak.

During a so-called ‘virgin epidemic’, we initially have a population of *N* individuals — all at risk for a disease — who form the susceptible group. If an individual with a transmissible pathogen is introduced into this population, over time some of the susceptible individuals will become infected and become part of the infectious group whose members will contribute to onwards transmission. Depending on the pathogen, individuals may recover and acquire immunity, which can be for life (for example, in the case of measles) or transient (for example, in the case of influenza).

The simplest model for the spread of an infection is the SIR model^{1,2}, which tracks the fraction of a population in each of three groups: susceptible, infectious and recovered (Fig. 1a). The sizes of these groups are functions of time *t* — we will write them as *S*, *I* and *R*, with dependence on *t* implied. During a virgin epidemic, we often assume that spread is so rapid that we can ignore any change in the population due to births and deaths — this is a so-called ‘closed epidemic’ with a fixed population size *S* + *I* + *R* = *N*. The SIR model has no probabilistic component, except for the assumption that the population can mix at random and is large enough that predictions based on average rates can be used.

The model is initialized with the entire population being in the susceptible group except for a single infectious individual: *I*(0) = 1, *S*(0) = *N* – 1, *R*(0) = 0. At each time unit (for example, day), any infected individual can come into contact with *k* other individuals in the population. On average, per unit time they will come into contact with *kS*/*N* susceptible individuals and infect *kπS*/*N* of them, if *π* is the probability of infection on contact. Conventionally, *k* and *π* are combined into a transmission rate, *β* = *πk*, which is the average rate at which an infected individual can infect a susceptible. Infected individuals recover at a constant rate *γ* and 1/*γ* is the infectious period (or average recovery time). These rates of change of each group can be written as a set of three coupled differential equations:

The equations are typically solved numerically^{2}. For now, there are several key observations based on their forms. An outbreak will take off if initially d*I*/d*t* > 0, which implies *N*/*S*(0) < *β*/*γ* or, if *S*(0) ≈ *N*, that *β*/*γ* > 1. In this case, the initial increase in *I* will be exponential with a rate *r* = log(*β*/*γ*)/*G* and a doubling time of log(2)/*r*. Here, *G* is called the serial interval and is the average time between successive cases in a chain of transmission. For example, for influenza with a *β*/*γ* ratio of around 2.5 and a serial interval of 3.5 days, cases will initially double every 2.6 days. Eventually, however, *I* will drop to zero (if *γ* > 0) because the *S* group is depleted during the course of the epidemic as a result of the immunity of the *R* group. Many important aspects of the dynamics of an epidemic are influenced by the ratio *β*/*γ*, called the basic reproduction number (*R*_{0}), which represents the expected number of secondary cases caused by a single primary infected individual introduced into a population with no prior immunity.

Figure 1b tracks an outbreak with *R*_{0} = 2 and 1/*γ* = 14 days (*β* = *R*_{0}*γ* = 0.14). Per convention, we express *S*, *I* and *R* as percentages of the total population size, *N*. A key feature of *I* is the location and size of the peak infection, which occurs when *S* = 1/*R*_{0} and is given by *I*_{max} = 1 – (1 + log*R*_{0})/*R*_{0} (ref. ^{3}). We find *I*_{max} = 15% at *t* = 95 days. A second important feature is the total number of people infected — the cumulative epidemic size, *R*(∞), which is 80%. Regardless of the value of *R*_{0}, the epidemic will self-extinguish (*I*(∞) → 0) if no new susceptible individuals are added into the population (either through births or through loss of immunity). This happens because, with time, recovery begins to outpace infection before all the remaining susceptible individuals are infected. Thus, the model predicts there will be a fraction of people, *S*(∞), who escape infection given by the implicit equation \(S\left( \infty \right) = {\mathrm{e}}^{-R_0\left( {1 - S\left( \infty \right)} \right)}\), which can be approximated by \(S\left( \infty \right) \approx {\mathrm{e}}^{-R_0}\) as long as *R*_{0} ≳ 2.5.

If the number of secondary cases is increased by 50% (*R*_{0} = 3), the trajectories change: now, *I*_{max} = 30% at *t* = 54 days, *R*(∞) = 94% and only *S*(∞) = 6% is predicted to escape infection (Fig. 1c).

The spread of the infection can be mitigated by reducing *R*_{0}. This can be accomplished by reducing the infectious period 1/*γ* (for example, by therapeutics) or by reducing *β* = *πk*. Hygiene measures (sewage systems, hand-washing, air filters, and so on) reduce *π* by mitigating the number of contagious particles that are exchanged among individuals. Other measures, like quarantines, social distancing and travel restrictions, reduce the contact rate *k*.

When *R*_{0} is reduced, the infected fraction peak is delayed and lowered — the ‘flattening of the curve’ effect, recently popularized in the context of the COVID-19 outbreak — which is particularly valuable because it reduces the pressure on the health care system (Fig. 2a). Even small decreases in *R*_{0} can have substantial public health benefits. For example, decreasing *R*_{0} by 10% from 2.0 to 1.8 decreases *I*_{max} by a fifth (15% to 12%), delays the peak time by a fifth (95 to 113 days) and lowers the total epidemic size by from 80% to 73% (Fig. 2b). It is worth noticing the change in the relationship between *I*_{max} and the time at which it occurs when *R*_{0} is decreased (Fig. 2b). For example, reducing *I*_{max} from 30% to 20% delays peak infection time by 24 days (54 to 78) but reducing it to 10% delays it by 71 days (from 54 to 125).

A celebrated insight from the SIR model is the profound value of vaccination, which moves individuals from the *S* group directly to the *R* group. Since spread will self-limit when *S* < 1/*R*_{0}, vaccinating at least a fraction *p*_{c} = 1 – 1/*R*_{0} of the population will prevent an outbreak^{4}. This is the ‘herd immunity’ threshold. Practically, there will still be susceptible individuals in the population, but a pathogen will result in only a short and stuttering chain of transmission because infectious individuals are unlikely to encounter enough susceptible ones.

For smallpox — the only human disease eradicated by vaccination — herd immunity is achieved at *p*_{c} ≈ 80% (*R*_{0} ≈ 5 (ref. ^{1})). For the seasonal flu it is achieved at *p*_{c} = 50–75% (*R*_{0} ≈ 2–4). However, current influenza vaccines typically confer immunity to only a portion of individuals, so higher coverage is needed to establish herd immunity, and vaccinations may need to be updated as new strains emerge due to viral evolution. This is in contrast to the 90–99% efficacy for the smallpox or other childhood vaccines. Many zoonotic viruses, such as MERS, can be deadly, but they do not pose major risks for sustained human-to-human chains of transmission resulting in epidemics because *R*_{0} < 1.

An important caveat for these calculations is our original assumption of random mixing among individuals. Assortative mixing can break these predictions, as evidenced by recent outbreaks of measles (*R*_{0} = 12–20; ref. ^{1}) in the United States, where vaccination cover is generally sufficient but vaccine-refusal communities are aggregated so that local values of *p*_{c} drop below the value of 93% estimated to achieve herd immunity.

In Fig. 3 we show the trajectories of *I* and *R* for a disease with *R*_{0} = 3 at various vaccination fractions *p* = 0 to 0.5. As *p* is increased, the rate of infections decreases, the point in time at which infections peak is delayed, and the cumulative epidemic size decreases. For example, if we vaccinate *p* = 0.5 of the population, we decrease peak infection *I*_{max} = 30% to 3.2% and the size of the epidemic from *R*(∞) = 94% to 29%. An interactive tool to explore infection spread trajectories (Figs. 1–3) is at https://github.com/martinkrz/posepi1.

Shortly after a new infectious disease appears, it is often possible to estimate the parameters of the basic SIR model. This gives valuable insight to predict the disease trajectory and the needed reductions in *R*_{0} for control. The basic SIR framework introduced here is also readily extended to realistically model more complex population and disease dynamics: births and deaths in the general population, age structure, non-random mixing and spatial heterogeneities, asymptomatic carriers, latent periods (when individuals are infected but not yet infectious), loss of immunity, diseases that are transmitted by vectors (such as ticks and mosquitos) and diseases that require specific types of contacts (such as sexually transmitted diseases). We will discuss many of these in the next column.

## References

- 1.
Anderson, R. M., Anderson, B. & May, R. M.

*Infectious Diseases of Humans: Dynamics and Control*(Oxford Univ. Press, 1992). - 2.
Bjørnstad, O. N.

*Epidemics: Models and Data Using R*(Springer, 2018). - 3.
Weiss, H. The SIR model and the foundations of public health.

*Materials Matemàtics*(2013); http://mat.uab.cat/matmat/PDFv2013/v2013n03.pdf - 4.
Anderson, R. M. & May, R. M.

*Nature***318**, 323–329 (1985).

## Author information

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

*Editor’s note: Points of Significance are commissioned and are not peer-reviewed.*

## Rights and permissions

## About this article

### Cite this article

Bjørnstad, O.N., Shea, K., Krzywinski, M. *et al.* Modeling infectious epidemics.
*Nat Methods* **17, **455–456 (2020). https://doi.org/10.1038/s41592-020-0822-z

Published:

Issue Date: