In the next few articles, we will discuss mathematical and statistical models that are commonly used to study the spread of infectious diseases. Such models are used to inform decisions on disease prevention, surveillance, control and treatment and can be applied to new epidemics, such as the ongoing COVID-19 outbreak.

During a so-called ‘virgin epidemic’, we initially have a population of N individuals — all at risk for a disease — who form the susceptible group. If an individual with a transmissible pathogen is introduced into this population, over time some of the susceptible individuals will become infected and become part of the infectious group whose members will contribute to onwards transmission. Depending on the pathogen, individuals may recover and acquire immunity, which can be for life (for example, in the case of measles) or transient (for example, in the case of influenza).

The simplest model for the spread of an infection is the SIR model1,2, which tracks the fraction of a population in each of three groups: susceptible, infectious and recovered (Fig. 1a). The sizes of these groups are functions of time t — we will write them as S, I and R, with dependence on t implied. During a virgin epidemic, we often assume that spread is so rapid that we can ignore any change in the population due to births and deaths — this is a so-called ‘closed epidemic’ with a fixed population size S + I + R = N. The SIR model has no probabilistic component, except for the assumption that the population can mix at random and is large enough that predictions based on average rates can be used.

The model is initialized with the entire population being in the susceptible group except for a single infectious individual: I(0) = 1, S(0) = N – 1, R(0) = 0. At each time unit (for example, day), any infected individual can come into contact with k other individuals in the population. On average, per unit time they will come into contact with kS/N susceptible individuals and infect kπS/N of them, if π is the probability of infection on contact. Conventionally, k and π are combined into a transmission rate, β = πk, which is the average rate at which an infected individual can infect a susceptible. Infected individuals recover at a constant rate γ and 1/γ is the infectious period (or average recovery time). These rates of change of each group can be written as a set of three coupled differential equations:

$$\frac{{\mathrm{d}}S}{{\mathrm{d}}t} = - \frac{\beta SI}{N}$$
$$\frac{{\mathrm{d}}I}{{\mathrm{d}}t} = \frac{\beta SI}{N} - \gamma I$$
$$\frac{{\mathrm{d}}R}{{\mathrm{d}}t} = \gamma I$$

The equations are typically solved numerically2. For now, there are several key observations based on their forms. An outbreak will take off if initially dI/dt > 0, which implies N/S(0) < β/γ or, if S(0) ≈ N, that β/γ > 1. In this case, the initial increase in I will be exponential with a rate r = log(β/γ)/G and a doubling time of log(2)/r. Here, G is called the serial interval and is the average time between successive cases in a chain of transmission. For example, for influenza with a β/γ ratio of around 2.5 and a serial interval of 3.5 days, cases will initially double every 2.6 days. Eventually, however, I will drop to zero (if γ > 0) because the S group is depleted during the course of the epidemic as a result of the immunity of the R group. Many important aspects of the dynamics of an epidemic are influenced by the ratio β/γ, called the basic reproduction number (R0), which represents the expected number of secondary cases caused by a single primary infected individual introduced into a population with no prior immunity.

Figure 1b tracks an outbreak with R0 = 2 and 1/γ = 14 days (β = R0γ = 0.14). Per convention, we express S, I and R as percentages of the total population size, N. A key feature of I is the location and size of the peak infection, which occurs when S = 1/R0 and is given by Imax = 1 – (1 + logR0)/R0 (ref. 3). We find Imax = 15% at t = 95 days. A second important feature is the total number of people infected — the cumulative epidemic size, R(∞), which is 80%. Regardless of the value of R0, the epidemic will self-extinguish (I(∞) → 0) if no new susceptible individuals are added into the population (either through births or through loss of immunity). This happens because, with time, recovery begins to outpace infection before all the remaining susceptible individuals are infected. Thus, the model predicts there will be a fraction of people, S(∞), who escape infection given by the implicit equation $$S\left( \infty \right) = {\mathrm{e}}^{-R_0\left( {1 - S\left( \infty \right)} \right)}$$, which can be approximated by $$S\left( \infty \right) \approx {\mathrm{e}}^{-R_0}$$ as long as R0 2.5.

If the number of secondary cases is increased by 50% (R0 = 3), the trajectories change: now, Imax = 30% at t = 54 days, R(∞) = 94% and only S(∞) = 6% is predicted to escape infection (Fig. 1c).

The spread of the infection can be mitigated by reducing R0. This can be accomplished by reducing the infectious period 1/γ (for example, by therapeutics) or by reducing β = πk. Hygiene measures (sewage systems, hand-washing, air filters, and so on) reduce π by mitigating the number of contagious particles that are exchanged among individuals. Other measures, like quarantines, social distancing and travel restrictions, reduce the contact rate k.

When R0 is reduced, the infected fraction peak is delayed and lowered — the ‘flattening of the curve’ effect, recently popularized in the context of the COVID-19 outbreak — which is particularly valuable because it reduces the pressure on the health care system (Fig. 2a). Even small decreases in R0 can have substantial public health benefits. For example, decreasing R0 by 10% from 2.0 to 1.8 decreases Imax by a fifth (15% to 12%), delays the peak time by a fifth (95 to 113 days) and lowers the total epidemic size by from 80% to 73% (Fig. 2b). It is worth noticing the change in the relationship between Imax and the time at which it occurs when R0 is decreased (Fig. 2b). For example, reducing Imax from 30% to 20% delays peak infection time by 24 days (54 to 78) but reducing it to 10% delays it by 71 days (from 54 to 125).