The emergence of a disease combines two elements: the introduction of the pathogen into the human population and its subsequent spread and maintenance within the population. Ecological factors such as human behaviour can influence both of these elements, and consequently ecology has been recognized to have an important role in the emergence of disease1,2,4. In contrast, evolutionary factors including the adaptation of the pathogen to growth within humans and the subsequent transmission of the pathogen between humans are mostly considered in terms of changes in the virulence of the pathogen, and are often thought to have a lesser role in the initial emergence of pathogens4. One exception5 suggests that immunocompromised individuals might provide “stepping stones” for the evolution of pathogens.

The successful emergence of a pathogen requires R0 to exceed one in the new host. Only then can an introduction trigger emergence2. (Here we use R0 to refer to spread in human populations, not in the natural reservoir.) If R0 for a potential pathogen exceeds one, this scenario represents an epidemic waiting to happen. By contrast, when R0 is initially less than one, infections will inevitably die out and there will be no epidemic unless genetic or ecological changes drive R0 above one.

There are a number of ways in which R0 can increase. Ecological changes such as changes in host density or behaviour can increase R0, as can genetic changes in the pathogen population or in the population of its new host. Genetic changes in the pathogen can arise either through ‘coincidental’ processes such as neutral drift or coevolution of the pathogen and its reservoir host, or through adaptive evolution of the pathogen during chains of transmission in humans. Genetic changes of the new host might be more likely for domesticated or endangered species than for humans.

Here we show that factors, such as ecological changes, that increase the R0 value of the potential pathogen to a level not sufficient to cause an epidemic (that is, R0 remains less than one) can greatly increase the length of the stochastic chains of disease transmission. These long transmission chains provide an opportunity for the pathogen to adapt to human hosts, and thus for the disease to emerge.

Our model is illustrated in Fig. 1. Introductions occur stochastically from the natural reservoir of the pathogen, and each primary case is followed by stochastic transmission that generates a variable number of subsequent infections in the human population. We assume that the number of secondary cases follows a Poisson distribution with a mean equal to R0. Each introduction thus forms branched chains of transmission, which stutter to extinction if R0 < 1, and the pathogen cannot evolve (µ = 0). The probability that the pathogen evolves and a secondary infection is caused by the mutant is equal to µ for each of the secondary infections (we note that µ incorporates not only the pathogen's mutation rate, but also its dynamics within the host and its transmissibility). We use multi-type branching processes6,7,8,9 to describe the initial spread of the infection, incorporating the evolution of the pathogen.

Figure 1: Schematic for the emergence of an infectious disease.
figure 1

Introductions from the reservoir are followed by chains of transmission in the human population. Infections with the introduced strain (open circles) have a basic reproductive number R0 < 1. Pathogen evolution generates an evolved strain (filled circles) with R0 > 1. The infections caused by the evolved strain can go on to cause an epidemic. Daggers indicate no further transmission.

In the simplest case (see Fig. 2a) only one mutation is required for the R0 of the evolved pathogen to exceed one. The probability that a single introduction evolves, causing one (or more) infections with the evolved pathogen (a filled circle in Fig. 1) before it goes extinct, depends very strongly on the R0 of the introduced pathogen, particularly for low µ values as R0 approaches 1. This probability is approximately linearly dependent on the rate of evolution µ.

Figure 2: One-step evolution. A single change is required for the pathogen to evolve to R0 > 1.
figure 2

a, The probability that an introduction leads to an infection with an evolved strain of the pathogen (filled circle in Fig. 1) is highly sensitive to R0, and is approximately linearly dependent on the mutation rate µ. Lines correspond to numerical solutions to the branching process model (see Supplementary Information) and symbols correspond to Monte-Carlo simulations following 105 introductions. b, The probability of emergence per introduction depends on the R0 value of the introduced pathogen and of the evolved pathogen. The solid, dashed and dotted lines correspond to the evolved pathogen having an R0 of 1,000, 1.5 and 1.2 respectively.

The probability that the introduction leads to an epidemic (the ‘probability of emergence’) depends on the probability of evolution and the probability that the evolved infections do not go extinct due to stochastic effects. In Fig. 2b we plot the probability of emergence for three different R0 values of the evolved strain. The probability of emergence approaches the probability of evolution when R0 of the evolved strain is large, and is lower when the R0 of the evolved strain is close to 1. We find that the probability of emergence depends most strongly on the R0 of the introduced pathogen, increases approximately linearly with the mutation rate µ, and depends only modestly on the R0 of the evolved pathogen.

We extend the simple one-step mutation model to consider the situation in which multiple evolutionary changes are required for the pathogen to attain R0 > 1 in the human population. We begin with a simple scenario, which we call the jackpot model, where the R0 of the pathogen with the intermediate mutations is the same as that of the introduced pathogen, and where only the addition of the final mutation results in an increase in R0 to greater than one. As seen in Fig. 3, increasing the number of required evolutionary steps greatly reduces the probability of emergence and increases its sensitivity to changes in R0. The probability of emergence is approximately proportional to the mutation rate to the power of the number of evolutionary steps required (see Supplementary Information).

Figure 3: Multiple-step evolution.
figure 3

Here multiple evolutionary changes are required for evolution of the pathogen to have an R0 > 1. a, Jackpot model with µ = 0.1 and n intermediate changes each with R0 equal to that of the introduced pathogen: increasing the number of steps (n) greatly decreases the probability of evolution, and makes it more sensitive to the R0 of the introduced pathogen. b, Alternative multi-step models for the one-intermediate (n = 1) case. The jackpot model (solid line), additive model (dashed line) and fitness valley model (dotted line) are shown (see text for details).

The fitness landscape on which evolution occurs is important in determining the outcome10. Figure 3b illustrates this for the case of a single intermediate type. As should be expected, changing the jackpot model to an additive model, where the fitness of the intermediate is the average of the fitness of the introduced strain and fully evolved strain, increases the probability that the pathogen evolves to R0 > 1, whereas changing it to a fitness valley model, where the fitness of the intermediate is lower than the fitness of the introduced strain, decreases the probability of emergence.

The key characteristic distinguishing our model from the conventional view is the R0 of the introduced pathogen. In the conventional view the R0 of these infections must be greater than one, whereas in the mechanism described here it is less than one, and evolution during the stochastic chains of transmission allows R0 to increase above one. In the case of human infections such as HIV1,11,12, SARS13,14,15,16,17 and (potentially) monkeypox18,19,20,21, seroprevalence studies among groups at high risk of infection from the reservoir would allow evaluation of whether crossover events are usually dead ends, as we expect in our model, or whether they are associated with a large number of secondary cases. In the case of diseases emerging into non-human populations, such as the Nipah virus, which moved from bats to pigs22,23,24, it may be possible to conduct additional tests involving controlled experimental infections to estimate the R0 (in this case of the bat Nipah virus in pigs) and to determine whether it evolves during the course of chains of transmission. Such studies may additionally help to identify pathogens that have an ability to evolve rapidly and thus have a high potential for emergence.

The framework presented here has special relevance for pathogens that have been driven to extinction by vaccination. In the case of smallpox there are probably reservoirs of related zoonoses (such as, but not restricted to, monkeypox) from which smallpox may have originated. Although the R0 of monkeypox in the human population is clearly less than one, there are occasional chains of transmission in the human population18,19,20,21. As the level of herd immunity to smallpox wanes in the absence of continued vaccination, we expect an increase in R0 of infections with monkeypox albeit to a level still less than one (the smallpox vaccine provides about 85% cross immunity against monkeypox). Our results suggest that this increase in the effective R0 of monkeypox in the human population could markedly increase the probability of evolution of monkeypox, allowing it to emerge into a successful human pathogen (which, depending on the evolutionary trajectory followed, may be similar to or differ from smallpox).

The present study could be extended in a number of directions. These include explicitly incorporating the details of ecological interactions such as heterogeneity in the transmission in different areas and subpopulations2,25,26,27 and incorporating genetic diversity of the pathogen in its reservoir. Finally, we note that this framework can be applied to the more general problem of biological invasions28,29.


We describe the dynamics and evolution of emerging diseases as a multi-type branching process with the following probability-generating functions6,7,8,9:

We calculated the extinction probabilities of the above process numerically. For details and definitions see Supplementary Information.