The approximately universal shapes of epidemic curves in the Susceptible-Exposed-Infectious-Recovered (SEIR) model

Compartmental transmission models have become an invaluable tool to study the dynamics of infectious diseases. The Susceptible-Infectious-Recovered (SIR) model is known to have an exact semi-analytical solution. In the current study, the approach of Harko et al. (Appl. Math. Comput. 236:184–194, 2014) is generalised to obtain an approximate semi-analytical solution of the Susceptible-Exposed-Infectious-Recovered (SEIR) model. The SEIR model curves have nearly the same shapes as the SIR ones, but with a stretch factor applied to them across time that is related to the ratio of the incubation to infectious periods. This finding implies an approximate characteristic timescale, scaled by this stretch factor, that is universal to all SEIR models, which only depends on the basic reproduction number and initial fraction of the population that is infectious.

Compartmental models provide a key tool in infectious disease epidemiology for studying the transmission dynamics of various pathogens [1][2][3] . The Susceptible-Infectious-Recovered (SIR) model is known to have an exact semi-analytical solution [4][5][6] . No such solution exists for the Susceptible-Exposed-Infectious-Recovered (SEIR) model, although some of its properties have been examined using an approximate analytical approach 7 . In the current study, the approach of 5 is generalised to demonstrate that, while no exact semi-analytical solution is possible, an approximate one does exist.
It will be demonstrated that this approximate solution of the SEIR model implies the curves of all SEIR models are simply stretched or compressed relative to one another by the factor, where the incubation period is 1/σ , the infectious period is 1/γ and the generation time is 1/σ + 1/γ . The SIR model is a special case with α = 1 . This property implies the time taken for the infectious curve to peak is approximately universal for the SEIR model when scaled by α.
In "The SIR model" section, the SIR model is concisely reviewed and extended. In "The SEIR model" section, approximate solutions of the SEIR model and their implications are elucidated. A concise summary is provided in "Summary" section.

The SIR model
In the SIR model, the fraction of the population that is susceptible (S) becomes infected at a rate β = R 0 γ , where R 0 is the basic reproduction number. There is no incubation period. The fraction of the population that is infected is immediately infectious (I) for a period of 1/γ , after which a fraction of the population recovers (R). The SIR model is described by the following set of coupled ordinary differential equations 1,5 , Review of Harko et al. 5 . As a starting point, the derivation of 5 is made more compact and cast in the mathematical notation of the current study. By taking the derivative of the first equation of (2) with respect to time, one obtains equation (12) of 5 , where for convenience one uses shorthand notation for the derivatives with respect to time, one obtains from Eq. (5) an expression that is equivalent, but not identical, to equation (24) of 5 , because one has chosen to work directly with S (and not S/S 0 ) as the independent variable. The preceding expression is recognised as a Bernoulli differential equation, which may be solved to obtain an expression that is equivalent, but not identical, to equation (25) of 5 , where the initial value of S is denoted as S 0 . The constant of integration is set by demanding that S + I + R = 1 .
Recalling the definition of φ , an expression that is equivalent to equation (26) of 5

follows,
where t 0 is the initial time. The preceding integral has no exact analytical (closed-form) solution and needs to be evaluated numerically, which is why it is strictly speaking an exact semi-analytical solution of the SIR model. The first and third equation of (2) may be combined to obtain where the initial fraction of the population that has recovered is chosen to be R 0 = 0 , which in turn implies that the initial fraction of the population that is infectious is Extension of Harko et al. 5 . By setting I ′ = 0 in Eq. (2), one realizes that the infectious curve I peaks at S = γ /β = 1/R 0 . Thus, Eq. (9) may be used to express the time taken for I to peak, where one assumes I 0 ≪ 1 . The quantity γ �t is the time interval expressed in terms of the infectious period and depends only on two parameters: R 0 and I 0 . Variations in I 0 shift the S, I and R curves back and forth in time without changing their shapes. We emphasize a subtle choice of notation: R 0 is the initial fraction of the www.nature.com/scientificreports/ population that has recovered (and is always set to zero in the current study), whereas R 0 is the basic reproduction number. When the infectious curve I first starts to rise from its initial value, the logarithm term in the integral of Eq. (9) may be approximated as ln (S/S 0 ) ≈ S/S 0 − 1 , which allows the integral to be evaluated analytically. It follows that where we have defined the epidemic growth rate as from which one obtains the known relationship between the basic reproduction number and the growth rate 1,8 , where D ≡ 1/γ is the infectious period.

The SEIR model
Seeking an approximate semi-analytical solution. The SEIR model builds on the SIR model by considering an additional compartment for the fraction of the population that is exposed (E): infected but not yet infectious. The incubation period is 1/σ . The SEIR model is described by the following set of coupled ordinary differential equations 1 , Since this set of equations does not consider births or deaths, we have S + E + I + R = 1.
The first and fourth equations may be combined to obtain which is identical to the SIR model. Again, the choice of R 0 = 0 is made with no loss of generality. By combining all four equations, one obtains The approximation is taken that the rate of change of the acceleration of R is vanishingly small, Retaining the R ′′′ term in Eq. (17) would lead to a second-order, non-linear ordinary differential equation of φ(S) with no known analytical solution.
(12) www.nature.com/scientificreports/ Solving for φ as in "Review of Harko et al. 5 " section yields where φ 0 is the initial value of φ . The preceding expression leads to an expression for I, in terms of S, with a yet unspecified constant of integration ( φ 0 ), Let the initial fraction of the population that is exposed be E 0 . Demanding that S 0 + E 0 + I 0 + R 0 = 1 yields Expressions for E and I, in terms of S, are obtained Finally, S can be expressed in terms of t via the following integral, Since I 0 ≪ 1 , the time taken for I to peak is The preceding expression is identical to Eq. (11) of the SIR model, except for the extra factor of α . It should be noted that the upper limit of the integral ( 1/R 0 ) assumes the approximation I ′ = E ′ = 0 . However, Eq. (27) is not used to compute the peak times in Fig. 2. Its only purpose is to demonstrate that one may factor out αγ from the integral. Stating the upper limit of the integral in Eq. (27) more accurately does not alter the main conclusion of the current study. The relationship between the growth rate and the basic reproduction number can again be derived. Using the same series expansion of the logarithm term in the integral of Eq. (26), one obtains albeit with a different definition of the growth rate, It follows that where D ′ ≡ 1/σ is the incubation period. When α = 1 , the expression for the SIR model in Eq. (14) is recovered. If S 0 ≈ 1 and I 0 ≪ 1 , then one obtains R 0 ≈ 1 + �(D + D ′ ).
The exact relationship between the growth rate and R 0 has been derived in various ways 8 (and references therein) and is given by R 0 = (1 + �D ′ )(1 + �D) . This equation accounts for the characteristic generation time distribution of SEIR models, which is a convolution of the exponentially distributed incubation and infectious periods with mean durations of D ′ and D, respectively. The approximate solution of Eq. (30) lacks the term 2 D ′ D . Hence, it corresponds to the case of an exponentially distributed generation time with mean duration D ′ + D , which is the same as the solution for the SIR model assuming an infectious period of D ′ + D. (27) has non-trivial implications. It suggests that the susceptible, exposed, infectious and recovered curves of SEIR models, with different values of D ′ and D, follow approximately universal shapes that are stretched by a factor of 1/α = 1 + D ′ /D relative to one another. To demonstrate this property, the full set of coupled equations in (15) is solved numerically using the solve_ivp routine of the Python programming

Implications. Equation
(24) www.nature.com/scientificreports/ language suite 9 . For illustration, one assumes R 0 = 2 and I 0 = 10 −4 . Figure 1 shows the solution curves of 100 SEIR models, where the values of the incubation ( D ′ ≡ 1/σ ) and infectious ( D ≡ 1/γ ) periods are randomly drawn from an interval between 2 and 5 days. When time is scaled by the factor αγ , the 100 susceptible, exposed, infectious and recovered curves lie approximately on top of one another. The second implication is that the time taken for the infectious curve to peak is approximately universal for all SEIR models when scaled by α and expressed in terms of the infectious period. In other words, αγ �t should only depend on R 0 and I 0 . To demonstrate this property, the full set of equations in (15) is again solved numerically for 10,000 random draws of 1/σ and 1/γ and for R 0 = 2 to 7. For each SEIR model, the time taken for the infectious curve to peak ( t ) is calculated numerically. All 10,000 values of t are multiplied by αγ ; two sets of curves with different I 0 values are shown in Fig. 2 for illustration. For all 10,000 SEIR models, the αγ �t values lie approximately on the same curve across R 0 for a given value of I 0 , demonstrating that αγ �t is a dimensionless (with no physical units), approximately universal timescale of the SEIR model.

Summary
In the current study, approximate semi-analytical solutions of the SEIR model are found by generalising a previous approach for deriving an exact solution of the SIR model. This finding implies that the entire family of susceptible, exposed, infectious and recovered curves of the SEIR model follow approximately universal shapes that are stretched or compressed, relative to one another, by a factor consisting of the incubation and infectious periods. The time taken for the infectious curve to peak is the characteristic timescale of the system and depends only on the basic reproduction number and the initial fraction of the population that is infectious when scaled by the infectious period and this stretch factor. For illustration, the basic reproduction number has been set to R 0 = 2 and the initial fraction of the population that is infectious has been set to I 0 = 10 −4 . Each set of curves is generated using 100 random realisations of the incubation and infectious periods, each drawn from an interval between 2 and 5 days for illustration. Figure 2. Time until the infectious curve (I) peaks as a function of the basic reproduction number R 0 . In the SEIR model, the time to the epidemic peak ( t ) scales approximately with α and γ . For illustration, two values of the initial fraction of population that is infectious ( I 0 ) are considered. Each set of curves is generated using 10,000 random draws of the incubation and infectious periods from an interval between 2 and 5 days.