Abstract
The COVID19 epidemic brought to the forefront the value of mathematical modelling for infectious diseases as a guide to help manage a formidable challenge for human health. A standard dynamic model widely used for a spreading epidemic separates a population into compartments—each comprising individuals at a similar stage before, during, or after infection—and keeps track of the population fraction in each compartment over time, by balancing compartment loading, discharge, and accumulation rates. The standard model provides valuable insight into when an epidemic spreads or what fraction of a population will have been infected by the epidemic’s end. A subtle issue, however, with that model, is that it may misrepresent the peak of the infectious fraction of a population, the time to reach that peak, or the rate at which an epidemic spreads. This may compromise the model’s usability for tasks such as “Flattening the Curve” or other interventions for epidemic management. Here we develop an extension of the standard model’s structure, which retains the simplicity and insights of the standard model while avoiding the misrepresentation issues mentioned above. The proposed model relies on replacing a module of the standard model by a module resulting from Padé approximation in the Laplace domain. The Padéapproximation module would also be suitable for incorporation in the wide array of standard model variants used in epidemiology. This warrants a reexamination of the subject and could potentially impact modelbased management of epidemics, development of software tools for practicing epidemiologists, and related educational resources.
Introduction
The global epidemic of COVID19 has brought to the forefront the importance of mathematical modelling in the development of strategies for managing the spread of infectious diseases^{1,2,3,4,5,6,7}. Terms such as flattening the curve, \({R}_{0},\) or herd immunity, which entered public discourse^{8} emerge from mathematical models that purport to provide useful predictions and thus to help guide effective management strategies^{9}. A basic class of such models separates a population into various compartments—each comprising individuals at a similar stage before, during, or after infection—and keeps track of the population fraction in each compartment over time, by balancing loading, discharge, and accumulation rates. The archetype for this modelling approach is the celebrated SIR model structure^{10,11,12,13,14,15,16,17} which splits a population into three compartments: susceptible (S) to the infection, infectious (I), and the rest (R) being immune or removed from infectious by recovery or death. The dynamics of how individuals move from S to I to R was developed almost a century ago in a mathematical modelling tourdeforce by Kermack and McKendrick^{18} who derived a general, if elaborate model structure in Eqs. (11)–(15) of their landmark paper. In the same publication (Eq. (29) ibid.) these authors also presented a well characterized special case in the form of the following three simple ordinary differential equations (ODEs) comprising the widely used standard SIR model:
where \(s,i,r\) are the susceptible, infectious, and removed fractions of a fixedsize population, respectively; \(\beta , \gamma\) are infectivity and discharge constants, respectively; and each of the Eqs. (1)–(3) can be derived from the remaining two using the compatibility condition
The great value of the SIR model is not merely that it can fit data (as already shown by Kermack and McKendrick in the same publication) but that it can also provide two deep and insightful conclusions about the dynamics governing the course of infectious disease epidemics. The first conclusion concerns the Threshold Theorem:
…there exists a critical or threshold density of population. If the actual population density be equal to (or below) this threshold value the introduction of one (or more) infected person does not give rise to an epidemic, whereas if the population be only slightly more dense a small epidemic occurs (ibid., p. 701).
The second conclusion concerns the longterm behavior of \(s, i, r\) at the asymptotic end of an epidemic:
… the course of an epidemic is not necessarily terminated by the exhaustion of the susceptible members of the community. … the termination of an epidemic may result from a particular relation between the population density, and the infectivity, recovery, and death rates. (ibid., pp. 701, 702, and Eq. (20))
These conclusions are fairly robust, whether the general or the simplified version (Eqs. (1)–(3)) of the KermackMcKendrick model is considered^{15,18}. In fact, it immediately follows from stability analysis of Eqs. (1) and (2) that the threshold value for \(s\) implied by the SIR model is
where \({R}_{0}\) (introduced as such later^{15,19,20}) is the basic reproductive ratio, widely considered “one of the most critical epidemiological parameters”^{11,21}. It also follows^{18} from Eqs. (1)–(3) that the total fraction of individuals infected throughout an epidemic, \(r\left(\infty \right)\), is the real solution of the transcendental algebraic equation
as depicted in Fig. 1. That figure shows the rapid escalation of \(r\left(\infty \right)\) as \({R}_{0}\) rises above 1, given an initially susceptible population (In fact, making time dimensionless as \(\eta \stackrel{\scriptscriptstyle\mathrm{def}}=\gamma t\), immediately transforms Eqs. (1) and (2) to \({s}^{{\prime}}\left(\eta \right)={R}_{0}s\left(\eta \right)i\left(\eta \right), {i}^{{\prime}}\left(\eta \right)=\left({R}_{0}s\left(\eta \right)1\right)i\left(\eta \right)\), whose only parameter is \({R}_{0}\)).
The above two quantitative predictions by Eqs. (5) and (6) lend exceptional value to the SIR model, both conceptually and computationally. For instance, they can be used to assess herd immunity^{11} for a population, corresponding to an estimated value of \({R}_{0}\) achieved by nonpharmaceutical or pharmaceutical interventions^{22}. Or, conversely, for an epidemic that ran its course or in development, data can be used to gauge an overall or temporary value of \({R}_{0}\)^{21}.
However, as we will substantiate in the next section, there are another two important quantitative predictions of the standard SIR model that, we argue, can be problematic (see Fig. 1 for visualization):

(a)
The peak value, \({i}^{*}\), of the infectious fraction, \(i\), which may be misrepresented by as much as a factor of about 2, and

(b)
the exponential growth rate of the infectious fraction, \(i\), which is also misrepresented by as much as a factor of about 2, with corresponding misrepresentation of the time to peak, \({t}^{*}\).
The above two shortcomings are not confined to the standard SIR model but, as we elaborate in the next section, are far more pervasive and reach a wide area in compartmentbased epidemiology modeling spanned by SIR variants.
To start with, good prediction of both the infectious peak, \({i}^{*}\), and the time to that peak, \({t}^{*}\), is of paramount importance when considering management strategies for an epidemic. This is because \({i}^{*}\) and \({t}^{*}\) significantly affect the resources needed for care of infectious patients. To wit, calls for Flattening the Curve^{22} during the COVID19 epidemic aimed precisely at lowering \({i}^{*}\) and thus averting the overwhelming of medical care resources^{23}.
In addition, good estimates of \({R}_{0}\) from data of exponential growth during the spread of an epidemic are critical for assessing the situation and for designing effective interventions^{9,21}.
Furthermore, and more importantly, to the extent that predictions of the infectious peak and timetopeak by the standard SIR model may be problematic, the problem is not confined to the I compartment of the standard SIR model. Rather, it may be endemic (no pun intended) in the numerous possible variants of compartmentbased epidemiology models with loading and discharge terms. Such variants include a variety of compartments with corresponding arrangements and interactions (e.g., SEIR, SI, SIS, or similar^{24}), multiple subpopulations (e.g., of different age and/or social contact structure^{11,16,17,25}), spatial variation in addition to temporal (entailing partial differential equations^{11}), and any combinations thereof, which collectively lead to diverse stratification patterns^{26}. In the voluminous literature dealing with such models, the discharge rate from a compartment is virtually always represented by a term similar to the term \(\gamma i(t)\) of Eq. (2)^{27}. In fact, this practice is so widespread in the entire literature of epidemiology^{11,15,28,29,30} that it is selected, perhaps uncritically, even in advanced modeling efforts which employ sophisticated tools (e.g., automated algorithmic discovery^{31}) in attempts to uncover more realistic expressions for infection dynamics. It is plausible, therefore, to claim that peak and timetopeak predictions for a related compartment in any of these models may be as problematic as the corresponding predictions of the simple SIR model, with similarly adverse consequences.
Of course, for more accurate predictions, one could forego the simplifying assumptions leading to Eqs. (1)–(3) and its variants, in favor of the general timevarying integrodifferential equation patterns introduced by Kermack and McKendrick^{12,29,32}. This, however, would significantly increase complexity of analysis and use^{32}, which partly explains the popularity and underscores the importance of simplified models such as SIR.
Consequently, a natural question arises: Given the aforementioned shortcomings of the standard SIR model, is there a mathematical model of comparable simplicity to Eqs. (1)–(3) that retains the two sound conclusions about the epidemic threshold (Eq. (5)) and longterm epidemic course (Eq. (6)) while avoiding the two issues mentioned above, namely misrepresentation of the infectious peak, \({i}^{*},\) and time to that peak, \({t}^{*}\)?
Here, we constructively answer this question in the positive. Using a combination of Laplace transforms and Padé approximations to describe compartment discharge dynamics, we develop in the “The Padé SIR model structure” section (Eqs. (12) and (13)) of similar simplicity to SIR. The Padé SIR structure produces the exact same threshold and longterm values (Eqs. (5) and (6)) as Eqs. (1)–(3), while predicting more realistic infectious peak and time to peak for a wide range of practically significant cases. More importantly, because the proposed structure relies on replacement of the discharge term \(\gamma i(t)\) in the I module of the SIR Eq. (2) without increasing complexity, it can be used widely in the large array of compartmentbased epidemiology models to realistically represent the dynamics of compartment discharge. This immediately prompts a reevaluation and possible revision of the wide literature on compartmentbased modeling in epidemiology inspired by the SIR model. It is emphasized that the preceding prompt is not motivated by a mere intent for higher accuracy; rather, the aim is to offer higher utility, in the spirit of George Box’s dictum “all models are wrong, but some are useful”^{33}.
In the rest of the paper, we first present the significant merits and subtle issues of the SIR structure. Subsequently, we offer a remedy to these shortcomings, in the form of a new class of SIR variants (the Padé SIR model structure) whose properties and implications we explore for epidemiology modeling and epidemic management. Discussion and extensions follow, pointing to the usability of the proposed modeling approach and its applicability to the wide class of compartmentbased epidemiology models.
Methods
To provide context and intuition for the developments that follow, we will rely on the basic schematic of Fig. 2. Directly inspired by the original KermackMcKendrick ideas, Fig. 2 shows how the stacked fractions \(s,i,r\) of a fixedsize population change during an epidemic, as individuals move from compartment S to I to R over time. Infectious individuals in the I compartment are discharged (to enter the R compartment) at times \(T\ge 0\) after becoming infectious, where the infectious period \(T\) follows a cumulative distribution and corresponding density^{18,29} defined in the standard way as \(\mathcal{F}\left(\theta \right)\stackrel{\scriptscriptstyle\mathrm{def}}=P\left[T\le \theta \right]\) and \(\mathcal{f}\left(\theta \right)={\mathcal{F}}^{{\prime}}\left(\theta \right)\).
Simple balances around the boxed areas in Fig. 2 for a timeinvariant cumulative distribution \(\mathcal{F}\left(\theta \right)\) (APPENDIX A) yield the equation
which, combined with the infectivity Eq. (1) and the consistency Eq. (4) forms a general representation of the SIR system dynamics^{29,32}.
Selecting \(\mathcal{F}\) in Eq. (7) to be the cumulative density function of the exponential distribution \(\mathcal{F}\left(\theta \right)=1\mathrm{exp}\left(\gamma \theta \right)\stackrel{\scriptscriptstyle\mathrm{def}}=F\left(\gamma \theta \right)\), one immediately gets the SIR model, Eqs. (1)–(3) (APPENDIX A). For that model, the parameter \(\gamma\) is the inverse of
the average infectious period … estimated relatively precisely from epidemiological data^{11}.
As will be detailed in what follows, it is at this point where issues with peak and timetopeak misrepresentations by the SIR model may originate:
Heeding the above suggestion to use epidemiological data for direct estimation of the average, \(1/\gamma\), of the infectious period, \(T\), is indeed sensible (in fact, necessary for a reasonable estimate). However, the associated distribution of \(T\) is typically far from exponential (because an exponential distribution would suggest, inter alia, that most infectious individuals leave the I compartment in zero time, an untenable assumption). Rather, \(T\) follows distributions with peak not near zero^{34} as shown in Fig. 3 by the curves indexed by \(n\gg 1\). The exact shape of these curves is not important; rather, these curves serve as examples of distributions \(f\left(\gamma \theta \right)\) with peak not near 0.
The assumption of exponential distribution for the infectious period “has appeared in many epidemic models but has seldom been questioned”^{27} yet would be conveniently acceptable, if it did not lead to inadvertent outcomes. Unfortunately it does, in the following subtle yet important way: While the same threshold and longterm values (Eqs. (5) and (6), respectively) would result from Eqs. (1), (7), and (4), and for practically any reasonable distribution of \(T\) with the same average, \(D\stackrel{\scriptscriptstyle\mathrm{def}}=1/\gamma\), (an insight already provided by Kermack and McKendrick^{18}) the estimated infectious peak and time to peak would be significantly affected by the kind of distribution considered, in an interesting fashion, as demonstrated in Fig. 4. This figure shows the profiles of the infectious fraction, \(i\left(t\right)\), for the infectious period distributions in Fig. 3, with time in both dimensional and dimensionless form. The latter is in terms of dimensionless time \(t/D\), because this simple transformation trivially makes the dynamics of all considered models dependent on \({R}_{0}\) alone and allows for meaningful comparisons without loss of generality. The dimensional time is in days, to provide some context for epidemics such as COVID19 with related values^{34,35,36}
What is remarkable in Fig. 4 is that while different distributions of \(T\) sharing the same average, \(D\stackrel{\scriptscriptstyle\mathrm{def}}=1/\gamma\), expectedly yield different profiles of \(i(t)\)^{27} these profiles quickly approach the profile corresponding to the unitimpulse distribution shown in Fig. 3. For that distribution of \(T\), it immediately follows from Eqs. (1), (7), and (4) (APPENDIX A) that the resulting dynamic model, which we will term dSIR, comprises the delay differential equation (DDE)
and the delay algebraic equations
with \(D\stackrel{\scriptscriptstyle\mathrm{def}}=\frac{1}{\gamma }\). Therefore, the dSIR model of Eqs. (9)–(11) constitutes a more realistic representation of spreading epidemic dynamics than the standard SIR model.
Delay differential equations (DDEs) such as the above have been a classic subject of study in biology^{37,38}). DDEs are generally perceived as more difficult to analyze than ODEs^{32}^{, p. 5} perhaps because of infinite spectra (for linear DDEs) or discontinuities in the derivatives of DDE solutions—albeit the corresponding theory for DDEs such as the above “does not present substantial additional difficulties” compared to ODEs^{39, p]}^{. 6}. Nevertheless, even though Eqs. (9)–(11) have long been known^{29} they are typically bypassed in favor of their ODE counterparts, Eqs. (1)–(3), along with their misrepresentations of the infectious peak and time to peak already discussed.
To address this issue, in the next section we derive novel approximations of the dSIR Eqs. (9)–(11) in the form of the Padé SIR ODEs, which have a number of advantages: While the Padé SIR model structure is as simple as that of the standard SIR model, Eqs. (1)–(3), and produces the same threshold and longterm values captured by Eqs. (5) and (6), it produces more realistic representations for the infectious peak and time to peak than the standard SIR ODEs. As such, the Padé SIR model structure not only creates an alternative to the standard SIR model but also provides a general module that can be immediately incorporated in the wide variety of compartmentbased models used in epidemiology.
Main results
The Padé SIR model structure
Combining Laplace transforms with firstorder Padé approximation (a popular approach for approximating transcendental transfer functions by polynomial rational fractions in automatic control^{40,41}) one can show (APPENDIX C) that Eqs. (9)–(11) of the dSIR model can be approximated by the firstorder Padé SIR model, comprising Eqs. (1), (4), and the novel ODE
where \(D\) is the value of the infectious period, \(T.\) Note that the only difference between the above Eq. (12) and its standard SIR counterpart, Eq. (2), is simply the factor 2. Yet this difference has significant implications, to be highlighted shortly.
For better approximation of Eqs. (9)–(11) one can use a secondorder Padé approximation to obtain (APPENDIX C) the secondorder Padé SIR model, which comprises Eqs. (1), (4), and the secondorder ODE
in place of the SIR model’s Eq. (2).
Why Padé SIR?
A basic merit of the Padé SIR model is illustrated in Fig. 5, which shows that the profiles of \(i(t)\) obtained by numerically integrating the (first or secondorder) Padé SIR models are close to that produced by the dSIR model for a range of values of \({R}_{0}\), but far from the corresponding profile produced by the standard SIR model.
Note that the approximation in Fig. 5 depends on \({R}_{0}\stackrel{\scriptscriptstyle\mathrm{def}}=\beta D\) and deteriorates as \({R}_{0}\) takes values farther away from 1, as expected by the properties of Padé approximants. In fact, the firstorder Padé SIR model should be used with caution for \({R}_{0}\ge 2\), because it would yield negative early values of \(r(t)\), as can be immediately deduced by linear analysis of the corresponding third ODE, \({r}^{{\prime}}\left(t\right)={s}^{{\prime}}\left(t\right){i}^{{\prime}}\left(t\right)=\frac{{R}_{0}}{D}s\left(t\right)i\left(t\right)+\frac{2}{D}i\left(t\right)\), which implies \(r\left(t\right)\overline{r }\approx \frac{2{R}_{0}}{D}i\left(t\right)\); and the same model, for larger \({R}_{0}\ge 2\), would produce peak values of \(i\left(t\right)>1\), which is clearly meaningless. However, as shown in Fig. 5, the predictions of \({i}^{*}\) by the firstorder Padé SIR remain remarkably close to those of the dSIR model, even for fairly large \({R}_{0}\) well above 2. This behavior of approximation accentuates the value of the Padé SIR model, as values of \({R}_{0}\) close to (or lower than) 1 would be far more desirable than values well above 1 (Fig. 1). Of course, one could easily extend the Padé SIR model to yield \(r(t)\) values in the interval \(\left[\mathrm{0,1}\right]\) through the simple modification \({r}^{{\prime}}\left(t\right)=\mathrm{max}\left(\frac{{R}_{0}}{D}s\left(t\right)i\left(t\right)+\frac{2}{D}i\left(t\right), 0\right)\), as indicated for \({R}_{0}=5, 6\) in Fig. 5.
Figure 5 also shows profiles of \(i\left(t\right)\) by the secondorder Padé SIR model, and indicates that Padé approximations of third or higher order could be used in an similar way, but the point of diminishing returns would be quickly reached, as model complexity would increase a lot more quickly than quality of approximation.
Before discussing the important consequences implied by the Padé SIR model, relevant properties of that model are briefly summarized next, to better support the consequences established thereafter.
Comparative summary of important properties of the Padé SIR models
The models considered can be analyzed using standard ODE or DDE theory, as already mentioned. Therefore, only aspects that bear insight or novelty will be discussed and corresponding comparisons will be made.
Instability at equilibrium and epidemic outbreak
It can be shown (APPENDIX D) that an equilibrium point, {\(s = \bar s,\;i = 0,\;r = 1  \bar s\)} of the dSIR or of the Padé SIR model is stable and an epidemic outbreak does not occur iff \(\overline{s }\) is below the threshold in Eq. (5). This result is in fact anticipated by the original Kermack and McKendrick analysis.
Final values of \(\left\{s,i,r\right\}\)
It can be shown (APPENDIX E) that at the end of an epidemic that started at \(s\left(0\right)\approx \overline{s },\) \(i\left(0\right)\approx 0,\) and \(r\left(0\right)\approx 1\overline{s },\) the total fraction of infected throughout the epidemic is
for all four models, where \({R}_{0}\stackrel{\scriptscriptstyle\mathrm{def}}=\beta D=\beta /\gamma\) and \(W\) is the Lambert function^{42}, whose importance in epidemiology modeling appears to have been recognized only recently^{43} (note that \(\overline{s}\beta D\stackrel{\scriptscriptstyle\mathrm{def} }=\overline{s}{R }_{0}>1\) is required for the epidemic to spread). Equation (14) is the analytical solution of Eq. (6) and is precisely what is depicted in the graph of Fig. 1 for \(s\left(0\right)\approx 1\).
Exponential rate of epidemic spread
For the early part of a spreading epidemic, it can be shown (APPENDIX F) that the infectious fraction, \(i(t)\), follows the approximately exponential growth
according to the two Padé SIR models, or
according to the dSIR model, where the constants \(a\ll b\) in Eq. (17) are in terms of \({R}_{0}\stackrel{\scriptscriptstyle\mathrm{def}}=\beta D\) (APPENDIX F). By comparison, the early growth of \(i\left(t\right)\) according to the standard SIR model in Eqs. (1)–(3) is
where \({R}_{0}\stackrel{\scriptscriptstyle\mathrm{def}}=\beta /\gamma\). Note that in all four Eqs. (15)–(18) the rates \({p}_{0}\) depend on \({R}_{0}\overline{s }\) alone, as is anticipated by the corresponding dSIR, Padé SIR, and SIR models, for which introduction of the dimensionless time \(\eta =\gamma t\stackrel{\scriptscriptstyle\mathrm{def}}=t/D\) leaves \({R}_{0}\) as the only remaining parameter in the corresponding equations. Therefore, the above exponential rates \({p}_{0}\) are shown as functions of \(\overline{s}{R }_{0}\) in Fig. 6. Note that \(\overline{s }=1\) for an epidemic without prior immunity in the population.
It is evident in Fig. 6 that the rates (or doubling periods) corresponding to the Padé SIR and dSIR models differ from their SIR counterparts by a factor of about 2, for \(\overline{s}{R }_{0}\) not much higher than 1. This agrees well with the more rapid early rise of \(i(t)\) from numerical integration of the dSIR and Padé SIR models compared to that of the SIR model, as shown in Fig. 5. Note again that despite these rate differences shown in Fig. 6, all four models considered eventually reach the same steadystate values, as captured by Eq. (14).
The importance of these discrepancies for estimation of \({R}_{0}\) from early epidemic data will made clear shortly.
Peak of infectious fraction
While an analytical solution for \({i}^{*}\) according to the dSIR model is not obvious to the author, a good approximation can be easily obtained (APPENDIX G) through the firstorder Padé SIR model, following the same approach taken for the SIR model, to get
The above \({i}^{*}\), for the same \({R}_{0}\), is exactly double the \({i}^{*}\) of the standard SIR model, which is known to be
(APPENDIX G). This discrepancy accounts for the differences observed in Fig. 5 between the \({i}^{*}\) produced by the SIR and by the other three models considered. Obviously, this approximation breaks down for values of \({R}_{0}\) that yield \({i}_{\mathrm{Pad}\acute{{\rm e}}1\mathrm{ SIR}}^{*}>1\), a situation that would be expected for large values of \({R}_{0}\), as illustrated in the last plot \(\left({R}_{0}=6\right)\) of Fig. 5.
The discrepancy between the SIR and Padé SIR models also manifests itself in using them for modelbased predictions that depend on parameter estimates driven by epidemiological data, as discussed in the next section.
Discussion and extensions
Modelbased predictions form fitting epidemiological data
An immediate and important discrepancy for the models discussed is in the estimation of \({R}_{0}\) from epidemiological data on daily new cases during exponential growth, i.e. from \(i(t)\) or \(i{^{\prime}}(t)\), and from the average infectious period, \(D\stackrel{\scriptscriptstyle\mathrm{def}}=1/\gamma\). Figure 6 captures the relationship between the exponential growth rate \({p}_{0}\) given a corresponding \(\overline{s}{R }_{0}\). Therefore, for \(s\left(t\right)\approx \overline{s }=1\), it is standard to use a simple logplot of daily new cases vs. time to estimate the slope \({p}_{0}=\mathrm{ln}\left(2\right)D/{t}_{d}\) of \(\mathrm{exp}\left({p}_{0}t/D\right)\) (where \({t}_{d}\) is the doubling period) and from that the resulting \({R}_{0}\). Following this procedure for \(\overline{s }=1\) (no prior immunity) the two Padé SIR models, Eqs. (15) and (16), yield the novel \({R}_{0}\) estimates
the dSIR model yields
whereas the standard SIR model, Eq. (2), yields the well known estimate^{9,21,44}
The above Eqs. (21)–(24) can be visualized in Fig. 6 with \({p}_{0}\) considered the independent variable. Note that \({R}_{0, \mathrm{ Pad}\acute{{\rm e}}1\mathrm{ SIR}}1=\frac{{R}_{0, \mathrm{ SIR}}1}{2}\) and \({R}_{0, \mathrm{ Pad}\acute{{\rm e}}2\mathrm{ SIR}}=\frac{{R}_{0, \mathrm{ SIR}}1}{2}\left(1+\frac{{R}_{0, \mathrm{ SIR}}1}{6}\right)\) and that \({R}_{0, \mathrm{dSIR}}\approx {R}_{0, \mathrm{ Pad}\acute{{\rm e}}1\mathrm{ SIR}}\) for small \({p}_{0}\).
The important message of Fig. 6 is that systematic error may arise in the estimation of \({R}_{0}\) when using the standard SIR model. For example, taking \(D=8.4\mathrm{ days}\) (Eq. (8) for COVID19) and \({t}_{d}=2.3 \; \mathrm{days}\) (corresponding to early COVID19 spread in the US^{45}) yields \({R}_{0,\mathrm{dSIR}}\approx {R}_{0,\mathrm{Pade}2\mathrm{ SIR}}=3.2\) vs. \({R}_{0,\mathrm{SIR}}=4\) for \(\frac{{t}_{d}}{D}=0.23\) in Fig. 6. As \({t}_{d}\) increases, the discrepancy between \({R}_{0,\mathrm{dSIR}}\) or \({R}_{0,\mathrm{Pade}2\mathrm{ SIR}}\) on one hand and \({R}_{0,\mathrm{SIR}}\) on the other becomes more pronounced.
Systematic errors in estimates of \({R}_{0}\) have important implications. For example, the conceptual anticipation of total infected through the pandemic, as shown in Fig. 1, following Eq. (14), is going to be significantly affected. In addition, the infectious peak is also going to be affected in a nontrivial way, as shown in Fig. 7. In that figure, profiles of \({i}^{*}\) are plotted as functions of the exponential growth rate, \({p}_{0}\), through the following procedure: Given \({p}_{0}\), the corresponding values of \({R}_{0}\) are computed according to the dSIR, Padé1 SIR, Padé2 SIR, and SIR models (Eqs. (21)–(24)) and, subsequently, values of \({i}^{*}\) are computed using Eq. (19) (Padé1 SIR model) for the first three \({R}_{0}\) values and Eq. (20) for the fourth value of \({R}_{0}\). For calibration, the dots in Fig. 7 represent calculation of \({i}^{*}\) through direct numerical integration of the dSIR Eqs. (9)–(11) for values of \({R}_{0}\) computed using Eq. (23). There is remarkable closeness of \({i}^{*}\) values produced by the Padé SIR models to the ideal values produced by the dSIR model, contrasted to the distance of \({i}^{*}\) values produced by the standard SIR model.
The message from this exercise is that although adjusting the parameter \({R}_{0}\) of the standard SIR model can fit data from exponential epidemic growth well, there will remain two significant problems, namely neither the estimated \({R}_{0}\) nor the predicted \({i}^{*}\) will be represented well. The proposed model structures offer a better representation.
Analytical calculation of \({R}_{0}\) to observe an upper bound on \({i}^{*}\)
Of practical interest is the situation where an upper bound is placed on \({i}^{*},\) to avoid the overwhelming of hospitalization facilities during an epidemic. For that situation, Eq. (19) of the Padé1 SIR model has an explicit analytical solution for the corresponding \({R}_{0}\stackrel{\scriptscriptstyle\mathrm{def}}=\beta D\) as
where \({W}_{1}\) is the Lambert function of order \(1\) and typically \(s\left(0\right)\approx 1\) without prior immunity. By comparison, the standard SIR model yields
The values of \({R}_{0}\) indicated by Eqs. (25) and (26), with corresponding definitions, are shown in Fig. 8. It is evident that the Padé SIR model places twice as tight a restriction on \(\left({R}_{0}1\right)\) as the standard SIR model, if \(i\) is not to exceed the specified \({i}^{*}\) value. The implications of this result for tasks such as Flattening the Curve through interventions that adjust \({R}_{0}\) are clear.
How does the Padé SIR model work?
Underlying the Padé SIR model are constructs for approximating the unitstep cumulative distribution of the infectious time period, \(T\), shown in Fig. 3\((n=\infty )\), as explained in APPENDIX C. Graphs of these approximations and their corresponding formulas are presented in Fig. 9, along with the exponential and unitstep distributions for comparison. Note that the two Padé SIR distributions in Fig. 9 might appear absurd, as they involve negative values. However, this pattern turns out to yield acceptable values for the fractions \(s,i,r\).
It should also be noted that Eq. (12) of the firstorder Padé SIR suggests that the infectious loading rate remains \(\frac{{R}_{0}}{D}s\left(t\right)i\left(t\right)\), whereas the infectious discharge rate appears as \(\frac{{R}_{0}s\left(t\right)2}{D}i(t)\) rather than \(i(t)/D\), suggested by Eq. (2). This is illustrated visually in Fig. 10 in two ways, both of which underscore the significant differences between the SIR model on one hand and dSIR and Padé SIR models on the other: First (top), a timevarying \(\gamma \left(t\right)\stackrel{\scriptscriptstyle\mathrm{def}}=r {^{\prime}}(t)/i(t)\) (following Eq. (3)) is shown, with the values of \({r}^{{\prime}}\left(t\right)\) and \(i(t)\) calculated by the first or secondorder Padé SIR model with a fixed \(D\). Note that the discrepancy between \(D\) and \(1/\gamma \left(t\right)\) (shown as values of \(\gamma \left(t\right)D\) in Fig. 10) remains appreciable even for values of \({R}_{0}\) close to 1. Second (middle and bottom), Fig. 10 shows in a stacked plot the differences between the fractions \(\left\{s\left(t\right),i\left(t\right),r\left(t\right)\right\}\) produced by the (first or secondorder) Padé SIR models and the SIR model. In addition to the clear difference in the time profiles and infectious fraction peaks, note that the horizontal slices of the orange segments, corresponding to the infectious period for each newly infected fraction (Fig. 2), remain constant (equal to \(D\)) over time for the Padé SIR model, in contrast to the SIR model, for which the infectious period increases (Fig. 10, top).
The proposed approach in the context of Kermack and McKendrick
In the sentence right before they present their SIR model in Equ. (29) of their paper, Kermack and McKendrick^{18} explain that this is a
special case in which \(\phi\) and \(\psi\) are constants \(\kappa\) and \(l\) respectively.
with \(\left(\kappa , l\right)\) refering to \(\left(\beta ,\gamma \right)\) of Eqs. (1)–(3), respectively. The assumption about constant \(\phi\) is plausible, as it refers to the rate of spread of the epidemic (cf. Eq. (1)). While that parameter might change over time as a result of interventions taken to curb an epidemic, such changes could easily be reflected in the SIR model by a timevarying \(\phi\) (cf. \(\beta\) in Eqs. (1) and (2)). The assumption about constant \(\psi ,\) however, as widely as it may have been used, is chosen for mathematical convenience rather than for reasonableness of representation:
If \({\psi }_{\theta }\) denotes the rate of removal, …, then the number who are removed from each \(\theta\) group at the end of the interval \(t\) is \({\psi }_{\theta }{v}_{t,\theta }\), (ibid., p. 703).
where
\({v}_{t,\theta }\) shall denote the number of individuals in unit area at the time \(t\) who have been infected for \(\theta\) intervals (ibid., p. 702).
However, the rate of removal depends more on the duration over which individuals have remained infected and less on the size of that group. It is this simple fact that is critiqued here and alternatives for which are proposed.
Finally, it is fitting to quote Kermack and MacKendrick’s remarks on fitting field data from a plague outbreak: Along with using the SIR model, thereby assuming an exponential distribution of infectious time after infection, these authors explicitly state five additional simplifying assumptions (p. 715, ibid.) and warn that
deductions as to the actual values of the various constants should not be drawn. It may be said, however, that the calculated curve, …, conforms roughly to the observed figures.
Indeed, all four models considered in this study (dSIR, 1/2 Padé SIR, and SIR) fit well the data mentioned. Yet, were these models to be used for fitting the early exponential spread of the epidemic, their projections would be quite different, as already elaborated on.
Extensions
As already mentioned, the proposed approach to compartmentbased epidemiological modeling is applicable to model structures with a variety of compartments and flows among them. For these structures, the corresponding ODE models resulting from compartment discharge rates proportional to the load of each corresponding compartment^{14} can be immediately translated (a) from ODEs to DDEs with each compartment delay equal to the average residence time of that compartment, and (b) from DDEs to (first or secondorder) Padé approximations, which retain an ODE structure.
To substantiate these claims by an example, we briefly discuss next an extension of the ideas developed for the SIR structure to the SPIR variant that includes a compartment P between S and I (APPENDIX H). Individuals in the P compartment (equivalent to the E compartment in the standard SEIR structure^{10,11,46,47}) are asymptomatic infectious, that is they can transmit the disease before they enter the I compartment as symptomatic infectious^{48} a trait observed in several occasions, notably in the current COVID19 epidemic^{23,49,50}. Corresponding equations are shown in Table 1.
Figure 11 presents a comparison of profiles of \(i(t)\) which result from numerical solution of the dSPIR, Padé SPIR, and SPIR models. The values
relevant to COVID19^{35,36} are used in all simulations with \({R}_{0}\stackrel{\scriptscriptstyle\mathrm{def}}={\beta }_{i}\left({D}_{i}{D}_{p}\right)+{\beta }_{p}{D}_{p}\).
Additional properties of the proposed SPIR models can be established in a similar manner^{51} and will be explored in more detail elsewhere.
Finally, in situations where there are data to warrant it, one can relax the basic premise of the preceding discussion, namely that the dynamics of a system with S, I, R compartments will likely be close to the dynamics of a system with a step function as cumulative distribution \(\mathcal{F}\) of infectious period (Figs. 3 and 4). In such situations (e.g. models by Anderson et al.^{52}) a corresponding SIRlike model structure can be developed that employs the ODE \({{\varvec{i}}}^{{{\prime}}}\left({\varvec{t}}\right)=\frac{\alpha }{D}\left({R}_{0}s\left(t\right)1\right)i(t)\) in place of Eq. (12), where the parameter \(\alpha\) (\(1\le \alpha \le 2)\) is associated with the sigmoidicity of \(\mathcal{F}.\) A full development of that case is presented in a separate publication^{53}.
Conclusion
We have made a case for revisiting the standard SIR model that describes the spread of infectious disease epidemics. While that model features valuable insights, it also has fundamental shortcomings related to quantifying the spread of an epidemic, as detailed in the main text. Therefore, use of that model to manage an epidemic could have adverse consequences. A remedy to this problem is proposed in the form of the Padé SIR model structure, which retains all qualitative features of the standard SIR structure as well as its simplicity, yet mitigates its systematic errors. It is also noted that the remedy proposed is not confined to the standard SIR model, but is applicable to the numerous compartmentbased epidemiological models that constitute SIR variants, a reexamination of which would be warranted. The tools developed here can be easily and transparently incorporated in related software for practitioners or researchers^{44,54,55}. Related formulas, derived in the main text, can be used both for epidemiological data processing to guide decision making as well as for theoretical analysis to advance the mathematical theory of epidemics.
References
Jewell, N. P., Lewnard, J. A. & Jewell, B. L. Predictive mathematical models of the COVID19 pandemic: Underlying principles and value of projections. JAMA 323, 1893–1894. https://doi.org/10.1001/jama.2020.6585 (2020).
Giordano, G. et al. Modelling the COVID19 epidemic and implementation of populationwide interventions in Italy. Nat. Med. https://doi.org/10.1038/s4159102008837 (2020).
Adam, D. The simulations driving the world’s response to COVID19. Nature 580, 316–318 (2020).
Wang, C. et al. Evolving epidemiology and impact of nonpharmaceutical interventions on the outbreak of coronavirus disease 2019 in Wuhan, China. medRxiv https://doi.org/10.1101/2020.03.03.20030593 (2020).
Kucharski, A. J. et al. Early dynamics of transmission and control of COVID19: A mathematical modelling study. Lancet. Infect. Dis 20, 553–558. https://doi.org/10.1016/S14733099(20)301444 (2020).
Rahimi, I., Chen, F. & Gandomi, A. H. A review on COVID19 forecasting models. Neural Comput. Appl. https://doi.org/10.1007/s00521020056268 (2021).
Wynants, L. et al. Prediction models for diagnosis and prognosis of covid19: Systematic review and critical appraisal. BMJ 369, m1328. https://doi.org/10.1136/bmj.m1328 (2020).
Tufekci, Z. This overlooked variable is the key to the pandemic. The Atlantic 30. https://www.theatlantic.com/health/archive/2020/09/koverlookedvariabledrivingpandemic/616548/ (2020).
Fraser, C., Riley, S., Anderson, R. & Ferguson, N. Factors that make an infectious disease outbreak controllable. Proc. Natl. Acad. Sci. USA. 101, 6146–6151. https://doi.org/10.1073/pnas.0307506101 (2004).
Murray, J. D. Mathematical Biology: I. An Introduction (Springer, 2002).
Keeling, M. J. & Rohani, P. Modeling Infectious Diseases in Humans and Animals (Princeton University Press, 2008).
Brauer, F. & CastilloChavez, C. Mathematical Models in Population Biology and Epidemiology (Springer, 2012).
Diekmann, O., Heesterbeek, H. & Metz, H. In Epidemic Models: Their Structure and Relation to Data (ed Mollison, D.) (1995).
Anderson, R. M. & May, R. M. Population biology of infectious diseases: Part I. Nature 280, 361–367. https://doi.org/10.1038/280361a0 (1979).
Brauer, F. Mathematical epidemiology: Past, present, and future. Infect. Dis. Model. 2, 113–127. https://doi.org/10.1016/j.idm.2017.02.001 (2017).
Hethcote, H. W. In Models for Infectious Human Diseases: Their Structure and Relation to Data (eds Isham, C. & Medley, G.) 215–238 (Publications of the Newton Institute, 1996).
Anderson, R. M. & May, R. M. Infectious Diseases of Humans. Dynamics and Control (Oxford University Press, 1991).
Kermack, W. O. & McKendrick, A. G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. Ser. A 115, 700 (1927).
Heesterbeek, J. A. A brief history of R0 and a recipe for its calculation. Acta. Biotheor. 50, 189–204. https://doi.org/10.1023/a:1016599411804 (2002).
MacDonald, G. The Epidemiology and Control of Malaria (Oxford Univ. Pr., 1957).
Dietz, K. The estimation of the basic reproduction number for infectious diseases. Stat. Methods Med. Res. 2, 23–41. https://doi.org/10.1177/096228029300200103 (1993).
Qualls, N. L. et al. Community mitigation guidelines to prevent pandemic influenza—United States, 2017. MMWR Recomm. Rep. 66, 1 (2017).
Ferguson, N. M. et al. Report 9—Impact of Nonpharmaceutical Interventions (NPIs) to Reduce COVID19 Mortality and Healthcare Demand (Imperial College, 2020). https://doi.org/10.25561/77482.
Hethcote, H. W. Frontiers in theoretical biology. In A Thousand and One Epidemic Models (ed. Levin, S. A.) 504–515 (Springer, 1994).
Ferguson, N. M. et al. Strategies for mitigating an influenza pandemic. Nature 442, 448–452. https://doi.org/10.1038/nature04795 (2006).
Kestenbaum, B. Epidemiology and Biostatistics (Springer, 2019).
Kemper, J. T. On the identification of superspreaders for infectious disease. Math. Biosci. 48, 111–127. https://doi.org/10.1016/00255564(80)900188 (1980).
Siettos, C. I. & Russo, L. Mathematical modeling of infectious disease dynamics. Virulence 4, 295–306. https://doi.org/10.4161/viru.24041 (2013).
Hethcote, H. W. The mathematics of infectious diseases. SIAM Rev. 42, 599–653. https://doi.org/10.1137/S0036144500371907 (2000).
Rock, K., Brand, S., Moir, J. & Keeling, M. J. Dynamics of infectious diseases. Rep. Prog. Phys. 77, 026602. https://doi.org/10.1088/00344885/77/2/026602 (2014).
Horrocks, J. & Bauch, C. T. Algorithmic discovery of dynamic models from infectious disease data. Sci. Rep. 10, 7061. https://doi.org/10.1038/s4159802063877w (2020).
Cushing, J. M. Integrodifferential Equations and Delay Models in Population Dynamics (Springer, 1977).
Box, G. E. P. Robustness in Statistics (eds Launer, R. L. & Wilkinson, G. N.) 201–236 (Academic Press, 1979).
Byrne, A. W. et al. Inferred duration of infectious period of SARSCoV2: Rapid scoping review and analysis of available evidence for asymptomatic and symptomatic COVID19 cases. BMJ Open 10, e039856. https://doi.org/10.1136/bmjopen2020039856 (2020).
Lauer, S. A. et al. The incubation period of coronavirus disease 2019 (COVID19) from publicly reported confirmed cases: Estimation and application. Ann. Intern. Med. https://doi.org/10.7326/M200504 (2019).
Boldog, P. et al. Risk assessment of novel coronavirus COVID19 outbreaks outside China. J. Clin. Med. 9, 571 (2020).
Kuang, Y. Delay Differential Equations with Applications in Population Dynamics (Academic Press, 1993).
Gopalsamy, K. Stability and Oscillations in Delay Differential Equations of Population Dynamics (Springer, 1992).
Bellen, A. & Zennaro, M. Numerical Methods for Delay Differential Equations (Oxford, 2003).
Stephanopoulos, G. Chemical Process Control: An Introduction to Theory and Practice (Prentice Hall, 1984).
Baker, G. A. & GravesMorris, P. Pade Approximants 2nd edn. (Cambridge University Press, 1996).
Corless, R. M., Gonnet, G. H., Hare, D. E., Jeffrey, D. J. & Knuth, D. E. On the Lambert W function. Adv. Comput. Math. 5, 329–359 (1996).
Kesisoglou, I., Singh, G. & Nikolaou, M. The Lambert function should be in the engineering mathematical toolbox. Comput. Chem. Eng. 148, 107259. https://doi.org/10.1016/j.compchemeng.2021.107259 (2021).
Obadia, T., Haneef, R. & Boëlle, P.Y. The R0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks. BMC Med. Inform. Decis. Mak. 12, 147. https://doi.org/10.1186/1472694712147 (2012).
Centers for Disease Control and Prevention—COVID19 Response. COVID19 Case Surveillance Public Data Access, Summary, and Limitations (version date: October 31, 2020). (2020).
Blackwood, J. C. & Childs, L. M. An introduction to compartmental modeling for the budding infectious disease modeler. Lett. Biomath. 5, 195–221. https://doi.org/10.1080/23737867.2018.1509026 (2018).
Anderson, R. M., Anderson, B. & May, R. M. Infectious Diseases of Humans: Dynamics and Control (Oxford University Press, 1992).
Zhang, J. et al. Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside Hubei province, China: A descriptive and modelling study. Lancet. Infect. Dis 20, 793–802. https://doi.org/10.1016/S14733099(20)302309 (2020).
Institute for Health Metrics and Evaluation (IHME). COVID19 Projections Assuming Full Social Distancing Through May 2020 (2020).
Kucharski, A. J. et al. Early dynamics of transmission and control of COVID19: a mathematical modelling study. Lancet Infect. Dis. https://doi.org/10.1016/S14733099(20)301444 (2020).
Nikolaou, M. Using feedback on symptomatic infections to contain the coronavirus epidemic: Insight from a SPIR model. medRxiv https://doi.org/10.1101/2020.04.14.20065698 (2020).
Anderson, R., Medley, G., May, R. & Johnson, A. A preliminary study of the transmission dynamics of the human immunodeficiency virus (HIV), the causative agent of AIDS. IMA J. Math. Appl. Med. Biol. 3, 229–263. https://doi.org/10.1093/imammb/3.4.229 (1986).
Nikolaou, M. Ziegler and Nichols meet Kermack and McKendrick: Parsimony in dynamic models for epidemiology. Comput. Chem. Eng. 157, 107615. https://doi.org/10.1016/j.compchemeng.2021.107615 (2022).
Jalali, M. S., DiGennaro, C. & Sridhar, D. Transparency assessment of COVID19 models. Lancet Glob. Health 8, e1459–e1460. https://doi.org/10.1016/S2214109X(20)304472 (2020).
Sills, J. et al. Call for transparency of COVID19 models. Science 368, 482–483. https://doi.org/10.1126/science.abb8637 (2020).
Acknowledgement
The author gratefully acknowledges constructive comments and encouragement by Professors Peter Vekilov and Navin Varadarajan during early stages of the manuscript.
Funding
The Institute of Allergy and Infectious Diseases of the National Institutes of Health under award (grant number R01AI140287) partially supported the research reported in this publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Funding sources had no involvement in study design; in the collection, analysis and interpretation of data, nor in the writing of the report nor in the decision to submit the article for publication.
Author information
Authors and Affiliations
Contributions
M.N. is solely responsible for all activity associated with creation and submission of this manuscript.
Corresponding author
Ethics declarations
Competing interests
The author declares no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nikolaou, M. Revisiting the standard for modeling the spread of infectious diseases. Sci Rep 12, 7077 (2022). https://doi.org/10.1038/s41598022101850
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598022101850
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.