## Abstract

The number of confirmed COVID-19 cases reached over 1.3 million in Ontario, Canada by June 4, 2022. The continued spread of the virus underlying COVID-19 has been spurred by the emergence of variants since the initial outbreak in December, 2019. Much attention has thus been devoted to tracking and modelling the transmission of COVID-19. Compartmental models are commonly used to mimic epidemic transmission mechanisms and are easy to understand. Their performance in real-world settings, however, needs to be more thoroughly assessed. In this comparative study, we examine five compartmental models—four existing ones and an extended model that we propose—and analyze their ability to describe COVID-19 transmission in Ontario from January 2022 to June 2022.

### Similar content being viewed by others

## Introduction

Humans have faced severe infectious diseases throughout history, some of which have been classified as worldwide pandemics^{1}, including the Spanish flu in 1917 and Hong Kong flu (H3N2) in 1968. The most recent example is the spread of the coronavirus disease 2019 (COVID-19). COVID-19 is the infectious disease caused by the novel coronavirus of severe acute respiratory syndrome (SARS-CoV-2), and the first case was detected in the Wholesale Seafood Market in Wuhan City, Hubei province, China, on December 3, 2019^{2}. The disease then spread all over the world, such that in February 2020 the World Health Organization (WHO) declared COVID-19 to be a worldwide pandemic. In a variety of ways, COVID-19 and its associated public health policies have had serious impacts on human physical and mental health since then, in many regions of the world. Up to June 2022, the cumulative confirmed cases of COVID-19 worldwide reached 530 million, and continue to increase rapidly due to the spread of variants.

Mutations in the virus underlying COVID-19 have led to a number of variants of concern, including Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2), and Omicron (B.1.1.529)^{3}. Omicron was the most recently detected variant in Ontario, Canada, first identified in a traveller by the Public Health Ontario laboratory on November 22, 2021^{4}. As of January 20, 2022, Omicron (and its subvariants) has become dominant and represents the majority of infections in Ontario.

Mathematical models are widely used to describe the evolution of epidemics. In particular, compartmental models are one of the most popular classes of models in epidemiology to mimic the transmission dynamics. They have played an instrumental role in tracking epidemiological trends, generating predictions, and informing decisions of policy-makers. For example, in May 2020 the British Columbia government released a management strategy for COVID-19 that heavily relied on the results of a fitted dynamic compartmental model^{5}. Their model predicted the number of people who would require critical care under different levels of social contacts, which in turn informed the level of lockdown restrictions implemented by the government to protect the health system from being overwhelmed.

Compartmental models divide the total population into a number of different compartments; then the flow of the population through these compartments is usually modelled via a system of differential equations. Names of compartment models are usually given in acronym form, by abbreviating the first letter of each compartment and arranging the letters according to how the population tends to flow through the different compartments^{6}. The earliest and simplest compartmental model was the SIR model^{7}. The SIR model consists of three compartments that divide the total population into susceptible (S), infected (I) and recovered (R) individuals, along with a system of three differential equations which describes the flow rate in and out of each compartment. Since then, researchers have devoted much attention towards developing extensions to the basic SIR model.

The choice of compartments, flow directions, and parameters included in these models depends mainly on the characteristics of the disease. We briefly overview the variety of compartmental models that have been used to describe the transmission of COVID-19. Starting from the SIR model^{8}, the SEIR^{9}, SEIRD^{10}, and SMEIHRDV^{11} are examples of models that divide the overall population into finer compartments. Other compartmental models stratify the population into different groups to describe the transmission dynamics of COVID-19 in a more targeted way: e.g., stratification by age groups, such as the young, adults and seniors^{12}; profession stratification, such as healthcare workers and others^{13}; gender stratification^{14}. Some researchers combine these two approaches by setting up compartments within strata. A SEAPIR model^{15}, which was used to model Omicron cases in British Columbia^{16}, stratified the susceptible population by vaccination status, i.e., vaccinated and unvaccinated. Moreover, it added an asymptomatic infection compartment to account for the high asymptomatic carriage rate of the Omicron variant^{17}. Finally, a \(\mathrm {SV^2(AIR)^3}\) model^{18} not only considered asymptomatic infections and vaccination status, but also added the impact of policy measures and competition between different variants.

For illustration, this paper also develops our proposed extension to the SEAPIR model. As in the SEAPIR model, we stratify the population by vaccination status and include compartments for asymptomatic (A) and pre-symptomatic (P) infections. We incorporate a time-dependent function that aggregates the policy measures that the government has imposed to reduce COVID-19 transmission. In addition, we introduce a new compartment (Q) for individuals in self-isolation, and consider the interaction between groups with different vaccination statuses in the disease transmission stage.

With the plethora of models available, the choice of a suitable one to describe epidemic transmission is therefore an important consideration. Simple models (e.g., SIR) rely on few parameters and assumptions, and thus tend to provide an oversimplified representation of reality. In contrast, a complex model will often have been designed to provide a more comprehensive description of transmission dynamics and population behaviour. However, a complex model requires a larger number of unknown parameters, which can significantly impact its performance. Unknown parameters either need to be calibrated or have their values assumed—the former can increase the variance and uncertainty of the model predictions, while the latter can introduce significant bias.

Conventional methods for parameter calibration, such as non-linear least squares (NLS) and maximum likelihood, may fail to adequately capture the uncertainties of the calibrated parameters. Their calibration results are largely dependent on the stability of the known parameters, such as recovery rate and disease incubation rate, which are often borrowed from the existing literature. Moreover, for the NLS method, the global optimum can be difficult to find when the parameter space is large, which may result in misleading inferences. Therefore, besides using NLS for our proposed model, we also adopt a Bayesian approach to inference and apply Markov Chain Monte Carlo (MCMC) methods for parameter calibration. A Bayesian framework allows us to incorporate prior information and coherently accounts for the uncertainty of the parameters via their posterior distributions; e.g., parameters that are not well-calibrated from data will tend to have wide credible intervals.

As a specific case study, this paper focuses on modeling confirmed COVID-19 infections in Ontario from January 2022 to June 2022. This task is potentially more challenging with Omicron’s prevalence, compared to the original wild-type strain. First, there are the effects of vaccination. As COVID-19 vaccines have become widely available in Ontario, most of the population of Ontario has taken a complete dose of vaccination (i.e., fully vaccinated with one or two doses of a Health Canada authorized COVID-19 vaccine), and furthermore, some have also taken a booster dose (i.e., fully vaccinated plus one additional booster dose). However, individuals who are vaccinated or have recovered from COVID-19 in the past are still likely to be infected: vaccine effectiveness against the Omicron variant exhibits a continuous and consistent decrease after injection, and vaccination provides more limited protection against symptomatic disease caused by the Omicron variant^{19}. As a result, national re-infection associated with Omicron emergence was observed in South Africa^{20}, the United States^{21}, and Canada^{22}. Second, the limited availability of COVID-19 testing in Ontario, especially as case loads increased due to Omicron’s highly transmissible nature, hinders estimation of true infection and re-infection rates. Third, Omicron is thought to have higher rates of asymptomatic infection^{17}, and thus detection is more elusive. Fourth, Ontario moved through a series of reopening phases during this period^{23}, which has impacts on the social behavior of the population.

To the best of our knowledge, few studies have investigated whether simple models perform worse or better than complex models for describing the recent transmission dynamics of COVID-19 in Ontario. Therefore, a comparative study between models can help address this research question. This paper considers five different models: SIR^{8}, vaccination-stratified SIR^{24}, SEIRD^{25}, \(\mathrm {SV^2(AIR)^3}\)^{18} and our SEAPIR-extended model. We calibrate their adjustable parameters according to their proposed methods and evaluate their fits to Ontario’s confirmed daily case counts. By examining their performance in this real-world setting, we gain insight into the relative strengths and shortcomings of compartmental models that range from simple to complex.

## Methods

### Data description

The COVID-19 data used in this paper are obtained from Public Health Ontario. We investigate the daily confirmed COVID-19 cases from January 6 to June 4, 2022, which spans five reopening phases as determined by the Ontario government^{23}. The first phase is from January 6 to January 30, when the province returned to a modified Step 2 of the reopening plan^{26} with restrictions on social activities. The second phase is from January 31 to February 16, when the Ontario government began the process of gradually easing restrictions while maintaining protective measures^{27}. The third phase is from February 17 to February 28, when the Ontario government further eased public health measures^{23}. The fourth phase is from March 1 to March 20, when the proof of vaccination requirement was lifted for all settings^{27}. The data ends with a portion of the fifth phase from March 21 to June 4, which corresponds to the time when the Ontario government scrapped most mask mandates^{28}. Daily vaccination counts were also available for this investigated time period.

As the Ontario government moved from one reopening phase to the next, the restrictions on indoor and outdoor public activities were relaxed, which led to out-of-home mobility increasing over this time period. These phase-to-phase changes in restrictions may be quantified via the ‘Oxford Stringency Index’ (denoted by \(\lambda (t)\)), which is an aggregate value (ranging from 0 to 100\(\%\)) that quantifies the “overall impact of policy measures on workplace closures, school closures, travel bans, and vaccination requirements”^{29}.

The daily COVID-19 cases stratified by vaccination status shown in Fig. 1 provide a more detailed look at the data^{30}. We note that the Ontario government changed its stratification rules for reporting cases during the investigated time period. From January 6 to March 10 (before the dashed line in Fig. 1), the Ontario government used three strata for reporting infections: unvaccinated, partially vaccinated, and fully vaccinated (which includes infections among both those with a completed primary series and those with an additional booster dose). From March 11 to June 4 (after the dashed line in Fig. 1), the Ontario government changed the compositions of the three strata it used for reporting infections: not fully vaccinated (which included partially vaccinated and unvaccinated), completely vaccinated, and vaccinated with booster dose. Table 1 shows these two different stratification rules in detail.

### Model descriptions

In this section, the SIR^{8}, vaccination-stratified SIR^{24}, SEIRD^{25}, \(\mathrm {SV^2(AIR)^3}\)^{18}, and a new model that we call vaccination-stratified SEPAIQRD, are introduced.

#### SIR model

Among others, Cooper et al.^{8} used the basic SIR model to track the spread of COVID-19. The total population size, *N*, is represented as \(N = S(t) + I(t) + R(t)\), where *S*(*t*), *I*(*t*), and *R*(*t*) respectively denote the number of susceptible, infected, and recovered individuals at time *t*. The SIR model is governed by the system of differential equations in Eq. (1):

where \(\gamma\) and \(\beta\) are the two parameters to be calibrated. The parameter \(\beta\), as seen in the first two equations, governs the rate at which individuals move from the *S* to the *I* compartment; it is called the disease transmission rate, representing the average number of susceptible individuals in the *S* compartment that a contagious individual in the *I* compartment infects in a day. The parameter \(\gamma\), as seen in the last two equations, governs the rate at which individuals move from the *I* to the *R* compartment; it is called the removal rate, representing the probability per day that an *I*-individual transits to *R* (which can encompass both recovered and deceased individuals). Thus, the average duration of infection under this model is \(1/\gamma\). Note that the last equation implies that *R* is an absorbing state, since individuals can no longer leave once they enter this compartment (e.g., reinfections are not possible in this model). The unknown parameters \(\beta\) and \(\gamma\) can be calibrated by minimizing the sum of squared errors (SSE) between the model-fitted daily case counts and actual case counts.

#### Vaccination-stratified SIR model

Fisman et al.^{24} proposed a modified SIR model by stratifying the population into vaccinated and unvaccinated groups. They also considered the impact of interaction (or mixing) between the vaccinated and unvaccinated sub-populations on COVID-19 transmission, by introducing the parameters \(f_{ij}\) for the fraction of contacts among individuals in the i-th group [i.e., vaccinated or unvaccinated, denoted respectively by *V* and *U* in Eq. (2)] with those in the j-th group. Immunity from vaccination (when effective) is assumed to be permanent. The parameters \(\gamma\) and \(\beta\) have the same interpretation as in the basic SIR model.

The system of differential equations governing this model is shown in Eq. (2):

where \(N_i= S_i(t) + I_i(t) + R_i(t)\) is the subpopulation size of the i-th group.

The authors mainly obtained their parameters from other literature and numerically solved the differential equations with predetermined initial conditions. Their modeling approach does not involve parameter calibration from data, and for simplicity their chosen parameter values are also adopted in our implementation of the model. In principle, a procedure for better fitting the model to data could be developed, to handle situations where their original assumptions do not hold.

#### SEIRD model

Melo^{25} proposed an SEIRD model to provide a fuller description of COVID-19 progression, which divides the population into finer compartments. Susceptible individuals *S* first move to the exposed compartment *E* with disease transmission rate \(\beta\), rather than directly moving to *I*. After the disease incubation period (an average of \(1/\gamma\) days), exposed individuals will transit into the *I* compartment. Infected individuals will then either move to the recovered *R* compartment (with a rate of \(\mu\)) or the dead *D* compartment (with a rate of \(\rho\)). Their governing system of differential equations is shown in Eq. (3):

As in the SIR model, the unknown parameters \(\beta , \gamma , \rho ,\mu\) can be calibrated by minimizing the SSE between the model-fitted daily confirmed case counts and the actual ones.

#### \(\mathrm {SV^2(AIR)^3}\) model

Layton and Sadria^{18} introduced the \(\mathrm {SV^2(AIR)^3}\) model to provide a more comprehensive description of COVID-19 epidemic progression in Ontario. The authors included additional parameters that measured the impact of waning immunity, vaccine effectiveness, and policy measures that restrict public activities. The quantitative values of measuring the policy strictness are equivalent to the previously introduced Oxford Stringency Index. The compartmental setup considered two vaccine types (hence \(V^2\)), asymptomatic infections (A), and competition among the three main variant types as of Fall 2021, i.e., wild, Alpha, and Delta. The authors also modeled the potential spread of a hypothetical new-emerging variant.

The model parameters that describe the clinical characteristics of the COVID-19 variants are obtained from published studies. Other parameters related to the demographics and social behaviours of the Ontario population are obtained from published provincial statistics. In total, the model has 69 parameters. To calibrate the model, we update nine parameters pertaining to their new-emerging variant to mimic the characteristics of the actual Omicron variant, including higher values for Omicron’s transmission rate and fraction of asymptomatic infection. We also use the actual values of the Oxford Stringency Index during our investigated period, which serve as scaling factors in the model. Tables S1 and S2 in the Supplementary Information respectively show the model parameters with respect to wild-type, Alpha-type, Delta-type, and our updated Omicron-type variants.

#### Vaccination-stratified SEPAIQRD model

We introduce an extension of the SEAPIR model to describe the dynamic mechanisms of COVID-19 transmission in Ontario over the investigated period, which we call the vaccination-stratified SEPAIQRD model. A summary of its key features is as follows: more compartments are added to reflect the situation in Ontario; the population is stratified by the four vaccination statuses as defined by Ontario; migration between susceptible compartments of the different vaccination statuses occurs, according to the daily reported vaccination counts; time-varying parameters describe testing efficacy and asymptomatic infections.

The COVID-19 data released by Public Health Ontario of confirmed COVID-19 cases are split into three strata up to and including March 10, as presented in the Data Description section. During this period, the ‘fully vaccinated’ infections counted both completely vaccinated and vaccinated with booster dose infections. For simplicity, we further split these ‘fully vaccinated’ infections according to the daily-updated proportion of completely vaccinated and vaccinated with booster dose populations in Ontario. After March 10, we further split the ‘not fully vaccinated’ infections according to the daily-updated proportion of unvaccinated population and partially vaccinated population in Ontario. Figure S1 in the Supplementary Information plots the case counts stratified by the four vaccination statuses after this processing step. In the following description, the subscript index *i* for \(i=1,2,3,4\) will respectively denote the unvaccinated, partially vaccinated, completely vaccinated, and vaccinated with booster dose populations. We let \(N_i\) denote the size for each of these populations. The flow of susceptible individuals with these four vaccination statuses will be tracked in the model using parallel compartments.

Susceptible individuals (in the ‘\(S_i\)’ compartment) can move to the exposed compartment (denoted as ‘\(E_i\)’) when in contact with contagious individuals. We let \(\beta _{k,ij}\) denote the transmission rate from the contagious compartment *k* in the i-th group to the susceptible individuals in the j-th group. The construction of the disease transmission matrix that governs such interactions follows a previous approach^{12} and is described in Section B.2 of the Supplementary Information, with Table S3 showing the contact matrix given different vaccination statuses. Tables S4 and S5 in the Supplementary Information show the transmission matrix of different contagious compartments. These disease transmission rates will be scaled multiplicatively by \(1-\lambda (t)\), which quantifies the impact of policy measures via the Oxford Stringency Index.

After exposure, asymptomatic individuals are assumed to follow the flow \(E_i \rightarrow A_i\) (asymptomatic) \(\rightarrow RA_i\) (recovered asymptomatic). Those with mild to severe symptoms follow the flow \(E_i \rightarrow P_i\) (pre-symptomatic); then after the disease incubation period, they either recover without testing (\(P_i \rightarrow R'_{i}\), e.g., mild symptoms) or are documented by the Ontario government as confirmed cases (\(P_i \rightarrow I_i\), e.g., more serious symptoms). Finally, individuals with confirmed cases follow one of three flows: \(I_i \rightarrow D_i\) (death); \(I_i \rightarrow Q_i\) (quarantined) \(\rightarrow R_i\) for those who self-isolate and then recover; \(I_i \rightarrow R_i\) for those who recover without self-isolation. These compartments and flows are all illustrated in the overall schematic of the model in Fig. 2.

The flows in Fig. 2 are governed by a number of fixed and time-varying parameters. The fixed \(\kappa\) parameters^{12,16} are various transition rates; e.g., \(\kappa _E\) governs the \(E_i \rightarrow P_i\) transition rate, with the interpretation that an individual spends an average of \(1/\kappa _E\) days in the \(E_i\) compartment. The death rate is \(\alpha _i\), whose value depends on vaccination status^{31}. The fixed \(\epsilon\) parameter is the proportion of infected individuals who comply with self-isolation after testing positive. Table S6 in the Supplementary Information lists the values of these fixed parameters. Next, the time-varying parameter \(f_i(t)\) is interpreted as the probability of asymptomatic infection; it is treated as unknown and will be calibrated from data for each vaccination status and reopening phase. Finally, the time-varying ‘case ascertainment rate’, denoted as *CAR*(*t*), is interpreted as the proportion of symptomatic infections that are documented by the Ontario government as confirmed cases; it is also treated as unknown and will be calibrated from data for each reopening phase. Note that \(T'_i\) and \(T_i\) in Fig. 2 are intermediate compartments set up so that the parameters *CAR*(*t*), \(f_i(t)\), and \(\epsilon\) can be interpreted as the proportion of flux-out from the preceding compartment. Table 2 summarizes all of the model parameters and their corresponding definitions.

The final element of the model is the flow of people who migrate between vaccination statuses during the investigated period. We let \(V_1\), \(V_2\), and \(V_3\) respectively denote the number of individuals taking first, second, and booster vaccine doses, represented as daily counts reported by the Ontario government. These daily counts of individuals who get vaccinated govern the flows \(S_1 \rightarrow S_2\), \(S_2 \rightarrow S_3\), and \(S_3 \rightarrow S_4\), as indicated in Fig. 2. The corresponding population sizes \(N_i\) are also updated daily based on these counts.

The overall model incorporates certain assumptions, which we now state explicitly. We assume that the cases reported by Public Health Ontario units are symptomatic cases. We expect this assumption to be reasonable, since the Ontario government decided to “limit eligibility for publicly funded PCR tests to high-risk individuals who are symptomatic beginning from December 31 (2021)”^{32}. This policy was maintained throughout the investigated time period. We assume that asymptomatic and mild cases are not tested and therefore do not self-isolate. We assume that the rate at which Ontario government documents confirmed cases and asymptomatic infection rate are constant within one phase, and allowed to change between phases. Moreover, the model allows the asymptomatic infection rate to differ by vaccination status and by reopening phase, which will be calibrated from data.

The full system of differential equations, that corresponds to Fig. 5 and incorporates the above considerations, is provided in Section B.4 of the Supplementary Information.

### Parameter calibration for vaccination-stratified SEPAIQRD model

Our model has two sets of unknown phase-dependent parameters that need to be calibrated based on data, namely \(f_i(t)\) and *CAR*(*t*). Given initial conditions for each compartment and a set of values for the phase-dependent \(f_i(t)\) and *CAR*(*t*), running the numerical ODE solver produces deterministic trajectories of each compartment. Using the hat symbol to denote the numerical solution, the number of new daily confirmed cases in the i-th vaccination status on the t-th day can be expressed as \(\Delta \widehat{I_i}(t) := \widehat{I_i}(t)+\widehat{R_i}(t)+\widehat{D_i}(t)+\widehat{Q_i}(t)-(\widehat{I_i}(t-1)+\widehat{R_i}(t-1)+\widehat{D_i}(t-1)+\widehat{Q_i}(t-1))\). We denote the actual daily confirmed case counts on the t-th day as \(\Delta I_i(t)\).

To fit the model using a basic method that is similar to the other models considered, we may calibrate the values of \(f_i(t)\) and *CAR*(*t*) by minimizing the SSE between \(\Delta \widehat{I_i}(t)\) and \(\Delta I_i(t)\), where \(f_i(t)\) and *CAR*(*t*) may vary by reopening phase. Thus equivalently, we may denote the parameters to calibrate as \(f_{i,j}\) and \(CAR_j\), for vaccination status \(i \in \{ 1,2,3,4\}\) and reopening phase \(j \in \{1,2,3,4,5\}\). We refer to this method as the NLS model fit in the subsequent results.

It can be advantageous to take a Bayesian approach to inference and apply MCMC methods for parameter calibration, and this is the main method we consider for fitting our model. Briefly, the key idea of Bayesian inference is to incorporate prior information or beliefs concerning the unknown parameters with the likelihood of the observed data, to generate a posterior distribution for the unknown parameters. Mathematically, for parameters \(\theta\) and data *y*, along with the prior distribution \(\pi (\theta )\) and the likelihood function \(P(y|\theta )\), by Bayes’s Theorem, the posterior distribution \(P(\theta |y)\) (up to a multiplicative constant) is given by

When closed-form analysis of the posterior distribution is not possible, MCMC methods are often used to generate samples from \(P(\theta |y)\). To facilitate MCMC sampling, we convert *CAR*(*t*) and \(f_i(t)\) from their [0, 1] scale to the real numbers via a logit transformation. We let *CAR*(*t*) and \(f_i(t)\) on the logit scale be denoted as \(\mathscr {L}(CAR(t))\) and \(\mathscr {L}(f_i(t))\), where \(\mathscr {L}(x) = \log (\frac{x}{1-x})\).

Here, the likelihood provides the probabilistic link between the ODE solution \(\Delta \widehat{I_i}(t)\) and the observed data \(\Delta I_i(t)\). A model for the daily case counts is therefore needed; the Poisson and negative binomial are common probability distributions used to model count data^{38}. As the variance in daily case counts tends to exceed the mean (i.e., overdispersion)^{39,40}, we adopt a negative binomial model for the likelihood as shown in Eq. (5):

where \(\phi_i(t)\) is the phase-dependent parameter that accounts for overdispersion in the i-th group, i.e., conditional on the ODE solution, we assume independence of the negative binomial across different days^{40}.

To complete the Bayesian model, prior distributions need to be specified for the unknown phase-dependent parameters \(\mathscr {L}(f_i(t))\), \(\mathscr {L}(CAR(t))\), and the overdispersion parameter \(\phi _i(t)\). These are chosen to be weakly informative: we use them to encode some *a priori* beliefs and knowledge, while letting the data likelihood be the main contributor to the posterior. First, we believe *a priori* that vaccinated individuals are more likely to be asymptomatically infected, as it is commonly accepted that mild or severe COVID-19 symptoms are reduced by vaccinations and boosters^{41}. Second, the immunity conferred by vaccination wanes over time, especially against Omicron^{19}; with most booster doses in Ontario being administered in early 2022, their overall effect is expected to wane over the investigated period. These considerations are encoded by ordering the prior means for \(\mathscr {L}(f_i(t))\) according to vaccination status and phase. Third, the daily case counts are highest in January and decrease over our investigated period, and the Ontario government may more readily document cases when daily infections are lower. Thus, the prior means for \(\mathscr {L}(CAR(t))\) are set to increase with reopening phase. Due to our uncertainties about these unknown parameters, a relatively large standard deviation of the priors is chosen, so that the posterior distributions will be primarily informed by the data. The full list of priors is shown in Table S7 in the Supplementary Information for \(\mathscr {L}(f_{i,j})\), \(\mathscr {L}(CAR_j)\), and \((\phi _{i,j})^{-1}\), for vaccination status \(i \in \{ 1,2,3,4\}\) and reopening phase \(j \in \{1,2,3,4,5\}\). As seen in Fig. 6, the prior densities (blue) chosen are relatively flat, indicating they encode some knowledge without strongly contributing to the posterior.

Multiplying the likelihood and prior (Eq. 4) yields the posterior distribution of the unknown parameters \(CAR_j,\phi _{i,j},f_{i,j}\) in Eq. (6):

where \(\varvec{CAR}, \varvec{\phi }, \varvec{f},\) and \(\Delta \varvec{I}\) are the concatenated vector forms of \(CAR_j, \phi_{i,j}, f_{i,j},\) and \(\Delta I_i(t)\), with \(i \in \left\{ 1,2,3,4\right\} , j \in \left\{ 1,2,3,4,5\right\}\).

To obtain samples from the posterior distribution of the parameters, we use Stan 2.21.5^{42} and R 4.1.1, running 2000 MCMC iterations and four chains. Let \(\Delta \varvec{I}\) be the concatenated vector form of the daily case counts \(\Delta I_i(t)\), and \(\Delta \varvec{\widehat{I}}_{pred}\) the corresponding model estimates. The posterior distribution of \(\Delta \varvec{\widehat{I}}_{pred}\) is then approximated using the MCMC samples via Eq. (7):

where \(\varvec{\theta } = (\mathscr {L}(\varvec{CAR}),(\varvec{\phi })^{-1},\mathscr {L}(\varvec{f}))\) and \(\varvec{\theta _{(1)}}, \ldots , \varvec{\theta _{(K)}}\) are the MCMC samples of \(\varvec{\theta }\). We treat the posterior mean as the model-fitted daily case counts. Further, credible intervals can be easily obtained in the Bayesian framework, e.g., we take the 0.025 and 0.975 quantiles of the posterior distribution for \(\Delta \varvec{\widehat{I}}_{pred}\) (based on the MCMC samples) to form 95\(\%\) credible intervals. We refer to this Bayesian approach as the MCMC model fit in the subsequent results.

We note that our investigated period is divided into several reopening phases. When the Ontario government moved from one reopening phase to another, the Oxford Stringency Index will immediately reflect the change, whereas the actual social behaviour in response to the change is likely to be more gradual. Jump discontinuities from the piece-wise function \(\lambda (t)\) and the phase-dependent parameters (\(CAR_j\) and \(f_{i,j}\)) would lead to corresponding jumps in estimated case counts. Locally weighted linear regression (LOESS)^{43} is a nonparametric technique that is useful for estimating a smoothed curve, e.g., in volatile time series data. This technique has been used to smooth SIR model predictions on the total number of deaths, in the presence of a time-varying case ascertainment rate^{44}. Here, we also apply LOESS with automatic bandwidth and span selection to our fitted daily case counts (for both NLS and MCMC fits) and credible boundaries (for the MCMC fit). Another technique could be to use a smooth linear function to interpolate \(\lambda (t)\) over a 1-week period after the start of a new phase^{40}; however, it is less applicable here due to the presence of other phase-dependent parameters (\(CAR_j,f_{i,j}\)).

### Initial conditions for the models

The total population size is a key input to each of the compartmental models described. We set *N* = 14,051,980, according to the total population in Ontario as recorded in the Canadian census^{45}. The true initial conditions for the different compartments are generally unknown, so an estimation procedure is needed. The number of active infections on a given day (e.g., corresponding to compartment *I*) is estimated by the number of hospitalized patients with COVID-19 divided by the 1.9% hospitalization rate of COVID-19^{46}. As of January 5, 2022, the number of confirmed COVID-19 cases in Ontario was approximately 840,000^{47}. However, due to the high re-infection rate of the Omicron variant, previously infected individuals were still likely to be susceptible. Thus, for simplicity, we only used confirmed case counts in early January to set initial conditions for the remaining compartments. Specifically, with Omicron having an average recovery time of five days, we used the case count on January 1 to set the initial size of the recovered compartment. Since 840,000 represents about only 6% of the Ontario population, this modeling choice will only have a small effect on the total number of remaining susceptible individuals, and thus compartmental model dynamics. Similar reasoning is used to obtain initial conditions for quarantined, exposed, and pre-symptomatic compartments (where applicable to the model), based on the infection counts recorded on January 5, January 11 and January 8, respectively.

## Results and discussion

This section presents the results of our comparative study, where the five compartmental models are fitted to the Ontario COVID-19 data over our investigated period. After calibrating the relevant parameters in each model to these data as described in the “Methods” section, the numerical solution of each model is computed from January 6 to June 4, 2022. These model trajectories are compared to the actual daily case counts to assess their goodness-of-fit to the data. For models that encode fewer vaccination statuses than the stratification rules for case counts used by the Ontario government, the proportion of sub-populations with different vaccination statuses will be used to allocate the estimated case counts. This is to ensure that all of the model fits can be fairly compared to the ground truth provided by the Ontario government. Furthermore, we emphasize that the quality of fit for a given model will depend not only on the choice of its structure and compartments, but also on the choice of which of its parameters to calibrate from data and how that calibration is done. For example, we adopted a number of predetermined parameter values from the original studies for the vaccination-stratified SIR and \(\mathrm {SV^2(AIR)^3}\) models. If their methods were modified to allow some or all of these predetermined parameter values to instead be calibrated from data, the empirical performance of those models might be improved; however, such an investigation is beyond the scope of the current study.

### Assessing the model fits

We first present graphical summaries of the model fits, by overlaying the actual confirmed daily case counts on the fitted model trajectories. To account for the effect of weekends (which are associated with reduced testing activity), a centered 7-day moving average of the case counts is also shown to help visualize the case trends over time. The fits are shown according to the six strata definitions used by the Ontario government for reporting cases in Table 1. The trajectories of the calibrated SIR and SEIRD models are plotted in Fig. 3, while the trajectories of the vaccination-stratified SIR model and Omicron-calibrated \(\mathrm {SV^2(AIR)^3}\) model are plotted in Fig. 4. Finally, the trajectories of the calibrated vaccination-stratified SEPAIQRD model fitted by NLS and MCMC are plotted in Fig. 5, with the green bands representing the credible region with 95% probability under the Bayesian posterior.

A corresponding quantitative measure of model performance can be provided by the root mean squared error (RMSE) between the model-fitted daily confirmed case counts and the actual case counts. We first compute the RMSEs for each model based on the total daily case counts (regardless of vaccination status) over the entire investigated period, as shown in Table 3. Then, we compute the RMSEs of the fitted case counts according to Ontario’s stratification rules: Table 4 shows the results for January 6 to March 10, and Table 5 shows the results for March 11 to June 4.

The graphical and RMSE summaries indicate that none of the compartmental models can fully capture the trends in Ontario’s COVID-19 case counts during the investigated period. Of the five models considered, the proposed vaccination-stratified SEPAIQRD provides the closest fit to the data, both for total and stratified daily case counts (lowest RMSE in each column of Tables 3, 4, 5), with the Bayesian approach providing a slightly better fit than NLS. Visually, it is the only model with estimated trajectories that can partially capture the resurgence of cases in late March, and almost all actual counts lie within the 95% credible bands in Fig. 5.

The simple SIR and SEIRD models provide similar RMSEs, performing relatively well among the models considered. However, they cannot capture the resurgence of cases that occurs in late March. The limited number of model parameters only allow them to fit the general downward trend of case counts, and they lack the flexibility to model more complex scenarios, e.g., multiple waves of the epidemic within the investigated period. The additional compartments ‘E’ and ‘D’ introduced in the SEIRD model do not provide the capacity to help in that regard. While these two models do not explicitly account for different vaccination statuses, simply allocating the estimated cases according to the proportion of the population in each strata provides reasonable results.

The vaccination-stratified SIR model does not perform well on this dataset, having the largest RMSEs overall of the models considered. This might be attributed to the stringent assumptions employed by its authors. First, they encoded assumptions on the efficacy of the vaccine, such that 80\(\%\) of vaccinated individuals have permanent immunity, while 20\(\%\) of unvaccinated individuals are assumed to be immune. Second, values of the model parameters were obtained by authors from existing literature, without proposing a calibration process from real data. Simply reusing their parameter values does not provide an adequate fit over our investigated period. Thus, while stratifying by vaccination status could potentially provide more granularity for predictions, the performance of the model is hindered by its fixed parameters.

The \(\mathrm {SV^2(AIR)^3}\) model incorporates the impact of vaccine efficacy, policy measures, and clinical characteristics of specific COVID-19 variants. We updated the model parameters according to the characteristics of the Omicron variant, and used the actual values of the Oxford Stringency Index over the investigated period. However, we were unable to calibrate model trajectories that fit the data well: unvaccinated cases show an increasing rate of growth from mid-March to the end of our investigated period (Fig. 4), which is opposite to the trend in the actual data. Furthermore, while we adjusted the vaccine efficacy against the Omicron variant to be only 30% when fully vaccinated (compared to the authors’ original assumption of 75% for the hypothetical variant), the model still vastly underestimates the number of vaccinated cases. Only towards the end of the fifth reopening phase, with the loosest restrictions, do the model trajectories start to show an uptick in vaccinated cases. Thus, while the \(\mathrm {SV^2(AIR)^3}\) model should theoretically have the flexibility to capture complex transmission dynamics, a more sophisticated method of calibrating its parameters would likely be needed to adapt it to the present setting.

The proposed vaccination-stratified SEPAIQRD model extended an existing SEAPIR model, by incorporating four vaccination statuses and adding other relevant compartments. We used a mix of parameters from the existing literature, together with using daily case counts to calibrate a selective set of time-varying parameters pertaining to asymptomatic infection and case ascertainment rates. For the investigated period, this approach provided a good balance between modeling flexibility and fixed parameter assumptions, with good empirical performance relative to the other models considered. We note that the overall empirical performance of the model did not depend strongly on the parameter calibration method used. Both the basic NLS method and the more sophisticated Bayesian approach with MCMC yielded reasonable fits to the data. While MCMC did perform slightly better, the NLS fit nonetheless lies within the 95% credible bands of the MCMC fit.

### Examining the calibrated parameters

Next, we discuss the values of the fixed and calibrated parameter values in each model. The parameters of the SIR, SEIRD and vaccination-stratified SIR model are presented in Table 6. First, we find that calibrating parameters for the SIR and SEIRD models on case counts alone cannot accurately describe the clinical characteristics of COVID-19. In obtaining the parameters that provide the best fit to the data for the investigated period, these two models tend to underestimate the transmission rate \(\beta\) and the basic reproduction number \(R_0\). Interestingly, although the SEIRD model could not calibrate a reasonable value for \(\rho\) (as death counts were not used) and its estimated \(R_0\) differs significantly from the SIR model, both models effectively provided the same quality of fit to the data. This suggests that the calibrated parameter values of these models should be interpreted with caution, and do not necessarily correspond to the actual clinical characteristics of the disease. The low \(R_0\) values are a clear artifact of reasonably fitting the overall downward trend in case counts during the investigated period. Overall, the simplicity of these models is both a strength and a weakness. In contrast, the vaccination-stratified SIR model used entirely fixed parameters^{24}, as shown in the corresponding row of Table 6. While its fixed \(R_0\) value might more closely reflect the intrinsic spread of COVID-19, real-world factors during the investigated period violated that assumption.

The full list of model parameters in the \(\mathrm {SV^2(AIR)^3}\) model is presented in Tables S1 and S2 in the Supplementary Information. The parameters we used for the emerging variant were calibrated to values that reflect our best knowledge of the Omicron variant. These include a higher asymptomatic proportion (60%, vs. 50% for previous variants), a higher baseline transmission rate (4.5 times that of Delta in the unvaccinated population), a shorter recovery time (8 days), and setting the actual start date for its spread in Ontario to be November 22, 2021. We also greatly increased Omicron’s transmission rates in the fully vaccinated population to reflect lower vaccine efficacy: 70% of the baseline unvaccinated rate (compared to the authors’ 12% for Delta). Despite these calibrations, the model could not adequately describe the data during the investigated period, especially for the vaccinated population. This indicates that other assumptions used throughout the model may also require adjustment, such as the parameters related to waning immunity from vaccination.

In the vaccination-stratified SEPAIQRD model, we used both NLS and Bayesian inference with MCMC to obtain the calibrated parameters (i.e., \(f_{i,j}\) and \(CAR_j\)). For the NLS fit, the summary of the calibrated parameters is provided in Table 7. For the MCMC fit, all four MCMC chains are observed to have converged (Fig. S2 in the Supplementary Information). A comparison between the prior and posterior probability densities of \(f_{i,j}\) and \(CAR_j\) for the five phases is plotted in Fig. 6. A corresponding summary table of the posterior mean, 0.025 lower quantile, and 0.975 upper quantile of the \(95\%\) credible bounds for \(f_{i,j}\) and \(CAR_j\) is presented in Table S8 of the Supplementary Information.

In general, both the NLS and MCMC fits provide similar findings regarding the asymptomatic infection proportion (\(f_{i,j}\)). The MCMC posterior means and NLS calibrated values consistently indicate that the asymptomatic infection proportion is small among all four vaccination statuses. We might conclude it is highly likely that exposed populations become infected with at least mild symptoms, i.e., asymptomatic infection is a low-probability event according to the model. However, this cannot be fully tested against reality and could be an artifact of the model setup. While the NLS and MCMC fits align on this general finding, their calibrated \(f_{2,j}\) values (i.e., for the partially vaccinated group) do noticeably differ. This may explain the relatively poorer NLS fit seen in Fig. 5b.

The credible intervals of the \(f_{i,j}\)’s broadly overlap for the MCMC fit, indicating that there is no observable difference in their posterior distributions across different vaccination statuses. While the corresponding NLS fit does not directly yield confidence intervals, its calibrated values of the \(f_{i,j}\)’s likewise do not exhibit clear trends in relation to vaccination status. Together, this suggests the asymptomatic infection proportion is not directly associated with vaccination status and the change of reopening phase, even when prior beliefs (blue densities in Fig. 6) are encoded in the model. Other confounding factors might also significantly influence the asymptomatic infection proportion. For example, patients with higher-risk medical histories, such as hypertension and chronic obstructive pulmonary disease, were given priority for booster doses. Even with a booster dose, this group is increasingly likely to be infected with symptoms^{48}. Furthermore, as the Ontario government continued to relax social restrictions, the overall increase in social interactions could have a more adverse effect on highly susceptible populations. Finally, case ascertainment rates may not be uniform across different vaccination statuses.

The calibrated values of \(CAR_j\) generally increase when the Ontario government shifts from one phase to another, which suggests the Ontario government is more efficient at documenting the infections, or a larger proportion of infected people get tested when daily infections become fewer. This trend in \(CAR_j\) is seen in both the MCMC and NLS fits, indicating good agreement overall with our prior beliefs, i.e., these parameters are associated with the changes in reopening phases. Since Ontario’s testing policies did not change during our investigated period, a possible reason is that with the very high daily infections in phase one, it might have been difficult for the Ontario government to handle the testing volume and people were less likely to get tested. At the same time, the magnitudes of \(CAR_j\) tend to differ between the NLS and MCMC fits, with NLS yielding smaller calibrated values. The MCMC fit exploits the information encoded in the priors, which may have led to more realistic \(CAR_j\) values and a slightly better fit to the data.

The need for parameter values to change over time, so that the dynamics of COVID-19 resurgences can be captured (particularly in the spring of 2022), is an important consideration from this analysis. The vaccination-stratified SEPAIQRD model accommodates such changes through piece-wise constant parameters (\(f_{i,j}\) and \(CAR_j\)), and is the key feature that contributes to its good empirical performance over the investigated period. Time-varying parameters introduce additional complexity to a modeling framework; however, SIR or SIR-like models will typically require such adjustments to their parameters over time to capture resurgences and fit data over multiple epidemic waves. Therefore, judiciously incorporating time-varying parameters in the other four models might likewise enhance their ability to fit the dataset considered in this paper.

## Conclusion

It is necessary to collect, analyze and monitor pandemic data to assess strategies of intervention, management, and control^{49}. This paper aimed to provide insight into the data analysis step, by presenting a comparative study of five compartmental models and their ability to fit COVID-19 case data in Ontario, Canada from January 2022 to June 2022. In addition to four existing compartmental models, we presented an extension of the SEAPIR model to help provide a more comprehensive description of the recent COVID-19 dynamics in Ontario. Each model was found to have its strengths and weaknesses when applied to the investigated period. The SIR and SEIRD models had relatively few compartments and simple assumptions, which allowed them to fit the overall downward trend in cases—but not to reflect more complex situations involving multiple epidemic waves, nor necessarily have calibrated parameter values that reflect actual clinical characteristics of COVID-19. The trajectories of the vaccination-stratified SIR model and the \(\mathrm {SV^2(AIR)^3}\) model appeared to be implausible compared to the actual case counts, despite them being more sophisticated models. Their implausibility and underperformance might be due to having some fixed parameters borrowed from existing literature that were no longer appropriate. Due to the real-world complexities underlying the current Ontario data, more data-driven parameters would be needed to account for situations such as time-varying case ascertainment rates and vaccine efficacy. These results practically illustrate the potential tradeoffs between applying simple models versus more complex ones. For the proposed SEPAIQRD model, the results also illustrate some practical differences between parameter calibration methods: while the basic NLS fit provided a reasonable depiction of the dynamics, the Bayesian approach with MCMC was more favorable in terms of the fitted daily case counts and the interpretations of the calibrated parameter values. While more computationally intensive, the overall advantages of the Bayesian approach include its ability to account for both the observed data and prior beliefs, and to quantify uncertainty in both the parameters and fitted case counts.

Nonetheless, some factors that likely play a role in COVID-19 transmission dynamics and disease progression were largely excluded from the models considered, and we briefly discuss a few such factors. First, disease progression may vary by age group (e.g., the COVID-19 death rate tends to be higher for older groups), and these differences could be modeled by stratifying over age groups^{12}. This may involve adding parallel compartments, with specific parameters governing each age group (or combination of age and vaccination status, for models that also stratify by vaccination status). The disease transmission matrix would also need to be expanded to account for interactions between all the strata under consideration. Second, waning immunity and the possibility of re-infection were only handled in a simplified way (via assumed parameter values for the \(\mathrm {SV^2(AIR)^3}\) model, and via priors for our SEPAIQRD model) or not at all (for the SIR, vaccination-stratified SIR, and SEIRD models). In general, this could be modeled by adding flows for vaccinated or recovered individuals to return to the susceptible (S) state, as suggested in the \(\mathrm {SV^2(AIR)^3}\) model. Overall, the inclusion of these factors would introduce additional complexities to the models, along with parameters that may be difficult to calibrate plausibly from available data.

Several other limitations also exist in our work. On one hand, all model estimates are symptomatic infections. Although the assumption that Public Health Ontario only documents the number of symptomatic infections might be reasonable, asymptomatic infection is still worth consideration. At worst, the “infected” here is some combination of both symptomatic and asymptomatic infections, with the symptomatic very likely being the larger component. Had the infected been separated out in the data into symptomatic and asymptomatic components, this could have been incorporated into the model (though likely the asymptomatic would be under-represented in the data). Second, as with any statistical model, the predictive capacity has not (as yet) been tested on future case counts. It could very well perform poorly on future counts, especially should the dynamics of disease transmission and health policy change. What is clear from this study, is that the demonstrable failures and inherent limitations of compartmental models suggest that they should not be relied on too heavily by decision-makers in forming public health policy on COVID-19. At the same time, we may recognize that this study, along with others that focus on compartmental models, can help provide insight into the mechanisms behind the spread of COVID-19. In particular, as models are designed to approximate reality, modeling is useful for identifying the most relevant aspects of the mechanisms that are necessary to explain the observed data. When our proposed model is compared to the others considered in this study, we find that time-varying parameters are crucial for fitting the data well, even though it is challenging to calibrate such parameters in real-time as the pandemic evolves; in contrast, stratification by vaccination status has only a limited impact on our ability to fit the data.

There are several extensions of our current work that can be considered for further studies. The literature on compartmental modeling for COVID-19 transmission dynamics is vast. Additional models, including time series models (e.g., ARIMA and SARIMA) might be considered and compared with those considered in this study. Data from other time periods or jurisdictions could also be investigated. Finally, while Bayesian parameter calibration via MCMC methods is effective for obtaining credible bounds for parameters and estimated case counts, it comes with a relatively large computational cost. Faster computational methods for Bayesian inference would be useful for larger studies involving compartmental models.

## Data availability

The computer code produced in this study for the proposed vaccination-stratified SEPAIQRD model is available in https://github.com/YuxuanZhao1/Code-for-Vaccination-stratified-SEPAIQRD-model. The datasets analysed during the current study are available in the Public Health Ontario repository, https://data.ontario.ca/en/dataset/covid-19-vaccine-data-in-ontario/.

## References

Novel Swine-Origin Influenza A (H1N1) Virus Investigation Team. Emergence of a novel swine-origin influenza A (H1N1) virus in humans.

*N. Engl. J. Med.***360**, 2605–2615 (2009).Lu, H., Stratton, C. W. & Tang, Y.-W. Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle.

*J. Med. Virol.***92**, 401–402 (2020).World Health Organization. COVID-19 weekly epidemiological update. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports (2021). (Accessed 07 Oct 2022).

Ontario Public Health. Early Dynamics of Omicron in Ontario, November 1 to December 23, 2021. https://www.publichealthontario.ca/-/media/documents/ncov/epi/covid-19-early-dynamics-omicron-ontario-epi-summary.pdf (2022) (Accessed 07 Oct 2022).

British Columbia government. BC COVID-19 Go-Forward Management Strategy. http://www.bcmea.com/wp-content/uploads/2020/05/bc_covid-19_go-forward_management_strategy_web.pdf (2022) (Accessed 02 Oct 2022).

Garg, H., Nasir, A., Jan, N. & Khan, S. U. Mathematical analysis of COVID-19 pandemic by using the concept of SIR model.

*Soft Comput.***27**, 3477–3491 (2023).Kermack, W. O., McKendrick, A. G. & Walker, G. T. A contribution to the mathematical theory of epidemics.

*Proc. R. Soc.Lond. Ser. A Contain. Pap. Math. Phys. Charact.***115**, 700–721. https://doi.org/10.1098/rspa.1927.0118 (1927).Cooper, I., Mondal, A. & Antonopoulos, C. G. A SIR model assumption for the spread of COVID-19 in different communities.

*Chaos Solitons Fract.***139**, 110057 (2020).He, S., Peng, Y. & Sun, K. SEIR modeling of the COVID-19 and its dynamics.

*Nonlinear Dyn.***101**, 1667–1680 (2020).Carcione, J. M., Santos, J. E., Bagaini, C. & Ba, J. A simulation of a COVID-19 epidemic based on a deterministic SEIR model.

*Front. Public Health***8**, 230 (2020).Dashtbali, M. & Mirzaie, M. A compartmental model that predicts the effect of social distancing and vaccination on controlling COVID-19.

*Sci. Rep.***11**, 8191. https://doi.org/10.1038/s41598-021-86873-0 (2021).Fields, R.

*et al.*Age-stratified transmission model of COVID-19 in Ontario with human mobility during pandemic’s first wave.*Heliyon***7**, e07905. https://doi.org/10.1016/j.heliyon.2021.e07905 (2021).Masandawa, L., Mirau, S. S. & Mbalawata, I. S. Mathematical modeling of COVID-19 transmission dynamics between healthcare workers and community.

*Results Phys.***29**, 104731 (2021).Cartocci, A., Cevenini, G. & Barbini, P. A compartment modeling approach to reconstruct and analyze gender and age-grouped COVID-19 Italian data for decision-making strategies.

*J. Biomed. Inform.***118**, 103793 (2021).Day, T., Gandon, S., Lion, S. & Otto, S. P. On the evolutionary epidemiology of SARS-CoV-2.

*Curr. Biol.***30**, R849–R857 (2020).CoVaRR. Model projections for the spread of Omicron and the potential impact on hospital occupancy. https://covarrnet.ca/model-projections-for-the-spread-of-omicron-and-the-potential-impact-on-hospital-occupancy/. (Accessed 09 Oct 2022).

Garrett, N.

*et al.*High asymptomatic carriage with the Omicron variant in South Africa.*Clin. Infect. Dis.***75**, e289–e292. https://doi.org/10.1093/cid/ciac237 (2022).Layton, A. T. & Sadria, M. Understanding the dynamics of SARS-CoV-2 variants of concern in Ontario, Canada: A modeling study.

*Sci. Rep.***12**, 2114 (2022).Andrews, N.

*et al.*Covid-19 vaccine effectiveness against the Omicron (B. 1.1. 529) variant.*N. Engl. J. Med.***386**, 1532–1546 (2022).Pulliam, J. R.

*et al.*Increased risk of SARS-CoV-2 reinfection associated with emergence of Omicron in South Africa.*Science***376**, eabn4947 (2022).Vandegrift, K. J.

*et al.*SARS-CoV-2 Omicron (B.1.1.529) infection of wild white-tailed deer in New York City.*Viruses***14**, 2770 (2022).Ribeiro Xavier, C., Sachetto Oliveira, R., da Fonseca Vieira, V. & Lobosco, M. Characterisation of omicron variant during COVID-19 pandemic and the impact of vaccination, transmission rate, mortality, and reinfection in South Africa, Germany, and Brazil.

*BioTech***11**, 12 (2022).Ontario Public Heath. Ontario Moving to Next Phase of Reopening on February 17. https://news.ontario.ca/en/release/1001600/ontario-moving-to-next-phase-of-reopening-on-february-17 (2021) (Accessed 15 June 2022).

Fisman, D. N., Amoako, A. & Tuite, A. R. Impact of population mixing between vaccinated and unvaccinated subpopulations on infectious disease dynamics: Implications for SARS-CoV-2 transmission.

*CMAJ***194**, E573–E580 (2022).Melo, L. Application of the SEIRD epidemic model and optimal control to study the effect of quarantine and isolation on the spread of COVID-19. In

*The Proceedings of GREAT Day*170 (2022).CTV News. Full list of Ontario COVID-19 restrictions for starting Jan. 5. https://toronto.ctvnews.ca/full-list-of-ontario-covid-19-restrictions-for-starting-jan-5-1.5726245 (2022) (Accessed 15 June 2022).

Ontario Public Heath. Ontario Outlines Steps to Cautiously and Gradually Ease Public Health Measures. https://news.ontario.ca/en/release/1001451/ontario-outlines-steps-to-cautiously-and-gradually-ease-public-health-measures (2021) (Accessed 15 June 2022).

CBC News. Ontario lifts mask mandates in most spaces, but it’s no ’light switch’ for pre-pandemic life, expert says. https://www.cbc.ca/news/canada/toronto/covid19-ont-masks-march-21-2022-1.6385293 (2022) (Accessed 15 June 2022).

COVID-19 advisory for Ontario. ONTARIO DASHBOARD. Tracking Omicron. https://covid19-sciencetable.ca/ontario-dashboard/ (2022) (Accessed 15 June 2022).

Ontario Government. Cases and rates by vaccination status. https://data.ontario.ca/en/dataset/covid-19-vaccine-data-in-ontario (2022) (Accessed 15 June 2022).

Danza, P.

*et al.*SARS-CoV-2 infection and hospitalization among adults aged \(\ge\) 18 years, by vaccination status, before and during SARS-CoV-2 B.1.1.529 (Omicron) variant predominance-Los Angeles County, California, November 7, 2021–January 8, 2022.*Morbidity. Mortal. Wkly. Rep.***71**, 177 (2022).CBC News. Ontario’s COVID-19 testing and isolation rules have changed: Here’s what you need to know. https://www.cbc.ca/news/canada/toronto/ontario-testing-isolation-guidance-1.6300831 (2021). (Accessed 15 June 2022).

Chen, S.L.-S.

*et al.*A new approach to modeling pre-symptomatic incidence and transmission time of imported COVID-19 cases evolving with SARS-CoV-2 variants.*Stoch. Environ. Res. Risk Assess.***37**, 441–452 (2023).Wu, Y.

*et al.*Incubation period of COVID-19 caused by unique SARS-CoV-2 strains: A systematic review and meta-analysis.*JAMA Netw. Open***5**, e2228008–e2228008 (2022).Pei, L.

*et al.*Comorbidities prolonged viral shedding of patients infected with SARS-CoV-2 omicron variant in Shanghai: A multi-center, retrospective, observational study.*J. Infect. Public Health***16**, 182–189 (2023).Kearney, P. M.

*et al.*Cross-sectional survey of compliance behaviour, knowledge and attitudes among cases and close contacts during COVID-19 pandemic.*Public Health Pract.***5**, 100370 (2023).Menni, C.

*et al.*Symptom prevalence, duration, and risk of hospital admission in individuals infected with SARS-CoV-2 during periods of omicron and delta variant dominance: A prospective observational study from the ZOE COVID Study.*Lancet***399**, 1618–1624 (2022).Tang, W., He, H. & Tu, X.

*Applied Categorical and Count Data Analysis*. Chapman & Hall/CRC Texts in Statistical Science (Taylor & Francis, 2012).Anderson, S. C.

*et al.*Quantifying the impact of COVID-19 control measures using a Bayesian model of physical distancing.*PLoS Comput. Biol.***16**, e1008274 (2020).McGregor, G., Tippett, J., Wan, A., Wang, M. & Wong, S. Comparing regional and provincial-wide COVID-19 models with physical distancing in British Columbia.

*AIMS Math.***7**, 6743–6778 (2021).Andrews, N.

*et al.*Duration of protection against mild and severe disease by Covid-19 vaccines.*N. Engl. J. Med.***386**, 340–350 (2022).Stan Development Team. RStan: the R interface to Stan (2022). R package version 2.21.5.

Little, T. D.

*The Oxford Handbook of Quantitative Methods, Volume 1: Foundations*(Oxford University Press, 2013).Deo, V. & Grover, G. A new extension of state-space SIR model to account for underreporting-an application to the COVID-19 transmission in California and Florida.

*Results Phys.***24**, 104182 (2021).Canada Population. Ontario Population 2022. https://www.canadapopulation.net/ontario-population/ (2022) (Accessed 12 May 2022).

Public Health Ontario. COVID-19 Variant of Concern Omicron (B.1.1.529): Risk Assessment, January 26, 2022. https://www.publichealthontario.ca/-/media/documents/ncov/voc/2022/01/covid-19-omicron-b11529-risk-assessment-jan-26.pdf?sc_lang=en (2022) (Accessed 15 June 2022).

Public Health Ontario. COVID-19 in Ontario: January 15, 2020 to January 5, 2022. https://files.ontario.ca/moh-covid-19-report-en-2022-01-06.pdf (2022) (Accessed 15 June 2022).

Niu, S.

*et al.*Clinical characteristics of older patients infected with COVID-19: A descriptive study.*Arch. Gerontol. Geriatr.***89**, 104058 (2020).Abolmaali, S. & Shirzaei, S. A comparative study of SIR Model, Linear Regression, Logistic Function and ARIMA Model for forecasting COVID-19 cases.

*AIMS Public Health***8**, 598 (2021).

## Acknowledgements

We thank Wayne Oldford for constructive comments on the manuscript. This work was partially supported by Discovery Grant RGPIN-2019-04771 from the Natural Sciences and Engineering Research Council of Canada.

## Author information

### Authors and Affiliations

### Contributions

Y.X. and S.W.K.W. designed research; Y.X. and S.W.K.W. performed research; Y.X. curated data; Y.X. analyzed data; Y.X. and S.W.K.W. discussed the results; Y.X. and S.W.K.W. wrote and revised the paper.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

### Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary Information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Zhao, Y., Wong, S.W.K. A comparative study of compartmental models for COVID-19 transmission in Ontario, Canada.
*Sci Rep* **13**, 15050 (2023). https://doi.org/10.1038/s41598-023-42043-y

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41598-023-42043-y

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.