A novel hybrid SEIQR model incorporating the effect of quarantine and lockdown regulations for COVID-19

Mitigating the devastating effect of COVID-19 is necessary to control the infectivity and mortality rates. Hence, several strategies such as quarantine of exposed and infected individuals and restricting movement through lockdown of geographical regions have been implemented in most countries. On the other hand, standard SEIR based mathematical models have been developed to understand the disease dynamics of COVID-19, and the proper inclusion of these restrictions is the rate-limiting step for the success of these models. In this work, we have developed a hybrid Susceptible-Exposed-Infected-Quarantined-Removed (SEIQR) model to explore the influence of quarantine and lockdown on disease propagation dynamics. The model is multi-compartmental, and it considers everyday variations in lockdown regulations, testing rate and quarantine individuals. Our model predicts a considerable difference in reported and actual recovered and deceased cases in qualitative agreement with recent reports.


HySEIQR model, notations and assumptions
Considerations in adapting SEIR model to COVID-19. Adapting standard SEIR model to the current scenario requires addressing the following key elements: (a) asymptomatic carriers, (b) effect of quarantine and lockdown, c) multi-compartmental approach, (d) testing rate and efficiency, (e) varying viral strains and their virulence, and (f) availability of medical resources and efficacy of treatment. These elements are known to strongly influence disease transmission rate (β), the spread of infected individuals to newer regions, and the recovery and mortality of patients. COVID-19 infected individuals can be symptomatic or asymptomatic, and in most cases, can develop symptoms over time. However, statistical studies have shown varying proportions of symptomatic and asymptomatic cases in different populations 17,18 . In most countries, symptomatic and identified (tested) individuals are quarantined or advised to self-isolation. Isolation of infected cases removes them from the general population, thus reducing the spread of the disease. Further, identifying the infected cases depends directly on the testing rate in the region 19,20 . HySEIQR model. The schematic representation of the SEIQR model is shown in Fig. 1, and the set of equations (Eqs. [1][2][3][4][5][6][7][8][9][10] listed below describes the model. Table 1 lists all the variables, parameters and constants along with the notations used in this study. (1) S = −β(t).S * (Ia.η + Im + Is) (2) E = β(t).S * (Ia.η + Im + Is) − σ E

Multi-compartment model and the stochastic nature.
A typical epidemiological model assumes the region understudy to be a single compartmental with a homogenous density of exposed and infected cases across the region and throughout the time. One of the implications of lockdown restrictions is the localization by isolating sub-regions with a higher density of infected individuals. These restrictions create heterogeneity which requires a multi-compartmental approach. In the HySEIQR model, a country/state is uniformly divided into multiple sub-regions with boundaries. These regions are placed on a square map. The size and number of sub-regions depend on parameter N e (number of compartments/sub-regions). The disease propagation dynamics is assumed to occur in each sub-region independently through the Eqs. (1)(2)(3)(4)(5)(6)(7)(8)(9)(10). The movement of exposed and infected individuals between the neighbouring sub-regions is dependent on the transfer rate (Tr Rate 0 ). The inclusion of a multi-compartmental approach adds stochastic components to the model. The transfer of individuals from a sub-region to neighbouring sub-regions occurs through random selection of the neighbours. In addition, during the initialization of the simulation, the number of initially exposed individuals, E 0, is distributed among randomly selected sub-regions. These events vary with iterations due to their dependency on pseudo-random numbers.
Incorporating real-world data into the model. The Hybrid SEIRQ model considers the day-to-day variations in government-imposed travel restrictions and lockdown conditions, the number of tested samples and positivity rate. The actual world data from various sources were collected and integrated into the model as functions λ(t), β(t) and H(t). We have collected the time-series data on the number of infections, deaths, recoveries, tested samples and positivity rate from COVID19India.org GitHub repository (https:// api. covid 19ind ia. org) for the Indian population till May 2021. Further, the data on change in people movement was collected from Google mobility reports (https:// github. com/ Googl eClou dPlat form/ covid-19-open-data) and Oxford stringency index 21 (http:// www. bsg. ox. ac. uk/ covid track er) as a measure of Quarantine and Lockdown stringency index (λ(t)). β(t) denotes the variation in transmission rate (Eq. 11). H(t) represents the actual number of positively tested cases in a day. The number of tested /identified cases predicted by the model on a day (t) depends directly on the number of infected cases (I a , I m, and I s ) on t but is limited by H(t). Seven-day window averages were used throughout our study to reduce non-specific day to day variations (Fig. S1).
where W q and W T denote the weight associated with lockdown regulations, λ(t).

Interpreting the HySEIQR model
Parameter estimation and fit. The parameters for the model were either defined as parameters (Table 1) and estimated through non-linear data fitting or as constants and derived from literature and published reports ( Table 1). The parameters associated with disease dynamics, such as σ, β and γ, have been well studied and declared constants in this work [22][23][24] .
The rest of the parameters were estimated through a two-step approach. A systematic grid search is performed in the parameter space (Table 1). For each point in the grid, the least-square optimization algorithm (least_squares) was applied to minimize the least-squares error between the predicted and actual number of identified recovered and infected cases 25,26 . The model with the lowest error was selected and further optimized.
Comparison between predicted and actual data. Figure 2 shows the predicted number of recovered cases and change in infected cases over time, along with the actual data for the Indian population. As of 15th May 2021, approximately 20.82 million and 270,000 recovered and deceased as per the available data with 3.54 million existing infected cases. Our model predicted 18.56 million identified and 772.28 million unidentified recovered cases. The number of unidentified recovered cases is nearly 37 [25,49] times higher than the reported number of recovered cases. Similarly, our model predicted 2.6 [1.7-3.5] times higher deceased cases than reported deaths due to COVID-19. In other words, on average only ~ 3 in 100 recovered cases and ~ 1 in 3 deaths is reported. Despite the variations in the prediction results over iterations and its high sensitivity to parameters, the model consistently indicates several times higher recovered cases than reported. These results agree with earlier reports of undetected COVID-19 cases [27][28][29] . Especially considering the asymptomatic cases and the low testing rate, the actual number of recovered cases can be several times higher than the reported numbers.
To validate our simplified multi-compartmental design, we compared the distribution of recovered cases across the compartment/sub-regions with relevant administrative sub-division in India. We chose districts for comparison since our model consists of 1000 sub-regions in the similar range as the number of districts in India. Figure 3 shows the fraction of sub-regions and districts with the number of recovered cases greater than 1000 and 10,000. A threshold of 1000 was used to check the presence of COVID-19 infection in a sub-region. In addition, a threshold of 10,000 was chosen to test the widespread of the disease. The results indicate that our model underestimates the presence of COVID-19 infection at a threshold of 1000). However, the model showed a better correlation with the spread of the disease in sub-regions at a threshold of 10,000. The difference between the predicted (sub-regions) and actual (districts) is expected since the size and distribution of sub-regions are (11)

Influence of lockdown regulations on disease-control. The lockdown-imposed restrictions (λ(t))
affects the disease transmission factor (β 0 ) as described in Eq. 11. W q is the weight that determines the influence of the λ(t) on β. Higher W q indicates a more substantial influence of lockdown rules in reducing disease transmission. To study the influence of the parameter, we run simulations by varying W q from 0 to 1 at intervals of 0.2 (Fig. 4). Figure 4a shows the change in the total number of infected cases (identified and unidentified) over time. The results show that with strong adherence to government-imposed lockdown regulations (W q ≥ 0.6), the spread of COVID-19 could have been controlled within six months. However, non-adherence (W q ≤ 0.4) to the restriction could lead to the rapid spread of disease with an average of million cases per day and spreading to almost every sub-region (Fig. 4b). We repeated the analysis by varying Tr 0 rate , the transfer rate of infected between sub-regions (Fig. 5). With zero movements between the sub-regions (Tr 0 rate = 0), the spread of disease is restricted to only the initial compartments and spread to compartments is stopped. Increasing Tr 0 rate increases the rate of spread to other sub-regions, leading to a rapid increase in infection rate.

Discussion
Several methods have been developed using modified SEIR models to understand the spread of COVID-19. The most necessary adaptions are (i) identification of infected cases and quarantine of suspected cases, (ii) role of lockdown on social interactions and movement of the population and (iii) inclusion of asymptomatic cases [32][33][34] . Quarantine is one of the important strategies in controlling any contagious disease 32 . Senapati et al. 35 developed a deterministic compartmental model incorporating quarantined and hospitalized for mild and severe symptomatic cases, respectively. The rate of quarantine and hospitalization are usually determined as part of the model   www.nature.com/scientificreports/ parameters. In our model, the quarantine/hospitalization is determined by the number of actual tested cases in the region obtained from real-world data and fed to the model. These tested cases are further distributed among symptomatic and asymptomatic cases based on testing sensitivity and specificity. These suspected cases, which include true and false positives, are quarantined during the expected recovery period (determined by γ). These assumptions are close to a real-world situation and also easily adaptable to other geographical regions. The effect of lockdown on disease transmission is time-dependent and complex with various direct and indirect influences. For example, the government imposed regulations directly create barriers for the movement of people and indirectly generate awareness among the population to follow hygienic practices. Although these actions reduce the disease transmission rate (β), the influence of these measures changes over time. To accommodate this effect, studies have modelled β as a function of time or lockdown [36][37][38] . Ianni and Rossi 36 represented β as a decreasing exponential function to accommodate the increasing awareness of the population and the reducing disease transmission rate over 120 days since disease outbreak.
On the other hand, the awareness could gradually reduce over an extended period and governments can impose lockdown in phases, which create waves of awareness. While such approaches are convenient and easily adapted to standard epidemiological models/equations, disease transmission rate depends on complex social interactions among the population/community. Networks/graphs representing the interaction pattern among communities are often used to overcome these shortcomings 30,31,39,40 (λ(t)). These measures represent the dynamic changes in government norms and associated population behaviour towards COVID-19. Thus, it provides a better reflection of the real-world situation for the model. The λ(t) influences the disease transmission rate, β and also affects the movement of people from one region to another, Tr rate in the model (Eqs. 11 and 12). In addition, the multi-compartmental design model raises a barrier to people moving from one sub-region to another. Thus, the model mimics the effect of lockdown in a large geographical region like India.
Consideration of asymptomatic cases is another crucial and essential criterion for a COVID-19 epidemiological model. Models, which incorporate asymptomatic cases consider a part of the infected cases to be asymptomatic with no identifiable symptoms and are probably undetected. This fraction of the infected patients undergoes natural recovery over a period of time 27,41,42 . A similar approach is employed in our model. Infected cases are considered to be part of one of the three classes: (i) asymptomatic, (ii) symptomatic with moderate and (ii) symptomatic with severe symptoms. Few models treated a constant fraction of infected cases as asymptomatic, which is determined by model optimization. These asymptomatic cases can remain asymptomatic until recovery or may show symptoms over time 42 .

Conclusions
We have developed a hybrid SEIQR model by incorporating several adaptations for COVID-19 disease, testing protocols, current quarantine, and lockdown regulations. In our approach to the model, several assumptions and simplifications were imposed to account for the following: (i) The government imposed lockdown regulations were represented through over-simplified metrics from openly available reports, (ii) The role of hospitals in controlling mortality rate, allocation and availability of equipment in hospitals, the effect of viral strains in disease transmission and mortality rate were not factorized into the model and (iii) only part of the parameters was optimized, and the rest were considered constants based on the literature to ease parameter optimization.
Despite the limitations, our model captured the essence of the quarantine, lockdown and movement of infected between the regions. The model was developed with minimal dependency using core python libraries and is available as a webserver at https:// web. iitm. ac. in/ bioin fo2/ covid 19hys eiqr/ home. The model is highly customizable and can be adapted to further modifications. The inclusion of network topology of administrative divisions in India and the effects of viral strains would benefit the community to a greater extent.

Ethics declarations.
No experiments on Human or Animals were conducted as part of the study. All data used in the study were collected from openly available repositories.

Data availability
All data used in the study were collected from openly available repositories. Sources are listed in the manuscript. In addition, the model is available as a webserver at https:// web. iitm. ac. in/ bioin fo2/ covid 19hys eiqr/ home.