Introduction

The consequences of the COVID-19 pandemic on human life and the economy are disastrous, and the propagation of infection has not yet been controlled1,2,3,4. Governments have devised several strategies and imposed regulations and restrictions to decelerate the spread, control the cost of human lives and reduce the load on the health care industry5,6,7. While the development of several vaccines has been hopeful progress, evolving variants of SARS-COV-2 and its virulence is a threat8,9,10. Vaccination and acquired immunity have progressively led to the relaxation of lockdown restriction in a few geographical regions. However, considering the current vaccination rate and the bias in global distribution, most countries primarily rely on quarantine and lockdown procedures.

The quarantine and lockdown regulations and restrictions imposed by the governments target the reduction in disease transmission by confining the movement of the population and reducing the spread through human contacts5,11,12. Effective implementation of these procedures can slow down the spread and provide a window for the government to devise strategies to develop vaccines/drugs. Understanding the effect of lockdown on the dynamics of the pandemic is vital in planning and implementation11. SEIR model and its variants can be used in parameterizing and predicting the disease dynamics13,14,15,16. However, the typical formulations of the SEIR model do not take into account the complex effects of lockdown restrictions.

In the current study, we have adapted the standard SIER model to the current pandemic situation (COVID-19) by addressing specific essential observations such as the presence of asymptomatic carriers, reduction in transmission rate due to lockdown and its effect on the infection rate of the disease. By incorporating these parameters, we have developed a model to provide robust estimates of asymptomatic carriers in the population. Apart from providing an estimate of the infected and recovered population, these data would elucidate the role of these external factors on COVID-19 transmission. In addition, we have also incorporated the real-world, day-to-day mobility data, positive rate and number of tested samples into a Hybrid Susceptible-Exposed-Infected-Quarantined-Removed (HySEIQR) model. The model accounts for the effect of lockdown on disease transmission through inter-person contacts and the movement of people across geographic regions. Simulation of our detailed model showed a good correlation with the observed trend in the number of recovered cases.

HySEIQR model, notations and assumptions

Considerations in adapting SEIR model to COVID-19

Adapting standard SEIR model to the current scenario requires addressing the following key elements: (a) asymptomatic carriers, (b) effect of quarantine and lockdown, c) multi-compartmental approach, (d) testing rate and efficiency, (e) varying viral strains and their virulence, and (f) availability of medical resources and efficacy of treatment. These elements are known to strongly influence disease transmission rate (β), the spread of infected individuals to newer regions, and the recovery and mortality of patients.

COVID-19 infected individuals can be symptomatic or asymptomatic, and in most cases, can develop symptoms over time. However, statistical studies have shown varying proportions of symptomatic and asymptomatic cases in different populations17,18. In most countries, symptomatic and identified (tested) individuals are quarantined or advised to self-isolation. Isolation of infected cases removes them from the general population, thus reducing the spread of the disease. Further, identifying the infected cases depends directly on the testing rate in the region19,20.

HySEIQR model

The schematic representation of the SEIQR model is shown in Fig. 1, and the set of equations (Eqs. 110) listed below describes the model. Table 1 lists all the variables, parameters and constants along with the notations used in this study.

Figure 1
figure 1

(a) Illustration of multi-compartmental approach in HySEIQR. Circles represent sub-regions or compartments. The movement of infected between the compartments/sub-regions is governed by the parameter TrRate. (b) the schematic representation of the Hybrid SEIRQ model (refer Eqs. 112).

Table 1 List of variables, constants and parameters used in the model.
$$\begin{array}{c}\dot{S}= -\beta \left(t\right).S*\left(\mathrm{Ia}.\upeta +\mathrm{Im}+\mathrm{Is}\right)\end{array}$$
(1)
$$\begin{array}{c}\dot{E}= \beta \left(t\right).S*\left(\mathrm{Ia}.\upeta +\mathrm{Im}+\mathrm{Is}\right)- \sigma E\end{array}$$
(2)
$$\begin{array}{c}\dot{{I}_{a}}= {f}_{a}.\sigma .E-\left({\mu }_{a}+{\gamma }_{a}\right).{I}_{a}-H\left({I}_{a}, t\right)\end{array}$$
(3)
$$\begin{array}{c}\dot{{I}_{m}}= {f}_{m}.\sigma .E-\left({\mu }_{m}+{\gamma }_{m}\right).{I}_{m}-H\left({I}_{m}, t\right)\end{array}$$
(4)
$$\begin{array}{c}\dot{{I}_{s}}= {f}_{s}.\sigma .E-\left({\mu }_{s}+{\gamma }_{s}\right).{I}_{s}-H\left({I}_{s}, t\right)\end{array}$$
(5)
$$\begin{array}{c}\dot{{H}_{a}}= H\left({I}_{a}, t\right)-\left({{\mu }_{a}}^{H}+{{\gamma }^{H}}_{a}\right).{I}_{a}\end{array}$$
(6)
$$\begin{array}{c}\dot{{H}_{m}}= H\left({I}_{m}, t\right)-\left({{\mu }_{m}}^{H}+{{\gamma }^{H}}_{m}\right).{I}_{m}\end{array}$$
(7)
$$\begin{array}{c}\dot{{H}_{s}}= H\left({I}_{s}, t\right)-\left({{\mu }_{s}}^{H}+{{\gamma }^{H}}_{s}\right).{I}_{s}\end{array}$$
(8)
$$\begin{array}{c}\dot{R}= {\gamma }_{a}.{I}_{a}+ {\gamma }_{m}.{I}_{m}+ {\gamma }_{s}.{I}_{s}+{{\gamma }_{a}}^{H}.{H}_{a}+ {{\gamma }_{m}}^{H}.{H}_{m}+ {{\gamma }_{s}}^{H}.{H}_{s}\end{array}$$
(9)
$$\begin{array}{c}\dot{D}= {\mu }_{a}.{I}_{a}+ {\mu }_{m}.{\mu }_{m}+ {\mu }_{s}.{I}_{s}+{{\mu }_{a}}^{H}.{H}_{a}+ {{\mu }_{m}}^{H}.{H}_{m}+ {{\mu }_{s}}^{H}.{H}_{s}\end{array}$$
(10)

where S, E, R and D denote the susceptible, exposed, recovered, and deceased population. \(\sigma\), \(\gamma\) and \(\mu\) represent the incubation, recovery and mortality rate. Infected cases were grouped into three categories: asymptomatic (Ia), moderately symptomatic (Im) and severely symptomatic (Is). Each category has a different recovery period (γ) and mortality rate (μ). And also, different contributions (η) to disease transmission (β). Similarly, Ha, Hm and Hs represent the identified infected cases, including self-quarantined and hospitalized cases. The transmission of infection from Infected to Susceptible depends on the transmission factor, \(\upbeta (\mathrm{t})\), a time-dependent variable. The number of hospitalized cases were obtained from real-world data (www.covid19India.org).

Multi-compartment model and the stochastic nature

A typical epidemiological model assumes the region understudy to be a single compartmental with a homogenous density of exposed and infected cases across the region and throughout the time. One of the implications of lockdown restrictions is the localization by isolating sub-regions with a higher density of infected individuals. These restrictions create heterogeneity which requires a multi-compartmental approach. In the HySEIQR model, a country/state is uniformly divided into multiple sub-regions with boundaries. These regions are placed on a square map. The size and number of sub-regions depend on parameter Ne (number of compartments/sub-regions). The disease propagation dynamics is assumed to occur in each sub-region independently through the Eqs. (110). The movement of exposed and infected individuals between the neighbouring sub-regions is dependent on the transfer rate (TrRate0).

The inclusion of a multi-compartmental approach adds stochastic components to the model. The transfer of individuals from a sub-region to neighbouring sub-regions occurs through random selection of the neighbours. In addition, during the initialization of the simulation, the number of initially exposed individuals, E0, is distributed among randomly selected sub-regions. These events vary with iterations due to their dependency on pseudo-random numbers.

Incorporating real-world data into the model

The Hybrid SEIRQ model considers the day-to-day variations in government-imposed travel restrictions and lockdown conditions, the number of tested samples and positivity rate. The actual world data from various sources were collected and integrated into the model as functions λ(t), β(t) and H(t). We have collected the time-series data on the number of infections, deaths, recoveries, tested samples and positivity rate from COVID19India.org GitHub repository (https://api.covid19india.org) for the Indian population till May 2021. Further, the data on change in people movement was collected from Google mobility reports (https://github.com/GoogleCloudPlatform/covid-19-open-data) and Oxford stringency index21 (http://www.bsg.ox.ac.uk/covidtracker) as a measure of Quarantine and Lockdown stringency index (λ(t)). β(t) denotes the variation in transmission rate (Eq. 11). H(t) represents the actual number of positively tested cases in a day. The number of tested /identified cases predicted by the model on a day (t) depends directly on the number of infected cases (Ia, Im, and Is) on t but is limited by H(t). Seven-day window averages were used throughout our study to reduce non-specific day to day variations (Fig. S1).

$$\begin{array}{c}\beta \left(t\right)= {\beta }_{0}*\left({W}_{q}. (1-\lambda \left(t\right)) +\left(1-{W}_{q}\right) \right)\end{array}$$
(11)
$$\begin{array}{c}{\mathrm{Tr}}^{rate}\left(t\right)= {{Tr}^{rate}}_{0}*\left({W}_{T}. (1-\lambda \left(t\right)) +\left(1-{W}_{T}\right) \right)\end{array}$$
(12)

where Wq and WT denote the weight associated with lockdown regulations, λ(t).

Interpreting the HySEIQR model

Parameter estimation and fit

The parameters for the model were either defined as parameters (Table 1) and estimated through non-linear data fitting or as constants and derived from literature and published reports (Table 1). The parameters associated with disease dynamics, such as σ, β and γ, have been well studied and declared constants in this work22,23,24.

The rest of the parameters were estimated through a two-step approach. A systematic grid search is performed in the parameter space (Table 1). For each point in the grid, the least-square optimization algorithm (least_squares) was applied to minimize the least-squares error between the predicted and actual number of identified recovered and infected cases25,26. The model with the lowest error was selected and further optimized.

Comparison between predicted and actual data

Figure 2 shows the predicted number of recovered cases and change in infected cases over time, along with the actual data for the Indian population. As of 15th May 2021, approximately 20.82 million and 270,000 recovered and deceased as per the available data with 3.54 million existing infected cases. Our model predicted 18.56 million identified and 772.28 million unidentified recovered cases. The number of unidentified recovered cases is nearly 37 [25, 49] times higher than the reported number of recovered cases. Similarly, our model predicted 2.6 [1.7–3.5] times higher deceased cases than reported deaths due to COVID-19. In other words, on average only ~ 3 in 100 recovered cases and ~ 1 in 3 deaths is reported. Despite the variations in the prediction results over iterations and its high sensitivity to parameters, the model consistently indicates several times higher recovered cases than reported. These results agree with earlier reports of undetected COVID-19 cases27,28,29. Especially considering the asymptomatic cases and the low testing rate, the actual number of recovered cases can be several times higher than the reported numbers.

Figure 2
figure 2

Actual and predicted number of (a) recovered and (b) change in infected COVID-19 cases in India. The shaded regions represent the standard error from 10 replicates.

To validate our simplified multi-compartmental design, we compared the distribution of recovered cases across the compartment/sub-regions with relevant administrative sub-division in India. We chose districts for comparison since our model consists of 1000 sub-regions in the similar range as the number of districts in India. Figure 3 shows the fraction of sub-regions and districts with the number of recovered cases greater than 1000 and 10,000. A threshold of 1000 was used to check the presence of COVID-19 infection in a sub-region. In addition, a threshold of 10,000 was chosen to test the widespread of the disease. The results indicate that our model underestimates the presence of COVID-19 infection at a threshold of 1000). However, the model showed a better correlation with the spread of the disease in sub-regions at a threshold of 10,000. The difference between the predicted (sub-regions) and actual (districts) is expected since the size and distribution of sub-regions are uniformly modelled, whereas districts vary significantly in size and geographical locations. The gaps can be overcome by using a network topology based on the distribution of districts sizes and their connectivity30,31.

Figure 3
figure 3

Comparing the predictions for the sub-regions in our model with the actual data from Indian districts. The y-axis represents the fraction of sub-regions (blue line)/districts (green line) with (a) more than 1000 recovered cases and (b) more than 10,000 recovered cases. The shaded regions represent the standard error from 10 replicates.

Influence of lockdown regulations on disease-control

The lockdown-imposed restrictions (λ(t)) affects the disease transmission factor (β0) as described in Eq. 11. Wq is the weight that determines the influence of the λ(t) on β. Higher Wq indicates a more substantial influence of lockdown rules in reducing disease transmission. To study the influence of the parameter, we run simulations by varying Wq from 0 to 1 at intervals of 0.2 (Fig. 4). Figure 4a shows the change in the total number of infected cases (identified and unidentified) over time. The results show that with strong adherence to government-imposed lockdown regulations (Wq ≥ 0.6), the spread of COVID-19 could have been controlled within six months. However, non-adherence (Wq ≤ 0.4) to the restriction could lead to the rapid spread of disease with an average of million cases per day and spreading to almost every sub-region (Fig. 4b). We repeated the analysis by varying Tr0rate, the transfer rate of infected between sub-regions (Fig. 5). With zero movements between the sub-regions (Tr0rate = 0), the spread of disease is restricted to only the initial compartments and spread to compartments is stopped. Increasing Tr0rate increases the rate of spread to other sub-regions, leading to a rapid increase in infection rate.

Figure 4
figure 4

Effect of lockdown and quarantine on the spread of COVID-19 infection. The simulated change in the (a) total number of infected individuals (identified and unidentified) and (b) the number of sub-regions/compartments with more than 1000 recovered cases as a function of the parameter, Wq.

Figure 5
figure 5

Role of inter-compartment movement on the spread of COVID-19 infection. The simulated change in the (a) total number of infected individuals (identified and unidentified) and (b) the number of sub-regions/compartments with more than 1000 recovered cases as a function of the parameter, TrRate0, transfer rate between sub-regions.

Discussion

Several methods have been developed using modified SEIR models to understand the spread of COVID-19. The most necessary adaptions are (i) identification of infected cases and quarantine of suspected cases, (ii) role of lockdown on social interactions and movement of the population and (iii) inclusion of asymptomatic cases32,33,34. Quarantine is one of the important strategies in controlling any contagious disease32. Senapati et al.35 developed a deterministic compartmental model incorporating quarantined and hospitalized for mild and severe symptomatic cases, respectively. The rate of quarantine and hospitalization are usually determined as part of the model parameters. In our model, the quarantine/hospitalization is determined by the number of actual tested cases in the region obtained from real-world data and fed to the model. These tested cases are further distributed among symptomatic and asymptomatic cases based on testing sensitivity and specificity. These suspected cases, which include true and false positives, are quarantined during the expected recovery period (determined by γ). These assumptions are close to a real-world situation and also easily adaptable to other geographical regions.

The effect of lockdown on disease transmission is time-dependent and complex with various direct and indirect influences. For example, the government imposed regulations directly create barriers for the movement of people and indirectly generate awareness among the population to follow hygienic practices. Although these actions reduce the disease transmission rate (β), the influence of these measures changes over time. To accommodate this effect, studies have modelled β as a function of time or lockdown36,36,37,38. Ianni and Rossi36 represented β as a decreasing exponential function to accommodate the increasing awareness of the population and the reducing disease transmission rate over 120 days since disease outbreak.

On the other hand, the awareness could gradually reduce over an extended period and governments can impose lockdown in phases, which create waves of awareness. While such approaches are convenient and easily adapted to standard epidemiological models/equations, disease transmission rate depends on complex social interactions among the population/community. Networks/graphs representing the interaction pattern among communities are often used to overcome these shortcomings30,31,39,40. In our model, we have incorporated two measures, namely Google mobility reports (https://github.com/GoogleCloudPlatform/covid-19-open-data) and Oxford stringency index (Hale et al., 2021) (http://www.bsg.ox.ac.uk/covidtracker) as a measure of Quarantine and Lockdown stringency index (λ(t)). These measures represent the dynamic changes in government norms and associated population behaviour towards COVID-19. Thus, it provides a better reflection of the real-world situation for the model. The λ(t) influences the disease transmission rate, β and also affects the movement of people from one region to another, Trrate in the model (Eqs. 11 and 12). In addition, the multi-compartmental design model raises a barrier to people moving from one sub-region to another. Thus, the model mimics the effect of lockdown in a large geographical region like India.

Consideration of asymptomatic cases is another crucial and essential criterion for a COVID-19 epidemiological model. Models, which incorporate asymptomatic cases consider a part of the infected cases to be asymptomatic with no identifiable symptoms and are probably undetected. This fraction of the infected patients undergoes natural recovery over a period of time27,41,42. A similar approach is employed in our model. Infected cases are considered to be part of one of the three classes: (i) asymptomatic, (ii) symptomatic with moderate and (ii) symptomatic with severe symptoms. Few models treated a constant fraction of infected cases as asymptomatic, which is determined by model optimization. These asymptomatic cases can remain asymptomatic until recovery or may show symptoms over time42.

Conclusions

We have developed a hybrid SEIQR model by incorporating several adaptations for COVID-19 disease, testing protocols, current quarantine, and lockdown regulations. In our approach to the model, several assumptions and simplifications were imposed to account for the following: (i) The government imposed lockdown regulations were represented through over-simplified metrics from openly available reports, (ii) The role of hospitals in controlling mortality rate, allocation and availability of equipment in hospitals, the effect of viral strains in disease transmission and mortality rate were not factorized into the model and (iii) only part of the parameters was optimized, and the rest were considered constants based on the literature to ease parameter optimization.

Despite the limitations, our model captured the essence of the quarantine, lockdown and movement of infected between the regions. The model was developed with minimal dependency using core python libraries and is available as a webserver at https://web.iitm.ac.in/bioinfo2/covid19hyseiqr/home. The model is highly customizable and can be adapted to further modifications. The inclusion of network topology of administrative divisions in India and the effects of viral strains would benefit the community to a greater extent.

Ethics declarations

No experiments on Human or Animals were conducted as part of the study. All data used in the study were collected from openly available repositories.