Modelling, prediction and design of COVID-19 lockdowns by stringency and duration

The implementation of lockdowns has been a key policy to curb the spread of COVID-19 and to keep under control the number of infections. However, quantitatively predicting in advance the effects of lockdowns based on their stringency and duration is a complex task, in turn making it difficult for governments to design effective strategies to stop the disease. Leveraging a novel mathematical “hybrid” approach, we propose a new epidemic model that is able to predict the future number of active cases and deaths when lockdowns with different stringency levels or durations are enforced. The key observation is that lockdown-induced modifications of social habits may not be captured by traditional mean-field compartmental models because these models assume uniformity of social interactions among the population, which fails during lockdown. Our model is able to capture the abrupt social habit changes caused by lockdowns. The results are validated on the data of Israel and Germany by predicting past lockdowns and providing predictions in alternative lockdown scenarios (different stringency and duration). The findings show that our model can effectively support the design of lockdown strategies by stringency and duration, and quantitatively forecast the course of the epidemic during lockdown.


Model
A core, simplifying, assumption of mean-field compartmental models is that all individuals in the population have the same degree of interaction with everyone else. This assumption is violated when a lockdown is in place, because while a minority of individuals (e.g. essential workers) has a high degree of social interactions, the majority limits their contacts to the members of their own household. In other words, on the day in which a lockdown is enforced, the population undergoes an abrupt change in its dynamic behaviour, which undermines the assumption on which traditional mean-field compartmental models are grounded 30 . The FL-Hybrid (Free-to-Lockdown Hybrid) epidemic model that we propose in this paper overcomes this fundamental limitation, in that it is able to model the instantaneous creation and destruction of these two categories of individuals characterised by different levels of social interaction. The FL-Hybrid model exploits a hybrid mathematical framework 31 that allows modelling the interaction between discrete events (e.g. governments introducing lockdowns) on continuous-time dynamics (e.g. compartmental models).The relevance of the FL-Hybrid model that we introduce is not only due to addressing mathematically the sudden change of social behaviour caused by a lockdown. The importance of this new model also lies in the fact that it provides the policymakers with a tool to assess the impact of a past lockdown on the course of the epidemic as well as to plan for potentially new lockdowns, should the cases be on a sharp rise in the future. In fact, a parameter that the model allows tuning is the stringency of the lockdown, i.e. the percentage of population effectively constrained in their own household. Different stringency levels are reflected in real life by, for instance, what type and how many workers are considered essential, thus being allowed to move freely, but also how strictly the rules are enforced and, therefore, how many individuals are expected to abide by them. A second parameter that the model allows tuning is the duration of the lockdown, which may be crucial to evaluate the optimal time to lift it, thus avoiding a new surge in cases as well as preventing an overly prolonged paralysis of the economy. Apart from stringency and duration, which are parameters for the policymakers to design, the rest of the model parameters can be estimated from analysis of the available data, specifically the number of active COVID-19 infections in a given country, the cumulative number of deaths, and the cumulative number of recoveries. The interactions between the free phase and the lockdown phase and, within these, among the sub-models and their inner compartments are shown in Fig. 1. The mathematical formulation of the FL-Hybrid model is provided in the following section. A detailed discussion of the model considerations, assumptions and parameters is provided in the "Methods".
Mathematical formulation. The FL-Hybrid (Free-to-Lockdown Hybrid) model is a hybrid dynamical model 31 switching between two modes, which we call phases: the free phase and the lockdown phase. Both the free phase and the lockdown phase of the model are described by sets of ordinary differential equations and represent the so-called flow of the hybrid model. The switching action (i.e. the so-called jump set of the hybrid model) corresponds to the government's action of enforcing or lifting a lockdown.
The free phase models the evolution of the epidemic when the entire population is allowed to move freely and no strict policy requiring the individuals to stay at home and avoid contacts with others is enforced. This phase typically occurs at the beginning of the epidemic or after a period of lockdown, when social distancing measures are relaxed as a result of a drop in number of diagnosed infections. In this phase the assumption of traditional compartmental models is verified, therefore the free phase is described by a sub-model that is a variation of a standard SIR model. We refer to this classical model as SUDER, in which the population is partitioned in five disease stages: S, susceptible; U, undetected or undiagnosed infected; D, detected or diagnosed infected; E, extinct (dead); R, recovered.
The lockdown phase models the evolution of the epidemic when a lockdown is imposed. The population is divided into two categories, which we call the free population and the lockdown population. The free population is a minority that maintains a high number of interactions with other individuals. For example, this is the case of key workers, who are partially exempt from isolation to carry out essential jobs, but also takes into account that some individuals do not comply with the regulations. Thus, the dynamics of the epidemic in the free population is still accurately described by the SUDER sub-model. On the other hand, the lockdown population is assumed to have a drastically reduced number of social interactions. This population is split into households, which are for simplicity assumed to be of fixed size of three members (the rationale of this number is explained in the "Methods"). The dynamics of the epidemic among these individuals is described by the HP, i.e. Household Partitions, sub-model. In the HP sub-model the household, rather than the individual, is the fundamental unit and represents a small group of individuals who often come in contact with each other, but rarely interacts with the rest of the population. Hence, a household with zero infected individuals at the start of the lockdown will most likely keep this status until the end of the lockdown. Susceptible members in a household might still be infected from external contacts, but the probability of this happening is low. Instead, if one or multiple infected individuals are present in the household, then the HP sub-model describes the spread of the disease among the household members, who then, if infected, go through the usual U, D, E, and R stages of infection. www.nature.com/scientificreports/ The detailed mathematical description of the free phase and lockdown phase of the model is provided in the forthcoming sections. The switching between these phases and the full discussion of the modelling assumptions is provided in the "Methods".
Free phase. In the free phase the epidemic evolves according to a standard compartmental sub-model which we call SUDER sub-model. The SUDER sub-model is a dynamical model consisting of six ordinary differential equations. Each equation characterises the change over time of the proportions of population experiencing a specific stage of the disease. This model describes the dynamics of the epidemic when no lockdown measures are introduced. The system of equations is given by where S (Susceptible), U (infected Undetected), D (infected Detected), E (Extinct), R u (undetected Recovered), and R d (detected Recovered) are the proportions of the population at each stage of the disease, while the Greek letters represent the parameters of the model and are positive numbers. In particular, • β is the disease transmission rate from an undetected infected person to a susceptible person. This is the probability that an undetected person transmits the infection to a susceptible person multiplied by the average  www.nature.com/scientificreports/ number of contacts per person. This parameter is dependent both on the infectivity of the disease and on the number of close contacts between individuals. Therefore this parameter is reduced when social distancing measures are implemented. • δ is the probability rate of detection, i.e. the probability that an undetected infected person becomes detected after any form of diagnosis. This parameter increases as the scale and efficiency of mass testing and contact tracing policies are improved. • ρ and σ are the probability rates of recovery of undetected and detected infected people, respectively. Undetected people are generally asymptomatic or develop very mild symptoms, compared to detected people who might develop life-threatening conditions. Therefore, ρ is generally higher than σ. • θ is the mortality rate of detected people and is lowered by more effective treatments.
Lockdown phase. In the lockdown phase the epidemic evolves according to the interaction of two sub-models, one for the free population and one for the lockdown population. This interaction results in a system of twentytwo ordinary differential equations.
Among the free population the epidemic evolves according to the SUDER sub-model, therefore six out of twenty-two equations are analogous to the ones previously introduced and are given by where both the Latin and Greek letters have the same interpretations as in (1) and the subscript f specifies that the quantities characterise the free population while a lockdown is enforced.
The remaining sixteen equations describe the evolution of the epidemic among the lockdown population and constitute the so-called HP (for Household Partitions) sub-model . This sub-model aims at considerably simplifying the complex dynamics of people isolating in lockdown, while at the same time capturing the fundamental behaviour of the progression of the disease. The core of the HP sub-model is the household, which is a unit composed of 3 individuals. The assumption on the fixed number of individuals is justified in the "Methods". The equations model the spread of the disease among members of the same household who are observing the lockdown measure, but also the fact that households do not isolate perfectly and new infections can be introduced in a household when its members get in contact with infected people from the free population. Let H i denote the number of households with i undetected infected, with i = 0, 1, 2, 3 . Then the equations describing their dynamics are given by where β h denotes the probability rate that an infected household member infects another susceptible person within the household. This parameter depends on the level of interaction between members of the same household, and increases as household members interact more. ρ h denotes the probability rate that an undetected household member recovers before infecting other people within the household. Again, this depends on the level of household members' interaction, but decreases as members interact more. δ h denotes the probability rate that an undetected household member becomes detected before infecting other members. This rate depends on the efficiency of testing policies. β fh denotes the disease transmission rate from an undetected person in the free population to a susceptible person in a infection-free household. This parameter depends on the level of exposure of susceptible members in a household to the free population, and decreases as social distancing measures are tightened.
Let T be the total population, which is considered to be constant. We now define the quantities U i = iH i /T for i = 1, 2, 3 , which represent the portions of undetected infected people living in households of type H i . Analogously, we define the quantities D i , E i , R u,i and R d,i . The set of sixteen equations representing the HP sub-model is then given by www.nature.com/scientificreports/ where σ h denotes the probability rate of recovery of a detected infected household member. θ h denotes the mortality rate of a detected infected household member.
Equations (2) and (4) together describe the lockdown phase of the FL-Hybrid model. The parameters of the model, both in its free and lockdown phases, are estimated using the official data provided by the national health authorities. In particular, we consider the time histories of the total number of COVID-19 diagnosed cases, deaths and recoveries. We then obtain the time history of the COVID-19 active cases by subtracting the deaths and recoveries from the diagnosed cases. The model parameters are then chosen so as to minimise the weighted mean squared errors between the model-predicted time histories of the active cases, deaths and recoveries and the real ones. The model parameters can be periodically updated to reflect the changes in factors like the stringency of social distancing measures, the effectiveness of testing and contact tracing regimes and the efficacy of treatments. These changes might alter the infection, detection, mortality and recovery rates, thus justifying this parameter updating procedure.

Results and discussion
To illustrate the effectiveness of our model we present two case studies, namely the evolution of the COVID-19 epidemics in Israel and Germany.
Stringency. We first illustrate the effect of changing the strength of lockdown policies on the number of active cases and deaths. We define the active cases of infection at any given time as the difference between the cumulative number of diagnosed infections and the sum of cumulative deaths and recoveries. Figure 2a,b show the change in the number of active cases and deaths as a result of changing the stringency of a lockdown in Israel. We focus on the nation-wide lockdown imposed for thirty days, between 18/09/2020 and 18/10/2020 (days 183 and 213 since the start of the epidemic). The data is compatible with our model prediction that about 65% of the population stayed at home. As shown in Figure 2a a more stringent lockdown would have resulted in an earlier and lower peak of active infections. A less stringent lockdown would have resulted in a peak of active cases of about 0.9413% of the population. The beneficial effects of stricter lockdown policies is also evident on the number of deaths, as shown in Figure 2b. Figure 2c,d show an analogous behaviour when a change in lockdown stringency is considered in relation to the national lockdown imposed in Germany between 23/03/2020 and 12/05/2020 (days 23-73 since the beginning of the outbreak), lasting 50 days. Our model suggests that the government data on active cases and deaths are compatible with 80% of the population isolating within their own household. In Figure 2c a milder lockdown produces a lower and flatter curve of active cases while the latter results in a much higher peak, just slightly less than double the actually diagnosed cases. Figure 2d shows that a milder lockdown would have caused five times as many deaths as a stricter one. While our model suggests that the lockdown is an extremely effective way to curb the spread of the infection among the population, thus confirming previous findings 7,9 , additionally it allows making two types of assessment. Firstly, it is possible to infer how large the proportion of population actually isolating is. This enables policymakers to evaluate, for example, whether the isolation rules are enforced in an adequate way or if the number of workers considered essential can be increased or must be reduced. Secondly, our model helps to quantify the lockdown effectiveness by predicting potential outcomes, in terms of number of active cases and deaths, when a lockdown is implemented with different levels of strength. Figure 2 shows that even milder forms of lockdown (i.e. making sure that about half of the population is isolating) help to make the curve of active infections flatter, compared to the case of no restrictions, thus avoiding an extremely large number of infections and deaths.

Duration.
Having assessed the effects of lockdown stringency on predicted number of infections and deaths, we now consider the implications of different lockdown durations. Figure 3a,b illustrates this for the case study of the lockdown imposed in Israel between 18/09/2020 and 18/10/2020. While the lockdown in Israel lasted thirty days in this instance, we consider alternative scenarios in which a lockdown with the same stringency is lifted later or earlier, and we show our model predictions on a window of thirty days after the lockdown end. As of day 260, a 50-day lockdown would have achieved a tenfold reduction in the number of active cases with respect to the actual thirty-day lockdown. In Fig. 3c,d we show the same for the German nation-wide lockdown Subsequent lockdown. At last, Fig. 4 demonstrates that our model is able to predict the evolution of a new lockdown only based on data from previous lockdowns. This kind of prediction is different from the previous ones we presented, in that we no longer provide alternative scenarios in which different types of lockdown are implemented. Instead, here we suppose to be at the start of the second lockdown, and we assume that no www.nature.com/scientificreports/ data is available from the future course of the epidemic. The model is able to reasonably predict the number of active cases and infections in a future time window by leveraging the data collected in the first lockdown. This is a powerful tool that enables policymakers to design the lockdown in order to achieve, for instance, a desired peak of the active cases or to limit the total number of deaths. In Fig. 4a,b we consider the second Israeli national lockdown starting on 27/12/2020 (day 283 of the epidemic). In our model we set a 60% lockdown stringencyinstead of the first lockdown's 65%-to predict the course of the epidemic in the first 15 days of the second lockdown. This is compatible with the fact that in the first days this lockdown was milder than the previous one 32 . Based on data from the previous lockdown, our model is able to predict that the future peak of the active cases of infections would be about 0.9% of the population. The future increase of the total number of deaths is also accurately reproduced. Figure 4c,d illustrate analogous findings for the second lockdown in Germany, starting on 02/11/2020 (day 247 of the epidemic). This second lockdown is less stringent than the first one 32 , which is compatible with a 70% stringency in our model, instead of the first lockdown's 80%. Moreover, the drastically increased testing capacity also reduced the case fatality rate of coronavirus 33 . Using data collected in the first lockdown and adjusting for a smaller case fatality rate, we predict the course of the epidemic for the first 15 days of lockdown. Our model successfully predicts that the peak of the active cases is approximately at 0.37% of the total population, and is able to provide consistent forecasts of the future trends of deaths. It is important to www.nature.com/scientificreports/ remark that these predictions of the future course of the epidemic are to be considered an initial and provisional guess of the future trends. When real data is available, this should be used to assess the effectiveness of the lockdown and, by feeding it back to the model, produce improved future predictions. The motivations, findings and implications of our model are summarised in Table 1.

Methods
In this section we describe the switching between the free and lockdown phases of the FL-Hybrid model and discuss the underlying modelling assumptions and justifications.

Switching between phases.
When the FL-Hybrid model switches between the free phase and the lockdown phase (and vice versa), some conditions on the variables need to be enforced in order to guarantee consistency. www.nature.com/scientificreports/ When a lockdown starts, the FL-Hybrid model switches from the free phase to the lockdown phase. To guarantee consistency, the population that is split among the SUDER sub-model compartments in the free phase needs to be distributed among the compartments of the lockdown phase. To do so, we first define the lockdown percentage L, which is the percentage of the susceptible and undetected population which will become the lockdown population. Denote by t the moment at which the switching between free phase and lockdown phase occurs. Then at the switching the lockdown population will be L(S(t) + U(t)) . Note that we exclude that detected, extinct or recovered people are part of the lockdown population. This assumption makes the analysis of the model simpler without compromising its accuracy. In fact, the population in D is assumed to be perfectly isolated and the population in E, R u , and R d is not infectious anymore. Therefore these variables do not play an active role in the dynamics of the epidemic, although they still play a fundamental role in tracking the impact of the disease. Distributing these portions of population among the households would have very little impact on the evolution of the epidemic among the lockdown population, and for simplicity they can be considered part of the free population. The free population is therefore (1 − L)(S(t) + U(t)) + D(t) + E(t) + R u (t) + R d (t) . Consequently, the initial conditions for the SUDER sub-model of the free population in the lockdown phase are As far as the lockdown population is concerned, this has to be split appropriately into households. The total number of households is given by Let a i be the proportion of households with i undetected infected at the beginning of the lockdown, i.e. H i = a i N , for i = 0, 1, 2, 3 . The initial conditions of the HP sub-model is then Note that the coefficients a i are constrained in [0, 1] and must satisfy the system of equations where the first equation comes from the fact that the coefficients a i are fractions which must sum to 1 and the second equation ensures consistency in the number of undetected people in lockdown. In fact the total number of undetected infected in the lockdown population is given by H 1 + 2H 2 + 3H 3 = (a 1 + 2a 2 + 3a 3 )N and this number must equal LU(t)T . By the definition of N and rearranging the terms, the second equation in (7) is obtained.

Background
The introduction of a national lockdown is an effective mean of containing the spread of COVID-19. The implementation of a lockdown requires a careful design aimed at limiting the number of infections and, consequently, deaths. We propose a model that is able to support this process by providing future predictions on the course of the epidemic for different stringency levels and durations of the lockdown

Main findings and limitations
Stringency Even mild lockdowns, i.e. isolating about 50% of the population, result in dramatic improvements compared to not implementing any form of lockdown. However, ensuring that around 80% of the population isolate within their household yields reasonably better performances, whilst more aggressive (i.e. more than 80% isolating population) lockdowns produce negligible additional benefits, which are potentially outbalanced by the financial and social drawbacks caused by the restrictions Duration Lifting a lockdown too early (e.g. before 20 days since it started) is likely to induce a sharp new increase of infections and deaths or, at best, to cause a halt in the decrease of cases. A lockdown duration of around 50 days produces substantial improvements in terms of limiting the number of infections and deaths. While enforcing a lockdown for longer periods (e.g. more than 50 days since it started) has been found to yield marginal improvements, in the long run the negative social and business implications of prolonged restrictions might outweigh such advantages Prediction By leveraging data collected in a previous lockdown, it is possible to accurately predict the peak of active cases and deaths when a new lockdown is enforced. As customary in modelling studies, our model-based predictions are based on reasonable assumptions, but the real evolution of the epidemic depends on how prompt and effective the actual public health and safety policies are

Policy implications
Our model can support policymakers to plan ahead of a lockdown by providing them with a tool to predict the course of the epidemic as a result of different lockdown stringency levels and/or durations. This is paramount to achieve a trade-off between the negative impact on business and society and the crucial goal of limiting infections and deaths www.nature.com/scientificreports/ At the end of the lockdown, the lockdown population mixes again with the free population, thus starting a new free phase, the dynamics of which is described only by the SUDER sub-model. If the switching between lockdown phase and free phase happens at time t , the initial conditions of this sub-model are easily obtained as follows Besides the lockdown percentage, another feature that the model lets the user modify is the lockdown duration, i.e. the number of days between the lockdown is enforced and lifted. In terms of the mathematical model, the lockdown duration is the amount of time that the FL-Hybrid model spends in the lockdwown phase.
The parameters of the model have been identified using a nonlinear grey box model in MATLAB. The best fitting parameters have been calculated by solving a weighted nonlinear least-squares problem. This least-squares problem was solved using the MATLAB Optimization Toolbox based on the conditions and constraints we discussed above (e.g. an initial guess and a tuning range for each parameter, a matrix of weights and an estimation window length). Parameters are usually updated every 20-40 days. The initial guesses of parameters for each subsequent estimation were set to be equal to the values of the parameters before the update. See the section "Code availability" for details.
Discussion on modelling assumptions. Before we discuss the modelling assumptions, it is useful to remark that both the SUDER sub-model and the HP sub-model are mean-field models. This implies that the models themselves do not capture the infection transmission and evolution in a case-by-case fashion, but rather describe the averaged dynamics of the epidemic. In this sense, the parameters of the sub-models are to be meant as average rates over the fraction of population (i.e. free or lockdown population) described by the sub-models. Carefully choosing the parameters enables making high-precision predictions of the future trends.
In the SUDER and HP sub-models we make the following simplifying assumptions.
• The birth rate and mortality rate are assumed to be negligible and thus the total population T is considered to be constant in the model. This is a standard assumption for compartmental models. • Detected people are properly isolated and do not transmit the infection to susceptible people, i.e. the number of contacts which detected people have is zero. Therefore, the disease transmission rate from detected infected to susceptible is assumed to be zero. In reality, perfect quarantining does not happen, therefore this rate, although very low, is not zero and depends on a country's specific policy on detecting and isolating infected individuals. The sub-models might be easily extended to have a non-zero transmission rate from detected infected to susceptible. • No undetected people will die from the disease, as the development of life-threatening symptoms would lead to a diagnosis before death occurs. This is not always accurate, especially at the beginning of the epidemic when low detection rates and the inability of health services to cope with the high number of cases might lead to official sources under-reporting the number of casualties 34 . The sub-models might be extended by adding an additional compartment for deaths from the undetected stage. • A recovered person will not become susceptible of re-infections. A prior SARS-Cov-2 infection has been found to be associated to an 83% lower risk of infection, thus justifying this simplifying assumption while still making highly accurate predictions in a period which is equal to the duration of the immunological memory 35,36 . Re-infections might be introduced in the sub-models by allowing a flow from the R compartments to the S compartment at a given immunity loss rate.
In the HP sub-model we make a number of additional assumptions aimed at considerably simplifying the intricate household dynamics, yet without compromising the sub-model prediction accuracy.
• The household size is fixed and consists of three members. While in most countries the average household size is generally a number between two and four, (3.1 for Israel and 2.1 for Germany in 2019 37 ), our HP submodel requires an integer size. We selected 3 for both countries in order to simplify the theoretical development, but the equations can be easily adapted to have an integer household size as close as possible to a (8) www.nature.com/scientificreports/ country's average household size. Note that, although the number of 3 is quite accurate for Israel, yet it also produces accurate results for Germany. The main limitation of using the average household size is that this simplification relies on a normal distribution which ensures that the approximation is valid. In the cases in which this assumption does not hold, several possible changes can be implemented to maintain the validity of the model. For instance, outliers can be excluded from the dataset (e.g., the percentage of household with 6 or more members is only 0.76% for Germany 37 ). Another approach is to blend multiple HP sub-models which have different household sizes according to the real distribution. Considering households of different sizes is allowed by our model, but it would cause an increase in the number of HP sub-model equations, with minimal benefits in terms of prediction accuracy. Another possible solution is still to use a fixed household size but evaluate the predictions of the model within a 95% confidence interval or a Bayesian credible interval. • The coefficients a i are selected arbitrarily, yet reasonably, in order to satisfy the constraints in (7) and the constraint a i ∈ [0, 1] . Note that since the constraints in (7) are a system of two equations in four unknowns, once two of the coefficients, for example a 0 and a 1 , are arbitrarily chosen, the remaining ones, i.e. a 2 and a 3 , are known functions of the first two. Also note that the bounds on a 2 ∈ [0, 1] and a 3 ∈ [0, 1] provide constraints on the values that a 0 and a 1 can assume. The combination of these constraints limits, in fact, the arbitrariness of the choice. For instance, the coefficient a 0 is constrained to a large value in the range [0, 1]. This is consistent with the fact that, as the infected individuals are a small percentage of the population, the vast majority of the households at the beginning of the lockdown will have no undetected members. Additionally, note that the HP sub-model has very low sensitivity over these coefficients. Changing these coefficients does not produce inaccurate predictions as long as the sub-model parameters-inferred from the official data-are suitably updated. • Within the same household, all undetected infected members are diagnosed at once. This assumption is motivated by the fact that if one member of the household tests positive for the infection, it is very likely that the rest of the household will get tested and diagnosed if infected. This process is based on the assumption that sufficient medical resources are available. Thus, it may not be an accurate assumption at the early stage of a vast pandemic when the health system may be overloaded. • If infected household members are diagnosed, to avoid spread of the infection the household as a whole will self isolate, as well as its members individually. This prevents new household members from being infected from the outside, and vice versa, and also stops the infection from spreading within the same household. • All the cases within the same household will evolve in the same way, i.e. they all either recover or die. Obviously this is not always the case in real circumstances, but, as previously mentioned, this mean-field assumption considerably simplifies the model-making it a simple yet powerful instrument-while still producing good averaged descriptions of the epidemic evolution. • The households of individuals belonging to the free population are assumed to be part of the free population as well. These include, for instance, the households of key workers. The rationale is that, although the families of the key workers keep isolating at home, they are still virtually in contact with the rest of the free population through the key workers. This explains why official data is compatible with relatively high percentages of free population in the lockdown phase of our model. • When a susceptible person in an isolating household is infected by an individual outside of the household, it is assumed that this happens only by contact with an infected individual in the free population and not from other households. Introducing also inter-household infections is feasible, but the rate would be very low. This is due to the definition of the household unit as a group of individuals with high degree of isolation with respect to the free people. It has been noted in early testings of the model that such a low inter-household rate does not produce significant differences in the predictions. Consequently, such a change would just over-complicate the HP sub-model while yielding negligible benefits from the point of view of its dynamic properties.
To conclude this section, we discuss the meaning and practical implications of the lockdown percentage L in the lockdown phase. This number, which in our model is defined as the fraction of susceptible and undetected individuals isolating in their own household, should not be considered as an absolute measure of the fraction of population isolating in reality. Instead, once the baseline lockdown percentage is identified by fitting the model with the official data, the model allows the user to assess the outcome of more or less stringent lockdowns by increasing or reducing L relative to the baseline.

Effective reproduction number.
In both the free phase and lockdown phase, the epidemic grows if the time derivative of the undetected population is higher than zero. For the free phase, this corresponds to which in turn is equivalent to The term R f t is the effective reproduction number of the epidemic in the free phase provided by the model, while the basic reproduction number is (9) βSU − (ρ + δ)U > 0, www.nature.com/scientificreports/ In the lockdown phase the epidemic grows when U f +U 1 +U 2 +U 3 > 0 . Using the expressions of these derivatives given in Eqs. (2) and (4) and rearranging we obtain where R ℓ t is the effective reproduction number in the lockdown phase. We now want to show that, at the start of the lockdown, the effective reproduction number can be expressed as a function of the lockdown percentage L. Let t be the time at which the lockdown is enforced and let S = S(t) and Ū = U(t) be the portion of susceptible and undetected population at the beginning of the lockdown. Then using the initial conditions S f (t) = (1 − L)S , U f (t) = (1 − L)Ū , defining a = a 1 + 2a 2 + 3a 3 and using (5), (6), and (7), the effective reproduction number resulting from the introduction of the lockdown can be expressed as a function of L as By defining η = β h (a 1 + 2a 2 )/a , ξ = β fh a 0 + 2 3 a 1 + 1 3 a 2 /a and ν = 3 i=1 ρ h i + i δ h a i /a , the previous expression reduces to Note that if L = 0 , i.e. no lockdown is imposed, the expression of the effective reproduction number in (14) becomes the same as in (10), i.e. R l t (0) = R ft and the effective reproduction number of the free phase is retrieved.

Data availability
Official epidemiological data was gathered from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The repository is available at https:// github. com/ CSSEG ISand Data/ COVID-19.