Simulation of the impact of people mobility, vaccination rate, and virus variants on the evolution of Covid-19 outbreak in Italy

We have further extended our compartmental model describing the spread of the infection in Italy. As in our previous work, the model assumes that the time evolution of the observable quantities (number of people still positive to the infection, hospitalized and fatalities cases, healed people, and total number of people that has contracted the infection) depends on average parameters, namely people diffusion coefficient, infection cross-section, and population density. The model provides information on the tight relationship between the variation of the reported infection cases and a well-defined observable physical quantity: the average number of people that lie within the daily displacement area of any single person. With respect to our previous paper, we have extended the analyses to several regions in Italy, characterized by different levels of restrictions and we have correlated them to the diffusion coefficient. Furthermore, the model now includes self-consistent evaluation of the reproduction index, effect of immunization due to vaccination, and potential impact of virus variants on the dynamical evolution of the outbreak. The model fits the epidemic data in Italy, and allows us to strictly relate the time evolution of the number of hospitalized cases and fatalities to the change of people mobility, vaccination rate, and appearance of an initial concentration of people positives for new variants of the virus.

1. To provide a model able to predict the evolution of the Covid-19 outbreak, especially in Italy, indicating a good forecast as a function of the level of the restrictions; 2. To suggest the best restriction strategies, as a trade-off between restraining the outbreak and facilitating social activities and economics; 3. To include and predict the influence of the vaccination on the previous points. 4. To simulate the impact of the variants in terms of disease spreading increase and of hospitalized persons and the relative incidence in the positive population; 5. To detect the onset of a new variant.

Methods
The model proposed to describe time evolution of the total number of infected people, positive cases, healed people, deaths, and hospitalized people, during the Covid-19 outbreak is based on a mean-field approximation. This consists in the assumption that the probability for an individual to contract the infection is proportional to the concentration p of positive circulating cases, to a diffusion coefficient D , equal to the surface area covered on average by each person in a day, and to an infection cross-section σ related to the probability of a single infection event ( σ = πR 2 , R being the average distance within which a healthy person can be infected by a positive one). This cross-section is a quantity specifically dependent on the virus infectiousness. We assume it is constant everywhere all over the examined geographic area and can change only in the presence of virus variants. In particular, the dimensionless probability η of a single infection event is related to the infection cross-section by the relationship η = ρ 0 σ , where ρ 0 is the density of inhabitants. Under these hypotheses, at any instant t , the increase dp of people positive for viral infection in the time interval dt can be expressed as: g is the concentration of healed people (infected people who are tested negative for the virus after a certain time interval from the infection), m is the concentration of fatalities, and c is the total concentration of those who have contracted the virus at the time t.
A fraction f of the new positive cases requires hospital care and, consequently, the concentration r of hospitalized people will change, in the time interval dt , by a quantity dr given by: where we further consider that r diminishes, in the same time interval dt , because a fraction q of hospitalized people dies in a characteristic time τ 1 , whilst the complementary fraction 1 − q heals in a characteristic time τ 2 . As a consequence, the concentration m of fatalities will vary with time according to the following equation: While the fraction f of positive people is hospitalized, the fraction 1 − f does not exhibit serious symptoms until complete healing. The relative concentration s , of people not exhibiting serious symptoms (i.e. not requiring hospital care), will vary with time according to the following relationship: τ 3 being the characteristic time toward healing for these individuals. As reported in Ref. 34 , this characteristic time is typically larger than τ 2 , i.e. the one used for describing time dependent healing of the most severe hospitalized cases (Eq. 2). As a consequence, the time dependence of the concentration g of healed people changes with time according to:  Our description is based on the assumption that the dynamics of all the observable variables, p , r , m,g , c , can be described in terms of the time dependence of the diffusion coefficient D(t) , while keeping constant the infection cross-section to the value of = 3.14 m 2 (corresponding to R = 1 m).

Results and discussion
Modelling the epidemic evolution in Italy before 2020 holiday season. The values of the diffusion coefficient D, from February the 24th, 2020 until December the 20th, 2020 (i.e. a few days before holiday season in Italy), are plotted in Fig. 2b (open lozenges). These values were extracted from the data of the hospitalized cases [open circles in Fig. 2a] by adopting the analytical procedure described in detail in Ref. 14 and briefly summarized in the following.
The function D = D(t) is determined by minimizing, point by point along the whole integration interval, the difference between the calculated concentration r (Eqs. [1][2][3][4][5][6] and the corresponding experimental value (we imposed that such a difference keeps lower than 1%). The result of this procedure is plotted in Fig. 2b, using a 5-day moving average filter.
We extracted the functional form of D and all the other relevant model parameters from a fit to the data of hospitalized cases since they are more reliable than the ones concerning the number of people tested positive www.nature.com/scientificreports/ for the virus. Indeed, the latters represent only a small fraction of the real corresponding concentration values, since they refer to the cases actually detected through the adopted testing procedure (swabs), restricted to a defined relatively small sample of the entire population. In Fig. 2b, the strong reduction of the diffusion coefficient to its minimum value D L = 2.7 × 10 5 m 2 day −1 is a consequence of the general lockdown in spring 2020, followed by a moderate increase during summer, when the mobility restriction rules were loosened. At the end of September 2020, people mobility increased at a higher rate, due to the resumption of school and work activity in more conventional modalities (compared to those based on work and school from home, experienced during the general spring lockdown). The fast increase of the diffusion coefficient in the first half of October 2020 has triggered the start of the second wave of the Covid-19 epidemic in Italy, accompanied by an exponential growth of the number of hospitalized people (Fig. 2a) in the following month. On October the 25th, 2020 [arrows in Fig. 2b] the Italian Government enacted further mobility restriction rules that induced a new decrease of the diffusion coefficient. These measures were significantly different Region to Region. A few of them, the so-called red Regions, experienced mobility restrictions very similar to the ones adopted during spring lockdown. For other Regions the measures were slightly less restrictive (orange Regions) up to a situation characterized by the persistence of a relatively high level of mobility with a limited number of restrictions (yellow Regions). The inhomogeneous intensity of the new mobility restriction rules reflects on the circumstance that the diffusion coefficient approached, at the beginning of December 2020, a constant value that was about 1.8 times higher than the one reached during the general, homogeneous spring lockdown, D L . Continuous line in Fig. 2b is a fit of the D values with a set of logistic functions of the following kind 14 : where t 0 is the time around which the diffusion coefficient changes from D 1 to D 2 and τ c is the characteristic duration of such variation. In order to follow the time-dependence of D , the parameters D 1 , D 2 , t 0 ,τ c where adjusted to their best fit values within four different time intervals: (i) from February the 24th to June the 4th, 2020 (start of the Covid-19 epidemic monitoring in Italy); (ii) from June the 4th to September the 22nd, 2020; (iii) from September the 22nd to October the 25th, 2020; (iv) for times beyond October the 25th, 2020. In particular, the last change of mobility (beyond October the 25th, 2020) was modeled by setting D 1 = 1.4 × 10 6 m 2 day −1 , D 2 = 4.8 × 10 5 m 2 day −1 (i.e., D 1 = 5.3D L and D 2 = 1.8D L ), t 0 = 260 days from the start of the Covid-19 epidemic monitoring in Italy, and τ c = 7.6 days.
We used the functional form describing the variation of D as a function of time for calculating the expected values r (hospitalized people) and m (total number of fatalities), through Eqs. (1)(2)(3)(4)(5)(6), and by adjusting the other model parameters by a fit to the data of Fig. 2a, with the exception of the characteristic times τ 2 (healing of hospitalized people) and τ 3 (healing of infected, but not hospitalized people) that were set to the values found in the literature ( τ 2 = 20 days, τ 3 = 14 days) 34 .
For all the calculations we also imposed the following initial conditions for Italy: r 0 = 127 /A (i.e. the experimental point at t = 0 ), Here A indicates the surface of the geographical area. From the best-fit we obtained f , q , τ 1 .
The results of such a procedure are the continuous lines plotted in Fig. 2a. The agreement of the theoretical curves with the experimental data is excellent. The best-fit values found for f (fraction of the new infected persons that require hospitalization), τ 1 (characteristic time for death), q (fraction of hospitalized people that die in the characteristic time τ 1 ) were: f = 0.35%, τ 1 = 7.2 days, q = 14% in the time interval February 24th ≤ t ≤ May the 13rd, q = 10% in the time interval May 13rd < t ≤ November the 14th, q = 15% for t > November the 14th. In the same plots, dashed lines simulated what would have occurred if the diffusion coefficient, after October the 20th, had approached the same value experienced in the occasion of the spring general lockdown.
The same analytical procedure was applied to model the evolution of epidemic data in three Italian Regions that, on November the 5th, 2020, were subjected to different mobility restrictions: Lombardia ("red zone": severe mobility restrictions), Sicilia ("orange zone": medium level of mobility restrictions), Lazio ("yellow zone": soft mobility restrictions). The results are shown in Fig We notice that, in the range October the 20th-December the 20th, the diffusion coefficient decreases to a plateau level that is different for the three examined Regions. Compared to the corresponding spring lockdown values D L , the ratio D/D L approaches 1.5 for Lombardia, 1.8 for Sicily, and 2.2 for Lazio, respectively. Thus, the model allows us to relate the temporal evolution of the epidemic data to the change of people mobility (diffusion coefficient) caused by different levels of restriction severity.
The model fitting parameters are listed in Table 1. Also in this case (as for Italy as whole) we need to use different values of the parameter q for the three time intervals (the first range is centered on the first wave of the outbreak, the second one corresponds to the summer characterized by a relatively small number of cases, and the third interval is around the peak of the second wave).
Effect of mobility increase occurred during 2020 holiday season. After December the 20th, 2020, the Italian Government decided to relax the mobility restriction measures in the occasion of holiday season. The corresponding increase of people mobility reflected in a significant slowdown of the decreasing rate of the hospitalized cases as shown in Fig. 4a. The proposed model interprets this effect in terms of variation of the diffusion coefficient, with respect to the spring lockdown value D L . In particular, Fig. 4c indicates that the diffusion coefficient (and the related quantity ρ 0 D , i.e. "average number of people encountered by each person in a day") increased, during holiday season, from 1.8 to 2.6 times D L . The reintroduction of more severe mobility restrictions on January the 7th caused a new decrease of D. However, these new actions (same in all the National territory) were significantly softer than the ones adopted before December the 20th. This circumstance reflects on the observation that the ratio D/D L  www.nature.com/scientificreports/ approached, after the holiday season peak, to a value, for Italy as a whole, of about 2.1, higher than the one observed in the pre-peak time interval (D/D L = 1.8). The corresponding fits of the model to the hospitalized and fatalities cases are plotted as continuous lines in Fig. 4a and b. The simulation of the scenario for a decrease of the diffusion coefficient, after October the 20th, 2020, to D L is also plotted in Fig. 4 (dashed lines), whilst dot-dashed lines simulate the scenario corresponding to a diffusion coefficient constant to the value experienced before holiday season. www.nature.com/scientificreports/ The variation of D/D L around the peak of holiday season has not been the same among the different Italian Regions. This is shown in Fig. 5 for Lombardia, Sicily, and Lazio. We notice that only Lazio has returned to a situation with a diffusion coefficient of the pre-holiday season value. Conversely, the ratio D/D L in Lombardia approached the value of 2.3, significantly larger than the pre-holiday season value (D/D L = 1.5). Sicily has experienced the most critical situation, with D/D L that has reached, during holiday season, a peak value as large as 3.6, thus triggering the start of a third wave of the outbreak, well visible in the plot of hospitalized cases shown in Fig. 5b. Indeed, beyond the holiday season peak, the ratio D/D L for Sicily has exhibited a relatively slow decreasing rate and, on January the 15th [see the arrow in Fig. 5h], it was still at a level of about 3.3. In particular, dotted lines in Fig. 5b, e, h show the results of our simulation under the hypothesis that D/D L for Sicily had approached, after the holiday season peak, a constant value equal to 3. With this assumption a new wave of the virus epidemic in the Region would occur. On January the 15th, however, the Italian Government imposed for Sicily more severe mobility restrictions ("red zone"), inducing a further decrease of D/D L and, as a consequence, the number of hospitalized people decreased as well (Fig. 5b).
In this first part of the discussion, we have presented the capability of the model to evaluate D and to compare its value to the different restriction levels in Italy: before, during and immediately after lockdown, during and after summer, during and after holidays. In brief, we have related all the restriction levels to corresponding D-values in a quite direct relation. This gives us, as shown in Figs. 2, 3, 4, 5, the possibility of forecast the dynamics of the disease holding the restrictions or modifying them. Some SIR-based models simulated the lockdown dynamics by   23 . Note that, in these cases, the outbreak spreading is limited to the first wave of the disease. There are some other few attempts in literature to describe the time-dependent parameters, in particular the transmission rate related to σ D , in a time-varying SIR-based model. Cartocci et al. 28 provided a conceptually similar approach. They adopted a simpler model in terms of positive compartments (considering only positives, not distinguished between hospitalized and non-severe positives) and they obtained quite similar behaviour for the parameter related to restriction, especially in the post lockdown time interval. For the period before and during lockdown, among others, our estimation shows quite reliable results. In fact, we have used the concentration of hospitalized and of fatalities to obtain D and q. These quantities are more reliable than the concentration of positives, since this last measure depends on many factors (number of swabs, tracking or statistical approach, congestion of the health-care system,…) that can change in each region and in different time periods. During the first outbreak, the few nasal swabs available were mainly used to monitor (elder) severe positives and symptomatics, neglecting mostly asymptomatics.
Modelling the impact of vaccine immunization. On December the 27th, 2020, Italy started its vaccination campaign. The investigation of the clinic effectiveness of the various vaccines is beyond the aim of this work. Our model, however, can include the influence of vaccination on the time evolution of the virus epidemic by assuming that immunization occurs, in general, about one week later from the inoculation of the second vaccine dose. Then, the concentration ρ i of people immunized by vaccination will increase with time t according to the following relationship: where t 0 is the immunization time onset, corresponding to the day first person has received the second vaccine dose (January the 16th, 2021, in our case), τ = 7 days is the time interval for getting immunization from the second dose inoculation, and v(t − τ ) is the concentration of people per day that was vaccinated (second dose) on the time corresponding to t − τ.
Under these hypotheses, the influence of vaccine on the time evolution of all the observable variables is taken into account, simply by substituting the term (ρ 0 − c) with (ρ 0 − ρ i − c) in Eqs. (1)(2)(3)(4)(5)(6). For the daily number of people, v , receiving the second vaccine dose, we used the data communicated by the Italian Health Ministry (Ministero della Salute) 32 . In order to simulate the impact of vaccination on the future time evolution of the virus outbreak, the function v was kept constant to the vaccination daily rate experienced the week prior to the date of the last available data, (January the 31st, 2021).
Vaccination is the best weapons we can use to strike virus outbreak and come back to highest levels of mobility in a relatively short time range. In order to investigate how vaccination can help us to increase people mobility, we have simulated several scenarios, shown in Fig. S1, consisting in a progressive increase of the diffusion coefficient to a level as high as the one reached at the end of summer 2020 (D/D L = 3.25).
These scenarios differ for the used time delays to increase the diffusion coefficient to D/D L = 3.25. Starting from January the 31st, 2021 such delay is set to (see The simulated number of hospitalized people and of fatalities are plotted in Fig. 6, in the absence (Fig. 6a, b) or in the presence of vaccination (Fig. 6c, d). Maintaining constant the diffusion coefficient to the present low level (D/D L = 2.1), the vaccine immunization produces only small effects on the decreasing rate of hospitalized people and fatalities, since these values are already decreased by the low mobility. Vaccination beneficial effect is evident when the mobility is higher. It strongly mitigates the amplitude of the third wave peak of the outbreak triggered by the increase of people mobility to the levels measured at the end of summer 2020 (D/D L = 3.25).
Reproduction number calculation. The model provides also a self-consistent method to evaluate the reproduction number R T , i.e. the number of primary infections produced by a single infected person during the time interval she/he remains still positive (we actually assume that a positive human transmits the infection with a constant probability, independently of the symptoms severity). We consider the concentration n of people infected by a positive concentration "probe" p(θ), initially equal to p t . The new positives for the virus at the time instant t , normalized to p t itself, increases, at a time instant θ ≥ t , by a quantity dn given by: The function c(θ) is determined, for assigned forms of ρ i (θ) and D(θ) , by solving Eqs. (1-6), whilst P(θ) = p(θ)/p t decays with time according to the rate equations that describe the process of healing or death of people positives for the virus, i.e.: The analytical solution of this system of differential equations is: The function P(θ) , calculated by setting f , q , τ 1 , τ 2 , τ 3 to the values used to fit the model to the data of hospitalized people and fatalities for Italy as a whole, is plotted in Fig. S2. P(θ ) is just but the probability that a single infected individual, contracting the viral infection at a given instant θ = 0 , is still positive and able, in turn, to infect susceptible people at a subsequent time θ > 0.
R T is defined as the number of people infected by a single individual, positive for the virus at a certain instant t , throughout her/his full lifetime (until healing or death), and then: Since R T is the result of a time integration, the instant, T, at which the determination of R T should be referred to is equal to the time average weighted on dn , i.e. the number of people that a single positive infects per unit of time throughout her/his lifetime:  17), is plotted in Fig. 7 for the different scenarios of people mobility variations described in Fig. S1. By comparison of Fig. 7 to Figs. S1 and 6, we notice that the increase of the diffusion coefficient to the level experienced at the end of summer 2020 in Italy, causes a reproduction number above 1, responsible then for the triggering of a new wave of the virus outbreak.
We finally observe that the integral term in Eq. (16) can be thought as the average number Ŵ of people, not yet infected ("susceptible people"), that a single positive individual meets throughout his lifetime, since the onset of his infection. Thus, we can conclude that R T is proportional to Ŵ , the proportionality constant being ρ 0 σ , i.e. the probability of a single infection event.
Nowadays, the Covid-19 outbreak in the Italian Regions we have analyzed (Lombardia, Sicily, and Lazio) is actually characterized by similar values of R T , all of them being below 1: 0.87 for Lombardia, 0.82 for Sicily, 0.79 for Lazio, 0.83 for Italy as a whole. However, people mobility levels corresponding to these similar R T values differ Region by Region. This situation is clearly illustrated in Fig. 8 Effect of virus variants. Our theoretical description can effectively provide simulation of the influence of virus variants on the time evolution of hospitalized cases and fatalities. A virus variant is expected to be more infective and/or or more severe in terms of the fraction of infected people requiring hospitalization. In the former case we should simply increase the value of the infection cross-section σ , in the latter case it is f that has to be changed. In particular, the effect of virus variants can be simulated by assuming that at, a certain date, the new form of the virus, characterized by different values of σ and/or f, is present in a small fraction of the active positive population. After that time, the concentration p ′ of people positive for the virus variant will vary with time according to the same set of equations, Eqs. (1)(2)(3)(4)(5)(6), that are simultaneously used to follow the time variation of the concentration p of people positive for the standard version of the virus. Of course, Eq. (6) has to be modi- since the variation f → f ′ is included in Eq. (2) describing the time evolution of fraction of p ′ that requires hospitalization. In all these approximations, we assume that all the other parameters (q, τ 1 , τ 1 , and τ 1 ) remain unchanged and that vaccine immunization is still effective for either standard or variant form of the virus. Figure 9 shows the results assuming the presence, on January the 15th, 2021, of people positives to virus variant at a concentration equal to 1% of the total active circulating positives. We performed these calculations by keeping constant the diffusion coefficient to its present value (D/D L = 2.1) and by considering the case of a virus variant with higher transmissibility ( σ ′ = 2σ , dashed line in Fig. 9) or producing more serious symptoms ( f ′ = 5f , dot-dashed line in Fig. 9). It is evident that the appearance of virus variant having a higher transmissibility is the most dangerous perspective, with respect to the hypothesis that the variant virus characteristics are only limited to the increase of symptoms severity. It should be also emphasized that the increase of cross-section by a factor two is obtained by the increase of the characteristic infection distance, R, ( σ = πR 2 ), by just 40%.
Since the model is very sensitive to variations of σ D , by this quantity it is possible to detect the onset of a new variant with a higher transmissibility. Indeed, if σ D increases in a time interval in which D is kept constant (e.g. when there are no new restrictions or openings) this variation can only be attributed to a higher σ . Since σ D is measured daily, the detection of a new variant is quite immediate. The other methods based on R T measurements rely instead on mathematical integrations on several days (Table 2).
Finally, since the model calculates self-consistently σ ′ , it also describes the incidence of the new variant, i.e. the ratio p ′ /p . Moreover, if the experimental data about each variant are separately available, e.g. through genomic sequencing, it is possible to perform a very precise fine-tuning of the value of σ ′ , of the order of 1‰.

Conclusions
In conclusion, we have shown that the spread of COVID-19 virus can be successfully described by a compartmental model,based on the assumption that the probability of a single infection event is given by the product between the density of inhabitants and a cross-section measuring the distance within which a person positive for the virus can infect a healthy one. Through the model, it is possible to relate the variation of observed hospitalized cases and fatalities to the modification of the mobility restriction measures, by comparing the present behavior to that already experienced during the first wave of the outbreak. The model includes the effect of vaccine immunization and the role of possible virus variants in propagating the infection. The possibility to simulate the time evolution of the observed cases as a function of a diffusion coefficient function is a powerful tool to investigate