Inter-provincial disparity of COVID-19 transmission and control in Nepal

Despite the global efforts to mitigate the ongoing COVID-19 pandemic, the disease transmission and the effective controls still remain uncertain as the outcome of the epidemic varies from place to place. In this regard, the province-wise data from Nepal provides a unique opportunity to study the effective control strategies. This is because (a) some provinces of Nepal share an open-border with India, resulting in a significantly high inflow of COVID-19 cases from India; (b) despite the inflow of a considerable number of cases, the local spread was quite controlled until mid-June of 2020, presumably due to control policies implemented; and (c) the relaxation of policies caused a rapid surge of the COVID-19 cases, providing a multi-phasic trend of disease dynamics. In this study, we used this unique data set to explore the inter-provincial disparities of the important indicators, such as epidemic trend, epidemic growth rate, and reproduction numbers. Furthermore, we extended our analysis to identify prevention and control policies that are effective in altering these indicators. Our analysis identified a noticeable inter-province variation in the epidemic trend (3 per day to 104 per day linear increase during third surge period), the median daily growth rate (1 to 4% per day exponential growth), the basic reproduction number (0.71 to 1.21), and the effective reproduction number (maximum values ranging from 1.20 to 2.86). Importantly, results from our modeling show that the type and number of control strategies that are effective in altering the indicators vary among provinces, underscoring the need for province-focused strategies along with the national-level strategy in order to ensure the control of a local spread.


Results
Epidemic trend. We present the general trend of the COVID-19 outbreak in Nepal and in its provinces using the compiled data (Fig. 1). In particular, we focused on the timing of surges in COVID-19 cases and the early linear growth rates of those surges. Despite the first COVID-19 case confirmed on January 23, 2020 and the second case detected on March 24, 2020 the cases remain negligible until May 20, 2020 due to timely implementation of interventions by the government of Nepal (March 24, 2020: closure of international flights, border closure, and the start of lockdown) 30,33,44 . The data shows that the country has faced a multi-phasic epidemic pattern with three surge periods of distinct magnitude: low (May 20 to June 25, 2020), medium (July 22 to September 20, 2020), and high (post-September 16, 2020). The timing for the beginning of these surges approximately corresponds to the border opening ( ∼ May 20, 2020), lockdown ending ( ∼ July 21, 2020), and countrywide travel opening ( ∼ September 20, 2020), respectively. Modeling the beginning of these surges by the linear growth function, we found that the average number of new infections grew linearly with a slope 15 per day, 24 per day, and 154 per day during the low, medium, and high surge periods, respectively (Table 1). A dramatic increase in the cases during the high surge period implies a significant impact of policies that were implemented prior to September 16, 2020 when all the policies were lifted.
We note that the first low-level surge can be attributable to the inflow of COVID-19 cases from India. In mid-May of 2020, the government opened the southern border partially to let seasonal labor migrants return Table 1. The slope of the linear model and the duration of the linear trend for three surge periods. www.nature.com/scientificreports/ home. They were screened and quarantined, and were isolated if tested positive. The inflow of the positive cases through the Indian border resulted in the first surge leading to as high as 671 cases per day (June 18, 2020). Because of border screening, quarantine, testing, and isolation, the potential local transmission was avoided, and the case number declined to about 100 per day by mid-July of 2020. However, after the Nepal government ended the lockdown policy on July 21, 2020, the second surge began primarily due to the community transmission, leading up to 1,460 new cases per day on September 15, 2020. Moreover, after the government resumed all activities on September 16, 2020, the third surge began, leading to 3,750 new cases per day on October 15, 2020. We also analyzed the province wise data to observe any disparities in the epidemic trend of COVID-19 among seven provinces. While the first surge of new cases began in all provinces after the government opened the Indian border (May 20, 2020), the subsequent surge patterns were quite different among the provinces, in both peak timing and magnitude (Fig. 1). Our analysis using linear model approximation on three surges (Table 1) shows that the different provinces have different time intervals during which the linear growth was observed, and the slope of these linear growths varies widely among provinces. Province 7 had the first linear growth pattern (June 1-July 1, 2020) with the highest slope (8 new cases per day), while Province 5 has the longest interval (May 12-June 27, 2020) with a slope of 3.6. Similarly, Province 2 and 5 have the earliest linear growth pattern starting on May 10 and 12, 2020, and ending on June 18 and 27, 2020, respectively. In general, the provinces with a higher inflow of migrant workers from India (Province 2, 6, 7) have a higher slope of the linear growth rate during first surge period, while the provinces with a lower inflow of migrant workers from India (Province 1 and 3) have higher slope during the later surge period, which occurred after the government started to lift the restraining policies. Note that Province 3, which does not have an Indian border entry point, has the smallest slope during the first surge period (June 01 to July 01, 2020). However, it has the highest slopes of 17 and 104 cases per day during the second (August 1 to September 25, 2020) and the third (September 27 to October 13, 2020) surge periods, respectively. These extremely high cases in Province 3, which contribute to about 2/3 of the total cases in the country during the third surge period, could be because this province includes the capital city of Nepal, making it the highest populated province with the maximum population contact network. Overall, our analysis on the disease transmission trend identifies that both geographical situation (mobility across the border with India), as well as internal demography (population size) and lifestyle (contact network), contribute to the provincial disparities on the epidemic trend of COVID-19 in Nepal.
Daily exponential growth rate. As revealed in the disease epidemic patterns (Fig. 1), while the linear growth can be observed for a short time, the disease pattern was nonlinear, indicating the time-dependency of the rate in which the number of cases changes. To better describe the time-dependent growth pattern, here we compute the daily exponential growth rates following the techniques implemented in Hsiang et al. 34 . Such techniques, often used in economic growth 47,48 , were also proved to be extremely useful in many diseases, such as COVID-19 34 and influenza 49 , to analyze time-varying epidemic growth rate that can be affected by policies and other conditions.
Our computations clearly show the temporal change of the epidemic growth rate (Fig. 2), which is also supported by the significantly large inter-quartile range (Fig. 3). Throughout the study period from April 4 to October 15, 2020, the country-wide daily growth rate was almost always positive except in two short intervals from July 4 to 28, 2020 and from September 7 to 14, 2020 (indicated by horizontal solid red line segments in the first plot of Fig. 2). We observed the sudden increase in the daily growth rate (from negative to positive values) during the end of July and the middle of September, 2020, which may be attributable to the major alteration of the policies, such as the lifting of the lockdown (July 21, 2020) and the opening of all business, including international flights (September 16, 2020), respectively. Furthermore, a drastic rise in the growth rate occurred following the relaxation of the phase-wise lockdown (June 18, 2020). During the study period, the number of active cases was changing with the median exponential growth rate of 4% per day (Quartile: [1%, 9%] per day; min-max: [− 5%, 29%] per day) (Fig. 3).
For each province, we observed the median value of the growth rate to be positive (1-4%), while the magnitude varied dynamically between negative and positive values throughout the study period (Fig. 3). Comparison of the growth rates among provinces reveals that there are some noticeable differences in the growth pattern. In Province 1, 2, 3, and 5, the epidemic growth began before the border opening with screening on May 20, 2020 (blue vertical lines in Fig. 3), but other provinces experienced the epidemic growth only after May 20, 2020. Also, unlike other provinces, Province 3 had a negative growth rate at the beginning of the pandemic (April 23-May 5, 2020), and then the growth rate remained positive almost all the time period considered except during a brief interval (July 13-25, 2020) ( Fig. 2). The absence of inflow of migrants through the border with India and controlled local transmission due to lockdown policy may have contributed to the negative growth rate of Province 3 during the early phase. However, Province 3 also suffered from one of the highest growth rates during the later part of the epidemic as shown in Figs. 2 and 3 (median growth: 4% per day; quartiles [ Q 1 = 2%, Q 3 = 10%]). Immediately after the border opening, the daily growth rate increased in the provinces having a higher inflow of migrants from India (Province 1, 2, 5, 7).

Reproduction numbers.
While the growth rate computed above is useful to observe the day-to-day microscopic trend of the epidemics, it may not be a good indicator to capture the general trend of the disease due to its inability to include the disease-state specific distribution of infectiousness. The general trends can be better described by the reproduction number of the disease [50][51][52][53] . Here we utilize the compiled data to estimate the basic reproduction number and the effective reproduction number for provinces as well as for the whole country.  www.nature.com/scientificreports/ Basic reproduction number, R 0 . The basic reproduction number is defined as the number of secondary infections produced by an infected individual in a pool of a completely susceptible population during its infectious period. In general, if the value of the basic reproduction number is greater than unity, then the disease spreads, and if the value is less than unity, the disease dies out [50][51][52] . In Table 2, we present the values of basic reproduction number along with their 95% confidence intervals for Nepal and its provinces. We estimated the basic reproduction number for the whole country to be 1.083 (95% CI 1.075-1.093 ) ( Table 2), which is comparable to R 0 estimated for other countries 33,36,39,40,43,54,55 . Among the seven provinces, the highest reproduction number was estimated in Province 3 ( R 0 = 1.205 , 95% CI: 1.191-1.219 ) followed by Province 1 ( R 0 = 1.137 , 95% CI: 1.107-1.166 ) and then by Province 4 ( R 0 = 1.047 , 95% CI: 1.009-1.085 ). Since Province 1, 3, and 4 are the ones with the highest populous cities, the higher likelihood of contacts may have contributed to their higher values of R 0 . Note that the other provinces (2,4,5,6) have values of R 0 less than 1 ( Table 2), indicating the local spread of the disease in these provinces was under control during the initial phase. The observed disparity of R 0 , not only on its magnitude but also on its level to be less or greater than unity, highlights that the province-level R 0 may be necessary to accurately predict whether a province-wise outbreak occurs or not. It's worth noting that while some provinces had the epidemic under control ( R 0 < 1 ) because of the strict on-time government policies, the disease eventually spread to those provinces once the control policies were lifted, as indicated by the data.  www.nature.com/scientificreports/ Effective reproduction number, R t . One of the main assumptions while calculating the basic reproduction number, R 0 , is that all individuals who are not infected are equally susceptible to the virus. While this is usually true at the beginning of an epidemic, the susceptible populations change over time as some individuals in the population may be recovered or immune to the disease or become protected due to intervention policies. This implies that the actual number of secondary infections per infected individual changes over time, partly due to the impact of the policy changes. The number of secondary cases per infected individual in a population made up of both susceptible and non-susceptible individuals is known as the effective reproduction number (denoted by R t ) 56 . In general, R t < 1 ( R t > 1 ) indicates that the epidemic trend at time t is decreasing (increasing). Our estimated profile of the effective reproduction number in Nepal and its provinces, along with their 95% Credible Interval (gray region), is presented in Fig. 4. Note that as required by our method 57 we started computing R t only after the time point when there are at least some consecutive non-zero incidences and no sequence of consecutive zeros afterward. Our estimates show that the effective reproduction number of Nepal stayed below unity at almost all time points before the government started to ease the phase-wise lockdown (on June 11, 2020), indicating a decreasing trend in this period, consistent with the controlled phase of the epidemic. The observed decreasing trend ( R t < 1 ) for the initial phase, despite the initial surge of COVID-19 cases, is consistent with the fact that the initial increase in cases is attributable to imported cases rather than local transmission. After June 11, 2020, the effective reproduction number exceeded 1 and reached 1.24 on June 23, 2020, and then remained above 1 for the next 2 weeks. After the end of lockdown (July 21, 2020), a further increase of R t took place, making it 1.40 on July 30, 2020, and then it stayed above 1 almost all the time throughout the study period. Note that the pattern of outgrowing cases after the relaxation of policies is well represented by our estimates of R t . The trend of R t observed in our study is consistent with the trend observed in a previous differential equation modeling study 33 , with our estimates being slightly smaller than the estimates from the differential equation modeling, similar to the finding in other places 35 . This discrepancy may be described by the fact that our data includes only the recorded cases, missing those asymptomatic cases which were not recorded.
The temporal pattern of the effective reproduction number in all provinces appeared to be more or less similar after July 21, 2020, showing that all provinces experienced the increasing trend of cases in the absence of fullfledged policies. However, the magnitude of R t varies from province to province. For example, in Province 3, the value of R t reached as high as 2.86 (July 3, 2020) and as low as 0.17 (June 1, 2020) with the highest variance ( s 2 = 0.20 ), while Province 4 had the lowest variance ( s 2 = 0.09 ) with the value of R t ranging from 0.16 to 1.5. More importantly, we observed that the pattern of R t is quite different among provinces while the policies were in place (before July 21, 2020). Because of the data having consecutive zeros at the beginning of pandemic in Province 3, 4, 6, and 7, the non-zero effective reproduction number appeared only after May 20, 2020 (blue vertical line in Fig. 4) when the government opened the border with screening policy, while Province 1, 2, and 5 had continuous non-zero cases, and thus non-zero values of R t before May 20, 2020. Province 2 is the first province that had the earliest value of R t greater than 1 ( R t = 1.04 on May 12, 2020), while Province 3 had the value of R t greater than 1 only after more than a month, on June 25, 2020 ( R t = 1.07 ), which corresponds to the timing of the relaxation of within province restriction. Also, Province 5 and 7 did not experience R t > 1 (increasing trend) until July 5 and June 25, 2020, respectively. Moreover, right after June 22, 2020, Province 3 experienced a dramatic spike of R t leading to its value of 2.86 on July 3, 2020, while most of the other provinces had the value of R t less than 2 during the entire study period. These results show that the effectiveness of policies might vary across the provinces even though universal policies were implemented throughout the whole country.

Impact of policies on the epidemic indicators.
In this section, we used multiple linear regression models to study the effects of nation-wide implemented government policies on the control of COVID-19 spread. In particular, we estimated the potential impact of four government policies (predictor variables), lockdown, testing, isolation, and quarantine, on each of the three epidemic indicators (response variables), number of reported cases, the daily growth rate, and effective reproduction number. For the whole country as well as each province and for each epidemic indicator, we considered all possible models with one, two, three, and four predictor variables taken at a time. Then in each case, we compared the models using Akaike Information Criteria (AIC) to identify the best models that describe the data. The lower the AIC value, the better the model to describe the data. For the best models identified, we also provided the regression coefficients to quantify the impact of the policies in epidemic indicators.
Identification of the best models. The models having the lowest AIC values for 1, 2, 3, and 4 policies taken at a time are presented in Table 3. Our analysis of the whole country's data indicates that each of the epidemic indicators is best described by the model containing all four policies. This shows that each of the policies implemented by the Nepal government had contributed to the successful control of the reported cases, daily growth rate, and effective reproduction number. To further analyze, we also observed the models restricted to 1, 2, and 3 policies. Interestingly, our model analysis identified that the policies that best describe epidemic indicators are not identical for all of these indicators. For example, in the case of one policy, the reported case or the growth rate as a response variable can be best described by the model with the isolation policy, while the effective reproduction number as a response variable can be best described by the lockdown policy. Similarly, for the models restricted to two policies, (testing, isolation), (testing, quarantine), and (lockdown, testing) were found to be the best predictors of reported case number, daily growth rate, and the effective reproduction number, respectively. Also, for the three-policy models, lockdown, testing, and isolation can best predict the reported cases and the effective reproduction number, while the daily growth rate can be best predicted by testing, isolation, and quarantine.  www.nature.com/scientificreports/ We performed a comparative analysis on the modeling of province-wise data to examine whether similar policies can predict the epidemic indicators at the province level. Our results show that the number of policies that predict the epidemic indicators are quite different at the province-level from those at the national level (Table 3). For example, except the reported case in Province 1, 5, 6, and 7 and the effective reproduction number in Province 1 and 7, all other indicators in all provinces are described by the model with less than four policies. For Province 2 and 4, the reported cases are predicted by three policies, while only two policies predict the reported case of Province 3. Similarly, three policies are required to best predict the daily growth rate of Province 1, 2, 3, 4, 5, and 7, and the effective reproduction number of Province 2, 5, and 6. The daily growth rate in Province 6 and the reproduction number in Province 3 can be best predicted by the model with two policies. Interestingly enough, only one policy (lockdown) was found to be the best predictor of the reproduction number in Province 4.
Not only the number of policies but also the type of policies as best predictors vary among provinces as well as among epidemic indicators. For example, even though the effective reproduction number is predicted by the same number of policies (3 policies) in Province 2, 5, and 6, the actual policies as predictors are (lockdown, testing, quarantine), (lockdown, testing, isolation) and (lockdown, isolation, quarantine) for Province 2, 5, and 6, respectively. Similar disparities were observed for the reported cases and the growth rates, as presented in Table 3. Such disparities among provinces as well as among indicators can be attributable to the province-wise heterogeneity in inflow of migrants through the border with India and the population contact network.
Regression coefficients of the best models. In Table 4, we present the regression coefficients, along with their 95% confidence interval, of the best models identified above for each of the three epidemic indicators. Here β 0 Table 3. The models having the lowest AIC values for three response variables (the reported cases, the daily growth rate, and the basic reproduction number), with four predictors (L-lockdown, T-testing, I-isolation, and Q-quarantine) taken 1, 2, 3, and 4 at a time. The bold-face indicates the best model identified according to AIC values for each province and the whole country.

Reported cases
Growth rate R t www.nature.com/scientificreports/ indicates the value of the epidemic indicators in the absence of policies, while the coefficients β L , β T , β I , and β Q represent the slope (the rate of change of the indicator value per policy change) corresponding to lockdown, testing, isolation, and quarantine, respectively. We identified lockdown to be consistently effective to impact all indicators of the whole country, reducing the reported cases by 428 per day, the growth rate by 0.9%, and the effective reproductive number by 0.28. Surprisingly, our results provide positive coefficients for some indicators corresponding to some policies. While we acknowledge uncertainty on these coefficients due to misinformation in the data and the models, we note that these obscure findings can be specific to the context of Nepal. For example, an increase in the reported cases due to an increase in isolation and quarantine can be possible because the reported cases in Nepal at the beginning of the pandemic constitute those returning migrants who were quarantined and isolated following border screening. Also, an increase in the effective reproduction number due to an increase in testing or isolation could be because of the focus of testing and isolation on only those migrant workers screened at the border, who were less likely to be disease spreader in the local community. Further information, detailed data, and in-depth modeling can help ascertain these unclear results. We also provided the coefficients for the best models identified for each province (Table 4). A widely varying magnitude of estimated coefficients implies that the net effects of policies on the epidemic indicators vary among provinces. For example, lockdown was found to reduce the effective reproduction number from the minimum of 0.14 in Province 6 ( β L = −0.14 ) to the maximum of 0.53 in Province 3 ( β L = −0.53 ) . Surprisingly, the impact of the same policy on the same epidemic indicator was found to be positive in one province and negative in another province. While lockdown is mostly effective in reducing these indicators in most of the provinces (negative values of β L ), our modeling resulted in the positive value of β L for the reported cases in Province 7 and the daily growth rate in province 4 and 7. Even though these results appear to be paradoxical, it may be quite possible to be true in the context of Nepal due to geographical heterogeneity. In particular, Province 7's border has one of the most in-and-out flows of migrants. As a result, the reported cases, as well as the growth rate, can increase due to inflow of cases from India even after lockdown is implemented. Table 4. Regression Coefficients of the best model for the reported cases, the daily growth rate and the effective reproduction number, R t , in Nepal and its provinces.  www.nature.com/scientificreports/

Discussion
Although multiple COVID-19 vaccines have recently been developed, proper implementation remains a major issue for many countries, especially developing countries like Nepal, where poor management and availability of resources are often obstacles in public health programs. Since the long-term efficacy of the vaccines are not well-known, and a considerable portion of the population may be reluctant to receive the vaccines during the early phase of vaccination, the disease spread has been expected to continue for a longer time period. Also, the emergence of new strains of SARS-CoV-2, such as UK, Brazil, and South African strains, has posed a continued threat of future COVID-19 surge. Therefore, non-pharmaceutical interventions are still the recommended strategies for the prevention of disease spread. The effectiveness of various non-pharmaceutical interventions on controlling the disease has not been fully understood, particularly in the context of local-level heterogeneity in terms of geographical background and population contact network. Here, we took advantage of unique province-wise data set from Nepal that allowed us to quantify the impact of national-level policies (lockdown, testing, isolation, quarantine) on the province-wise epidemic indicators (reported cases, daily growth rate, effective reproduction number).
Presumably because of unique geographical and demographic features, our analysis and modeling of data from Nepal revealed a number of interesting findings regarding COVID-19 transmission and control policies. For each of the widely used epidemic indicators (reported case number, daily growth rate, and effective reproduction number), both the geographical situation, such as the open border with India, and contact network, such as population size and lifestyle (urban or rural), could play significant role in causing province-wise disparity to these indicators. The movement of seasonal labor migrants across the Indian border was primarily responsible for the first surge of COVID-19 in Nepal. However, the peak of this initial surge and the slope of the early linear trend of reported cases remained quite low, highlighting the effectiveness of policies that the Nepal government implemented right at the beginning of the pandemic. The dramatic surge with a high slope of the linear trend after lifting the policies also asserts the successful impact of these policies in the context of Nepal, where mobility across the open border is a huge issue for disease control. While these policies were in place (i.e., until June 2020), the effective reproduction number remains quite low below 1 (decreasing trend of local transmission), which after the relaxation of policies frequently increases to above 1 (increasing trend). This pattern of the effective reproduction number identified with our approach is similar to the one computed using dynamical system model 33 . The maintenance of the effective reproduction number below 1 observed in our study due to the policies that control imported cases is consistent with the estimated low reproduction number in other places where inter-cities travel restrictions were imposed 36,39,41 . Despite a continuous flow of cases from India, one of the most COVID-19 affected countries in the world, the policies implemented by the Nepal government ideally controlled the local transmission, and thus deserve in-depth analysis to identify their local level impacts. The COVID-19 epidemic control through travel restrictions and non-pharmaceutical interventions have also been reported in other places 4,12,14,17,[35][36][37][39][40][41][42][43] .
More importantly, our results show that the magnitude and trend of the epidemic indicators widely vary among provinces, most likely due to heterogeneity of border-sharing with India and population contact network. The magnitude and timing of peaks, as well as the slope of the linear trend (at the beginning of each surge), is quite different from province to province (Table 1). The province-wise variation is clearly seen in the pattern of daily growth rate as well (Fig. 1). For example, Province 3, which does not share a border check-point with India, has a negative growth rate during the early period of the pandemic, as opposed to other provinces (2, 5, 6, 7), which experience high mobility of migrant workers across open borders. Disparities in the growth rate has also been reported in a study that examined the effects of human mobility and control measures in provinces of China 12 . The effective reproduction number, which is widely considered to observe disease trends impacted by policies, was also found to vary widely among provinces both in magnitude as well as timing to cross threshold value of 1 (Fig. 4). In line with our finding, the disparity in R t -trend has also been found in two cities of China where imported cases also contributed to the COVID-19 cases at the beginning of the epidemic 36 . Such wide disparities across provinces found by our study underscores that the analysis of country-level data might not be enough to implement policies for reducing province-level epidemic indicators.
Given the potential disparities of indicators among provinces, we further performed analysis through modeling to identify how the national-level policies implemented by the Nepal government were translated into province-level control. Our model comparison found that all four policies, lockdown, testing, isolation, quarantine, significantly contributed to each of the reported cases, the daily growth rate, and the effective reproduction number when the analysis is performed at the country level. However, the local level modeling indicates that the number and type of policy that was effective significantly vary among provinces as well as among epidemic indicators; only one policy (lockdown) was effective to reduce reproduction number for Province 4, while a combination of all four policies was effective for Province 1 and 7 (Table 3). On the other hand, for the same Province 4, three policies (testing, isolation, quarantine) were effective to alter the reported cases, while a different combination of three policies (lockdown, isolation, quarantine) was effective to alter the daily growth rate. In addition to the number and type of policies, the magnitude of effects of these policies is quite different from province to province (Table 4). In general, lockdown appeared to be the most effective policy to control most of the indicators, consistent with previous studies 33,36,37,39,40,42,43 , although there were some exceptions. Combining all these analyses, our results indicate that the net effect of the policies can be province-dependent despite the policy planning at national level, and thus province-level analysis is critical to proper implementation of government policies.
We acknowledge several limitations of our study. The data used in this study was limited and bears some uncertainties. In particular, the COVID-19 cases include only those that were tested and confirmed positive. However, several studies 6 www.nature.com/scientificreports/ who can spread the virus, even though their infectivity still remains uncertain. Therefore, our estimates may have underestimated the values of indicators, especially the reproduction numbers. The actual number of active cases was not available, thus there is some uncertainty on the computed daily growth rates. Despite the strict policy of the government of Nepal to perform border screens, there were cases that avoided the border screening, causing potential direct community transmission by infected migrants. These cases, while assumed to be negligible, may contribute to the uncertainty of our estimates of growth rate and reproduction number. In our modeling, the effects of policies were assumed to be linear. However, the infectious disease transmission mechanism, especially in the presence of multiple policies, can be nonlinear. We note that our models provided positive coefficients in some cases, showing the increase in policy level in those cases was associated with the increased epidemic indicator. As discussed earlier, while it appears to be paradoxical, this may be possible to be true in the context of Nepal. For example, higher testing and isolation can be associated with a higher reproduction number because the testing and isolation were primarily focused on seasonal migrants rather than individuals in the local community. It is worth mentioning that we obtained all coefficients negatives when we restricted our model to a single predictor. Uncertainty around these results needs more accurate data along with complex nonlinear models, such as dynamical system models.
In summary, our study on province-wise COVID-19 data from Nepal provides a baseline for evaluating locallevel epidemic indicators under the commonly practiced national-level policy implementation. As demonstrated by our analyses, the net effects of nationally implemented policies can widely vary on the province-level epidemic indicators due to heterogeneity on the geographical situation (open border with India) and the contact network (population size and urbanization). Therefore, local level analyses are important for estimating epidemic indicators and designing ideal non-pharmaceutical policies for the control of local COVID-19 outbreaks.

Methods
Data collection and analysis. The countrywide and province-wise data of COVID-19 outbreak and the government policies implemented were obtained from a wide range of publicly available sources, including official websites of different ministries of Nepal, public health information sources, and national and regional newspapers 30,44,[59][60][61] . Specifically, the numerical data include "reported case number", "recovered case number", "death case number", "number of tests performed", "number of people sent to quarantine and isolation" for each province as well as the whole country. In this study, we considered the number of new cases and active cases data from March 29 through October 15, 2020 and the available policy-related data from March 29 through September 27, 2020. Since the active case numbers were not available, we approximated them by subtracting the recovered case number and death number from the reported case number. A few missing data are estimated by taking the average between the value from the previous day and the following day.
As suggested by several government agencies, most of the new cases before mid-June were seen in the people who recently returned from India 62 . Thus we assumed that 80% of the new incidences before June 18, 2020 were contributed by imported cases, and after June 18, 2020, all the new cases were assumed to be local cases. We also considered the data set generated assuming 70% and 90% contributed by the imported cases and found that the results with these new data sets are similar to the ones with 80% imported cases. Therefore, all the results presented in this study are based on 80% of cases contributed by imported cases before June 18, 2020. All the compiled data sets that we used in this study are provided in the GitHub public repository (see Data Availability).
Estimation of the epidemic growth rate. The daily exponential growth rate has been considered to be a useful indicator of epidemic progress. As discussed in Hsiang et al. 34 , the number of infected cases for a short time period ( t 1 to t 2 ) can be approximated by an exponential growth model. Thus, we take Taking the natural logarithm on both sides we get ln (I t 2 ) − ln (I t 1 ) = g t 1 (t 2 − t 1 ) . For the daily growth rate, we take t 2 − t 1 = 1 , which implies the daily growth rate at time t 1 to be Estimation of the basic reproduction number, R 0 . The basic reproduction number, R 0 , is defined as the number of new infections produced by an infected individual during its infection lifetime 63 . There is a number of methods to estimate the basic reproduction number from the disease incidence data. Among them, the method proposed by White and Pagano 56,64 , known as the maximum likelihood method (MLM), is widely used in many studies. Therefore, we selected MLM to estimate R 0 for Nepal and its provinces. However, for the purpose of comparison, we also estimated R 0 using the exponential growth method (EGM), and obtained the values for R 0 similar to the ones obtained by MLM. To implement MLM, we require the incidence data, the number of new cases over consecutive time units represented by (N 0 , N 1 , . . . , N T ) , and the distribution of generation time, which is defined as the time duration from onset of infectiousness in a primary case to the onset of infectiousness in a secondary case that was infected by the primary case 65 . We note that at each time t, the total incident cases, N t , is the sum of the number of incident local cases ( N local t ) and imported cases ( N imported t ), i.e., Since the generation time is a nonobservable phenomenon, especially in COVID-19 due to the presence of incubation phase, we approximated it by the serial interval, which is defined as the duration of time between the onset of symptoms in a primary case (infector) and the onset of symptoms in a secondary case (infectee) that was infected by the primary case 56,65 . Following the previous studies 38, [66][67][68] , in which the data on the serial I t 2 = I t 1 e g t 1 (t 2 −t 1 ) . www.nature.com/scientificreports/ interval of COVID-19 was found to follow the gamma distribution, we used the gamma-distribution for the serial interval in our computation. In this study, we present the results with the gamma distributed serial interval with the mean of 3.96 days and the standard deviation of 4.75 days as reported by Du et al. 69 , while we note that some other studies have reported the mean varying from 1.77 to 5.30 38,[66][67][68] . We assumed that the secondary cases produced by an infected individual follow a Poisson process, with an expected value R 0 . If w s is a distribution of the serial interval, R 0 can be estimated by maximizing the loglikelihood function where µ t = R 0 t s=1 N t−s w s . This provides the following expression for R 0 We carried out the computation in R-software using ' R 0 ' package that implements MLM to estimate the basic reproduction number for Nepal and its provinces.
Estimation of the effective reproduction number, R t . The effective reproduction number, R t , is the real-time estimation of the reproduction number. We use a similar idea as in the case of R 0 and assume poisson process 56,57,70 . Therefore for the time-dependent R t , the likelihood function becomes where, k = k s=1 N k−s w s . Similar to the technique implemented in previous studies 57,70 , τ represents the length of the time window over which R t is assumed to be constant. In the Bayesian framework, it can be shown that a Gamma prior distribution for R t,τ with parameters (a, b) implies the posterior distribution of R t,τ to be Gamma distribution with parameters 57,70 As in the calculation of R 0 , we consider the serial interval following gamma distribution with the mean of 3.96 days and the standard deviation of 4.75 days 69 . All computations were performed in R-software using 'EpiEstim' package 57,70 . Modeling via multiple regression analysis. We considered four policies implemented by the government of Nepal: lockdown ( L t ), testing ( T t ), isolation ( I t ), and quarantine ( Q t ). Assuming the linear relationship between these policies and the epidemic indicators (the reported case number, N t , the daily growth rate, g t , and the effective reproduction number, R t ), we modeled each of the epidemic indicators using the multiple linear regression as follows 34,71,72 where y t represents one of the three epidemic indicators ( N t , g t , R t ), whose analysis is required. Here, β 0 , β L , β T , β I , and β Q represent regression coefficients corresponding to no policy, lockdown, testing, isolation, and quarantine, respectively. ε t is the residual term, which is assumed to have an expected value of zero with constant variance and to be independently and identically distributed. As per the available data, the lockdown is a categorical variable (taken as L t = 1 or 0 depending on whether or not lockdown is in effect), and the remaining other three policies are numerical.
For each epidemic indicator, we considered all possible models with 1, 2, 3, 4 policies taken at a time, i.e., the total of 15 possible models. We performed data fitting to these models, and chose the best model on the basis of their AIC (Akaike information criterion) values, which is given by 71,72 Here n is the number of days for which the data was recorded, σ 2 is an estimate of the variance of the residual ε t , RSS is the residual sum of squares, and d is the number of policies in the model. We repeated the process for each province and for the whole country.

Data availability
The compiled data that are analyzed are available in the github public repository. https:// github. com/ Subas-Achar ya/ COVID_ Proje ct.