Modeling the first wave of Covid-19 pandemic in the Republic of Cyprus

We present different data analytic methodologies that have been applied in order to understand the evolution of the first wave of the Coronavirus disease 2019 in the Republic of Cyprus and the effect of different intervention measures that have been taken by the government. Change point detection has been used in order to estimate the number and locations of changes in the behaviour of the collected data. Count time series methods have been employed to provide short term projections and a number of various compartmental models have been fitted to the data providing with long term projections on the pandemic’s evolution and allowing for the estimation of the effective reproduction number.


A1 Isolate-Detect Methodology
The existing change-point detection techniques for the scenarios mentioned in the Change-point Analysis and Projections subsection of the main text are mainly split into two categories based on whether the change-points are detected all at once or one at a time. The former category mainly includes optimizationbased methods, in which the estimated signal is chosen based on its least squares or log-likelihood criterion penalized by a complexity rule in order to avoid overfitting. The most common example of a penalty function is the Bayesian Information Criterion (BIC); see (8) and (9) for details. In the latter category, in which change-points are detected one at a time, a popular method is binary segmentation, which performs an iterative binary splitting of the data on intervals determined by the previously obtained splits. Even though binary segmentation is conceptually simple, it has the disadvantage that at each step of the algorithm, it looks for a single change-point in possibly long intervals, which leads to its suboptimality in terms of accuracy, especially for signals with frequent change-points. The Isolate-Detect (ID) methodology of (1), which is used for the analysis carried out in our paper, works towards solving this issue.
The concept behind ID is simple and is split into two stages; firstly, the isolation of each of the true change-points within subintervals of the domain [1, 2, . . . , T ], and secondly their detection. The basic idea is that for an observed data sequence of length T and with λ T a positive constant, ID first creates two ordered sets of Q = T /λ T right-and left-expanding intervals as follows. The j th right-expanding interval These intervals are collected in the ordered set S RL = {R 1 , L 1 , R 2 , L 2 , . . . , R Q , L Q }. For a suitably chosen contrast function, ID identifies the point with the maximum contrast value in R 1 . If its value exceeds a threshold, denoted by ζ T , then it is taken as a change-point. If not, then the process tests the next interval in S RL . Upon detection, the algorithm makes a new start from the end-point (or start-point) of the right-(or left-) expanding interval where the detection occurred.
For clarity of exposition, we give below a simple example. Figure A1 covers a specific case of two change-points, r 1 = 38 and r 2 = 77. We will be referring to Phases 1 and 2 involving six and four intervals, respectively. These are clearly indicated in the figure and they are only related to this specific example, as for cases with more change-points will entertain more such phases. At the beginning, s = 1, e = T = 100, and we take the expansion parameter λ T = 10. Then, r 2 gets detected in {X s * , X s * +1 , . . . , X e }, where s * = 71. After the detection, e is updated as the start-point of the interval where the detection occurred; therefore, e = 71. In Phase 2 indicated in the plot, ID is applied in [s, e] = [1,71]. Intervals 1, 3 and 5 of Phase 1 will not be re-examined in Phase 2 and r 1 gets detected in {X s , X s+1 , . . . , X e * }, where e * = 40.
After the detection, s is updated as the end-point of the interval where the detection occurred; therefore,

A2 More on Count Time Series Models and Interventions
Recall model (2) of the main text and that the parameters d, a 1 , b 1 can be positive or negative but they need to satisfy certain conditions so that we obtain stable behavior of the process. Note that the lagged observations of the response X t are fed into the autoregressive equation for ν t via the term log(X t−1 + 1).
This is a one-to-one transformation of X t−1 which avoids zero data values. Moreover, both λ t and X t are transformed into the same scale. Covariates can be easily accommodated by model (2) of the main text.
When a 1 = 0, we obtain an AR(1) type model in terms of log(X t−1 + 1). In addition, the log-intensity process of (2) in the can be rewritten as after repeated substitution. Hence, we obtain again that the hidden process {ν t } is determined by past functions of lagged responses, i.e. (2) belongs to the class of observation driven models; see (4).
Models like (2) of the main text, can accommodate various type of interventions (or extraordinary) observations by suitable modification. Generally speaking, intervention effects on time series data are classified according to whether their impact is concentrated on a single or a few data points, or whether they affect the whole process from some specific time t = τ on. In classical linear time series methodology an intervention effect is included in the observation equation by employing a sequence of deterministic where ξ(B) is a polynomial operator, B is the shift operator such that  (2), is determined by a latent process. Therefore a formal linear structure, as in the case of Gaussian linear time series model does not hold any more and interpretation of the interventions is a more complicated issue. Hence, a method which allows detection of interventions and estimation of their size is needed so that structural changes can be identified successfully.
Important steps to achieve this goal are the following; see (2): 1. A suitable model for accommodating interventions in count time series data.
2. Derivation of test procedures for their successful detection.
3. Implementation of joint maximum likelihood estimation of model parameters and outlier sizes.
4. Correction of the observed series for the detected interventions.
All these issues and possible directions for further developments of the methodology have been addressed by (6) under the Poisson and mixed Poisson distributional framework.

A3 Computational Details for Fitting Equations (6) of the main text
According to the official reports, the number of quarantined cases (Q), recovered (R) and deaths (D), due to COVID-19, are available. However, the recovered and death cases are directly related to the number of quarantine cases, which plays an important role in the analysis, especially since the numbers of exposed (E) and infectious (I) cases are very hard to determine. The latter two are therefore treated as hidden variables.
This implies that we need to estimate the four parameters ζ, β, γ −1 , δ −1 and both the time dependent cure rate λ(t) and mortality rate κ(t). This is an optimization problem that we solve as follows: first we allow the latent time γ −1 to vary between 1 and 7 days and for each fixed γ −1 , we explore its influence on the rest of the parameters. The system of differential equations (6) in the main text is solved numerically using the Runge-Kutta 45 numerical scheme. The left plot of Figure A2 shows that the protection rate ζ and the transmission rate β both attain their corresponding maximum value when γ −1 is equal to 3 days. Note that ζ takes values between 0.08 and 0.2, while β converges very fast to 1. The reciprocal of the quarantine time δ −1 is increasing with the latent time γ −1 . One would suspect that longer latent time results to higher transmission rate and as the latent time increases almost every unprotected person will be infected after a direct contact with a COVID-19 patient. The right plot of Figure A2 shows the effect of the latent time on the total number of infected cases (exposed and infectious E(t) + I(t)) but not yet quarantined. The of ζ, β and δ −1 . After a small sensitivity analysis the latent time was finally determined as 3 days. The mortality rate κ(t) is constantly very small and almost equal to zero, therefore we have not attempted to fit any function to it. For the cure rate λ(t) we have fitted the exponential function given in (9) in the main text, the idea behind being that with time the recovery should converge to a constant rate. For the parameter estimation we have used a modified version of the MATLAB code given by (3) because Cyprus is a small country and this fact needs to be taken properly into account.

A4 Prior Modelling and Computational Details for the Estimation of the effective Reproduction Number R t section of the main text
We present a Bayesian analysis, for the model defined by (4)  Following (5), we assume that the daily number of reported cases are independent Gaussian random variables and use an empirical variance given as (see (7)), in order to sample the posterior (namely, we use an independence sampler). This is in contrast to the model defined by (5) in the Compartmental Model 2 subsection of the main text, see (5), where one has to use the Ensemble Adjustment Kalman Filter (EAKF), which introduces some approximations to the posterior distribution, due to the more complex meta-population structure. Originally developed for use in weather prediction, the EAKF assumes a Gaussian distribution for both the prior and the likelihood and adjusts the prior distribution to a posterior using Bayes rule deterministically. In particular, the EAKF assumes that both the prior distribution and likelihood are Gaussian, and thus can be fully characterized by their first two moments (mean and variance). The update scheme for ensemble members is computed using Bayes rule (posterior ∼ prior × likelihood) via the convolution of the two Gaussian distributions (see (5) for the implementation).

A5 Results of Change-Point Analysis for Piecewise-Constant Model
We report the results obtained after fitting a piecewise-constant signal plus noise model, as described in