Mathematical epidemiologic and simulation modelling of first wave COVID-19 in Malaysia

Since the first coronavirus disease 2019 (COVID-19) outbreak appeared in Wuhan, mainland China on December 31, 2019, the geographical spread of the epidemic was swift. Malaysia is one of the countries that were hit substantially by the outbreak, particularly in the second wave. This study aims to simulate the infectious trend and trajectory of COVID-19 to understand the severity of the disease and determine the approximate number of days required for the trend to decline. The number of confirmed positive infectious cases [as reported by Ministry of Health, Malaysia (MOH)] were used from January 25, 2020 to March 31, 2020. This study simulated the infectious count for the same duration to assess the predictive capability of the Susceptible-Infectious-Recovered (SIR) model. The same model was used to project the simulation trajectory of confirmed positive infectious cases for 80 days from the beginning of the outbreak and extended the trajectory for another 30 days to obtain an overall picture of the severity of the disease in Malaysia. The transmission rate, β also been utilized to predict the cumulative number of infectious individuals. Using the SIR model, the simulated infectious cases count obtained was not far from the actual count. The simulated trend was able to mimic the actual count and capture the actual spikes approximately. The infectious trajectory simulation for 80 days and the extended trajectory for 110 days depicts that the inclining trend has peaked and ended and will decline towards late April 2020. Furthermore, the predicted cumulative number of infectious individuals tallies with the preparations undertaken by the MOH. The simulation indicates the severity of COVID-19 disease in Malaysia, suggesting a peak of infectiousness in mid-March 2020 and a probable decline in late April 2020. Overall, the study findings indicate that outbreak control measures such as the Movement Control Order (MCO), social distancing and increased hygienic awareness is needed to control the transmission of the outbreak in Malaysia.

www.nature.com/scientificreports/ these 23 days, only 22 confirmed cases were recorded with eight recovered patients and zero deaths, with minimal outbreak control measures exercised. Moreover, there were no new cases recorded for the following 10 days. However, on February 27, 2020, a recent incident involving two cases marked the beginning of the second outbreak wave in Malaysia. The first 18 days of the second wave did not see a tremendous increase in the number of new cases; however, on March 15, 2020, the numbers had a sudden rise to 190 individuals from only 35 on the previous day. This alarming rate was due to identifying large clusters of susceptible individuals in contact with infectious individuals (s). Following that, the first death incidence involving two cases occurred on March 17, 2020. The sudden hike and severe outbreak spread changed the Malaysian context of the epidemic from being in control to a dangerous state. They triggered the Movement Control Order (MCO) implementation on March 18, 2020 16 . Since the first rise on March 15, 2020, the inclining trend of new cases each day continued with an average of 150 cases per day and was hit with a major spike of 235 cases on March 26, 2020. On top of that, the number of deaths started spiraling 9 upwards, with more than 60% of fatalities were patients over the age of 60 years, which included those with underlying conditions such as hypertension and diabetes (as of March 31, 2020), in line with the findings from Guan et al. 10 clinical progress study. The Malaysian scenario of COVID-19 outbreak from January 25, 2020 to March 31, 2020 can be visualized in Fig. 1.
A well-known contradiction during the early pandemic is that only minimal knowledge is available about the disease; however, the need for extensive such information is extremely high 17 . This contradiction is evidently true for the case of COVID-19. As such, research on modelling using available data plays a crucial role in the times of a pandemic 18,19 . Accordingly, estimation of infectious count over time can provide a better understanding of the current epidemiological situation in many countries including Malaysia. It can provide insights into the measurable effect of undertaken outbreak control measures 20 . Analysis providing such estimations enables predictions of potential future growth to assist in risk estimation of regional countries and planning alternative interventions or increasing the intensity of existing interventions [20][21][22] . Concerning the perturbing situation in Malaysia, assessing the infectious trend of COVID-19 is crucial to measure the pandemic's severity, as evidenced by the enormous growth from three to 2766 cases within only 67 days (March 31, 2020).
Nevertheless, performing such analyses, especially in real time is often difficult due to constraints such as delay in symptom appearance resulting from the incubation period and delay in confirmation of positive cases resulting from limitations in testing and detection capacity 20,23 . Mathematical modelling of infectious diseases can help overcome the constraints caused by the delays and uncertainty 24 . The most common modelling approach to simulate a contagious disease outbreak's probable outbreak trajectory and severity is the Susceptible-Infectious-Recovered (SIR) model 9 . As anticipated, several studies have widely applied the SIR model 25,26 and its extensions, such as the Susceptible-Exposed-Infectious-Recovered (SEIR) model [27][28][29][30][31][32] , the Susceptible-Exposed-Infectious-Hospitalized-Recovered (SEIHR) 29 and the Susceptible (S), Exposed/pre asymptomatic (E), Asymptomatic (A), Symptomatic (I), Recovery (R) and the Virus in the Environment, thus, on surfaces (V) 33 to the current COVID-19 outbreak at global and national levels. Sun and Weng 34 constructed an adjusted model with two novel features: the asymptomatic population and recovery threshold behaviour. Meanwhile, auto-regressive integrated moving   38 provided a spatio-temporal approach for quantifying regional compliance with the US COVID-19 mitigation strategies.
In Malaysia, several studies have been made by researchers such as Wong et al. 39 , who modified the SIR model under vaccine intervention in several localities of Malaysia by using the simulation of the COVID-19 spread. Salman et al. 40 use a simple universality class of the SIR system and adaptations thereof to undertake scenario analysis for COVID-19 in Malaysia (i.e., the inclusion of temporary immunity through the reinfection problems and limited medical resources scenarios leads to the SIRS-type model). Furthermore, Abidemi et al. 41 developed a deterministic compartmental model to examine the impact of various pharmaceutical and nonpharmaceutical control measures on COVID-19 population dynamics in Malaysia. Using the SIR model, we simulated the infectious trend of COVID-19 in Malaysia to estimate the COVID-19 transmission pattern for a period of 67 days. The simulation is used to obtain an overall picture of COVID-19's potential severity in Malaysia. We searched Google Scholar, medRxiv, arXiv for peer-reviewed articles, preprints, and research reports on the modelling of coronavirus disease 2019 (COVID-19) severity using the search terms "COVID-19 modelling", "epidemic model COVID-19", "mathematical modelling COVID-19", "SIR COVID-19" and "SEIR COVID-19" up to March 30, 2020. No language restrictions were applied. We identified 12 papers that were relevant in the context of mathematical modelling for COVID-19 that can be applied to the Malaysian scenario. Most of the papers focused on the outbreak at the global level, in Hubei, China, China in general, South Korea, Italy, and Iran. We also found several estimates of the transmission rates, recovery rates and the basic reproductive numbers used in these papers. Until now, there has been scarce information in understanding the changing severity and transmission dynamics of COVID-19 in the Southeast Asia region, particularly Malaysia, which holds the highest number of cases (more than 2500 active cases) at the time of writing (March-April 2020).
In the absence of a complete study for Malaysia or the Southeast Asia region, in particular, we inferred the severity of COVID-19 infectiousness in Malaysia by simulating the infectious count against the actual count. We used the same model to project the simulation trajectory into the future (up to 110 days since day zero of the outbreak) to estimate the approximate number of days for the inclining trend and sudden spikes to decline. Furthermore, we predicted the cumulative number of infectious individuals in order to assess the preparations undertaken by MOH. That is, whether MOH has prepared the minimal number of beds or not. We show that the severity dynamics of COVID-19 in Malaysia is rapidly changing and should be closely monitored. Our findings suggest that outbreak control measures such as stricter enforcement of the Movement Control Order, social distancing and increased hygienic awareness are needed in order to control the local transmission of the outbreak.

Results
Actual and simulated current infectious trend. The visualization of the first 67 days of the COVID-19 outbreak in Malaysia is shown in Fig. 1, with the four major timelines marked with a dotted black line. The infectious count is represented by a blue line, the recovered count is represented by a yellow line, and the deaths count is represented by a red line. The SIR model is initialized (at time t = 0 ) with the initial conditions S(0) = S 0 = 0.999 (or 999 for computational simplicity), and I(0) = I 0 = 0.001 (or 1 for computational simplicity), while R(0) = R 0 = 0 . We then obtained the basic reproductive number N 0 for this study using the average N 0 = 2.44 estimated using stochastic methods in two previous studies 27,31 . It is consistent with the range estimated by WHO 1 and Soetaert et al. 42 . The average number of days of recovery is assumed to be D = 11 days based on the first recovered case in Malaysia. It follows that the recovery rate, γ = 1/11 = 0·09. Next, by using N 0 = β/γ 43 , we derive the value of the transmission rate, β = 2·44(0·09) = 0.22. Finally, with the values of S 0 , I 0 , and R 0 , the differential equations were solved to obtain the values of each compartment at each time point (days) beginning from day zero (January 25 2020) today 67 (March 31, 2020).
It can be seen that in the initial stage (day zero to day 15, Fig. 2a), the simulated counts were approximately close to actual counts. However, after day 20, the simulation had an upwards trend with the peak value of 224 infectious individuals on day 56 (March 21, 2020). The simulation trend started declining after day 58. On the other hand, the actual infectious counts only started increasing drastically after day 50, and there were three major spikes between days 50 and 61. The actual peak value was 235 infectious individuals on day 61 (March 26, 2020). The standard SIR model was able to approximately mimic the actual trend and predict the actual spikes. The discrepancy in SIR simulation between day 20 to day 50 is due to the nature of the actual counts (discrete values) when there is a sudden spike in the number of confirmed cases. Note that the simulated peak was also approximately close to that of the actual peak. All three simulated compartments of the SIR model for the same period of time are shown in Fig. 2b. The simulated susceptible trend declines after day 30 and further down after day 60. At the same time, the simulated recovered trend inclines after day 30 while the infectious trend was steadily approaching a downward trend. The blue dots represent the simulated number of susceptible individuals, the red dots represent the simulated number of infectious individuals, and the yellow dots represent the simulated number of recovered individuals in the period of 67 days.  Fig. 2c that the simulated maximum count is 224 infectious individuals in a day. Thereafter, the simulated line dropped to below 200 infectious individuals in a day and further to below 110 infectious individuals at the end of the trajectory on April 13, 2020. For the second trajectory (Fig. 2d), which is the extended trajectory of 30 days from the previous one, the infectious count trend declines steadily by exhibiting a downward trend from day 80 (April 13, 2020) and reaches the lowest point with less than 20 cases in a day on May 13, 2020, which indicates that the severity of COVID-19 in Malaysia may reduce by mid-May, 2020. The actual infectious count reported on April 13, 2020, was 134 new infectious individuals. In between April 14 until April 23, 2020, actual data showed a downward trend. The actual infectious counts reported were 170, 85, 110, 69, 54, 84, 36, 57, 50 and 71 new infectious individuals respectively within that time period. Except for the spike on April 14, 2020, the preceding counts maintained below the generated SIR curve. As such, evidence from actual data on infectious individuals has a strong correlation to the fact that our SIR curve is based on β = 0·22. This implies a strong assumption that infection from an infectious individual to a susceptible individual in Malaysia is at a conservative rate. That is, at the high end, an infectious individual in Malaysia will transmit the disease to a susceptible individual every 4 days, while at the low end, in 5 days.   Fig. 3 indicates that on average the difference between actual and simulated cases is about 10% (for cumulative cases) and about (for daily cases). Hence, the SIR model produces predicted daily values that are not far off from the actual daily values. As such, the SIR transmission rate, β = 0·22, which corresponds to a hypothesized scenario whereby an individual will infect another individual within a 4-day interval, should not be taken lightly. Furthermore, a one-to-one transmission on an interval of 4 days can be seen to be rather conservative. Nevertheless, as shown in Fig. 4, even at this rate, the exponential growth rate is visible.

Discussions
Based on the SIR model's simulation, we extend our discussion to predict cumulative positive infectious cases of COVID-19 in Malaysia. This discussion is subject to further analysis since it is not conclusive in nature. However, we would like to highlight at this point of our argument that discussions provided here are based on definitions within the literature. We will base our discussion on β = 0·22, which implies a one-to-one transmission after 4 days. This assumption is conservative.
We hypothesize that after an individual becomes infected, thus becoming an infectious individual, he will only display symptom(s) after 14 days in order to be confirmed positive with the COVID-19 disease and will www.nature.com/scientificreports/ be hospitalized thereafter. As such, after being hospitalized, he will not be able to be in contact with other susceptible individuals. This hypothesis is well supported by several studies that include Cruz et al. 46 , Del Rio and Malani 47 , Han and Yang 48 , and Walsh et al. 49 . Furthermore, we hypothesize that during the 14 days period, an infectious individual will be able to spread the disease to another individual in a 4-day interval. This hypothesis is supported by factors such as living conditions, social movements, exposure within known clusters, climate, and environmental conditions 12,13,[50][51][52][53] . We assume that the first three infectious individuals in Malaysia (confirmed on January 25, 2020) were still at large within society at that point in time. With a 4 days transmission rate, the next three individuals to be infectious would be on January 29, 2020, and the next six individuals to be infectious would be on February 2, 2020. On February 8, 2020, the first cohort of three individuals would display symptoms and be hospitalized. On February 12, 2020, the second cohort consisting of three individuals to be infectious on January 29, 2020, would display symptoms and be hospitalized. On February 16, 2020, the third cohort consisting of six individuals to be infectious February 2, 2020, will show symptoms and will be hospitalized. Prior to anybody from the first to the third cohort displaying symptoms and being hospitalized, the fourth cohort of twelve individuals to be infectious will occur on February 6, 2020. This cycle repeats itself ad infinitum until certain measures are able to halt the process. Figure 4 depicts this process. www.nature.com/scientificreports/ We note here that the actual cumulative positive cases (where we assumed they are hospitalized and not in contact with other susceptible individuals) reported on March 29, 2020, is 2470 individuals. Meanwhile, the predicted cumulative number of infectious individuals to be hospitalized on March 29, 2020, is 3063. Furthermore, we have predicted that there are around 19,926 infectious individuals still roaming around in society on March 29, 2020. Although the prediction depicted in Fig. 4 is of exponential growth of infectious individuals, we also take note of the following: 1. The above 'avalanche' effect is under the assumption that no remedial action has been taken to halt interaction (except for hospitalizing the infectious ones-which translates into no longer being in contact with susceptible individuals). 2. The above 'avalanche' effect is under the assumption that the best fit SIR curve upon the actual infectious count (discrete data) produces the transmission rate, β = 0.22, that is translated into the idea that one infectious individual will transmit the disease to another individual within a 4-day interval. 3. On March 29, 2020, the MOH made an official announcement on the preparation of around 19,200 new beds 54 . This is a near coincidence to our prediction of 19,926 infectious individuals still roaming around within society on March 29, 2020.
Even though the SIR model is a numerical simulation, the numbers do provide us with a high possible scenario in which the COVID-19 infectious cases can surge too. This gives us an overall picture of the infectious severity of COVID-19 in Malaysia. As COVID-19 is still an infectious pandemic with some unclear properties, accurate SIR predictions can only be obtained once the outbreak has been successfully contained 16 . These trajectories could serve as a dependable means for the Malaysian government, businesses, and citizens to plan and mitigate for such spike in infectious cases. This study is believed to serve as one of the initial efforts for in-depth research on questions that revolve around this global pandemic within the Malaysian context. Our study is also a collective effort towards flattening the COVID-19 infectious curve in Malaysia and stopping the spread as per interventions set out by the government, namely the MCO. To our note, this early or preliminary study is part of ongoing wider research as more intensive studies regarding COVID-19 can be performed. An obvious future research direction for this study is to extend the current SIR model to the SEIR model by including the Exposed (E) compartment in the modelling procedure.
This study of modelling of data at the early stage of the pandemic in Malaysia is intended to provide useful insights for the planning of future outbreak control measures in the near future and is anticipated to be relevant for the coming years to curb the infectious rate. In line with Siam et al. and Johnson 55,56 , this study described the gaps found in public health preparedness, epidemiological characteristics, as well as mental preparedness of the Malaysian citizens to overcome this pandemic. The objectives of this study were to simulate the infectious trend and trajectory of Covid-19 in Malaysia and to determine the approximate number of days required for the trend to decline. The objectives were achieved by using the SIR model and data that were available at the time of writing.
However, the data used was collected over a period of only 110 days. Due to the limited availability of data and information at the time of writing, a standard SIR model was considered for this study. The major components considered in this study were limited to susceptible, infectious and recovered individuals. The exposed category was not considered here. Another limitation is that it did not take into account age dependent mixing pattern mainly due to the short term nature of this study and to understand the initial outbreak of the disease in a homogenous mixing setting. Control strategies were also not considered in this study as this was intended to simulate a disease outbreak without any interventions. This study can be extended further by including disease induced death in the context of an SEIR model. Furthermore, the effects of control strategies such as movement control order and vaccination can be incorporated into the model for analysis on the disease transmission dynamics.

Methodology
Data source. In this modelling study, we extracted the daily number of confirmed positive (infectious) cases from the official daily statistics of COVID-19 provided on the MOH web portal 57 . The extracted records were then collated as time-series data, which begins from day zero of the COVID-19 outbreak in Malaysia (January 25, 2020). In order to avoid any possibility of biases in using a single data source, we validated our figures with the Kini News Lab COVID-19 tracker 44 , a local website that provides real-time data and information on COVID-19. These data are collected through daily press conference statements by the Director-General of Health, Malaysia, where patients' data are not identifiable and remain anonymous. Hence, ethical approval is not required. Data were collected for the duration of only 110 days from day zero of the outbreak due to the limited data available at the time of writing. The first 67 days was chosen for simulation so as to complete a month from day zero until a month into the second wave, while the following 13 days (80 days in total) was used for the first projection for initial evaluation of the simulation and finally, the additional 30 days from first trajectory (total 110 days) was used for final evaluation of the simulation. www.nature.com/scientificreports/ In the modelling procedure, we divided the population into three compartments, as follows: susceptible S(t), number of not yet infectious and disease-free individuals at a time ( t ), infectious I(t), the number of confirmed or isolated individuals at a time (t ), and recovered R(t) , no longer infectious individuals at a time ( t ). We used the standard SIR epidemic model (Fig. 5) to simulate the infectious severity of COVID-19 in Malaysia beginning from the first day of the outbreak. This model is widely used as a first approach to analyze virus spreading and is reasonably predictive for infectious diseases which are transmitted from human to human, and where recovery confers lasting resistance, such as measles, mumps and rubella 59 .

Infectious trend simulation of COVID
The standard SIR model assumes no births or deaths, i.e. a fixed population, N = S(t) + I(t) + R(t) . The primary components of this model are the parameters β : transmission rate, which controls the rate of spread; and γ : recovery rate. If the average duration of recovery is denoted D , then the recovery rate is given by γ = 1/D , since an individual experience one recovery in D days. Individuals move from susceptible compartment to infectious compartment at the rate of β and from infectious compartment to recovered compartment at the rate of γ.
Apart from these parameters, another important measure in epidemiology is the basic reproductive number R 0 , which estimates the speed at which a disease is capable of spreading in a specific population 60 . The variable R 0 also indicates the number of secondary infections stemming directly from the first case in a susceptible population. When R 0 > 1 , one infected individual will on average infect > 1 person in total. When R 0 = 1 , we are right at the threshold between an epidemic and not. Finally, when R 0 < 1 , one infected individual will, on average, infect < 1 person in total. Thus, it is the target to have mechanisms to achieve R 0 < 1 . As disclosed by Tan Sri Dr Noor Hisham Abdullah, Director General of Health, Malaysia, on April 10 2020, Malaysia is approaching R 0 = 1 43 . This significant improvement is due to, among others-the Movement Control Order (MCO), better social distancing etiquettes and hygienic practices.
The dynamics of the COVID-19 transmission can be described using the following nonlinear ordinary differential equations (ODEs) as shown below: The differential equations were numerically solved with R software environment (version 3.6.3), with Runge-Kutta (RK4) method via the package deSolve (version 1.28) 42 . On the first day of the outbreak, there were three infectious cases reported with no recovered cases and to initialize each compartment, and we use N = S(t) + I(t) + R(t) , we obtain S(0) = N − I(0) − R(0) which follows that S(0) = 32, 731, 000 − 3 − 0 = 32, 730, 997 with I(0) = 3 and R(0) = 0 . Although it may seem usual to work with the exact counts itself, the more apt way to obtain the initial conditions is by using fractional representation by dividing with N . This approach is reasonable for numerical simulations performed at the early stage of a pandemic since it evades discrepancies that may be caused by the large count of the population while having a significantly small counts in the infectious and recovered compartments 61