key: cord-0792397-9ysefesi authors: Arifin, W. N.; Chan, W. H.; Amaran, S.; Musa, K. I. title: A Susceptible-Infected-Removed (SIR) model of COVID-19 epidemic trend in Malaysia under Movement Control Order (MCO) using a data fitting approach date: 2020-05-05 journal: nan DOI: 10.1101/2020.05.01.20084384 sha: 894cb8cb1d6f14b92b01791d113b75458c496cf2 doc_id: 792397 cord_uid: 9ysefesi Background: In this work, we presented a Susceptible-Infected-Removed (SIR) epidemiological model of COVID-19 epidemic in Malaysia post- and pre-Movement Control Order (MCO). The proposed SIR model was fitted to confirmed COVID-19 cases from the official press statements to closely reflect the observed epidemic trend in Malaysia. The proposed model is aimed to provide an accurate predictive information for decision makers in assessing the public health and social measures related to COVID-19 epidemic. Methods: The SIR model was fitted to the data by minimizing a weighted loss function; the sum of the residual sum of squares (RSS) of infected, removed and total cases. Optimized beta ({beta}), gamma ({gamma}) parameter values and the starting value of susceptible individuals (N) were obtained. Results: The SIR model post-MCO indicates the peak of infection on 10 April 2020, less than 100 active cases by 8 July 2020, less than 10 active cases by 29 August 2020, and close to zero daily new case by 22 July 2020, with a total of 6562 infected cases. In the absence of MCO, the model predicts the peak of infection on 1 May 2020, less than 100 active cases by 14 February 2021, less than 10 active cases by 26 April 2021 and close to zero daily new case by 6 October 2020, with a total of 1.6 million infected cases. Conclusion: The results suggest that the present MCO has significantly reduced the number of susceptible population and the total number of infected cases. The method to fit the SIR model used in this study was found to be accurate in reflecting the observed data. The method can be used to predict the epidemic trend of COVID-19 in other countries. The novel coronavirus disease caused by SARS-CoV-2 was announced as a pandemic on 11 March 2020 by the World Health Organization, less than 3 months since the epidemic started in Wuhan, China. As of 29 April 2020, the number of cases climbed above 3 million with a death toll of over 200,000 worldwide [1, 2] . In response to COVID-19, many countries implemented a large scale public health and social measures (PHSM), including movement restrictions, closure of schools and businesses, geographical area quarantine, and international travel restrictions. These measures are sometimes referred to as "lockdown" or "cordon sanitaire". Similarly, in Malaysia a large scale PHSM called Movement Control Order (MCO) was announced to the whole country under the Prevention and Control of Infectious Diseases Act 1988 (Act 342) on 16 Mac 2020. Following the MCO, all gatherings, whether for religious, sports, recreational, social or cultural purposes are banned. All universities, schools and non-essential sectors are closed. Journeys from one place to another within any infected local area are not allowed except for the following purposes: to perform any official duty; to make a journey to and from any premises providing essential services, non-essential services or food supply; to purchase, supply or deliver food or daily necessities; to seek PHSM involving the whole country continuously for almost 8 weeks is not without a burden. It has a major social consequences and economic costs. At the moment, the decision to implement MCO was acceptable to many in view of disease casualties, but the question is how long should the MCO be in place? There is a need for Malaysian authority to assess and balance the benefits and potential harms of adjusting these MCO duration, so as not to trigger a resurgence of COVID-19 cases and jeopardize the health of the population. In this study, we presented a Susceptible-Infected-Removed (SIR) epidemiological model of the COVID-19 epidemic in Malaysia following the MCO and prior to the MCO. To make sure the model closely reflects the observed epidemic trend, the proposed SIR model was fitted to confirmed COVID-19 cases from the official press statements by the Director General of Health, Malaysia by optimizing a chosen loss function. The proposed model is aimed to provide an accurate predictive information in assessing the public health and social measures for COVID-19 epidemic and in due course make timely future plans. This study utilized the COVID-19 data set from https://wnarifin.github.io/covid-19malaysia/, which is updated daily by WNA until the date of writing. The data set contains 4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 5, 2020. One of the mathematical models that describe the population level dynamics of infectious diseases is the classical SIR compartmental disease model, which is based on the work of Kermack and Kendrick in 1930s [4] . Conceptually, the SIR model is presented schematically as shown in Figure 1 below: is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. . where S = susceptible individuals, I = infectious individuals (active cases), R = removed individuals (recover and death), and N = S(t) + I(t) + R(t). 1/γ) param is the time until recovery and β), is the transmission rate (contact rate x probability of transmission given contact). The basic reproductive number, R 0 is defined as the "average number of secondary cases arising from a typical primary case in an entirely susceptible population" [4] . The R 0 is given as: To fit the model to the data, the following weighted loss function was defined, taking the sum of the residual sum of squares (RSS) for Y = I, R and total cases (I + R): The model was fitted to the data using the methods described in [6, 7] in R software environment [8] , utilizing deSolve R package [9] . For optimization procedure, the 6 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. Step = MaxStep = 7 was found to be sufficient. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. Proportions ← N (Opt sequence between P start and P end with P step size. OptimumLocation ← N (Opt index of output with minimum loss in OptimumOutputs 15. print N (OptimumLocation) // for every step to see progress. Once the optimal β),, γ) param and N values that minimized the loss function were found, projected data were obtained from the SIR model using the optimized parameters. Using the data, the SIR model was plotted using ggplot2 R package [12] . The goodness of fit and error rate measures of the fitted model were given in form of R 2 and mean absolute percentage error (MAPE). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. Figure 2 includes additional curves, assuming 5% (Beijing) and 50% (Iceland) asymptomatic cases [13] . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. Figure 3 includes additional curves, assuming 5% (Beijing) and 50% (Iceland) asymptomatic cases [13] . For the pre-MCO period, the MaxStep was set to 1, which mean the loop only run once. Here, we limited the algorithm to search for the optimal parameters between 5% to 100% of the initial susceptible population so as to reflect the absence of MCO. In addition, this was also because whenever the data do not contain the peak of active cases, the algorithm is unable to find the optimal N . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. 12 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10.1101/2020.05.01.20084384 doi: medRxiv preprint Figure 4 : SIR model prediction based for pre-MCO period, assuming 5% and 50% asymptomatic cases. The SIR model following the MCO indicates that the infection already peaked on 10 April 2020, after which the number of active cases started to decline. There will be less than 100 active cases by 8 July 2020 and less than 10 active cases by 29 August 2020, close to zero daily new case (when the predicted new case < 0.5) by 22 July 2020. The total number of infected cases was predicted to be between 6560 to 13895 cases. In contrast, if the MCO was not implemented, the infection will peak on 1 May 2020, and there will be less than 100 active cases by 14 February 2021, less than 10 active cases by 26 April 2021 and close to zero 13 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10.1101/2020.05.01.20084384 doi: medRxiv preprint daily new case by 6 October 2020, with a larger number of total infected cases between 1.6 million to 3.3 million cases. The fitted models for both periods maintained high goodness of fit, indicated by R 2 values of 85.2% to 98.7% for I and R respectively. The error rates by MAPE were low for post-MCO period with MAPE (I) and MAPE (R) of 4.4% and 13.4% respectively. The error rates were slightly higher for pre-MCO period, with MAPE (I) and MAPE (R) of 33.2% and 15.4% respectively. This is because the data pattern during that period was relatively less well explained by the SIR model especially for I, and the number of data points was smaller as compared to post-MCO period (24 vs 36). This could also be due to these reasons: a) the Ministry of Health increased the test capacity, b) delay in receiving the test results due to overwhelmed laboratory capacity and c) the number of tests fluctuates on daily basis [14, 15] . However, this high error rate was balanced by high R 2 for both I and R (87.9% and 87.8% respectively). The R 0 for all presented figures were relatively high as compared to other studies [16, 17, 18] . For post-MCO period, the R 0 is close to 3, which is against the common assumption that the R 0 should be low once health intervention measures kicks in. This is also the case with the pre-MCO R 0 , which is more than 8. However, these R 0 s for both periods were the ones that best explained the observed data. We propose that this observation might suggest the real infectiveness in the susceptible population, which is the fuel of the epidemic. Although the infectiveness of COVID-19 remains high following the MCO, the size of susceptible population reduced significantly as a result of MCO implementation. At the 14 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10.1101/2020.05.01.20084384 doi: medRxiv preprint same time, it was found that MCO significantly reduced transmission rate by 44.3%, from β), = 0.273 to β), = 0.152. The model was unable to consider imported cases because the size of susceptible population was assumed fixed at the start of each prediction period. Hence, another approach must be considered whenever imported cases are present. The performance measures provided in form of R 2 and MAPE can be viewed as the goodness of fit and error measures for the training set, which may not be generalizable to future observations and there could be an issue with model overfitting [19] . However, given the limited number of available data points, all available data points were used for the model optimization. Given the urgency of the present situation, it is not reasonable to wait for more data points to be used as the test set to estimate the test goodness of fit and error measures. In our opinion, cross-validation method is also not reasonable in this situation given the small number of data points. Finally, our model used the available incidence data instead of the data based on date of onset [20] . Despite this limitation, we feel that our SIR model, to some extent, reflects the true COVID-19 epidemic trend. The results showed that the implementation of the MCO from 18 March 2020 until today has significantly reduced the COVID-19 transmission in Malaysia. The MCO effectively restricted the number of susceptible population, resulting in a smaller number of infected 15 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10.1101/2020.05.01.20084384 doi: medRxiv preprint cases. Based on the model, the flattened epidemic curve is expected to be in July this year. The method used in this study to fit the SIR model was shown to be accurate in reflecting the observed data. This method can be used to predict the epidemic trend of COVID-19 in other countries. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 5, 2020. . All states gazetted as Covid-19-hit area, ban on inter-state travelling An introduction to compartmental modeling for the budding infectious disease modeler Mathematical Models in Epidemiology Analysing COVID-19 (2019-nCoV) outbreak data with R -part 1 ncov-outbreak-data-with-r-part-1/#estimating-changes-in-the-effective-reproductionnumber The basic SIR model in R R: A language and environment for statistical computing Solving Differential Equations in R: Package deSolve Department of Statistics Malaysia. Press release: Demographic statistics fourth quarter Elegant Graphics for Data Analysis COVID-19: What proportion are asymptomatic? Available online The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19)-China Consequences of delays and imperfect implementation of isolation in epidemic control The reproductive number of COVID-19 is higher compared to SARS coronavirus Estimation of the reproductive number of novel coronavirus (COVID-19) and the probable outbreak size on the Diamond Princess cruise ship: A data-driven analysis Estimating the basic reproduction number Zhonghua liu Xing Bing xue za zhi= An introduction to statistical learning with applications in R The CDC Field Epidemiology Manual: Describing Epidemiologic Data We would like to thank the contributions and support from our research groups: COVID-19