key: cord-0939760-o7krc485 authors: Abebe, T. H. title: Forecasting the Number of Coronavirus (COVID-19) Cases in Ethiopia Using Exponential Smoothing Times Series Model date: 2020-06-30 journal: nan DOI: 10.1101/2020.06.29.20142489 sha: 259ad2798117fb1bcbd7c522fce6653c59176645 doc_id: 939760 cord_uid: o7krc485 The main objective of this study is to forecast COVID-19 case in Ethiopiausing the best-fitted model. The time series data of COVID-19 case in Ethiopia from March 14, 2020 to June 05, 2020 were used.To this end, exponential growth, single exponential smoothing method, and doubleexponential smoothing methodwere used. To evaluate the forecasting performance of the model, root mean sum of square error was used. The study showed that double exponential smoothing methods was appropriate in forecasting the future number ofCOVID-19 cases in Ethiopia as dictated by lowest value of root mean sum of square error. The forecasting model shows that the number of coronavirus cases in Ethiopia grows exponentially. The finding of the results would help the concerned stakeholders to make the right decisions based on the information given on forecasts. The daily data on coronavirus disease, also known as COVID-19 confirmed case in Ethiopia is collected from World Health Organization (WHO) database over the period between 14-03-2020 and 05-06-20 (5). To develop an appropriate forecast of the data, the data set should be divided into in-sample and out-sample forecasts. Thus, the COVID-19 dataset is divided into training set (80%) on which our models are trained and testing set (20%) to test the performance of the model. A time series is a sequence of observations on a variable taken at discrete intervals in time. We Moreover, it is known that the main objective of time series analysis is to forecast the future value or patterns of the series. In this study, we try to forecast the future value of COVID-19 cases in Ethiopia based on the nature of the series. In order to forecast the number of positive COVID-19 casesin Ethiopia, the following modelsare defined. Single exponential growth model is one of the best time series model in forecasting data that have growing nature by accounting an exponentially growing (decaying) data. is an exponential function of time (t), then the modelis given by: is the number of COVID-19 confirmed cases, In order to estimate the exponential growth model using OLS, we should to linearize it using a log transformation. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 30, 2020. . https://doi.org/10.1101/2020.06.29.20142489 doi: medRxiv preprint Therefore, the linearize trend model is given by: The exponential growth model can be used to estimate the parameters as well as to forecast the future cases of COVID-19. For forecasting part, we can compare it with exponential smoothing (single and double) methods since they are also appropriate to forecast data that have an exponential pattern. Thus, we can compare the forecasting performance of exponential growth model and exponential smoothing methods. Therefore, the single and double exponential smoothing techniques are discussed below. Exponential smoothing was introduced in the late 1950s. Exponential Smoothing is a method of smoothing time series data based on the exponential window function. The exponential functions are used to assign exponentially decreasing weights over time. It is method of data analysis obtained by using some optimal weight generated according to the data estimations with a given specific weight.Forecastsproduced using exponential smoothing methods weighted averages of past observations. These methods give decreasing weights to past observations and thus the more recent the observation the higher the associated weight. This framework enables reliable estimates to be produced quickly in most applications. Single Exponential Smoothing method is used when the time series data has no trend and no seasonality. The smoothing function for any time period t as defined by (6)is given by: denotes the current smoothed series obtained by applying simple exponential smoothing series Y.ܻ The general form of single exponential smoothing forecast function is: where݄ is the number of periods in the forecast lead-time and ܻ ௧ ା is the forecast for ݄ periods ahead from origin ‫ݐ‬ . Single exponential smoothing cannot smooth well when there is trend in the data. Thus,we should to use a double exponential smoothing (second order exponential smoothing) method when the time series data has a trend but not seasonality component. The smoothing function for any time period t isgiven by: is the smoothed additive trend at the end of period t,ߙ is the data smoothing parameter for the level of the series andߚ is the smoothing parameter for the trend range between (0,1). According to (7) forecasting is an important application of time series analysis. However, theperformance of forecasting a given model should be evaluated by using different evaluation criteria. Among the common evaluation criteria is therootmeansum of square errors All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 30, 2020. . https://doi.org/10.1101/2020.06.29.20142489 doi: medRxiv preprint (RMSSE).The models, which have the smallest root mean sum of square error,will be used as the best forecasting model. The root mean sum of square error is given by: is the expected (forecasted) value for period t, is the actual (observed) value for the period t, and T is the number of periods. To analysis the dataset, the researcher uses Stata 13 for exponential growth analysis and Microsoft excel 10 for single and double exponential smoothing method analysis due to the simplicity of the software for the respected models. Therefore, if you look some difference on the Figure format, that is a matter ofsoftware package difference. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 30, 2020. . https://doi.org/10.1101/2020.06.29.20142489 doi: medRxiv preprint From Figure 1 , the horizontal line shows the number of days and the vertical axis is the total number of COVID-19 cases in Ethiopia. From the figure, we observe that the series grow exponentially. In order to identify an appropriate model for forecasting the number of COVID-19 cases in Ethiopia, an exponential growth, single and double exponential smoothing methods are used as a candidate model.After fitting the three models, the following results are obtained. As shown in Table 1 , wehave established the following time series techniques for analyzing and forecasting the number of confirmed cases in Ethiopia. When we compare the three competitive candidate models, we find that adouble exponential smoothing technique is better than the other models as indicated by the highest coefficient of determination (ܴ ଶ ሻ and F-statistic test.Therefore, according to Table 1 , we find that double exponential smoothing technique is the best of the other two time series models in terms ofgoodness of fit criteria. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 30, 2020. . Figure, we observe that, the difference between the actual value and the forecast value in double exponential smoothing method is relatively smaller than that of exponential growth and simple exponential smoothing methods. Thus, the double exponential smoothing method in a better way to predictthe number of COVID-19 cases in Ethiopia on the coming three weeks. Beside the graphical evaluation of the performance of the fitted model for a given series, we can evaluate it using formal statistical methods of evaluation. The performance of the model is depending on how the actual value is close to the predicted value. In this case, the root mean sum of square error (RMSSE) is used to evaluate the performance of the three models (exponential growth, single exponential and double exponential models). Among the three models, the double exponential smoothing model was the best-fittedmodelfor analysis and forecast the number of COVID-19 cases in Ethiopia as given by small values of RMSSE. Therefore, this model is bestfitted model and selected as a best model for forecasting the future cases of the COVID_19 cases. The diagnostic measures for the selection of best forecasting model is given by Table 2 . Forecasting is making prediction of some future value of events using past and present data.However, as suggested by Neils Bohr, making good prediction is not always easy (8). Moreover, most statistical time series methods are valid for short term forecasting (such as days, weeks, months) and medium term forecasts (one to two years) than long term forecasts (more than two years) since most historical data usually exhibit inertia and do not change dramatically very quickly. From the model diagnostic result on Table 2 , the best-fitted model based on minimum root mean sum of square error (RMSSE) is, a double exponential smoothing method. Therefore, to forecast the future COVID-19 cases, a double exponential smoothing method given in Figure 5 as below. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 30, 2020. . https://doi.org/10.1101/2020.06.29.20142489 doi: medRxiv preprint In this study, an exponential growth, simple and double exponential smoothing methods were fitted on COVID-19 data of Ethiopia from the period 14-03-2020 to 05-06-2020. The forecasting performance of the model is evaluated using Root Mean Sum of Square Error (RMSSE). The result shows that double exponential smoothing method is better than other two modelsinforecast future cases of COVID-19 in Ethiopia.From the result of forecast, the number of people who will be affected by COVID-19 in Ethiopia increasesin an exponential manner in the next three weeks. Thus, the forecast helps the Ethiopian government, policy makers, and the society at alltotake preventive measures before the transmission become out of control especially rural areas since until now, most of the cases were observed in urban areas. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 30, 2020. Since Mr. Teshome is the only author of this paper, his contribution is not only on formulating the idea of the research, but also in collecting, analyzing, and interpretation of the data. Thus, the author read and approved the final manuscript by himself. The author declares no conflict of interest. This research did not receive any fund. Time series forecasting of COVID-19 transmission in Canada using LSTM networks Modeling and Forecasting for the number of cases of the COVID-19 pandemic with the Curve Estimation Models