key: cord-0935601-ssa5rzd5 authors: Koczkodaj, W.W.; Mansournia, M.A.; Pedrycz, W.; Wolny-Dominiak, A.; Zabrodskii, P.F.; Strzaška, D.; Armstrong, T.; Zolfaghari, A.H.; Debski, M.; Mazurek, J. title: 1,000,000 cases of COVID-19 outside of China: The date predicted by a simple heuristic date: 2020-03-23 journal: Glob Epidemiol DOI: 10.1016/j.gloepi.2020.100023 sha: 136a52d8100f85d1baa7976838bfcace2b6c08d7 doc_id: 935601 cord_uid: ssa5rzd5 We forecast 1,000,000 COVID-19 cases outside of China by March 30, 2020 based on a heuristic and WHO situation reports. We do not model the COVID-19 pandemic; we model only the number of cases. The proposed heuristic is based on a simple observation that the plot of the given data is well approximated by an exponential curve. The exponential curve is used for forecasting the growth of new cases. It has been tested for the last situation report of the last day. Its accuracy has been 1.29% for the last day added and predicted by the 57 previous WHO situation reports (the date 18 March 2020). Prediction, forecast, pandemic, COVID-19, coronavirus, exponential growth curve parameter, heuristic, epidemiology, extrapolation, abductive reasoning, WHO situa- tion report. J o u r n a l P r e -p r o o f 1,000,000 COVID-19 CASES 3 Abstract. We forecast 1,000,000 COVID-19 cases outside of China by March 30, 2020 based on a heuristic and WHO situation reports. We do not model the COVID-19 pandemic; we model only the number of cases. The proposed heuristic is based on a simple observation that the plot of the given data is well approximated by an exponential curve. The exponential curve is used for forecasting the growth of new cases. It has been tested for the last situation report of the last day. Its accuracy has been 1.29% for the last day added and predicted by the 57 previous WHO situation reports ( Due to potentially overwhelming numbers of severe COVID-19 patients, medical resources need to be allocated wisely. With hospital beds and life-saving machinery the simplest and most likely explanation for the observations. In our case, the most likely explanation is exponential growth. This process yields a plausible conclusion but may not always positively verify it. The abductive conclusions are heuristics (see [1]), hence involve uncertainty, which is expressed by the bounded rationality as satisficing. Satisficing is a decision making process which takes into account the costs of optimization into the optimization process, thereby producing an efficient but suboptimal result. This can be compared with maximizing, which produces an optimal result at the expense of suboptimal costs. The extrapolation is a mathematical estimation, predicting unknown future values based on existing values. Compared to interpolation, which determines unknown values between existing values, extrapolation is less accurate. The best method for extrapolation is dependent on which method was used to initially acquire the data. The WHO situation report #31 (see [7] ) has been assumed as the starting data point since it shows, for the first time, over 1,000 cases outside China (see Fig. 1 ). Due to the risk of data from any individual country being biased or politically motivated to misreport data, we decided to use data from many countries; as such, any doctored data becomes statistically insignificant. In China, where COVID-19 originated, the situation seems to be under control as the Fig. 2 indicates. For this reason, including data about China would deviate the results or at least make them difficult to obtain. The visual inspection suggested the exponential growth, but could not be assumed. As such, R code was needed to be used for it with its nls function. According to For more details see [8] . We consider a non-linear model of the form: with type exponential function f (.) of the form: (2) In order to estimate the parameters a, b, we apply the non-linear least squares method, in which the residual sum of squares is minimized, see [8]: where yi is the number of total infected by COVID-19 outside China. In a, b parameters estimation we use well-known nls function from R program receiving: The residual standard error is Su = 1827. According to these results, we predict 1,000,0000 COVID-19 cases outside of China by the WHO situation report day 70/71 which is 31 March / 01 April (see Fig. 3 ). The lines of the plot, up to the last day of WHO situation report, are: (1) the blue line connecting 18 March WHO data, (2) the red line standing for 1,000,000 cases, (3) the exponential curve computed by R to be as close as possible to the real data up to 18 March. The vertical blue bar (Fig. 3) shows where the WHO data ends and where the predicted results start. For this reason, on the right hand side of the vertical bar there is only one line which is the computed exponential curve. Evidently, we do not have knowledge of how long (in terms of days) such an exponential curve will be an acceptable extrapolation; a million cases in 16 days, however, seems to have a high likeliness. Such a finding has considerable importance and should not be ignored. To the best of our knowledge, this may be the first study proposing a heuristic for computing parameters a and b for the approximating exponential curve a * exp(b * x) and for using x as the day number for the COVID-19 situation. The more people know about our finding, the better chance that they may regard self-care as a major contribution to preventing the spread of COVID-19. Our assumptions do not consider the complexity of a pandemic. In particular, we do not consider flattening of the approximating exponential curve. Simply, it is a short term prediction model, but it is very simple and we believe it is very accurate. As for the prediction standards, 1.29% error is more than acceptable for short term predictions. We regard the WHO situation report #31 as the starting data point since it shows over 1,000 cases outside China for the first time. The presented approach is based on a heuristic solution and makes a realistic assumption that the current trend can continue for the next 17 days. Obviously, it is an abstract, mathematical model; the reality may be different and COVID-19 situation may change in just a few days. Consideration of the Origin of Herbert Simon's Theory of Satisficing (1933-1947) COVID-19: What is next for public health? The Lancet Care for Critically Ill Patients with COVID-19 Improving the Medical Scale Predictability by the Pairwise Comparisons Method: Evidence from a Clinical Data Study Approaches to abductive reasoning: an overview An Introduction to R, Notes on R: A Programming Environment for Data Analysis and Graphics Asymptotic Theory of Nonlinear Least Squares Estimation There is no conflict of interest and this study has been conducted pro bono publico.