key: cord-0930564-bo9omynv authors: Bhattacharyya, Arinjita; Chakraborty, Tanujit; Rai, Shesh N. title: Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model date: 2022-01-13 journal: Nonlinear Dyn DOI: 10.1007/s11071-021-07099-3 sha: fc59d14e1d9835b73559d6747bbc7b3fd8812a2b doc_id: 930564 cord_uid: bo9omynv An unprecedented outbreak of the novel coronavirus (COVID-19) in the form of peculiar pneumonia has spread globally since its first case in Wuhan province, China, in December 2019. Soon after, the infected cases and mortality increased rapidly. The future of the pandemic’s progress was uncertain, and thus, predicting it became crucial for public health researchers. These predictions help the effective allocation of health-care resources, stockpiling, and help in strategic planning for clinicians, government authorities, and public health policymakers after understanding the extent of the effect. The main objective of this paper is to develop a hybrid forecasting model that can generate real-time out-of-sample forecasts of COVID-19 outbreaks for five profoundly affected countries, namely the USA, Brazil, India, the UK, and Canada. A novel hybrid approach based on the Theta method and autoregressive neural network (ARNN) model, named Theta-ARNN (TARNN) model, is developed. Daily new cases of COVID-19 are nonlinear, non-stationary, and volatile; thus, a single specific model cannot be ideal for future prediction of the pandemic. However, the newly introduced hybrid forecasting model with an acceptable prediction error rate can help healthcare and government for effective planning and resource allocation. The proposed method outperforms traditional univariate and hybrid forecasting models for the test datasets on an average. In December 2019, clusters of pneumonia cases caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) were identified at Wuhan, Hubei province, in China [1] [2] [3] . Soon after the emer-gence of the novel beta coronavirus, World Health Organization (WHO) characterized this contagious disease as a pandemic in March 2020 due to its rapid spread within and outside the highly mobile population of China which got further aggravated by the densely populated location of the sea food market and time of Chinese new year [4] . It resulted in an exponential increase in the incidence rate (IR) and case fatality rate (CFR) [5] . As of March 16, 2021, a total of 120,512,041 confirmed global cases and 2,665,742 deaths have been reported worldwide [6] . Researchers have faced significant challenges to forecast real-time COVID-19 cases with traditional mathematical, statistical, and machine learning-based forecasting tools [7] [8] [9] [10] [11] [12] [13] . Studies in March 2020 with simple and efficient forecasting methods such as exponential smoothing model predicted cases ten days ahead, with large confidence intervals, that despite the positive bias, had reasonable forecast error [14] . Previously used linear and exponential model forecasts for better preparation regarding hospital beds, ICU admission estimation, resource allocation, emergency funding, and proposing strong containment measures were conducted [15] that projected ICU admissions in Italy for March 20, 2020 . ICU admission and mechanical ventilation for critically ill patients reached their peak shattering the health system of Lombardy, Italy, by end of March 2020 [16] . Health-care workers had to go through the immense mental stress left with a formidable choice of prioritizing young and healthy adults over the elderly for allocation of life support, especially unwanted ignoring of those who are extremely unlikely to survive [17, 18] . Real estimates of mortality with 14-day delay demonstrated underestimating the COVID-19 outbreak and indicated a grave future with a global CFR of 5.7% [19] . Some of the countries are eventually in a second wave with cases rising in India, and the UK. The uncertain future has several unanswered questions on the daily trajectory of the pandemic, infection rate, and number of deaths. Thus, real-time forecasting with time series models could respond to these debatable quires to reach a statistically validated conjecture in this current health crisis. Some of the impacting leading-edge research concerning real-time projections of COVID-19 confirmed cases, recovered cases, and mortality using statistical, epidemiological, and machine learning models are as follows. Short-term forecasting of cumulative confirmed cases was produced using autoregressive inte-grated moving average (ARIMA), Cubist regression, random forest, ridge, support vector regression (SVR), and stacking-ensemble learning (SEL) model in [20] , whereas exponential smoothing model produced ten days ahead forecasts of actual cases within 90% CI and forecasts reflect the significant increase in the trend of global cases with growing uncertainty [14] . Forecasting and nowcasting domestic and international COVID-19 cases are also done with epidemiological compartmental models, like susceptible-exposed-infectiousrecovered (SEIR) and SIR models [8, 9, 21] . The prediction and modeling of cumulative COVID-19 deaths in Mexico have utilized ANN and a Gompertz model with multiple optimization algorithms and emphasized on the performance of ANN with different parameter settings [22] . Parametric or data-driven modeling approaches for the virus infection and propagation and machine learning methods for predicting the virus spreading dynamics are previously used in the literature [23, 24] . Modern machine learning techniques such as neural networks and black-box deep learning models were also used for COVID-19 forecasting [25, 26] . Dynamics and control of COVID-19 pandemic with nonlinear incidence rates and building nonlinear models for COVID-19 forecasting have been found useful for European countries [27, 28] . There is a huge literature on mathematical modeling in epidemiology starting from the early work of [29] , going through the compartmental models of [30] laying down the foundations of modern epidemiology, up to the current data-driven approaches (see, e.g., [31] [32] [33] addressing COVID-19). However, forecasting COVID-19 cases is harder and this is primarily due to the following major factors [34] : (a) Very less amount of data is available; (b) less understanding of the factors that contribute to it; and (c) model accuracy is constrained by our knowledge of the virus. With an emerging disease such as COVID-19, many biologic features of transmission are hard to measure and remain unknown; (d) another source of uncertainty, affecting all models, is that we do not know how many people are, or have been, infected; and (e) we are certainly missing a substantial number of cases due to virologic testing, so models fitted to confirmed cases are likely to be highly uncertain [35] . Time-series forecasting models take the input of historical observations and extrapolate the patterns into the future. The past in no way will continue in the future for this pandemic, so it is challenging to model and produce predictions. There are essentially two general approaches to forecasting a time series: (a) generating forecasts from a single model and (b) combining forecasts from many models (hybrid). In classical time series forecasting, the ARIMA model is used predominantly for forecasting linear time series [36] , which has a significant strong assumption of linearity in the system and homoscedastic error distribution, typically without sudden jumps and bursts. Individual models such as ARIMA, wavelet ARIMA (WARIMA) [37] , and Theta method [38, 39] are inadequate to model such situations. Usage of nonlinear time series techniques in infectious disease modeling has successfully demonstrated the success of artificial neural networks (ANNs) and autoregressive neural networks (ARNNs) [40] [41] [42] [43] . The idea of hybridizing forecasting models is not a new concept, and their empirical applications resulted in superior forecasts to their counterparts. Most recently, some hybrid models combining linear and nonlinear time series models are proposed for COVID-19 [12, 44] and performed exceptionally well for predicting these epidemics. Motivated by this, this study considers the time series datasets of coronavirus confirmed cases that show nonlinearity, non-stationary, and non-Gaussian patterns, thus making decisions based on a discrete model critical and unreliable. Another difficulty in COVID-19 data from the modeling aspect is the unavailability of sufficient data points, which generates biased predictions and estimates, which can be well maneuvered by neural nets [45] . Most of the relevant studies focused on the outbreak's short-term and longterm forecasts of reported confirmed cases have a broad range of fluctuations, wide confidence intervals, poorly reported data and model specifics, and highly optimistic predictive performance [46] . On the other hand, this paper handles the problem of forecasting of the current pandemic cases by designing a hybrid decisionmaking method based on highly useful Theta forecasting method and machine learning-based ARNN model. Hybridization of two highly accurate and useful models can be regarded as the most handy solution [47] for optimizing forecasting performance, and efficiency with irregular data characteristics of COVID-19 cases [48] . The importance of hybrid methodology with a fusion of linear and nonlinear forecasting models became evident in tackling such dynamic/nonlinear time series due to its in-built time-changing variance, complex autocorrelation structure [12, 49] . With the growing need for advanced parsimonious hybrid forecasting methods accompanied by precise accuracy and accurate forecasts, this paper focuses on providing a plausible solution for COVID-19 forecasting using the newly proposed hybrid decision making system. The main objectives of this study are as follows: 1. To propose a simple and computationally efficient hybrid forecasting model which generates out-ofsample forecasts (two months) for five profoundly affected countries (the USA, Brazil, India, the UK, and Canada) with higher accuracy as compared to state-of-the-art forecasting methods. 2. To prove the model's stationarity and ergodicity properties from a statistical point of view. 3. To discuss the merits and future challenges that need attention while working with the proposal for epidemiological forecasting. We also recommend policy-making decisions and resource allocation based on these forecasts. Then, the ARNN model is employed to capture the nonlinear patterns in the data using residual values obtained from the base Theta model. The proposed TARNN model has easy interpretability, produces robust predictions, and is adaptable to seasonality. Through experimental evaluation, we have shown the excellent performance of the proposed hybrid model for the pandemics forecasting for several datasets. An RShiny application is also built for TARNN which can help practitioners reproducing and updating the forecasts as further data become available. The rest of the paper is organized as follows: Sect. 2 describes the formulation of the proposed hybrid TARNN model. The ergodicity and stationarity of the proposed hybrid model are discussed in Sect. 3. In Sect. 4, we discuss the country-wise COVID-19 confirmed case datasets and experimental results. The discussions about the results and practical implications are given in Sect. 5. Finally, we conclude the paper in Sect. 6. We start by discussing the single forecasting models to be used in the hybridization, followed by the detailed formulation of the proposed hybrid Theta-ARNN model. The 'Theta method' or 'Theta model' is a univariate time series forecasting technique that performs particularly well in M3 forecasting competition and of interest to forecasters [39] . The method decomposes the original data into two or more lines, called theta lines, and extrapolates them using forecasting models. Finally, the predictions are combined to obtain the final forecasts. The theta lines can be estimated by simply modifying the 'curvatures' of the original time series. This change is obtained from a coefficient, called θ coefficient, which is directly applied to the second differences of the time series: where . . , Y n } denote the observed univariate time series. In practice, the coefficient θ can be considered as a transformation parameter which creates a series of the same mean and slope with that of the original data but having different variances. Now, Eq. (1) is a second-order difference equation and has solutions of the following form [38] : where a θ and b θ are constants and t = 1, 2, . . . , n. Thus, Y new (θ ) is equivalent to a linear function of Y t with a linear trend added. The values of a θ and b θ are computed by minimizing the sum of squared differences: Forecasts from the Theta model are obtained by a weighted average of forecasts of Y new (θ ) for different values of θ . Furthermore, the prediction intervals and likelihood-based estimation of the parameters can be obtained based on a state space model which is demonstrated in [38] . The generalized version of the Theta method is suitable for automatic forecasting of time series [50] . Artificial neural network-based forecasting methods received increasing interest in various applied domains in the late 1990s. A wide variety of neural nets are popularly used for supervised classification, prediction, and nonlinear time series forecasting [51] . The architecture of a simple feedforward neural network can be described as a network of neurons arranged in an input layer, hidden layer, and output layer in a prescribed order. Each layer passes the information to the next layer using weights that are obtained using a learning algorithm [52] . ARNN model is a modification to the simple ANN model especially designed for prediction problems of time series datasets [52] . ARNN model uses a pre-specified number of lagged values of the time series as inputs, and the number of hidden neurons in its architecture is also fixed [53] . ARNN( p, k) model uses p lagged inputs of the time series data in a one hidden layered feedforward neural network with k hidden units in the hidden layer. Let x denote a p-lagged inputs and f be a neural network of the following architecture: where c 0 , a j , w j are connecting weights, b j is pdimensional weight vector, and φ is a bounded nonlinear sigmoidal function (e.g., logistic squasher function or tangent hyperbolic activation function). These weights are trained using a gradient descent backpropagation [54] . Standard ANN faces the dilemma to choose the number of hidden neurons in the hidden layer, and the optimal choice is unknown. However, for ARNN model, we adopt the formula k = [( p + 1)/2] for nonseasonal time series data where p is the number of lagged inputs in an autoregressive model [53] . In this section, we describe the proposed hybrid model based on Theta method and ARNN model and we name it TARNN model. The proposed TARNN model is based on an error remodeling approach, and there are broadly two types of error calculations popular in the literature which are given below [55] . In the additive error model, the forecaster treats the expert's estimate as a variable,Ŷ t , and thinks of it as the sum of two terms: where Y t is the true value and e t is the additive error term. In the multiplicative error model, the forecaster treats the expert's estimateŶ t as the product of two terms: where Y t is the true value and e t is the multiplicative error term. Now, even if the relationship is of product type, in the log-log scale, it becomes additive. Hence, without loss of generality, we may assume the relationship to be additive and expect the errors (additive) of a forecasting model to be random shocks. However, this is violated when there are complex correlation structures in the time series data and less amount of knowledge is available about the data generating process. A simple example is the daily confirmed cases of the COVID-19 cases for various countries where very little is known about the structural properties of the current pandemic. Thus, we need two-stage modeling approach to formulate this complex time series problem. The proposed TARNN model is a hybrid model based on the additive error remodeling approach. The hybrid TARNN approach consists of three basic steps: -In the first step of the TARNN model, the Theta method is applied to the time series data to model the linear components of a given time series dataset. -Theta model generates in-sample forecasts, and the error series is calculated. -In the next phase, the residuals (additive errors) generated by the Theta method are remodeled using a nonlinear ARNN model. Finally, both the forecasts obtained from the Theta and ARNN models are combined together to get the final forecasts for the given time series. The mathematical formulation of the proposed hybrid TARNN model (Z t ) is as follows: where L t is the linear part and N t is the nonlinear part of the hybrid model. We can estimate both L t and N t from the available time series data. LetL t be the forecast value of the Theta model at time t and t represent the error residuals at time t, obtained from the Theta model. Then, we write These left-out residuals are further modeled by ARNN model and can be represented as follows: where f is a nonlinear function and the modeling is done by the ARNN model as defined in Eq. (4) and ε t is supposed to be random shocks. Therefore, the combined forecast can be obtained as follows: whereN t is the forecasted value of the ARNN model. An overall flow diagram of the proposed TARNN model is given in Fig. 1 . In the proposed TARNN model, ARNN is applied to remodel the left-over autocorrelations in the residuals which Theta model individually could not model. Thus, the proposed TARNN model can be considered as an error remodeling approach. This is important because due to model mis-specification and disturbances in the pandemic time series, the linear Theta model may fail to generate white noise behavior for the forecast residuals. The TARNN approach eventually can improve the predictions for the epidemiological forecasting problem as shown in Sect. 4 . The idea of additive error modeling is useful for modeling complex time series for which achieving random shocks based on individual forecasting models is difficult. More precisely, the TARNN approach is developed for forecasting the COVID-19 confirmed cases for which the data generating process and the various characteristics of the epidemic are still unknown. The proposed TARNN model only assumes that the linear and nonlinear components of the epidemic time series can be separated individually. In this section, we derive the results for the ergodicity and asymptotic stationarity of the proposed TARNN model. The ergodicity and stationarity are of particular importance from a statistician's point of view in time series analysis since for such processes a single realization displays the whole probability of the data generating process. We use several previous results on nonlinear time series and Markov chains to find sufficient conditions for which the overall process is ergodic and stationary [56] [57] [58] [59] . To start, we write the underlying stochastic model of the Theta method using the state space approach. We initialize the model by setting Y 1 = l 1 and then for t = 2, 3, . . . ; and drift term b, let where { t } is assumed to follow Gaussian white noise with mean zero and variance σ 2 and α is the smoothing parameter for the simple exponential smoothing (SES) model. Now, Y t follows a state space model which gives forecasts equivalent to SES with drift [38] . Also, Y t can be written in the following form: The above is an ARIMA(0, 1, 1) process with drift term [38] . The left-out residuals of Theta model are further modeled by ARNN process. We consider the ARNN process generated by the additive noise of the ARIMA(0, 1, 1) process with drift. Let t be the process defined by the stochastic difference equation of the following form: where ε t is an i.i.d. noise process and f (·, Θ) is a feedforward neural network with weight (parameter) vector Θ and inputs t−1 , t−2 , . . . , t− p . The definition of f is given in Eq. (4). We start by defining the following notations: Then, Eq. (6) can be written as follows [56] : with z t , e t ∈ R p . In this section, we show the (strict) stationarity of the state space form, as defined in Eq. (7). The problem of showing {z t } to be stationary is closely related to the ergodicity of the process [60] . A Markov chain {z t } is called geometrically ergodic if there exists a probability measure π on the state space (R p , B) and a constant ρ > 1 such that for each z ∈ R p , B is the Borel σ -algebra on R p and · denotes the total variation norm. If Eq. (8) holds for ρ = 1, then {z t } is called ergodic. The definition for P n (z, A) can be given as the probability that {z n } moves from z to the set A ∈ B in n steps: P n (z, A) = P (z t+n ∈ A|z t = z) . Also, expression for π(A) is as follows: Thus, we call π as the stationary measure and the distribution of z t converges to π if {z t } is ergodic. Then, we say {z t } is asymptotically stationary [61] . To establish the ergodicity of TARNN processes, we need the concept of irreducibility and aperiodicity. A Markov chain {z t } is called irreducible if Hence, it is possible that the Markov chain returns to given sets only at specific time points for an aperiodic Markov chain. For most general time series models, irreducibility and aperiodicity cannot be assumed automatically. However, for a TARNN process, these conditions can be checked. In general, it is sufficient to assume the distribution of the noise process to be an absolutely continuous component with respect to Lebesgue measure and the support of the probability density function (PDF) is sufficiently large. It is clear from the above discussion that if the Markov chain is geometrically ergodic, then its distribution will converge to π and the corresponding time series will be called asymptotically stationary, see also [58] . Lemma 1 states that the state space of the Markov chain cannot be reduced depending on the starting point. (7), and let E|ε t | < ∞ and the PDF of ε t be positive everywhere in R. Then, if f is defined by (4), the Markov chain {z t } is ϕ-irreducible and aperiodic. Proof Since the support of the PDF of ε t is the whole real line, that is, the PDF is positive everywhere in R, then we can say that {z t } is ϕ-irreducible by using [56] . In our case, every non-null p-dimensional hypercube can be reached in p steps with positive probability (and hence every non-null Borel set A). A necessary and sufficient condition for {z t } to be aperiodic is to have a set A and positive integer n such that P n (z, A) > 0 and P n+1 (z, A) > 0 for all z ∈ A [58] . In this case, this is true for all n due to consideration of the unbounded additive noise. The theorem below states the necessary conditions for geometric ergodicity of a Markov chain. This can be obtained using the decomposition technique and ergodicity of stochastic difference equations [56, 59] . (6) and (7), F is a compact set that can be decomposed as F = F h + F d , and the following conditions hold: bounded; (ii) E|ε t | < ∞, and probability distribution function of ε t is positive everywhere in R; then, {z t } is geometrically ergodic. Proof {z t } satisfies the following equation: Let F h be continuous and homogeneous, viz. F h (cz) = cF h (z) for all c > 0, z t ∈ R p , and F d be bounded. Let the origin, O, be a fixed point of F h . It is important to note that ε t satisfies the condition (ii) in Theorem 1. We are going to show the existence of a continuous Lyapunov function, V , in a neighborhood of the origin which will ensure the geometric ergodicity of (7). To start with we let W ⊆ R p , the closure of W by W and its boundary by W . We also let V be defined over the closure of the unit ball. We let where · denotes the Euclidean norm. Also, let G be the maximal connected component of {z : V (z) < p 0 2 } that contains the origin. Then, we have F(G) ⊂ G. Let g(z) = inf{r ≥ 0, z ∈ r G}, z ∈ R p , where r G = {r z, z ∈ G}. Then, g(z) is well defined and it can easily be checked that g has the following properties: 4. There exists > 0, 0 < θ < 1 such that for all z ∈ G, y ∈ R p , we have y − F(z) < ⇒ y ∈ G and g(y) < θ. Now, for Eq. (7), ε t satisfies E|ε t | < ∞. Let A ∈ B and z ∈ R p , we define P(z, A) be the transition probability function as: Thus, it holds that g(y)P(z, dy) Here, |β(z)| < B < ∞ for all z and β(z) → 0 as z → ∞. We let h(z) = g(z) + 1 and r be such that Then, for r 0 = max r, 4(C+1) Using Theorem 4 of [60] , we can conclude that {z t } is geometric ergodic. The next theorem gives the main result for asymptotic stationary of the TARNN model. Theorem 2 Let E|ε t | < ∞ and the PDF of ε t be positive everywhere in R, and {ε t } and {z t } are defined as in (6) and (7), respectively. Then, if f is a nonlinear neural network as defined in (4), then {z t } is geometrically ergodic and {ε t } is asymptotically stationary. Proof The noise process ε t satisfies E|ε t | < ∞ by assumption (e.g., Gaussian noise). It is also important to note that neural network activation functions, more precisely logistics or tan-hyperbolic activation functions, are continuous compact functions and have a bounded range. Thus, {z t } satisfies all the criteria to be geometrically ergodic and using Theorem (1), one can write that for the ARNN process with F h ≡ 0 and F d ≡ F. Thus, the series {ε t } is asymptotically stationary. implies that the memory of TARNN process vanishes exponentially fast. This implies that the simplest version of the proposed model converges to a Wiener process. Five time series COVID-19 datasets are considered for assessing various forecasting models (individual and hybrid). The datasets are mostly nonlinear, nonstationary, and non-Gaussian in nature. We have used root-mean-square error (RMSE), mean absolute error (MAE), and mean absolute scaled error (MASE) [62] to evaluate the predictive performance of the models used in this study. Since the number of data points in all the datasets is limited, advanced deep learning techniques will overfit the datasets [63] . Data are collected from the 'Our World in Data' public repository (Link: https://ourworldindata.org/ coronavirus) on five countries: the USA, Brazil, India, the UK, and Canada that have the leading number of confirmed cases of COVID-19. These are univariate time series from inception-February 26, 2021, of confirmed cases that are analyzed for generating future out-break predictions. We have taken last 60 data points as test points on which the forecasting models are evaluated. A summary of the COVID-19 datasets of confirmed cases and their descriptions is presented in Table 1 . A pictorial view of the training datasets along with auto-correlation function (ACF) and partial ACF (PACF) plots is given in Table 2 . The performance of different forecasting models are evaluated based on RMSE, MAE, and MASE metrics for these five time series [53] : where y i is the actual value,ŷ i is the predicted value, and n denotes the number of data points. MASE is computed using 'mase' function available under the 'Metrics' package in R statistical software. By definition, the lower the value of these performance metrics, the better is the performance of the concerned forecasting model. A schematic diagram is presented in Fig. 2 to give an outline of the models to be used in this section. We start the experimental evaluation for all datasets with the classical ARIMA( p, d, q) using 'forecast' [64] statistical package in the R statistical software [65] . The nonlinear and non-seasonal time series of these countries were modeled with traditional and Daily Italy advanced methods such as ARIMA, Theta, WARIMA [66] , ETS [67, 68] , TBATS, ARNN, and two popularly used hybrid models, namely hybrid ARIMA-ARNN [69] , hybrid ARIMA-WARIMA [12] , and the newly introduced hybrid Theta-ARNN (TARNN) models. For the accuracy of prediction, the data were par-titioned into training and test sets, where the test set consisted of last 60 data points, and the training set included (inception -60+1) data points. Table 3 gives the essential details about the functions and packages required for the implementation of standard individual models. In the proposed TARNN model, linear modeling is conducted with Theta model using 'thetaf' function under the 'forecast' package in R statistical software. Nonlinear modeling with ARNN approach is carried out with 'caret' package using 'nnetar' function in R statistical software. Furthermore, Theta residuals are modeled with ARNN( p, k) model having a pre-defined Box-Cox transformation set λ = 0 to ensure the forecast values to stay positive. The values of p and k are obtained by training the network, which is a datadependent approach as discussed in Sect. 2.2. Further, both the linear and nonlinear forecasts are summed up to obtain the final forecasts. Theta model was fitted to five datasets, namely the USA, Brazil, India, the UK, and Canada. Theta model residuals for these five countries were trained using ARNN (15, 8) , ARNN (21, 11) , ARNN (25, 13) , ARNN (24, 12) , ARNN (12, 6) , and ARNN(9, 5) models with an average of 500 networks for all five datasets, each of which is a 15-8-1, 21-11-1, 25-13-1, 24-12-1, 12-6-1, and 9-5-1 networks with 137, 254, 352, 313, 85, and 56 weights and with 5,807,711, 309,816, 20,735, 883.6, 209,396 , and 59,477 estimated σ 2 , respectively. Finally, the predicted results of both Theta and ARNN models are added together to obtain the estimated forecasts of the proposed TARNN model. Rest of the hybrid models were compared in the similar fashion. We compared our proposed TARNN model with traditional single models (ARIMA, TBATS, Theta, ARNN, WARIMA, ETS) along with hybrid ARIMA-WARIMA model and hybrid ARIMA-ARNN model, and the experimental results are reported in Table 4 . The performance of the proposed hybrid Theta-ARNN (TARNN) model is superior compared to all traditional individual and hybrid models on average. The theoretically proven asymptotic stationarity of the proposed hybrid model also suggests that the model cannot have a growing variance over time. The consistency and adequacy in the experimental results empirically approve the same. Thus, the efficacy of the proposed methodology of the proposed hybrid model is experimentally validated. Nevertheless, no model can have dominant advantage and this is related to no free lunch theorem in statistical learning [63] . Forecast of COVID-19 cases for March 20-29, 2021 , is generated using proposed TARNN model and shown in Fig. 3 . We now provide a brief discussion on some time series characteristics that will guide a potential modeler to opt for their hybrid scheme. A time series is usually classified into discrete or continuous, deterministic or stochastic, stationary or non-stationary, and linear or nonlinear. In real-life epidemiological forecasting problems, it is often challenging to determine whether a time series under study is generated from a linear or nonlinear underlying process. Since all the available pandemic datasets are complex and often contain both linear and nonlinear patterns, a single model may be insufficient to meet the whole data characteristic adequately. Thus, a hybrid model combining both the linear and nonlinear components is indeed useful. The basic assumption in the hybrid methodology is the additive relationship between the linear and nonlinear components of the time series. Thus, the proposed hybrid decision-making system is best suited for both stationary or non-stationary time series and situations when the data exhibit both linear and nonlinear patterns. In the development of the hybrid methodology, we need the component model to be sub-optimal, and it will be useful to combine individual forecasts based on different information sets to produce superior forecasts. Based on our experience on COVID-19 forecasting in this work, we comment that our proposal is best suited in situations where the datasets will have enough nonlinearity and non-stationarity. The experimental finding also suggests that the presence of the concept shift in time series leads to inconsistency in the performance among different time series models (see, for example, all the hybrid models including our proposed one fail to capture the concept shift in the US data as reported in Table 4 ). Even in Fig. 3 , the out-of-sample forecasts and actual values are worse for the UK data, whereas for other countries, we could capture the data irregularities quite well. Experimental finding also suggests the presence of the concept shift in time series data which leads to inconsistency in the performance among different time series models for a few datasets. Many parameters associated with COVID-19 transmission are still poorly understood. The resulting model uncertainty is not always calculated or reported in a standardized way. Once we can incorporate these variables, we can improve our estimates and update the TARNN model accordingly. Since purely statistical approaches do not account for how transmission occurs, they are generally not well suited for long-term predictions about epidemiological dynamics (such as when the peak will occur and whether a resurgence will happen) or for inferences about intervention efficacy. Most forecasting models therefore limit their projections to one week or a few weeks ahead. Moreover, the problem of using confirmed cases to fit models is further complicated by the fact that the fraction of cases that are confirmed is spatially heterogeneous and time-varying. Amid enormous uncertainty about the future of the COVID-19 pandemic, the proposed TARNN model yields quantitative projections that policymakers may need in the short term to allocate resources or plan interventions. Guided by the short-term forecasts reported in this paper, the lockdown period can be adjusted accordingly and vaccination project can be enhanced. Since we presented a real-time forecast system unlike an ex post analysis, thus one can regularly update the actual confirmed cases and update the out-of-sample forecasts, just like it happens in temperature forecasting. To conclude, this model can further be easily extended for similar nonlinear and non-Gaussian forecasting problems arising 99in other applied domains. In this study, we proposed a novel hybrid Theta-ARNN (TARNN) model using residual modeling approach that performs considerably well for confirmed cases of COVID-19 forecasting for the countries that include the ones with the highest number of cases: the USA, followed by Brazil, India, the UK, and Canada. The proposed TARNN model filters linearity using the Theta model and can better explain the linear, nonlinear, and non-stationary tendencies present in the selected COVID-19 datasets compared to the traditional single and hybrid models. It also yields better forecast accuracy than various traditional single and hybrid models for three out of five countries. The proposal will be useful in decision and policy making for government officials and policymakers to allocate adequate healthcare resources for the coming days in responding to the crisis. Time series of epidemics can oscillate heavily due to various epidemiological factors, and these fluctuations are challenging to be captured adequately for precise forecasting. This newly developed model can still predict with better accuracy provided the condi-tions of asymptotic stationarity of the hybrid model are satisfied. This method can be used to update real-time forecasts as more data become available. The study covering multiple countries can be utilized without geographical borders and reflects the impact of social distancing, wearing masks, lock down, shutdown, quarantine, and sanitizing proper measures implemented by authorities. Prevalent techniques in the literature were unable to completely capture the nonlinear behavior of stochastic time series containing inherent random shock components. This new method has significant theoretical (established ergodicity and stationarity of the proposed TARNN process) as well as practical implications. Authorities and health care can modify their planning in stockpiles and hospital beds depending on these forecasts of the COVID-19 pandemic. Clinical features of patients infected with 2019 novel coronavirus in Clinical characteristics of coronavirus disease 2019 in china The 1918 Spanish flu in Spain Real-time forecasts of the COVID-19 epidemic in china from Who director-general's opening remarks at the media briefing on Geographical tracking and mapping of coronavirus disease COVID-19/severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic and associated events around the world: how 21st century GIS technologies are supporting the global fight against outbreaks and epidemics Trend and forecasting of the COVID-19 outbreak in China Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Analysis and forecast of COVID-19 spreading in China, Italy and France Early dynamics of transmission and control of COVID-19: a mathematical modelling study Estimation of local novel coronavirus (COVID-19) cases in Wuhan, China from off-site reported cases and population flow data from different sources Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: a datadriven analysis Forecasting COVID-19 Forecasting the novel coronavirus COVID-19 Critical care utilization for the COVID-19 outbreak in Lombardy, Italy: early experience and forecast during an emergency response Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy region Fair allocation of scarce medical resources in the time of Facing COVID-19 in Italy-ethics, logistics, and therapeutics on the epidemic's front line Real estimates of mortality following covid-19 infection Short-term forecasting COVID-19 cumulative confirmed cases: perspectives for Brazil Predictions, role of interventions and effects of a historic national lockdown in India's response to the COVID-19 pandemic: data science call to arms Comparison of an artificial neural network and Gompertz model for predicting the dynamics of deaths from COVID-19 in México Nonlinear dynamics of COVID-19 pandemic: modeling, control, and future perspectives Mechanisms of recurrent outbreak of COVID-19: a model-based study A machine learning forecasting model for COVID-19 pandemic in India COVID-19 future forecasting using supervised machine learning models Dynamics and control of COVID-19 pandemic with nonlinear incidence rates Nonlinear model predictive control with logic constraints for COVID-19 management Essai d'une nouvelle analyse de la mortalité causée par la petite vérole, et des avantages de l'inoculation pour la prévenir. Histoire de l'Acad Contributions to the mathematical theory of epidemics. II.-The problem of endemicity Dynamics of COVID-19 transmission with comorbidity: a data driven modelling based approach A data-driven model for predicting the course of COVID-19 epidemic with applications for China Understanding COVID-19 nonlinear multiscale dynamic spreading in Italy Forecasting for COVID-19 has failed Wrong but useful-what COVID-19 epidemiologic models can and cannot tell us Time Series Analysis: Forecasting and Control Dayahead electricity price forecasting using the wavelet transform and ARIMA models Unmasking the theta method The theta model: a decomposition approach to forecasting Time series forecasting using a hybrid ARIMA and neural network model Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA-ANN model Application of artificial neural network with extreme learning machine for economic growth estimation Comparison of ARIMA and NNAR models for forecasting water treatment plant's influent characteristics An integrated deterministicstochastic approach for forecasting the long-term trajectories of COVID-19 Artificial neural networks for small dataset analysis Prediction models for diagnosis and prognosis of COVID-19 infection: systematic review and critical appraisal Ensembles for time series forecasting Combining Pattern Classifiers: Methods and Algorithms Spectral Analysis for Univariate Time Series Generalizing the theta method for automatic forecasting Forecasting with artificial neural networks: the state of the art Time series forecasting with neural networks: a comparative study using the air line data Forecasting: principles and practice Learning internal representations by error propagation The assessment of probability distributions from expert opinions with an application to seismic fragility curves On the use of the deterministic Lyapunov function for the ergodicity of stochastic difference equations Strictly stationary solutions of autoregressive moving average equations Stationary and integrated autoregressive neural network processes Unemployment rate forecasting: a hybrid approach The existence of moments for stationary Markov chains Markov Chains and Stochastic Stability Another look at measures of forecast accuracy The Elements of Statistical Learning: Data Mining, Inference, and Prediction Package 'forecast R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing Forecasting with Exponential Smoothing: The State Space Approach Forecasting time series with complex seasonal patterns using exponential smoothing Forecasting dengue epidemics using a hybrid methodology Package 'waveletarima The authors gratefully acknowledge input, advice, feedback, and encouragement from anonymous reviewers, Associate editor, and Editor-in-chief.Funding The authors had no funding for this work.Data availability The data are available at https://ourworldindata. org/coronavirus. All the results can be effortlessly updated with the help of R-shiny application available at https://github.com/ arinjita9/COVID-19-Forecasting-by-TARNN-. The authors declare that they have no conflict of interest.