key: cord-0180954-aa6e0i02 authors: Costa, Kleyton da; Silva, Felipe Leite Coelho da; Coelho, Josiane da Silva Cordeiro; Modenesi, Andr'e de Melo title: A Systematic Comparison of Forecasting for Gross Domestic Product in an Emergent Economy date: 2020-10-26 journal: nan DOI: nan sha: 377fd2a8289f5bbdbfb1c3071f160b4789366b97 doc_id: 180954 cord_uid: aa6e0i02 Gross domestic product (GDP) is an important economic indicator that aggregates useful information to assist economic agents and policymakers in their decision-making process. In this context, GDP forecasting becomes a powerful decision optimization tool in several areas. In order to contribute in this direction, we investigated the efficiency of classical time series models, the state-space models, and the neural network models, applied to Brazilian gross domestic product. The models used were: a Seasonal Autoregressive Integrated Moving Average (SARIMA) and a Holt-Winters method, which are classical time series models; the dynamic linear model, a state-space model; and neural network autoregression and the multilayer perceptron, artificial neural network models. Based on statistical metrics of model comparison, the multilayer perceptron presented the best in-sample and out-sample forecasting performance for the analyzed period, also incorporating the growth rate structure significantly. The economic activity of a country can be influenced by several factors that subject economic agents to change their consumption and investment decisions, in addition to impacting other results, such as inflation and unemployment. Such factors, or shocks, resulting from the modification of economic policies, in the level of production technology, through meteorological changes etc. The gross domestic product (GDP) is one of the main indexes for measuring the level of economic activity, and the forecast of its trajectory provides useful information concerning the future economic trend in the short term, acting as an object for the expectation of economic behavior. Significant impacts on economic activity arise through crises. They are a dysfunction inherent in the free market system. Through the development of information transmission technologies and the global integration of markets, the scope and frequency of these dysfunctions have been expanded. Beginning in the second quarter of 2014, the Brazilian economic crisis is still the subject of many analyzes, with no consensus on the generating variables, as well as their consequences. In the second quarter of 2016, the GDP growth rate accumulated in four quarters had reached the lowest level of the last two decades (-4.6 %) . The data show that the recovery (after a significant drop) was not complete, followed by a period of stagnation in the country's growth rate. Paula and Pires (2017) analyzed the ineffectiveness of counter-cyclical policiesbetween 2011 and 2014 -as a result of problems in the coordination of macroeconomic policy; and also by the occurrence of exogenous shocks, such as the deterioration of trade terms and the water crisis that occur in period. Filho (2017) argues that the origin of the Brazilian economic crisis was due to a series of supply and demand shocks that (mostly) were caused by wrong public policies, contributing to the reduction of growth potential in the Brazilian economy and to the increase in tax cost. According to Feijó and Ramos (2013) , the most relevant aggregates that derive from the System of National Accounts are the measures of product, income and expenditure. The Macroeconomic Aggregates are statistical constructions that synthesize the productive effort of a given country or region and its possible consequences on the generation of income and expenditure for a specific period of time. By definition, the GDP of a country or region represents production 1 of all production units of the economy -government, self-employed workers, companies etc. -in a given period, usually quarterly or annually, at market prices 2 . Blanchard and Johnson (2017) presents two ways of interpreting GDP. The so-called nominal GDP is defined as the sum of quantities of final goods multiplied by the current price of the goods, that is, considering the inflationary effect during the calculation period. Real GDP takes into account constant prices and sets a given year as a base, excluding the effect of price increases. Restricting itself to GDP as an instrument for efficiently measuring the quality of life of the 1 The socially organized economic activity that aims to create goods and services to be traded on the market and/or they are achieved by means of factors production (land, capital and labor) traded on the market(IBGE, 2016). 2 Economic transactions with observed or imputed market value. population has theoretical and practical limitations. The growth in production is not a sufficient condition for improving well-being (education, health, culture, security etc.). This is because the quality of economic growth is not part of the scope defined for the calculation of GDP. There is the possibility of an expansion sustained by war expenditures (production of supplies and weapons, construction of military installations etc.) or through the reconstruction of a region that has been affected by natural disasters (hurricanes, earthquakes, floods etc.), that it is reasonable to understand themselves as issues that do not promote economic and social well-being or are motivated to do so. Angus Deaton, the Economics Nobel Prize in 2015 says, If crime goes up, and we spend more on prisons, GDP will be higher. If we neglect climate change, and spend more and more on cleaning up and repairing after storms, GDP will go up, not down; we count the repairs but ignore the destruction. (Deaton, 2013) Time series analysis has proven to be an effective tool to understand the behavior patterns of a dataset distributed sequentially over time, with a wide range of models for the purpose of analyzing and predicting trends and seasonality. Seasonal Autoregressive Integrated Moving Average (SARIMA) and Holt-Winters method are considered classical time series models and the class of dynamic linear models is part of the Bayesian approach. Regarding the contributions that use classic models, the analysis constructed by Abonazel and Abd-Elftah (2019) for Egypt's annual GDP between the years 1965 and 2016, with forecasting of ten years ahead (2017 to 2026), presented results that pointed to the country's GDP growth during the period under analysis; Wabomba et al. (2016) estimated Kenya's GDP between 2013 and in 2017. The result obtained was significant growth in the Kenyan economy in the period; Agrawal (2018) modeled the series of India's real GDP growth rate from 1996 to 2017. In the analyzed data, the ARIMA model did not show any more significant results than other models. The author also used the Holt-Winters model and linear trend, both showing similar results each other, and da Silva et al. (2020) found significant results using ARIMAX and SARIMAX models (take into account exogenous variables) for the forecast of Brazilian annual and quarterly real GDP for the year 2019. For the Bayesian approach and the class of state-space models, Piccoli (2015) analyzed four dynamic linear models to identify the one with the best forecasting capacity for nominal GDP in the United States. Best results were obtained using a multivariate model SUTSE (Seemingly Unrelated Time Series Equations) that considered as variables the nominal GDP, the industry production index, the consumer price index (inflation), and the quarterly interest rate for US Treasury bills; Rees et al. (2015) built new measures for Australia's GDP growth, using statespace methods. The results found have a high correlation with the figures published officially for GDP growth. However, the measures are less volatile, easier to predict, and achieved good results in nowcasting; Issler and Notini (2016) estimate Brazilian real monthly GDP with statespace representation and also find good results in forecasting when compared with Central Bank Economic Activity Index (IBC-Br) 3 ; Migon et al. (1993) developed a study about the performance of Bayesian Dynamic Models applied to a set of Brazilian macroeconomics time series (industrial productivity index, the balance of trade, components of GDP and others) between the period 1970 to 1990. The comparison was made between the dynamic models and classical structured models and obtained results indicate that the Bayesian approach was similar to the classical approach. Another applied study was developed by Baurle et al. (2020) , with the aim of forecasting GDP in the euro area and Switzerland with a Bayesian vector autoregressive structure (BVAR) and a factor model structure. He found evidence that the factor model structure performs satisfactorily. For neural network models, Safi (2016) proposed a comparative study between artificial neural network models and time series forecasting models, finding results that indicate that the neural network model is superior to the ARIMA models and regression models for the quarterly Palestinian GDP in the period between 2014 to 2016. Jahn (2018) applied an artificial neural network regression model to predict the annual GDP growth of fifteen industrialized countries, finding evidence that the model surpasses a similar linear model in the period from 1996 to 2016. Tkacz (2001) investigates the predictive capacity of the artificial neural networks model for the forecasting of Canadian GDP growth, concluding that the model slightly outperforms the classic models for the short term and significantly exceeds for the long-term. Another accepted approach to GDP forecasting is macroeconomic projections based on leading indicators. Garnitz et al. (2019) applied this strategy to forecast GDP growth in fortyfour countries, including Brazil. One of the results found indicates that the forecasts can be improved by adding World Economic Survey (WES) indicators of the three main trading partners by country. The aim of this work is to investigate a suitable time series model to describe and forecast Brazilian GDP, also investigating the fit of these models to dynamics between periods of economic growth and recession. For this purpose, it is compared different classes of time series models. Thus, the chosen models were the Holt-Winters method, SARIMA, dynamic linear model, and the artificial neural network approach. In the literature, there are some applications regarding these models but no comparative studies were found using the models adopted in this work. This work is organized as follows: Section 2 describes the methodology. Section 3 presents the results and discussion, and, finally, the last section provides the main conclusions and some possibilities for future research. To follow we outline the data and the empirical approach used to fitted and forecast the time series of Brazilian gross domestic product between the years 1996 and 2021, at 1995 prices. This section also defines the models that were investigated. Care was also taken that the references used in the definition of models and metrics also correspond to studies and authors with wide use and quality proven by the academic community. The quality of the data used in empirical analysis is a fundamental element for the quality of the results. A factor that contributes to the empirical analysis of GDP is the vast documentation made available by government agencies. For that, we obtained the time series in the IBGE Automatic Recovery System (IBGE, 2020). Statistical analyzes, as well as graphic representations, were built using open-source software R Core Team (2020). The Table 1 shows the summary of statistics descriptions of data. The value of kurtosis coefficient indicates a platykurtic distribution for quarterly Brazilian GDP, i.e, the values of distribution is too flat. The skewness coefficient indicates that the distribution of data is highly skewed. We also accepted the hypothesis of stationary of data, according with Phillips-Perron unit root test for the first diff transformation. Several factors can affect the behavior of gross domestic product, and the economic/financial crisis are one of the most relevant. In the Figure 1 we observe the time series of Brazilian quarterly GDP from 1Q1996 to 1Q2021. In this period, three events (or period of crisis) has highlight: the Subprime crisis (2008) (2009) , an external shock caused through a financial crisis in United States financial market that spread to the global market, including Brazil; the second event is the Brazilian recession period started in second quarter of 2014 with repercussions until fourth quarter of 2017, follow by a period of stagnation; and the third event is the Covid-19 crisis, a pandemic started in Wuhan, China and first reported in beginning of 2020 [ (Wu et al., 2020) , (Zhou et al., 2020) ]. The United Nations (2010) says that GDP derives from the concept of value added. Therefore, GDP is the sum of gross value added of all resident producer units plus that part of taxes on products, fewer subsidies on products. GDP is also equal to the sum of finalizes of goods and services measured at purchasers' prices, less the value of imports goods and services. And GDP is too equal the sum of primary incomes distributed by resident producer units. According to Feijó and Ramos (2013) GDP can be calculated in three different ways, but are part of the Accounting Identity (Production = Income = Expenditure), guiding National Accounts. The perspective of production is calculated by sum the added values of economic activities plus taxes, net of subsidies, on products. That is, where GVA it is gross value added, IC is the intermediate consumption, T are taxes on products and Sub are subsidies on products. The income perspective is obtained by adding the remunerations of factors of production. Labor is remunerated by wages, loan capital is remunerated by interest, venture capital is remunerated by profit, and ownership of production goods ("land") is remunerated by rent. That is, where W are wages, GOS are gross operating surplus (sum of interest, profit e rent), T are taxes on products and Sub are subsidies on products. The time series constructed in this work was built from the perspective of expenditure. It is calculated by the sum of household consumption, investment, government spending and net exports. That is, where C it is household consumption, I it is investment (gross fixed capital formation plus stock variation), G it is government consumption, and NE it is net exports (Exports less Imports). As described in Cowpertwait and Metcalfe (2009) , the Holt-Winters method was proposed by Holt (1957) and Winters (1960) , using exponentially weighted moving averages to update those needed for seasonal adjustment of the mean (trend) and seasonality. The method has two variations with four equations: one forecast equation and three smoothing equations. Hyndman and Athanasopoulos (2018) The additive method equations is describe as following, whereŷ t+h|t is the forecast equation. The t , b t and s t are respectively level, trend and seasonality equations, with corresponding smoothing parameters α, β and γ. The parameter m denotes the frequency of seasonality, and for quarterly data m = 4. Finally, k is the integer part of ( h−1 m ) which ensures that the estimates of the seasonal indices used for forecasting come from the final year of the sample. For the multiplicative method the same equations t , b t and s t are defined. But the change in structure occurs because instead of sum the equations inŷ t+h|t an operation is performed to multiply the sum of the level and trend equations by the seasonality equation. Box & Jenkins models determine the proper stochastic process to represent a given time series by passing white noise through a linear filter (Morettin and Toloi, 2018) . The model used was SARIMA, seeking to incorporate the seasonality component that is present in the data under analysis. The SARIMA of order (p, q, d) × (P, Q, D) s is defined by, where θ(B) is the moving average operator of q order, φ(B) is the autoregressive operator of p order, Φ(B s ) is the seasonal autoregressive operator of P order, Θ(B s ) is the seasonal moving average operator of Q order, ∇ d is the simple difference operator, ∇ D s is the seasonal difference operator and α t is the noise. The artificial neural networks model seeks to model the relationship between a set of input signals and an output signal. We apply the multilayer perceptron (MLP) structure. By definition, the multilayer perceptron, or feedforward deep network, is a mathematical function mapping a sort of inputs values to output values. Following the contribution of Goodfellow et al. (2016) , the main objective of a feedforward deep network is to approximate any function f * , defining a mapping y = f (x; θ) and learns the value of the parameter θ that makes the better function approximation. We can describe a feedforward neural network through a hidden layer and a layer of lagged inputs, being a useful approach for forecasting univariate time series. When lagged values of the time series are uses as inputs to a feedforward neural network, this process is called neural network autoregression or NNAR model (Hyndman and Athanasopoulos, 2018) . We can consider the relationship between the output and the inputs of neural network autoregression as where y t is the output, (y t−1 , . . . , y tp ) are the inputs, the model parameters (weights) are w ij (i = 1, 2, . . . , n; j = 1, 2, . . . , h), and w j (j = 1, 2, . . . , h). The usually activation function used is a sigmoid function, given by In Figure 2 is possible to observe the graph representation of an MLP as describes above, with n inputs in input layer, L hidden layers with m ( L) hidden units, and k outputs in output layer. x 0 x 1 . . . Dynamic linear models are an important class of state-space models. Broadly used in the last decades, they have a high degree of efficiency for the analysis and forecast of time series, providing flexibility and applicability through an elegant and robust probabilistic apparatus. The estimation and inference challenges are solved by recursive algorithms, which follow the Bayesian approach, calculating conditional distributions of quantities of interest given the observed information. Considering a series affected by time, through dynamic and random deformations, they associate seasonal or regressive components. In this work were used contributions from West and Harrison (1997) , Laine (2019), Petris et al. (2009) and Petris (2010) . For each time t, the general univariate DLM is defined by a observational equation, a system equation and initial information given by where F t e G t are known matrices; v t and w t are two sequences of independent noises, with average zero and known covariance matrices V t and W t respectively. D t is the current information set; m 0 and C 0 contains relevant information about the future, according usual statistical sense, To take into account growth and seasonality, it is defined θ t = (µ t , β t , γ t , γ t−1 , γ t−2 ), where µ t is the current level, β t is the slope of the trend, γ t , γ t−1 and γ t−2 are the seasonal components. The selection of most suitable forecasting model was made through the contributions of Hyndman and Koehler (2006) , Armstrong (2001) , Morettin and Toloi (2018) and Ahlburg (1984) using the mean absolute percentage error (MAPE). The MAPE precision metric has the advantage of being scale-independent, and so are frequently used to compare forecast performance across different data sets. The metric is defined as where y t is the observed value y in time t andŷ t+h is the predicted valueŷ t with h steps ahead. The empirical strategy and forecast workflow (Figure 3 ) of this study taking into account three basic steps: the data tasks, the individual tasks, and the common tasks. In data task step we first extract the variables of quarterly Brazilian gross domestic product, by expenditure side, from IBGE Automatic Recovery System (SIDRA). Next, we initialize the organization of data: (i) compute GDP as Y = C + G + I + (N E); (ii) convert Y to time series format (Y t ); (iii) split Y t in train and test sets. In individual tasks step we define the structure uses in the three classes of models that we consider (classical, state-space, and artificial neural networks). For classic models first we compute the sum of squared errors (SSE) for additive and multiplicative methods. The method with lower SSE was considered for analysis; next, we apply the selection algorithm of SARIMA structure (described in Algorithm 1). In state-space approach we use the dynamic linear model with variance parameters estimated on Monte Carlo Markov Chain method, through the Gibbs Sampler. Thus, the structure of matrices of dynamic linear model was constructed based on Gibbs Sampler results. And in the artificial neural networks we consider the multilayer perceptron (MLP) with different number of layers and the neural network autoregression (NNAR). For common tasks step we compute the predictions (Ŷ t ) of each model applied to the dataset under analysis and then calculate the prediction error for the training and testing sets. Then the graphical representations and comparison tables of the metrics are created. Artificial Neural Networks Models Individual Tasks Common Tasks Figure 3 : Flowchart of empirical strategy adopted in study. Authors elaboration. This section presents the results obtained using the Holt-Winters additive method, SARIMA, dynamic linear models, neural network autoregression, and multilayer perceptron to fit the data of interest. For each model, it was plotted the mean absolute error for a forecast horizon (h) varying h = 1, . . . , 13. Compared to the multiplicative Holt-Winters method, the additive formulation was considered the most appropriate, taking into account the sum of squared errors. Figure 4 shows the evolution trajectory of MAPE through the Holt-Winters Method. It can be seen that MAPE growth increases linearly with the steps of the forecast horizon. To apply SARIMA model, the behavior of autocorrelation (ACF) and partial autocorrelation functions (PACF) were verified. In Figure 5 (a), it is possible to see a slow decay rate of the autocorrelation function to zero. This behavior indicates the non-stationarity of the series, which needs to be differentiated in order to make it stationary. Figure 6 (a) shows the autocorrelation function of the differentiated series with an exponential decay in the lags multiples of 4, indicating a possible series stationarity. Through the Phillips-Perron test (Dickey-Fuller Z α = -62.816; p-value = 0.01), the alternative hypothesis of stationarity of the differentiated series was accepted at a significance level of 1%. We used an algorithm (Algorithm 1) to generate sixteen SARIMA models following the principle of parsimony. From the generated models, the structure with the best results was the SARIMA (1, 1, 0) × (1, 1, 1) 4 , with metrics: MAPE (0.9823); RMSE (3216.698); and the Ljung-Box test (p-value = 0.997), showing that residuals are independently distributed. However, other structures presented good results considering the selection algorithm results. In this study, we consider the neural network autoregression model, NNAR(p, P, k) m . The model takes into account seasonality with input layer as lags (y t−1 , y t−2 , . . . , y tp , y tm , y t−2m , y t−P m ) with a hidden layer with k nodes. Figure 8 shows the trajectory of MAPE through NNAR model. It can be seen in this figure that the MAPE is less than 1% up to the forecast horizon of 9 steps ahead. This is the same behavior ( Figure 9 ) observed for MLP models 4 . In this work, the dynamic regression matrix F t and the evolution matrix G t of the model are For the study, it was assumed the observational variance V t = σ 2 , and the covariance matrix of the system W t is a diagonal matrix introduced by W t = diag(σ 2 µ , σ 2 β , σ 2 γ , 0, 0). These unknown variances were also estimated using Bayesian inference. Thus, to complete the specification of the model, it was assumed independent inverse gamma priors distributions with means a, a θ 1 , a θ 2 , a θ 3 and variances b, b θ 1 , b θ 2 , b θ 3 , respectively, fixed in known values. Therefore, by using the unobservable states as latent variables, a Gibbs sampler can be run on the basis of the following full conditional densities: with SS y = n t=1 (y t − F t θ t ) 2 and SS θ i = T t=1 (θ t,i − (G t θ t−1 ) i ) 2 , for i = 1, 2, 3. The full conditional density of the states is a normal distribution and it is covered in the used dlm package (Petris, 2010) . From the Gibbs sampler, 5000 iterations were generated for each parameter, model variances, out of which the 1000 initial iterations were considered as burn-in period and discarded. Hence, the remaining iterations were used to compose the posterior samples of the estimated variances. Posterior estimates of the four unknown variances, from the Gibbs sampler output, can be seen in Figure 10 . Figure 10 shows the trajectory of MAPE. It can be seen in this figure that the MAPE is less than 2% for up to 9 steps ahead forecasts horizons. The metrics were used to assess the goodness of fit of models to the Brazilian quarterly GDP data between the years 1996 and 2016, at 1995 prices, and their results are shown in Table 1 . It is observed that the better results were given through the multilayer perceptron models, this model being one that best fits the series of Brazilian GDP, at 1995 prices, for having achieved the lowest values in all metrics for fitted and forecast values. Table 3 presents the MAPE for each forecast horizon yielded by proposed models.The dynamic linear model presented the best MAPE result for forecasts up to two steps ahead. But, the multilayer perceptons models were better for the other forecasting periods. Understanding GDP behavior is a topic of study and discussion by society and the academic community. In the present work, we proposed the application of the Holt-Winters additive method, SARIMA, dynamic linear model, and neural network models with interest in the forecast of Brazilian quarterly GDP, at 1995 prices. The data comprise the period between the first quarter of 1996 and the fourth quarter of 2019. The results of the study indicate that the proposed models have a satisfactory ability to adjust to the Brazilian quarterly GDP data. The MAPE of fit of the models to the training set was less than 2%. In addition, the models managed to capture the complex structure of the data involving the crises (peaks in the series) in the years 2009, 2015 and 2020. By the metric MAPE, it was found that the multilayer perceptron presented the best fit to data and efficient forecast performance. On the other hand, the dynamic linear model presented the second best fit result and the best predictive ability for the forecast horizon of two steps ahead. We find evidence in this study that corroborates with the observed results of stagnation in the Brazilian economy after a crisis period started in the second quarter of 2014. Therefore, the multilayer perceptron model proved to be efficient for forecasting and fit GDP data even with economic shocks. For future work, it would be interesting to compare the results obtained with forecasting models of the machine learning approach. For example, with Long Short-Term Memory (LSTM) networks. And detect the concept drifts present in the time series, looking for detect crisis periods. Forecasting egyptian gdp using arima models GDP modelling and forecasting using ARIMA: an empirical study from India Forecast evaluation and improvement using theil's decomposition Principles of forecasting: a handbook for researchers and practitioners Forecasting the production side of gdp Uso de ferramentas econométricas para modelar e estimar o pib do brasil The great escape: health, wealth, and the origins of inequality A crise econômica de Forecasting gdp all over the world using leading indicators based on comprehensive survey data Deep learning Forecasting seasonals and trends by exponentially weighted moving averages Forecasting: Principles and Practice Another look at measures of forecast accuracy Brasil : ano de referência 2010. IBGE Sistema ibge de recuperação automática -sidra Estimating brazilian monthly gdp: a state-space approach. Revista Brasileira de Economia Artificial neural network regression models: Predicting gdp growth Introduction to dynamic linear models for time series analysis Modelos bayesianos univariados aplicados à previsão de séries econômicas Análise de séries temporais: modelos lineares univariados Crise e perspectivas para a economia brasileira An r package for dynamic linear models Dynamic Linear Models with R Identification of a dynamic linear model for the american gdp R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing A state-space approach to australian gross domestic product measurement A comparison of artificial neural network and time series models for forecasting gdp in palestine Neural network forecasting of canadian gdp growth System of National Accounts Modeling and forecasting kenyan gdp using autoregressive integrated moving average (arima) models Bayesian Forecasting and Dynamic Models Forecasting sales by exponentially weighted moving averages A new coronavirus associated with human respiratory disease in china A pneumonia outbreak associated with a new coronavirus of probable bat origin