key: cord-0686609-fb11fm5d
authors: Durairaj, Dr. M.; Mohan, B. H. Krishna
title: A convolutional neural network based approach to financial time series prediction
date: 2022-03-23
journal: Neural Comput Appl
DOI: 10.1007/s00521-022-07143-2
sha: 79c121cb00ae8b7ffa78cfcf70517ae45c0dc87a
doc_id: 686609
cord_uid: fb11fm5d

Financial time series are chaotic that, in turn, leads their predictability to be complex and challenging. This paper presents a novel financial time series prediction hybrid that involves Chaos Theory, Convolutional neural network (CNN), and Polynomial Regression (PR). The financial time series is first checked in this hybrid for the presence of chaos. The chaos in the series of times is later modeled using Chaos Theory. The modeled time series is input to CNN to obtain initial predictions. The error series obtained from CNN predictions is fit by PR to get error predictions. The error predictions and initial predictions from CNN are added to obtain the final predictions of the hybrid model. The effectiveness of the proposed hybrid (Chaos+CNN+PR) is tested by using three types of Foreign exchange rates of financial time series (INR/USD, JPY/USD, SGD/USD), commodity prices (Gold, Crude Oil, Soya beans), and stock market indices (S&P 500, Nifty 50, Shanghai Composite). The proposed hybrid is superior to Auto-regressive integrated moving averages (ARIMA), Prophet, Classification and Regression Tree (CART), Random Forest (RF), CNN, Chaos+CART, Chaos+RF and Chaos+CNN in terms of MSE, MAPE, Dstat, and Theil’s U.

The Financial Time Series is a collection of observations of Financial Variable(s) recorded regularly. E.g., daily exchange rates, daily stock market index values, and daily commodity prices are financial time series. In general, The financial time series is chaotic and noisy [39] . A chaotic time series is not linear and sensitive to initial conditions. [7] . Financial Time series are also noisy, and their statistical properties vary with time. This property makes the prediction impossible [11, 19] . Building the right prediction model that can capture nonlinearity present in the time series is always challenging. It reveals, therefore, that the prediction of financial time series is a difficult and complex task.

Several researchers have demonstrated that an ensemble or hybrid forecasting model for time series can perform better in comparison with stand-alone forecasting models [4, 32] . A hybrid combines two or more stand-alone forecasting models into a mixed model to improve prediction accuracy and overcome the deficiencies of stand-alone models.

Chaos theory [26, 38] models nonlinear financial time series by using lag and embedding dimension in which a lag is the time delay, and embedding dimension is the number of variables required to capture the nonlinear dynamics of financial time series.

Applying deep learning approaches can help achieve better prediction accuracy [3, 6] . Deep learning, a subset of machine learning, allows Artificial Neural Networks (ANNs) to learn multi-level abstraction data representations (hierarchical learning) [10, 16] . The ANNs can construct a nonlinear and complex function that maps inputs to output. These are applied to solve various financial problems such as prediction of stock markets, optimization of portfolios, processing, and execution of trade information [43] . This field is still relatively unexplored, however.

A CNN [14] is a special case of the neural network that consists of one or more convolutional layers, often with a subsampling layer, which are followed by one or more fully connected layers as in a standard neural network. The CNNs are a type of neural network developed for twodimensional image data. However, they can be used for one-dimensional data such as sequences of text and time series [15] .

This paper presents a hybrid model involving Chaos Theory, CNN, and PR to predict financial time series as follows. The financial time series in this hybrid is checked for chaos. The chaotic modeled time series is input to CNN to obtain initial predictions. The error series obtained from CNN predictions is fit by PR to obtain error predictions. To get final forecasts from the hybrid model, CNN error predictions and initial predictions are added. Our goal is to build a more accurate model to predict different financial time series such as exchange rates, commodity prices and stock prices.

Though there are Chaos-based hybrids, such as Chaos?MLP?PSO [27] , Chaos?MLP?MOPSO and Chaos?MLP?NSGA-II [31] , present in the literature (see Table 1 ), the second-stage of the approaches modeling error series aforementioned are complex and time consuming as there are more parameters to be tuned. So, we used a simple PR to model error series as it can capture nonlinearity present in error series very well. In addition, no approach is comprehensively tested for its efficacy on three types of financial time series.

The contributions of this paper include:

- 

This section presents various related CNN-based hybrids and chaos-based hybrids proposed for financial time series prediction connected with the works mentioned above. The CNN-based hybrids are as follows: Livieris et al. [18] proposed a CNN-LSTM model for gold price time series forecasting in which CNN is used for learning an internal representation of time series and Long [37] proposed another CNN-LSTM hybrid model, which could include images as input which provides a wide variety of information associated with both static and dynamic characteristics of the series. The authors utilized this approach for predicting gold price volatility. Selvin et al. [33] applied a sliding window approach and proposed a new CNN-based hybrid, namely the CNN-Sliding Window model, in which a sliding window is used for predicting future values on a short-term basis. Table 1 presents the Hybrids based on chaos theory found in the literature to predict financial time series. All of these concluded that the proposed chaos-based hybrids outperformed stand-alone models.

In the proposed hybrid, a financial time series is checked for the presence of chaos. Lyapunov exponent [31] is used for this purpose. Chaos theory is then employed to build the scalar time series phase space [23, 35] . Optimum lag and optimal dimensional values are required for building phase space. Akaike Information Criterion (AIC) [1] It is used for optimal time series lag selection. Method of Cao's [5] is used for the optimal dimensions of embedding. Once optimal lag and optimal embedding dimension are obtained from time series, phase space can be reconstructed using Chaos Theory. Later, CNN is used for obtaining initial predictions, and finally, PR is used to fine-tune predictions. The proposed hybrid is compared with ARIMA [21] , Prophet (https://facebook.github.io/prophet/), CNN, CART, RF, Chaos?CART [28] , Chaos?RF [28] and Chaos?CNN. Table 2 presents the notations along with their interpretations used in the proposed approach.

The proposed hybrid approach is described as follows. Let Y ¼ fy 1 ; y 2 ; y 3 ; . . .; y k ; y kþ1 ; . . .; y N g be a time series with N Comments sometimes recorded t ¼ f1; 2; 3; . . .; k; k þ 1; . . .Ng.

Then perform the following:

1. For chaos to occur, check Y. When there is chaos, get optimum lag (l) and optimum embedding dimensions (m) from Y. 2. Once optimal lag and embedding dimension values are obtained, reconstruct phase space from Y.

. . .; Ng. 4. Input Y Train to CNN, train CNN to get initial predictions of training set using Eq. 1.

5. Obtain initial test set predictions by input Y Test to trained CNN by replacing t ¼ fk þ 1; k þ 2; . . .Ng in Eq. 1. 6. Compute training set of prediction errors using Eq. 2 and test set of prediction errors by replacing t ¼ fk þ 1; k þ 2; . . .Ng in Eq. 2.

7. Fit Polynomial Regression to training set of errors and obtain training set error predictions using Eq. 3.

Similarly fit PR to test set of errors and obtain test set error predictions by replacing t ¼ fk þ 1; k þ 2; . . .Ng in Eq. 3.

8. Add training set initial predictions and training set error predictions to obtain final training set predictions using Eq 4. Similarly, add test set initial predictions and test set error predictions to obtain final test set predictions by replacing t ¼ fk þ 1; k þ 2; . . .Ng in Eq. 4. Table 3 presents these datasets along with corresponding dates, number of observations, training set, and test set.

Here, the financial time series prediction problem is modeled as a supervised learning problem. Thus, each dataset is divided into a training set (80%) and a test set (20%) of observations. First, all of these datasets are checked for chaos, and it is found that chaos is present in each dataset. Later, phase space is reconstructed with the corresponding optimum lag and ideal insertion dimensions from each dataset ( Fig. 1 ). Table 4 presents various descriptive statistical measures of the datasets such as minimum, mean, median, maximum, standard deviation, skewness, and kurtosis. The The skewness measures asymmetry of data. The value Zero indicates the data is perfectly symmetric. The positive value indicates the tail of the distribution is more stretched on the side above mean. The negative value indicates that the tail of the distribution is more stretched on the side below the mean. The tails of the distribution of all commodity prices, stock prices and exchange rates are more stretched on the side above the mean.

The Kurtosis characterizes the relative peakedness or flatness of a distribution compared with the normal distribution. Positive kurtosis indicates a relatively peaked distribution and a negative kurtosis indicates a relatively flat distribution. The datasets of all commodity prices, Nifty 50 Stock Price, INR/USD and SGD/USD have relatively flat distribution. The stock prices such as Shanghai Composite Index and S&P 500 and JPY/USD have relatively peaked distribution. 

Various tasks are carried out during the experimentation. Such tasks, as well as the tools used to conduct them, are presented in the Table 5 . The Lyapunov Exponent (lambda) is used to check for chaos, the AIC is used to achieve optimal lag, and Cao's technique is used to provide optimal embedding dimension, as shown in Table 5 . For additional information on the descriptions of the tasks aforementioned, readers are suggested to refer to [31] . While experimenting with the datasets, various parameters are obtained, and some parameters are utilized in common. Table 6 presents the optimal values for chaotic parameters obtained. k ! 0 denotes the presence of chaos. From the table, it is clear that all of the datasets have chaos. The optimal chaotic parameters such as lag (l) and embedding dimension (m) are also presented in Table 6 . The estimateEmbeddingDim(.) method from ''nonlin-earTimeseries'' package implemented Cao's method [5] .

The optimal parameters for ARIMA (p, d, q) will be presented in respective sections. The optimal p, d, and q values of the ARIMA model are obtained using auto arimað:Þ from ''pmdarima'' module of Python. The commonly used parameters for all datasets are as follows. The CNN architecture used here consists of one fully connected dense layer of 50 nodes. Each node is with the activation function of ReLU. For the CNN to be trained for 500 epochs, adam optimizer is used with MSE as a loss function. It also consists of a convolutional layer and a pooling layer. Scaled values using MinMaxScaler are input to CNN, Chaos?CNN, and Chaos?CNN?PR. While modeling errors using PR, second-degree polynomial regression is used.

The suggested hybrid's performance is measured using four performance measures: Mean Squared Error (MSE), Mean Absolute Percentage Error (MAPE), Directional Change Statistic (Dstat), and Theil's Inequality Coefficient (Theil's U).

By measuring the average of squared errors, the MSE (see Eq. 5) determines how well the model predicts the response [20] . The MAPE [20] calculates the absolute numbers of errors in percentage terms to determine how well the model predicts the response. An MSE/MAPE score near 0 suggests that the suggested model could produce predictions that are more accurate than the observed data. 

Yao and Tan [39] developed a measure (expressed in percentages) namely Dstat (see Eq. 7) to measure the directional change of time series. Higher the value of Dstat, better the movements of time series are captured by the model.

Theil's U indicates how near a projected time series is to the actual time series [20, 36] . The value of U (see Eq. (8)) is usually somewhere between 0 and 1. U ¼ 0 indicates that y t ¼ € y t for all observations and a perfect fit exists, whereas U ¼ 1 indicates that the performance is poor. A Theil's U value that is closer to 0 suggests that the suggested model could produce more accurate predictions.

In all of the related equations of these performance measures, y t is the actual value at time t, € y t is the predicted value obtained using the proposed approach at time t and N is the number of predicted values.

The results of each dataset are described as follows. It is important to note that, for each dataset, the proposed hybrid (Chaos?CNN?PR) is compared with ARIMA, Prophet, CNN, CART, RF, Chaos?CART [28] , Chaos?RF [28] and Chaos?CNN in terms of MSE, MAPE, Dstat, and Theil's U.

The INR/USD test set results of prediction approaches are presented in Table Among the conventional prediction techniques (ARIMA (0,1,0), Prophet, CNN, CART, RF), CNN followed by RF could produce superior forecasts in terms of MSE, MAPE, and Theil's U. However, it fell short of capturing the change in direction. CART may function better in this situation.

Similarly, among Chaos-based hybrids (Chaos?CART, Chaos?RF, Chaos?CNN), the proposed hybrid, Chaos?CNN, could provide superior forecasts in terms of MSE, MAPE, and Theil's U. It could not, however, record the direction change better than Chaos?CART.

The predictions of the test set of JPY/USD are shown in Fig. 3 . CNN, Chaos?CNN, and Chaos?CNN?PR are used to obtain the predictions and they are shown in the figure. The predictions achieved using Chaos?CNN?PR are significantly closer to real values as seen in the figure. It's also worth mentioning that the forecasts made with CNN are superior to those made using Chaos?CNN.

The SGD/USD test set results of prediction approaches are presented in Theil's U. However, it could not capture the direction change better than Chaos?CART. Figure 4 depicts predictions of the test set of SGD/USD. The predictions are obtained from CNN, Chaos?CNN, and Chaos?CNN?PR. From the figure, it can be observed that the predictions obtained using Chaos?CNN?PR are very much closer to actual values. It is also worth noting that the predictions obtained using CNN are better than that of Chaos?CNN.

The S&P 500 Stock Index test set results of prediction approaches are presented in Theil's U. However, it could not capture the direction change better than Chaos?CART. Figure 5 depicts predictions of the test set of S&P 500 Stock Index. The predictions are obtained from CNN, Chaos?CNN, and Chaos?CNN?PR. From the figure, it can be observed that the predictions obtained using Chaos?CNN?PR are very much closer to actual values. It is also worth noting that the predictions obtained using Chaos?CNN are better than that of CNN. 0,1,1) , Prophet, CNN, CART, RF), CNN followed by ARIMA could yield better predictions in terms of MSE, MAPE and Theil's U. However, it could not capture the direction change better. In this context, CART could perform better.

Similarly, among Chaos-based hybrids (Chaos?CART, Chaos?RF, Chaos?CNN), the novel hybrid, Chaos?CNN, Figure 6 depicts predictions of the test set of Nifty 50 Stock Index. The predictions are obtained from CNN, Chaos?CNN, and Chaos?CNN?PR. From the figure, it can be observed that the predictions obtained using Chaos?CNN?PR are very much closer to actual values. It is also worth noting that the predictions obtained using CNN are better than that of Chaos?CNN.

The Shanghai Composite Index test set results of prediction approaches are presented in 

The Crude Oil Price test set results of prediction approaches are presented in 

The Gold Price test set results of prediction approaches are presented in Figure 9 depicts predictions of the test set of Gold Price. The predictions are obtained from CNN, Chaos?CNN, and 

The Gold Price test set results of prediction approaches are presented in can be observed that the predictions obtained using Chaos?CNN?PR are very much closer to actual values. It is also worth noting that the predictions obtained using CNN are better than that of Chaos?CNN.

Finally, the Diebold and Mariano test [8] is used to officially test the statistical difference between Chaos?CNN?PR and other forecast models on average. The test of statistical signficance accepts the predictions obtained from two approaches as inputs. Table 16 shows also possible to extend the proposed Hybrid to various financial and non-financial time series. The regression problem solved here can also be converted into a classification problem. In this context, the approaches proposed by [40] [41] [42] are very much helpful. 

Chaos?CNN?PR 1.336608e À 05

Neural Computing and Applications ARIMA Prophet CNN CART RF Chaos?CART Chaos?RF Chaos?CNN

A new look at the statistical model identification

A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems

A deep learning framework for financial time series using stacked autoencoders and long-short term memory

The combination of forecasts

Practical method for determining the minimum embedding dimension of a scalar time series

Computational intelligence and financial markets: a survey and future directions

Nonlinear ensemble prediction of chaotic daily rainfall

Comparing predictive accuracy

A review of two decades of deep learning hybrids for financial time series prediction

Deep learning in finance

Predicting the stock market

Chaos-based support vector regressions for exchange rate forecasting

Forecasting foreign exchange rates with artificial neural networks: a review

Backpropagation applied to handwritten zip code recognition

Convolutional networks for images, speech, and time series

Deep learning

Applications of artificial neural networks in financial economics: a survey

A CNN-LSTM model for gold price time-series forecasting

Financial markets: very noisy information processing

Evaluating accuracy (or error) measures

ARMA models and the Box-Jenkins methodology

Soft computing techniques applied to finance

Geometry from a time series

Financial forecasting through unsupervised clustering and evolutionary trained neural networks

Applications of neural networks in training science

Sur le problème des trois corps et les équations de la dynamique

Forex rate prediction using chaos, neural network and particle swarm optimization

Forex rate prediction using chaos and quantile regression random forest

Forex rate prediction: a hybrid approach using chaos theory and multivariate adaptive regression splines

Soft computing hybrids for forex rate prediction: a comprehensive review

Financial time series prediction using hybrids of chaos theory, multi-layer perceptron and multi-objective evolutionary algorithms

Combining three estimates of gross domestic product

Stock price prediction using LSTM, RNN and CNNsliding window model

Ozbayoglu AM (2020) Financial time series forecasting with deep learning: a systematic literature review

Detecting strange attractors in turbulence

Applied economic forecasting

Gold volatility prediction using a CNN-LSTM approach

Chaos theory tamed

A case study on using neural networks to perform technical forecasting of forex

Training SVMs on a bound vectors set based on fisher projection

SVMs classification based two-side cross domain collaborative filtering by inferring intrinsic user and item features

Neural Computing and Applications

A cross-domain collaborative filtering algorithm with expanding user and item features via the latent factor space of auxiliary domains

Stock market prediction based on generative adversarial network

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Neural Computing and Applications

The authors declare that they have no conflict of interest with any author, or organization.