key: cord-0155001-yh9i13a4
authors: Han, Daniel; Orlando, Giuseppe; Fedotov, Sergei
title: Identification of the nature of dynamical systems with recurrence plots and convolution neural networks: A preliminary test
date: 2021-10-20
journal: nan
DOI: nan
sha: ec410ebc6fad8871a33532d47124f72b491ab4d0
doc_id: 155001
cord_uid: yh9i13a4

In this study, we present a method for classifying dynamical systems using a hybrid approach involving recurrence plots and a convolution neural network (CNN). This is performed by obtaining the recurrence matrix of a time series generated from a given dynamical system and then using a CNN to classify the related dynamics observed from the recurrence matrix. We consider three broad classes of dynamics: chaotic, periodic, and stochastic. Using a relatively simple CNN structure, we are able to obtain $sim 90%$ accuracy in classification. The confusion matrix and receiver operating characteristic curve of classification demonstrate the strength and viability of this hybrid approach.

Identifying the type of dynamics that is experimentally measured is critical to time series classification, forecasting and statistical inference. This is because different analyses are required for different types of dynamics. In fact, the very essence of a time series is different between a stochastic process, a periodic process or a chaotic process. There has been much work recently to identify different dynamical processes using recurrence plots [1] . Recurrence plots were first introduced to visualize the recurrences of dynamical systems [2] . If { x i }, for i = 1, · · · , N , is a trajectory of a dynamical system with N points, then the recurrence plot is a binary matrix defined as

where is a threshold distance [2, 1] . This concept has been widely applied in various fields such as astrophysics [3, 4, 5] , damage detection in engineering [6] , molecular dynamics [7, 8] , economics [9, 10, 11] , and medicine [12, 13] . While recurrence plots and the associated statistical tools, such as spectral analysis [14] and Lyapunov exponents [15, 16, 17] , provide assistance in determining the nature of an unknown time series, they cannot be used to conclusively classify the dynamical system which generated the time series. In addition to being purely observational, many of the statistical tools require expensive amounts of computation or a large dataset in order to achieve high accuracy.

On the other hand, machine learning is a well established method for classification and forecasting, especially convolution neural networks (CNNs) for analysing images and matrices of measurements. Among the founders of the CNN we recall Hubel and Wiesel who proposed a cascading neural model for pattern recognition [18] , Fukushima that introduced "neocognitron" as organized in two layers, the first for convolution and the second for downsampling [19] , and Atlas et al. who proposed a "dynamic formal neuron" (i.e. a temporal generalization of the formal neuron) for learning dynamic patterns [20] .

To fill the gap in classifying time series to different dynamical systems, we propose using a CNN trained on the recurrence plots. Recently, many applications combining recurrence plots with CNNs have been presented, such as recognition of physical activity [21] , detection of Parkinson's disease [22] , forecasting residential energy loads to optimize renewable energy resources [23] and emotion recognition from electroencephalograms [24] . While those applications combining recurrence plots and CNNs have provided more insights on the matter, there has been no study on how to use time series to classify the nature of dynamical systems. In this paper, we tackle the problem by presenting a hybrid method for classifying different types of dynamics, such as stochastic, periodic and chaotic, using both recurrent matrices and CNNs.

The main advantages of this hybrid approach are scalability, practicality and simplicity. Recurrence plots are not computationally expensive to generate, particularly for shorter time series, and CNNs are also easy and inexpensive to operate once trained. Even for training, there exist multiple software implementations in various programming languages that assist in quick prototyping. Furthermore, the ability to classify the underlying dynamical system using a small number of points is particularly useful for economic and historic data, which are limited to short time series. Finally, this hybrid method presents a simple and efficient way that is accessible to many non-specialist users, albeit there is a risk of misuse.

Time series classification (TSC) and, more generally, the problem of identifying the nature of a dynamical system (e.g. stochastic or deterministic) has attracted much attention. On TSC, among the great quantity of methods [25] , one can mention the use of the nearest neighbor (NN) classifier coupled with a distance function such as the Dynamic Time Warping (DTW). This constitutes a very strong baseline and, when compared to several distance measures, it has been shown that there is no single distance measure that significantly outperforms it [26] .

Given this, the research has redirected the focus on ensembling methods to outperform the NN classifier coupled with DTW (NN-DTW). The idea is to pre-process time series so that the original data are transformed into a new feature space. After the pre-processing phase, ensemble methods are applied. Those techniques fall within three classes: bag-of-patterns, shape-based, and structure-based.

Bag-of-patterns (BOP) techniques that extract substructures of a time series as higher-level features, transform them via a method called Symbolic Aggregate approXimation (SAX) [27] and uses the Euclidean distance (ED) as a similarity metric. A variant is the so-called bag-of-SFA-Symbols (BOSS) which"combines the extraction of substructures with the tolerance to extraneous and erroneous data using a noise reducing representation of the time serie" [28] .

Shape-based techniques combine similarity metric with 1-nearest-neighbour (1-NN) classification [29, 30] and is used as a benchmark [31] . However, "the problem with shape-based techniques is that they fail to classify noisy or long data containing characteristic substructures" [28] .

Structure-based techniques either extract higher-level features or they build a model with classical data mining algorithms like support vector machines (SVMs), decision trees, or random forests [32, 33, 34] . Shapelets also belong to this class (i.e. classifiers that extract representative variable-length sequences from a time series) [35, 36] or DTW features [34] . The characteristic of the latter techniques over the others of this class is that they do not transform the entire data set, but instead identify a suitable partition and outperform the NN coupled with DTW (NN-DTW) [37] . Because of this promising result, research focussed on the development of an ensemble, the so-called Collective Of Transformation-based Ensembles (COTE) classifiers [38] "that does not only ensemble different classifiers over the same transformation, but instead ensembles different classifiers over different time series representations" [25] . A further development is the extended COTE with a Hierarchical Vote system (HIVE-COTE) that uses a hierarchical structure with probabilistic voting. The latter is currently considered the state-of-the-art algorithm with regard to time series classification [37, 25] .

As mentioned, HIVE-COTE represents the state of the art but has become very computationally intensive and impractical [37] . For example, one of the classifiers is the shapelet transform [35] whose time complexity is O(n 2 l 4 ), where n is the number of time series in the dataset and l is the length of a time series.

Moreover, "adding to the huge runtime of HIVE-COTE, the decision taken by 37 classifiers cannot be interpreted easily by domain experts, since researchers already struggle with understanding the decisions taken by an individual classifier" [25] .

Given the impracticability of HIVE-COTE algorithm, research has been redirected elsewhere. For example, following the idea of feature extraction and subsequent classification of time series, recurrence patterns have been identified by the means of a recurrence plot (RP) and its quantification (RQA) [39] . This is because, "evidence suggest that recurrences contain all relevant information about a system's behaviour" [1] and "the explicit representation of such regularities can reveal the underlining mechanisms that generated the data, and thus it is a potentially useful feature to classify time series" [40] . Another advantage of RP/RQA is the ability to identify segments of similar trajectories at arbitrary positions in multivariate time series as well as "the dynamical properties, such as determinism, which reflect the pairwise (dis)similarity" [41] . This approach has been proved superior to the classical dynamic time warping distance [41] . More recently, the spatial bag-of-features (SBoF) model and the deep convolutional neural networks (CNN) have been used for forecasting [42] and successfully compared with automated algorithms encompassing autoregressive integrated moving average (ARIMA), exponential smoothing algorithm (ETS), feed-forward neural network with autoregressive inputs (NNET-AR), exponential smoothing state space model with a Box-Cox transformation (TBATS), seasonal and trend decomposition using LOESS with AR modeling of the seasonally adjusted series (STLM-AR), random walk with drift (RW-DRIFT), theta method (THETA), naïve (NAIVE), and seasonal naïve (SNAIVE) [42, 43] .

Along these lines, we focus on the nature of a dynamical system trying to understand whether by means of feature extraction (RP) and classifiers (CNN) if it is random, chaotic or periodic. The potential of the suggested approach was recently demonstrated by a comparative analysis of noisy time series [44] .

Our data consists of 1,000 time series generated for each system under consideration (i.e. purely stochastic, periodic and chaotic). The systems considered are the Arnold tongue, dyadic transformation, Gaussian noise, logistic map, Rossler attractor and multi-frequency sine waves. These time series, of 50 data points each, are then processed with the toolbox CRP (R32.6) [45]. This is to obtain a recurrence plot (see Figure 1 ).

To classify a given recurrence plot from time series data, we designed a CNN composed of three convolution layers with 12 filters using a 5 by 5 kernel each followed by max-pooling layers of stride 2 and finally a densely connected network section (see Figure 2 ). The CNN was trained and tested using Tensorflow in Python3. Figure 3 shows the confusion or error matrix [46] that allows visualization of the performance of the CNN. The columns represent the instances in a predicted class and the rows represent the instances in an actual class. As shown, over six types of time series the prediction is 77.5% correct at worst and 99.4% correct at best. A receiver operating characteristic curve (ROC) expresses the diagnostic ability of a binary classifier system [47] . The ROC curve displays the true positive rate (TPR) versus the false positive rate (FPR) by changing the threshold. Figure 4 shows that the TPR is quite high for low levels of thresholds thus confirming the capability of the suggested approach in classifying recurrence plots of time series. The 'Area under the ROC Curve' (AUC) is 0.994 for 'one-vs-one' Figure 2 : A diagram showing the CNN structure consisting of three convolution layers with 12 filters using a 5 by 5 kernel followed by three max-pooling layers of stride 2 and finally a densely connected network section to produce categorical predictions of different types of trajectories. This figure was generated using the 'visualkeras' package. 

In this work, we have demonstrated the effectiveness of using both RPs and CNNs in classifying the nature of dynamical systems from the generated time series. Our results show that this hybrid approach achieves very high accuracy despite only inputting 50 data points for a given time series and as obtained by the related RP. The rather high accuracy of the proposed classification is in agreement with the recent literature aimed at comparing noisy time series [44] . Furthermore, the RPs and CNN is fully scalable in terms of both parallel computing and GPU computing. Moreover, extraction of features through RP provides a visual and understandable classification by domain experts.

The application of this research is in some domains such as economics and particularly in business cycles. In fact, they are nonlinear and irregular [49] , However, most of the research concluded that chaotic models are sometimes theoretically fascinating but hold little to no practical use [11] . Notwithstanding, some found that a chaotic model could fit well with true data [50, 51, 52] including crashes such as those caused by the COVID-19 pandemic [52] .

Thus, this research could provide the link between economic theory and identification of real dynamics based on machine learning classifiers and RQA/RP feature extraction where the latter has been successfully used to discover hidden dynamics and structural changes in economics [9, 10] .

Recurrence plots for the analysis of complex systems

Recurrence plots of dynamical systems

Testing for nonlinearity in radiocarbon data

Stability of terrestrial planets in the habitable zone of Gl 777 A, HD 72659, Gl 614, 47 Uma and HD 4208

Phase asynchrony of the north-south sunspot activity

Damage detection using multivariate recurrence quantification analysis. Mechanical systems and signal processing

Hidden peculiarities in the potential energy time series of a tripeptide highlighted by a recurrence plot analysis: a molecular dynamics simulation

Recurrence analysis of hydration effects on nonlinear protein dynamics: multiplicative scaling and additive processes

RQA correlations on real business cycles time series

Recurrence quantification analysis of business cycles

Business cycle modeling between financial crises and black swans: Ornstein-Uhlenbeck stochastic process vs Kaldor deterministic chaotic model

Use of recurrence plots in the analysis of heart beat intervals

Recurrence quantification in epileptic EEGs

Spectral analysis of signals

A robust method to estimate the maximal Lyapunov exponent of a time series

Distinguishing chaos from noise by scale-dependent Lyapunov exponent

Multiscale analysis of economic time series by scale-dependent Lyapunov exponent

Receptive fields of single neurones in the cat's striate cortex

Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition

An artificial neural network for spatio-temporal bipolar patterns: Application to phoneme classification

Classification of recurrence plots' distance matrices with a convolutional neural network for activity recognition

A recurrence plot-based approach for Parkinson's disease identification

Single residential load forecasting using deep learning and image encoding techniques

A recurrence quantification analysisbased channel-frequency convolutional neural network for emotion recognition from EEG

Deep learning for time series classification: a review. Data mining and knowledge discovery

Time series classification with ensembles of elastic distance measures

Experiencing SAX: a novel symbolic representation of time series

The BOSS is concerned with time series classification in the presence of noise

Discovering similar multidimensional trajectories

Searching and mining trillions of time series subsequences under dynamic time warping

Querying and mining of time series data: experimental comparison of representations and distance measures

SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets

Binary shapelet transform for multiclass time series classification

Using dynamic time warping distances as features for improved time series classification

Classification of time series by shapelet transformation. Data mining and knowledge discovery

Tensorflow: A system for large-scale machine learning

The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Time-series classification with cote: the collective of transformation-based ensembles

Recurrence quantification analysis

Time series classification using compression distance of recurrence plots

A recurrence plot-based distance measure

Forecasting with time series imaging

Forecasting functions for time series and linear models

Time series classification based on visualization of recurrence plots

Selecting and interpreting measures of thematic classification accuracy. Remote sensing of Environment

An introduction to ROC analysis

A simple generalisation of the area under the ROC curve for multiple class classification problems

Is the business cycle characterized by deterministic chaos

A discrete mathematical model for chaotic dynamics in economics: Kaldor's model on business cycle

Chaotic Business Cycles within a Kaldor-Kalecki Framework. Nonlinear Dynamical Systems with Self-Excited and Hidden Attractors