key: cord-0058979-3dnlmizz
authors: Kim, Kyutae; Jeong, Jongpil
title: A Hydraulic Condition Monitoring System Based on Convolutional BiLSTM Model
date: 2020-08-19
journal: Computational Science and Its Applications - ICCSA 2020
DOI: 10.1007/978-3-030-58802-1_42
sha: 15462df89dbd02547c41355f21317b598a5a3f23
doc_id: 58979
cord_uid: 3dnlmizz

In this paper, to monitor the conditions of hydraulic system, a real-time monitoring method based on convergence of convolutional neural networks (CNN) and a bidirectional long short-term memory networks (BiLSTM) is proposed. This method uses CNN and BiLSTM. In the CNN, the feature is extracted from the time-series data entered as an input, and in the BiLSTM, information from the feature is learned. Then, the learned information is sent to the Sigmoid classifier and it classified whether the system is stable or unstable. The experimental results show that compared to other deep learning models, this model can more accurately predict the conditions of the hydraulic system with the data collected by the sensors.

Mechanical fault diagnosis is an important part of the smart factory and is the key in the 4th industry [1] . Numerous sensors are installed in the hydraulic system to detect failures and the data collected from these sensors can be used to monitor whether the equipment is working properly [2, 3] .

Recent years has seen a growing popularity of performing fault detection with data collected from these sensors [6] . As data volumes increase and computing power advances, there are increasing attempts to take advantage of deep learning. In particular, convolutional neural networks (CNN) have been widely used in classification problems and it have shown high-performance [4, 5] . However, the CNN relies too much on extracting high-dimensional features [7] . Too many convolutional layers cause vanishing gradient, and too few convolutional layers can't find global features [8] . Therefore, this paper proposes a method to monitor the condition of the hydraulic system using a model that combines a CNN and a bidirectional long short-term memory (BiLSTM) networks [9] .

The data collected from the sensor first enters the CNN for feature extraction, and then into the BiLSTM networks for feature extraction of the long-distance dependence information. Finally, learned features are sent to a sigmoid classifier that identifies the conditions of the hydraulic system [10] . This method can be useful for diagnosing the conditions of a hydraulic system in an actual industrial sites.

The remainder of this paper is organized as follow. Section 2 presents the CBLSTM model proposed in this paper. Section 3 introduces the data used in the experiment, presents the equipment, and the experimental process. Section 4 analyzes the experimental results on the conditions of the hydraulic system. Section 5 concludes with a summary and presents a future study plan.

In this paper, a model combining a CNN and a recurrent neural networks (RNN) was applied to monitor the condition of a hydraulic system. The fusion model, convolutional bidirectional long short-term memory, reveals the correlation between time-series data that was ignored when using CNN alone, and avoids the gradient vanishing and gradient explosion problems frequently encountered in RNN [11] .

The CNN is a model inspired by the way the brain's visual cortex works when it recognizes objects [12] . It has received great attention because of its excellent performance in image classification. This has made great strides in machine learning and computer vision applications. One-dimensional CNN can be used to analyze time series data collected from sensors. Several filters in the convolution layer filter the input data and superimpose the obtained results to get feature maps [13] . The feature map of the convolutional layer can be expressed as follows:

y j is the j-th output data, and x i is the feature map of the previous layer. w ij is the j-th kernel, b j is the bias of the j-th kernel. The pooling layer is usually located immediately after the convolution layer. In the pooling layer, the input data is divided into multiple zones and the maximum values are extracted from the zones to form new data. This is called max pooling. There is another pooling method, which is called average pooling as a method of extracting the average value in a zone. Data that passes through the convolution layer and pooling layer is called a feature map. After the input data is subjected to the convolution calculation, the activation function is applied and then the feature map is created as it passes through the pooling layer [7] . The In order for elements in the original data array to participate equally in the operation, virtual elements must be added to both ends of the original array. In this case, using 0 as a virtual element is called zero padding. You can make all elements of the original array participate in the operation equally by adding the appropriate number of zero paddings [14] . The padding method in which all elements of the original array participate in the operation equally is called full padding (see Fig. 2 ). 

LSTM was first introduced to overcome the gradient vanishing and gradient exploding problems encountered in RNN. The basic building block of LSTM is a memory cell, which means a hidden layer [15, 16] . To solve gradient vanishing and gradient exploding problems, each memory cell has a circular edge that maintains the proper weight w = 1. The output of this circular edge is called the cell state. The structure of the LSTM is shown in Fig. 3 .

The cell state C (t−1) of the previous time step is changed without being multiplied by any weight directly to obtain the cell state C (t) of the current time step. The flow of information in a memory cell is controlled by several operations described below. In Fig. 2 , means element-wise multiplication and ⊕ means element-wise addition. Also, x (t) is the input data at time step t, and h (t−1) is the output of the concealed unit at time step t − 1. The four boxes are represented by a sigmoid function (σ) or tanh activation function and a series of weights. This box is linearly combined after matrix-vector multiplication on the input. The unit calculated by the sigmoid function is called a gate and is output through . There are three types of gates in LSTM cells. Forget gate, input gate, output gate. The forget gate f t resets the cell state so that the memory cell does not grow indefinitely. In fact, the forget gate determines what information to pass and what to suppress. It is calculated as:

The input gate (i t ) and input node (g t ) are responsible for updating the cell status, and are calculated as follows.

At time step t, the cell state is calculated as follows.

The output gate (o t ) updates the output value of the hidden unit.

Invented in 1997 by Schuster and Faliwal, bidirectional LSTM was introduced to enable networks to utilize larger amounts of information. BiLSTM connects two hidden layers in opposite directions to the same output. With this type of structure, information can be simultaneously obtained from the previous sequences and the subsequent sequences [7] . BiLSTM does not need to modify the input data and their future input information can be reached in the current state [17] . The structure of the BiLSTM network is shown in Fig. 4 . The features of input data are generated as − → h t by forward LSTM network, and ← − h t is generated by inverse LSTM network. And a vector P t at time step t is generated by the BiLSTM network. The formula is as follows:

The model used in this paper was applied with a dropout of 0.2 probability, which can prevent overfitting. In addition, the relu function was used for the activation function, and the sigmoid function was used for the last fully connected layer. Binary-crossentropy was used as a loss function because it is a binary classification model [18] . The formula for sigmoid and cross-entropy error are as follow:

In the sigmoid equation, if the value of x moves away from 0 in the negative direction, y is close to 0 because the denominator increases. Also, if the value of x is away from 0 in the positive direction, y is close to 1 because e −x approaches 0. t n ∈ {0,1} and y n ∈ [0,1], y n = Sigmoid(net). The optimization algorithm is a method of finding the point at which the value of the loss function is minimal. Optimization algorithms are very important to efficiently and reliably reach the global optimal solution. In this model, RMSProp was used, which overcomes the drawback of stagnating learning by reducing the amount of correction in Adagrad [19] . 

This paper applies CNN and RNN convergence to monitor the condition of hydraulic system and constructs two networks models: CNN and BiLSTM. CBLSTM effectively resolves the problem that mutual relation between input data is ignored on a single CNN and prevents gradient vanishing and explosion [20, 21] .

A CBLSTM model is proposed to further improve the accuracy of model predictions [22] . The CBLSTM model consists of three parts: First, the input data is fed into the model and the model extracts local feature from a time series data, primarily using one-dimensional CNN for filtering, In the second part, the BiLSTM network is used for information of long-distance dependence. Finally, the things that learned in the previous layers are transferred to classifier which is used to classify the condition of hydraulic system using sigmoid function. Figure  5 and Fig. 6 show model framework and diagram of CBLSTM.

In this experiment, time series data collected by sensors installed in the hydraulic system test rig were used. This test rig consists of working circuit, cooling circuit, and filtration circuit and the sensors installed in it measure various physical values such as pressure, vibration and temperature. Each sensor has a different sampling rate, which varies 1 Hz 100 Hz. The hardware platform used in this experiment: Intel R CoreTM i7-8700K CPU @ 3.70 GHz, and 32 GB of RAM. The software that was used to test was Python 3.7 and Tensorflow. It was shown in Table 1 .

The model used in this paper is CBLSTM, which is a convergence of CNN and BiLSTM. It is consist of 2 one-dimentional convolution layers followed 1 max pooling layer, 2 bidirectional layers and fully connected layer. The parameter settings and diagram for this model are shown in Table 2 .

Three models were additionally tested to compare the performance of CBLSTM with other models. The basic LSTM model, bidirectional LSTM, and CLSTM model combining CNN and LSTM were used. First, in the case of the LSTM model, the loss rapidly decreased as epoch progressed in both training and test. At first, the loss was greater than 0.6, but as learning progressed, it decreased to 0.5. On the other hand, the accuracy did not improve significantly, and it was 0.6 at the beginning of the training and stayed at 0.7 even after the training was completed. Finally, the result of the test was loss of 0.512 and accuracy of 0.680. The test results were not satisfactory and there was a possibility of further development as the learning progressed further (see Fig. 7 and 8).

The second is the result of experiments with bidirectional LSTM. In this case, as with LSTM, the loss was sharply reduced in both training and test. At first it was 0.65, but as learning progressed, it rapidly decreased to 0.45. In the test set, the loss increased sharply when the epoch was 12, but it did not significantly affect the performance. In the case of accuracy, it was about 0.6 at first, and as the epoch increased, the accuracy also increased, finally reaching 0.708 in the test set (see Fig. 9 and 10 ).

In the case of a combination model of CNN and LSTM, it was very unstable. In the case of loss, the test set showed a significant change, especially when epoch = 3 and 6, the loss increased significantly. Finally, the test results showed that the loss was only reduced to 0.52. Accuracy was seen to improve as learning progressed. There was no significant difference between the training set and the test set, and the accuracy was finally recorded as 0.742, which is better than the LSTM and BiLSTM alone (see Fig. 11 and 12 ). For the training parameters, 0.2 for the dropout, 15 for the epoch, and 16 for the batch size, 20% of the total dataset for the training set, and 25% for the validation set. Graphs were shown during the experiment and during the training. Looking at the experimental results of CBLSTM, loss decreased stably in the training set, but in the test set, the epcoch increased significantly at 8. The accuracy was lower than that of the training set, but we got decent accuracy (see Fig. 13 and 14) . To compare with other models, three more models were tested in addition to CBLSTM. The experimental result of simple LSTM model, Fig. 15 and 16 . Looking at the loss first, as the learning progressed, all four models were lowered. LSTM and BiLSTM showed similar patterns, and CLSTM and CNNBiLSTM showed similar patterns. The simple LSTM model only reduced the loss to around 0.5, the BiLSTM to 0.45, the CLSTM to 0.43, and the CBLSTM to 0.35. When looking at the accuracy, LSTM was the lowest, and BiLSTM, CLSTM, and CBLSTM were the highest. Finally, the accuracy of CBLSTM increased to 0.83, that of CLSTM increased to 0.8, and BiLSTM and LSTM of 0.75. In particular, CLSTM showed unstable appearance, such as when the epoch was between 12 and 14, the accuracy was greatly reduced (see Fig. 15 and 16). Table 3 shows the loss and accuracy of the train and test sets of the four models. 

In this paper, we proposed a method to classify the condition of a system as stable or unstable using data collected from sensors in a hydraulic system using a model combining CNN and BiLSTM. The accuracy of the CBLSTM reached 81%, recording higher accuracy compared to the other three models tested with the same dataset and showing superior performance with lower loss. In a future study, we will add attention modules to the CBLSTM network to develop the model and apply it to other manufacturing domains.

Information in conversion era: impact and influence from 4th industrial revolution

Condition monitoring of a complex hydraulic system using multivariate statistics

D8. 1-detecting and compensating sensor faults in a hydraulic condition monitoring system

Face recognition: a convolutional neural-network approach

CNN architectures forlarge-scale audio classification

A method of sensor fault detection and identification

Research on a real-time monitoring method for the wear state of a tool based on a convolutional bidirectional LSTM model

Densely connected convolutional networks

Bidirectional LSTM-CRF models for sequence tagging

The influence of the sigmoid function parameters on the speed of backpropagation learning

Recurrent neural network based language model

Imagenet classification with deep convolutional neural networks

Deep convolutional neural networks on multichannel time series for human activity recognition

Cyclic pre-fixing or zero padding for wireless multicarrier transmissions?

What are the differences between long-term, short-term, and working memory? Prog

Activation, attention, and short-term memory

Action recognition in video sequences using deep bi-directional LSTM with CNN features

Minimum cross entropy thresholding

Variants of RMSProp and adagrad with logarithmic regret bounds

Bidirectional recurrent convolutional networks for multi-frame super-resolution

Bidirectional recurrent convolutional neural network for relation classification

Bidirectional-convolutional LSTM based spectral-spatial feature learning for hyperspectral image classification