key: cord-0898181-e2gu3pf3 authors: Tavoosi, Jafar; Mohammadzadeh, Ardashir; Jermsittiparsert, Kittisak title: A review on type-2 fuzzy neural networks for system identification date: 2021-03-09 journal: Soft comput DOI: 10.1007/s00500-021-05686-5 sha: 25c034fde59c8a34a1510bed43954fe916ca00d4 doc_id: 898181 cord_uid: e2gu3pf3 In many engineering problems, the systems dynamics are uncertain, and then, the accurate dynamic modeling is required. Type-2 fuzzy neural networks (T2F-NNs) are extensively used in system identification problems, because of their strong estimation capability. In this paper, the application of T2F-NNs is reviewed and classified. First, an introduction to the principles of system identification, including how to extract data from a system, persistency of excitation, preprocessing of information and data, removal of outlier data, and sorting of data to learn the T2F-NNs, is presented. Then, various learning methods for structure and parameters of the T2F-NNs are reviewed and analyzed. A number of different T2F-NNs that have been used to system identification are reviewed, and their disadvantages and advantages are described. Also, their efficiency in different applications is reviewed. Finally, we will look at the horizon ahead in this issue and analyze its challenges. By having mathematical relationships between the variables of a physical system, it is possible to improve performance, predict behavior, and control that system. One way to get the mathematical relationships of physical systems is to use the basic laws of physics and chemistry and so on. But today, with the complexity of systems and the high mathematical calculations of these systems, as well as the lack of information about the details of the system, the method of using the basic rules is very limited. Today, it is proposed to obtain mathematical relations of a system (generally unknown) using the input-output dataset, which is called system identification. By applying input to a physical system, the output corresponding to that input can be obtained, and then using this pair of input-output data and various methods of system identification, the mathematical input and output of the system can be obtained. From the early 1970s, work began on identifying systems in a serious and extensive manner. In these studies, basic issues such as identifiableness, different identification strategies, and their convergence and uniqueness of estimation have been studied. The proposed algorithms are more efficient in identifying linear systems and generally do not show the required efficiency to identify nonlinear systems (Nelles 2001) . In system identification, the structure of the model or the so-called model framework must first be determined, and then, the indefinite parameters of the model must be determined. Different model choices lead to different system identification methods, from classic multitasking models to new fuzzy, and neural models are used to identify nonlinear dynamic systems in various books and papers have been discussed. In (Nelles (2001) ), various linear and nonlinear structures have been introduced and analyzed. The mentioned book ranges from the classic Kolmogorov-Gabor multi-sentence identification methods and the Volterra series to modern fuzzy, neural, and fuzzy neural models. In Thoma et al. (2010) , Wiener and Hammerstein block models are discussed to system identification. In Ruano (2005) fuzzy, neural and fuzzy neural models are discussed for system identification. T2F-NNs have received much attention in the last ten years with more capabilities and flexibility than type-1 counterpart Eyoh et al. Oct. 2018; Tavoosi et al. 2011a Tavoosi et al. , 2016 Tavoosi et al. , 2017a Tavoosi et al. , 2017b ; Tavoosi xxxx. T2F-NNs have high approximation accuracy, so these tools can be used wherever an accurate model is needed. Unlike to the T1F-NNs, the secondary membership in T2F-NNs is not a crisp value. The secondary membership in T2F-NNs is also a fuzzy set. Then T2FL systems have one more degree of freedom. The application of T2F-NNs in many problems is more efficient. For example, in the adaptive inverse control method, it is necessary to have an exact inverse model, which T2F-NNs can be used Kien et al. 2020; Tavoosi et al. 2011b; Zhao et al. 2017 . In predicting the future of a dynamic system such as a stock market or a weather situation, a recurrent type-2 fuzzy system could be a good model for these purposes José Á ngel Barrios 2020; Eyoh et al. 2020; Narges Shafaei Bajestani 2017 . In terms of data segregation and classification, a precise type-2 fuzzy system can more accurately categorize and completely separate data, and this is very common in telecommunications. Recently, T2F-NNs are extensively used in various applications. For example in Mohammadzadeh and Kayacan 2020, nonlinear system modeling by the use of T2F-NNs is studied and it is proved that T2F-NNs outperform than type-1 counterparts. In Sabzalian, et al. 2019 , the application of T2F-NNs in control problem is investigated and an adaptive controller is designed and its robustness against uncertainty is shown. In (Failed 2012), the application of fuzzy rough systems in intrude detection problem is studied and it is shown that by the use of fuzzy logic systems (FLSs) the accuracy is improved and the rate of false alarm is decreased. In (Kanimozhi 2019), a fuzzy prediction systems is designed and it is used for cancer prediction. In (Ganapathy, et al. 2014 ), a pattern classification methodology is developed using FLSs and it is concluded that FLSs improve the detection accuracy significantly. In (Sethukkarasi 2014) , a temporal mining procedure is designed and its reliability investigated. In (Nancy et al. 2020) , a feature selection mechanism on the basis of FLSs is introduced and its application in attack detection systems is studied. In addition to structure, the learning method is also effective in estimation performance of FNNs. Various optimization methods have been applied on the tuning of the both parameters and rules such as particle swarm optimization (Deng et al. 2020a; Kacimi et al. 2020) , quantum-inspired differential evolution (Su and Yang 2011; Deng et al. xxxx; Deng et al. 2020b) , differential evolution (Deng et al. 2020c) , extreme learning approach (He et al. 2019) , fractional-order learning rules (Mohammadzadeh and Kumbasar 2020a), consensus learning (Shi et al. 2020) . The contributions of this study are summaarized as: (1) The systems identification methods are classified. (2) Various learning methods for structure and parameters of T2F-NNs are reviewed and analyzed. (3) The applications of type-1 and type-2 FLSs in different problems are investigatedand, and their superiorities and drawbacks are investigated. System identification in one sentence means ''finding the mathematical relationship between the input and output of a system, using the input-output pair of that system.'' In state space equations, in system identification process, the input replaced with the current state and the output replaced with the subsequent moment state. Dynamic system identification is based on data observed from the real system and has wide applications in many fields. In systems control and engineering, systems identification methods are used to extract appropriate models for controlling, designing, predicting algorithms, or simulating systems. Figure 1 shows the schematic of the model with the system for system identification. In Fig. 1 , y is the system output, b y is the model output, and u is the input signal. e is the difference between the output of the model and the output of the system and is used to adjust the parameters of the model, and for this purpose, different algorithms can be used, which are described in detail in (Nelles 2001) . It should be noted that structural training can be used first, and after the structure is stabilized, parametric training can be used to regulate it, and the structure can be considered fixed and only Fig.1 Schematic of system identification parametric training can be used. The system identification process includes the steps described below. Which combination of delayed input and output signals is used is very important in identifying a dynamic system. This part is usually determined by trial and error and with the help of previous knowledge of the system. Four methods have been introduced for the initial selection of inputs, which are: (A) Use of all inputs: In this case, the dimensions of the problem will increase dramatically as the large number of inputs leads to a large number of membership functions (MFs), and as a result, the parameters of the model are very large, the number of data increases, and the identification time increases. (B) Use of all compounds: This method is very difficult in practice, because the number of combinations is very large, and if more delays are used, the dimensions of the problem become larger. (C) Choosing unsupervised inputs: In this way, unrelated inputs are not used less to perform calculations. One of the methods proposed in this section is the analysis of basic components. (D) Choosing supervised inputs: In this method, categories of inputs are selected so that the difference between the model and the system is minimal. In linear models, this method is the same as correlation analysis, and in nonlinear models, it is a complex optimization problem (Mohammadzadeh and Kayacan 2020). This step requires knowledge of the system and its performance, as well as the purpose of modeling. For black box modeling, the input signals of the system are important sources of information, so in this type of system, accuracy is very important at this stage (Sabzalian et al. 2019; Failed 2012) . The excitation signal must be such that it excites all system modes and does not damage the system. Model structure selection is one of the most important steps in identifying nonlinear systems. To choose the structure of the model, the following should be considered: -The type of system (static systems or dynamic systems). The purpose of the problem (simulation, optimization, control, etc.).-Number of inputs and number of outputs of the problem and their range. -Pay attention to the values and quality of data (for example, if the data are scattered and associated with noise, general optimization methods work better than local optimization).-Paying attention to the constraints of the problem (for example, model training time can be important in some issues).-Offline and online learning methods.-Simplicity and applicability of the model (hardware and software). In a dynamic system, the output can depend on the input and output at any time. For example, the NARX model (nonlinear ARX) is described as (1). The purpose of nonlinear dynamic system identification is to find an approximation of the unknown function f (.) in Eq. (1). The system discussed in Eq. (1) is considered as single input-single output (SISO), which can be generalized to multi-input-multi-output (MIMO) systems. In a SISO system, the input u k ð Þ is applied to the system and the output y k ð Þ is taken from the system. But if the system is dynamic, in order to be able to model the transient and permanent modes of the system well, this dynamic must be provided for the system. In other words, the inputs and outputs of the previous moments must also be used (Kanimozhi 2019). There are two ways to do this, known as external dynamics and internal dynamics. A nonlinear dynamic model can be used in two ways: 1) as a prediction model (parallel), or 2) as a simulation model (series-parallel). In the predictive mode, all input and output data of the system are used in the previous moments to predict the output in each moment, but in the simulation mode, only the input and output of the model are used (Nelles 2001; Ganapathy, et al. 2014) . Figure 2 shows both model modes. In computer control systems, sampling of continuous signals causes information to be lost. Therefore, it is necessary to select the sampling frequency in such a way that there is no problem in controlling the system. Although higher frequencies look better for sampling, they can still cause the following problems (Sethukkarasi 2014; Nancy et al. 2020; Deng et al. 2020a ): The amount of learning data is very large, and the training time is long. Where data do not change much, the same learning data are repeated and result in nothing more than memory and extra time to learn. However, sampling time can be optimized for simple problems. But in the case of real systems, this is not practical. Here are three commonly used methods for determining sampling time (Kacimi et al. 2020 ). The smallest time constant: If s min is the smallest time constant of the system, then the sampling time is selected as follows: Bandwidth: If f 0 is the cutoff frequency of a system, a good choice for sampling frequency is: Settling time: If T st is the sitting time of a system, the sampling time can be selected as follows: In a category, a system model can be a state space model or an input-output model. Each of these two models has advantages and disadvantages that are preferred to each other according to the purpose of modeling (Su and Yang 2011) . However, it should be noted that using the inputoutput model, a wider range of nonlinear systems can be approximated. On the other hand, most nonlinear control theory methods are defined based on the state space method. In many cases, space method is effective. For example, in multitasking control issues, the state-of-the-art space method usually performs better, and fewer parameters are needed to identify these systems with the state-ofthe-art space method (Deng et al. xxxx) . It should also be noted that the state space model can always be converted to the input-output model, but the opposite of this conversion is not always possible. Despite the above advantages for the mode space mode, the mode systems of the state space are reversible and it is very difficult and laborious to identify the return systems. Therefore, it is usually preferred to identify the input-output model system because it includes both a wide range of nonlinear dynamic systems and is easier to work with the input-output model (Deng et al. 2020b ). To identify the state space model, consider the following system. In Eq. (5), u k ð Þ is the input of the system and x k ð Þ is the state of the system and x k þ 1 ð Þis the state of the system at a later moment which is considered as the output. Therefore, like the input-output model, this system can be identified. Thus, x k ð Þ and u k ð Þ are considered as inputs and x k þ 1 ð Þ as output. T2FL has shown better performance than T1FL (Deng et al. 2020c; He et al. 2019; Mohammadzadeh and Kumbasar 2020a; Shi et al. 2020; Son et al. 2020 ). T2F-NNs are divided into feedforward and recurrent, Mamdani (Linguistic) (Ayala et al. 2020 ) and TSK (Tim Oliver Heinz 2017) and finally interval and general, in different categories. Each of these categories has its own characteristics and features. Depending on the type of system to be identified, each of these categories can be used. For example, if the system had a strong dynamic, that is, its output was highly dependent on past moments; the performance of the recurrent T2F-NN would be better (Hong et al. 2008) . Or if more qualitative information is available from the system, the Mamdani model will work better. Finally, if the model must be very accurate (and time is less important), the general model will be better (Prawin et al. 2020) . In continue, first, the type-2 fuzzy sets (T2FSs) are described, and then, the type-2 fuzzy neural structures of are reviewed. T2FSs have more parameters than type-1 fuzzy and therefore have a higher ability to face uncertainties. These sets are divided into two types, interval (Fig. 3) and general (Fig. 4) . Interval T2FSs are more commonly used due to their smaller computational volume and user-friendliness. However, in these sets, the third dimension is equal to 1, and therefore, their display is two-dimensional, but in general T2FSs, the third dimension is a fuzzy set, and therefore, its display is three-dimensional. A general T2FS formula is as follows: T2FSs are special types of general T2FSs that are maintained under the following conditions: f x l ð Þ ¼ 1. Given the advantages of T2FL over T1FL, it is necessary to upgrade T1F-NNs (T1F-NNs) to T2F-NNs (Schoukens et al. 2017 ). T2F-NNs respond well under uncertain conditions and inaccurate data and are able to approximate a variety of systems. These networks are well able to dramatically reduce the uncertainty effects of modeling. Major work on T2F-NNs dates back to 2008 (Yukai et al. 2020) . For example, in (Zhao et al. 2016 ) a T2F-NN has been used to nonlinear dynamical system identification. In the mentioned paper, the asymmetric type-2 fuzzy MFs are used. In (Tsimbinos and Lever 1994) , a T2F-NN has been used to approximating functions that uses the PSO-based combination training algorithm and recursive least squares. Also, in the mentioned paper, an uncertainty in degree of membership has been used for MFs. A class of T2F-NNs is named interval T2F-NNs because it uses interval T2FSs. Interval T2F-NNs are divided into two categories, simplified (singleton) type-2 neural network and TSK T2F-NNs. Due to the type-2 fuzzy problems and the type reduction (from type-2 to type-1), Mamdani (general/ linguistic) models have not received much attention. In the following, some of the most widely used T2F-NNs are introduced. This structure is zero-order Takagi-Sugeno-Kang (TSK) T2F-NNs. The output of this network is a single crisp number. A fuzzy rule for this network is written as follows: A review on type-2 fuzzy neural networks for system identification where u 1 and u 2 are the inputs,à f 1 andà f 2 are interval T2FSs. The network output is y 1 , and w f is a crisp number. Figure 3 shows a singleton T2F-NN. This structure is a kind of Takagi-Sugeno-Kang (TSK) T2F-NNs. The difference between this structure and singleton (zero-order TSK) is that in this structure, the output of each fuzzy rule is two numeric values, but in zero-order TSK, the output is a single crisp number. A fuzzy rule for these networks is written as follows: The symbols are the same as Sect. 3.2.1; only w f L ; w f R are left and right values of the output. Figure 4 shows a twovalued T2F-NN. Various structures have been introduced for this category. A fuzzy rule for these networks is written as follows: The antecedent parameters and variables are the same as Sect. 3.2.1. The coefficients (C k;i ) can be expressed in different ways. For example, they can be crisp number, interval number, and type-1 fuzzy sets (T1FSs). If the interval numbers have been chosen for the coefficients (C k;i ), they can be written as: where c k;i and s k;i are the canter and spread of ith input coefficient in the rule k, respectively. In ), the coefficients (C k;i ) have been chosen as T1FSs. This improves network accuracy, but also increases training time. If T1FSs have been chosen for the coefficients, the fuzzy rules can be written as ): R k : ifu 1 isà k 1 andu 2 isà k 2 thenthenỹ k ¼r k þp k u 1 þq k u 2 wherer k ,p k , andq k are T1FSs. The consequent part of the fuzzy rules can be nonlinear functions. Some nonlinear functions such as triangular, exponential, and Volterra series can be used in the Then part of the fuzzy rules; for example, some nonlinear Then part type-2 fuzzy rule can be written as follows: R k : ifu 1 isà k 1 andu 2 isà k 2 thentheny k ¼ r k þ p k u 1 þ q k u 2 þ s k u 1 u 2 þ t k u 2 1 þ . . . R k : ifu 1 isà k 1 andu 2 isà k 2 thentheny k ¼ r k cos u 1 ð Þ þ q k sin u 1 ð Þ þ s k cos u 2 ð Þ þ t k sin u 2 ð Þ þ . . . But it should be noted that in this case, the existential philosophy of the fuzzy system throws into question, because the goal of fuzzy logic was to eliminate or reduce complex mathematical relationships. Therefore, the use of these models should be avoided as much as possible, unless other models do not work properly and the use of nonlinear consequent part models is inevitable. Unlike the TSK model, there is no mathematical equation in the Mamdani model, both if and then of the fuzzy rules Fig. 4 A two-valued T2F-NN are completely qualitative. This model is closer to the human thinking model. R k : ifu 1 isà k 1 andu 2 isà k 2 thenthenỹ k isB k whereB k is a T2FS. The calculations of these networks are very large and require type reduction algorithms (such as Karnik-Mendel), so they are not widely used. The following is how to calculate the output in a Mamdani type-2 fuzzy system. Consider a Mamdani type-2 fuzzy system with the following two rules: ifxisà k 1 andyisB k 1 thenzisG k 1 ifxisà k 2 andyisB k 2 thenzisG k 2 Suppose the above two rules are shown in Fig. 3 . Suppose two inputs x and y are applied to the system. In this case, how to calculate the output is shown in Figs. 5, 6, 7, 8, 9, 10, 11. At this stage, the lowest membership degree is selected. In other words, the lowest values above and the lowest values are selected and multiplied by the result. As can be seen in Fig. 5 , the distance between the below minimum values and the above minimum values in the then part the rules in the form of the hash part is a type-2 fuzzy number. Then the maximum amount of MFs of the then part are calculated (Fig. 6) . The final output of the Mamdani type-2 fuzzy system (Fig. 6) is a type-2 fuzzy number, and to use this number, you have to convert it to a real number (type reduction). Figure 10 shows the KM algorithm for type reduction process. In Fig. 10 , the C l and C r points are called switching points . In this case, a type-2 fuzzy number is converted to two fuzzy type-1 numbers (Fig. 11) . Now, using the defuzzification methods, the crisp values equivalent of the T1FSs can be extracted. Supervised learning methods are divided into three categories: linear, local nonlinear, and general nonlinear. Linear supervised learning methods are well known and widely used methods, such as the simple least squares method. Local nonlinear supervised learning methods are mathematical-based methods and are therefore widely used. The most well known of these methods are gradientbased algorithms that are used to train neural networks, fuzzy neural networks as well as in optimization of nonlinear parameters such as MFs in fuzzy systems which are very useful. General nonlinear supervised learning methods are fundamentally different from local nonlinear supervised learning methods and that is these methods are not based on strong mathematics, yet are very useful. Algorithms such as genetics and PSO are in this category. In continue, some of recent papers are reviewed. The contribution of gradient-based methods for learning of T2F-NNs is more than other methods. In , adaptive learning rate-based steepest descent gradient has been applied in an interval T2F-NN. In gradient-based training methods, the training rate is very important because a large amount of it may lead to divergence of training and a small rate may cause it to get stuck in the local minimum and also slow down the training . So the adaptive learning rate is one of the solutions for this problem. On the other hand, the global nonlinear learning methods have been used for T2F-NNs, recently (Tavoosi and Mohammadi 2019; Ahmadieh and Branson 2019). In (Tafti et al. 2020 ), a dynamic group cooperative particle swarm optimization has been used for training of an interval TSK T2F-NN. They used their proposed T2F-NN for experimental mobile robot control. Genetic algorithm-based learning applied to a Mamdani type-2 fuzzy system for blood pressure level classification presented in (Shahparast and Mansoori 2019). However, due to the lack Fig. 5 Two rules of a Mamdani type-2 fuzzy system A review on type-2 fuzzy neural networks for system identification of a mathematical basis and the guarantee of convergence, general optimization methods are not recommended for applications that control critical systems. Other learning methods such as the sliding model-based learning (Richa Sharma et al. 2020 ) and Lyapunov-based (Zhao et al. 2019a ) have been presented. In some articles, reinforcement training methods are used, as in (Mohammadzadeh and Kumbasar 2020b) Q-learning algorithm has been used for an interval T2F-NN. Q-learning is a reinforcement learning technique that pursues a specific policy for performing different movements in different situations by learning a function/value function. One of the strengths of this method is the ability to learn the function without having a specific model of the environment. In this section, the use of T2F-NNs to system identification is reviewed. Before reviewing, it should be noted that in the title of the articles, the term ''system identification'' may not be mentioned, but in these articles, the ability to function approximate for control, classification, and clustering has been used. In (Mohammadzadeh and Kayacan 2019), a new type-2 neuro-fuzzy network has been used to multivariable dynamic system identification. The article discusses sudden data changes as well as uncertainties. Both structure and parameter learning have been done in the method of (Mohammadzadeh and Kayacan 2019). In the then part of each fuzzy rules, a linear state space equations and the Henkel matrix computation have been Fig. 6 How to calculate the output in a Mamdani type-2 fuzzy system Fig. 7 How to calculate the output in a Mamdani type-2 fuzzy system used. To identify nonlinear of a system under uncertainties, a self-organized T2F-NN with asymmetric MFs has been developed (Bencherif and Chouireb 2019) . In this T2F-NN, the then part of the fuzzy rules is Mamdani model, in which, first, the fuzzy c-mean algorithm is employed for division of the input data to obtain uncertainty centers and widths from the predecessors of fuzzy rules. Then, on the basis of validity criterion of the cluster, rule numbers are determined. Thus, the identification of old parameters and structure is completed automatically. In (Ching-Hung Lee 2008), the quantum behaved particle swarm optimization (QPSO) has been used to design of interval type-2 TSK fuzzy logic system (FLS). They used the combination of the A1-C1, A2-C0, A2-C1 interval type-2 TSK FLS with neural network to design fuzzy neural network systems, and then, the fuzzy neural network system parameters have been tuned by QPSO intelligent algorithm. Both QPSO and BP algorithms have been employed for learning the system model. By considering QPSO and BP algorithms, their results are shown that the QPSO-based is more effective, which can result in a better proficiency. Finally, they Fig. 8 How to calculate the output in a Mamdani type-2 fuzzy system Fig. 9 How to calculate the output in a Mamdani type-2 fuzzy system A review on type-2 fuzzy neural networks for system identification analyze the efficiency of the four FLSs and concluded the effect of A2-C1 FLS is better than that of the other three FLSs. In (Wiktorowicz and Krzeszowski 2020), a novel self-organizing T2F-NN has been used for nonlinear system identification. In the mentioned paper, a new selftuning recurrent radial basis function network (RBFN) has been presented. In (Mohammadzadeh and Zhang 2019) , the theories of fuzzy systems and artificial neural networks are reviewed and the supervised learning methods are investigated. The mentioned article gives you a brief overview on fuzzy neural networks and nephropathy. The origins of fuzzy neural networks and their types are well illustrated. The fuzzification and defuzzification techniques and also training algorithms for T1F-NNs have been reviewed and compared. Many applications of T1F-NNs for industrial problem solution have been reviewed in (Mohammadzadeh and Zhang 2019) . In (Zirkohi and Lin 2015) , a new method based on long-term learning for interval T2F-NN has been presented. In the mentioned paper, the principles of granular computing (GrC) have been used to obtain knowledge from raw data and to build a computational mechanism for adapt to long-term learning fashion and new information in an additive. As mentioned earlier, structural training is a very important step in learning phase of a T2F-NN; in (Yeh et al. 2011) to determine the fuzzy rules, a modified density-based clustering is implemented for structure learning, where both density and membership degrees are involved. Noisy environments are a challenging issue for system identification where in (Wiktorowicz and Krzeszowski 2020) this issue has been considered, but unfortunately, the method of parameter training and how to apply it in this paper is a bit vague. In (Khankalantary et al. 2020) , gravitational search algorithm-based fuzzy c-regression has been proposed for an evolving modified interval type-2 fuzzy model, and they used extreme training technique for tuning of parameter identification. The coefficients of hyperplanes were determined by computing type-2 fuzzy method using gravitational search algorithm. Finally, to identify the antecedent parameters of the T-S fuzzy model, they used a hyper-plane-shaped MF, and WOS-ELM was employed to identify the consequent parameters. In (Kececioglu 2019), an improved particle swarm optimization (PSO) algorithm has been used for interval T2F-NN learning. To overcome the local minimum problem, they used dynamic group cooperative particle swarm optimization (DGCPSO) for nonlinear system identification. In some articles, a T2F-NN may be used to control, classification, or clustering, in which the approximation operation is generally performed. In the following, some of the latest research in this field will be reviewed and analyzed. In ), a general T2F-NN has been used for Mackey-Glass time series data (for s = 17) prediction. The disadvantage of the mentioned paper is the training time of their proposed general T2F-NN, because it trains and predicts to the work of others over a longer period of time. Self-tuning TSK T2F-NN has been used to two-link flexible manipulator control (Camci 2018) . In mentioned paper, the performance of the control method against the robot's highly coupled system is appropriate, but no structural learning is provided. In , interval T2F-NN has been used for medical diagnosis classification based on K-means clustering algorithm. In (Juan Carlos Guzmán 2019), a new control system design Fig. 11 Two T1FSs obtained from a T2FS based on T2F-NNs has been presented. In the mentioned paper, the T2F-NN has been used for function approximation in sliding mode control technique. One of the uses of T2F-NNs is noise elimination, where the T2F-NN creates an opposite phase anti-noise signal which has the same magnitude with the unwanted noise . Any change in the noise will lead to an increase in error, resulting in an update of the T2F-NN. In (Ghaemi et al. 2019) , the properties of noise reduction based on an interval T2F-NN (A2-C0) have been presented. Much of the data, such as speech, time-based data, or time series (such as weather forecast data and financial data.), data received from sensors, videos, text, and so on. They are sequential in nature. Recurrent neural networks (or RNN for short) are a family of neural networks specifically designed to process serial data (or sequences). These networks were actually created to process comet signals. In a typical neural network, all inputs and outputs are independent of each other, but in many cases the idea can be very bad. For example, suppose you are looking for a prediction of the next moment in a signal. Let us look at this type of network with a different perspective. These networks have a type of memory that records information they have seen before. In theory, it seems that these networks can record and use the information in a long sequence, but in practice this is not the case and they are very limited, so they only record the information a few steps ago (Yi, et al. 2019) . The Fig. 9 shows an example of a typical RNN Fig. 12 . Unlike conventional networks that use different parameters in each layer, an RNN network shares the same parameters between all-time steps. (U, V, W) This means that we have similar operations at each time step. We do only the inputs are different. With this technique, the total number of parameters that the network must learn is greatly reduced. The main feature of RNN hidden mode is that it stores information in a sequence. Also, we do not necessarily need to have an output or an input at any time (Evangelista and Serra 2019) . This diagram can be changed based on the intended work. RNNs are called recursive because the output of each layer depends on the calculations of the previous layers. In other words, these networks have memory that stores information about the data seen. It may seem a little strange at first glance, but these networks are actually multiple copies of ordinary neural networks that are stacked together and each transmits a message to the other (Zhao et al. 2019b ). It should be noted that RNNs are trained to use backpropagation through time, which again raises the issue of gradient disappearance. In fact, the problem with RNN is worse because each step is equivalent to a layer in a network. So if the network is trained for 1000 time steps, the gradient will disappear like in a 1000layer MLP. There are several approaches to solve this problem, the most popular of which is the gating method. The routing method takes the output of each step of the next time and input and makes the change before returning the result to the RNN. There are several types of ports; the longest short-term memory (LSTM) is more popular. Other techniques related in this field include gradient clipping, steeper gates, and better optimizers. In this article, we will discuss recurrent T2F-NNs and how to use them for suitable system identification. If the system dynamics is severe, recurrent models must be used. Numerous recurrent type-2 fuzzy neural models have been proposed, some of the most recent of which are discussed in continue. A novel recurrent T2F-NN (eIT2FNN-LSTM) has been proposed in (Fan et al. 2018) . In their proposed network, the long shortterm memory mechanism has been used to recurrent structure, which avoids the gradient vanishing problem that there is in the classic recurrent fuzzy neural systems and improves the performances for sequential data with longtime dependency feature. A recurrent interval TSK T2F-NN has been used for trajectory tracking problem of an experimental mobile robot (Tavoosi 2016) . The mentioned paper uses a simple T2F-NN structure and its innovation is hardware implementation on a robot, but unfortunately there is no discussion about how to implement and real time operation of the system. In (Paulo Vitor de Campos Souza 2020) by combination of recurrent neurons, wavelet neural network, and type-2 fuzzy systems, a novel structure has been proposed for model predictive control. A common Fig. 12 An example of a typical RNN A review on type-2 fuzzy neural networks for system identification example that highly used to system identification is the time-varying nonlinear dynamic system as follows: where y k À 1 ð Þ, y k À 2 ð Þ, and y k À 3 ð Þare 1, 2, and 3 units of delay, respectively, from the output of y k ð Þ and also u k ð Þ and u k À 1 ð Þ are the input and its one delay unit, respectively. In (7), the time-varying parameters a k ð Þ, b k ð Þ; and c k ð Þ are defined as follows: To this system, the input signal is used in relational form (7) and is chosen as follows: ; k\250 1; 250 k\500 À 1; 500 k\750 0:3 sin pk 25 þ 0:1 sin pk 32 þ 0:6 sin pk 10 750 k\1000 The results of identification of the system (7) with some classes of T2F-NNs are shown in Table 1 . In Table 1 , the references are sorted by RMSE. From Table 1 , it can be seen that the fully fuzzy system (pseudo-Mamdani) has better RMSE, more training time, lower MFs (MFs), and lower rules (user-friendly or simple structure). Also the performance of recurrent T2F-NN is almost similar to that of Mamdani networks. But in contrast, feedforward networks are much simpler and have less training time. They also seem more appropriate in real-time and online applications. If we want to look at the future horizons of T2F-NNs as a subset of computational intelligence, it must be said that there are some strengths and challenges. In the introduction, it can be said that artificial intelligence has changed companies, led to increased productivity and, in turn, economic growth. This technology will change the nature of the work environment as well as the work because machines will be able to complement the work done by humans, do more work in less time, and even be able to do things that are beyond human ability. As if in the recent challenge of Coronavirus, we all realized that artificial intelligence and telecommuting should become more and more widespread. Many believe that the advent of artificial intelligence technology will lead to the dismissal of workers. In the next 10 years, the technology is likely to occupy about 40% of human occupations. But many do not know how this technology will benefit the employment sector. Examples include increased productivity, promotion and activation of innovation, job creation (data scientists and robotics), accuracy in complex operations, collaborative learning, telecommuting and traffic reduction, and more. Certainly, in the future, T2F-NNs will be used for modeling, system identification, etc., provided that they have a high accuracy of function approximation as well as high computational speed. Researchers in this field need to work harder on learning algorithms to reduce the training time and increase the accuracy. The existence of supercomputers makes it much easier to compute big data, but researchers need to work hard to implement T2F-NN hardware with the suitable memory, as it can be used in some applications such as aerial or robotic bees, or standalone and self-governing systems, due to lack of access to the supercomputers. Finally, it can be said that if very high accuracy is achieved and time is not a priority, higher-order typ-3 and type-2 FLSs can be used (Baraka and Panoutsos 2019; Luo et al. 2019; Wei et al. 2020; Lin et al. xxxx; Pal and Kar 2019) . With a detailed and analytical comparison of type-2 fuzzy neural networks presented in [7, 9, 11, 36, 56, 60, 68 and 71] , interesting points can be realized. In general, if the accuracy of system identification is very important, the number of network trainable parameters should be increased, because this will increase the degree of freedom of the system and the maneuverability of type-2 fuzzy neural network. It is even possible to increase the type of the fuzzy system, for example the type-3 fuzzy system (Baraka and Panoutsos 2019), which naturally increases the number of parameters and the identification of the system will be more accurate. On the other hand, a large number of parameters requires more time to train and therefore will be problematic in online applications. Therefore, if the purpose of system identification is for use in an offline system but requires high accuracy, such as surgeon robots or painting robots, the type-2 fuzzy neural networks or higher types of them with many parameters can be used. However, if the goal is to use type 2 fuzzy neural networks in online applications such as chemical processes or to control various types of electric motors, low-parameter and fasttraining networks should be used (such as (Khankalantary et al. 2020) ). Fuzzy systems have been proposed and used for decades, from washing machines to satellite systems, from power systems to chemical processes, etc. What kind of fuzzy system should be used for each specific system, depending on the nature of the problem, the type of data, the basic knowledge of the system, required accuracy and required speed. If mathematical equations of the system exist, naturally TSK fuzzy models can be used. On the other hand, whatever qualitative and inaccurate information is available from the system, we must inevitably go to Mamdani models. On the other hand, today's human needs are becoming more complex and extensive, and therefore, systems are becoming more complex. Identifying a complex system requires a complex identifier, and so here the overall ability and superiority of type-2 fuzzy systems to type-1 fuzzy systems is revealed. Due to the above, it is not possible to provide a type-2 fuzzy neural network to solve all problems. In other words, to solve any problem, a special class of type-2 fuzzy neural networks must be proposed. Whether the mathematical model of the system is available or not, what is the degree of nonlinearity and uncertainty of the system, how much accuracy is required, how much speed of operation is required, how much data are available And with this amount of data, with what type 2-fuzzy neural network can the hidden dynamics of the system be modeled, how much data are scattered, etc. All of the above show that a single version of the type-2 fuzzy neural network cannot be provided for all problems. But in general, it can be said that the method (Baraka and Panoutsos 2019) has a very high accuracy and the method (Khankalantary et al. 2020 ) has less execution time. More recently, T3-FLS has been introduced that results in better identification accuracy and it can handle more level of fuzziness and uncertainties. The secondary membership in this kind of FLSs is not a crisp value, but it is a type-2 MF. Also the upper and lower degrees of foot print of uncertainty are not crisp values, but they are type-2 MFs. In (Mohammadzadeh et al. xxxx) , in various applications and experimental examination, the good performance of T3-FLSs has been proved. The authors recommend the use of this kind of FLS in high noisy environment. In this paper, an overview of the field of system identification using T2F-NN was performed. The types of system identification steps including system preparation for data extraction, apply input to excitation the system and forcing it to respond, data preprocessing, T2F-NN design, and a variety of training methods were discussed. A closer look at recent work reveals that a special T2F-NN must be designed for each type of system, and of course, self-organized networks can be used with structural and parametric training. However, in sensitive and critical applications, as well as in hardware implementation, selforganized structures are not recommended, as they require more time and discuss divergence. Whether gradient-based learning is used or evolutionary algorithms depend on the type of data, their number, and their characteristics. Certainly, a bright future awaits computational intelligence, and the subset of it, T2F-NNs, as the recent coronavirus has demonstrated the importance of artificial intelligence, telecommuting and robotics. In the future, the discussion of structural training is very important, because in the end, an optimal structure with the least fuzzy rules and parameters should be achieved. Also, in order to develop the theory of work, we can research on type-3 (and above) fuzzy systems, but it seems that while T2FL does not find its place in industry and application, higher-order fuzzy systems will fail. Author contributions The authors contributed to this paper as follows. JT and AM contributed in writing-original draft; formal analysis; investigation; and methodology, and KJ contributed in writingreview and investigation. Conflict of interest The authors declare that they have no conflict of interest. Support Vector Regression for multi-objective parameter estimation of interval type-2 fuzzy systems Backstepping-based recurrent type-2 fuzzy sliding mode control for MIMO Systems (MEMS Triaxial Gyroscope Case Study) Nonlinear black-box system identification through coevolutionary algorithms and radial basis function artificial neural networks A piecewise type-2 fuzzy regression model Long-term learning for type-2 neuralfuzzy systems Hybrid-learning type-2 takagi-sugeno-kang fuzzy systems for temperature estimation in hot-rolling A recurrent TSK interval type-2 fuzzy neural networks control with online structure and parameter learning for mobile robot trajectory tracking An aerial robot for rice farm quality inspection with type-2 fuzzy neural networks tuned by particle swarm optimization-sliding mode control hybrid algorithm Fuzzy neural networks and neuro-fuzzy networks: a review the main techniques and applications used in the literature A recurrent interval type-2 fuzzy neural network with asymmetric membership functions for nonlinear system identification An enhanced MSIQDE algorithm with novel multiple strategies for global optimization problems A novel gate resource allocation method using improved PSO-based QEA Differential evolution algorithm with wavelet basis function and optimal mutation strategy for complex optimization problem Multivariable state-space recursive identification algorithm based on evolving type-2 neural-fuzzy inference system Hybrid learning for interval type-2 intuitionistic fuzzy logic systems as applied to identification and prediction problems Derivative-based learning of interval type-2 intuitionistic fuzzy logic systems for noisy regression problems Design and application of interval type-2 TSK fuzzy logic system based on QPSO algorithm An intelligent temporal pattern classification system using fuzzy temporal rules and particle swarm optimization A recalling-enhanced recurrent neural network: conjugate gradient learning algorithm and its convergence analysis Lyapunov-Krasovskii stable T2FNN controller for a class of nonlinear time-delay systems Optimal genetic design of type-1 and interval type-2 fuzzy systems for blood pressure level classification A fast learning algorithm based on extreme learning machine for regular fuzzy neural network Model selection approaches for non-linear system identification: a review An intelligent agent based intrusion detection system using fuzzy rough set based outlier detection Soft computing techniques in vision science An intelligent risk prediction system for breast cancer using fuzzy temporal rules Noise reduction property of type-2 fuzzy neural networks https Robust control of high gain DC-DC converter using type-2 fuzzy neural network controller for MPPT An adaptive constrained type-2 fuzzy Hammerstein neural network data fusion scheme for low cost SINS/GNSS navigation system Adaptive control of a two-link flexible manipulator using a type-2 neural fuzzy system A K-means interval type-2 fuzzy neural network for medical diagnosis Adaptive filter design for active noise cancellation using recurrent type-2 fuzzy brain emotional learning neural network Design and verification of an interval type-2 fuzzy neural network based on improved particle swarm optimization Design and verification of an interval type-2 fuzzy neural network based on improved particle swarm optimization Nonlinear state-space system identification with robust laplace model Robust identification approach for nonlinear state-space models Reliable dissipative interval type-2 fuzzy control for nonlinear systems with stochastic incomplete communication route and actuator failure An evolving recurrent interval type-2 intuitionistic fuzzy neural network for online learning and time series prediction Particle swarm optimization of interval type-2 fuzzy systems for FPGA applications Feed-forward versus recurrent architecture and local versus cellular automata distributed representation in reservoir computing for sequence memory learning An approach for parameterized shadowed type-2 fuzzy membership functions applied in control applications A review on type-2 fuzzy logic applications in clustering, classification and pattern recognition A non-singleton type-2 fuzzy neural network with adaptive secondary membership for high dimensional applications A new fractional-order general type-2 fuzzy predictive control system and its application for glucose level regulation An interval type-3 fuzzy system and a new online fractional-order learning algorithm: theory and practice A novel fractional-order type-2 fuzzy control method for online frequency regulation in ac microgrid A new fractional-order general type-2 fuzzy predictive control system and its application for glucose level regulation Intrusion detection using dynamic feature selection and fuzzy temporal decision tree classification for wireless sensor networks Heidelberg Pal SS, Kar S (2019) A hybridized forecasting method based on weight adjustment of neural network using generalized type-2 fuzzy set Parameter identification of systems with multiple disproportional local nonlinearities An optimal interval type-2 fuzzy logic control based closed-loop drug administration to regulate the mean arterial blood pressure Intelligent control systems using computational intelligence techniques, Institution of Engineering and Technology Sabzalian MH et al (2019) Robust fuzzy control for fractional-order systems with estimated fraction-order Discrete time approximation of continuous time nonlinear state space models Deep recurrent neural networks for nonlinear system identification An intelligent neuro fuzzy temporal knowledge representation model for mining temporal patterns Developing an online general type-2 fuzzy classifier using evolving type-1 rules Consensus learning for distributed fuzzy neural network in big data environment Uncertain nonlinear system identification using Jaya-based adaptive neural network Differential evolution and quantum-inquired differential evolution for evolving Takagi-Sugeno fuzzy models Recurrent interval type-2 fuzzy wavelet neural network with stable learning algorithm: application to model-based predictive control Amir Abolfazl Suratgar, Mohammad Bagher Menhaj, Nonlinear system identification based on a selforganizing type-2 fuzzy RBFN Stable backstepping sliding mode control based on ANFIS2 for a class of nonlinear systems A 3-PRS Parallel Robot Control Based on Fuzzy-PID Controller A New Type-II Fuzzy System for Flexible-Joint Robot Arm Control New Type-2 Fuzzy Systems for Flexible-Joint Robot Arm Control adaptive inverse control of nonlinear dynamical system using type-2 fuzzy neural networks A novel intelligent control system design for water bath temperature control Stable ANFIS2 for nonlinear system identification Stability analysis of recurrent type-2 TSK fuzzy systems with nonlinear consequent part Stability analysis of a class of MIMO recurrent type-2 fuzzy systems Block-oriented nonlinear system identification Oliver nelles, iterative excitation signal design for nonlinear dynamic black-box models Sampling frequency requirements for identification and compensation of nonlinear systems Inverse-adaptive multilayer T-S fuzzy controller for uncertain nonlinear system optimized by differential evolution algorithm Synchronization and identification of nonlinear systems by using a novel self-evolving interval type-2 fuzzy LSTM-neural network A T-S fuzzy model identification approach based on evolving MIT2-FCRM and WOS-ELM algorithm Training high-order takagisugeno fuzzy systems using batch least squares and particle swarm optimization An enhanced type-reduction algorithm for type-2 fuzzy sets A navigation method for mobile robots using interval type-2 fuzzy neural network fitting Q-learning in unknown environments Nonlinear system identification of an all movable fin with rotational freeplay by subspace-based method Designing a general type-2 fuzzy expert system for diagnosis of depression Design of type-2 fuzzy logic systems based on improved ant colony optimization Nonlinear system identification with the use of describing functions -a case study Interval type-2 fuzzy model based on inverse controller design for the outlet temperature control system of ethylene cracking furnace Soft sensor modeling of chemical process based on self-organizing recurrent interval type-2 fuzzy neural network Self-organizing interval type-2 fuzzy neural network with asymmetric membership functions and its application Sliding-mode-controltheory-based adaptive general type-2 fuzzy neural network control for power-line inspection robots Interval type-2 fuzzy-neural network indirect adaptive sliding mode control for an active suspension system