key: cord-0744545-02aqunju authors: Talpur, Noureen; Abdulkadir, Said Jadid; Alhussian, Hitham; Hasan, Mohd Hilmi; Aziz, Norshakirah; Bamhdi, Alwi title: Deep Neuro-Fuzzy System application trends, challenges, and future perspectives: a systematic survey date: 2022-04-13 journal: Artif Intell Rev DOI: 10.1007/s10462-022-10188-3 sha: 914b96ace792af57e14b1d42f2e4092acffea59e doc_id: 744545 cord_uid: 02aqunju Deep neural networks (DNN) have remarkably progressed in applications involving large and complex datasets but have been criticized as a black-box. This downside has recently become a motivation for the research community to pursue the ideas of hybrid approaches, resulting in novel hybrid systems classified as deep neuro-fuzzy systems (DNFS). Studies regarding the implementation of DNFS have rapidly increased in the domains of computing, healthcare, transportation, and finance with high interpretability and reasonable accuracy. However, relatively few survey studies have been found in the literature to provide a comprehensive insight into this domain. Therefore, this study aims to perform a systematic review to evaluate the current progress, trends, arising issues, research gaps, challenges, and future scope related to DNFS studies. A study mapping process was prepared to guide a systematic search for publications related to DNFS published between 2015 and 2020 using five established scientific directories. As a result, a total of 105 studies were identified and critically analyzed to address research questions with the objectives: (i) to understand the concept of DNFS; (ii) to find out DNFS optimization methods; (iii) to visualize the intensity of work carried out in DNFS domain; and (iv) to highlight DNFS application subjects and domains. We believe that this study provides up-to-date guidance for future research in the DNFS domain, allowing for more effective advancement in techniques and processes. The analysis made in this review proves that DNFS-based research is actively growing with a substantial implementation and application scope in the future. Over the past decades, significant progress in the field of artificial intelligence (AI), machine learning, and deep learning has been made with several real-world problems having been successfully solved. The success in these fields has resulted in the emergence of various methods including fuzzy logic, swarm intelligence, genetic programming, and hybrid approaches, such as neuro-fuzzy and genetic fuzzy systems, all of which have contributed to the design and analysis of complex intelligent systems. Among these methods, deep learning techniques such as deep neural networks (DNN) have made major advances in solving problems that have resisted the best attempts of the AI community for many years. The term "deep" is used because the depth of the network is greater than that of conventional neural networks, which are often referred to as shallow networks (Paul and Singh 2015) . Conventional neural networks are limited in their ability to process natural data in their raw form. For decades, the construction of pattern-recognition or machinelearning systems requires careful engineering and considerable domain expertise when designing a feature extractor that transforms the raw data into a suitable internal representation or feature vector from which the learning system (often a classifier) can detect or classify patterns in the input (Ashraf et al. 2020 ). Compared to typical neural networks with a single hidden layer, a DNN applies representation learning that allows a machine to be fed with raw data and automatically discover the representations needed for detection or classification using multiple hidden layers (LeCun et al. 2015) . Hence, a DNN has turned out to be very good at discovering intricate structures in high-dimensional data, and thus is applicable to many domains of science, business, and engineering. Although a DNN is an effective approach for handling big data problems, the superior accuracy of the model, however, comes at the cost of high complexity. It is therefore essential to note a few points before employing this type of network to solve certain problems. Because a DNN uses more than one hidden layer, it can provide a deeper analytical model; however, each added layer adds computational complexity (Sharma 2019) . Further, such networks are inspired by a traditional neural network that utilizes the gradient descent optimization approach for network training. Hence, the DNN frequently encounters the problem of being stuck in the local minima. In addition to these challenges, as the major disadvantage of a DNN, the model is often criticized as being non-transparent and its predictions are not traceable by humans owing to its black-box nature (Buhrmester et al. 2019) . It is challenging to trust the findings generated by such deep networks. Hence, there is always the possibility of a communication gap occurring between analysts and DNNs (Bonanno et al. 2017; Hayashi 2020) . This downside more often limits the usability of such networks in the majority of real-world problems, where verification of the predicted results is a major concern. To cope with these problems, few studies from the literature (Aviles et al. 2016; El Hatri and Boumhidi 2018; Zhang et al. 2020a, b, c) have combined a DNN with fuzzy systems to produce a novel deep neuro-fuzzy system (DNFS) . Fuzzy systems are structures established on fuzzy techniques oriented toward information processing, and are mainly used for implementation in systems where the use of classical binary logic is impossible or difficult. Their main characteristic involves a symbolic knowledge representation in the form of fuzzy conditional IF-THEN rules (Czabanski et al. 2017) . Therefore, the novel hybridization of a DNN and fuzzy systems has demonstrated an effective way to reduce uncertainty using fuzzy rules. As an emerging hybrid approach, the use of DNFS has gained enormous popularity among research communities during the past 5 to 6 years in the field of AI. Hence, positive growth in the implementation of this model can be seen in distributed systems, cloud computing, healthcare, and various other areas. However, to the best of the authors' knowledge, no systematic review has been conducted with the sole focus on highlighting the current progress in the domain of DNFS with detailed facts and figures. This study, therefore, presents a systematic literature survey of the research work published between the year 2015 and 2020 with the following major contributions: First, a comprehensive methodology has been designed to perform an in-depth search in a systematical way by following a revised study mapping process comprised of seven phases (shown in Fig. 1 ). This paper contributes to deliver the basic concept of DNFS and highlights some of the open questions covering different variations of structural designs that have been introduced in the literature with a combination of deep neural networks and fuzzy systems. The study also covers the optimization methods and techniques that have been widely used to train and optimize the parameters of DNFS. In addition, this paper presents information regarding the intensity of the research conducted in this discipline by performing extensive searches in different scientific databases. One of the research questions included in this paper intends to highlight the applications of DNFS, which is one of the main focuses of this study. Finally, this survey highlights the research gaps, issues, and challenges that require further attention from researchers. It provides a comprehensive body of knowledge and delivers the current status of this particular field, while suggesting some potential future directions. The remainder of this paper proceeds as follows. Section 2 highlights related research based on the available survey studies from the literature, whereas Sect. 3 presents the methodology designed to conduct this systematic review. Section 4 answers the research questions set out in our systematic survey by analyzing the synthesized results of identified publications from available sources in the literature. The identified issues, gaps, challenges, and future areas of study are discussed in Sect. 5. Finally, the conclusions of this study are presented in Sect. 6. From the past 5 to 6 years, successful attempts to hybridize deep learning and fuzzy systems have attracted researchers to implement such a method in various real-world applications. An extensive literature have been published to date, focusing on experimenting with the model in new domains where it has not been implemented in the past. However, considering that DNFS is a novel approach, at present, very few survey studies have been carried out delivering the overall insight regarding this domain. Therefore, the focus of this section is to highlight the survey studies that are conducted under the domain of DNFS. These survey studies were carefully selected and studied to create a general idea of the present state of DNFS research. The survey performed by Dorzhigulov and James (2020) mainly focuses on neuro-fuzzy and similar machine learning models from the perspective of functionality and architectures. In their study, the authors presented an overview of fuzzy systems and described the stunning journey related to the hybridization of fuzzy systems and neural networks. Moreover, deep learning methods such as a DNN can be integrated with fuzzy systems to introduce automated optimization of neural architectures. Therefore, this study described the DNFS architectures, including an adaptive neuro-fuzzy inference system (ANFIS), fuzzy neural networks, fuzzy trees, and overviews of neural architectures that use some fuzzy elements, such as radial basis function networks (RBFN) and a fuzzy adaptive resonant theory map (ARTMAP). The literature has shown significant interest in the domain of control systems and classification using neuro-fuzzy systems. However, most of the neuro-fuzzy systems presented in the literature are software-based solutions that provide improved training algorithms or mathematical and architectural modifications of the model. However, neuro-fuzzy systems still face challenges of slow training when dealing with big data, which affects its overall performance. There have been limited studies implementing neuro-fuzzy systems as dedicated high-performance hardware, including (Jhang et al. 2018; Khati et al. 2019; Mata-Carballeira et al. 2019) , that have proposed the use of field-programmable gate array (FPGA) devices. This hardware solution tends to be more efficient and faster, but with a trade-off in flexibility. Recently, only one study (Marlen and Dorzhigulov 2018) can be seen using memristive crossbar arrays with a fuzzy membership function that acts as a resistor, capacitor, and inductor. Hence, the study of Dorzhigulov and James (2020) suggests that in the near future, hardware solutions should be used to improve the performance and speed of these hybrid approaches. In the same vein, another recent and interesting study (Das et al. 2020) found in the literature explored the different ways in which deep learning is improved with fuzzy logic systems along with the utilization of the model in various real-life applications. It can be seen that using fuzzy theory along with deep learning can improve the performance of models in which the data are noisy, heterogeneous, incomplete, or vague. However, a problem of computational complexity may occur when utilizing fuzzy systems. The availability of software platforms such as the Compute Unified Device Architecture (CUDA) of Nvidia, the Radeon Open Compute (ROCm) ecosystem released by Advanced Micro Devices, Inc. (AMD), and the Math Kernel Library (MKL) by Intel further accelerate the deep learning processes. The computation of fuzzy parameters is time-consuming using the presently available architectures, despite the models providing resistance to noise and searches over a wider space. Alternatively, fuzzy logic can be used alongside standard deep learning models to process the input or output. Models can make use of fuzzified inputs coupled with standard deep learning models such as deep belief network (DBN) or convolutional neural networks (CNN). This allows leveraging software platforms to accelerate DNN training using fuzzy systems. This study further suggests exploring better ways to improve the performance of fuzzy deep learning models in the future. Taking a deeper look into deep learning-based neuro-fuzzy systems, an excessive and appealing approach can be found in (Singh and Lone 2020) . This study develops the basics for readers from fuzzy sets to the concepts of fuzzy rules and reasoning to understand membership functions with the help of real-world scenarios and case studies in simple mathematics. Furthermore, it describes the working style of Mamdani fuzzy inference systems, Takagi-Sugeno-Kang (TSK) fuzzy inference systems, and Tsukamoto fuzzy inference systems, along with explanations of how these three models vary from each other. A CNN, natural language processing (NLP), and recurrent neural networks (RNN) were implemented in the subject area of computer vision and time-series prediction. Different variations of the architectures have been defined with the integration of fuzzy systems and deep learning. Insight into these hybrid approaches as intelligent systems in the modern world are provided. In addition, it simplifies the implementation of fuzzy logic, neural networks, DNFS, and related concepts using Python, which encourages readers to experiment with these machine learning and deep learning methods. This study not only builds the fundamentals but also encourages newcomers in the field of AI to implement these methods in their respective research areas. The fourth and last survey study was conducted by de Campos Souza (2020). This study aims to describe the proposed methodologies and existing and improved techniques, including the implementation of neuro-fuzzy in applications such as pattern classification, time-series prediction, fault detection, and various other approaches developed since the year 2000. Moreover, the author provided a well-defined model architecture, describing its problem-solving abilities, mechanisms, training algorithms, and different ways to extract information through fuzzy rules. The major emphasis of the study was to specifically survey neuro-fuzzy systems from the literature that provide supervised learning. The central focus of this survey is to gain in-depth knowledge of the neuro-fuzzy systems without the implementation of deep learning. However, a small section of the study focuses on the training of neuro-fuzzy models using deep learning methods to perform the tasks of data classification (Deng et al. 2017) , traffic incident detection (El Hatri and Boumhidi 2018), and sentiment analysis (Nguyen et al. 2018 ). The survey also mentioned a few studies implementing a semi-supervised DNFS for image classification (Xiaowei and remote sensing scene classification tasks . As future work, this study not only suggests exploring dynamic hybrid model architectures, it also advises on addressing new learning algorithms. From this section, it is clear that currently there are only four survey studies present in the literature that discuss novel DNFS approaches. It is important to note that these studies did not focus on DNFS only. The main motivation of these survey studies is to build general concepts about DNFS models along with various similar techniques, such as neurofuzzy systems without a deep learning approach. Although these studies comprehend helpful knowledge on understanding the basic concept of a hybrid model, they do not explore the trends and developments regarding DNFS. Based on these observations, Table 1 provides a summarized comparative analysis of the above-mentioned four survey studies. The summary of the comparative analysis presented in Table 1 provides confirmatory evidence of a research gap. The current literature does not offer comprehensive, systematic, and more importantly, quantitative research knowledge for researchers who wish to explore the scope and current research progress on DNFS in the area of AI, particularly in hybrid approaches. Therefore, our systematic survey aims to broaden the knowledge of readers by presenting not only a basic structural understanding, but also aims to deliver the information regarding the optimization methods and application domains through a quantitative analysis, facts, challenges, scope, and future suggestions. To fulfill the above-mentioned objectives, the next section (Sect. 3) provides the detailed methodology that has been followed to construct this systematic survey. The purpose of this systematic literature survey is to provide a complete list of all possible studies related to DNFS reported in the recent literature between the year 2015 and 2020. Various attractive approaches to conduct the survey studies have been implemented in Yu and Pan 2021; Yu and Sheng 2020) . The guidelines followed in this paper are taken from systematic literature reviews published by Baashar et al. (2020) , Schön et al. (2017) , and Muhammed et al. (2018) , whereas, a revised study mapping process illustrated in Fig. 1 is designed for this review paper, which is a combination of preferred reporting items for systematic reviews and metaanalyses (PRISMA) statement (Moher et al. 2009; Safdar et al. 2018) ; and a systematic mapping process presented by Hussain et al. (2019) . It is comprised of seven phases (as shown in Fig. 1 ), i.e., a preliminary study, formulation of research questions, identification of the search criteria, all papers found in a literature search, the screening process, an eligibility and quality assessment of the selected studies, data extraction and compilation, and the final list of articles included in this systematic study after applying the exclusion criteria. A preliminary study is an initial and significant phase applied when conducting a systematic literature survey. In this phase, we narrow down the search parameters by obtaining the background information for studies related to deep neuro-fuzzy systems. A random search was performed on the common search engines over the Internet using a single domainspecific keyword, such as deep neuro-fuzzy systems. Afterward, we filtered out the search results and focused on those studies that helped us to choose search strings strictly related to the domain of our study. In this phase, we formulated the key research questions that can direct us throughout the research and writing process. We determined the quality of the research questions based on the elements of constructiveness, focus, and relevance to a specific area or problem. Therefore, an extensive research has been conducted on the development of neuro-fuzzy systems (NFS). However, it is challenging to find sufficient literature guidance regarding novel deep neuro-fuzzy systems (DNFS) that involve deep learning methods such as basic DNNs. Therefore, the primary motivation behind this study is to prepare a systematic review regarding the development of DNFS, their optimization methods, and their application subjects. The following are the fundamental research questions: RQ-1 What are the fundamental concepts related to deep neuro-fuzzy systems? Motivation Answering this question, will set the basics and help in understanding the primary knowledge regarding deep neuro-fuzzy systems. RQ-2 What approaches have been widely employed for the optimization of deep neurofuzzy systems? Motivation This question aims to identify optimization techniques involved in the training and learning of deep neuro-fuzzy systems. RQ-3 What is the intensity of publications in the domain of deep neuro-fuzzy systems in terms of year-wise and directory-wise? Motivation In this question, the potential of deep neuro-fuzzy systems is investigated using conference proceedings, journals, articles, and book chapters. RQ-4 What are the most promising and practical application subjects and domain areas where deep neuro-fuzzy systems have been implemented? Motivation This question investigates the studies that highlight the application, particularly in the domain of deep neuro-fuzzy systems in multiple potential subject and domain areas. The preliminary search was conducted using the Google Scholar search engine, which helped to formulate the research questions for this study. Next, research questions were used in this survey to identify the relevant keywords and search terms/strings related to the DNFS topic. The identification stage of this study further helped us to explore the topic from a broader perspective while employing specific search criteria. In this survey study, five scientific databases and venues (Table 2) were selected for a more advanced search using finalized keywords (Table 3 ) from the domain of DNFS. The selected databases were chosen because they offer a wide variety of the most essential and highest-impact journals and conference proceedings. Because each scientific database uses different search features and filters to perform a systematic search, it is important to adjust the search string for every scientific database. Table 2 presents the search criteria used to apply the systematic search for this study using the selected scientific databases and search engines. The search was performed with the aim of obtaining the maximum number of relevant studies published in these databases based on the specific keywords and search strings using the logical operators OR and AND, as presented in Table 3 . However, it is important to modify the search strings to meet the unique specifications of each database. Moreover, the search strings highlighted in Table 3 were used to retrieve research papers for RQ1, RQ2, and RQ4. Meanwhile, RQ3 is specifically formulated to investigate the intensity and potential of DNFS based on the publications published between the year 2015 and 2020 using the keywords and search strings shown in Table 3 . After completing the database search using the identified keywords and search strings, all retrieved papers were screened based on the inclusion and exclusion criteria used to filter and select only the studies that were relevant to answering the research questions, while excluding studies that were not. Table 4 lists the criteria that were followed throughout the screening process to evaluate each paper and decide whether to include or exclude the paper in the systematic literature survey. Deep neuro-fuzzy systems, Deep neuro-fuzzy optimization, Deep neuro-fuzzy applications subjects ("fuzzy systems" OR "fuzzy logic" OR "fuzzy inference systems") AND ("deep neural network" OR "deep learning" OR "deep networks") OR ("fuzzy deep" OR "deep fuzzy") AND ("neural networks" OR "networks" OR "systems" OR "models" OR "approach") AND ("optimization methods" OR "optimization techniques" OR "optimization") AND ("applications" OR "utilization" OR "implementation") OR ("practices") In addition to the inclusion and exclusion criteria, it is critical to assess the eligibility and relevance of the primary studies found in the previous stage. In this survey, three quality assessment scores applied to answer every question were adopted from (Hordri et al. 2017) to examine the eligibility of individual study based on the factors, such as the significance of a particular study, the quality of the results and analysis, and future research guidelines or findings. The scoring procedures are 1 (Yes), 0.5 (Partly), and 0 (No), whereas Table 5 describes the quality assessment scores employed to check the eligibility of each DNFS paper from the records. After appropriately classifying the studies to be included in the systematic literature survey following the steps in Phases 4 and Phase 5, we performed a data extraction and compilation to examine and compare the relevant studies. Generally, the data extraction and compilation are performed using a variety of available tools and software, including Microsoft Excel (spreadsheets), REDCap, and Google Sheets. We utilized Microsoft Excel to record data from the publications to answer the research questions and achieve the goals of the study. The following information was extracted from each included study: title, abstract, keywords, authors, publication year, scientific databases/venue, publication type (e.g., journal, conference, or book chapter), the technique used (new method, modified, or hybrid approach), and study type (e.g., analysis, survey or a mixture of both). Figure 2 summarizes the entire revised study mapping process according to the PRISMA guidelines using a flow chart. As illustrated in Fig. 2 , a total of 252 studies were found using the keywords and search strings (as listed in Table 3 ) in scientific databases including ACM, IEEE Xplore, Scopus, ScienceDirect, and SpringerLink from the years 2015 to 2020 during the identification phase. The screenings of the 252 collected records were then conducted based on the inclusion and exclusion criteria mentioned in Table 4 . At this stage, 166 relevant studies were included. The preferences were given to the journal articles, conference proceedings, and book chapters relevant to the domains of DNFS, its optimization methods, and applications in various domains that are written in English language and published between the year 2015 and 2020. A total of 86 studies were excluded at this stage because they were not related to the DNFS domain, written in languages other than English, published prior to the year 2015, or were published as tutorials, short papers, interviews, blogs, or duplicated publications. The 166 relevant studies screened from the previous phase were further crosschecked for eligibility based on the eligibility criteria specified in Table 5 of phase 5 before being included in this systematic literature survey. The eligibility criteria were designed to assess whether the studies have clearly defined objectives, a clearly presented methodology, a clear experimental process, stated limitations, and findings based on scoring outputs of 1 (yes), 0.5 (partly), and 0 (no). As a result of the quality assessment process, a total of 105 studies remained in the records, whereas 61 studies were excluded because they did not meet the eligibility criteria. Therefore, after following the mapping process, the final 105 collected studies were included in the systematic literature survey to answer the RQs of this study and highlight the research gaps, issues, and challenges of this particular domain. Furthermore, Fig. 3 shows the publication type of the 105 included studies based on their publication avenues, such as studies published in journals, conference proceedings, book chapters, and preprint servers for the analysis. This section answers the research questions in this study and provides a better understanding of the collected data. A deep neuro-fuzzy system (DNFS) is an advanced concept of hybridization, where deep learning approaches, such as deep neural networks and fuzzy logic approaches, are combined to solve various real-world complex problems involving high-dimensional data. That said, before going into the details of DNFS models, it is essential to build preliminary knowledge for the readers by providing an overview of deep neural networks and neurofuzzy systems, which makes it easier to comprehend the idea behind developing DNFS. Deep neural network (DNN) Deep learning enables multi-layer cognitive models to learn and interpret data with several levels of abstraction, replicating the brain perception and knowledge representation; hence, it is indirectly capable of capturing complex largescale data structures. In comparison to various existing state-of-the-art methods, the importance of deep learning approaches is growing rapidly owing to their extraordinary performance in several applications, such as visual, audio, social, and medical data (Voulodimos et al. 2018) . In general, AI models are trained to perform data processing tasks based on hand-crafted features derived from raw data or features learned from other basic AI models. Using deep learning, computers can automatically learn useful representations and features directly from the raw data, avoiding the challenging step of manually crafting the features (Lundervold and Lundervold 2019). Deep learning techniques have achieved an excellent performance in computer vision, automatic speech recognition, and natural language processing. It is evident from the term "deep learning" that a DNN model involves a greater number of processing layers, instead of fewer layers in a simple neural network recognized as a shallow learning model. The advancement from shallow to deep learning models has increased the possibility of dealing with complex and nonlinear functions (Shrestha and Mahmood 2019) . Neuro-Fuzzy Systems (NFS) Several NFS have been presented in the literature. However, the ANFIS model is the most frequently used approach. The concept of ANFIS was presented by Jang in 1993, which is a proficient combination of neural networks and fuzzy logic (Hussain et al. 2015) . The key benefit of a neural network is the ability to learn from data. However, such a network is considered a "black-box," because it does not clarify how the final outcome is achieved. Therefore, with the help of IF-THEN rules in fuzzy systems, one can interpret the results generated by the model (Kruse and Nauck 1998) . The following is a presentation of the standard fuzzy rule: where A and B are fuzzy sets, and z is a polynomial or a constant (Emad Hussen et al. 2020 ). Hence, a neuro-fuzzy model can incorporate human knowledge and self-learning competencies that can potentially approximate every situation (Mohd Salleh and Hussain 2016) . ANFIS has gained popularity among the research community as compared to other variations of fuzzy and neuro-fuzzy systems because it has been successfully implemented in various classification, rule-based process controls, and pattern recognition applications. It also embeds learning mechanisms to adapt and update all adjustable parameters of the model with two-pass learning algorithms, i.e., forward pass and backward pass. This algorithm is a hybrid of gradient descent (GD) and a least squares estimator (LSE), which helps to adjust the antecedent and consequent parameters of the ANFIS model to minimize the error between the actual output and the targeted output (Salleh et al. 2018 ). (1) Novel hybrid approach of deep neuro-fuzzy system (DNFS) For many years, the scientific community has explored several ways to execute sophisticated algorithms that can learn from data, the main barriers of which were the computational power and limited data. Such attempts and years of development to solve these problems have resulted in an exciting new subfield of machine learning called deep learning (DL). A popular algorithm within DL is a DNN (Bonanno et al. 2017) . A DNN attempts to learn multiple levels of abstractions and representations to locate complex relations between data. This evolving subfield has been rising rapidly over the past few years owing to the ever-increasing computational power and unlimited access to data. Although DNNs have shown remarkable progress with respect to feature learning from big data, the model is unable to express the uncertainties with data because of its "black-box" nature (Shwartz-Ziv and Tishby 2017). Despite the usefulness of a DNN, a gap in communication exists between the analysis and DNN. Since the beginning of big data, massive amounts of data have been produced daily around the world within the domain of science and industry. However, when the volume of data increases, the existence of noise and unpredictable uncertainties in large amounts cannot be ignored, which is another critical issue of data ambiguity . This issue again becomes challenging for many autonomous systems because the majority of these machine learning models are designed and trained using labeled data (Bonanno et al. 2017) . The drawbacks of such methods have been solved by introducing an additional machine learning process into neural networks, that is, fuzzy inference, to create an explainable rule-based structure known as a neuro-fuzzy system (NFS). The NFS allows experts to generate rule-based structures. Once the rules are generated, it is possible for the experts to bias features generated from the DL by providing feedback to the system. In addition, through these rule-based structures, an analyst can easily understand how a decision has been made by the system ). Using fuzzy if-then rules, neuro-fuzzy systems such as ANFIS can approximate complex nonlinear problems. These systems can be applied in various applications that involve encoding both objective measurements and subjective information (Bonanno et al. 2017) . Therefore, with the emergence of the deep learning concept and a DNN, some researchers have introduced fuzzy inference system elements and modules into such systems to address the possible uncertainties in the raw data, similar to ANFIS. Recently, a few studies have incorporated the concepts of an explainable rule-based structure called fuzzy inference with a DNN as a DNFS to overcome the blackbox problem of DNN . Figure 4 shows DNFS, combining the advantages of fuzzy systems and a DNN (Aviles et al. 2016 ). A sequential DNFS is suitable for solving problems involving high linearities, such as time-series data, text documents, sentiment or video classification, and speech recognition. In sequential structural design, data processing in a fuzzy system and a DNN take place one after the other, as presented in Fig. 5a , b. In fuzzy theory, a fuzzy set A in a universe of discourse X is represented by a membership function A taking the values from the unit interval as A ∶ X → [0, 1] . At this stage, a membership function shows the degree of similarity for a data point within the universe of discourse of x ∈ X (Yazdanbakhsh and Dick 2020). The approximate reasoning and decision-making ability of fuzzy logic assist the fuzzy system in effectively describing the uncertainty of the real world. It can work with the data having the characteristics of imprecision, ambiguity, and uncertainty (Gallab et al. 2019 ; Vlamou and Papadopoulos 2019). The process in the first approach of DNFS' sequential structural designs (Fig. 5a) starts by taking the input features of the data and converting the values into fuzzy sets which are processed by DNN. This means that the inputs enter the fuzzy system to get the fuzzy linguistic values. Afterward, the neural network helps to generate the outputs of the sequential DNFS model. Similarly, the process in Fig. 5(b) works in the opposite way (Vieira et al. 2004 ). The DNN model assists the fuzzy system in determining the desired parameters when the DNFS model cannot measure the input values directly from the data (Abraham 2001) . To develop a better understanding of sequential DNFS, the study of (Sarabakha and Kayacan 2019) is presented in this review paper. This study presents a deep fuzzy neural network (DFNN) proposed by Sarabakha and Kayacan, which uses one antecedent fuzzification layer for learning control of nonlinear systems. As illustrated in Fig. 6 , the DFNN neurons are organized in an input layer with n inp neurons, a Gaussian fuzzification layer x with n gF neurons, hidden layers n hL with ( n H + 1) neurons in each layer, and an output layer with y out neurons. In a fuzzy set theory, the degree of truth is defined by the membership function which contains the curve. This curve represents every single point in a specified input space. The inputs x 1 , … x n inp to the DFNN are fuzzified using three Gaussian membership functions, i.e., c gF,1 = −1 , c gF,2 = 0 , c gF,3 = 1 , and gF,1 = gF,1 = gF,1 = 1 , as indicated in Eq. (2). The two parameters of c and σ defines the center/mean and width of the curve/variance. Figure 7 shows the representation of the Gaussian membership function. (2) gF l x j = e The fuzzified inputs gF 1 x 1 , gF 2 x 1 , gF 3 x 1 , … , gF 1 x n inp , gF 2 x n inp , gF 3 x n inp are forwarded to the first hidden layer of the DFNN through the weights w 1 of the network. The DFNN hidden layers are aligned in a fully connected model using network weightsw i , i = 2, … , n hL − 1 . Finally, the output y 1 , … , y out is calculated using the weights w n hL until reaching the output of the final hidden layer. The weights of network W i, i = 1, … , n hL , are restricted by some positive constants ofc W,i , i = 1, … , n hL , i.e., The learning of this model is divided into two stages: offline pre-training and online training. During the offline pre-training process, the classical controller executes a series of trajectories and collects a batch of training samples. The controller based on a DFNN, such as DFNN 0 , is then pre-trained on the data samples obtained to estimate the inverse dynamics of the system. Because DFNN 0 cannot tune the new conditions, online training is conducted at this stage. During this stage, the DFNN continuously monitors and updates the input of the controller to improve the performance. The DFNN adaptive information is generated by expert knowledge encoded in the rule-based method using fuzzy mapping. The approximation of the inverse dynamics of the system is a typical problem of regression; thus, the mean square error was set as a cost function in this study for both offline and online training. In a parallel structural design, data are processed separately from the fuzzy systems and a DNN, and then fused to obtain the final output of the data, as shown in Fig. 8 . The parallel structure uses a fuzzy system with a hierarchical DNN that derives information from both fuzzy and neural representations. The knowledge learned from these two (2) and Fig. 7 to determine the degree of membership and to specify the input belonging to a certain fuzzy set. In this phase, if p is input, q is an output, and n is the layer number, p n i denotes the input of the i-th neuron of the n-th layer. Similarly, q n j represents the output of the jth neuron in the n-th layer. The output is processed through a Pythagorean fuzzification, which is defined with parameter r to indicate the non-membership function as follows: Phase 2 -Neural net phase of HPFDNN In this phase, the neural net is formed based on the perceptron. In the perceptron, input values are multiplied by the weight of each neuron, and when the degree of the entire input signal exceeds the defined threshold value, a neuron produces the output. This is achieved by computing the sum of the weighted inputs with the threshold neural function (NF) on the sum to produce the output. Subsequently, the DNN converts the inputs into high-level representations by activated neurons, and the neural net phase helps the model acquire neural features. In this phase, a sigmoid activation function (Fig. 10 ) was used as follows: Here, the activated weights and biases are presented as the output of the neural net. Phase 3 -Fusion phase of HPFDNN In the proposed HPFDNN model, the fusion of fuzzy and neural nets is processed to obtain the output using the following operation: where q f and q n represent the outputs of both the fuzzy and neural phases. Phase 4 -Learning phase of HPFDNN This final phase of the HPFDNN generates a trained DNN model where the output of every layer is used as the input for the upcoming layer as follows: where a weight matrix connection is presented by w (n) n−1 in layer n with n − 1 . The bias vector is represented by b (n) . In addition, a sigmoid function is used in each hidden layer of this phase to transform the de-normalized output q with real values as follows: In cooperative deep neuro-fuzzy designs, there are two potential models of DNFS, as illustrated in Fig. 11a , b. In Fig. 11a , the fuzzy interface block converts the crisp input into fuzzy values to provide an input vector to a multi-layer neural network in response to linguistic statements. Then, the DNN is trained to generate the required outputs, and defuzzification of the outputs is performed to convert the fuzzy value into a crisp output value. As shown in Fig. 11b , the fuzzy inference mechanism is determined by a multilayered DNN. Fuzzy systems obtain the computational characteristics of learning offered by a DNN, and in return, the DNN receives the interpretation and clarity of the system representation (Phuong and Kreinovich 2001) . . 11 Structural design of cooperative DNFS: a fuzzy deep neural network and b deep neuro-fuzzy network A simple example of a cooperative DNFS was proposed by (Yeganejou et al. 2020 ) using a CNN for feature extraction and by transferring the outputs of the final convolutional layer for fuzzy classification, as depicted in Fig. 12 . With the proposed model, modifications can be made to the CNN based on a dataset or individual needs. In Fig. 12 , layers 1-3 are the same as those in the LeNet architecture. Subsequently, the data paths are divided into two parts. In the case of a deep fuzzy structure, the feature maps of layer 3 are extracted and sent for fuzzy clustering. Rocchio's algorithm was employed to generate the final classification results for the network. The other path leads to a more conventional deep network including subsampling layer 4, a fully connected layer with ReLU neurons, and finally to a fully connected layer with softmax activation functions. where O i is the i-th network output, ����� ⃗ O flc is the output from the former fully connected layer, ������ ⃗ w PM i is the modified value of weights for the i-th softmax neuron, and the number of neurons in the softmax layer are represented by n and ����⃗ w M i . Here, ����� ⃗ O flc represents the logit value of the output layer. The outputs of the network are defined as [0, 1], and the summation to 1 for all inputs of the network. The fuzzy classifier unit of the proposed model implies a process for selecting a feature map, which states that either all elements are selected from the feature map, or none of the elements are selected. This process helps to omit redundant feature maps, which leads to a better classification accuracy. Next, a PCA-based dimension reduction idea is employed using GK clustering, in which a matrix inversion requires a step with each chosen feature map providing 196 features of a 14 × 14 image. Fuzzy clustering was executed, and Rocchio's algorithm (Yeganejou et al. 2020 ) (presented in Fig. 13 ) was employed as a classifier. A mini-batching variant of stochastic gradient descent was used as an additional momentum to train the network. where Δw ij (t) represents the tuning of the ij-th weight after observing the t-th mini-batch of the input patterns. The learning rate is presented as , j (t) shows the error for neuron j, and (10) m is the constant of the momentum. Subsequently, the former mini-batch of the ij-th weight is tuned using Δw ij (t − 1) and a cross-entropy loss function L CE is used as follows: where b n states the ground-truth of the class that the current input ⃗ I is n, and l shows the probability that the output O n will be projected by the deep network for input ⃗ I. In Optimization plays an extremely important role in discovering the best solution from a set of available options with minimal resources. In the field of computing, engineering, or a simple task of online shopping, to find solutions rationally, optimization methods help to identify the best solution from a wider range of possible options. Similarly, various optimization methods, such as a gradient descent, stochastic gradient descent, and conjugate gradient methods, have been adapted by machine learning and deep learning techniques for parameter optimization. These methods are well known as exact methods. However, metaheuristic algorithms have gained more popularity over exact methods for solving optimization problems owing to the simplicity and robustness of the results generated when implemented in a wide range of fields, including engineering, industry, transport, and even social sciences (Hussain et al. 2019) . The exact methods are suitable for delivering optimal solutions for smaller problems by following local search mechanisms, whereas metaheuristic-based methods have shown a significant performance in finding optimal solutions when solving large-scale problems using their ability of a global search (Kolajo et al. 2019) . The metaheuristic concept further offers a search mechanism based on a single solution and population-based methods. Population-based (PB) metaheuristics offer a wide range of algorithms, such as evolutionary algorithms (EA) or swarm intelligence (SI)based metaheuristics. The EAs are composed in a population of individuals, where each individual represents a search point in the space of possible solutions, and are subjected to a collective learning process for transmitting the information to the next generations. In SI, the individual member in a swarm works independently on the basis of their stochastic behavior and observations from the neighborhood or surroundings environment (Kurban et al. 2014 ). This section aims to provide an insight into the optimization methods used for optimizing the DNFS in the included studies. A careful investigation of the optimization method is presented in Table 6 . The deep analysis of the final data revealed that most papers on the DNFS model have employed exact methods for network optimization, as presented in Table 6 . Meanwhile, only five studies have employed population-based (PB) metaheuristic approaches to optimize DNFS such as brain storm optimization (BSO) (Ravi 2020) , elephant herd optimization (EHO) (Velliangiri and Pandey 2020), genetic algorithms (GA) (Lee 2020), crow search algorithm (CSA) (Chandrasekar 2020) , and the Jaya optimization algorithm (JOA) (Siva Raja and Rani 2020). However, in three studies, the optimization was performed by combining one metaheuristic with another metaheuristic algorithm such as the genetic algorithm (GA) with big bang-big crunch (BB-BC) (Chimatapu et al. 2018) , biogeography-based optimization (BBO) with hessian-free (HF) (Zheng et al. 2017) and BBO with greedy layer-wise training method (Zheng et al. 2016) . Few studies in the records were found to explain the model without mentioning any optimization methods. Figures 14 and 15 provides a clearer image of the distribution and record found in scientific databases for each optimization method presented in Table 6 . Based on Fig. 15 , IEEE Xplore has the highest number of publications in the optimization of DNFS when using different methods. The publication record found in Scopus indicates the least number of studies from the literature to optimize DNFS, and SpringerLink shows that no studies have been published in the directory using optimization techniques other than exact methods. Along with the type and intensity of each optimization method in scientific databases, Fig. 16 identifies the trend of studies in the DNFS domain employing exact optimization methods. Similarly, a deeper analysis was carried out to understand the beginning of the trend for researchers using population-based (PB) metaheuristics for DNFS optimization. This will help to identify the scope of metaheuristics-based methods within this domain. From Fig. 17 , it is obvious that the implementation of metaheuristic methods in the DNFS domain first took place in the year 2017, and only eight studies were conducted from 2017 According to the primary data extracted from scientific databases, it can be concluded that the advanced research related to the integration of deep learning techniques such as a fuzzy-based DNN in the form of DNFS initially took place in the year 2015 (Laleye et al. 2015) and has gained the attention of the research community ever since. Since then, the model has been employed to solve problems in various application areas. Although the implementation of such a system is still in its early stages, the rise in the number of publications, as indicated in Fig. 18 , cannot be ignored. It is clear from the figure that the research towards the integration of a DNN and fuzzy systems attracted researchers more effectively from 2015 to 2016, and a steady growth in the number of publications had occurred until 2017. After a decrease in 2018, a positive increase in the publications on DNFS can be seen in Fig. 18 for the period of 2019 to 2020, along with application subjects and an analysis of DNFS optimization methods. Hence, by observing the intensity of the publications in Fig. 18 , it can be concluded that the novel DNFS has a promising scope at present and as well as in the future. A total of 252 publications found in the literature were published over the past 6 years (from 2015 to 2020). This does not necessarily imply that all publications in the literature have been found, and there remains more to be explored. However, during the DNFSrelated keyword search, most of the databases displayed only related studies on the first three pages. This shows that there is still much to be explored in the domain of DNFS and the intensity of the research conducted in this domain has continued to increase over the past 6 years, as shown in Fig. 18 . To be more careful, we exceeded our search to a maximum of five pages for each scientific database. The priority was given to studies appearing on the initial five search result pages from ACM, IEEE Xplore, Scopus, ScienceDirect, and SpringerLink. Figure 19 shows the intensity of publications for DNFS in each scientific database. It was revealed that the most popular avenue for published literature related to the DNFS is IEEE Xplore (also with the highest number of conference proceedings, journals, papers, and book chapters) followed by SpringerLink, ScienceDirect, Scopus, and ACM. Combining fuzzy systems with a DNN enables the development of AI models that are not only accurate in prediction but also inherently interpretable and understandable to humans. Furthermore, the experimental results presented in (Yazdanbakhsh and Dick 2019) show that the DNFS approach is capable of achieving better accuracy than a DNN with the same level of abstraction/depth. Because basic neuro-fuzzy systems have been major research topics for over 27 years, various surveys and systematic review studies can be found in the literature. However, there is a lack of current research on the novel DNFS hybrid technique (Yeganejou and Dick 2018) . Hence, researchers have started exploring the potential of this new domain by implementing the model in various applications ranging from the computing domain to the healthcare, manufacturing, and aviation industries. Most breakthroughs regarding the implementation of DNFS in various application subject domains are highlighted in the following subsections. Techniques in the field of AI have made significant contributions to the solution of different real-world problems, including those in the computing domain. Likewise, the novel DNFS has also shown positive potential in solving multiple problems, mainly from this Fig. 19 Publications of DNFS directory-wise domain. In this subsection, we discuss the maximum number of studies found in records implementing the DNFS in different subject domains of computing, including distributed systems, cloud computing, cybersecurity, Internet marketing, software testing, and the classification of image, speech, text, and video. The difficulty of classifying sentiments on Twitter is important for real-world situations such as decision-making and information systems, where customers might obtain relevant information through online reviews. Service ratings can serve as an excellent point for the decision-making process as they provide quick information on the online reviews (Uma 2020) . Therefore, an optimization-based fuzzy deep learning classification was proposed in (Uma 2020) and (Bedi and Khurana 2020) for sentiment analysis. The proposed method was developed to solve the misclassification problem in social media reviews. Similarly, taking the advantages of deep learning and fuzzy inference, the authors in (Nguyen et al. 2018 ) proposed a hybrid fuzzy convolutional neural network (FCNN) with the integration of fuzzy logic and a CNN model for text sentiment classification of Twitter sentiment and movie reviews. The proposed model can resolve ambiguities in data with linguistic labels that are important for emotion detection for sentiment analysis tasks. A comparison of the results between the proposed FCNN and a conventional CNN showed that the proposed FCNN achieves better classification accuracies on emotional data. In another study (Zhou et al. 2014) , the authors embedded prior knowledge into the learning structure, making a two-step semi-supervised learning method called fuzzy deep belief networks (FDBN) for sentiment classification. In addition, DNFS has been implemented for congestion control in wireless sensor networks (WSNs) (Monisha and Ranganayaki 2018) and data streaming processing (Mahardhika Pratama et al. 2018) . (ii) DNFS application on cloud computing A few years ago, personal computers were not capable of tackling heavy workloads with vast amounts of data for processing. Hence, processing massive data was (and in some regards still is) a challenging problem until cloud computing was introduced to offer services as a solution to the massive data storage. While attempting to reduce the service rates, the most important aspect was the capability to schedule an upscale of cloud system resources for future or on-demand use. To ensure that cloud services are affordable to the customers, the authors in (Chen et al. 2018a, b) proposed a fuzzy deep neural network (FDNN) model to predict the demand for cloud computing resources. This model can assist customers in deciding the number of resources to be reserved for their computing needs, thereby reducing their operational costs. In another study, the same authors developed a hierarchical Pythagorean fuzzy deep neural network (HPFDNN) model by incorporating the properties of Pythagorean fuzzy logic and DNN . (iii) DNFS application on cybersecurity Modern malware is an alarming threat to both individual and organizational security. Over the last few decades, several malware families have been codified and differentiated based on their behavior and functionality. Most machine learning methods work well with the general benign-malicious classification, but are unable to distinguish new malware among many classes (Shalaginov and Franke 2017) . Therefore, the authors Shalaginov and Franke (2017) proposed a novel deep neuro-fuzzy architecture for multi-label malware classification and fuzzy rule extraction. In addition, smart grids (SGs) are critical and intelligent systems. However, to ensure the security, network requires cybersecurity method with advanced mechanisms of intrusion detection and prevention systems (IDPSs). Therefore, to provide the best possible security for SGs, a smart collaborative IDPS was introduced in (Patel et al. 2017 ) with a fully distributed management system to support and prevent the network from attacks. In (Amosov et al. 2019 ), a hybrid model of fuzzy logic with convolutional layers was introduced to detect the denial-of-service (DoS) attacks in a highly loaded corporate network by recognizing the abnormal network traffic. (iv) DNFS application on Internet marketing Apart from the sentiment analysis, Internet marketing and online advertising are seen as successful approaches for promotional engagement because of their solid and personalized communication capabilities. As a result, many researchers have shown interest in Internet advertisements, which have become an important source of revenue for online businesses. The click-through rate (CTR) is an effective factor for determining the effect of targeted advertising. Therefore, an FDNN was proposed in (Jiang et al. 2018) to predict the advertising CTR. In addition, fuzzy clustering and deep learning were combined in (Yin et al. 2020) to forecast the sales of the new products. (v) DNFS application in software testing In the field of software engineering, humanbased software testing consumes a lot of time and resources. However, software testing is an essential part of the process to validate the performance of a product under different circumstances before being released for consumer usage. Therefore, to save cost and time during the testing process, various software testing tools related to Oracle have been reported in the literature. Among such studies, in (Monsefi et al. 2019) , the authors implemented a novel deep neuro-fuzzy approach for software testing on Oracle. The proposed approach was validated using four different applications and produced better accuracy in detecting the errors with correct data. By contrast, in the study (Liu et al. 2019 ), a deeplearning and fuzzy oversampling-based model called DeepBalance was used for software vulnerability detection. (vi) DNFS application on image, speech, and text classification Earlier, deep learning was successfully applied in various tasks such as data, text, and image classification. Likewise, the majority of the articles in the literature on DNFS applications can be seen handling these challenges by combining fuzzy systems and a DNN. The author Korshunova (2018) proposed a convolutional fuzzy neural network approach with the help of convolutional, pooling, fully connected, and fuzzy self-organization layers for the classification of real-world objects and image scenes. The approach combines the advantages of a CNN and fuzzy logic to tackle ambiguity in the interpretation of the input sequence. Inspired by the current advancements of CNNs, the study of Greeshma and Bindu (2017) developed a fuzzy deep learning algorithm for single-image super-resolution. This novel approach uses a fuzzy rule layer along with a deep network to recreate a high-resolution image. In addition, the authors of (Deng et al. 2017) presented an FDNN for classification tasks, such as natural scene image categorization and stock trend prediction. Similar studies for image classification can be found in (Guan et al. 2020; Kunchala et al. 2020; Liu et al. 2020a, b; Liu et al. 2020a, b; Manchanda et al. 2020; Tianyu and Xu 2020; Dick 2018, 2019; Yeganejou et al. 2020; Zhang et al. 2020a, b, c) . In addition to image classification, DNFS has been successfully implemented to perform speech classification. A speech enhancement framework using a fuzzy deep belief network (FDBN) was reported in (Samui et al. 2019 ). In this model, the network is implemented to perform pre-training with the help of multiple FDBNs to enhance the stability and speed of feature learning. Meanwhile, Xu and Xiao (2018) suggested a theoretical method using a combination of fuzzy optimization with deep learning to cope with the fuzziness of emotions. Since data are growing at an exponential rate, it is essential to summarize a text document in order to understand the key elements of the document. Many studies have been conducted on summarization methods, and most of them are extractive summarizers. Henceforth, the research work by Chopade and Narvekar (2017) presented a hybrid approach of DNN and fuzzy logic systems. In this study, a restricted Boltzmann machine (RBM) is proposed with a DNN and fuzzy rule based on the phrases to extract the features using a sentence matrix. (vii) DNFS application on video classification and robotics The study of Nguyen et al. (2019) presented a novel convolutional neuro-fuzzy network, which incorporated a CNN into the fuzzy logic domain to derive high-level features of emotions from the text, audio, and image data. Alternatively, the authors in (Cunha Sergio and Lee 2020) proposed a novel hybrid DNN with ANFIS to interpret the emotions of a video from its visual features and a deep long short-term memory recurrent neural network to produce the related audio signals with an equal emotional impression. Likewise, in another study (Savchenko et al. 2018) , the authors implemented a fuzzy analysis with a CNN in still-to-video recognition. Moreover, the model has also been implemented in human action recognition and robotics (Bendre et al. 2020; Chen et al. 2020a, b; Liao et al. 2020; Mohmed et al. 2020; Wu et al. 2020) . Figure 20 shows the intensity of publications for various DNFS applications in the computing domain. Based on Fig. 20 , most DNFS applications are focused on image, speech, and text classifications, followed by video classification and robotics, distributed systems, cybersecurity, cloud computing software testing, and Internet marketing. Similar to the computing domain, DNFS has been successfully implemented in the healthcare sector to perform robotic surgeries and predict various diseases. A deep neuro-fuzzy approach was implemented in (Aviles et al. 2016) to estimate the interaction forces while performing a Robotic surgery. A fuzzy hybridized FCNN model was used in (Ramasamy and Hameed 2019) to classify healthcare data. In the study (Davoodi and Moradi 2018) , a modern fuzzy deep model was proposed for intensive care units (ICUs) for mortality prediction. For this, a deep framework was developed based on the layered structure of the fuzzy rule base, which can address big data problems. In (Park et al. 2016) , the researchers proposed fuzzy deep learning (FDL), which is a specific estimation method for intra-and inter-fractional variations in many patients. The proposed FDL was built by breathing clustering, a prediction of precise movements, and decreasing the computational cost. Sharma et al. (2020) employed a DNFS as a decision-making system to predict the risk and severity of diseases. The study presented a hybrid diagnosis strategy (HDS) using fuzzy inference and a DNN to detect COVID-19 patients. Moreover, the novel hybrid approach to DNFS has been implemented for tumor and cancer detection and segmentation (Banerjee et al. 2020; Lima et al. 2020; Mudiyanselage et al. 2020; Özyurt et al. 2019; Pitchai et al. 2020; Rahouma et al. 2019; Sengan et al. 2020; Shen et al. 2020; Yang et al. 2020; Zhang et al. 2020a, b, c) . After the field of financial engineering expanded over the last few years from financial signal analysis to financial prediction methods, this field has become the most important topic among the academic communities and the financial world. Several hybrid intelligent financial prediction systems incorporating neural networks, fuzzy logic, and genetic algorithms have been proposed over the past 20 years. Likewise, to make a worldwide financial prediction, Lee (2020) introduced a chaotic type-2 transient-fuzzy deep neuro-oscillatory network (CT2TFDNN) with retrograde signaling. Other studies (Chandrasekar 2020; Chen et al. 2020a, b; Wang 2020; Xiao 2020) have implemented the DNFS method in Bitcoin price prediction, stock index prediction, and e-commerce platforms. In the era of AI today, the use of intelligent software for travel assistance is growing rapidly. An intelligent transportation system (ITS) is an advanced transport management system that incorporates electronic information, AI, global positioning system (GPS) tracking, communications engineering, and other techniques. Generally, traffic flow data has the limitation of complexity and noise interaction. However, compared with the previous deterministic explanation, fuzzy theory is capable of generalizing the original data more logically . Hence, a fuzzy-based convolutional neural network (F-CNN) approach was implemented in for predicting traffic flow. This approach uses a fuzzy inference system (FIS) to produce uncertain knowledge about traffic incidents. In addition, a CNN training algorithm is used to learn the characteristics of internal traffic data, traffic accident information, and external information, thus forming an F-CNN prediction model to predict the traffic flow. A few similar studies have employed DNFS for traffic flow prediction and incident detection in (Chen et al. 2018a, b; El Hatri and Boumhidi 2018; Sumit and Akhter 2019; Usman et al. 2020) . At the same time, the authors in (Chai et al. 2020; Ivanov et al. 2019 ) proposed the same model for monitoring unmanned surface vehicles (USVs) and hypersonic vehicles. In the welding industry, methods and techniques must consider trends of robotic usage and a large multi-structural architecture to meet the criteria of current development projects in the market. In addition, the technologies in modern manufacturing have led to new developments in welding techniques. Drawing inspiration from the AI technique, the study in (Kesse et al. 2020) proposed the implementation of an AI-based tungsten inert gas (TIG) algorithm for welding to identify the control parameters and predict the optimal welding bead width using fuzzy deep learning. Similarly, for industrial accidents, it is important to prevent and control industrial accidents with an early warning. The existing approaches are time-consuming, unreliable, and incompetent in coping with uncertainty. Therefore, an FDNN was implemented in (Gobinath and Madheswaran 2020; Lin et al. 2020; Yun et al. 2020; Zhang et al. 2020a, b, c; Zheng et al. 2017) to diagnose the faults in machines, provide a forecast, and alert managers for possible industrial accidents in advance. In addition, the study of (Remya and Sasikala 2019) used hybridization of the back-propagation in deep learning and fuzzy logic decision tree for rubberized coir fiber classification. In the aviation sector, passenger profiling plays a vital role in maintaining commercial airline security. However, the conventional methods have become inefficient in handling the rapidly increasing amounts of electronic records. Hence, the researchers in (Zheng et al. 2016 ) proposed a deep neuro-fuzzy approach with the integration of ordinary Pythagorean-type fuzzy sets and a deep Boltzmann machine (DBM) as a Pythagorean fuzzy deep Boltzmann machine (PFDBM). This study further proposed a hybrid learning algorithm combining a biogeography-based optimization (BBO) metaheuristics algorithm to improve the exploration search and a gradient-based method to enhance the exploitation search. The simulation results performed on the Air China datasets indicate that the proposed solution offers a high classification accuracy with a great learning ability. In addition, many patternanalysis tasks can be solved using this approach. The incorporation of smart meters in energy management systems has made it easier for electrical companies to access the electricity usage data of their customers. However, extracting and analyzing enormous amounts of data is challenging for these companies. Therefore, researchers have started to utilize various AI techniques to analyze data retrieved from smart meters (Javaid et al. 2019 ). The efforts have been made for the energy management of the residential buildings in (Javaid et al. 2019 ). This study focused on an efficient load and cost optimization by proposing the use of DNFS for solving uncertain behaviors of consumers with large amounts of data. The final finding of the study confirms the robustness of the proposed model in terms of cost optimization and energy efficiency. Apart from this, one more study was found in the literature that combines fuzzy and deep learning methods to predict the hourly load of the next 7 days. The proposed technique showed a superior performance compared to the traditional load forecasting schemes (Sideratos et al. 2020) . Figure 21 summarizes and visualizes the DNFS research applications in various domains, together with the intensity of publications for each application subject domain. By contrast, Fig. 22 visualizes the distribution of records found in each application domain for the papers included in this systematic literature survey. Deep Neuro-Fuzzy System application trends, challenges, and… 1 3 As a comprehensive analysis of the studies found through a designed study mapping process, this systematic literature survey reports a total of 105 relevant studies addressing the research questions. This section has been thoroughly organized into two subsections. The first subsection (Sect. 5.1) highlights the research gaps, issues, and challenges found while answering the four research questions. In addition, a few recommendations are presented to facilitate the researchers in finding potential directions for future work. Meanwhile, the limitations of this systematic literature survey are presented in second subsection (Sect. 5.2). Research Question (RQ1) Based on the literature published during the past 5 to 6 years, several variations of DNFS have been proposed since its emergence, and the successful implementation of this emerging model is growing rapidly in a variety of application domains. Because adding a fuzzy layer into a DNN is extremely flexible, it is possible to include it anywhere in the network architecture, depending on the desired behavior of the fuzzy layer. Hence, this study also covered three different structural designs that have been developed in the formation of deep neuro-fuzzy-based models, such as sequential structural designs, parallel structural designs, and cooperative structural designs. Since these novel deep neuro-fuzzy systems are a type of deep network with a hybrid of fuzzy rules, membership degrees along with DNN parameters such as the learning rate, number of layers, number of nodes per layer, a huge number of weights, activation functions, and an optimizer. Therefore, despite various choices of structural designs, answering the first research question of this study revealed that most of the examples found during the period of 6 years have used sequential structural designs to develop DNFS as compared to parallel and cooperative structure designs to keep the model simple. However, sequential structures are designed to be linear and are considered as slow models compared with the other two structural designs. This behavior of linearity becomes a challenge in implementing DNFS in the big data paradigm owing to its deep architecture. Whereas a parallel and cooperative structural design could be more successful in solving complex real-world problems involving large-scale data because of their flexibility in learning and interpretability. Hence, in the future, research on the development of DNFS could be further oriented toward efficient hybridization of fuzzy or neuro-fuzzy systems with a DNN to create parallel and cooperative models. Moreover, because these models have been proposed to tackle big data problems, the computational complexity increases when dealing with huge and complex data owing to the deep architecture. In addition to suggested structural changes, few studies from the literature have recommended introducing hardware solutions such as the use of powerful GPUs, an FPGA, and memristors in the future to overcome the problem of computational complexity. In the same context of big data, it is often challenging to model suitable techniques and methods to deal with streaming data continuously generated by different sources at high speeds. In the past, various AI-based decision-making systems have been presented (Almuammar and Fasli 2019; Lobo et al. 2018; Mahardhika Pratama and Wang 2019; Ullah et al. 2019) . However, when it comes to DNFS, only a single study (M. Pratama et al. 2020 ) has been found in developing this model for the continuous learning of nonstationary data streams. In this study, a deep evolving fuzzy neural network (DEVFNN) with an elastic structure is introduced. This approach helps to make dynamic modifications in fuzzy rules and the depth of the network structure. In the future, more research work should be directed toward constructing DNFS models that can analyze streaming data dynamically to generate instant and reliable outcomes. Research Question (RQ2) The second research question covered the methods employed to optimize the parameters of the DNFS. Based on the data presented in Table 6 and Fig. 14 , it is obvious that most of the studies have used the exact methods such as a gradient descent (GD) algorithm, which is iterative and prone to being stuck in the local minima. Therefore, when facing large-scale data, deep neuro-fuzzy models often deal with slow convergence and poor outcomes (Das et al. 2020 ). This problem ultimately affects the accuracy of the model during the classification tasks. As a result, several modern optimization techniques under "metaheuristics" have been introduced and implemented in the literature to efficiently optimize machine learning models. However, very limited efforts have been made in the literature from 2017 to 2020 to solve this problem of the local minima with the help of metaheuristic techniques for DNFS models. Moreover, based on our findings for this research question, the majority of studies have used evolutionary-based metaheuristic optimization approaches, whereas only two studies have adopted the swarm intelligence approach. However, according to the research work presented in (Kurban et al. 2014) , swarm-based algorithms are generally more accurate and reliable than evolutionary algorithms. In contrast, an analysis based on the study in (Janga Reddy and Nagesh Kumar 2020) states that evolutionary algorithms outperform swarm-based algorithms in terms of finding a near-optimal solution within a reasonable computational time. In addition, we cannot ignore the concept of the "No Free Lunch theorem", which states that no single metaheuristic is better than another metaheuristic algorithm for solving all real-world problems. Therefore, it is challenging to generalize a particular metaheuristic optimization algorithm that can be used to solve classification, time-series, computer vision, natural language processing, and other tasks. To date, several attempts have been made to introduce new metaheuristic techniques in the literature. Among these methods, some of the popular algorithms are cuckoo search (CS) (Gandomi et al. 2013) , bat algorithm (BA) (Yang and He 2013) , grey wolf optimizer (GWO) (Mirjalili et al. 2014) , animal migration optimization (AMO) (Li et al. 2014) , whale optimization algorithm (WOA) (Mirjalili and Lewis 2016) , emperor penguins colony (EPC) (Harifi et al. 2019) , mayfly algorithm (MA) (Zervoudakis and Tsafarakis 2020), and equilibrium optimizer (EO) (Faramarzi et al. 2020) . With respect to the issues identified in the above statement, research on optimizing DNFS with metaheuristic-based algorithms still needs significant work in the future. Researchers may in the future investigate and explore the newly introduced metaheuristic optimization methods mentioned above to compare and further improve the performance of DNFS. Research Question (RQ3) The intensity of publications in the domain of DNFS was carefully examined and addressed in this research question. Based on our extensive search from the online databases, it can be concluded that the research on the development of DNFS first took place in 2015. According to Fig. 18 , the domain of DNFS was not explored much during the first four years (2015-2018) of its development. This might have caused difficulties and challenges for researchers to understand this new concept of integrating deep learning with fuzzy systems. Nevertheless, a major rise in the publication of DNFS can be seen from the years 2019 to 2020. The DNFS model has started gaining attention among research communities while successfully solving various problems in the subject areas of computing, engineering, and industry. Since this domain is still a new area of interest, the majority of the research work has been conducted by implementing the model in different application domains without any major efforts on improving its performance by reducing the computational complexity. Hence, future research should be geared towards exploring efficient ways to improve the training mechanism, hyperparameter tuning (e.g., learning rate and number of hidden layers), fuzzy knowledge base, and structural modifications of the DNFS model. Research Question (RQ4) With this research question, this study tried to cover the majority of the applications in the domain of DNFS. Since the model has been recently introduced, there were minimal studies found in the literature that have used DNFS for different application subjects. Figure 21 of Sect. 4.4 shows that the use of DNFS models in the computing area is the leading trend, followed by healthcare, traffic management systems, and manufacturing industry when compared to other applications. Most studies in the computing domain have used the DNFS model for image, speech, and text classification. Subsequently, video classification and robotics, as well as distributed systems, are trending ahead of other fields such as cybersecurity, cloud computing, software testing, and Internet marketing, as shown in Fig. 20 . However, relatively little research using DNFS has been identified in the aviation industry, finance and economics, and energy management. Therefore, while answering this research question, it can be concluded that there is a vast scope related to the future implementation of the model in technologies under the fourth industrial revolution (4IR), such as AI, blockchain, virtual and augmented reality, cybersecurity, biotechnology, the Internet of Things, digital signal processing, robotics, manufacturing industry, and renewable energy. A critical analysis of the records found in the literature revealed that only four survey studies have been published on DNFS. However, our study is the first initiative in this domain to present a systematic literature survey. The motivation behind conducting this systematic literature survey was to investigate and identify detailed statistics and figures by performing an in-depth analysis of the records obtained from the literature, and to cover all papers related to the area. However, this systematic literature review has a few limitations. For instance, during the screening procedure, only studies published with detailed knowledge and written in English were included in this systematic literature survey. As a result, there may be some short papers or publications that are published in other languages, which may have made a positive contribution in this domain but were not analyzed. Furthermore, to maintain the quality and reliability of this systematic literature survey, we had to exclude research papers that claimed their method was a combination of fuzzy systems and deep learning but had poorly defined methodologies. We are aware that these filters might have affected the final findings of the included studies. Nonetheless, the decision to exclude the aforementioned papers was not taken lightly and was conducted based on the inclusion and exclusion criteria (see Table 4 ), as well as the eligibility criteria (see Table 5 ). Thus, the goal of this systematic literature review was to identify and highlight the majority of work published in the DNFS domain. This systematic literature survey aims to capture state-of-the-art research in the novel domain of DNFS by following the guidelines of well-written systematic reviews from the literature. As a result, a revised study mapping process comprising seven phases was introduced in this study. Four research questions were designed to lay the foundation of this study and help extract meaningful information from the database to draw a comprehensive picture of the current state of research related to DNFS. A total of 252 studies were retrieved during the first step of the primary search using the selected keywords and search strings. It became obvious that DNFS-based systems are relatively new, with only a few relevant papers found in the literature during the in-depth analysis of the identified publications. However, a total of 105 studies were found during the quality assessment process, which provides an answer to the research questions of this systematic literature survey. Moreover, the well-defined answers to the research questions helped to identify the research gaps, issues, and challenges of this particular domain. In addition, this study addressed possible future directions, including potential structural designs (e.g., parallel and cooperative architectures) to further strengthen the outcomes for solving the classification and prediction-related problems. This study also suggested the implementation of modern optimization methods such as metaheuristic techniques to optimize DNFS in the future. This study also suggests to review the performance of the model by improving the training mechanism, hyperparameter tuning (e.g., learning rate and number of hidden layers), and fuzzy knowledge base. Along with that, this study also discovered and recommended potential application areas where the DNFS has not yet been deployed, such as virtual and augmented reality, business, education, robotics, manufacturing, renewable energy, and engineering. Recommendations were made to address the limitations found in the literature to help both researchers and practitioners interested in this particular domain. Therefore, this comprehensive systematic literature survey aims not only to provide researchers with the maximum information about DNFS in a single paper but also offer a platform for researchers who wish to commence their research and explore the potential of DNFS for future work. In the final analysis and conclusion, this study discussed the limitations of the systematic literature survey that affected the final results of the included studies, such as the fact that few studies were excluded owing to poorly defined methodologies, short papers, and research published in languages other than English. In: Paper presented at the connectionist models of neurons, learning processes, and artificial intelligence Recognition of abnormal traffic using deep neural networks and fuzzy logic A novel fuzzy-based convolutional neural network method to traffic flow prediction with uncertain traffic accident information Deep rule-based classifier with human-level performance and characteristics Multi-biometric sustainable approach for human appellative A deep-neuro-fuzzy approach for estimating the interaction forces in robotic surgery Customer relationship management systems (CRMS) in the healthcare environment: a systematic literature review Bag R (2020) Melanoma diagnosis using deep learning and fuzzy logic Sentiment analysis using fuzzy-deep learning Human action performance using deep neuro-fuzzy recurrent attention model An approach to explainable deep learning using fuzzy inference Analysis of explainers of black box deep neural networks for computer vision: a survey Real-time reentry trajectory planning of hypersonic vehicles: a two-step strategy incorporating fuzzy multiobjective transcription and deep neural network Fuzzy crow search algorithm-based deep LSTM for bitcoin prediction Prediction of cloud resources demand based on fuzzy deep neural network Prediction of cloud resources demand based on hierarchical pythagorean fuzzy deep neural network A Fuzzy deep neural network with sparse autoencoder for emotional intention understanding in human-robot interaction A novel fuzzy deep-learning approach to traffic flow prediction with uncertain spatial-temporal data features A deep hybrid fuzzy neural hammerstein-wiener network for stock price prediction Interval type-2 fuzzy logic based stacked autoencoder deep neural network for generating explainable AI models in workforce optimization Hybrid auto text summarization using deep neural network and fuzzy logic system Emotional video to audio transformation using deep recurrent neural networks and a neuro-fuzzy system Theory and applications of ordered fuzzy numbers: a tribute to Professor Witold Kosiński Fuzzy deep neural network for classification of overlapped data Mortality prediction in intensive care units (ICUs) using a deep rulebased fuzzy classifier Fuzzy neural networks and neuro-fuzzy networks: a review the main techniques and applications used in the literature A hierarchical fused fuzzy deep neural network for data classification Deep learning classifiers with memristive networks: theory and applications Fuzzy deep learning based urban traffic incident detection Fuzzy based multi-line power outage control system Equilibrium optimizer: a novel optimization algorithm Risk assessment of maintenance activities using fuzzy logic Deep perceptron neural network with fuzzy PID controller for speed control and stability analysis of BLDC motor Single image super resolution using fuzzy deep convolutional networks Semi-supervised deep rule-based approach for image classification A massively parallel deep rule-based ensemble classifier for remote sensing scenes Lip image segmentation based on a fuzzy convolutional neural network Emperor penguins colony: a new metaheuristic algorithm for optimization Black box nature of deep learning for digital pathology: beyond quantitative to qualitative algorithmic performances A systematic literature review on features of deep learning in big data analytics Optimization of ANFIS using mine blast algorithm for predicting strength of malaysian small medium enterprises Metaheuristic research: a comprehensive survey Intelligent deep neuro-fuzzy system recognition of abnormal situations for unmanned surface vehicles Evolutionary algorithms, swarm intelligence methods, and their applications in water resources engineering: a state-of-the-art review Towards buildings energy management: using seasonal schedules under time of use pricing tariff via deep neuro-fuzzy optimizer FPGA implementation of a functional neurofuzzy network for nonlinear system control An improved advertising CTR prediction approach based on the fuzzy deep neural network Optimization based fuzzy deep learning classification for sentiment analysis Development of an artificial intelligence powered TIG welding algorithm for the prediction of bead geometry for TIG welding processes using hybrid deep learning Neuro-fuzzy control of a position-position teleoperation system using FPGA Big data stream analysis: a systematic literature review A convolutional fuzzy neural network for image classification Transfer learning based fuzzy deep neural networks for leaves detection from digital images Comparison of evolutionary and swarm based computational techniques for multilevel color image thresholding Adaptive decision-level fusion for Fongbe phoneme classification using fuzzy logic and Deep Belief Networks Deep learning Chaotic type-2 transient-fuzzy deep neuro-oscillatory network (CT2TFDNN) for worldwide financial prediction Animal migration optimization: an optimization algorithm inspired by animal migration behavior A fuzzy ensemble method with deep learning for multi-robot system A proposal for an explainable fuzzy-based deep learning system for skin cancer prediction Using fuzzy uncertainty quantization and hybrid RNN-LSTM deep learning model for wind turbine power Deep fuzzy graph convolutional networks for PolSAR imagery pixelwise classification Fuzzified image enhancement for deep learning in iris recognition DeepBalance: deep-learning and fuzzy oversampling for vulnerability detection Drift detection over non-stationary data streams using evolving spiking neural networks An improved multifocus image fusion algorithm using deep learning and adaptive fuzzy filter Fuzzy membership function implementation with memristor An FPGA-based neuro-fuzzy sensor for personalized driving assistance The whale optimization algorithm Grey Wolf optimizer Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement Convolutional neural network classifier with fuzzy feature representation for human activity modelling Congestion avoidance aware using modified weighted fairness guaranteed DRED-FDNNPID congestion control for MWSN Performing software test oracle based on deep neural network with fuzzy inference system Deep fuzzy neural networks for biomarker selection for accurate cancer detection Applying deep learning techniques for big data analytics: a systematic literature review A fuzzy convolutional neural network for text sentiment analysis A multimodal convolutional neuro-fuzzy network for emotion understanding of movie clips Brain tumor detection based on Convolutional Neural Network with neutrosophic expert maximum fuzzy sure entropy Intra-and inter-fractional variation prediction of lung tumors using fuzzy deep learning A nifty collaborative intrusion detection and prevention architecture for Smart Grid ecosystems A review on advances in deep learning Fuzzy logic and its applications in medicine Brain tumor segmentation using deep learning and fuzzy k-means clustering for magnetic resonance images An incremental construction of deep neuro fuzzy system for continual learning of non-stationary data streams Deep stacked stochastic configuration networks for lifelong learning of nonstationary data streams Brain cancer diagnosis and prediction based on neural gas network and adaptive neuro fuzzy Classification of healthcare data using hybridised fuzzy and convolutional neural network Image classification using deep learning and fuzzy systems Classification of rubberized coir fibres using deep learning-based neural fuzzy decision tree approach Machine learning based decision support systems (DSS) for heart disease diagnosis: a review A review of training methods of ANFIS for applications in business and economics A modified neuro-fuzzy system using metaheuristic approaches for data classification A novel Spatio-Temporal Fuzzy Inference System (SPATFIS) and its stability analysis Time-frequency masking based supervised speech enhancement framework using fuzzy deep belief network Online deep fuzzy learning for control of nonlinear systems using expert knowledge Fuzzy analysis and deep convolution neural networks in still-to-video recognition Agile requirements engineering: a systematic literature review A fuzzy based highresolution multi-view deep CNN for breast cancer diagnosis through SVM classifier on visual analysis A deep neuro-fuzzy method for multi-label malware classification and fuzzy rules extraction Deep neuro-fuzzy approach for risk and severity prediction using recommendation systems in connected health care Deep challenges associated with deep learning Hierarchical fused model with deep learning and type-2 fuzzy learning for breast cancer diagnosis Review of deep learning algorithms and architectures Opening the black box of deep neural networks via information A novel fuzzy-based ensemble model for load forecasting using hybrid deep neural networks Systematic review of spell-checkers for highly inflectional languages Brain tumor classification using a hybrid deep autoencoder with Bayesian fuzzy clustering-based segmentation approach C-means clustering and deep-neuro-fuzzy classification for road weight measurement in traffic management system Hyperspectral remote sensing image segmentation based on the fuzzy deep convolutional neural network Action recognition using optimized deep autoencoder and CNN for surveillance data streams of non-stationary environments A Human-in-the-loop probabilistic CNNfuzzy logic framework for accident prediction in vehicular networks Fuzzy-Taylor-elephant herd optimization inspired Deep Belief Network for DDoS attack detection and comparison with state-of-the-arts algorithms Neuro-fuzzy systems: a survey Fuzzy logic systems and medical applications Deep learning for computer vision: a brief review Fast Training algorithms for deep convolutional fuzzy systems with application to stock index prediction Two-stage fuzzy fusion based-convolution neural network for dynamic emotion recognition Information management of e-commerce platform based on neural networks and fuzzy deep learning models Speech emotion recognition based on deep learning and fuzzy optimization Applications of deep learning and fuzzy systems to detect cancer mortality in next-generation genomic data Bat algorithm: literature review and applications A deep neuro-fuzzy network for image classification A deep neuro-fuzzy network for image classification Classification via deep fuzzy c-means clustering Improved deep fuzzy clustering for accurate and interpretable classifiers Interpretable deep convolutional fuzzy classifier A hybrid method for forecasting new product sales based on fuzzy clustering and deep learning Tracing the main path of interdisciplinary research considering citation preference: a case from blockchain domain Knowledge diffusion paths of blockchain domain: the main path analysis Collapse moment estimation for wall-thinned pipe bends and elbows using deep fuzzy neural networks A mayfly optimization algorithm A situation assessment method with an improved fuzzy deep neural network for multiple UAVs Deep fuzzy echo state networks for machinery fault diagnosis Deep learning and unsupervised fuzzy c-means based level-set segmentation for liver tumor A pythagorean-type fuzzy deep denoising autoencoder for industrial accident early warning Airline passenger profiling based on fuzzy deep machine learning Fuzzy deep belief networks for semi-supervised sentiment classification