key: cord-0078949-7mwyfh45 authors: Elhoseny, Mohamed; Metawa, Noura; Sztano, Gabor; El-hasnony, Ibrahim M. title: Deep Learning-Based Model for Financial Distress Prediction date: 2022-05-25 journal: Ann Oper Res DOI: 10.1007/s10479-022-04766-5 sha: 2fac05e69f00155c5b293df2254971149e3e645b doc_id: 78949 cord_uid: 7mwyfh45 Predicting bankruptcies and assessing credit risk are two of the most pressing issues in finance. Therefore, financial distress prediction and credit scoring remain hot research topics in the finance sector. Earlier studies have focused on the design of statistical approaches and machine learning models to predict a company's financial distress. In this study, an adaptive whale optimization algorithm with deep learning (AWOA-DL) technique is used to create a new financial distress prediction model. The goal of the AWOA-DL approach is to determine whether a company is experiencing financial distress or not. A deep neural network (DNN) model called multilayer perceptron based predictive and AWOA-based hyperparameter tuning processes are used in the AWOA-DL method. Primarily, the DNN model receives the financial data as input and predicts financial distress. In addition, the AWOA is applied to tune the DNN model's hyperparameters, thereby raising the predictive outcome. The proposed model is applied in three stages: preprocessing, hyperparameter tuning using AWOA, and the prediction phase. A comprehensive simulation took place on four datasets, and the results pointed out the supremacy of the AWOA-DL method over other compared techniques by achieving an average accuracy of 95.8%, where the average accuracy equals 93.8%, 89.6%, 84.5%, and 78.2% for compared models. Because of the impacts of the COVID-19 outbreak, many businesses are facing financial difficulties. Usually, economic downturns are part of companies' longterm business activity, and problems may be discovered in time and solved adequately. Due to its nature, financial distress is predictable with several types of models. (Uthayakumar et al., 2020a) Generally, businesses could perceive possible threats by analyzing country-, industry, and firm-level information on economic stance. To assist companies in dispersing and avoiding financial threats effectively and in a timely manner, economic crises warning systems have great importance in companies' risk management. Several researchers have developed early warning economic models in various sectors; however, the precision of these methods still needs to be enhanced (Ghisellini & Ulgiati, 2020; Liu & Arunkumar, 2019; Sankhwar et al., 2020) . Now, many related pieces of literature use arithmetical models for building an economic crisis earlier warning system or employ different techniques for assessment (Ashraf et al., 2019; Bluwstein et al., 2020; Shajalal et al., 2021) . Also, these studies use a comparative approach and use 2 NN methods for establishing economic crises earlier warning systems (Metawa et al., 2021; . This 2 NN model includes RPROP NN and SVM. Such results could recommend the domestic economic sectors and offer a foundation for financial management for public investments. Currently, the ML algorithm has been extensively employed in the fields of company economic distress predictions. But, many algorithms regard economic distress predictions as a more straightforward dichotomy problem and frequently neglect the timeline of the financial distress epidemic in real-time. Also, the possible relations among tags and features may not be deliberated. Worthy economic distress predictions should be efficient and realistic (Chatzis et al., 2018; Huang & Yen, 2019; Petropoulos et al., 2020) . In this study, a new financial distress prediction model uses an adaptive whale optimization algorithm with deep learning (AWOA-DL) technique, including multi-layer perceptron (MLP) and optimization algorithm. Population-based WOA is capable of avoiding local optimums and finding a solution that is optimal globally. With these advantages, WOA is an ideal algorithm for solving a wide range of constrained and unconstrained optimization problems in practical applications without the need for structural reform of the algorithm. The WOA is used for fine-tuning the parameters of the MLP so producing the parameters that maximize the performance of the MLP with better accuracy. It is now easier to fine-tune MLP parameters with the help of the WOA, which allows for more precision. The goal of the AWOA-DL approach is to determine whether a company is experiencing financial distress or not. The proposed AWOA-DL technique predicts a company's financial distress in four stages, each of which is described in detail below. The data collection phase is responsible for gathering the datasets that will be used to evaluate the performance of the proposed model. Second, the data preprocessing stage improves the quality of the data collected before moving on to the other steps of the model. Preprocessing removes incomplete samples, missing data, and null values. Because all financial indicators are based on company characteristics, this method can be used globally. Third, the AWOA-based parameter optimization phase fine-tunes the hyperparameters of the MLP model in preparation for the final step. We will use sigmoid transfer functions to train the networks to predict distress. The AWOA-DL technique combines deep neural network (DNN) prediction with AWOA hyperparameter tuning. The DNN, including the MLP model, predicts financial distress using financial data. The last step is to use deep neural networks (DNNs) to predict the company data class labels (DF or NFD). We examine the performance of real-time and dynamically changing structures in four different datasets in this work. The studies in this work primarily aim to give researchers clear guidance on constructing a reliable deep learning model for predicting future financial difficulties. FD is for the financially distressed firm, while NFD stands for the non-financially distressed corporation. The study's findings indicate that the proposed model has a significantly higher prediction accuracy than conventional machine learning models. This article is structured as follows. The next section discusses research into financial distress prediction. Section 3 discusses the tools and techniques employed in greater detail. Section 4 introduces and clarifies the model being discussed. Section 5 describes, optimizes, analyzes, and compares the prediction performance of all of the chosen deep models. Section 6 presents a robust model for predicting financial distress using deep neural networks and concludes with recommendations. The logistic regression models (Ohlson, 1980; Sun et al., 2014) , and discriminant analysis (Altman, 1968 ) are examples of early and conventional statistical models that have been used in distress prediction. As seen above, these basic linear processes are impractical since they are very simple when it comes to developing a good model for generating real-time predictions, as seen above. To better predict bank hardship, researchers in the Gulf Cooperation Council created a simple hazard model (Maghyereh & Awartani, 2014) . Other methods, such as machine learning utilizing data mining techniques like Logistic Regression (Min & Lee, 2005; Sun et al., 2017) and Support Vector Machines (Chandra et al., 2009; Cleofas-Sánchez et al., 2016; Hall & Holmes, 2003; Moula et al., 2017; Serre et al., 2007) were explored. An evaluation of the performance of machine learning algorithms for predicting financial hardship in publicly traded Chinese enterprises was completed in 2015 by Serre et al. (2007) . Yu et al. (2020) developed an advanced complicated network method for simulating an interbank network using general risks contagion, which considered the balance sheet of all banks, where they could find when the economic institution has an adequate capital reserve to avoid risks contagion. Cascade default is also made in the simulations based on the distinct crises triggering (target default) method. They utilize ML methods for identifying the synthetic feature of the networks. Appiahene et al. (2020) presented an integrated DEA using 3 ML methods in calculating bank performance and efficiency with 444 Ghanaian bank branches, DMUs. The outcomes are related to the respective efficacy rating attained from the DEA. Lastly, the predictive accuracy of the 3 ML algorithms is related. The result recommended that the DT and its C5.0 algorithms provide an optimal prediction method. Samitas et al. (2020) studied on EWS model by exploring potential contagion threats according to the structure economic network. Earlier warnings indicators enhance average crisis predictive model performances. The network analyses and ML algorithm found proof of contagion risks on the date in which they observed considerable expansion in centralities and correlations. Kim et al. (2020) aimed to forecast the directions of US stock price by incorporating time differing ETE and different ML methods. Examining the ETE depending on the three & six months moving window could be considered the explanatory market parameter by examining the associations among the economic crisis and Granger causal relations amongst the stocks. Zeng et al. (2020) proposed a method for FS classification predictions depending on data source grouping and feature attributes. The present economic distress predictions generally employ the information from financial statements and ignore the timeline of companies. Thus, proposed a company's economic distress predictions, i.e., improved in line using the practice and combines the grouping sparse PCA of financial information, business governance features, and market transaction information using SVM. Uthayakumar et al. (2020b) introduce a cluster-based classification model comprised of 2 phases: enhanced K-means clustering and FSCGACA based classification methods. Initially, improved K-means algorithms are developed for eliminating the incorrectly clustered information. Later, rule-based models are elected for designing the provided datasets. Previous research shows that predicting the failure of a business or organization is critical. As the failure of a company is clearly associated with several characteristics and with financial history, modeling the probability of firms' default is a feasible and, nonetheless, essential task. Earlier research focused on developing statistical and machine learning models to predict financial distress. Researchers have identified the economic problems warning model in various sectors, but their precision needs improvement. The ML algorithm has been widely used to predict company financial distress. However, many algorithms treat economic distress prediction as a binary problem, ignoring the epidemic's real-time timeline. Also, tags and features may not be related. In this chapter, a general description of the DNN and MLP, as a model used in this paper, will be presented and then the Whale Optimization Algorithms will be discussed. A deep neural network (DNN) is an artificial neural network with multiple layers. It is a multilayer network, with the term "deep" denoting the number of layers used to modify the data (Serre et al., 2007 )-a subclass of the class of machine learning models. The primary distinction between classical and deep neural networks is the number of hidden layers and the training procedure. DNN can extract higher-order correlations by adding additional hidden layers (Freund & Haussler, 1994; Hinton et al., 2006; Weston et al., 2012) . The basic concept behind a DNN is that it learns via repetition from a collection of samples, such as 100 images of various dogs, rather than from a set of artificial criteria, such as "a dog has a black nose and floppy ears." In this manner, a DNN acquires knowledge in the same way the human brain does-via practice and error (Mobahi et al., 2009 ). This is a graphical depiction of a feedforward DNN architecture in its most basic form. The DNN is given an input vector x of dimension D, which is processed according to a function g and the DNN's parameters by the hidden layers hj (constructed of Nj hidden units) (weights matrices W and bias vectors b) (Erhan et al., 2009) . Finally, the output layer O contains the DNN's output for the specified job (for the case of classification, the probability of an input vector to belong to each class C). DNN is a subset of machine learning modeled after how the human brain learns (Bengio, 2009 ). It has been used for several purposes, some of which you may be familiar with, such as language translation and picture search tools, and others of which you may be unaware, such as medical diagnostics (Zhou et al., 2015) . UCLA developed a deep neural network (DNN) to identify cancer cells! We are now utilizing it to charge our new hearing equipment. DNN training is a difficult task to accomplish (Bengio, 2014) . The MLP model is used in The processing elements in feedforward neural networks (FFNNs) are referred to as 'neurons.' There are several layers of neurons, and each layer is totally coupled to the one below it. The input layer is the first one, and it acts as a map between the network's variables and the inputs. The output layer is the final and most important layer to consider. Layers between the input and output are referred to as hidden layers. In the field of FFNN, the most commonly used architecture is the multilayer perceptron (MLP). One-way and one-directional connections between neurons are common in MLP. Weights are used to depict connections, and actual integers in the interval [1,1] are used. Using only one hidden layer, Fig. 2 depicts a basic MLP. Each node's output is calculated twice: once for input and once for output. This is done by first calculating the weighted sum of the inputs using Eq. 1, wherein I i is the input variable I and w i j is the weighted correlation between I i and the hidden neuron j. Second, a summation function is employed to activate neurons' output via an activation function. MLP could make use of several activation function types. The hidden layer's node j's output can be determined as stated in Eq. 1 using the sigmoid function, which is widely The network's ultimate output is given in Eq. 2 after the hidden layer's neuron outputs have been calculated. There has been a rise in the use of metaheuristic algorithms that replicate the collective behavior of creatures. Particularly concerning optimization, these techniques have had a substantial impact. Metaheuristic algorithms may provide excellent solutions in an acceptable amount of time. A smart way to overcome the restrictions of a lengthy, time-consuming search is to use a more efficient method. On the other hand, many metaheuristic algorithms suffer from a lack of search variety and imbalances between explosive and exploitative performance, making them less effective than other algorithms. One of the new populationbased meta-heuristic algorithms is the Whale optimization algorithm (WOA). WOA employs a diminishing encircling mechanism, helical rising, and random learning procedures to keep track of whales. WOA has a lot going for it in terms of its ease of use and accuracy of computation. The AWOA is developed in the whales' hierarchy. A primary objective of WO concerned the humpback whale and the individual's hunting rule. The present and optimum candidate solutions are set to accomplish objective prey or optimum solutions (Gharehchopogh & Gholizadeh, 2019) . The little fish near the surface are the food source for humpback whales (Fig. 3) . Mainly, the bubble-net encourage or chase performance is regarded as a positive objective in which humpback whale takes down in unpleasant water from the depth of 10-15 m. Still, another technique is a spiral shape prey surrounded and the air pocket transfer on the surface. The humpback whale notices prey's place and surrounds them. For accomplishing a model place from a search space, present and optimum opponent solution is an objective prey from AWOA . Then obtaining the optimum search agents, alternating search agent endeavor for reforming the conditions to effectual search agents, In the beyond purpose, S 2.r , and Y 2.I .r − I , in which "I" is decreased in [2-0] for every iteration, a novel position of search agent was easy from the center of actual place of agent and the place of present optimum agents. A novel solution has been recognized to forecast fitness value and optimum highlight with minimum parameter dependence. According to the functional fitness value, define the coefficient vectors "y", in which the versatile possibilities function has been established: At this point, f min and f max represent the maximal and minimal FFs, but C 1 and C 3 grades amongst 0 and 1. Therefore, meta-heuristic manners are effective if implemented to an adaptive scheme that outcomes from minimum processing duration to accomplish ideal outcomes like minimum evasion, and effectual convergence. The mathematical illustration refers to the bubble net performance of humpback whales, with 2 enhanced techniques. Primarily, shrink as well as the circle that is decreased to detect the dimensional to coefficient vector. The spiral function was implemented amongst the place of the whale as well as the victim to demonstrating the helix-shaped humpback whale which is shown as: It is noticeable humpback whales swim through prey in an increasing circle and wind is created to the whole method. For demonstrating the synchronous nature, the maximal possibilities are amongst contract as well as enclose method which refreshes the optimized model. The numerical technique was demonstrated as given under: where y refers to the arbitrary score amongst − 1 to 1 that implies the synchronous nature which recognizes possibilities amongst the contract encircling model and twisting scheme to the midst of optimization. The competing method in terms of the arrangement of vectors was implemented to search for prey. Afterward, the whole analyzer request to the administrator was invigorated utilizing an arbitrarily elected search specialist to utilizing a better searching operator So, arbitrary values are implemented frequently amongst [1, − 1]to calculate the search operation. Before utilizing an exploitation stage, the search specialist was changed from the exploration phase as demonstrated as without purpose elected search operator if related to an optimum search operator as recognized still now. Therefore, model − → Y > 1 emphasizing search operation and allows the AWOA to precede a global optimal and − → Y > 1 to examine a situation of search agent. This rule is monitored still getting a superior amount of iterations. A novel group of solutions is accomplished and trail on the fundamental of upgrading rules still the end condition is fulfilled. The proposed AWOA-DL technique predicts the financial distress of a company in four stages (as shown in Fig. 4) . The data collection phase collects the datasets required to evaluate the proposed model's performance. Second, the data preprocessing phase enhances the data collected before moving to the model's other steps. Third, the AWOA-based parameter optimization phase adjusts the MLP model's hyperparameters for the final step. Finally, DNN-based prediction of company data class label (DF or NFD) based on previous phases. A detailed discussion of these modules is offered in the succeeding sections. In order to find the most accurate model for predicting financial distress, several datasets have been used. The samples are obtained from four sets of data, Australian (Dua & Graff, 2017) , Analecta Dataset (Data archieves, https://pages.stern.nyu.edu/~adamodar/ New_Home_Page/dataarchived.html), Polish companies, and Taiwanese datasets (Zikeba et al., 2016) . Australian dataset: considers 690 samples with 14 input financial attributes. The dataset is associated for binary classification tasks. In this file, you'll find information on credit card application processes. To ensure data secrecy, all attribute names and values have been replaced with meaningless symbols…. Because of the good mix of features in this dataset-continuous, nominal with small values, and nominal with large values-it is worth looking into. A few values are omitted as well. The Australian dataset contains 307 non-bankrupt samples and 383 bankrupt samples. • Analects dataset considers 50 samples with about 5 attributes. The dataset is distributed into two or binary classes for bankrupt and non-bankrupt samples. The Analecta dataset contains 25 samples for non-bankrupt and the same for the bankrupt category. • Polish company dataset considers 10,503 samples with 64 input financial attributes. The dataset is associated for binary classification tasks. EMIS, a database of information about emerging markets around the world, was used to gather the information. During the years 2000-2012, the insolvent enterprises were examined, while the still operational firms This article aims to determine whether or not a company may be classified as being in a financial crisis. For preprocessing, incomplete samples are removed as well as missing data and null values. Firm health can be determined by looking at a company's level of solvency. Because all financial indicators are based on characteristics within the company, this approach can be used for businesses in other nations as well. The samples come under two class labels, namely bankrupt (FD) and non-bankrupt (NFD). Table 1 shows the details of the dataset description. The paper addresses a binary classification problem: whether a business should be classified as financially distressed (FD) or not (NFD). The financial dataset's output/target attribute is organized into economically distressed companies and financially healthy companies. Except for the binary target attribute, all other features in the dataset may be of any type. Additionally, the initial datasets include incomplete samples or have missing data or null values, which are removed during preprocessing. Smoothing the missing values is required to handle the lost values of attributes. Normalization of different scales of features is necessary to make all attributes on a homogeneous scale. There may be redundant rows in the datasets so, reducing and removing redundant rows is required. All these tasks are required before modeling the problem into the workflow of the proposed model. All mentioned tasks are performed in the preprocessing phase (Hanbal & Metawa, 2019) . For tuning the hyperparameters of the DNN model, the WOA is applied to it. Different MLP layers had varying success forecasting performance in this phase. Our study in this area is focused on finding new ways to improve MLP by combining different architectural styles. The prediction performance of a deep neural network can be improved by adjusting a number of hyperparameters. We focus on fine-tuning the algorithm's critical hyperparameters, such as network width and depth, which might cause it to explode or converge. We want to find out the best network depth and width to construct a financial distress classifier that works well. The models are trained using Python libraries such as scikit-learn and Keras, and the results are obtained using these same tools. Each search agent in the proposed model is encoded as a one-dimensional vector to represent a potential neural system. It is possible to divide vectors into three parts: an input layer, a hidden-layer weighted layer, and a hidden-to-output layer biases. It is possible to calculate the length of each vector by applying Eq. 10, where n is the number of input variables and m is the number of hidden layer neurons. Individual length (n × m) + (2 × m) + 1 To quantify the fitness value of the generated WOA agents, we use the mean square error (MSE) fitness function, which is calculated as the difference between the actual and predicted values by the created agents (MLPs) for all training samples. The MSE is depicted in Eq. 11, where y denotes the actual value, y denotes the predicted value, and n denotes the number of instances in the training dataset. Figure 5 represent the process of assigning search agent vector to MLP by WOA. The vector represents the input parameters assigned to the WOA for adaptation. In other words, the following steps illustrate the workflow of the WOA-based technique used to train the MLP network in this study: 1. A predetermined number of search agents is randomly produced as part of the initialization process. MLP networks are represented by the search agents. 2. The produced MLP networks' quality is assessed using a fitness function. Each MLP network is then assessed to see how well it matches a given set of search agents' weights and biases established during the previous stage. The MSE is used as a fitness function in this study because it is extensively used in evolutionary neural networks. Using the dataset as a training set, the purpose of the algorithm is to find the MLP network with the lowest MSE value. 3. Relocate the search agents to their new locations. 4. Step 4 is the repetition of steps 2 to 3 until the desired number of iterations is reached. Finally, the MLP network with the lowest MSE value is evaluated on an unexplored portion of the dataset. Figure 6 summarizes the above mentioned steps of the workflow of applying WOA for MLP as a DNN model. MLP is a highly preferred deep learning network model for financial distress prediction at this phase. MLP is developed, trained, and tested. We use Python's scikit-learn libraries to train the model and generate the results. There are two inputs to the model: financial data gathered by the firm and binary results showing whether the firm is in financial crisis or not. With scikit-learn, we created a classifier that uses back propagation along with cross-entropy loss and has a learning rate of 0.1. We utilized a dataset with an equal number of financial variables to train our model, thus there are an equal number of nodes in the input layer. If a data row can be classed as financially distressed, the output will include two nodes. With the sigmoid activation function and the Whale Optimization Algorithm, the MLP model is created. A total of three MLP models are developed and assessed during this stage. There are one, two, and three hidden layers in each of the models. Resampling techniques from Python's scikit-learn and keras packages are utilised for training and testing. Keras, a powerful Python deep learning tool, built and assessed the deep learning models. Scikit-learn, a Python library, uses stratified k-fold cross-validation to test the models. This is why we use resampling to check the accuracy of our models. As a result of this strategy, the data is divided into k parts, the model is trained on k-1 parts with k 10, and the remaining data is used as test data to assess how well the model performed. We chose to repeat this technique ten times, and the mean values of all the models developed were then used to measure the accuracy of the predictions. Accuracy, Precision, Recall, Kappa, and F-score are common machine learning assessment metrics used to evaluate prediction performance. Performance is assessed based on the correctness of the tests. In order to obtain the mean and standard deviation of the metrics over all models, we utilize a k-fold cross-validation score. When analyzing a test data set, testing accuracy refers to the percentage of correct predictions that were made. Based on the confusion matrix in Table 2 , precision and recall are used to determine the F-score, as demonstrated below (Abdelaziz & Mahmoud, 2021) : In order to determine "random accuracy," The confusion matrix in Table 2 tells us that a randomly selected label from the dataset has a probability of being positive with p 1 and a probability of being negative with (1-p 1 ), where A positive label is produced with a probability of p 2 and an incorrect label will be generated with probability (1-p 2 ), according to our classifier, where To define random accuracy, we must assume that the labels produced by these two processes are independent of one another as: Negative firms are the ones that are in financial difficulty, while financially sound firms are used in this study. Where the confusion matrix defines the true positives, true negatives, false positives, and true negatives: A brief comparative classification outcomes analysis of the AWOA-DL technique with the existing one on the Australian Credit dataset takes place in Table 3 and Figures 7 and 8 . The results are inspected in terms of different evaluation parameters. Concerning precision, the LR model has exhibited an ineffectual outcome with a minimal precision of 0.6579, whereas the RBF Network model has obtained a moderate precision of 0.8631. At the same time, the DNN and TLBO-DL models have tried to portray reasonable precision of 0.9620 and 0.9243. However, the AWOA-DL technique has resulted in superior performance with a higher precision of 0.9765. In terms of recall, the RBF Network approach has showcased poor results with the least recall of 0.8153, whereas the LR technique has reached a moderate recall of 0.8523. Concurrently, the DNN and TLBO-DL manners have tried to depict good recall Concerning F-score, the LR manner has showcased an ineffectual result with the reduced F-score of 0.7426, whereas the RBF Network manner has gained a moderate F-score of 0.8386. Followed by, the DNN and TLBO-DL methods have tried to portray a reasonable F-score of 0.8996 and 0.9315. Finally, the AWOA-DL technique has resulted in higher performance with an improved F-score of 0.9521. Finally, concerning kappa, the RBF Network algorithm has outperformed ineffectual outcomes with the lesser kappa of 0.5797, whereas the LR model has obtained a moderate kappa of 0.7024. At the same time, the DNN and TLBO-DL approaches have tried to reveal reasonable kappa of 0.8230 and 0.8791. Nevertheless, the AWOA-DL methodology has resulted in a maximum performance with an increased kappa of 0.9364. To see how well the AWOA-DL method compares to our current one on the Analcat dataset, see Table 4 and Figs. 9 and 10. The outcomes are evaluated based on various calculating parameters. LR models have a moderate precision of 0.9200, while RBF Network models provide ineffective results with a minimal precision of 0.8000. Parallel to each other, DNN and TLBO-DL models have attempted to convey a reasonable degree of accuracy, with precision values between 0.96 and 1.0000. The AWOA-DL approach, on the other hand, has shown better results, with a precision of 1.0000. Regarding recall, the RBF Network technique has shown poor results with a minimum recall of 0.7142, while the LR method has obtained an average recall of 0.8518. Both TLBO-DL and DNN attempted to demonstrate a fair recall of 0.8500 and 0.8518 simultaneously. The AWOA-DL approach, on the other hand, has a high recall of 0.9675. The RBF network approach has a low accuracy of 0.7400, while the LR method has a middling accuracy of 0.8800. Also, DNN and TLBO-DL tried to portray 0.9000 and 0.9600 Table 5 and Figs. 11 and 12 quickly compare the AWOA-DL method's classification results with those from the current one on the Polish dataset. Different computation settings are examined to see how the results change. While the RBF Network models have shown ineffective results, the LR and DNN models have achieved moderate precision of 0.8010 and 0.8632, respectively, with low accuracy of 0.7135. Due to their reliance on deep learning techniques, for the most part, the TLBO-DL models have attempted to show an acceptable level of accuracy (0.95412). The AWOA-DL approach has a higher accuracy of 0.9823 due to the tuning and adaptation phases of hyperparameters, which primarily improve the phase of prediction. The RBF Network approach has shown low recall outcomes, with a minimum recall of 0.7022 compared to the LR method's 0.8010, which was observed in DNN and TLBO-DL try to portray a realistic F-score of 0.9123and 0.9553. Finally, the AWOA-DL approach enhanced performance by 0.9887. For kappa, the AWOA-DL technique resulted in maximum performance of 0.9318 due to the use of the MLP prediction model the proved the superiority over many DNN models in classification results. Table 6 and Figs. 13 and 14 provide a quick comparison of classification results using the AWOA-DL method and the current one on the Taiwan dataset. RBF Network models provide ineffective results with the minimal precision of 0.8000. The AWOA-DL approach, on the other hand, has shown better results, with a precision of 1.0000. The DNN and TLBO-DL approach attempted to demonstrate a good recall of 0.8500 and 0.8518 simultaneously. In contrast, the AWOA-DL approach has a maximum recall of 0.9675, which indicates outstanding performance. Finally, an improved F-score of 0.9521 was achieved using the Radar Chart for accuracy measure AWOA-DL method. According to RBF Network's kappa, it performed better with a smaller value of 0.5797 than an ineffectual result, but LR's middling value of 0.7024 beat RBF's. However, using the AWOA-DL approach yielded the best results, with a kappa of 0.9364. As shown in Table 7 and Fig. 15 , the average accuracy of AWOA-DL over the four datasets used in this study, when compared to other techniques, demonstrated that the proposed model outperformed the others. Several factors contribute to the proposed model's superiority; for example, using the WOA optimization algorithm for adjusting the classifier's parameter is one of them. WOA is familiar with high-performance advertisements that have the least amount of MSE for the predicted values. Secondly, unlike earlier probability-based models, MLP Neural networks make no assumptions about the underlying probability density functions or the pattern classes under consideration, which is a significant advantage. Additionally, the MLP demonstrated high accuracy when dealing with large datasets and input sizes. As a detailed evaluation of the results, we conducted the Friedman rank-sum test, shown in Table 8 . As shown in Table 8 , the Australian dataset, the test statistic (19.04), and degrees of freedom (4) are reported for the Friedman test. The Friedman test was significant (p 0.0007718 < 0.001), so the scores' distributions for the whole models being compared are different. This explanation for all the compared datasets. The ability to estimate or predict the failure of any business or organization is critical for financial institutions because bad decisions impact overall revenue. In this study, a new financial distress prediction model uses an adaptive whale optimization algorithm with deep learning (AWOA-DL) technique. The goal of the AWOA-DL approach is to determine whether a company undergoes financial distress or not. The AWOA-DL method involves a deep neural network (DNN) based predictive process and AWOA based hyperparameter tuning process. Primarily, the DNN model receives the financial data as input and predicts financial distress. In addition, the AWOA is applied to tune the MLP model's hyperparameters, thereby raising the predictive outcome. To validate the improved performance of the AWOA-MLP technique, a comprehensive simulation took place on the Australian Credit dataset, and the results pointed out the supremacy of the AWOA-DL technique. The proposed model outperformed other techniques with an average accuracy of 95.8% versus 93.8, 89.6%, 84.5, and 78% for compared models. WOA has a lot going on in terms of ease of use and accuracy, but its convergence speed is poor. In the future, the predictive outcomes can be further improved by using clustering and feature selection approaches. Moreover, using a hybrid of metaheuristic algorithms for improving the convergence rate. Intelligent system for forecasting failure of agile projects Deep learning-based exchange rate prediction during the COVID-19 pandemic Financial ratios, discriminant analysis and the prediction of corporate bankruptcy Predicting bank operational efficiency using machine learning algorithm: comparative study of decision tree, random forest, and neural networks Do traditional financial distress prediction models predict the early warning signs of financial distress? Evolving culture versus local minima Credit growth, the yield curve and financial crisis prediction: Evidence from a machine learning approach Failure prediction of dotcom companies using hybrid intelligent techniques Forecasting stock market crisis events using deep and statistical machine learning techniques Financial distress prediction using the hybrid associative memory with translation An anatomization on breast cancer detection and diagnosis employing multilayer perceptron neural network (MLP) and Convolutional neural network (CNN) {UCI} machine learning repository The difficulty of training deep architectures and the effect of unsupervised pre-training Unsupervised learning of distributions of binary vectors using two layer networks A comprehensive survey: Whale Optimization Algorithm and its applications. Swarm and Evolutionary Computation Circular economy transition in Italy. Achievements, perspectives and constraints Benchmarking attribute selection techniques for discrete class data mining Study on reasons of failure of small and medium enterprises: Looking into Egypt case A fast learning algorithm for deep belief nets A new perspective of performance comparison among machine learning algorithms for financial distress prediction Predicting the direction of US stock prices using effective transfer entropy and machine learning techniques Risk prediction and evaluation of transnational transmission of financial crisis based on complex network Bank distress prediction: Empirical evidence from the Gulf Cooperation Council countries Computational intelligence-based financial crisis prediction model using feature subset selection with optimal deep belief network Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters Deep learning from temporal coherence in video Credit default prediction modeling: An application of support vector machine Financial ratios and the probabilistic prediction of bankruptcy Predicting bank insolvencies using machine learning techniques Machine learning as an early warning system to predict financial crisis Improved grey wolf optimization-based feature subset selection with fuzzy neural classifier for financial crisis prediction A quantitative theory of immediate visual recognition Product backorder prediction using deep neural network on imbalanced data Adaptive optimal multi key based encryption for digital image security Predicting financial distress and corporate failure: A review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowledge-Based System Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble. Knowledge-Based Systems Financial crisis prediction model using ant colony optimization Intelligent hybrid model for financial crisis prediction using machine learning techniques Deep learning via semi-supervised embedding Research on the application of GA improved neural network in the prediction of financial crisis Prediction of systemic risk contagion based on a dynamic complex network model using machine learning algorithm A financial distress prediction model based on sparse algorithm and support vector machine Comparative performance of six supervised learning methods for the development of models of hard rock pillar stability prediction Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations