key: cord-0951907-6svyti1e authors: nan title: Novel Feature Selection and Voting Classifier Algorithms for COVID-19 Classification in CT Images date: 2020-09-30 journal: IEEE Access DOI: 10.1109/access.2020.3028012 sha: 1fe10fab0344af37c69d8602cb4bfcf220c3cab9 doc_id: 951907 cord_uid: 6svyti1e Diagnosis is a critical preventive step in Coronavirus research which has similar manifestations with other types of pneumonia. CT scans and X-rays play an important role in that direction. However, processing chest CT images and using them to accurately diagnose COVID-19 is a computationally expensive task. Machine Learning techniques have the potential to overcome this challenge. This article proposes two optimization algorithms for feature selection and classification of COVID-19. The proposed framework has three cascaded phases. Firstly, the features are extracted from the CT scans using a Convolutional Neural Network (CNN) named AlexNet. Secondly, a proposed features selection algorithm, Guided Whale Optimization Algorithm (Guided WOA) based on Stochastic Fractal Search (SFS), is then applied followed by balancing the selected features. Finally, a proposed voting classifier, Guided WOA based on Particle Swarm Optimization (PSO), aggregates different classifiers’ predictions to choose the most voted class. This increases the chance that individual classifiers, e.g. Support Vector Machine (SVM), Neural Networks (NN), k-Nearest Neighbor (KNN), and Decision Trees (DT), to show significant discrepancies. Two datasets are used to test the proposed model: CT images containing clinical findings of positive COVID-19 and CT images negative COVID-19. The proposed feature selection algorithm (SFS-Guided WOA) is compared with other optimization algorithms widely used in recent literature to validate its efficiency. The proposed voting classifier (PSO-Guided-WOA) achieved AUC (area under the curve) of 0.995 that is superior to other voting classifiers in terms of performance metrics. Wilcoxon rank-sum, ANOVA, and T-test statistical tests are applied to statistically assess the quality of the proposed algorithms as well. hospitals around the world, X-ray images can be considered less sensitive than CT scans for the investigation of COVID-19 patients. [3] reported that X-ray was diagnosed to be normal in both early and mild stages. On the other hand, CT images enable the non-destructive 3D visualization of internal structures and are considered as a powerful analysis tool [5] , [6] that has been applied widely to clinical diagnosis [7] and biomedical imaging [8] . In addition, CT has always aimed to achieve improved scanning efficiency in both time and radiation dose [9] . The development of Multi-slice CT (MSCT) has been successful to improve the efficiency of scanning by simultaneously increasing the number of scanned slices [10] . Moreover, dual-source CT managed to achieve a larger temporal resolution improvement, [11] . Machine learning algorithms have been gaining momentum over the last decades for medical applications such as computer-aided diagnosis to help physicians for an early diagnosis, which can lead to better-personalized therapies and enhancement of the medical care offered to patients [12] , [13] . Convolutional neural networks (CNN), as a subset of machine learning algorithms, is a unique structure of synthetic neural networks used for image classification. There are several CNN models including AlexNet [14] , VGG-Net [15] , GoogLeNet [16] , and ResNet [17] . In the CNN models, classification accuracy correlates with the extended number of convolution layers [18] . Optimization is the process by which the best possible solution is found for a particular problem from all the available solutions [19] . One of the most powerful methods to solve applications in radiology problems are Meta-heuristic algorithms. The inspiration of most of these algorithms is from physical algorithms' logical behavior found in nature. The acceptable solutions found these optimization techniques are typically obtained with less computational effort in a reasonable time, [20] . The early diagnosis of coronavirus can significantly limit its wide-spreading and therefore increases the patients' recovery rates. So, several artificial intelligence (AI) techniques have been proposed for the early detection of COVID-19 in the literature. In this article, a framework for COVID-19 classification is proposed based on three cascaded phases. The first phase automatically extracts features from the training CT images by a CNN model named AlexNet. Then, a proposed feature selection algorithm, using Stochastic Fractal Search (SFS) and Guided Whale Optimization Algorithm (Guided WOA) techniques, is applied to properly select the valuable features. The LSH-SMOTE (Locality Sensitive Hashing Synthetic Minority Oversampling Technique) is used in the second phase to balance the extracted features. The last phase classifies the selected features by a proposed voting classifier, using Particle Swarm Optimization (PSO) and Guided WOA techniques, by aggregating the Support Vector Machine (SVM) [21] , Neural Networks (NN) [22] , k-Nearest Neighbor (KNN) [23] , and Decision Trees (DT) [24] classifiers to improve the ensemble's accuracy. Two kinds of CT datasets are used in the experiments to test the proposed framework. The first dataset has COVID-19 CT images, while the second dataset has extra CT images with clinical cases that have no COVID-19. For feature selection, the proposed (SFS-Guided WOA) algorithm is compared in experiments with binary versions of the original WOA [25] , Grey Wolf Optimizer (GWO) [26] , Genetic Algorithm (GA) [27] , PSO [28] , hybrid of PSO and GWO (GWO-PSO) [29] , hybrid of GA and GWO (GWO-GA), Bat Algorithm (BA) [30] , Biogeography-Based Optimizer (BBO) [31] , Multiverse Optimization (MVO) [32] , Bowerbird Optimizer (SBO) [33] , and Firefly Algorithm (FA) [34] in terms of average error, average select size, average (mean) fitness, best fitness, worst fitness, and standard deviation fitness. Lastly, the proposed voting classifier (PSO-Guided WOA) result of 0.995 is compared with voting WOA, voting GWO, voting GA, and Voting PSO in terms of Area Under The Curve (AUC) and the Mean Square Error (MSE). The main contributions of this article are as follow: • A COVID-19 classification framework based on proposed algorithms for feature selection and classification is developed. • A novel feature selection algorithm based on SFS and Guided WOA techniques is proposed. • A novel voting classifier based on PSO and Guided WOA techniques is proposed. • The proposed framework can classify the input CT images to COVID-19 or non-COVID-19 effectively. • The proposed framework is evaluated using two datasets of COVID-19 CT images and non-COVID-19 CT images. • Statistical tests of Wilcoxon rank-sum, ANOVA, and T-test are carried out to ensure the quality of the proposed algorithms. • This framework can be generalized to the applications of biomedical imaging diagnoses. This article contains the following sections. Related work and the problem definition are discussed in Section II. Section III introduces the materials and methods employed in this research. Section IV presents the model and the proposed algorithms in detail. Section V shows the designed scenarios and results. Section VI discusses the experimental results. The conclusions and future work are shown in Section VII. See Table 1 for a list of abbreviations. In this section, the recent literature utilizing the CT scans for diagnosing COVID-19 patients will be summarized. Then, the recent evaluation of Artificial Intelligence (AI) against COVID-19 based on the CT scans will be discussed as well. Recent study proposed several COVID-19 detection paradigms. In [35] , Li et al. proposed a methodology to recognize the infection rate using the coronal and axial view of lung CT scans. The proposed work achieved a specificity of 100%, AUC of 0.918, and sensitivity of 82.6%. Another study by [36] evaluated COVID-19 disease using visual inspection. They claimed that visual inspection can help to correctly identify the infection. In [37] , Panwar et al. proposed a scheme to evaluate the lung CT scans and implemented visual inspection-based detection. Their scheme could achieve Specificity of 94%., AUC of 0.892, and Sensitivity of 83.3%. In [2] , Wang et al. investigated 90 patients' lung CT scans. Their investigation managed to detect the severity based on the time since the patient got infected. In [38] , in addition a diagnostic methodology was proposed based on the CT scans image features. They concluded that the combination of both image features evaluation and clinical findings can early detect the presence of COVID-19. In [39] , Bai et al. investigated the patient's information and considered the CT scans and RT-PCR for the examination. They achieved a specificity of 100% and a sensitivity of 93%. In a similar study [40] , authors clinically evaluated patients with both CT scans and real-time RT-PCR with an early detection accuracy of 90%. Recent works show that the CT scans are mainly utilized to offer fast diagnostic methods to prevent and control the spread of COVID-19 and assist physicians and radiologists to correctly manage patients in high workload. Authors in [41] developed a method based on deep learning to accurately assist radiologists to identify COVID-19 patients using CT images. They used deep learning to train a neural network to screen COVID-19 patients based on their CT images. The proposed method achieved a specificity of 61.5%, sensitivity of 81.1%, AUC of 0.819, and accuracy of 76%. In [42] , Ardakani et al. proposed a method to diagnose COVID-19 using an AI technique based on CT slices and ten convolutional neural network models to correctly diagnose COVID-19 from non-COVID-19 groups. The authors found that both ResNet-101 and Xception have achieved the best performance. Moreover, ResNet-101 managed to detect COVID-19 cases with a specificity of 99.02%, Sensitivity of 100%, AUC of 0.994, and Accuracy of 99.51%. On the other hand, Xception achieved a Specificity of 100%, Sensitivity of 98.04%, AUC of 0.994, and Accuracy of 99.02%. The authors recommended the use of ResNet-101 to characterize and diagnose COVID-19 infections due to its higher sensitivity. Another study in [43] used a large CT dataset to develop an AI method that can diagnose COVID-19 and differentiate it from normal controls and other types of pneumonia. The authors investigated the significance of identifying important clinical markers using the convolutional neural network ResNet-18 model. Their proposed method achieved a Specificity of 91.13%, Sensitivity of 94.93%, AUC of 0.981, and Accuracy of 92.49% for COVID-19. In [44] , the authors proposed a deep learning neural network-based method named nCOVnet for detecting the COVID-19 based on analyzing the patients' X-ray images. Their nCOVnet method achieved a Specificity of 89.13%, Sensitivity of 97.62%, AUC of 0.881, and Accuracy of 88.10% for COVID-19. Butt et al. [45] used a special type of CNN, namely ResNet-18 to classify CT samples with COVID-19, normal subjects, and Influenza viral pneumonia. They achieved an accuracy of 86.7% with 98.2% sensitivity, 92.2% specificity, and AUC value of 0.996. Chua et al. [46] proposed a model based on the CNN architecture model that was trained from scratch. Their model consisted of five convolution layers utilized as a deep feature extractor. K-nearest neighbor, SVM, and decision tree were fed using the extracted deep discriminative features. The superiority of the SVM classifier was demonstrated with an accuracy of 98.97%, a sensitivity of 89.39%, and a specificity of 99.75%. Another study by Wu et al. [47] proposed a weakly supervised CNN that could achieve an accuracy of 96.2% with 94.5% sensitivity, 95.3% specificity, and AUC value of 0.970. A ML-method is proposed in [48] to classify the chest x-ray images into COVID-19 or non-COVID-19 patients. A Fractional Multichannel Exponent Moments (FrMEMs) method is used for feature extraction. A modified Manta-Ray Foraging Optimization based on differential evolution is then used to select the most significant features. The authors' proposed method is evaluated using two COVID-19 x-ray datasets. The recent AI research for COVID-19 is summarized in Table 2 . The importance of the AI techniques in the early evaluation of COVID-19 and the areas where AI can contribute to the battle against COVID-19 are discussed in [50] . The authors concluded that AI is not fully utilized in COVID-19 because of the possible lack of data or excessive data. To overcome these constraints careful balance must be made between public health, data privacy, and the right utilization of the AI techniques. Furthermore, the need for an extensive gathering of diagnostic data will be extremely crucial to train AI, save lives, and limit the associated economic damages. Most of the above-discussed studies mainly applied statistical analysis and visual inspection techniques to correctly diagnose COVID-19 infection. A lesser number of applied researches used transfer learning and CNN with CT datasets of coronavirus pneumonia patients, non-corona virus pneumonia patients, and healthy subjects. Therefore, more study needs to be conducted that utilizes AI with properly optimized performance metrics. As per the literature review of this work, it is recommended to use the CT images as a fast method to diagnose patients with COVID-19. The proposed paradigms need to be both reproducible and easily validated to can be quickly integrated into the arsenal of battling the COVID-19 pandemic. This section discuss data sets and methodologies of this research. The datasets, dataset balancing, and the optimization methods of WOA, PSO, and SFS are discussed. The CNN models, classification methods, and ensemble learning techniques are also explained. Data collection is considered as the first and main step in COVID-19 applications. Recently, it has been reported that several data collection works were done on COVID-19. The authors have used two datasets to apply the proposed paradigm. The first is the COVID-19-dataset which has 334 CT images containing clinical findings of COVID-19. While the second is the non-COVID-19-dataset that has extra 794 CT images with clinical cases that have no COVID-19. Figure 1 shows samples of the COVID-19 and the non-COVID-19 cases. The images are collected from COVID19-related articles from medRxiv, bioRxiv, NEJM, JAMA, and Lancet.CTs containing COVID-19 abnormalities were selected by reading through the papers' figures captions [49] . All patients' images in the datasets were high-resolution Multi-Detector Computerized Tomography (MDCT) Axial images. The Axial images show bilateral scattered ground-glass opacities with air space consolidation, mainly posterior segments of lower lung lobes with peripheral and subpleural distribution; the picture of atypical pneumonia caused by COVID-19 that is clinically proved by Polymerase Chain Reaction (PCR). PCR is a process that replicates a small segment of DNA, a large number of times, to create enough samples for analysis. The extracted features from the utilized datasets may suffer from a class imbalance problem. Therefore, several algorithms were investigated to solve that type of problems. Some of the recent algorithms are the SMOTE and the LSH-SMOTE [51] , [52] . The SMOTE technique finds its k-nearest minority class neighbors for a selected minority class instance a at random. Then, it randomly chooses another k-nearest neighbor b to be connected with a to form a line segment in the feature space. Euclidean distance is used to sort the instances while selecting the k-nearest neighbors. Finally, a list of k-nearest neighbor's instances is returned to the main SMOTE class for generating the synthetic instances. LSH-SMOTE was first introduced by [52] to improve the performance of the feature selection SMOTE based optimization techniques. The algorithm starts with hashing and dividing the dataset into buckets by assigning similar items with similar hash codes to the same bucket. That, in turn, can increase the matching probability between similar items leading to a simplified search for the k-nearest neighbors. CNN is of the most well-regarded machine learning methods in the literature. One of the reasons of its popularity is due to the automatic hierarchical feature representation in recognizing objects and patters in images [42] . CNNs reduce the parameters of a given problem using spatial relationships between them. This makes them a more practical classifier specially in image processing where we deal with a large number of parameters (pixels), rotation, translation, and scale of images. In fact, CNNs alleviate the drawbacks of Feel Forward Neural networks and Multi-Layer Perceptons by using an alternative to matrix multiplication. We use this powerful method in this study due to the nature of COVID-19 diagnosis from CT images and its high-dimensional nature. In the WOA algorithm, the inspiration is from the foraging behaviour of whales, in which bubbles are used to trap the prey by forcing them to the surface in a spiral-shaped [25] , [53] . Mathematically, the first mechanism by this optimizer is based on the following equation: where vector − → G (t) represents a solution at iteration t and vector − → G * (t) represents the position of the prey. the ''.'' indicates pairwise multiplication and − → G (t + 1) represent the updated position for the solution [54] , [55] . The two vectors of − → r 2 for vector − → a changes from 2 to 0 linearly and − → r 1 and − → r 2 are random values in [0, 1]. The second mechanism includes a shrinking encircling, which decreases the values of − → a and − → A vectors, and a spiral process for updating the positions as follows represents ith whales and the best one distance. Parameter b is a constant, represents the spiral's shape, and l is a random value in [−1, 1]. The WOA mechanism can be simulated by the following equation where − → r 3 represents a random value in [0, 1]. The last mechanism can be achieved based on the − → A vector. The position of search agent is updating based on a random whale − → G rand to allow a global search by the following equation Thus, the exploitation and exploration are controlled by − → A , and the spiral or circular movement is controlled by r 3 . The WOA algorithm is shown step by step in Algorithm 1. for (i = 1 : i < n + 1) do 8: if ( − → r 3 < 0.5) then 9: if (| − → A | < 1) then 10: Update current search agent position using Eq. 1 11: else 12: Select a random search agent − → G rand Update current search agent position by Eq. 4 14: end if 15: else 16: Update current search agent position by Eq. 2 17: end if 18: end for 19 : Calculate fitness function F n for each − → G i Find best individual − → G * Set t = t + 1. (increase counter). 23: end while 24: return The Stochastic Fractal Search (SFS) technique was proposed by [56] in which the fractal mathematical concept was used as a property of objects' self-similarity. The Fractal Search (FS) algorithm depending on the Diffusion Limited Aggregation (DLA)that generates the objects' fractal-shaped. Figure 2 presents a random fractal sample. The SFS technique uses diffusion and two kinds of updating processes to outperform the original FS technique. Figure 2 shows the diffusion process of the SFS technique in a graphical form for a solution. For the best solution BP, a list of solutions BP 1 , BP 2 , BP 3 , BP 4 , and BP 5 can be listed around this best solution [57] . PSO algorithm is based on the swarming pattern of flocks in nature [58] , [59] . PSO algorithm simulates an animal's social behavior such as birds. The swarms searching for food by changing their positions according to the updated velocity. PSO has several particles and each particle has the following parameters: • Position (x i ∈ R n ), which indicated a point in R n search space. The fitness function is used to evaluate the particles' current positions. • Velocity or rate of position change, (v i ), • Last best positions (p i ), which store better positions' values of the particles. During the algorithm iterations, the positions and velocity of all particles are changing. The particles' positions are updated as follows: where x i t+1 is the new particle position, and the updated velocity of each particle v i t+1 can be calculated as where ω is the inertia weight, C 1 and C 2 represent cognition learning factor and the social learning factor. Parameter G is the global best position and r 1 and r 2 are random numbers in [0; 1]. SVM can perform classification, regression, and outlier detection [21] . SVMs are suited for the classification of complex datasets. The classification of the SVM technique is based on transforming the features dimension space that is nonlinearly separable into a higher dimension space in which a hyperplane can easily separate the different classes. That can be done using a kernel trick in which linear, polynomial, or Gaussian RBF kernel can be used to decrease the computational complexity associated with the calculations of added features. The margin between classes depends on dataset instances called support vectors. While the kernel hyperparameters are those parameters that determine the margin of separation between classes and the tolerance for permitting margin violation. Even though SVM is a binary classifier, it can be easily extended to be used in multiclass classification. KNN method can also be used for classification and regression [23] purposes. As a classifier, this algorithm considers k closest training examples in the feature space. The output in this algorithm is a class membership. DT [24] is also a machine learning capable of doing both classification and regression. MLP is a class of feedforward ANN [22] . There are three layers in MLP: input, hidden, and output layers. Such architecture with three layers is mostly suited to small or medium datasets. In addition, the dataset complexity can be accommodated using suitable activation functions and/or a suitable number of perceptrons in the hidden layers. However, large datasets can be more complex to be accommodated by only three layers of nodes. Therefore, architectures with more than three layers are common while suitable training techniques for them are usually called deep learning. That architecture can capture the complex relations associated with the large dataset they try to model or classify. The problem might arise when a small dataset with a large number of attributes needs to be used in MLP of complex architectures of many layers. Ensemble Learning is the aggregation of a group of predictors (such as classifiers), which can often achieve better predictions. It is recommended to use diverse, independent classifiers in such methods to get the best outcome [60] . One way to achieve this is to use different learning algorithms. To create a better classifier, the predictions of each classifier can be aggregated and then determine the class with the most votes. This is called the majority-vote classifier which is considered a hard-voting classifier. Using this approach will raise the chance that the individual classifiers will make very different types of errors to improve the ensemble's accuracy. Another way is to use the same algorithms with different data subsets such as the Random forest. In that ensemble classifier, ''forest'' is an analogy that refers to creating decision trees that is trained by ''bagging'' method. In bagging, a similar learning algorithm is used for all the predictors. To get the most reliable income, however, it is recommended to train them on different random subsets of the training set while sampling is performed with replacement. The general idea of this method is to increase the overall result accuracy due to the soft-computing nature of all methods in this area. Another type of ensemble classification is AdaBoost [61] in which the output of the weak learners, other learning algorithms, is collected into a weighted sum and this represents the boosted classifier final output. The proposed framework has three phases. The first phase has a feature engineering process which includes the CNN training techniques. The second phase represents the proposed SFS-Guided WOA for feature selection and then applying the LSH-SMOTE method for balancing the selected features. The last phase, phase three, applies the proposed voting classifier algorithm (PSO-Guided WOA) for the selected features from the second phase to classify the infected cases. In the first phase of the proposed framework, CNN is used. As dsicussed above, CNN reduce the parameters of a given problem using spatial relationships between them, which makes them a more practical classifier specially in image processing where we deal with a large number of parameters (pixels), rotation, translation, and scale of images. Several CNN models including AlexNet [14] , VGG-Net (VGG16Net and VGG19Net) [15] , GoogLeNet [16] , and ResNet-50 [17] are involved in this phase as shown in Fig. 4 . In the CNN models, classification accuracy correlates with the extended number of convolution layers. The pre-trained CNN models are employed in this phase. To understand the CT images in the datasets, a Radiology Registrar at the Typical Medical complex in Riyadh and a Fellow of The Royal College of Radiologists in UK help the authors. They guided the authors to deal with COVID-19 CT images of the infected cases to differentiate them from the non-infected cases. The preprocessing step makes the data ready for the machine learning models. Based on the problem of COVID-19 and the available dataset, some data processing tasks are required before feeding the images to the learning model. To feed the current dataset of images to the convolutional network, they must be resized to have the same size. All the CT images have been resized to 224 × 224 by the Nearest Neighbour interpolation function which is a simple and commonly used. The learning model can be applied in this stage for salient features extraction from CT images by altering the nodes in the fully connected layer and performing a fine-tuning using the input dataset. Then, the Min-Max-Scalar is employed for the ith input image I i normalization to be within [0, 1] by applying the following form where I i is the resized image. The data augmentation technique is applied in this research on the existing data to create new training data artificially. Image augmentation, as a type of data augmentation, creates versions of the images in the training dataset. Image transformations include horizontal and vertical shift, horizontal and vertical flip, random rotation, and random zoom are applied to the input dataset. The shift augmentation moves all pixels of the CT image in horizontal or vertical direction and keeps the image at the same dimensions. The flip process reverses all pixels rows and columns for a horizontal flip or vertical flip. The rotation augmentation rotates the CT image randomly clockwise from 0 to 360 degrees. Finally, the zoom augmentation zooms the CT image randomly by a factor range [0.9, 1.1]. The image augmentation algorithm is shown in (Algorithm 2). are extracted from the first phase of the CNN model are the input to the second phase for the proposed algorithm as shown in Fig. 4 . The SMOTE and LSH-SMOTE methods are then applied for balancing the selected features for improving the accuracy of COVID-19 classification at the last phase. The Guided WOA is a modified version of the original WOA. To overcome the drawback of this method, the search strategy for one random whale can be replaced with an advanced strategy that can move the whales rapidly toward the best solution or prey. From the original WOA, Eq. 4 forces whales to move around each other randomly which is similar to the global search. In the modified WOA (Guided WOA), to enhance exploration performance, a whale can follow three random whales instead of one. This can force whales for more exploration and not being affected by the leader position by replacing Eq. 4 with the following equation where − → G rand1 , − → G rand2 , and − → G rand3 are three random solutions. − → w 1 is random value between [0, 0.5]. − → w 2 and − → w 3 are two random values between [0, 1]. − → z decreases exponentially instead of linearly to smoothly change between exploitation and exploration and calculated as where t represents iteration number and Max iter indicates maximum number of iterations. The proposed SFS-Guided WOA algorithm is shown in (Algorithm 3). Based on the diffusion procedure of the SFS algorithm, a series of random walks around the best solution can be created. This increases the exploration capability of the Guided WOA using this diffusion process for getting the best solution. The Gaussian random walks as a part of the diffusion process around the updated best position − → G * is calculated as For the feature selection, the solution is converted to a binary solution of 0 or 1. The following sigmoid function is applied for (i = 1 : i < n + 1) do 10: if ( − → r 3 < 0.5) then 11: if (| − → A | < 1) then 12: Update position of current search agent as − → Update position of current search agent as − → end for 22: for (i = 1 : i < n + 1) do 23 : end for 25 : Convert updated solution to binary by Eq. 11. Calculate fitness function F n for each − → G i Find best individual − → G * Set t = t + 1 30: end while 31: return − → G * to convert the continues solution to a binary one where G Best is the best position at iteration t. The role of the Sigmoid function is to scale the continuous values between 0 and 1. The condition of Sigmoid(G Best ) ≥ 0.5 is used to decide whether the value of the dimension will be 0 or 1. The LSH-SMOTE technique is employed in this research to balance the selected features by the proposed SFS-Guided WOA algorithm to improve the performance of the classification algorithm. The LSH-SMOTE technique consists of the following steps: 1) LSH-SMOTE initialization, 2) converting the minority class instances into vectors, 3) creating Hash Codes by using Hash Functions then creating Hash Tables, 4) creating the nearest Neighbors List, 5) Synthetic instances generation using the SMOTE algorithm. The SFS-Guided WOA algorithm' computational complexity according to Algorithm (3) will be discussed. Let n as number of population; M t as total number of iterations. For each part of the algorithm, the time complexity can be defined as: As per the above complexities, the overall complexity of the proposed SFS-Guided WOA algorithm is O (M t ×n). Considering the number of variables as m, the final computational complexity of the algorithm will be O (M t × n × m). The third and last phase is the classification of infected patients. Figure 5 shows the third phase of the proposed framework for COVID-19 patient classification. In this section, a voting classifier is proposed based on PSO and Guided WOA algorithms as shown in Algorithm 4. The PSO-Guided WOA aggregates the SVM, NN, KNN, and DT classifiers to improve the ensemble's accuracy. After balancing the selected features by the SMOTE or LSH-SMOTE algorithms, the classifiers are trained to get the optimal weights. The PSO-Guided WOA starts to optimize theses weights. For the proposed Algorithm 4, the guided WOA in section IV-B1 is employed in the algorithm development. After the initialization of the WOA algorithm and find the first best solution − → G * (Lines from 1 to 6), the iteration number VOLUME 8, 2020 t starts to divide the calculation of the fitness function from the guided WOA or from the PSO algorithm. If t%2 == 0 (Line 8), then the algorithm goes through the updating positions and calculating the fitness function F n for the updated solutions from the guided WOA (Lines from 9 to 22). Otherwise, the fitness function F n will be calculated based on The PSO algorithm (Line 24). The proposed PSO-Guided WOA algorithm' computational complexity will be discussed here according to Algorithm (4).Let n as number of population; M t as number of iterations. For each part of the algorithm, the time complexity can be defined as: The experiments section in this article is divided into three scenarios. The first scenario is based on the first phase of the proposed model. This experiment shows the effectiveness of different CNN models for classifying the COVID-19 cases and interns show the importance of extracting features for the next phase. In the second scenario, the proposed feature selection algorithm (SFS-Guided WOA) is tested and compared to other algorithms to show its performance. The third scenario is designed to investigate the ability of the proposed voting optimizer (PSO-Guided WOA) for improving the classification accuracy of the COVID-19 cases. Finally, Wilcoxon's rank-sum test and t-test are performed to verify the superiority of the proposed algorithms in a statistical way. The CT images datasets, [49] , are separated randomly in the experiment of the first scenario into (60%, 20%, 20%) images for the training, validation, and testing processes. if (t%2 == 0) then 9: for (i = 1 : i < n + 1) do 10: if ( − → r 3 < 0.5) then 11: if (| − → A | < 1) then 12: Update position of current search agent as − → else 14: Select three random search agents − → G rand1 , − → G rand2 , and − → G rand3 Update ( − → z ) by the exponential form of Calculate fitness function F n for each − → G i from Guided WOA 23: else 24: Calculate fitness function F n for each − → G i from PSO 25: end if 26 : Find best individual − → G * Set t = t + 1 29: end while 30: return The first experiment is designed to investigate the classification accuracy of five CNN models namely AlexNet [14] , VGG-Net (VGG16Net and VGG19Net) [15] , GoogLeNet [16] , and ResNet-50 [17] for the tested dataset. In this scenario, several performance metrics are calculated to measure the performance of the different models for COVID-19 classification. Table 3 shows the CNN experimental setup employed in the first scenario. The default parameters are employed in this experiment since the first stage is used to extract features of the CT images from the earlier layers of a CNN model to be used for the next scenario for features selection and balancing. The performance metrics calculated for the first phase are accuracy, sensitivity, specificity, precision (PPV), Negative Predictive Value (NPV), and F-score. Let TP represents the true-positive value and TN represents the true-negative value, while FN indicates the false-negative value and FP indicates the false-positive value. The metrics are defined as in the following equations. • Accuracy: measures the model ability to identify the whole cases correctly, regardless the cases are being positive or negative and can be formed as Accuracy = TP + TN TP + TN + FP + FN (13) • Sensitivity: called the true positive rate (TPR) or recall. It computes the capability of the positive case and is calculated as • Specificity: called the true negative rate (TNR) or selectivity. It gets the capability of finding negative cases and is calculated as • F-score: measures the harmonic mean of precision and sensitivity and is calculated as This scenario results are shown in Table 4 . The results show that the precision (Pvalue) of the GoogLeNet model of 84.75% which is better than VGG19Net (83.78%), ResNet-50 (81.08%), AlexNet (75%), and VGG16Net (51.75%) models. The AlexNet model outperforms other models with an F-score of 77.88%. However, the GoogLeNet model has better specificity of 92.44% than other models. According to sensitivity, the rate of the VGG16Net model of 95.08% is better than the sensitivity rate of AlexNet (81%), ResNet-50 (62.5%), VGG19Net (62%), and GoogLeNet (50%) models, respectively. For the Pvalue, the VGG16Net model has a better percentage of 87.74%. As an overall performance metric for the models, the AlexNet model has an accuracy of 79% whereas VGG19Net, ResNet-50, GoogLeNet, and VGG16Net have the accuracy of 77.17%, 77.17%, 73.06%, and 58.21% for the tested COVID-19 dataset, respectively. Based on this experiment, the highest accuracy that can be achieved for the CT images from the COVID-19 dataset tested in this research is 79% by the AlexNet model. Since this is not acceptable accuracy in this critical endeavor, the features are extracted from the earlier layers of the AlexNet model, according to its promising performance, to be used for the next scenario for features selection and balancing. In this scenario, the importance and performance of the proposed feature selection algorithm (SFS-Guided WOA) are investigated. The proposed algorithm in the second phase is compared to other algorithms of the original WOA [25] , Grey Wolf Optimizer (GWO) [26] , Genetic Algorithm (GA) [27] , PSO [28] , hybrid of PSO and GWO (GWO-PSO) [29] , hybrid of GA and GWO (GWO-GA), Bat Algorithm (BA) [30] , Biogeography-Based Optimizer (BBO) [31] , Multiverse Optimization (MVO) [32] , Bowerbird Optimizer (SBO) [33] , and Firefly Algorithm (FA) [34] in terms of average error, average select size, average (mean) fitness, best fitness, worst fitness, and standard deviation fitness, to show its performance. Table 5 shows the configuration of the proposed (SFS-Guided WOA) algorithm in the experiments. The parameters of h 1 and h 2 in the fitness function are assigned to 0.99 and 0.01, respectively. Table 6 shows the configuration of the compared algorithms in the experiments. For the evaluation of the proposed SFS-Guided WOA algorithm effectiveness, the following metrics are employed. Let M is the number repetitions of runs of an optimizer for the feature selection problem; g * j is the best solution at the run number j; N is the number of tested points. • Average Error is calculated to show the accuracy of the classifier in giving the selected feature set. It is calculated as where C i is the label of the classifier output for point i, and L i is the label of the class for point i, and Match calculates the matching between two inputs. • Average Fitness is the selected features average size to the total number of features in the dataset (D). It is calculated as where size(g * j ) is the size of the vector g * j . • Mean is the average of the solutions output from running an optimizer for several times M . It is calculated as • Best Fitness is the minimum fitness function of an optimizer running for several times M . It is calculated as • Worst Fitness is the worst solution found by an optimizer running for several times M . It is calculated as • Standard Deviation (SD) is the obtained best solutions variation which can be found by running an optimizer several times M . SD can be calculated as where Mean is the average defined in equation 21. The results of the proposed SFS-Guided WOA algorithm in this experiment are shown in Table 7 . The lower error indicates that the optimizer has selected the proper set of features for the next stage. The SFS-Guided WOA algorithm achieved the minimum average error of (0.1381) in selecting the proper features. The feature selection algorithms ordered from the best to the worst according to the minimum error for the current problem are SFS-Guided WOA, PSO, GWO, GWO-GA, WOA, GA, BA, GWO-PSO, FA, BBO, MVO, and lastly SBO. Note that, the proposed algorithm outperforms the original WOA algorithm. Table 7 also shows that the proposed algorithm can find the lowest fitness value (0.2013), for the selected features of the COVID-19 datasets, which is lower than the compared algorithms values. The proposed algorithm can find the best fitness value of (0.1031) compared to other optimization techniques throughout runs. On the other hand, SFS-Guided WOA can not find the worst fitness and it has the lowest standard deviation compared to other algorithms that prove the stability and robustness of the proposed algorithm. Based on this experiment, the selected features are then balanced using two methods named SMOTE and LSH-SMOTE to be ready for the classification scenario. For both algorithms, the nearest neighbors parameter k = 5, and the oversampling percentage is 50% of features distribution (majority class = minority class). For the SMOTE algorithm, the number of instances per leaf is equal to 2. For the LSH-SMOTE algorithm, the Hashes parameter H = 5 and the Hash tables parameter T = 4. For getting the p-values between the proposed SFS-Guided WOA algorithm and other algorithms, Wilcoxon's rank-sum test is employed. This statistical test can determine if the results of the proposed algorithm and other algorithms have a significant difference or not; p-value < 0.05 will demonstrate significant superiority. By contrast, a p-value > 0.05 shows that the results have no significant difference. Hypothesis testing is formulated here in terms of two hypotheses; the null hypothesis (H 0 : . . , µ SFS−Guided WOA = µ GA ) and the alternate hypothesis (H 1 : Means are not all equal). Table 8 shows the results of p-value in which p-values less than 0.05 could be achieved between the proposed algorithm and other algorithms showing the superiority of the SFS-Guided WOA algorithm and indicating that the algorithm is statistically significant. Thus, the alternate hypothesis H 1 is accepted. This scenario is divided into three experiments and statistical tests. The first experiment is designed to investigate the results for the single classifiers of SVM, KNN, NN, and DT based on balanced and unbalanced features that are selected from the second scenario. The next experiment is VOLUME 8, 2020 performed to compare the proposed voting classifier (PSO-Guided WOA) with other ensemble learning techniques. In the last experiment, the proposed algorithm is compared with other voting classifier algorithms to check its effectiveness. Statistical tests of ANOVA and T-test are performed between the compared algorithms to show the effectiveness of the proposed algorithm. Table 9 shows the configuration of the proposed (PSO-Guided WOA) algorithm in the experiments. The parameters of h 1 and h 2 in the fitness function are assigned to 0.99 and 0.01, respectively. Table 10 shows the configuration of the compared algorithms in the experiments. This scenario performance metrics are the Area Under the ROC Curve (AUC) and the Mean Square Error (MSE). AUC is a good indicator of classification performance due to being independent from the distribution of instances between classes which is also referred to as a balanced accuracy or macro-average [51] . In the current case of binary classification, the balanced accuracy is equal to the arithmetic mean of specificity and sensitivity, or AUC with binary predictions rather than scores. The AUC (balanced accuracy) value can be calculated as follows: The Mean Square Error (MSE) evaluates the classifiers performance, calculates the difference between the required and the actual output of the classifiers according to this equation: where n indicates number of outputs, d h x indicates the xth input neuron optimal output when the hth training instance is applied, and o h x indicates optimal output actual output of the xth input neuron when the hth training instance appears in the input. The first experiment results for the SVM, KNN, NN, and DT as a single classifiers are shown in Table 11 . The classifier results are shown based on three cases of no preprocessing, balancing selected features by the SMOTE algorithm, and balancing selected features by the LSH-SMOTE algorithm. Note from Table 11 that, the DT classifier achieved the highest AUC percentage of 0.911 with the minimum MSE of (0.007932). This result show the importance of balancing the selected features from the previous stage by the LSH-SMOTE algorithm. The next experiment results for comparing the proposed algorithm with other ensemble learning methods of Bagging, AdaBoost, and Majority voting are shown in Table 12. This table shows Figure 6 shows the ROC curves of the proposed voting (PSO-Guided WOA) algorithm versus compared voting algorithms. These figures show that the proposed algorithm is able to distinguish between the COVID-19 and non-COVID-19 cases with a high AUC value near to 1.0 as shown in Table 13 . To conclude whether there is any statistical difference between the MSE of the proposed (PSO-Guided WOA) algorithm and other compared algorithms, a one-way analysis of variance (ANOVA) test was applied. The hypothesis testing can be formulated here in terms of two hypotheses; the null hypothesis (H 0 : Figure 7 shows the ANOVA test for proposed and the compared algorithms versus the objective function. Based on this test results, the alternate hypothesis H 1 is accepted. However, we cannot tell which algorithm is better from ANOVA, so another test is conducted between every two algorithms. A one-tailed T-Test at 0.05 significance level is performed. Hypothesis testing is formulated here in terms of two hypotheses; the null hypothesis (H 0 : µ A1 = µ B1 , µ A1 = µ C1 , µ A1 = µ D1 , µ A1 = µ E1 ) and the alternate hypothesis (H 1 : Means are not all equal). The results in Table 15 , for 20 samples (Number repetitions of runs) as mentioned in Table 9 , show that the p-values are less than 0.05 which indicates that there is a statistically significant difference between groups. Thus, the alternate hypothesis H 1 is accepted. The experiments in this research are designed based on three scenarios to assess the performance and accuracy of the proposed framework for COVID-19 classification. The first scenario shows that the highest classification accuracy of the compared CNN models can be achieved by the AlexNet model for the CT images from the tested COVID-19 dataset. Based on these results, the features are extracted from the earlier layers of the AlexNet model to be used for the next scenario for features selection and balancing. In the second scenario, the performance of the proposed feature selection algorithm (SFS-Guided WOA) is assessed. Results show that the proposed algorithm outperforms the compared algorithms, including the original WOA algorithm, and could find the lowest fitness value for the feature selection of the extracted features from the COVID-19 datasets. In addition, the proposed algorithm has the lowest standard deviation compared to other algorithms that prove the stability and robustness of the proposed technique. Based on the second scenario results, the selected features are then balanced using the SMOTE and LSH-SMOTE methods to be ready for the last stage which includes the final classification. The third scenario shows the performance of the proposed classification algorithm (PSO-Guided WOA). Results show that the proposed voting classifier (PSO-Guided WOA) with LSH-SMOTE preprocessing could achieve an AUC with binary predictions (balanced accuracy) result of 0.995 and a MSE of 2.49569E-05 which outperforms other state-of-theart ensemble learning techniques. That shows the importance of balancing the selected features from the previous stage by the LSH-SMOTE algorithm. The experimental results for comparing the voting classifier with other voting classifiers using WOA, GWO, GA, and PSO show the superiority of the proposed framework to identify COVID-19 patients using CT images. Thus, the efficacy of diagnosis can be improved while avoiding the radiologists the heavy workload associated with the initial screening of COVID-19 pneumonia. This article proposes a framework for COVID-19 classification with three cascaded phases. In the first phase, the hierarchical feature representation is automatically extracted from the training CT images by the CNN model of AlexNet. Afterward, the proposed feature selection algorithm, using SFS and Guided WOA techniques, is applied to select features in the second phase. The selected features are then balanced by the LSH-SMOTE algorithm to improve the classification results. In the last phase, a voting classifier, using PSO and Guided WOA techniques, is proposed to aggregate the predictions of four single classifiers, named SVM, NN, KNN, and DT, and predict the most voted class. This increases the chance that the individual classifiers will make very different types of errors to improve the ensemble's accuracy. Two datasets are used to test the proposed model. The first is the COVID-19 dataset which has CT images containing clinical findings of COVID-19 and the second is the non-COVID-19 dataset that has extra CT images with clinical cases that have no COVID-19. For feature selection, the proposed SFS-Guided WOA algorithm is compared in experiments with the original WOA, GWO, GA, PSO, hybrid of PSO and GWO (GWO-PSO), hybrid of GA and GWO (GWO-GA), BA, BBO, MVO, SBO, and FA in terms of average error, average select size, average (mean) fitness, best fitness, worst fitness, and standard deviation fitness. Finally, the proposed voting classifier (PSO-Guided WOA) result is compared with voting WOA, voting GWO, voting GA, and Voting PSO in terms of AUC and MSE. The statistical analysis of Wilcoxon rank-sum, ANOVA, and T-Test shows the superiority of the proposed algorithms. The utilization of each successive phase is aimed to improve the overall accuracy to offer a viable and reliable paradigm in the battle against the spread of COVID-19. A future research direction will be to tune the CNN parameters to increase the overall classification accuracy in case of using other datasets that cannot achieve satisfactory performance. Moreover, the proposed algorithms can be applied to several medical image processing applications that use other imaging modalities. EL-SAYED M. EL-KENAWY (Member, IEEE) is currently an Assistant Professor with the Delta Higher Institute for Engineering and Technology (DHIET), Mansoura, Egypt. He is inspiring and motivating students by providing a thorough understanding of a variety of computer concepts. He has pioneered and launched independent research programs. His research interests include computer science and machine learning field. He is an adept at explaining sometimes complex concepts in an easy-to-understand manner. SEYEDALI MIRJALILI (Senior Member, IEEE) is currently the Director of the Centre for Artificial Intelligence Research and Optimization, Torrens University Australia at Brisbane. He is internationally recognized for his advances in swarm intelligence and optimization, including the first set of algorithms from a synthetic intelligence standpoint-a radical departure from how natural systems are typically understood-and a systematic design framework to reliably benchmark, evaluate, and propose computationally cheap robust optimization algorithms. He has published over 200 publications with over 20,000 citations and an H-index of 51. As the most cited researcher in robust optimization, he is in the list of 1% highly cited researchers and named as one of the most influential researchers in the world by the Web of Science. He is working on the applications of multi-objective and robust meta-heuristic optimization techniques as well. His research interests include robust optimization, engineering optimization, multi-objective optimization, swarm intelligence, evolutionary algorithms, and artificial neural networks'. He is an Associate Editor of several journals, including Neurocomputing, Applied Soft Computing, Advances in Engineering Software, Applied Intelligence, PLOS One, and IEEE ACCESS. MARWA METWALLY EID received the Ph.D. degree in electronics and communications engineering from the Faculty of Engineering, Mansoura University, Egypt, in 2015. She has been an Assistant Professor with the Delta Higher Institute for Engineering and Technology, since 2011. Her current research interests include image processing, encryption, wireless communication systems, and field programmable gate array (FPGA) applications. He is a member of the ICMI and Institute of Engineering and Technology IET. His scientific research interests include cloud computing, data science, image and signal processing, modeling, and artificial intelligence. VOLUME 8, 2020 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges A novel coronavirus outbreak of global health concern Frequency and distribution of chest radiographic findings in COVID-19 positive patients Enhancement of soft-tissue contrast in cone-beam CT using an anti-scatter grid with a sparse sampling approach Multi-mounted X-Ray conebeam computed tomography Breast cancer detection and classification using thermography: A review A feasibility study of pulmonary nodule detection by ultralow-dose CT with adaptive statistical iterative reconstruction-V technique 3D algebraic iterative reconstruction for cone-beam X-Ray differential phasecontrast computed tomography A beam optics study of a modular multi-source X-ray tube for novel computed tomography applications Computed tomography X-Ray characterization: A Monte Carlo study Imaging of proteoglycan and water contents in human articular cartilage with full-body CT using dual contrast technique Deep learning workflow in radiology: A primer Breast cancer segmentation from thermal images based on chaotic Salp swarm algorithm Advanced deeplearning techniques for salient and category-specific object detection: A survey Very deep convolutional networks for large-scale image recognition Transfer deep learning along with binary support vector machine for abnormal behavior detection Learning long-term temporal features with deep neural networks for human action recognition Convolutional neural networks: An overview and application in radiology Dynamic group-based cooperative optimization algorithm Optimization method for forecasting confirmed cases of COVID-19 in China Parameter investigation of support vector machine classifier with kernel functions COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios Input initialization for inversion of neural networks using k-nearest neighbor approach Random forests The whale optimization algorithm Binary optimization using hybrid grey wolf optimization for feature selection A new local search based hybrid genetic algorithm for feature selection Two-step particle swarm optimization to solve the feature selection problem A novel hybrid PSO GWO algorithm for optimization problems Bat algorithm applied to continuous constrained optimization problems Biogeography-based optimization Multi-verse optimizer: A nature-inspired algorithm for global optimization Satin bowerbird optimizer: A new optimization algorithm to optimize ANFIS for software development effort estimation Memetic firefly algorithm for combinatorial optimization CT image visual quantitative evaluation and clinical classification of coronavirus disease (COVID-19) CT imaging features of 2019 novel coronavirus (2019-nCoV),'' Radiology Chest CT severity score: An imaging tool for assessing severe COVID-19 Radiological findings from 81 patients with COVID-19 pneumonia in wuhan, China: A descriptive study Performance of radiologists in differentiating COVID-19 from viral pneumonia on chest CT The role of CT in case ascertainment and management of COVID-19 pneumonia in the UK: Insights from high-incidence regions Deep learning-based multi-view fusion model for screening 2019 novel coronavirus pneumonia: A multicentre study Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography Application of deep learning for fast detection of COVID-19 in X-rays using nCOVnet Deep learning system to screen coronavirus disease 2019 pneumonia A novel medical diagnosis model for COVID-19 infection detection based on deep features and Bayesian optimization Weakly supervised deep learning for COVID-19 infection detection and classification from CT images New machine learning method for image-based diagnosis of COVID-19 A CT Scan Dataset about COVID-19,'' 2020 Artificial intelligence vs COVID-19: Limitations, constraints and pitfalls An imbalanced big data mining framework for improving optimization algorithms performance An imbalanced big data classification framework using whale optimization and deep neural network Algorithm: Theory, Literature Rev., Appl. Designing Photonic Crystal Filters Metaheuristics and Swarm Methods: A Discussion on Their Performance and Applications From ants to whales: Metaheuristics for all tastes Stochastic fractal search: A powerful metaheuristic algorithm MbGWO-SFS: Modified binary grey wolf optimizer based on stochastic fractal search for feature selection Hybrid gray wolf and particle swarm optimization for feature selection PAPSO: A power-aware VM placement technique based on particle swarm optimization Human thermal face recognition based on random linear oracle (RLO) ensembles Optimized superpixel and AdaBoost classifier for human thermal face recognition The authors would like to thank Dr. Mohamed Elsayed Gawish, Radiology Registrar at Typical Medical complex in Riyadh, and Dr. Shaaban Omar, Fellow of The Royal College of Radiologists in the U.K., for their help to understand the CT image datasets. They guided the authors to deal with COVID-19 CT images of the infected cases and to differentiate them from other non-COVID-19 cases.