key: cord-0724116-29h8wdch authors: Bozkurt, Ferhat title: A deep and handcrafted features‐based framework for diagnosis of COVID‐19 from chest x‐ray images date: 2021-11-19 journal: Concurr Comput DOI: 10.1002/cpe.6725 sha: f23eed43a057700cc0566cea0de66bb995236088 doc_id: 724116 cord_uid: 29h8wdch Automatic early diagnosis of COVID‐19 with computer‐aided tools is crucial for disease treatment and control. Radiology images of COVID‐19 and other lung diseases like bacterial pneumonia, viral pneumonia have common features. Thus, this similarity makes it difficult for radiologists to detect COVID‐19 cases. A reliable method for classifying non‐COVID‐19 and COVID‐19 chest x‐ray images could be useful to reduce triage process and diagnose. In this study, we develop an original framework (HANDEFU) that supports handcrafted, deep, and fusion‐based feature extraction techniques for feature engineering. The user interactively builds any model by selecting feature extraction technique and classification method through the framework. Any feature extraction technique and model could then be added dynamically to the library of software at a later time upon request. The novelty of this study is that image preprocessing and diverse feature extraction and classification techniques are assembled under an original framework. In this study, this framework is utilized for diagnosing COVID‐19 from chest x‐ray images on an open‐access dataset. All of the experimental results and performance evaluations on this dataset are performed with this software. In experimental studies, COVID‐19 prediction is performed by 27 different models through software. The superior performance with accuracy of 99.36% is obtained by LBP+SVM model. BOZKURT thickening, vascular thickening, and bronchial wall thickening. 5 In this context, chest x-ray images of COVID-19 and pneumonia are similar and contain similar features. This similarity makes it difficult for radiologists to diagnose COVID-19 cases. A reliable method for classifying COVID-19 and non-COVID-19 chest x-ray images can accelerate the triage process of non-COVID-19 cases and maximize the allocation of hospital resources to COVID-19 cases. Real-world data (like chest x-ray images) given to models in machine learning (ML) are in the form of feature vectors and these feature vectors are extracted from the raw data. Feature engineering is the process of extracting relevant features from data for a ML model. 6 The count of features is as important as the features themselves. If the features in the feature vectors representing real-world data given to the model are not sufficient, the model will not be able to perform its main task. If there is a feature vector that contains more than necessary and unrelated features, the model will still not produce accurate results. In the ML process, not only the model but also the features that will represent the data are selected. Well-selected features facilitate subsequent modeling steps and increase the resulting model's ability to complete the desired task. However, if the features are not selected properly, a much more complex model may be required to achieve the same level of performance. Thus, we study on performances of feature extraction techniques and feature vectors of chest x-ray images in this study. The scope and contributions of this study could be summarized as follows: -In this study, a general framework (HANDEFU) that could be used in different computer vision and ML problems is developed. -This original framework supports handcrafted, deep, and fusion-based feature extraction techniques for feature engineering. -Any feature extraction technique and model could then be added dynamically to the library of software at a later time upon request. -This framework is utilized for diagnosing COVID-19 from chest x-ray images. -The user can build a feature extraction method and classification to train the model. Then, all performance evaluations on test data are performed with this software. The rest of this article is organized as follows. In Section 2, the related works in literature are presented. In Section 3, the methodology which contains stages of the framework is explained. In Section 4, utilized dataset for testing is stated. Then, experimental results are presented and findings are discussed in Section 5. Finally, Section 6 concludes the article and presents some future work opportunities. In the literature, there exist numerous approaches to diagnose COVID-19 from chest x-ray images based on various ML, deep learning, and hybrid methods. The studies based on machine and deep learning techniques are gaining increasing popularity for radiology images. When the related studies on this subject are examined, the most common and biggest problem is the lack of a sufficiently large data set. In general, studies have aimed to overcome this problem with various methods. Studies conducted on this subject can be given as follows: related studies using a similar open-access dataset 7 with this study are summarized as follows. In some studies, different-sized image samples belonging to the classes may have been taken from the same data set at different times. Therefore, it can be observed that similar techniques give different performances in such a situation. For instance, Chowdhury et al. 8 Other related studies could be summarized as follows. Rasheed et al. 14 proposed the use of logistic regression and CNN for diagnosis of COVID-19. They investigated their method with PCA and without PCA to obtain high accuracy. They obtained overall accuracy between 95.2% and 97.6% without PCA, and between 97.6% and 100% with PCA for identification. In order to detect COVID-19 from chest x-ray images, Ahmed et al. 15 proposed a COVID-Net which is combination of residual network and parallel convolution. They achieved accuracy of 97.99%. Sharifrazia et al. 16 The common shortcoming of state-of-art studies is that most deep learning-based models are trained on unbalanced datasets and therefore it may lack robustness. There exists a small number of studies on creating a scalable deep learning model using image enhancement/preprocessing techniques and efficient feature extraction. Due to unbalanced data processing and inability to extract necessary features from images, the classification accuracy does not reach the desired level. Also, proposed models in literature cannot guarantee to reproduce the promising results when these are evaluated on a larger dataset. It is a handicap that running a CNN architecture with many iterations on a small dataset leads to overfitting. In order to overcome these limitations, this framework combines image preprocessing with handcrafting, deep and fusion based feature extraction techniques to create a scalable model. In this study, a general framework (HANDEFU) including different feature extraction techniques and classifiers is composed for use in computer vision and machine-learning problems as shown in Figure 1 . Classifier, and softmax classifier exist in the framework. These options can be further increased by dynamically adding them to the framework library. As shown in Figure 1 , HANDEFU framework is utilized for diagnosis of COVID-19 from chest x-ray images. The user loads the dataset as training and test data. Then, the user is given the opportunity for preprocessing. Then, the user builds a feature extraction method and classification to train the model through the framework. Furthermore, all performance evaluations on test data are performed with this software. All of the stages will be explained in detail as follow. F I G U R E 1 General overview of HANDEFU framework Images may need to be preprocessed before the feature extraction. The purpose of the preprocessing at this stage is to facilitate the computing to be done in the later stages of the framework. At this stage, the user is given the opportunity to choose a number of preprocessing options such as smoothing, noise reduction, normalization, and image resizing. The image could be enhanced in different ways. For instance, smoothing reduces undesired details or noise in the image and speeds up following processing. When taken as a single image, some artifacts may cause an inhomogeneity problem in the image. Normalization is the process in which multiple images are placed in an average statistical distribution in terms of size and pixel values. A single image can also be normalized within itself. Thus noisy images are eliminated by intensity normalization. Brightness and contrast can be adjusted through a linear transformation of image intensities with dynamically windowing of each image. Figure 2 shows a sample of chest x-ray image before and after preprocessing. Gaussian smoothing and median filtering significantly reduce the noise of the image. The two-dimensional (2D) Gaussian filter is often preferred for smoothing and noise reduction. Regions with a diagnosis of COVID-19 are pointed by a radiologist as shown in Figure 2 . The radiologist shows the changes made after pre-processing. Thanks to these filters, ground-glass opacities and consolidations become clearer. Also, some reticular patterns which are related to COVID-19 are sharpened after preprocessing as shown in Figure 2 . The equation of 2D Gaussian filter is given in Equation (1). 2 Cartesian coordinates of the image. 24 GF 2D (x, y) = 1 2 2 e −(x 2 +y 2 ∕2 2 ) (1) The sizes of the images in the database used for image classification and pattern recognition may differ from each other. In some feature extraction functions, the number of extracted features depends on the size of the image (e.g., LBP, HOG, and Gabor). In order to conduct F I G U R E 2 A sample of chest x-ray image (A) before preprocessing and (B) after preprocessing methods, the number of features obtained from the images must be the same. For this, all images must be resized to a fixed size. In addition, when the size and resolution of the images are reduced by the resizing process, the time required for processing the images in the framework decreases and the performance of the framework is positively affected. In this study, the images are resized from 1024×1024 to 250×250. The fundamental elements of the pattern recognition process are preprocessing, feature extraction, feature selection, training, and classification as shown in Figure 3 . The feature can be basically explained as measurable or observable information about the pattern. Feature extraction enables obtaining characteristic features of the pattern by eliminating irrelevant and excessive information. This is basically a kind of dimensional reduction process. Since very high dimensional data are used especially in image processing applications, using these data in their original form greatly increases the processing time. Thanks to hand-crafted feature extraction methods, it is ensured that the original data is reduced to smaller sizes by eliminating unnecessary and high-size information and preserving its characteristic features. 25 Feature extraction, which is one of the pattern recognition stages, plays an important role in the success of the system. Appropriately selected features represent the characteristics of the classes and positively affect recognition success. The purpose of feature extraction is to transport as much of the class information as possible to smaller sizes. 25 In other words, the purpose of feature extraction is to find a T transform that expresses the X space in the best possible way from a k-dimensional Xobservation space to a smaller l-dimensional Y feature space as given in Equation (2). Here, numerical features can be in the form of a set of vectors that characterize the class of the pattern. F I G U R E 3 An overview of handcrafted-based feature extraction and classification Figure 3 shows an overview of handcrafted based feature extraction and classification. In this stage, a number of handcrafted based feature extraction methods in HANDEFU framework (e.g., LBP, HOG, Gabor) will be explained in detail. LBP feature extraction is used in applications such as texture classification, face recognition, facial expression recognition, gender, and age identification (classification). 26 LBP is a pattern recognition method based on binary coding between each pixel in the image and its neighbors. LBP generates binary codes by comparing the 3×3 neighborhoods of each pixel in the gray image. The 3×3 image obtained from each pixel is considered as circular, and the central pixel is accepted as the threshold value. When the neighbor pixel value is compared with the center pixel value, it is encoded in binary as 1 if the value of the neighbor pixel is greater than or equal to the value of the center pixel, else encoded as 0. In this way, the LBP code of the center pixel is obtained by generating an 8-bit binary code and converting it to a decimal number. This process is computed for each pixel of the image as shown in Figure 4 . Places in an image (such as corners, edges, line regions, luminous, or unilluminated regions) represent the features of the image. These textural features of the image could be extracted with the LBP code. In this way, it is possible to obtain different textural features of the images with the generated LBP code. The LBP compares the values between the center pixel at (x, y) position and the neighboring pixels surrounding it. LBP code is obtained when an 8-bit code is created and converted to a decimal number. For each pixel in the image, the binary code is generated by thresholding the neighboring pixels according to the center pixel. The generation of LBP codes is performed by Equation (3). In Equation (3) (x i , y i ) is the coordinates of the center pixel, g i is the gray value of the center pixel, and g n is the values of the pixels in 3×3 neighborhood surrounding the center pixel. The f(k) function in Equation (4) is the 8 neighborhood values of gray pixels. [27] [28] [29] LBP Regional histograms could be used to get more efficient results. For this purpose, the x-ray image is divided into 8×8 = 64 square regions and histograms are obtained by calculating the LBP codes separately for each region. 27 These histograms are concatenated and used in classification as feature vectors. In recent years, the use of edge orientation histograms within the scope of image recognition studies has gained great momentum. Local gradient orientation histograms (HOG) was first suggested by Dalal and Triggs. 30 HOG is defined as the characteristics of the orientation ( ) and magnitude F I G U R E 4 Implementation of the LBP operator • Local histograms are created using gradient magnitude and orientations for each specified block. • The created local histograms are normalized. • All histograms are concatenated and used as feature vector. Horizontal (f x ) and vertical (f y ) gradient are computed by using derivative masks of the gray level images. The horizontal (f x ) and vertical (f y ) image gradient are calculated by Equations (5) and (6). In these equations, I(x, y) shows the pixel intensity at the point (x, y). 31 Gradient magnitude (m) and gradient orientations ( ) are calculated by using horizontal f x and vertical f y image gradients as follows. 31 in one block as shown in Figure 5 . This histogram is then normalized using by Equation (9) . In this equation, V k is the large histogram vector obtained from a block and v is the normalized HOG feature vector. L2-Norm is used for normalization. 31 In this stage, a computed HOG feature vector is ready to be given as input of a classifier. This technique shows great similarity with the human visual system in terms of its frequency and orientation characteristics. Gabor wavelets are used in computer vision applications and modeling biological visions (particularly applications about texture descriptor as fingerprint, palm, and face recognition). 32 In wavelet transform, spatial information is obtained as well as the frequency information of the image. In the wavelet transform method, frequency data could be regionalized. For this reason, it performs the transformation of the function not only according to the frequency parameters but also according to the scale parameters when compared with the Fourier transform. This technique, which is especially preferred in texture analysis, uses Gaussian elements modulated by sine and cosine functions. Gabor filter could be applied dimensionally to 1D or 2D images. A Gabor filter is obtained by modulating a sine wave with a Gaussian function. 33 The 2D Gabor filter kernel matrix is calculated by Equation (10) . In this equation, the first exponential expression (Gaussian function) is a descending term depending on x ′ and y ′ , while the second exponential expression is a periodic term containing a complex number. 32, 33 g(x, y; , , , , ) = exp In the equation, is the coefficient that determines the wavelength of the cosine factor. If the coefficient is 1, since the cosine expression (cos(2. .x ′ ) = 1) will always be 1, the coefficient should be chosen 2 or a larger integer. is used to calculate the values of x ′ and y ′ and is the orientation angle of the Gabor kernel to be created. The variables x ′ and y ′ are calculated by the following formula for a given value of . 32 x ′ = x cos + y sin , is the phase angle of the kernel matrix whose angle is to be formed. The filter can be shifted on the x-axis by changing this value. is the coefficient that determines the standard deviation (SD) of the Gaussian function. allows the SD value to be determined for y ′ . If this value is 1, the resulting kernel matrix will be of equal length since it has equal SD for x and y. When a different ratio is chosen, the kernel matrix will be formed in a rectangular shape. The application of the generated Gabor filter to an image is a 2D convolution process and is performed with Equation (12). In the equation, I(p, q) refers to the image to be applied Gabor filter and g(x − p, y − q; , , , , ) denotes the Gabor filter. 32, 33 G(x, y; , , , , In this study, Gabor feature extraction is performed by applying 40 Gabor filters obtained from 5 scales and 8 orientations as shown in Figure 6 . In order to define a model with classical ML techniques or to set up a detection (prediction) system, first the feature vector should be extracted. Experts in the field are required for the extraction of the feature vector. These processes both take much time and keep the expert very busy. For this reason, these techniques cannot process raw data without pre-processing and expert assistance (as supervised learning). Deep learning (DL) has made great progress by overcoming this problem that those working in the field of ML have been dealing with for many years. Deep learning is a field of ML research that has emerged from studies on artificial neural networks in recent years. Because deep networks do the learning process on raw data, unlike traditional ML and image processing techniques. While processing the raw data, it acquires the required information with the representations which are obtained in different layers. 34 F I G U R E 6 General structure of Gabor (scales, orientations, and feature vector) The DL algorithm is scalable with large data sets and is effective in identifying complex patterns from feature-rich data sets. The main fields that deep learning is successfully utilized are computer vision, speech recognition, and natural language processing. The reason for the widespread success of deep learning is its method of directly predicting output from the representation set. An important advantage of deep learning over traditional techniques is that it does not require an explicit feature extraction stage. The network takes the raw input and maps it to the desired output. Features are learned automatically by the network without manual intervention. Although these models require high processing power, CNN as shown in Figure 7 , is a mathematical structure consisting of three types of layers, typically convolution, pooling, and fully-connected layers. While the convolution and pooling layers perform feature extraction, it achieves the final output for classifying the extracted features using the fully-connected layer. 28, 35 In digital images, pixel values are stored in a 2D matrix, this is a small convolution matrix called a feature extractor mask that is applied to each point of the image. When one layer feeds its output to the next layer, the extracted features can become hierarchically and increasingly complex. At this stage, with the masks used, learning processes take place in the layers and the values of the layers are computed. Convolutional layer: The convolution process, which forms the basis of CNNs, aims to apply a filter matrix to the input by the convolution process and use the results for the next layer. This layer is based on traversing small size filters such as 2 × 2, 3 × 3, and 5 × 5 over the entire image. Thus, a new image is obtained by extracting the more distinctive features in the image. The weights of the filter matrix used for the convolution process are determined at the learning stage of the CNN. The filter matrix is shifted with the specified stride and the convolution operation is applied. The result obtained is given as the input for the next layer, if not the last layer. If it is the last layer, it represents the output image. Filter coefficients (f) are multiplied by equal-sized windows (w) in the image and calculated by taking their sums as in Equation (13). As a result, a new image is obtained based on distinctive high-level features. 36, 37 w Pooling layer: Pooling layer is used to reduce the computational load by reducing the parameters within the network. There are many types of pooling layers. Maximum, average, and L2 pooling options are available in the HANDEFU framework. In maximum pooling, the highest of the pixel values remaining in the filter window, and in the average pooling process, the average of all the pixel values within the filter window is kept as a single value in the output pixel. At the end of the pooling process, the aspect ratio of the image is reduced. In this process, there are losses in pixel values, but these losses create less computational load for subsequent layers. F I G U R E 7 Fundamental architecture of the CNN Dropout layer: The dropout layer is the most commonly used regularization layer in deep learning methods. Dropout is the removal of node from input and hidden layers according to certain rules. Thus, it is aimed to reduce the parameter and computation without loss of performance and it is aimed to prevent overfitting. Dropout layer is generally used after fully connected layers. 37 Thus, there is less information link between nodes and the nodes are least affected by each other's weight changes. As a result, more consistent network models are created. Batch normalization: Batch normalization is a method used to make the CNN more regular, like the dropout layer. Besides the regulation of the neural network, it also allows the network to resist the vanishing gradient during training. This can reduce training time and provide better performance. For the batch normalization computation, the input is defined as B = {x i … .m } and the training parameters are shown as , . The operation in Equation (14) is the average of the mini batch. The average is calculated by summing all the input in the mini-batch and dividing them by the total number of input. Equation (15) is variance of the mini-batch. As a result of these calculations, the general intervals in which the data in the mini-batch are computed. Normalization of the data is calculated with Equation (16) . And, normalized data is processed with (scaling parameter) and (shifting parameter) in Equation (17). As a result of these operations, a learnable linear equation is obtained. and are used as hyperparameters of the designed network. 5 These parameters determine how much the batch normalization layer will do on the data. Fully connected layer: The fully connected layer is commonly located at the end of the CNN and is used in classification processes. This layer is often used towards the end of the CNN architecture and to optimize class scores. In addition, the number of these layers may vary in architectures based on deep learning. Classification layer: Classification layer is the last layer of the CNN model that classification process is performed. The output values of this layer are equal to the number of classes, depending on the number of objects to be recognized. Activation functions are used to learn and predict any continuous and complex relationship between network variables. In this framework, the ReLU activation function exists to avoid linearity in the network. Since multi-class classification is needed in the fully connected layer in the CNN architecture, softmax function is used as in Equation (18). Softmax classifier is commonly used in this layer based on deep learning architectures. This classifier produces probabilistic values between 0 and 1 for each class. 5 As a result, the highest probability value gives the class predicted by the model. As shown in Figure 8 , the user is able to build a sample deep CNN model consisting of four set convolution-pooling layers, a flattened layer, three fully connected layers through the HANDEFU framework. Moreover, the framework contains popular deep learning techniques as shallow-fully connected network model, deep-fully connected network model, shallow-convolutional network model, deep-convolutional network model, AlexNet, VGG16, VGG19, ResNet50, InceptionV3, Xception, and DenseNet121 model as shown in Figure 9 . Thus the user is able to build shallow neural network and deep neural network models through the HANDEFU framework. The framework has a dynamic structure. Therefore, other popular deep learning techniques could be added to the library later. Ensemble learning is the creation of a new model by using multiple modeler algorithms together instead of using a single basic learning model. At this stage, it is aimed to obtain high performance by creating a robust model by combining handcrafted features and deep learning features. Feature fusion is achieved by combining multiple feature vectors together. 17 In this context, previously obtained features are combined into a single vector. Here, a feature fusion is obtained by integrating the maximum feature vector. The fusion of features with entropy determines the projection of the method. For instance, deep feature extraction is done by InceptionV3 (Equation 19 ) and handcrafted feature extraction is done by LBP (Equation 20 ). Then, feature extraction is combined as a single feature vector as in Equation (21) . In order to obtain better features depends on score, entropy is In In this study, an open-access dataset is utilized for classification of COVID-19 chest x-ray images. This dataset consists of chest x-ray images for COVID-19 positive cases along with normal and viral pneumonia images. A team of researchers composes this database from the university Qatar, Doha, Bangladesh, Dhaka, and Qatar with their collaborators and doctors. 7, 8 As shown in Table 1 , a total of 3886 chest x-ray images exist as 1200 COVID-19 (positive cases), 1345 viral pneumonia, and 1341 normal (negative cases) images. Figure 10 shows image samples of normal (negative), viral pneumonia, and COVID-19 from this dataset. Regions with a diagnosis of viral pneumonia and COVID-19 are pointed by a radiologist as shown in Figure 10 . In experimental results, this dataset is utilized for training and testing through the framework. All predictions and classification of COVID-19 chest x-ray images are performed on this dataset. The experimental studies in this study are conducted on a desktop computer with Intel (R) Core-i7 8700U CPU @ 3.20 GHz, 4GB NVIDIA GeForce In these equations, True Positive (TP) means that the patient actual with COVID-19 positive case and system recognized as positive. True Negative (TN) means that the patient actually without COVID-19 case and the system recognized as the negative. False Positive (FP) means that the patient actually without COVID-19 case and the system recognized as the positive. False Negative (FN) means that the patient actual with COVID-19 case and system recognized as the negative. In this section, performance comparisons of different feature extraction methods and models are performed through the HANDEFU framework as shown in Table 2 . Performance comparisons of different feature extraction methods and models (top 5 models are marked in bold according to Acc) are given in Table 2 . The user is able to split data for training and testing as 70% is reserved for training and 20% for testing or 80% is for training 20% is for testing in advance. In this section, we performed the performance evaluations for 80%-20%. The methods are categorized as hand-crafted, deep, and fusion-based feature extraction. For instance, LBP, HOG, and Gabor-based texture descriptors are used as hand-crafted feature extraction techniques with kNN, NB, NN, and SVM classifiers. In this context, 12 different models exist such as LBP+SVM, HOG+SVM, Gabor+SVM and so forth. The user can select any model and run the model on loaded dataset as train and test data through the framework. Similarly, the user is able to build shallow neural network and deep neural network models through the HANDEFU framework. The framework has a dynamic structure. Therefore, other popular deep learning techniques could be added to the framework later. Nowadays, the most popular deep-learning techniques exist in frameworks as AlexNet, VGG16, VGG19, ResNet50, InceptionV3, Xception, and DenseNet121 as shown in Table 2 . Hereby, performance comparisons of different deep feature extraction models are performed with obtained performance metrics. Furthermore, In fusion-based category, user can select and combine different deep and handcrafted feature extraction methods as InceptionV3+LBP, VGG19+GLCM, and VGG19+HOG through the framework. All of the performance evaluation results of different models on this dataset are given in Table 2 . Thus, we can observe the superior models thanks to comparisons of their performances. The top 5 models among different feature extraction methods and their performance comparisons are presented in Figure 12 Table 3 . Furthermore, the user can obtain training/test accuracy and training/test loss graphs according to iterations as shown in Figure 13 . The graphs of each model are saved by this software. TA B L E 2 Performance comparisons of different feature extraction methods and models (top 5 models are marked in bold according to Acc) Hand-crafted feature extraction In experimental studies, it has been observed that LBP-based hand-crafted feature extraction method is more successful in diagnosing COVID-19 cases for this dataset. As shown in Table 1 , LBP+SVM model gives the highest accuracy value with 99.36%. Besides, this model has the shortest execution time (approximately 5 min) according to other methods. Because, regional histograms in LBP give more efficient results. X-ray image is divided into 8×8 = 64 square regions and histograms are obtained by calculating the LBP codes separately for each region. These histograms are concatenated and then they have given to the classifier as feature vector. A 3776 size feature vector is obtained for each image in the data set. Similarly, HOG which is another hand-crafted feature extraction method is studied in experimental results. It is seen from the Table 1 In Figure 13 , training/test accuracy and training/test loss graphs for 50 iterations of the DenseNet121 are presented. All of the models' evaluation results, confusion matrices, training/test accuracy, and training/test loss graphs could be taken from this software. Both training and test accuracy curves show a rising curve as the number of iterations increases. The loss curve shows the decrease in error rate. It shows that the training process and learning of the network are at a good learning rate. As shown in Figure 13 , while the loss value decreases in each iteration, the accuracy rate increases with the given training set and learning occurs. Performance comparison of the current study with other studies is given as shown in with small size and unbalanced data small amount of COVID-19 images. In addition, the classification success of the studies was considered as a binary class in some studies. Due to its nature, it has been observed that the binary classification accuracy gives high accuracy results compared to the multi-class accuracy. Hence, in the current study, a balanced data set was created and acceptable high results are obtained for multiple classes. From past to present, infectious diseases are one of the most important threats to human health. COVID-19 is an infectious disease caused by severe acute respiratory syndrome coronavirus. The rapid spread of the disease and the increase in mortality rates in many countries reveal that an effective treatment method should be developed. At this point, artificial intelligence and ML can contribute to this problem. Especially, there is a serious shortage of experts and traditional pneumonia and COVID-19 have great similarities. ML aided diagnosing method could be an important milestone towards significantly reducing test time. This technique offers both a lower cost and a more accurate diagnosis and treatment for COVID-19 and similar diseases. Lung x-ray radiography is an effective tool for triaging non-COVID-19 patients with pneumonia to efficiently allocate hospital resources. However, there are many common features between x-ray images of COVID-19 and pneumonia caused by other viral infections, like common influenza. This similarity makes it difficult for radiologists to diagnose COVID-19 cases. Feature engineering is the process of extracting relevant features from data for a ML model. The count of features is as important as the features themselves. Well-selected features facilitate subsequent modeling steps and increase the resulting model's ability to complete the desired task (e.g., detection, classification). Thus, we study on performances of feature extraction techniques and feature vectors of chest x-ray images. In this study, we develop an original framework (HANDEFU) that supports handcrafted, deep, and fusion-based feature extraction techniques for feature engineering. The novelty of this study is that image preprocessing and diverse feature extraction and classification techniques are assembled under an original framework. This framework could be used in different computer vision and ML problems. The framework has a dynamic structure. The user is able to build any model with handcrafted, deep, and fusion-based feature extraction techniques and various classifiers through the HANDEFU framework. At present, the framework contains popular pretrained and deep learning techniques. Any feature extraction technique and model could then be added dynamically to the library of software at a later time upon request. Thus, the user can build a feature extraction method and classification to train the model. All performance evaluations on test data are performed with this software. This framework is utilized for diagnosing COVID-19 from chest x-ray images. In experimental studies, COVID-19 prediction is performed by 27 different models through software. The superior performance with accuracy of 99.36% is obtained by LBP+SVM model. This model has the shortest execution time according to other methods. Because regional histograms in LBP give more efficient results for feature extraction. In further studies, more handcrafted features-based and deep learning-based techniques, and more data will be studied and comparisons will be made. Furthermore, by using this framework in the background, it is planned to develop a mobile and web-based system aimed at supporting health professionals in their efforts to detect cases of COVID-19 and other diseases. The data that support the findings of this study are openly available in [COVID-19 Chest x-ray Database] at [https://www.kaggle.com/ tawsifurrahman/covid19-radiography-database], reference number. 7 Ferhat Bozkurt https://orcid.org/0000-0003-0088-5825 CNN 2971 images (285 COVID-19, 1345 viral pneumonia, and 1341 normal 40 2020 VGG19 1428 images (224 COVID-19, 700 viral pneumonia, and 504 normal) Binary: 98.75% Multiple: 93 SqueezeNet 487 images (423 COVID-19, 1485 viral pneumonia, and 1579 normal) Multiple: 98 41 2020 COVID-XNet 6926 images (2589 COVID-19 and 4337 normal) Binary: 94 MRFODE+kNN 1560 images (219 COVID-19 and 1341 non-COVID-19 images 43 2020 CoroNet 251 images (284 COVID-19, 330 pneumonia bacterial, 327 Pneumonia viral, and 310 normal) Binary: 89 CVDNet 2905 images (219 COVID-19, 1345 Viral pneumonia , and 1341 normal) Binary: 97.20% Multiple: 96.69% DarkCovidNet 1127 images (127 COVID-19, 500 viral pneumonia, and 500 normal) Binary: 98.08% Multiple: 87.02% 45 2020 Convolutional CapsNet 2331 images (231 COVID-19, 1050 viral pneumonia, and 1050 normal) Binary: 97.24% Multiple: 84 Ensemble deep learning 4200 images (1050 COVID-19, 1050 bacterial pneumonia, 1050 viral pneumonia, and 1050 normal) Multiple: 96 CNN+HOG+VGG19 5090 images CNN-BiLSTM 2905 images (219 COVID-19, 1345 viral pneumonia, and 1341 normal Ensemble deep learning 10,000 images (2161 COVID-19, 2022 viral pneumonia, and 5563 normal) Binary: 98.33% Multiple: 92 1345 viral pneumonia, and 1341 normal) Binary: 99.52% Multiple: 99.08% Xception 6432 images (576 COVID-19, 4273 viral pneumonia, and 1583 normal) Multiple: 97 non-COVID, and 8851 normal) Multiple: 95.11% Current study 2021 LBP+SVM 3886 images (1200 COVID-19, 1345 viral pneumonia, and 1341 normal REFERENCES 1. WHO. Pneumonia of unknown cause-China. emergencies preparedness, response. disease outbreak news. World Health Organization (WHO) Prognostication of patients with COVID-19 using artificial intelligence based on chest x-rays and clinical data: a retrospective study Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study An automated machine learning model to assist in the diagnosis of COVID-19 infection in chest x-ray images. medRxiv CVDNet: a novel deep learning architecture for detection of coronavirus (Covid-19) from chest x-ray images Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists COVID-19 chest X-ray database Can AI help in screening viral and COVID-19 pneumonia? Early detection of coronavirus cases using chest X-ray images employing machine learning and deep learning approaches. medRxiv CNN-based transfer learning-BiLSTM network: a novel approach for COVID-19 infection detection InstaCovNet-19: a deep learning classification model for the detection of COVID-19 patients using Chest X-ray Classification of covid-19 from chest x-ray images using deep convolutional neural networks Texture analysis in the evaluation of Covid-19 pneumonia in chest X-ray images: a proof of concept study A machine learning-based framework for diagnosis of COVID-19 from chest X-ray images Convid-Net: an enhanced convolutional neural network framework for COVID-19 detection from X-ray images Fusion of convolution neural network, support vector machine and Sobel filter for accurate detection of COVID-19 patients using X-ray images A novel hand-crafted with deep learning features based fusion model for COVID-19 diagnosis and classification using chest X-ray images A deep learning-based COVID-19 automatic diagnostic framework using chest X-ray images Deep learning based detection and analysis of COVID-19 on chest X-ray images Covid-densenet: a deep learning architecture to detect covid-19 from chest radiology images COVIDetectioNet: COVID-19 diagnosis system based on X-ray images using features selected from pre-learned deep features ensemble Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images Towards an effective and efficient deep learning model for COVID-19 patterns detection in X-ray images Effective Gaussian blurring process on graphics processing unit with CUDA Handcrafted vs. non-handcrafted features for computer vision classification A survey on local binary pattern and gabor filter as texture descriptors of smart profiling systems Multiresolution gray-scale and rotation invariant texture classification with local binary patterns Feature extraction based on deep-convolutional neural network for face recognition COVID-19 diagnosis from CT scans and chest X-ray images using low-cost raspberry Pi Histograms of oriented gradients for human detection Hardware architecture for HOG feature extraction A new facial expression recognition method based on local Gabor filter bank and PCA plus LDA 2D face recognition system based on selected Gabor filters and linear discriminant analysis LDA Deep learning An introduction to convolutional neural networks Design of convolution neural network for facial emotion recognition COVINet: a convolutional neural network approach for predicting COVID-19 from chest X-ray images A texture-based 3D region growing approach for segmentation of ICA through the skull base in CTA Real time wearable speech recognition system for deaf persons Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks COVID-XNet: a custom deep learning system to diagnose and locate COVID-19 in chest X-ray images New machine learning method for image-based diagnosis of COVID-19 CoroNet: a deep neural network for detection and diagnosis of COVID-19 from chest x-ray images Automated detection of COVID-19 cases using deep neural networks with X-ray images Convolutional capsnet: a novel artificial neural network approach to detect COVID-19 disease from X-ray images using capsule networks Deep ensemble model for classification of novel coronavirus in chest X-ray images COVID-19 detection from chest X-ray images using feature fusion and deep learning How to cite this article: Bozkurt F. A deep and handcrafted features-based framework for diagnosis of COVID-19 from chest x-ray images