key: cord-1037505-9xv3a5w7
authors: Ashour, Amira S.; Eissa, Merihan M.; Wahba, Maram A.; Elsawy, Radwa A.; Elgnainy, Hamada Fathy; Tolba, Mohamed Saeed; Mohamed, Waleed S.
title: Ensemble-based Bag of Features for Automated Classification of Normal and COVID-19 CXR Images
date: 2021-04-20
journal: Biomed Signal Process Control
DOI: 10.1016/j.bspc.2021.102656
sha: 0ca81fc6efdb5bb80905059c1203c977adac94f4
doc_id: 1037505
cord_uid: 9xv3a5w7

The medical and scientific communities are currently trying to treat infected patients and develop vaccines for preventing a future outbreak. In healthcare, machine learning is proven to be an efficient technology for helping to combat the COVID-19. Hospitals are now overwhelmed with the increased infections of COVID-19 cases and given patients’ confidentiality and rights. It becomes hard to assemble quality medical image datasets in a timely manner. For COVID-19 diagnosis, several traditional computer-aided detection systems based on classification techniques were proposed. The bag-of-features (BoF) model has shown a promising potential in this domain. Thus, this work developed an ensemble-based BoF classification system for the COVID-19 detection. In this model, we proposed ensemble at the classification step of the BoF. The proposed system was evaluated and compared to different classification systems for different number of visual words to evaluate their effect on the classification efficiency. The results proved the superiority of the proposed ensemble-based BoF for the classification of normal and COVID19 chest X-ray (CXR) images compared to other classifiers.

The COVID-19 pandemic, as announced by the world health organization in 2020, is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which was first informed in Wuhan, China before affecting 218 countries and territories world-wide. Compared to the outbreaks of different coronavirus infections, COVID-19 is considered the most contagious and widespread coronavirus [1] .

COVID-19 or coronavirus disease 2019 can spread via several means, most primarily via the droplets and excretions from the infected person, while sneezing, coughing, speaking, or breathing. The recounted symptoms include slight symptoms, such as J o u r n a l P r e -p r o o f cough, fatigue, fever, difficulty breathing, and sudden loss of taste and smell to severe complications, such as pneumonia and acute respiratory distress syndrome (ARDS).

Molecular test is considered the most common diagnostic test of COVID-19 in comparison to the antigen or antibody tests. However, molecular tests are complex, costly, prone to human errors, and time consuming [2] .

Thereby, medical imaging, such as chest X-ray images were approached to assist in the detection of COVID-19 in addition to the clinical symptoms. Chest X-ray (CXR) images allow perceiving the chest pathology via the acquired two-dimensional projection of the patient's chest, which has a pivotal role in the diagnosis of lung diseases and the detection of COVID-19 infection. Compared to the computed tomography (CT) scan, the wide availability and less complexity of the X-ray scan promotes the development of highly applicable computer-aided diagnosis (CAD) frameworks using the acquired CXR images in order to identify and confirm the COVID-19 cases. Accordingly, several studies have adopted machine learning for diagnosing COVID-19 in CXR images. These techniques can be categorized as either deep learning, or traditional machine learning (ML) techniques.

Several studies have developed deep-learning networks for automated detection of COVID-19. For instance, an optimized convolutional neural network [3] was designed to classify COVID-19, normal, and pneumonia CXR images, while optimizing the hyperparameters of the convolutional neural network (CNN) using Grey wolf optimization. The results showed 97.78% accuracy, 97.75% sensitivity, and 96.25% specificity. Also, five pre-trained CNN-based models [4] , namely the ResNet101, ResNet50, ResNet152, Inception-ResNetV2, and InceptionV3, were J o u r n a l P r e -p r o o f proposed for the classification of CXR radiographs into 4classes: COVID-19, bacterial pneumonia, viral pneumonia, and normal leading to classification accuracy ranging between 96.1% and 99.7% among three datasets. The deep CNN CoroNet model [5] was also proposed targeting the same four classes of COVID-19 and pneumonia CXR images using pre-trained Xception network leading to an overall accuracy of 89.6%, 93% precision, and 98.2% recall for COVID-19 detection among the four classes. Moreover, seven different networks of deep convolutional and NN models were included in the COVIDX-Net [6] which targeted the analysis of CXR images into positive or negative COVID-19 cases. The results showed 0.91 and 0.89 F1-scores for normal and COVID-19 cases, using DenseNet and VGG19 models, respectively. Transfer learning was applied in the proposed decompose, transfer, and compose (DeTrac) CNN-based model [7] achieving 95.12% accuracy, 97.91% sensitivity, and 91.87% specificity in the detection of COVID-19 CXR images among normal and severe acute respiratory syndrome cases. Also, adopting transfer learning with several CNN-based models [8] achieved the highest 2-class accuracy of 96.87%, 98.66% sensitivity, and 96.46% specificity using MobileNet v2 for classifying COVID-19 against non COVID-19 cases including normal, VN, and BN. As CNNs are prone to losing spatial information between image instances besides requiring large datasets, COVID-CAPS [9] , a capsule networks-based framework was developed to handle relatively small datasets, which achieved 98.3% accuracy, 80% sensitivity, and 98.6% specificity in the four class classification task (i.e. normal, COVID-19, VN, and BN).

Thus, the main limitation of most of the proposed DL-based models the required large datasets that include several alterations of the input images, such as shifting and rotation. However, the availability of large CXR datasets of COVID-19 is still J o u r n a l P r e -p r o o f limited. Accordingly, researchers have also adopted traditional machine learning (ML) techniques, for example, a linear support vector machine-based model [10] was proposed for classifying CXR images into healthy or COVID-19. The CXR images were segmented using a multi-thresholding segmentation process into background and several objects of different intensities. Using a dataset of 40 contrast-enhanced CXR images, the suggested system achieved 97.84% accuracy, 99.7% specificity, and 95.76% sensitivity. Also, an ensemble-based support vector machine (SVM) model [11] was implemented for the automated identification of COVID-19 in which the segmentation threshold for the contrast-enhanced CXR images was estimated using Li's method and particle swarm optimization. Subsequently, the texture information was improved using Laws' filter masks, which highlight the micro-structure characteristics, prior to extracting the texture-based feature vector using the gray-level cooccurrence matrix (GLCM). Finally, an ensemble of SVMs using weighted voting was applied in the classification stage yielding to 98.04% accuracy in distinguishing COVID-19 apart of SARS, MERS and ARDS pneumonia. From the previous, despite the few studies adopting traditional ML techniques, promising results have been introduced. Also, to the best of our knowledge, the bag-of-features (BoF) ML models have not been adopted yet in the domain of the CXR image-based COVID-19 diagnosis, despite their efficiency, the BoF has the ability to deal with the changed object's position and orientation. Moreover, from [11] , it was deduced that using ensembles in the classification process have led to high classification accuracy in terms of distinguishing COVID-19 apart of other causes of pneumonia, which indicates an expected high performance in classifying CXR images as either COVID-19 or normal, as in our study.

J o u r n a l P r e -p r o o f Accordingly, in this paper, we have proposed an automated ensemble-based BoF model with speeded up robust features (SURF) descriptor for the detection of COVID-19 in CXR images using a balanced two-class dataset of normal and COVID-19 cases. The organization of the paper is as follows. Section 2 reports significant related studies in the automated COVID-19 detection based on CXR images. Then, Section 3 introduces the methodology of the proposed ensemble-based BoF framework. In Section 4, the experimental results are reported and interpreted. In Section 5, the proposed system performance is compared to state-of-the-art studies.

Finally, the conclusions are presented in Section 6.

Bag of features (BoF), also known as bag of visual words (BoVW) model, is a standalone ML model which is highly efficient in image classification due to its high resistance to the variation in the orientation or the position of the object-of-interest.

The main advantage of the BoF is the unneeded segmentation process before the classification stage as it aims to construct a set of visual codewords, also referred to as a codebook or a dictionary, which represents all the possible visual codewords that can be present in the dataset images. The obtained bag of visual words represents a vector of existence counts of a vocabulary of local features in an image without the need of a segmentation process. Hence, given the codebook, an input image can be quantified and represented by a histogram that indicates the occurrence counts of the present visual codewords in the given image. The obtained histograms from the dataset are then used to classify the given images using classification models that are 

The feature extraction process in the BoF model is the initial process at which a feature vector is obtained for each determined keypoint without segmentation, which results in a large number of local features for each image. Therefore, the feature extraction process is considered a two-fold process at which: i) the interest points (i.e. keypoints), which represent the feature point locations in each input image are detected, then, ii) feature descriptors are applied to extract the feature vector for each keypoint. In our proposed model, for determining the keypoints and extracting their feature vectors, grid method, and speeded up robust features (SURF) descriptor algorithm [12] were applied, respectively. The interest points were located using the grid method in which a uniform grid with a predefined spacing (i.e. grid step) was applied on the image, such that the intersections of the grid lines determined the locations of the keypoints. In that process, the grid step was set to the size of 8×8. , ]

where both horizontal and vertical responses were summed over the sub-region in the first two entries. Also, the sums of the absolute responses 

The keypoint feature extraction process has generated a vast number of features by having a 64-dimensional feature vector for each region per each scale. These features are initially reduced prior to constructing the codebook. For feature-space reduction, the variance of the extracted descriptors is computed, then, the 80% of the strongest descriptors (i.e. having the highest score) are selected. Next, these features were quantized using K-means clustering algorithm to construct the visual vocabulary, which comprises K visual words. Using K-means clustering, the obtained descriptors are grouped into K clusters, such that the cluster centers represent the K visual words. For obtaining the K visual words, first, K initial cluster centers are randomly selected based on the inputted N descriptors. Then, the Euclidean distance between each of the N descriptors (points ) and each of the K initial cluster center is calculated, as expressed by the following equation:

where represents the cluster center. Thus, the descriptor is then assigned to the nearest cluster center. Subsequently, the new cluster centers are calculated in addition to the Euclidean distances between the descriptors and the new cluster centers. This process is repeated iteratively, while reducing the sum of the squared Euclidean distances, until the cluster centers became steady. The final obtained cluster centers represent the K visual words (i.e. codewords) of the BoF codebook.

After constructing the codebook, input images are represented by a histogram that indicates the frequency of occurrence of the K visual words within the image. This J o u r n a l P r e -p r o o f vector quantization process is performed using the nearest neighbors algorithm based on the Euclidean distance measure, which assigned each extracted descriptor to its nearest codeword. Hence, the histograms of the input images represented the distribution of visual content using the constructed codebook. Hereby, these histograms were then exploited by ML algorithm to classify the input CXR images into either normal or COVID-19 case.

Ensemble-based models integrate a set of classifiers for producing a superior analysis (LDA) to determine a low-dimensional discriminant subspace [13] . In this study, ensemble-based classification models using bagged trees ensembles and subspace discriminant ensembles were investigated in the classification layer of the proposed BoF model.

The proposed ensemble-based bag-of-features COVID-19 diagnosis model initially undergoes a training phase using the labeled training CXR images to construct the codebook of the BoF model in addition to train the classification model for setting its parameters. Figure 1(a) demonstrates the sequential processes that were carried out during the training phase, which encompassed the keypoint detection using grid method followed by the SURF descriptor extraction. The descriptor extraction process was based on the calculated integral image, which was divided into spatial regions and sub-regions using multi-scale block sizes. Accordingly, Haar wavelet response J o u r n a l P r e -p r o o f In the testing phase, as illustrated in Fig. 1(b) , the input CXR images were applied to the keypoint detection algorithm for extracting the SURF descriptors from the detected keypoints. The selected strongest features were quantized using the constructed codebook to obtain the frequency of occurrence histograms for the testing images. Finally, the pre-trained classification model produced the classification decisions regarding the input testing CXR images.

In this study, an open publicly-available dataset of CXR images was applied for training and testing the proposed system [14] . A 4 GB GPU, Intel Core i7 desktop was used to execute the MATLAB software for evaluating the proposed system. At the access date, the dataset consisted of 400 CXR images, including 200 COVID-19 cases and 200 normal cases, which were collected from public sources in addition to hospitals and physicians. Figure 2 displays a sample of the dataset images. The CXR images were separated into training and testing sets using the five-fold cross-validation technique by splitting the dataset into five equal folds to find the classifier's overall performance as the average of the five runs. The training images followed the process demonstrated in Fig. 1(a) However, the sensitivity has roughly remained steady at 98%. As the fine KNN considers the fine detailed distinctions between classes, better results were obtained using the fine KNN compared to the medium KNN in Fig. 9 at which the coarser distinctions are observed between classes. ). Table 2 indicates the superiority of the proposed model against the deep learning techniques in [6, 8] and the SVM ensembles model in [11] . 

The occurrence of COVID-19 pandemic has imposed major pressure on healthcare facilities, which hinders providing efficient healthcare services without the risk of infections. Computer-aided diagnostic systems present an automated risk-free solution to diagnose COVID-19 using CXR images. Although several automated detection systems were proposed in literature, most of these systems relied on deep-learning techniques, which require large datasets for accurate performance. However, this condition was hardly achieved in several studies due to availability limitations.

Accordingly, in our study, we have investigated the BoF classification model, which is one of the most promising traditional ML models. In our proposed model, the effect of the used number of visual words on the classification performance was studied using 150 K  and 200. K  Accordingly, it was concluded that the increase in the J o u r n a l P r e -p r o o f number of visual words boosts the classification accuracy due to the presence of more distinctive features, while reducing the computational time.

In the proposed BoF model, ensembles were employed in the classification process for the efficient classification of CXR images into normal or COVID-19 cases. Two several ensembles were investigated namely, the ensemble subspace discriminant and the ensemble bagged trees, which were compared to other classifiers including the 

COVID-19, SARS and MERS: are they closely related?

Laboratory diagnosis of coronavirus disease-2019 (COVID-19)

OptCoNet: an optimized convolutional neural network for an automatic diagnosis of COVID-19

Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks

Coronet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images

Covidx-net: A framework of deep learning classifiers to diagnose covid-19 in x-ray images

Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network

Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks

Covid-caps: A capsule network-based framework for identification of covid-19 cases from x-ray images

Automatic x-ray covid-19 lung image classification system based on multi-level thresholding and support vector machine

Automatic Computer Aided Diagnostic for COVID-19 Based on Chest X-Ray Image and Particle Swarm

Speeded-up robust features (SURF)

Two-dimensional linear discriminant analysis

Covid-19 image data collection: Prospective predictions are the future