key: cord-0866539-ld2vzl9d authors: Rajpal, Sheetal; Lakhyani, Navin; Singh, Ayush Kumar; Kohli, Rishav; Kumar, Naveen title: Using Handpicked Features in Conjunction with ResNet-50 for Improved Detection of COVID-19 from Chest X-Ray Images date: 2021-02-10 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2021.110749 sha: dabd91a905355475fc3d25bb7c374a2b2a8a74fb doc_id: 866539 cord_uid: ld2vzl9d Coronaviruses are a family of viruses that majorly cause respiratory disorders in humans. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a new strain of coronavirus that causes the coronavirus disease 2019 (COVID-19). WHO has identified COVID-19 as a pandemic as it has spread across the globe due to its highly contagious nature. For early diagnosis of COVID-19, the reverse transcription-polymerase chain reaction (RT-PCR) test is commonly done. However, it suffers from a high false-negative rate of up to 67% if the test is done during the first five days of exposure. As an alternative, research on the efficacy of deep learning techniques employed in the identification of COVID-19 disease using chest X-ray images is intensely pursued. As pneumonia and COVID-19 exhibit similar/ overlapping symptoms and affect the human lungs, a distinction between the chest X-ray images of pneumonia patients and COVID-19 patients becomes challenging. In this work, we have modeled the COVID-19 classification problem as a multiclass classification problem involving three classes, namely COVID-19, pneumonia, and normal. We have proposed a novel classification framework which combines a set of handpicked features with those obtained from a deep convolutional neural network. The proposed framework comprises of three modules. In the first module, we exploit the strength of transfer learning using ResNet-50 for training the network on a set of preprocessed images and obtain a vector of 2048 features. In the second module, we construct a pool of frequency and texture based 252 handpicked features that are further reduced to a set of 64 features using PCA. Subsequently, these are passed to a feed forward neural network to obtain a set of 16 features. The third module concatenates the features obtained from first and second modules, and passes them to a dense layer followed by the softmax layer to yield the desired classification model. We have used chest X-ray images of COVID-19 patients from four independent publicly available repositories, in addition to images from the Mendeley and Kaggle Chest X-Ray Datasets for pneumonia and normal cases. To establish the efficacy of the proposed model, 10-fold cross-validation is carried out. The model generated an overall classification accuracy of 0.974 [Formula: see text] 0.02 and a sensitivity of 0.987 [Formula: see text] 0.05, 0.963 [Formula: see text] 0.05, and 0.973 [Formula: see text] 0.04 at 95% confidence interval for COVID-19, normal, and pneumonia classes, respectively. To ensure the effectiveness of the proposed model, it was validated using an independent Chest X-ray cohort and an overall classification accuracy of 0.979 was achieved. Comparison of the proposed framework with state-of-the-art methods reveal that the proposed framework outperforms others in terms of accuracy and sensitivity. Since interpretability of results is crucial in the medical domain, the gradient-based localizations are captured using Gradient-weighted Class Activation Mapping (Grad-CAM). In summary, the results obtained are stable over independent cohorts and interpretable using Grad-CAM localizations that serve as clinical evidence. from first and second modules, and passes them to a dense layer followed by the softmax layer to yield the desired classification model. We have used chest X-ray images of COVID-19 patients from four independent publicly available repositories, in addition to images from the Mendeley and Kaggle Chest X-Ray Datasets for pneumonia and normal cases. To establish the efficacy of the proposed model, 10-fold cross-validation is carried out. The model generated an overall classification accuracy of 0.974 ± 0.02 and a sensitivity of 0.987 ± 0.05, 0.963 ± 0.05, and 0.973 ± 0.04 at 95% confidence interval for COVID-19, normal, and pneumonia classes, respectively. To ensure the effectiveness of the proposed model, it was validated using an independent Chest X-ray cohort and an overall classification accuracy of 0.979 was achieved. Comparison of the proposed framework with state-of-the-art methods reveal that the proposed framework outperforms others in terms of accuracy and sensitivity. Since interpretability of results is crucial in the medical domain, the gradient-based localizations are captured using Gradientweighted Class Activation Mapping (Grad-CAM). In summary, the results obtained are stable over independent cohorts and interpretable using Grad-CAM localizations that serve as clinical evidence. Keywords: COVID-19, Machine Learning, Classification, Chest X-Rays, Deep Learning, Grad-CAM. Soon after the onset of coronavirus disease 2019 (COVID- 19) in late 2019 in Wuhan city of China, it started to spread globaly at an alarming rate [1] . Consequently, on March 11, 2020 WHO declared Covid-19 as a pandemic. The coronavirus disease causes severe respiratory problems in humans [2] and it has already accounted for more than 1.8 million deaths. The reverse transcriptionpolymerase chain reaction (RT-PCR) test is popularly used to detect the COVID-19 disease in humans. However, the test suffers from a high false negative rate, as high as 67%, if the test is done during early days of the onset of the disease. Moreover, a negative report of the test does not rule out COVID-19 even in case of patients showing strong symptoms [3] . Thus, the followup of COVID-19 patients becomes difficult. Researchers have been finding alternative ways for identification of COVID-19 disease. Chest X-ray (CXR) images are popularly used to diagnose, evaluate, and monitor common respiratory and lung infections. Thus, recently, chest X-ray images have been used for early and improved detection of COVID-19 using machine learning techniques [4, 5] . Apart from medical diagnostics, machine learning techniques have found applications in diverse domains such as speech recognition, weather forecasting, and forecasting market trends [6, 7, 8, 9] . For example, Altan et al. [9] amalgamated long short-term memory (LSTM) neural network with empirical wavelet transform (EWT) decomposition using cuckoo search (CS) algorithm for forecasting the price of digital currency. Recently, Karasu et al. [8] applied support vector regression (SVR) for forecasting of crude oil prices in a multiobjective setting using particle swarm optimization (PSO). However, in this paper, we will focus on applying machine learning techniques for the detection of COVID-19. In order to automate the task of detecting various lung abnormalities, several researchers are applying deep learning techniques to identify the affected regions in a CXR image [10, 11, 12, 13] . Recently, the use of some well-known deep neural networks namely ImageNet (also known as 'AlexNet') [14] , VGGNet [15] , GoogLeNet [16] , ResNet [17] , and their variations has been explored for identification of COVID-19 using CXR images [18, 19, 20, 21, 22, 23] . Several researchers have approached the COVID-19 detection problem as a binary classification problem to distinguish between COVID-19 and Non-COVID/Normal classes. Based on experimentatios with VGG16, VGG19, ResNet18, ResNet50, and ResNet101, Ismael and Şengür [24] used a pretrained ResNet50 model for extracting features from CXR images from a data set having 180 COVID-19 and 200 normal patients' images. Their classification model achieved the best accuracy of 92.63% using support vector machine (SVM) with linear kernel. Similarly, Afshar et al. [25] proposed a capsule-network (CapsNet) based framework which exploited spatial information to overcome an inability of CNNs for recognizing the same object when it undergoes various transformations. They experimented with a dataset having 183 COVID-19 Chest X-ray images and achieved an initial accuracy of 95.7% and sensitivity of 90% which was improved to 98.3% and 98.6% respectively after pre-training the model with another dataset. However, the above works do not attempt to distinguish the pneumonia patients from COVID-19 and normal classes. As Pneumonia, a respiratory illness, shares several closely related symptoms with COVID-19 [26] , it is important to distinguish COVID-19 cases from several forms of pneumonia cases. In the multi-class classification domain, most of the researchers have incorporated Convolutional Neural Networks (CNNs) for the classification task to identify COVID-19 using CXR images. Bukhari et al. [27] used a ResNet-50 architecture [17] for assessment of COVID-19 using 278 CXR images that belonged to three classes, namely COVID-19 (89), normal (93), and pneumonia (96), and obtained an overall accuracy of 98.24% and F1-score equal to 0.98. Apostolopoulos et al. [22] experimented with different CNNs on two datasets, each comprising 224 CXR images of COVID-19. They obtained highest classification accuracy of 93.48% and 92.85% on one dataset using VGG19 and MobileNetv2 respectively, and 96.78% using MobileNet on another dataset. Similarly, Khan et al. [18] proposed a deep CNN architecture based on Xception architecture [28] and achieved an overall accuracy of 90.21% on a dataset that comprises 284 Covid-19 CXR images. Mahmud et al. [20] proposed a CNN-CovXNet in which variation in the dilation rate is introduced to identify significant features in CXR images and obtained an overall accuracy of 90.2%. Rajaraman et al. [29] proposed iteratively pruned deep learning ensemble that pruned the neurons to the maximum of 50% in each convolutional layer to obtain a accuracy of 99.01%. They experimented on a dataset that comprises 313 COVID-19 instances. Basu et al. [30] based their model on domain extension transfer learning (DETL). Using a pre-trained network on National Institutes of Health (NIH) CXR image dataset [10] , they obtained an overall accuracy 95.3% ± 0.02 by fine-tuning 12 layer CNN. Wang et al. [23] proposed an evolving deep convolutional architecture, called COVID-Net, inpired by the design of FermiNets [31] -a generative-synthesis approach that leads to the evolution of efficient neural network designs that satisfy the specified optimization criteria. Using CXR images having 358 COVID-19 images, they achieved an accuracy of 93.39%. Aslan et al. [12] proposed two deep learning architectures which takes segmented lung area as an input and yielded better results with hybrid architecture that comprise pretrained Alexnet architecture followed by additional BiLSTM (Bidirectional Long Short-Term Memory) layer leveraged for identification of sequential and temporal properties. They experimented on dataset that comprise only 219 samples of COVID-19 and achieved an accuracy of 98.70%. Loey et al. [13] used generative adversarial nets [32] for dataset augmentation and experimented with AlexNet, GoogLeNet, and ResNet and reported maximum accuracy of 85.19% using AlexNet. Soares et al. [19] used VGG16 network [15] to obtain 97.30% accuracy on a set of for 375 images. Significant efforts have been made to explore the applicability of handcrafted features for the detection of COVID-19 [4, 33, 34, 5] . Khuzani et al. [4] used multilayer neural networks (MNN) to classify COVID-19 positives from CXR images. Using FFT, textural analysis, DWT, GLCM, and GLDM methods, they obtained a pool of 252 features. From this feature pool, they extracted a set of 64 features using Principal Component Analysis (PCA) which was input to a multi-layer Neural network to attain an accuracy of 94.04%. In contrast, Altan et al. [33] used two-dimensional (2D) curvelet transformations to improve their model accuracy for classifying the CXR images into 5 three classes, namely, COVID-19, normal, and viral pneumonia. The 2D curvelet transformation captures a high degree of directionality and parabolic scaling relationship to the Fourier coefficients obtained from CXR images. Finally, inverse 2D Fourier transform was applied to obtain the curvelet coefficients. Subsequently, the chaotic salp swarm algorithm (CSSA) was applied to minimize the number of features and maximize the classification accuracy. The model achieved an accuracy of 99.69% using EfficientNet-B0 model. Sergio et al. [34] proposed a classifier that used texture features of CXR images with feed-forward neural networks and convolutional neural networks and achieved the best accuracy accuracy of 96.83% using a former model on a set of 255 COVID-19 CXR images. A deep learning based model's performance depends on the amount of data used for training the model. However, most of the above cited works used a small number of COVID-19 images [30, 13, 27, 12, 18, 34] . Another important aspect, validation on an independent cohort to establish the efficacy of a model, is missing from several works [20, 13, 19, 27, 18, 34] . Again, interpretability of results based on localizations used to mark the affected lung regions, a crucial issue in the medical domain, has not been properly addressed [33, 18, 12, 13] . To address the above short comings, the proposed framework uses a balanced data set that comprise 520 COVID-19 patients' CXR images and an equal number of normal and pneumonia CXR images. This results in an improved classification model. To establish the stability and performance of the proposed framework, 10-fold cross-validation is done. Again, an independent cohort having 471 CXR images with equal number of images for the three classes is used to validate the model. Finally, Grad-CAM based localizations are captured to serve as interpretable clinical evidence. To summarize, in this paper we have targeted the problem of improved detection of COVID-19 using CXR images and towards this end we have: 1. proposed a three-module multi-class classification framework to distinguish between COVID-19, normal and pneumonia CXR images. 2. combined a set of handpicked features obtained from conventional image processing methods with those obtained using the ResNet-50 based transfer learning. 3. employed 10-fold cross validation on our data set having 1560 images, 520 images belonging to each of the three classes. 4 . validated the proposed framework on an independent chest X-ray cohort having 471 images having equal number of images belonging to each of the classes. 5 . applied Grad-CAM to obtain localizations that serve as interpretable clinical evidence. The rest of the paper is outlined as follows: in section 2, we describe the datasets, preprocessing steps, and the architectural details, in section 3, we discuss the outcome of the experiments, and in section 4, we present the conclusions and mention the scope for future work. In this section, we describe our datasets, followed by the data preprocessing steps and a detailed description of the network architecture. To begin with, we enumerate the class wise data distribution of some of the popular publicly available COVID-19 datasets: 3. Figure1-COVID-chestxray-dataset [36] . It comprises 53 COVID-19 samples. 4. Actualmed-COVID-chestxray-dataset [37] . It comprises 150 COVID-19 samples. In order to maintain uniformity among the chest X-ray images obtained from different sources, we only consider images having frontal views, namely, Poster anterior (PA) and Erect anteroposterior (AP) views. The first two databases mentioned above contain 520 COVID-19 images corresponding to PA and AP views. In order to construct a balanced data set for training the network, an equal number of pneumonia and normal chest X-ray images are included. We select an equal number of CXR images related to viral and bacterial pneumonia from Kaggle database [35] and Mendeley database [38] . Thus, our dataset comprises 260 images of each of the type bacterial pneumonia and viral pneumonia, randomly selected from Kaggle and Mendeley databases. Similarly, we randomly select 520 images for normal class from Kaggle and Mendeley databases. The final dataset having 1560 chest X-ray images is now randomized and used for 10-fold cross-validation. Further, an independent cohort is created using a new dataset that was never used during 10-fold cross-validation. This cohort comprises of 157 unique COVID-19 samples collected from the cohorts at serial numbers 3 and 4 mentioned above, and also has an equal number of normal and pneumonia images taken from the Mendeley database. Since the CXR images are collected from various sources, they are resized and transformed from RGB to grayscale to ensure uniformity across different datasets. The resultant images are then subjected to min-max normalization [39] to speed up the convergence process. The proposed framework uses a set of handpicked image features in conjunction with the features In the second module, we incorporate a feature extraction unit which yields 252 features obtained from CXR images rescaled to 512 × 512. This set of 252 features so obtained is processed using is further passed to a feed forward neural network having a pair of dense layer and a dropout layer, followed by another pair of dense layer and a dropout layer to finally obtain a vector of 16 features. Based on experimentations, we have employed the ReLU activation function in dense layers and dropout regularization factor of 0.50 so as to prevent the network from overfitting. The feature extraction unit involves a process of extracting textual and frequency domain features (widely used in image processing works [40, 41, 42] ). The texture feature set is generated using CXR image in spatial domain, gray-level co-occurrence matrix (GLCM) [43, 44] and gray-level difference matrix (GLDM) [45, 4] . For each category of texture features, the computation of 14 statistical values -area, mean, standard deviation, skewness, kurtosis, energy, entropy, max, min, mean absolute deviation, median, range, root mean square, and uniformity is carried out. In this manner, we extract 14 spatial domain features, 56 GLCM features, and 56 GLDM features generating a combined texture feature set of size 126. Drawing inspiration from Zargari et al. [46] (used statistical features for predicting chemotherapy response in ovarian cancer patients), we also obtain frequency features by computing the aforementioned statistical values on the transformed region resulted by applying Fast Fourier Transform (FFT) [42] and two-level Discrete Wavelet Transform (DWT) [47] . These features are concatenated with the texture feature pool constructed above. We extract 14 FFT features and 112 DWT features. Thus, the texture and frequency features so obtained comprise of 252 features. In the third module of the proposed network, the outputs from both the modules are concatenated to form a vector of 2064 (2048 + 16) features which are passed to a dense layer employing ReLU activation function to obtain a reduced set of 1000 features. These features serve as input to a softmax layer that assigns classification probabilities of the input image belonging to one of three classes, i.e., COVID-19, pneumonia, and normal. In this section, we present the classification results obtained using the proposed framework. We compare the performance of the proposed framework with other state-of-the-art methods. Further, we validate the performance of proposed framework on an independent cohort. For clinical interpretation of the results, the gradient-based localizations are captured using Gradient-weighted Class Activation Mapping (Grad-CAM). We carried out all the experiments using Python 3.6.9 on the NVIDIA Tesla K80 GPU in Google Colaboratory environment. For implementing the proposed framework, we have used NumPy, SciPy, Scikit-learn, Keras, and Scikit-image libraries. Using the proposed framework described in the previous section, we classify the chest X-ray images into three classes: normal, pneumonia, and COVID-19 by training the network end-to-end. To assess the effectiveness of the proposed model in terms of variance over different test data, we perform 10-fold cross-validation. To avoid overfitting during model construction, out of the 90% of the data set being used for training in a fold, we reserved 10% of this data as the hold-out validation set. The results of the 10-fold cross-validation are summarized in a confusion matrix (see Figure 3 (a)). The diagonal entries in the confusion matrix indicate the number of samples correctly classified for each class, while off-diagonal entries indicate the number of samples wrongly assigned to each class. We note that out of 520 COVID-19 patients, 513 are correctly identified, five are misclassified as pneumonia and two are labeled as normal. Similarly, pneumonia and normal subjects are also labeled by the proposed classifier with high accuracy. Thus, we obtain an overall accuracy of 0.974 ± 0.02 and a high sensitivity of 0.987 ± 0.05, 0.963 ± 0.05, and 0.973 ± 0.04 at 95% confidence interval for COVID-19, normal, and pneumonia classes, respectively. This is also evident from the heatmap in Figure 3 (b) which summarizes information about precision, recall, and F1-score metrics for 10-fold cross-validation for all three classes. Note that the proposed framework 11 is able to label almost all COVID-19 patients correctly, thus achieving high average values of the precision, recall, and F1-score (≥ 0.987) across 10 different folds (heatmap in Figure 3(b) ). For the Normal subtype, the framework yields high values (greater than 0.963) of precision, recall, and F1-measure. Similarly, for the Pneumonia subtype, the framework yields precision, recall, and F1-measure scores greater than 0.962. To evaluate the relative effectiveness of the proposed framework, we compare the results obtained using the proposed framework with related work in literature. Table 1 shows the accuracy and sensitivity of the proposed framework along with other state-of-the-art classifiers in detecting COVID-19 from chest X-ray images. It may be noted that the proposed system performs better than the other state-of-the-art approaches in terms of accuracy and sensitivity. Although Bukharia et al. [27] report a slightly higher value of accuracy (0.98), they experimented on relatively small-sized datasets and did not report the sensitivity of their classifiers. Further, while majority of the related works have modeled the problem as binary classification problem (COVID-19 vs. Normal), we have included the closely related pneumonia class in our the multi-class classification model. Also, effect of variability in data sets has not been studied as most of these works do not report their cross-validation results. While, validation on an independent cohort is essential to establish the effectiveness of a model, somehow, this aspect has been missing in a majority of related works. Most importantly, interpretability of results in terms of localization is crucial in medical domain. However, we could not find such interpretations in several works. Although effectiveness of the proposed architecture in detecting COVID-19 from chest X-ray images is evident from the results tabulated in previous section, in order to be useful in clinical practice, it is necessary to relate the classification results to clinical evidence. For this purpose, we used Gradientweighted class activation mapping (Grad-CAM) technique. Grad-CAM is a popular tool used to produce a localization map highlighting the most important regions which assist the proposed network in predicting a class. Using this tool, we obtained gradient-based localization ( Figure 4 ) for the images that we processed. The red-colored region marks the region of interest (RoI) responsible for activating the final convolutional layer of the proposed network. The blue-colored region acts as an evidence [48] of the class identified by the proposed network. Radiological validation was done by a radiologist, who confirmed that lung regions marked in red colour in the figure relate to those regions of the lung which are predominantly affected in COVID-19. We also evaluate the effectiveness of the proposed framework using an independent cohort of the CXR images that is not used during 10-fold cross-validation. For this purpose, we use 157 unique CXR images from two COVID-19 datasets ( [36] and [37] ). The equal number of CXRs for pneumonia and normal are taken from Mendeley dataset [38] . Classification results are depicted through confusion matrix in Figure 5 . Given the low sensitivity of the RT-PCR test in detecting COVID-19, coupled with delayed availability of the results, there is an urgent need to devise alternative and more efficient methods for diagnosing COVID-19. Recently, chest X-ray images have been used for early and improved de- As part of the future work, we propose to evaluate the effect of segmentation on localizing the lung regions affected in COVID-19. Further, the clinical data such as data regarding symptoms, age, and treatment history of patients may be integrated for improved detection of the COVID-19 disease. WHO. Archived: WHO Timeline -COVID-19 Imaging profile of the COVID-19 infection: radiologic findings and literature review Interpreting a covid-19 test result COVID-Classifier: An automated machine learning model to assist in the diagnosis of COVID-19 infection in chest x-ray images. medRxiv Cov-elm classifier: An extreme learning machine based identification of covid-19 using chest-ray images Machine learning paradigms for speech recognition: An overview Machine learning applied to weather forecasting A new forecasting model with wrapper-based feature selection approach using multi-objective optimization technique for chaotic crude oil time series Digital currency forecasting with chaotic meta-heuristic bio-inspired signal processing techniques Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Local binary patterns variants as texture descriptors for medical image analysis CNNbased transfer learning-BiLSTM network: A novel approach for COVID-19 infection detection Within the Lack of Chest COVID-19 X-ray Dataset: A Novel Detection Model Based on Imagenet classification with deep convolutional neural networks Very deep convolutional networks for large-scale image recognition Proceedings of the IEEE conference on computer vision and pattern recognition Deep residual learning for image recognition Coronet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images Automatic Detection of COVID-19 Cases on X-ray images Using Convolutional Neural Networks CovXNet: A multidilation convolutional neural network for automatic COVID-19 and other pneumonia detection from chest X-ray images with transferable multi-receptive feature optimization COVID-19 Image Data Collection: Prospective Predictions Are the Future Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images Deep learning approaches for COVID-19 detection based on chest X-ray images Covid-caps: A capsule network-based framework for identification of covid-19 cases from x-ray images A comparative study on the clinical features of covid-19 pneumonia to other pneumonias The diagnostic evaluation of Convolutional Neural Network (CNN) for the assessment of chest X-ray of patients infected with COVID-19. medRxiv Xception: Deep learning with depthwise separable convolutions Iteratively Pruned Deep Learning Ensembles for COVID-19 Detection in Chest X-rays Deep Learning for Screening COVID-19 using Chest X-Ray Images Ferminets: Learning generative machines to generate efficient neural networks via generative synthesis Generative adversarial nets Recognition of COVID-19 disease from X-ray images by hybrid model consisting of 2D curvelet transform, chaotic salp swarm algorithm and deep learning technique A new approach for classifying coronavirus COVID-19 based on its manifestation on chest X-rays using texture features and neural networks GitHub -agchung/Figure1-COVID-chestxray-dataset: Figure 1 COVID-19 Chest X-ray Dataset Initiative GitHub -agchung/Actualmed-COVID-chestxray-dataset: Actualmed COVID-19 Chest X-ray Dataset Initiative Large dataset of labeled optical coherence tomography (oct) and chest x-ray images Score normalization in multimodal biometric systems Textural features for image classification DWT-LBP Descriptors for Chest X-Ray View Classification Detecting tuberculosis in chest radiographs using image processing techniques Automatic classification of medical x-ray images Texture Analysis Using the Gray-Level Co-Occurrence Matrix (GLCM) -MATLAB & Simulink Statistical textural features for detection of microcalcifications in digitized mammograms Bin Zheng, and Yuchen Qiu. Prediction of chemotherapy response in ovarian cancer patients using a new clustered quantitative image marker Detection of pneumonia in chest X-ray images Grad-cam: Visual explanations from deep networks via gradientbased localization Data curation, Software Rishav Kohli: Data curation, Software Naveen Kumar: Supervision, Writing -review & editing The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.