key: cord-249065-6yt3uqyy authors: Kassani, Sara Hosseinzadeh; Kassasni, Peyman Hosseinzadeh; Wesolowski, Michal J.; Schneider, Kevin A.; Deters, Ralph title: Automatic Detection of Coronavirus Disease (COVID-19) in X-ray and CT Images: A Machine Learning-Based Approach date: 2020-04-22 journal: nan DOI: nan sha: doc_id: 249065 cord_uid: 6yt3uqyy The newly identified Coronavirus pneumonia, subsequently termed COVID-19, is highly transmittable and pathogenic with no clinically approved antiviral drug or vaccine available for treatment. The most common symptoms of COVID-19 are dry cough, sore throat, and fever. Symptoms can progress to a severe form of pneumonia with critical complications, including septic shock, pulmonary edema, acute respiratory distress syndrome and multi-organ failure. While medical imaging is not currently recommended in Canada for primary diagnosis of COVID-19, computer-aided diagnosis systems could assist in the early detection of COVID-19 abnormalities and help to monitor the progression of the disease, potentially reduce mortality rates. In this study, we compare popular deep learning-based feature extraction frameworks for automatic COVID-19 classification. To obtain the most accurate feature, which is an essential component of learning, MobileNet, DenseNet, Xception, ResNet, InceptionV3, InceptionResNetV2, VGGNet, NASNet were chosen amongst a pool of deep convolutional neural networks. The extracted features were then fed into several machine learning classifiers to classify subjects as either a case of COVID-19 or a control. This approach avoided task-specific data pre-processing methods to support a better generalization ability for unseen data. The performance of the proposed method was validated on a publicly available COVID-19 dataset of chest X-ray and CT images. The DenseNet121 feature extractor with Bagging tree classifier achieved the best performance with 99% classification accuracy. The second-best learner was a hybrid of the a ResNet50 feature extractor trained by LightGBM with an accuracy of 98%. A series of pneumonia cases of unknown etiology occurred in December 2019, in Wuhan, Hubei province, China. On December 31, 2019, 27 unexplained cases of pneumonia were identified and found to be associated with so called "wet markets" which sell fresh meat and seafood from a variety of animals including bats and pangolins. The pneumonia was found to be caused by a virus identified as "severe acute respiratory syndrome coronavirus 2" (SARS-CoV-2), with the associated disease subsequently termed coronavirus disease 2019 (COVID-19) Figure 1 : The illustration of COVID-19, created at the Centers for Disease Control and Prevention (CDC) [10] . The protein particles E, S, and M are located on the outer surface of the virus particle. The spherical viral particles, colorized blue, contain cross-sections through the viral genome, seen as black dots [11] . processing techniques and deep learning algorithms could assist physicians as diagnostic aides for COVID-19 and help provide a better understanding of the progression the disease. Hemdan et al. [13] developed a deep learning framework, COVIDX-Net, to diagnose COVID-19 in X-Ray Images. A comparative study of different deep learning architectures including VGG19, DenseNet201, ResNetV2, InceptionV3, InceptionResNetV2, Xception and Mo-bileNetV2 is provided by authors. The public dataset of X-ray images was provided by Dr. Joseph Cohen [14] and Dr. Adrian Rosebrock [15] . The provided dataset included 50 X-ray images, divided into two classes as 25 normal cases and 25 positive COVID-19 images. Hemdan's results demonstrated VGG19 and DenseNet201 models achieved the best performance scores among counterparts with 90.00% accuracy. Barstugan et al. [16] proposed a machine learning approach for COVID-19 classification from CT images. Patches with different sizes 16×16, 32×32, 48×48, 64×64 were extracted from 150 CT images. Different hand-crafted features such as Grey Level Co-occurrence Matrix (GLCM), Local Directional Pattern (LDP), Grey Level Run Length Matrix (GLRLM), Grey-Level Size Zone Matrix (GLSZM), and Discrete Wavelet Transform (DWT) algorithms were employed. The extracted features were fed into a Support Vector Machine (SVM) [17] classifier on 2-fold, 5-fold and 10-fold cross-validations. The best accuracy of 98.77% was obtained by GLSZM feature extractor with 10-fold cross-validation. Wang and Wong [18] designed a tailored deep learning-based framework, COVID-Net, developed for COVID-19 detection from chest X-ray images. The COVID-Net architecture was constructed of combination of 1×1 convolutions, depth-wise convolution and the residual modules to enable design deeper architecture and avoid the gradient vanishing problem. The provided dataset consisted of s a combination of COVID chest X-ray dataset provided by Dr. Joseph Cohen [14] , and Kaggle chest X-ray images dataset [19] for a multi-class classification of normal, bacterial infection, viral infection (non-COVID) and COVID-19 infection. Obtained accuracy of this study was 83.5%. In a study conducted by Maghdid et al. [20] , a deep learning-based method and transfer learning strategy were used for automatic diagnosis of COVID-19 pneumonia. The proposed architecture is a combination of a simple convolutional neural network (CNN) architecture (one convolutional layer with 16 filters followed by batch normalization, rectified linear unit (ReLU), two fully-connected layers) and a modified AlexNet [21] architecture with the feasibility of transfer learning. The proposed modified architecture achieved an accuracy of 94.00%. Ghoshal and Tucker [22] investigated the diagnostic uncertainty and interpretability of deep learning-based methods for COVID-19 detection in X-ray images. Dropweights based Bayesian Convolutional Neural Networks (BCNN) were used to estimate uncertainty in deep learning solutions and provide a level of confidence of a computer-based diagnosis for a trusted clinician setting. To measure the relationship between accuracy and uncertainty, 70 posterioranterior (PA) lung X-ray images of COVID-19 positive patients from the public dataset provided by Dr. Joseph Cohen [14] were selected and balanced by Kaggle's Chest X-Ray Images dataset [19] . To prepare the dataset, all images were resized to 512×512 pixels. A transfer learning strategy and real-time data augmentation strategies were employed to overcome the limited size of the dataset. The proposed Bayesian inference approach obtained the detection accuracy of 92.86% on X-ray images using VGG16 deep learning model. Hall et al. [23] used a VGG16 architecture and transfer learning strategy with 10-fold crossvalidation trained on the dataset from Dr. Joseph Cohen [14] . All images were rescaled to 224×224 pixels and a data augmentation strategy was employed to increase the size of dataset. The proposed approach achieved an overall accuracy 96.1% and overall Area Under Curve (AUC) of 99.70% on the provided dataset. Farooq and Hafeez [24] proposed a fine-tuned and pre-trained ResNet-50 architecture, COVID-ResNet, for COVID-19 pneumonia screening. To improve the generalization of the training model, different data augmentation methods including vertical flip, random rotation (with angle of 15 degree), along with the model regularization were used. The proposed method achieved the accuracy of 96.23% on a multi-class classification of normal, bacterial infection, viral infection (non-COVID-19) and COVID-19 infection dataset. The main motivation of this study is to present a generic feature extraction method using convolutional neural networks that does not require handcrafted or very complex features from input data while being easily applied to different modalities such as X-ray and CT images. Another primary goal is to reduce the generalization error while achieving a more accurate diagnosis. The contributions are summarized as follows: • Deep convolutional feature representation [25, 26, 27] is used to extract highly representative features using state-of-the-art deep CNN descriptors. The employed approach is able to discriminate between COVID-19 and healthy subjects from chest X-ray and CT images and hence produce higher accuracy in comparison to other works presented in the literature. To the best of our knowledge, this research is the first comprehensive study of the application of machine learning (ML) algorithms (15 deep CNN visual feature extractor and 6 ML classifier) for automatic diagnoses of COVID-19 from X-ray and CT images. • To overcome the issue of over-fitting in deep learning due to the limited number of training images, a transfer-learning strategy is adopted as the training of very deep CNN models from scratch requires a large number of training data. • No data augmentation or extensive pre-processing methods are applied to the dataset in order to increase the generalization ability and also reduce bias toward the model performance. • The proposed approach reduces the detection time dramatically while achieving satisfactory accuracy, which is a superior advantage for developing real or near real-time inferences on clinical applications. • With extensive experiments, we show that the combination of a deep CNN with Bagging trees classifier achieves very good classification performance applied on COVID-19 data despite the limited number of image samples. • Finally, we developed an end to end web-based detection system to simulate a virtual clinical pipeline and facilitate the screening of suspicious cases. The rest of this paper is organized as follows. The proposed methodology for automatically classifying COVID-19 and healthy cases is explained in Section 2. The dataset description, experimental settings and performance metrics are given in Section 3. A brief discussion and results analysis are provided in Section 4, and finally, the conclusion is presented in Section 5. Few studies have been published on the application of deep CNN feature descriptors to X-ray and CT images. Each of the CNN architectures is constructed by different modules and convolution layers that aid in extracting fundamental and prominent features from a given input image. Briefly, in the first step, we collect available public chest X-ray and CT images. In the next step, we pre-processed the provided dataset using standard image normalization techniques to improve the quality of visual information of the input data. Once input images are prepared, we fed them into the feature extraction phase with the state-of-the-art CNN descriptors to extract deep features from each input image. For the training phase, the generated features are then fed into machine learning classifiers such as Decision Tree (DT) [28] , Random Forest (RF) [29] , XGBoost [30] , AdaBoost [31] , Bagging classifier [32] and LightGBM [33] . Finally, the performance of the proposed approach is evaluated on test images. The concept of transfer learning has been introduced for solving deep learning problems arising from insufficiently labeled data, or when the CNN model is too deep and complex. Aiming to tackle these challenges, studies in a variety computer vision tasks demonstrated the advantages of transfer learning strategies from an auxiliary domain in improving the detection rate and performance of a classifier [34] [35] [36] . In a transfer learning strategy, we transfer the weights already learned on a cross-domain dataset into the current deep learning task instead of training a model from scratch. With the transfer learning strategy, the deep CNN can obtain general features from the source dataset that cannot be learned due to the limited size of the dataset in the current task. Transfer learning strategies have various advantages, such as avoiding the overfitting issue when the number of training samples is limited, reducing the computational resources, and also speeding up the convergence of the network [37] [38]. Effective feature extraction is one of the most important steps toward learning rich and informative representations from raw input data to provide accurate and robust results. The small or imbalanced size of the training samples poses a significant challenge for the training of a deep CNN where data dimensionality is much larger than the number of samples leading to over-fitting. Although various strategies, e.g. data augmentation [39] , transfer learning [40] and fine-tuning [41] , may reduce the problem of insufficient or imbalance training data, the detection rate of the CNN model may degrade due to the over-fitting issue. Since the overall performance obtained by a fine-tuning method in the initial experiments for this study was not significant, we employed a different approach inspired by [25] [26] [27] known as deep convolutional feature representation. In this method, we used pre-trained well-established CNN models as a visual feature extractor to encode the input images into a feature vector of sparse descriptors of low dimensionality. Then the computed encoded feature vectors produced by CNN architectures are fed into different classifiers, i.e. machine learning algorithms, to yield the final prediction. This lower dimension vector significantly reduces the risk of over-fitting and also the training time. Different robust CNN architectures such as MobileNet, DenseNet, Xception, InceptionV3, InceptionResNetV2, ResNet, VGGNet, NASNet are selected for feature extraction with the possibility of transfer learning advantage for limited datasets and also their satisfying performances in different computer vision tasks [42, 43, 44, 45] . Figure 3 . illustrates the visual features extracted by VGGNet architecture from an X-ray image of a COVID-19 positive patient. In order to evaluate the performance of our feature extracting and classifying approach, we used the public dataset of X-ray images provided by Dr. Joseph Cohen available from a GitHub repository [14] . We used the available 117 chest X-ray images and 20 CT images (137 images in total) of COVID-19 positive cases. We also included 117 images of healthy cases of X-ray images from Kaggle Chest X-Ray Images (Pneumonia) dataset available at [19] and 20 images of healthy cases of CT images from Kaggle RSNA Pneumonia Detection dataset available at [46] to balance the dataset with both positive and normal cases. Figure 4 shows examples of confirmed COVID-19 images extracted from the provided dataset. The X-ray images of confirmed COVID-19 infection demonstrate different shapes of "pure ground glass" also known as hazy lung opacity with irregular linear opacity depending the disease progress [12] . The images within the dataset were collected from multiple imaging clinics with different equipment and image acquisition parameters; therefore, considerable variations exist in images' intensity. The proposed method in this study avoids extensive pre-processing steps to improve the generalization ability of the CNN architecture. This helps to make the model more robust to noise, artifacts and variations in input images during feature extraction phase. Hence, we only employed two standard pre-processing steps in training deep learning models to optimize the training process. • Resizing: The images in this dataset vary in resolution and dimension, ranging from 365×465 to 1125×859 pixels; therefore, we re-scaled all images of the original size to the size of 600×450 pixels to obtain a consistent dimension for all input images. The input images were also separately resized to 331×331 pixels and 224×224 pixels as required for NASNetLarge and NASNetMobile architectures, respectively. • Image normalization: For image normalization, first, we re-scaled the intensity values of the pixels using ImageNet mean subtraction as a pre-processing step. The ImageNet mean is a pre-computed constant derived from the ImageNet database [21] . Another essential pre-process step is intensity normalization. To accomplish this, we normalized the intensity values of all images from [0, 255] to the standard normal distribution by min-max normalization to the intensity range of [0, 1], which is computed as: where x is the pixel intensity. x min and x max are minimum and maximum intensity values of the input image in equation 1. This operation helps to speed up the convergence of the model by removing the bias from the features and achieve a uniform distribution across the dataset. To measure the prediction performance of the methods in this study, we utilized common evaluation metrics such as recall, precision, accuracy and f1-score. According to equations (2) (3) (4) (5) True positive (TP) is the number of instances that correctly predicted; false negative (FN) is the number of instances that incorrectly predicted. True negative (TN) is the number of negative instances that predicted correctly, while false positive (FP) is the number of negative instances incorrectly predicted. Given TP, TN, FP and FN, all evaluation metrics were calculated as follows: Recall or sensitivity is the measure of COVID-19 cases that are correctly classified. Recall is critical, especially in the medical field and is given by: Precision or positive predictive value is defined as the percentage of correctly classified labels in truly positive patients and is given as: Accuracy shows the number of correctly classified cases divided by the total number of test images, and is defined as: F1-score, also known as F-measure, is defined as the weighted average of precision and recall that combines both the precision and recall together. F-measure is expressed as: Diagnostic imaging modalities, such as chest radiography and CT are playing an important role in confirming the primary diagnosis from the Polymerase Chain Reaction (PCR) test for COVID-19. Medical imaging is also playing a critical in monitoring the progression of the disease and patient care. Extracting features from radiology modalities is an essential step in training machine learning models since the model performance directly depends on the quality of extracted features. Motivated by the success of deep learning models in computer vision, the focus of this research is to provide an extensive comprehensive study on the classification of COVID-19 pneumonia in chest X-ray and CT imaging using features extracted by the stateof-the-art deep CNN architectures and trained on machine learning algorithms. The 10-fold cross-validation technique was adopted to evaluate the average generalization performance of the classifiers in each experiment. For all CNNs, the network weights were initialized from the weights trained on ImageNet. The Windows based computer system used for this work had an Intel(R) Core(TM) i7-8700K 3.7 GHz processors with 32 GB RAM. The training and testing process of the proposed architecture for this experiment was implemented in Python using Keras package with Tensorflow backend as the deep learning framework backend and run on Nvidia GeForce GTX 1080 Ti GPU with 11GB RAM. Table 1 and Figure 5 summarize the accuracy performance of six machine learning algorithms, namely, DT, RF, XGBoost, AdaBoost, Bagging classifier and LightGBM on the feature extracted by deep CNNs. Each entry in Table 1 , is in the format (µ ± σ) where µ is the average classification accuracy and σ is standard deviation. Analyzing Table 1 the topmost result was obtained by Bagging classifier with a maximum of 99.00% ± 0.09 accuracy on features extracted by DesnseNet121 architecture (with feature extraction time of 9.306 seconds and training time of 30.748 seconds in Table 5 ), which is the highest result reported in the literature for COVID-19 classification of this dataset. It is also inferred from Table 1 that the second-best result obtained by ResNet50 feature extractor and LightGBM classifier (with feature extraction time of 0.960 seconds and training time of 10.206 seconds in Table 5 ) with an overall accuracy of 98.00 ± 0.09. Comparing the first and second winners among all combinations, the classification accuracy of DenseNet121 with Bagging is slightly better (1%) than ResNet50 with LightGBM, while the training time of the second winner is tempting, almost 30 times better than the first winner in terms of accuracy. Although Bagging is a slow learner, it has the lowest standard deviation and hence is more stable than other learners. The results also demonstrate that the detection rate is worst on the features extracted by ResNet101V2 trained by the AdaBoost classifier with 76.00 ± 0.32 accuracy. Figure 5 and Figure 6 demonstrate box-plot distributions of deep CNNs feature extractors and classification accuracy from the 10-fold cross-validation. Circles in Figure 5 represent outliers. In Tables 2, 3 Table 4 : Comparison of classification f1-score metric of different machine learning models. The bold value indicates the best result; underlined value represents the second-best result of the respective category. trained visual feature extractor so far was DesnseNet121, MobileNet and InceptionV3 rather than counterpart architectures for COVID-19 image classification. Although the approach presented here shows satisfying performance, it also has limitations classifying more challenging instances with vague, low contrast boundaries, and the presence of artifacts. Some examples of these cases are illustrated in Figure 7 . Finally, comparison of the feature extraction time using deep CNN models and training with ML algorithms are shown in Table 5 and After training a model, the pre-trained weights and models can be used as predictive engine for CAD systems to allow an automatic classification of new data. A web-based application was implemented using standard web development tools and techniques such as Python, JavaScript, HTML, and Flask web framework. Figure 9 shows the output of our web-based application for COVID-19 pneumonia detection. This web application could help doctors benefit from our proposed method by providing an online tool that only requires uploading an X-ray or CT image. The application then provides the physician with a simple COVID-19 Positive, or COVID-19 Negative observation. It should be noted that this application has yet to be clinically validated, is not yet approved for diagnostic use and would simply serve as a diagnostic aid for the medical imaging specialist. The proposed method is generic as it does not need handcrafted features and can be easily adapted, requiring minimal pre-processing. The provided dataset is collected across multiple sources with different shape, textures and morphological characteristics. The transfer learning strategy has successfully transferred knowledge from the source to the target domain despite the limited dataset size of the provided dataset. During the proposed approach, we observed that no overfitting occurs to impact the classification accuracy adversely. However, our study has some limitations. The training data samples are limited. Extending the dataset size by additional data sources can provide a better understanding on the proposed approach. Also, employing pre-trained networks as feature extractors requires to rescale the input images to a certain dimension which may discard valuable information. Although the proposed methodology achieved satisfying performance with an accuracy of 99.00%, the diagnostic performance of the deep learning visual feature extractor and machine learning classifier should be evaluated on real clinical study trials. The ongoing pandemic of COVID-19 has been declared a global health emergency due to the relatively high infection rate of the disease. As of the time of this writing, there is no clinically approved therapeutic drug or vaccine available to treat COVID-19. Early detection of COVID-19 is important to interrupt the human-to-human transmission of COVID-19 and patient care. Currently, the isolation and quarantine of the suspicious patients is the most effective way to prevent the spread of COVID-19. Diagnostic modalities such as chest Xray and CT are playing an important role in monitoring the progression and severity of the disease in COVID-19 positive patients. This paper presents a feature extractor-based deep learning and machine learning classifier approach for computer-aided diagnosis of COVID-19 pneumonia. Several ML algorithms were trained on the features extracted by well-established CNNs architectures to find the best combination of features and learners. Considering the high visual complexity of image data, proper deep feature extraction is considered as a critical step in developing deep CNN models. The experimental results on available chest X-ray and CT dataset demonstrate that the features extracted by DesnseNet121 architecture and trained by a Bagging tree classifier generates very accurate prediction of 99.00% in terms of classification accuracy. Covid-19 infection: Origin, transmission, and characteristics of human coronaviruses Thrombocytopenia is associated with severe coronavirus disease 2019 (covid-19) infections: A meta-analysis Probable pangolin origin of sars-cov-2 associated with the covid-19 outbreak The impact of the covid-19 epidemic on the utilization of emergency dental services Coronavirus disease (covid-19): A primer for emergency physicians The epidemiology and pathogenesis of coronavirus disease (covid-19) outbreak Clinical and ct imaging features of the covid-19 pneumonia: Focus on pregnant women and children COVID-19) Pandemic Transmission potential and severity of covid-19 in south korea Coronavirus Infections -Transmission electron microscopic image Temporal changes of ct findings in 90 patients with covid-19 pneumonia: a longitudinal study Covidx-net: A framework of deep learning classifiers to diagnose covid-19 in x-ray images Covid-19 image data collection Detecting COVID-19 in X-ray images with Keras, TensorFlow, and Deep Learning Coronavirus (covid-19) classification using ct images by machine learning methods An introduction to support vector machines and other kernel-based learning methods Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images Kaggle's Chest X-Ray Images (Pneumonia) dataset Diagnosing covid-19 pneumonia from x-ray and ct images using deep learning and transfer learning algorithms Imagenet classification with deep convolutional neural networks Estimating uncertainty and interpretability in deep learning for coronavirus (covid-19) detection Finding covid-19 from chest x-rays using deep learning on a small dataset Covid-resnet: A deep learning framework for screening of covid19 from radiographs A theoretical analysis of feature pooling in visual recognition Deep convolutional neural networks for breast cancer histology image analysis Deep learning for visual understanding: A review Induction of decision trees Random forests Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining -KDD '16 A desicion-theoretic generalization of on-line learning and an application to boosting Bagging predictors Lightgbm: A highly efficient gradient boosting decision tree Breast cancer diagnosis with transfer learning and global pooling A novel deep learning based framework for the detection and classification of breast cancer using transfer learning Breast cancer histology images classification: Training from scratch or transfer learning? Pathological brain detection based on alexnet and transfer learning Classification of histopathological biopsy images using ensemble of deep learning networks Automatic diagnosis of fungal keratitis using data augmentation and image fusion with deep convolutional neural network A novel scene classification model combining resnet based transfer learning and data augmentation with a filter Decision fusionbased fetal ultrasound image plane classification using convolutional neural networks Automated identification and grading system of diabetic retinopathy using deep neural networks Deep learning iot system for online stroke detection in skull computed tomography images Detection of tumors on brain mri images using the hybrid convolutional neural network architecture Mapgi: Accurate identification of anatomical landmarks and diseased tissue in gastrointestinal tract using deep learning RSNA Pneumonia Detection Challenge