key: cord-0717363-47sqvg54 authors: Jain, Rachna; Gupta, Meenu; Taneja, Soham; Hemanth, D. Jude title: Deep learning based detection and analysis of COVID-19 on chest X-ray images date: 2020-10-09 journal: Appl Intell DOI: 10.1007/s10489-020-01902-1 sha: 0ff70010a63c7b74c42290f020c536464c64431b doc_id: 717363 cord_uid: 47sqvg54 Covid-19 is a rapidly spreading viral disease that infects not only humans, but animals are also infected because of this disease. The daily life of human beings, their health, and the economy of a country are affected due to this deadly viral disease. Covid-19 is a common spreading disease, and till now, not a single country can prepare a vaccine for COVID-19. A clinical study of COVID-19 infected patients has shown that these types of patients are mostly infected from a lung infection after coming in contact with this disease. Chest x-ray (i.e., radiography) and chest CT are a more effective imaging technique for diagnosing lunge related problems. Still, a substantial chest x-ray is a lower cost process in comparison to chest CT. Deep learning is the most successful technique of machine learning, which provides useful analysis to study a large amount of chest x-ray images that can critically impact on screening of Covid-19. In this work, we have taken the PA view of chest x-ray scans for covid-19 affected patients as well as healthy patients. After cleaning up the images and applying data augmentation, we have used deep learning-based CNN models and compared their performance. We have compared Inception V3, Xception, and ResNeXt models and examined their accuracy. To analyze the model performance, 6432 chest x-ray scans samples have been collected from the Kaggle repository, out of which 5467 were used for training and 965 for validation. In result analysis, the Xception model gives the highest accuracy (i.e., 97.97%) for detecting Chest X-rays images as compared to other models. This work only focuses on possible methods of classifying covid-19 infected patients and does not claim any medical accuracy. Covid-19 is a severe disease issue where a large number of people lose their lives every day. This disease affects not only a single country, and even the whole world suffered because of this virus disease. In the past decade, several kinds of viruses (like SARS [1] , MERS, Flu, etc.) [2, 3] came into the picture, but they stand for only a few days or few months [4] . Many scientists are working on these kinds of viruses, and few of them are diagnosed due to the availability of vaccines prepared by them (i.e., Scientists or researchers). In the present time, the whole world is affected by Covid-19 disease [5] , and the most important thing is no single country scientists can prepare a vaccine for the same. Meanwhile, many more predictions came into a picture such as plasma therapy, X-ray images [6, 7] , and many more, but the exact solution of this deathly disease is not found. Every day, people lose their life due to covid-19 [8] , and the diagnostic cost of this disease is very high in the context of a country, state, and patients [9] . In March 2020, X-ray images of healthy people and Covid-19 infected peoples [10] were available online in different repositories such as Github, Kaggle for analysis. Covid-19 is an epidemic disease that threatens humans at a global level and turned into a pandemic. To diagnose covid-19 infected patients with healthy patients is a critical task. The dialysis of Covid-19 infected patients needs more precaution and must be cured under very strict procedures to reduce the risk of patients unaffected with covid-19 [11] . The novel coronavirus disease came first as a throat infection, and suddenly people faced difficulty in breathing. The covid-19 illness is a hidden enemy where no one is capable of fighting. Infected patients of Covid-19 [12] are required to be in isolation, do proper screening, and take adequate protection with prevention to protect healthy people. This infection is following a chain process [13] that transfers from one person to another after coming in contact with covid-19 infected persons. Hospital staff, nurses, doctors, and clinical facilities play an essential role in the diagnosis of this epidemic. Many more strategies have been applied to reduce the impact of Covid-19. Medical imaging [14] is also a method of analyzing and predicting the effects of covid-19 on the human body. In this, healthy people and Covid-19 infected patients can be analyzed in parallel with the help of CT (Computerised Tomography) images [15, 16] and chest X-ray images. For contributing to an analysis of Covid-19, we collected uploaded data of X-ray images of healthy and covid-19 infected patients from different sources and applied three different models (InceptionV3, Xception, and ResNeXt). The analysis of this collected data is done with the help of CNN, a machine learning tool. This work mainly focuses on the use of CNN models for classifying chest X-ray images for coronavirus infected patients. We have attempted to draw a parallel to the previous work in the field and look for potential models of the task, which can be assessed further to prove their usefulness in practical scenarios. This paper, further classified in the different sections such as section 2, discussed various researchers' views in analyzing the impact of the covid-19 disease on countries and humans. Dataset used and model formulation is discussed in section 3. Different matrices and algorithms used are also discussed in section 3. Further, the evaluation of results in terms of training and testing with confusion matrices for models used is discussed in section 4. Next, this work is concluded in section 5 with its future scope. In [17] , the authors proposed a framework model based on Capsule Networks to diagnose Covid-19 (i.e., COVID-CAAPS) disease with the help of X-ray images. In this proposed work, several convolution layers and capsules are used to overcome the problem of class-imbalance. In experimental analysis, they showed the satisfying performance of COVID-CAPS on a smaller number of trainable parameters. Authors mentioned about the considered trained model which is publicly available on Github [18] for open access. As a result, they concluded that the proposed model shows accuracy 95.7%, whereas sensitivity is shown as 90% and specificity as 95.80% while applying a smaller number of trainable parameters. In [19] , the authors considered the first three cases of Covid-19 infected cases in France. Out of these three persons, two were diagnosed in Paris and one in Bordeaux. Before coming in contact with Covid-19 diseases, they were staying in Wuhan, China. In [20] , the author proposed a hybrid system based on artificial intelligence, which specially used machine learning and deep learning algorithms (i.e., Convolutional Neural Network (CNN) using softmax classifier). The proposed system is specially implemented for detecting Covid-19 cases using chest X-ray images. In [21] , the authors have given a radiologic analysis of MERS (Middle East Respiratory Syndrome) on novel coronavirus. They considered the case of a 30 year old male patient who suffered from diarrhea, fever, and abdominal pain. The authors gave an analysis of the treatment of infected persons with chest X-rays [22] . Further, they applied this model on a collected dataset of chest X-ray and CT images and received improved results. Also, in [11] , they discussed what type of protocols are required to follow by hospital staff to minimize the risk of healthy patients and what precaution is needed in taking care of covid-19 infected patients. In [23] , the authors discussed the outbreak of etiology in Wuhan, China. They also raised a question about the exact cause of this epidemic. In this study, they evaluate the traveling (via commercial or air) impact on covid-19. In [24] , the authors applied the SVM technique to identify pneumothorax. They used a Local Binary Pattern (LBP) to mine the characteristics of lung images. In the proposed detection model, the authors used multi-scale texture segmentation by removing impurities of chest images for segmenting the regions of abnormal lungs. Further, this transformation was applied for a change of texture for finding multiple overlapping blocks. Finally, the authors used rid boundary (with Sobel Edge detection) for finding a whole region of disease with the abnormal part. In [25] , the authors considered 21 covid-19 patients' chest CT scans in Wuhan, China. The authors majorly focus on the demonstration of covid-19 disease on human's lungs and their impacts. Next, in [26] , the authors proposed a COVID-RENet model to extract the features (i.e., edge and region-based) with CNN for classification. In this work, authors obtain features by applying CNN, and later on, they used SVM to improve the performance of classification. They used 5-fold cross-validation on a collected dataset of Covid-19. This proposed approach is mainly applicable to a medical specialist for the early diagnosis of Covid-19 infected patients. In [27] , the authors applied a deep learning model on the collected image dataset of chest CT to identify the impacts of Covid-19 from persons acquired pneumonia and lung disease. Further, in [28] , an author has given a study about the impact of covid-19 on kidney and failure of acute renal. In [29] , they considered a dataset of 50 patients who suffered from Covid-19 disease and segregated into two categories of recovery groups (i.e., good and poor). The serological and viral shedding was explored dynamically. Then, the authors identified the risk factor of weak recovery and lung infections. As a result, they concluded that 58% of the patients had a fragile recovery. In [30] , authors have given a study about the total number of patients infected form Covid-19 and death cases all over the world. In [31] , the authors recommended a deep based methodology (with vector gadget classifier) for the detection of patients infected from Covid-19 by using X-ray images. This method is beneficial to hospital doctors for early detecting the cases of covid-19 infected patients. They find 97.48% accuracy of the proposed model for lung classification with the help of different matrices parameters. In [32] , the authors discussed how novel coronavirus is exposed as a novel pneumonia disease in China, city Wuhan. The primary purpose of this paper was to present a new framework of deep learning, i.e., COVIDX-Net, to help the clinical practitioner in automatically diagnose Covid-19 disease by using X-ray images. Further, in [33] , the authors discussed the different methodologies used for covid-19 disease detection and challenges faced. They also said that an automatic method for detecting the Covid-19 virus should be developing to prevent the spreading of the disease through contact. Then, they analyzed different chest X-rays for the detection of pneumonia and concluded that it is hard to predict that Covid-19 causes pneumonia or any other symptoms are responsible for this. In [34] , the authors discussed the chest radiography (CXR) for identification of lung abnormality. They show that the medical community will rely on CXR because of its full availability and reduced infection control. In [18] , they used 123 front views of X-ray for the detection of Covid-19 diseases. Further, in [35] , the authors discussed the role of AI tools in healthcare. They also talked about the challenges of implementing AI tools on less dataset of X-ray images (which is available publically). The authors considered a dataset of Xrays and CT images from several resources and applied deep learning and transfer learning algorithms to detect Covid-19 diseases. AlexNet pre-trained and modified CNN model has been used on a collected dataset. As a result, they showed that the pre-trained model gave 98% accuracy, and changed CNN shows 94.1% accuracy of the model. In [36] , the authors extracted two subsets (16*16 and 32*32) patches to generate sub-datasets, which is derived from 150 CT images, and 3000 X-ray images have been labeled for Covid-19. Further, fusion and ranking methods have been applied to enhance the performance of a proposed methodology. The authors used SVM to classify the processed data, and the CNN model was used to transfer learning. As a result, they showed that set 2 received good accuracy as compared to set 1. In [37] , the authors proposed a model that automatically detects the Covid-19 with the help of Chest X-ray images. The planned model is used to give accurate diagnostics on two different classification models (i.e., binary and multi-class). They applied the DarkNet model to classify the real-time object detection method. In [38] , the authors discussed the use of thermoplasmonic in the detection of Covid-19 diseases. In [39] , the authors consider the patients who confirmed with covid-19 pneumonia and were admitted to the hospital in China, Wuhan. They divide the patients of CT scan into different groups, and features of the image and its distribution were further analyzed and compared for detecting Covid-19 diseases. In [40] , the authors proposed a KE Sieve Neural Network architecture, which helps in finding the analyses of Covid-19 by using Chest X-ray images. Their proposed model shows 98% accuracy of their model. In [41] , authors conducted a study of covid-19 to the persons who have been isloateed in BBH hospital ward located at Rawalpindi. In [42] , the authors CNN based algorithm for analyzing pneumonia by using a chest X-ray dataset. They used two different models (i.e., VGG16 and InceptionV3) for transferring learning on CNN. They further applied SVM for finding better results. In [43] , the authors talked about the implementation of a deep anomaly detection method for reliable screening of Covid-19 patients. They have collected 100 chest X-ray images where 70 persons were confirmed positive with Covid-19. Finally, in [44] , they discussed the impact of Covid-19 on humans. They considered the dataset of 101 cases of Covid-19 infected pneumonia. The primary goal of this study was to compare the clinical condition of Covid-19 pneumonia with CT images. After analyzing all the views and proposed work given by different researchers, Covid-19 is a virus disease that not only has an impact on humans as well as on a country. They discussed different methodologies used for detecting Covid-19 cases early. We implemented three models (i.e., Inception V3, XCeption, and ResNeXt) on a collected dataset of chest X-ray images. Section 3 and 4 give a complete description of these models, and their comparison is shown in section 4. The dataset used and the methodology used is explained in the subsequent sections. The dataset of this work has been collected from Kaggle repository [45] , which contains Chest X-Ray scans of Covid-19 affected, normal and pneumonia. This collected dataset is not meant to claim the diagnostic ability of any Deep Learning model but to research about various possible ways of efficiently detecting Coronavirus infections using computer vision techniques. The collected dataset consists of 6432 total chest X-ray images. This data set is further divided into training (i.e., 5467) and validation (i.e., 965) set of normal, covid, and pneumonia. In the training set, 1345 is normal, 490 are covid, and 3632 is pneumonia. In the validation phase, 238 samples of a normal case, 86 covid, and 641 of pneumonia were considered for this analysis. At the time of the drafting of this paper, we had 576 PA (Posteroanterior) View scans of Covid-19 affected patients. The scans were scaled down 128 × 128 to aid the fast training of our model. The PA view scans were deemed to be consistent with our covid dataset. Table 1 displays the data distribution for training and testing the data. The data obtained from the Kaggle repository [45] was cleaned as needed. To implement a deep learning method [46] requires a large amount of dataset to receive reliable results. But it could be possible that every problem does not have enough data, especially in medical-related issues. Sometimes collecting medical-related data may be time consuming and expensive. To solve these kinds of difficulties, augmentation can be applied. Augmentation can overcome the problem of over-fitting and enhance the accuracy of the proposed model. Further, augmentation is applied in this collected dataset to prevent over-fitting. The augmentations included rotation, zoom, and sharing of images. The data was then shuffled to generalize Fig. 3 Abstracted form of Xception model the model and reduce over-fitting. After this, the prepared dataset was used to train the proposed model. For better analysis, three different models have been implemented, and then their performance was compared to calculate the accuracy. In the given models, we implemented LeakyReLU activation instead of the originally used relu activation function, which makes it as a novel method. This process helps to speed up the training and also avoids the problem of dead neurons (i.e., the relu neurons become inactive due to zero slope). Figure 1 shows the proposed model for chest x-ray image analysis. Inception Net V3 is a CNN based network for classification. It is 48 layers deep and uses inception modules, which comprises of a concatenated layer with 1 × 1 3 × 3 and 5 × 5 convolutions. Doing this, we can decrease the number of parameters and increase the training speed. It is also referred to as GoogLeNet architecture [47] . Figure 2 describes the abstract form of the Inception Net V3 Model. It is a modification of the inception net. In this model, the inception modules are replaced with depthwise separable convolutions. Its parameter size is similar to the Inception net, but it performs slightly better as compared to the inception net [48] . The abstracted form of the Xception model is shown in Fig. 3 . RESNeXt is an extension architecture of the deep residual network. In this model, the standard remaining blocks are replaced with one that leverages a split -transform -merge strategy used in the Inception models [49] . The ResNext architecture model is shown in Fig. 4 . We have used the categorical cross-entropy loss to train our model. It is used to optimize the value of parameters used in our model. We intend to decrease the loss function with successive epochs. We have used adam optimizer with a learning rate = 0.001 for training our model. is loss function: The approach used for implementing the proposed model is discussed below Step 1: Pre-process image i.e. image = X Pre-processing used (We have utilized Keras data generator for this purpose: i.) Reshape image (X) to (128, 128, 3) ii.) Random rotation range = 10°i ii.) Horizontal Flip = True iv.) Zoom Range = 0.4 Note: shape = (128, 128, 3) for fast processing shape = (256,256,3) for better performance Step 2: Apply the image to an input of the pre-trained model: Step 3: Fetch the output of the last convolution layer of the given model. Step 4: Flatten dimensions with reducing n dimensions to n-1. Step 5: Apply a dense layer units = 256 for XCeption Net and Inception Net units = 128 for ResNeXt where, W = Weights and b = bias Step 6: Apply Activation Step 7: Apply Dense Layer for inference Step 8: Apply softmax for classification The proposed model has been evaluated with the help of different parameters such as precision, recall, F1 score [50] , and its accuracy, sensitivity, and specificity [51] , as shown in Eqs. (8) to (12) . It is a modification of the inception net. In this model, the inception modules are replaced with depthwise separable convolutions. Its parameter size is similar to the Inception net, but it performs slightly better as compared to the inception net. Tables 2 and 3 depict the f1-score for training and testing set for XCeption Net Model. Figure 5 discusses the result analysis of training and testing with the loss and accuracy of the Xception model. Fig. 6 shows the confusion matrix of training and testing of the Xception model. It is a state of the art CNN network for classification. It is 48 layers deep and uses inception modules, which comprises a concatenated layer with 1 × 1 3 × 3 and 5 × 5 convulsions. Doing this, we can decrease the number of parameters and increase the training speed. It is also referred to as GoogLeNet architecture. Tables 4 and 5 depict the f1-score on training and testing set for the Inception V3 model. Figure 8a and b shows the confusion matrix of training and testing data of Inception V3 model. This architecture is an extension of the deep residual network. In this model, the standard remaining blocks are replaced with one that leverages a split -transform -merge strategy used in the Inception models. Tables 6 and 7 show the f1-scores for the training and testing sets for ResNeXt Model Figure 9a shows the model accuracy for the ResNeXt model as it improves with the successive epochs, and Fig. 9b shows the training loss for the ResNeXt model as it reduces with the successive epochs. Figure 10a shows the confusion matrices obtained for the train and test set, respectively. Fig. 10b shows that the misclassified samples are negligible. The above Figs. 5, 7, and 8 show the training and testing performance as the model trains with successive epochs. We can see that our ResNeXt model is prone to over-fit, but it still gives optimal accuracy. Moreover, we can see that XCeption Net has the best performance among all the discussed models. Covid-19 pandemic is a growing manifold daily. With the ever-increasing number of cases, bulk testing of cases swiftly may be required. In this work, we experimented with multiple CNN models in an attempt to classify the Covid-19 affected patients using their chest X-ray scans. Further, we concluded that out of these three models, the XCeption net has the best performance and is suited to be used. We have successfully classified covid-19 scans, and it depicts the possible scope of applying such techniques in the near future to automate diagnosis tasks. The high accuracy obtained may be a cause of concern since it may be a result of overfitting. This can be verified by testing it against new data that is made public shortly. In the future, the large dataset for chest X-rays can be considered to validate our proposed model on it. It is also advised to consult medical professionals for any practical use case of this project. We do not intend to develop a perfect detection mechanism but only research about possible economically feasible ways to combat this disease. Such methods may be pursued for further research to prove their real case implementation. Severe acute respiratory syndrome: temporal lung changes at thin-section CT in 30 patients Severe acute respiratory syndrome: radiographic appearances and pattern of progression in 138 patients Mining x-ray images of SARS patients Clinical features of patients infected with 2019 novel coronavirus in Abnormal respiratory patterns classifier may contribute to large-scale screening of people infected with COVID-19 in an accurate and unobtrusive manner Radiographic and CT features of viral pneumonia Artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct Fleischner society: glossary of terms for thoracic imaging Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks Covid-19: automatic detection from x-ray images utilising transfer learning with convolutional neural networks Recommendations for the prevention, mitigation and containment of the emerging SARS-CoV-2 (COVID-19) pandemic in haemodialysis centres Real-time forecasts of the COVID-19 epidemic in China from Prediction of criticality in patients with severe Covid-19 infection using three clinical features: a machine learning-based prognostic model with clinical data in Wuhan A deep learning algorithm using CT images to screen for Corona Virus Disease Middle East respiratory syndrome coronavirus (MERS-CoV) infection: chest CT findings Chest CT findings in 2019 novel coronavirus (2019-nCoV) infections from Wuhan, China: key points for the radiologist Covid-caps: a capsule network-based framework for identification of covid-19 cases from x-ray images Reaz MBI (2020) Can AI help in screening viral and COVID-19 pneumonia COVID-19) in France: surveillance, investigations and control measures Automated Systems for Detection of COVID-19 using chest X-ray images and lightweight convolutional neural networks Middle East respiratory syndrome-coronavirus infection: a case report of serial computed tomographic findings in a young male patient Diagnosis of pneumonia from chest Xray images using deep learning Pneumonia of unknown etiology in Wuhan, China: Potential for International Spread Via Commercial Air Travel Effective pneumothorax detection for chest X-ray images using local binary pattern and support vector machine CT imaging features of 2019 novel coronavirus Regarding artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT Human kidney is a target for novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection Virologic and clinical characteristics for prognosis of severe COVID-19: a retrospective observational study in Wuhan Le nouveau coronavirus Covid-19: quels risques ophtalmiques? J Francais D'Ophtalmologie Automatic X-ray COVID-19 lung image classification system based on multi-level Thresholding and support vector machine Covidx-net: a framework of deep learning classifiers to diagnose covid-19 in xray images Detection of Covid-19 from chest X-ray images using artificial intelligence: an early review Portable chest Xray in coronavirus disease-19 (COVID-19): a pictorial review Diagnosing COVID-19 pneumonia from X-ray and CT images using deep learning and transfer learning algorithms Coronavirus (COVID-19) classification using deep features fusion and ranking technique Automated detection of COVID-19 cases using deep neural networks with X-ray images Dual-Functional Plasmonic Photothermal Biosensors for Highly Accurate Severe Acute Respiratory Syndrome Coronavirus 2 Detection Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study Precise Prediction of COVID-19 in Chest X-Ray Images Using KE Sieve Algorithm Chest X-rays findings in COVID 19 patients at a University Teaching Hospital-A descriptive study Deep convolutional neural network based medical image classification for disease diagnosis Covid-19 screening on chest x-ray images using deep learning based anomaly detection Relation between chest CT findings and clinical conditions of coronavirus disease (COVID-19) pneumonia: a multicenter study Chest X-ray (Covid-19 & Pneumonia) A deep learning algorithm using CT images to screen for Corona virus disease Rethinking the inception architecture for computer vision Xception: deep learning with depthwise separable convolutions Aggregated residual transformations for deep neural networks Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation Understanding and using sensitivity, specificity and predictive values Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations