key: cord-0928434-h9xl3nlh authors: Nigam, Bhawna; Nigam, Ayan; Jain, Rahul; Dodia, Shubham; Arora, Nidhi; B, Annappa title: COVID-19: Automatic Detection from X-ray images by utilizing Deep Learning Methods date: 2021-03-16 journal: Expert Syst Appl DOI: 10.1016/j.eswa.2021.114883 sha: bdfd503948c712dedda90cf5071bde43fa172e84 doc_id: 928434 cord_uid: h9xl3nlh In recent months, a novel virus named Coronavirus has emerged to become a pandemic. The virus is spreading not only humans, but it is also affecting animals. First ever case of Coronavirus was registered in city of Wuhan, Hubei province of China on 31st of December in 2019. Coronavirus infected patients display very similar symptoms like pneumonia, and it attacks the respiratory organs of the body, causing difficulty in breathing. The disease is diagnosed using a Real-Time Reverse Transcriptase Polymerase Chain reaction (RT-PCR) kit and requires time in the laboratory to confirm the presence of the virus. Due to insufficient availability of the kits, the suspected patients cannot be treated in time, which in turn increases the chance of spreading the disease. To overcome this solution, radiologists observed the changes appearing in the radiological images such as X-ray and CT scans. Using deep learning algorithms, the suspected patients’ X-ray or Computed Tomography (CT) scan can differentiate between the healthy person and the patient affected by Coronavirus. In this paper, popular deep learning architectures are used to develop a Coronavirus diagnostic systems. The architectures used in this paper are VGG16, DenseNet121, Xception, NASNet, and EfficientNet. Multiclass classification is performed in this paper. The classes considered are COVID-19 positive patients, normal patients, and other class. In other class, chest X-ray images of pneumonia, influenza, and other illnesses related to the chest region are included. The accuracies obtained for VGG16, DenseNet121, Xception, NASNet, and EfficientNet are 79.01%, 89.96%, 88.03%, 85.03% and 93.48% respectively. The need for deep learning with radiologic images is necessary for this critical condition as this will provide a second opinion to the radiologists fast and accurately. These deep learning Coronavirus detection systems can also be useful in the regions where expert physicians and well-equipped clinics are not easily accessible. VGG16, DenseNet121, Xception, NASNet, and EfficientNet. Multiclass classification is performed in this paper. The classes considered are COVID-19 positive patients, normal patients, and other class. In other class, chest X-ray images of pneumonia, influenza, and other illnesses related to the chest region are included. The accuracies obtained for VGG16, DenseNet121, Xception, NASNet, and EfficientNet are 79.01%, 89.96%, 88.03%, 85.03% and 93.48% respectively. The need for deep learning with radiologic images is necessary for this critical condition as this will provide a second opinion to the radiologists fast and accurately. These deep learning Coronavirus detection systems can also be useful in the regions where expert physicians and well-equipped clinics are not easily accessible. Keywords: COVID-19, Deep Learning, Coronavirus, Pandemic Coronavirus Disease 2019 (COVID-19) is one of the deadliest viruses found in the world, which has high death rate and spread rate. This has caused a pandemic in the world. Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) is the name coined by the International Committee on Taxonomy of Viruses (ICTV). The virus was first discovered in Wuhan, Hubei province of China reporting causes similar to pneumonia. The first case was registered on December 31, 2019. By January 3, 2020, the number of patients with similar symptoms was 44, which was reported to the World Health Organization by the national authorities of China. From the reported 44 cases, there were 11 patients with severe illness and 33 patients in stable condition (Wu et al., 2020) . The virus spread to almost of China within a span of 30 days. The spread of the virus affects both animals and humans. The number of cases registered in the United States of America was seven on January 20, 2020. The count of the cases quickly escalated to 300,000 by April 5, 2020 (Holshue et al., 2020) . The virus spread to many countries including some of the biggest nations such as the USA, Japan, Germany, etc via transportation. A standard and most common kit is used for the diagnosis of SARS-CoV-2 virus named as RT-PCR (Lee et al., 2020) . But, due to the fast spread of this virus, there are very few kits for diagnosing the virus accurately for the suspected patients. This has posed to be one of the greatest threats in avoiding the infection of this virus . There seem to be no findings of the virus in the first two days when observed on a lung CT scan. The early symptoms of SARS-CoV-2 are hard to identify as the symptoms of the disease are visible after ten days (Pan et al., 2020) . In this case, the carrier of the virus may have contacted other people and infected most of them. To avoid this, there has to be a solution where the suspected patients get faster confirmation of the presence of the disease. This can be provided by imaging modalities such as Chest radiographs (X-rays) or Computed Tomography (CT) scan images. The radiologists can give an opinion by analyzing and applying some image analysis methods to diagnose SARS-CoV-2 (Kanne et al., 2020) . The radiologists state that X-ray and CT images contain vital information related to COVID-19 cases. Therefore, combining radiologic images with Artificial Intelligence (AI) such as Deep learning methods, an accurate and faster diagnosis, can be performed to determine COVID-19 (Lee et al., 2020) . The most common symptoms found are usually cough, fever, breathlessness, fatigue, malaise, sore throat, etc. The elderly age group and people with respiratory issues are more prone to getting infected. It starts with pneumonia and Acute Respiratory Distress Syndrome (ARDS), leading to dysfunctioning of multiple organs. Some of the laboratory findings include the normal or low count of white cells, along with an increased amount of C-reactive protein (CRP). The initial step for preventing the spread of disease is to keep the suspected patients in-home quarantine. People with a higher rate of infection are kept under treatment at hospitals with strict infection control measures. In this paper, five popular deep learning architectures are used to develop the COVID-19 diagnostic system. Various sources are screened for collecting X-ray images. The raw input images usually consist of unnecessary text information, poor quality of X-ray images, and different image dimensions. Before provid-ing these images as input to the classifier, the images need to be preprocessed. Multiple challenges occurred in this step, such as text in the chest X-rays, dimension mismatch, data imbalance, etc. Multiple techniques handle all these issues. After preprocessing of the input images, these preprocessed images of three classes are provided as input to five deep learning architectures, namely, VGG16, DenseNet121, Xception, NASNet, and EfficientNet. The detailed description of the study is discussed in the below sections. The contributions of this work are stated below: • A deep learning COVID-19 diagnostic system is developed using state-ofthe-art (SoTA) deep learning architectures. • A detailed analysis of different methods used for detecting COVID-19 is presented in this paper. • Chest X-rays from different sources are collected to build a robust classification model. • The developed diagnostic model provided efficient results for larger variety of input images. • The results produced in this work are validated by an expert radiologist. 98.7F-score: 99.6 The method exhibited very promising results using deep features that are extracted from Inception model and the decision is provided by a tree based classifier. However, the drawback of this method is the varying environments used for feature extraction and classification. Also, the method has been tested for very few images. The results may vary in case a larger dataset is fed to the model. Sensitivity= 97.91% The model provided better performance results. The limited data issue in this work is handled by performing a data augmentation step. But, augmenting the X-ray images may not be a proper solution to handle less data as the location of the presence of the virus spread may never be found correctly. Only frontal images of the chest X-rays are selected and given to the further processing in our work to overcome this problem. In this method, an abnormality localization is implemented along with COVID-19 detection. The results obtained from this method is promising and the use of CT images provided better visibility of images as compared to X-ray images. The dataset includes chest X-ray images collected from various private hospitals from Maharashtra and Indore regions from India. The X-ray images are collected from posteroanterior (PA) frontal chest view from the patients. The dataset is divided into three different categories, namely, COVID, normal, and other. The COVID class consists of X-ray images belonging to patients with COVID, normal class consists of X-ray images of healthy patient scans, and other class consists of patients with viral infections or diseases such as effusion, pneumonia, nodule masses, infiltration, hernia etc. The number of X-ray images present in each class is given in Table 2 . The model is trained on 70%, validated on 20% and tested on 10% of the total chest X-ray images respectively. Preprocessing the input images is one of the important prerequisites in de- The classification is performed using five different deep learning architectures, namely, VGG16, DenseNet121, Xception, NASNet and EfficientNet. These models are trained using transfer learning. Every model was trained for 100 epochs. The models are pre-trained using ImageNet weights, and these are also used for fine-tuning the model. The detailed explanation of the models is mentioned in the following sections. VGG16 is a CNN model that is developed by Visual Geometry Group (VGG) at Oxford university (Simonyan & Zisserman, 2014) . The network is a successor of AlexNet, which was developed in 2012. VGG16 consists of 11 layers, including eight convolution blocks, three fully connected layers, five max-pooling layers and, one softmax layer. The architecture was developed for the ImageNet challenge. The convolution blocks' width is set to a small number (i.e., starting from 64 in the initial layer). The width parameter is increased by two until it reaches 512 after each max-pooling operation is carried out. The model architecture is shown in Figure 1 . The details of the architecture are as follows. The input image size fed to the VGG16 is of 224 x 224. The kernel size was set to 3 x 3 with a stride of 1. By performing spatial padding, the spatial resolution of the image was preserved. The pool size of the max-pooling operation was set to 2 x 2 with a stride of 2. In the fully connected layers, the first two layers' size is 4096 and the last fully connected layer of 1000. The last layer was set to 1000 because of 1000 classes in the ImageNet classification. In this work, the last dense layer is set to 3, because the number of classes used in this work are 3. Finally, the last layer was a softmax function. VGG16 network is made available for public use so that similar tasks can be performed using this model. The model can also be used for Transfer learning as the pre-trained weights are available in some of the frameworks such as Keras, so these can be used to develop own models by performing slight modifications accordingly. Densely connected Convolutional Networks are called as DenseNets (Huang et al., 2016) . This is another way of increasing the depth of the deep convolutional networks without having issues such as exploding gradients and vanishing gradients. These issues are solved by connecting every layer directly with each other, which allows the passing of maximum information and gradient flow. The main key here is to explore the feature reuse instead of drawing represen- One thing to be taken care of for the concatenation of the features maps is that the size of the feature maps must be consistent, which means that the con-volutional layer's output must be of the same size as the input. The working of densely concatenated convolutions can be represented in the following equation 1. Where H l represents the l th layer, X l represents the output of the l th layer. In the above equation, each layer's input is fed to output of all frontal layers. To achieve a normalize the input of the layer, a Batch normalization (Ioffe & Szegedy, 2015) step in included in the network, which reduces the absolute difference between data and takes relative difference into consideration. The Xception network is a successor of the Inception network. The name Xception stands for "eXtreme Inception". The Xception network consists of depth-wise separable convolution layers instead of conventional convolution layers (Chollet, 2017) . The schematic representation of a block in Xception is demonstrated in Figure 3 . Xception involves mapping of spatial correlations and cross-channel correlations which can be entirely decoupled in CNN feature maps. Xception performed than the underlying Inception architecture. The Xception model consists of 36 convolution layers which can be divided into 14 different modules. Each layer has a linear residual connection around them by removing the initial and final layer. In short, Xception is the stacking of depth-wise separable convolution layers in a linear manner consisting of residual connections. To capture the cross-channel correlation in an input image, the input data is mapped to spatial correlations for each output channel separately. After performing this operation, a depth-wise 1 x 1 convolution operation is carried out. The correlations can be looked at as a 2D+1D mapping instead of a 3D mapping. In Xception, 2D space correlations are performed initially and then followed by 1D space correlation. An alternative method of prevention of the SARS-Cov-2 virus spread is essential in the pandemic situation like this. The models used in this paper can be useful in providing a second opinion to the radiologists to differentiate between the radiologic images obtained from the suspected patients. This can lessen the time of the testing of the patient using RT-PCR kits and getting laboratory results in a short duration of time. As mentioned earlier in the preprocessing section, only X-ray images are detected from the raw input images excluding the text and other unnecessary details on the images. This operation is performed using YOLO. The illustration of this preprocessing is illustrated in Figure 6 Figure 6: Illustration of X-ray images region selection from raw input images The models were run on NVIDIA Tesla K80 machine. All the models were run for 100 epochs. The parameters used for training models are presented in Table 3 . The accuracy achieved in training and testing of the learning process is stated in the table. A probability score of an X-ray image is given indicating the class which the X-ray image belongs to. In Figure 7 , the heatmap of an X-ray image and its original image extracted from Efficient model is provided. The heatmap is extracted using GRAD-CAM method. With respect to the activation map in the model, the gradient of the most dominant logit is computed for a heatmap. A channel-wise pooling of these gradients is performed, and the activation channels are weighted with their corresponding gradients. This results in a collection of weighted activation channels. When these channels are inspected, it can be decided that which of the channels play a significant role in the decision of the class. The probability values obtained from the heatmap are: it belongs to COVID class with a probability of 0.0%, normal class with a probablity of 3.11% and to others class with a probability of 100.0%. The performance of five architectures used in the study is presented in the confusion matrices illustrated in Figure 8 . By observing the confusion matrices, presence in the human body can be performed using the given models. Five architectures used in the study are evaluated based on precision, recall and F1 score performance metrics. The class-wise performance of the models Zhang et al. (2020) for COVID-19 positive and negative cases. The model performed exceptionally The illustration of some of the misclassified samples are shown in Figure 9 . Two misclassified samples are presented in the figure. Both the images belong to COVID class but are misclassified to normal class. The reason for the misclassification of the first image is the appearance of breast shadow. The reason for the misclassification of the second image is the over exposure of the X-ray image. the depletion of the loss as the epoch is increased. Figure 11 illustrates the epoch versus accuracy graph for the training and testing accuracy of the five architectures mentioned above. In the recent trends, many COVID diagnosis systems have been developed. However, there are still some limitations in the current systems. One of the main obstacles in developing a reliable modes is the availability of chest X-ray dataset for public usage. Since deep learning models are data-driven, there is a need for larger datasets. In order to improve the time complexity of the deep learning models, the model performance can be improvised by using pre-trained weights from the previous SoTA medical chest x-ray diagnostic systems such as CheXpert, CheXnet, CheXNext. Concentrating on the data collection aspect, the data used for training these models are not diverse since medical images for COVID are obtained from the local hospitals. There is a need of huge amount of data for all three classes viz COVID, normal, and others. Due to limited data, to avoid the deep learning models to overfit, the dataset size needs to be increased. Therefore, few augmnetation techniques are applied on the data. Performing augmentation on medical images is not considered good practice in the field of medical imaging. However, some augmentation techniques such as Affine transformation or noise addition can be used as per requirement. To improve the performance of the COVID diagnosis system, we can also use other SoTA models such as ResNet, MobileNet, Densenet169. In addition, we can use the Classification-Detection pipeline, which has provided good results in various Kaggle competitions. In this work, chest X-ray images are considered. However, CT scans of chest/thoracic regions may be considered to develop the COVID-19 diagnostic system, which can yield better performance. In this paper, popular and best performing deep learning architectures are used for COVID-19 detection in the suspected patients by analyzing the X-ray images. Radiologic images such as X-ray or CT scans consist of vital information. The models performed efficiently and provided considerable results to many SoTA Coronavirus detection systems. The virus has caused a pandemic and has affected the economy of the entire world. The global impact of the virus is still unknown. The spread rate of the virus is too high, and it can be seen in both humans and animals. The models classify the healthy person, Coronavirus infected person, and to others class which is an illness that is non-COVID19 by analyzing the chest X-ray images. In this work, various SoTA deep learning architectures are used to perform COVID-19 chest X-rays. The highest recognition accuracy is achieved from EfficientNet model, i.e., 93.48%. It is observed that deep learning models provide better and faster results by analysing the image data to identify the presence of COVID in a person. However, the performance of the system can still be improved using various deep learning architectures and also by increasing the size of the dataset. There is an urgent need to diagnose the presence of the virus in the suspected patients as it may prevent it from spreading further. Hence, the proposed system can be used as a tool that can provide a faster and accurate recognition of the virus's presence. Classification of covid-19 in chest x-ray images using detrac deep convolutional neural network Deep transfer learning-based automated detection of covid-19 from lung ct scan slices Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks Xception: Deep learning with depthwise separable convolutions Covid-19 image data collection. arXiv Mask r-cnn Covidx-net: A framework of deep learning classifiers to diagnose covid-19 in x-ray images First case of 2019 novel coronavirus in the united states Deep networks with stochastic depth Batch normalization: Accelerating deep network training by reducing internal covariate shift Essentials for radiologists on covid-19: An update-radiology scientific expert panel. Radiology, 0 Covid-19 pneumonia: what has ct taught us? Covid-mobilexpert: On-device covid-19 screening using snapshots of chest x-ray Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks Deep learning covid-19 features on cxr using limited training data sets Automated detection of covid-19 cases using deep neural networks with x-ray images Time course of lung changes on chest ct during recovery from 2019 novel coronavirus RSNA pneumonia detection challenge Detection of coronavirus disease (covid-19) based on deep features Very deep convolutional networks for large-scale image recognition Covid-19 image classification using deep features and fractional-order marine predators algorithm Efficientnet: Rethinking model scaling for convolutional neural networks Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weaklysupervised classification and localization of common thorax diseases A new coronavirus associated with human respiratory disease in china Deep learning system to screen coronavirus disease 2019 pneumonia Covid-19 screening on chest x-ray images using deep learning based anomaly detection Deep learning-based detection for covid-19 from chest ct using weak label. medRxiv Learning transferable architectures for scalable image recognition Figure 11 : Epochs versus accuracy graph for all the models