key: cord-0598390-9c8ktka5 authors: Abbas, Asmaa; Abdelsamea, Mohammed M.; Gaber, Mohamed Medhat title: Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network date: 2020-03-26 journal: nan DOI: nan sha: a15b28c5b9e2d3842fdb9893be668bc51d63dfbe doc_id: 598390 cord_uid: 9c8ktka5 Chest X-ray is the first imaging technique that plays an important role in the diagnosis of COVID-19 disease. Due to the high availability of large-scale annotated image datasets, great success has been achieved using convolutional neural networks (CNNs) for image recognition and classification. However, due to the limited availability of annotated medical images, the classification of medical images remains the biggest challenge in medical diagnosis. Thanks to transfer learning, an effective mechanism that can provide a promising solution by transferring knowledge from generic object recognition tasks to domain-specific tasks. In this paper, we validate and adopt our previously developed CNN, called Decompose, Transfer, and Compose (DeTraC), for the classification of COVID-19 chest X-ray images. DeTraC can deal with any irregularities in the image dataset by investigating its class boundaries using a class decomposition mechanism. The experimental results showed the capability of DeTraC in the detection of COVID-19 cases from a comprehensive image dataset collected from several hospitals around the world. High accuracy of 95.12% (with a sensitivity of 97.91%, a specificity of 91.87%, and a precision of 93.36%) was achieved by DeTraC in the detection of COVID-19 X-ray images from normal, and severe acute respiratory syndrome cases. Diagnosis of COVID-19 is typically associated with both the symptoms of pneumonia and Chest X-ray tests. Chest X-ray is the first imaging technique that plays an important role in the diagnosis of COVID-19 disease. Fig. 1 shows a negative example of a normal chest x-ray, a positive one with COVID-19, and a positive one with the severe acute respiratory syndrome (SARS). In the last few months, World Health Organization (WHO) has declared that a new virus called COVID-19 has been spread aggressively in several countries around the world [1] . Fast detection of the COVID-19 can be contributed to control the spread of the disease. One of the most successful algorithms that have been proved its ability to diagnosis medical images with high accuracy is convolution neural network (CNN ). For example, in [2] , a CNN was applied based on Inception network to detect COVID-19 disease within computed tomography (CT ). In [3] , a modified version of ResNet-50 pre-trained network has been provided to classify CT images into three classes: healthy, COVID-19 and bacterial pneumonia. Chest x-ray images (CXR) were used in [4] by a CNN constructed based on various ImageNet pre-trained models to extract the high level features. Those features were fed into a Support Vector Machine SVM as a machine learning classifier in order to detect the COVID-19 cases. Moreover, in [5] , a CNN architecture called COVID-Net based on transfer learning was applied to classify the CXR images into four classes: normal, bacterial infection, non-COVID and COVID-19 viral infection. Several classical machine learning approaches have been previously used for automatic classification of digitised chest images [6, 7] . For instance, in [8] , three statistical features were calculated from lung texture to discriminate between malignant and benign lung nodules using a support vector machine classifier. A grey-level co-occurrence matrix method was used with Backpropagation Network [9] to classify images from being normal or cancerous. With the availability of enough annotated images, deep learning approaches [10, 11] have demonstrated their superiority over the classical machine learning approaches. CNN architecture is one of the most popular deep learning approaches with superior achievements in the medical imaging domain [12] . The primary success of CNN is due to its ability to learn features automatically from domain-specific images, unlike the classical machine learning methods. The popular strategy for training CNN architecture is to transfer learned knowledge from a pre-trained network that fulfilled one task into a new task [13] . This method is faster and easy to apply without the need for a huge annotated dataset for training; therefore many researchers tend to apply this strategy especially with medical imaging. Class decomposition [14] has been proposed with the aim of enhancing low variance classifiers facilitating more flexibility to their decision boundaries. In this paper, we adopt and validate DeTraC [15] for the classification of COVID-19 in chest x-ray images 1 . This is by adding a class decomposition layer to the pre-trained models. The class decomposition layer aims to partition each class within the image dataset into several sub-classes and then assign new labels to the new set, where each subset is treated as an independent class, then those subsets are assembled back to produce the final predictions. For the classification performance evaluation, we used images of chest x-ray collected from several hospitals and institutions. The dataset provides complicated computer vision challenging problems due to the intensity inhomogeneity in the images and irregularities in the data distribution. DeTraC architecture overview DeTraC model consists of three phases. In the first phase, we train the backbone pre-trained CNN model of DeTraC to extract deep local features from each image. Then we apply the class-decomposition layer of DeTraC to simplify the local structure of the data distribution. In the second phase, the training is accomplished using a sophisticated gradient descent optimisation method. Finally, we use the class-composition layer of DeTraC to refine the final classification of the images. As illustrated in Fig. 2 , class decomposition and composition components are added respectively before and after knowledge transformation from an ImageNet pre-trained CNN model. The class decomposition component aiming at partitioning each class within the image dataset into k sub-classes, where each subclass is treated independently. Then those sub-classes are assembled back using the class-composition component to produce the final classification of the original image dataset. A shallow-tuning mode was used during the adaptation and training of an ImageNet pre-trained CNN model using the collected chest X-ray image dataset. We used the off-the-shelf CNN features of pre-trained models on ImageNet (where the training is accomplished only on the final classification layer) to construct the image feature space. However, due to the high dimensionality associated with the images, we applied PCA to project the high-dimension feature space into a lower-dimension, where highly correlated features were ignored. This step is important for the class decomposition to produce more homogeneous classes, reduce the memory requirements, and improve the efficiency of the framework. Class decomposition Now assume that our feature space (PCA's output) is represented by a 2-D matrix (denoted as dataset A): A = {a 1 , a 2 , . . . . . . .., a n } , where n is the number of images, a i = (a i1 , a i2 , . . . .., a in ), and L is a class category. A and L can be rewritten as where κ is the number of classes and m is the number of features. For class decomposition, we used k-means clustering [16] to further divide each class into homogeneous sub-classes, where each pattern in the original class L is assigned to a class label associated with the nearest centroid based on the squared euclidean distance (SED): where centroids are denoted as c j . Accordingly, the relationship between dataset A and B can be mathematically described as: where the number of instances in A is equal to B while C is defined as Also, the feature space of both dataset A and B can be illustrated as: For transfer learning, we used the ImageNet pre-trained ResNet [17] model, which showed excellent performance on ImageNet with only 18 layers. Here we consider freezing the weights of low-level layers and update weighs of high-level layers. For fine-tuning the parameters, the learning rate for all the CNN layers was fixed to 0.0001 except for the last fully connected layer (was 0.01), the min batch size was 64 April 1, 2020 4/9 with minimum 256 epochs, 0.001 was set for the weight decay to prevent the overfitting through training the model, and the momentum value was 0.9. With the limited availability of training data, stochastic gradient descent (SGD) can heavily be fluctuating the objective/loss function and hence overfitting can occur. To improve convergence and overcome overfitting, the mini-batch of stochastic gradient descent (mSGD) was used to minimise the objective function, E(·), with cross-entropy loss where x j is the set of input images in the training, y j is the ground truth labels while z(·) is the predicted output from a softmax function. In the class decomposition layer of DeTrac, we divide each class within the image dataset into several sub-classes, where each subclass is treated as a new independent class. In the composition phase, those sub-classes are assembled back to produce the final prediction based on the original image dataset. For performance evaluation, we adopted Accuracy (ACC), Specificity (SP) and Sensitivity (SN) metrics from the confusion matrix (as pointed out in [18] ). In our framework we used a combination of two datasets. We used 80 samples of normal CXRs (with 4020 × 4892 pixels) from the Japanese Society of Radiological Technology (JSRT ) [19, 20] and another imageset contains 105 and 11 samples of COVID-19 and SARS (with 4248 × 3480 pixels), respectively, from [21] . We applied different data augmentation techniques to generate more samples such as: flipping up/down and right/left, translation and rotation using random five different angles. This process resulted in a total of 1764 samples. Also, a histogram modification technique was applied to enhance the contrast of each image. We used AlexNet [22] pre-trained network based on shallow learning mode to extract discriminative features of the three original classes. AlexNet is composed of 5 convolutional layers to represent learned features, 3 fully connected layers for the classification task. AlexNet uses 3 × 3 max-pooling layers with ReLU activation functions and three different kernel filters. We adopted the last fully connected layer into three classes and initialised the weight parameters for our specific classification task. Secondly, we used k-means clustering [16] to apply the decomposition step and divide each class into two subclasses (i.e. k = 2). Finally, we assigned the new labels to the new sets, where each subset is treated as an independent class. More precisely, we constructed a new dataset (we called dataset B) with six classes (norm 1 , norm 2 , COV ID19 1 ,COV ID19 2 , SARS 1 , and SARS 2 ), see Table 1 . All the experiments in our work have been carried out in MATLAB 2019a on a PC with the following configuration: 3.70 GHz Intel(R) Core(TM) i3-6100 Duo, NVIDIA Corporation with the donation of the Quadra P5000GPU, and 8.00 GB RAM. The dataset was divided into two groups; 70% for training the model and 30% for evaluation of the classification performance. We used ResNet18 as an ImageNet pre-trained network in our experiment. ResNet18 [23] consist of 18 layers with input image size of 224 × 224 and achieved an effective performance with 95.12% of accuracy. The last fully-connected layer was changed into the new task to classify six classes. The learning rate for all the CNN layers was fixed to 0.0001 except for the last fully connected layer (was 0.01) to accelerate the learning. The min batch size was 64 with a minimum 100 epochs, 0.0001 was set for the weight decay to prevent the overfitting through training the model, and the momentum value was 0.95. The schedule of drop learning rate was set to 0.95 every 5 epochs. DeTraC-ResNet18 was trained based on deep learning mode. For performance evaluation, we adopted some metrics from the confusion matrix such as accuracy, sensitivity, specificity, and precision. The results were reported and summarised in table 2. We plot the learning curve accuracy and loss between training and test as shown in Fig 3. Also, the Area Under the receiver curve (AUC) was computed as shown in Fig 4. To demonstrate the robustness of DeTraC-ResNet18 in the classification of COVID-19 images, we compare it with ResNet18 using the same settings. ResNet18 achieved accuracy of 92.5%, sensitivity of 65.01%, specificity of 94.3%, and precision of 94.5%. Training CNN s can be accomplished using two different strategies. They can be used as an end-to-end network, where an enormous number of annotated images must be provided (which is impractical in medical imaging). Alternatively, transfer learning usually provides an effective solution with the limited availability of annotated images by transferring knowledge from pre-trained CNN s (that have been learned from a bench-marked large-scale image dataset) to the specific medical imaging task. Transfer learning can be further accomplished by three main scenarios: shallow-tuning, fine-tuning, or deep-tuning. However, data irregularities, especially in medical imaging applications, remain a challenging problem that usually results in miscalibration between the different classes in the dataset. CNN s can provide an effective and robust solution for the detection of the COVID-19 cases from chest X-ray CXR images and this can be contributed to control the spread of the disease. Here, we adopt and validate our previously developed deep convolutional neural network, we called DeTraC, to deal with such a challenging problem by exploiting the advantages of class decomposition within the CN N s for image classification. DeTraC achieved high accuracy of 95.12% with ResNet on CXR images. In this paper, we used DeTraC deep CNN architecture that relies on a class decomposition approach for the classification of COVID-19 images in a comprehensive dataset of chest X-ray images. DeTraC showed effective and robust solutions for the classification of COVID-19 cases and its ability to cope with data irregularity and the limited number of training images too. Coronavirus disease 2019 ( COVID-19): situation report, 51 A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19). medRxiv Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images. medRxiv Detection of Coronavirus Disease (COVID-19) Based on Deep Features COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images Artificial neural network-based classification system for lung nodules on computed tomography scans Lung cancer classification using neural networks for CT images Lung cancer detection using fuzzy auto-seed cluster means morphological segmentation and SVM classifier Lung tumour detection and classification using EK-Mean clustering Lung pattern classification for interstitial lung diseases using a deep convolutional neural network Computer aided lung cancer diagnosis with deep learning algorithms Deep learning A survey on transfer learning Class decomposition via clustering: a new framework for low-variance classifiers DeTraC: Transfer Learning of Class Decomposed Medical Images in Convolutional Neural Networks. under review Top 10 algorithms in data mining. Knowledge and information systems Deep residual learning for image recognition Selecting and interpreting measures of thematic classification accuracy. Remote sensing of Environment Lung Segmentation in Chest Radiographs Using Anatomical Atlases With Nonrigid Registration Automatic Tuberculosis Screening Using Chest Radiographs COVID-19 image data collection Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems Inception-v4, inception-resnet and the impact of residual connections on learning