key: cord-0560421-krohs84t authors: Khan, Saddam Hussain; Sohail, Anabia; Khan, Asifullah; Lee, Yeon Soo title: Classification and Region Analysis of COVID-19 Infection using Lung CT Images and Deep Convolutional Neural Networks date: 2020-09-16 journal: nan DOI: nan sha: 48dba8830df54540c432dae7f60774cf225b05cc doc_id: 560421 cord_uid: krohs84t COVID-19 is a global health problem. Consequently, early detection and analysis of the infection patterns are crucial for controlling infection spread as well as devising a treatment plan. This work proposes a two-stage deep Convolutional Neural Networks (CNNs) based framework for delineation of COVID-19 infected regions in Lung CT images. In the first stage, initially, COVID-19 specific CT image features are enhanced using a two-level discrete wavelet transformation. These enhanced CT images are then classified using the proposed custom-made deep CoV-CTNet. In the second stage, the CT images classified as infectious images are provided to the segmentation models for the identification and analysis of COVID-19 infectious regions. In this regard, we propose a novel semantic segmentation model CoV-RASeg, which systematically uses average and max pooling operations in the encoder and decoder blocks. This systematic utilization of max and average pooling operations helps the proposed CoV-RASeg in simultaneously learning both the boundaries and region homogeneity. Moreover, the idea of attention is incorporated to deal with mildly infected regions. The proposed two-stage framework is evaluated on a standard Lung CT image dataset, and its performance is compared with the existing deep CNN models. The performance of the proposed CoV-CTNet is evaluated using Mathew Correlation Coefficient (MCC) measure (0.98) and that of proposed CoV-RASeg using Dice Similarity (DS) score (0.95). The promising results on an unseen test set suggest that the proposed framework has the potential to help the radiologists in the identification and analysis of COVID-19 infected regions. December 2019; however, in early 2020 it spread across the world [1] , and to date, it persists with its devastating effects across all continents [2] . COVID-19 is most commonly manifested by mild flue like symptoms such as cough, fever, myalgia, and fatigue [3] . However, in severe cases, it causes alveolar damage, pneumonia, sepsis, and respiratory failure, which eventually lead to death [4] . The commonly used tests for assessment of COVID-19 patients are gene sequencing, reverse transcription polymerase chain reaction (RT-PCR), X-Ray and computed tomography (CT) based imaging techniques [5] , [6] . Out of these aforementioned assays, RT-PCR and gene sequencing are considered as a gold standard. However, because these standard assays are expensive, many developing countries have a limited number of testing kits or lack the sequencing facilities. RT-PCR usually takes up to 2 days for detection and often suffers from the inherited limitation of viral RNA instability. Thus, it requires serial testing to eliminate the likelihood of false-negative results (RT-PCR detection rate is ~ 30% to 60%) and necessitates some additional supplementary tests [7] - [9] . In this regard, additional detection methods with high precision are also required for immediate treatment of the infected persons and to cease the transmission of contagious COVID-19 infectious. CT based imaging is not expensive and available in all the hospitals and reliable, practical, and efficient tools for detection, prognosis and follow-up of COVID-19 patients. Several examination setups have shown reliable diagnostic power of CT imaging in capturing the lung abnormalities for COVID-19 infected individuals even when characteristic clinical symptoms are imperceptible, and RT-PCR is reported as a false negative [10] , [11] . The characteristic radiographic features for the infected patients are bilateral patchy shadows, ground glass pacification (GGO), consolidation, pleural effusion, rounded morphology, and peripheral lung distribution [4] , [12] . In a public health emergency, especially in epidemic and pandemics, there is a significant burden on hospitals and physicians. Visual analysis of a large number of CT images is a huge burden on radiologists. There are many areas where there are no experienced radiologists. The increased workload on radiologists affects performance. Moreover, radiologists are less sensitive 3 and more specific towards identifying COVID-19 infection by analyzing CT images. In this regard, there is an urgent need for an automated tool that can aid the radiologists to improve performance and to deal with a large number of patients. Deep learning (DL) based diagnostic systems are very valuable tools in plunging the physicians. Previously, DL based automated systems has been employed to facilitate the radiologists in the identification of lung-related anomalies [13] , [14] . The advantage of such a system is that they are reproducible and can sense the detect minute irregularities that cannot be located through a visual examination. In this ongoing COVID-19 pandemic era, several research groups have paid attention to develop automated systems to identify the COVID-19 infected individual by examining CT images [15] - [17] . In this work, we proposed a classification and segmentation based deep Convolutional The layout of the paper is as follows: Section 2 gives an insight into COVID-19 related work. Section 3 explains the Methodology of the proposed framework, whereas Section 4 presents the implementation details. Section 5 analyzes the results and discusses the performance of the implemented experiments, and finally, section 6 makes a conclusion. Nowadays, CT technology has been used for the analysis of COVID- 19 [21] . The COVID-Net was inspired by ResNet and was used to differentiate multi types of COVID-19 infections from normal pneumonia. Although COVID-Net has good accuracy (92%), yet it has a low detection rate (87%) [22] . Similarly, COVID-CAPS inspired by Capsule Net also reported good accuracy (98%), but it is less sensitive (80%) towards COVID-19 infection [23] . Moreover, a novel classification model COVID-RENet inspired by smooth and boundary information of images and achieved 98% accuracy. All these pre-trained CNN models have been trained on Natural images and fine-tuned on the COVID-19 dataset [24] . On the other hand, segmentation of the infected regions is usually performed to identify the location of disease and severity. Initially, some classical segmentation techniques like watershed have been employed but fail to show satisfactory performance [25] , [26] . Therefore, 5 DL based 'VB-Net' has been introduced for the segmentation of COVID-19 infected regions using CT images and reported the dice similarity (DS) score of 91%. Moreover, the COVID-19 JCS system based on classification and segmentation has been developed to visualize and segment the infected region. The JCS system reported the 95.0% detection rate, 93.0% specificity on classification, while low DS score (78.3%) on segmentation. However, most of the existing COVID-19 diagnostic systems have been trained on a small amount of CT datasets. Usually, these diagnostic systems have mostly two main challenges: 1) unavailability of sufficient amount of training data, which is required to make deep CNN models robust towards diverse categories of COVID-19 infections; 2) detection is restricted to the classification of infected samples and lacks the information of the infected region and severity of the disease. In this study, we proposed classification and segmentation based deep CNN framework for automatic analysis of COVID-19 abnormalities in lung CT images. The proposed framework is In this stage, at a coarse scale, COVID-19 infected CT samples are segregated from healthy samples by performing classification. Initially, a feature space of the dataset is transformed by employing wavelet-based decomposition (shown in Figure 2 ) to assign class discriminating features to the deep classifiers. In the classification stage, three different experimental setups are implemented: (i) Proposed CoV-CTNet (the detailed explanation of this term will be given in The CT image has been subjected to discrete wavelet transformation (DWT) to transform the feature space by decomposing the image into discrete wavelet coefficients using Haar mother wavelet. DWT has two main advantages: (i) reduction in computational complexity, and (ii) image enhancement by transforming the original image into information-rich feature-maps [27] , [28] . In DWT, at each decomposition level, the input image is divided into four equal parts: low-low ( , =2 ), low-high ( , =2 ), high-low ( , =2 ), and high-high ( , =2 ) resolution. Whereas, 'i' represents the level of decomposition (D) and 's' is a scaling factor. In this work, we performed two-level decomposition (shown in Figure 2 ) to select the highly informative 7 feature-map. For this, the output (LL and HH) from the 1 st level decomposition is further processed through DWT transformation to extract information-rich feature-map for classification. The high information features-maps are reconstructed back to images using the Inverse Discrete Wavelet Transform (IDWT). In this study, similar to the idea of Leplacian of Gaussian, we have enhanced the image representation by fusing the images reconstructed from LL and HH channels of second-level DWT representations [29] . This transformed feature space is used to distinguish between healthy and COVID-19-infected images. In this work, we proposed residual learning-based CNN CoV-CTNet that discriminates COVID- Equation (5) shows the residual learning. The architectural design of the proposed CoV-CTNet is shown in Figure 3 . For the regulation of model complexity and learning of invariant features, convolution operation with stride two is performed at the end of each block. In the proposed architecture, max pooling is used on the top of feature extraction stages to retain the most prominent class-specific feature information within the feature-maps. For classification purpose, two fully connected layers (mathematically express in Equation (6)) are specified to reduce the feature space by globally analyzing their contribution in classification, and in the last, fully connected layer with softmax is used for the discrimination of healthy lung CT samples from infected images. The proposed model is trained on IDWT enhanced images and optimized using cross-entropy loss (represented in Equation (7)). Equation (6 & 7) represents the fully connected layer, in which l d C is the convolved output having depth D, and k u is k th neuron of l th fully connected layer. Whereas, in the crossentropy loss (Equation (7)), CT p is the predicted class for input CT image, and CT y is the actual class of the image. Deep CNNs are a type of DL models that exploit the spatial correlation in images and have shown impressive results for detection, classification, and segmentation related tasks [31] . In recent years, CNN has shown convincing results for biomedical images and has been successfully applied to classify and detect medical images [32] . connected layer that is used for classification is replaced with a target-specific layer consisting of two neurons. Contrary to this, convolutional blocks from feature extraction stage are kept unchanged in state-of-the-art models. On the last layer, softmax is used to obtain the classspecific probabilities. These models are trained from scratch on CT images, and weight space is optimized using a backpropagation algorithm by minimizing the cross-entropy based loss function. Deep CNN architectures typically demand a sufficient amount of data for effective training. There are several types of TL; one of the effective ways of TL is the initialization of parameters of deep architectures from pre-trained models and then fine-tunes the learnable parameters to make them adaptable to the target domain. This strategy is employed when the dataset is small, and the target domain shares a similarity with the source domain in terms of feature space or task [43] . In images, usually, low-level features are common among different categories of images such as curves, edges, gradient, etc., whereas high-level features are class-specific. Based on this assumption, we adapted pre-trained deep architectures for classification of Lung CT images. In this regard, TL based optimization of existing deep CNN models on target data is performed by adding a new input convolutional layer that coincides with the size of CT samples (i.e., 82x82x1). Similarly, fully connected layers are replaced with the target-domain specific classification layers and dimension of the last layer is aligned to the number of the classes, i.e., two. The model is trained by fine-tuning the learnable layers of feature extraction stage and by optimizing the weights of classification layers from scratch. Localization and quantification of the infectious region are crucial for the analysis of infection pattern and its extent in diagnostics. Therefore, after the discrimination of CT images at a coarsescale, semantic segmentation is performed to obtain subtle inference of infectious regions on CT images. In this work, COVID-19 infected regions are segmented from the surrounding regions by dealing it as a binary semantic segmentation problem. Pixels of the infected regions are labelled as a positive class, whereas all other pixels are regarded as background class. Semantic segmentation is a fine-scale pixel-based classification that labels each pixel by its corresponding object or region class [44] , [45] . In this work, we implemented four different setups for segmentation: (i) proposed COVID-19 based region approximation CNN CoV-RASeg 13 for segmentation, (ii) target specific implementation of deep semantic segmentation models from scratch (iii) pixel attention based implementation of deep semantic segmentation models, and (iv) TL based fine-tuning of deep semantic segmentation models. Details of the experimental setup are mentioned below. The proposed CNN based semantic segmentation architecture CoV-RASeg is like a SegNet, consisting of two encoder and decoder blocks. Architectural design for the proposed CoV-RASeg is shown in Figure 4 . In the proposed architecture, we redesign the encoder and decoder block to enhance the network's feature learning capacity. For this purpose, we systematically incorporated average pooling with max-pooling in encoding stages (mathematically expressed in Equation (8 & 9)). In the decoder stage, we also implemented average pooling along with maxun-pooling, in contrast to other deep CNN semantic segmentation models. The difficulty in discrimination of COVID-19 infectious region from a background region is faced as a border between two regions is usually ill-defined, and infectious region overlaps with healthy lungs sections. Max pooling is used to learn the boundary information, whereas average pooling is used to determine characteristic COVID-19 infection patterns from the CT images. We used SegNet as a baseline model to evaluate the significance of systematic using max and average pooling in each encoder and decoder. We exploited encoder-decoder architecture for fine-grain semantic segmentation as encoding stages of such architectural design are very good in learning of semantically meaningful object-specific information. However, the feature encoding process loses spatial information that is required for object segmentation. Therefore, for the localization of infected regions on the original high-resolution image, we used the decoding stage to nonlinearly restore the spatial resolution of encoder's feature-maps by utilizing max-pooling indices. Whereas, in the last layer, 2x2 convolutional operation with sigmoid activation function is employed for 14 discriminating each pixel into either infected or background region. Encoder and decoder are symmetrical in structure with the difference of max-pooling layer in the encoder part that is replaced by un-pooling layers in the corresponding decoder part (Figure 4 ). Several DL models with different architectural designs are reported for semantic segmentation and are benchmarked against diverse categories of the datasets [46] . In this work, we also incorporated the idea of assigning attention to each pixel during training based on their respective class representation in the dataset to deal with the scant representation of COVID-19 infected regions [51] . This is a type of static attention (SA), which enhances the foreground (COVID-19 infected region) by assigning this region high weightage while suppressing the background region pixels by associating small weight with them. It also helps to learn the foreground region anatomies effectively. This idea is incorporated in the proposed CoV-RASeg as well as in the existing deep CNN based semantic segmentation models. For the development of the proposed framework, we used standard CT images dataset provided by the SIRM [52] . As COVID-19 is a new disease; therefore, no other standard dataset is available. The dataset is consisting of 829 axial CT samples, out of which 370 CT images show COVID-19 infection pattern, whereas 459 images represent the healthy samples. The provided dataset was examined by the experienced radiologist, and infected lung sections were also marked by the radiologists. Each CT sample was paired with a radiologist provided binary mask (ground truth) that is a fine-grained pixel-level binary label (shown in Figure 5 ). Moreover, the dataset covers different infection levels like mild, medium, and severe cases of COVID-19. All the images were resized from 512x512x3 to 304x304x3 for computational efficiency. The CT images for COVID-19 infected and healthy lung samples are shown in Figure 5 . The cross-validation technique is employed during hypermeter selection to improve the robustness and generalization of the models. We The proposed two-stage CNN framework performance has been evaluated using standard metrics. Evaluation metrics, along with abbreviation and mathematical explanations, are provided in Table 1 . The classification metrics include accuracy (Acc), recall (R), specificity (S), precision (P), Mathew Correlation Coefficient (MCC), and F-score are expressed in Equation (10) (11) (12) (13) (14) . While the segmentation models are evaluated in terms of segmentation accuracy (S-Acc), the intersection of union (IoU), and the Dice Similarity (DS) coefficient that are expressed in Equation (16) and (17), respectively. In this work, we proposed a two-stage framework for the analysis of COVID-19 infected lungs. The advantage of dividing the workflow into two stages is to initially scrutinize the infected CT samples, whereas exploration of regions corresponding to characteristic infection pattern may only be performed within the selected images. This staging process emulates the clinical workflow, where patients based on initial screening are devised for further diagnostic tests. The results of the two stages are discussed below. In this work for initial screening, we proposed a deep CNN based classification model CoV-CTNet for categorization of each sample into infected or healthy images. We optimized this stage for highly sensitive in identifying COVID-19 symptoms with a minimum number of false positives. The learning potential of the proposed CoV-CTNet for COVID-19 specific CT feature is evaluated by comparing performance with exiting models on the unseen test dataset. In the classification stage, we gain improvement in detection rate ( Figure 6 , Table 2 & Table 3 ) by adding two enhancements. In the first step, we enhanced the original image by fusing high-frequency channel (HH) with region approximation (LL) channel from the second level decomposition of DWT. This fusion mimics the idea of Leplacian of Gaussian and heightened the distinct characteristics of infected and healthy regions, and thus improves the detection, which is evident from MCC score and other performance metrics ( Table 2 & Table 3 ). Secondly, we proposed a new CNN model CoV-CTNet in which we added fully connected layers with dropout for emphasizing on learning of discriminatory CT based image features for COVID-19 infection. The proposed CoV-CTNet is evaluated on the test set based on three different performance metrics: Accuracy, F-score, and MCC. In addition to this sensitivity, specificity and precision of proposed classifier are also analyzed. The results of the proposed CoV-CTNet are presented in Table 2 . The proposed CoV-CTNet shows good generalization as compared to baseline 21 ResNet18 in terms of F-score (CoV-CTNet: 99%, ResNet-18: 97%), Accuracy (CoV-CTNet: 98.80%, ResNet-18: 97.17%), and MCC: (CoV-CTNet: 98%, ResNet-18: 94%). Moreover, good discrimination ability of the proposed CoV-CTNet is also evident from the decision function feature space (Figure 7) . Performance of the proposed CoV-CTNet is compared with the nine different exisiting deep Table 3 shows the comparison between performances of TL-based fine-tuned and train from scratch models on the test dataset. Performance analysis suggests that TL-based fine-tuned models learn the COVID-19 specific feature in a better way than the deep CNN models that are trained from scratch on CT images. Whereas comparison of the proposed CoV-CTNet shows better performance in terms of F-score, MCC and accuracy (Table 3) with the existing deep CNNs either they are trained from scratch or fine-tuned. Whereas gain in performance of the proposed CoV-CTNet as compared with best, average and lowest MCC score reported by deep models is shown in Figure 8 . Furthermore, the PR curve is used to quantitatively measure the discrimination power of the proposed model (shown in Figure 9 ). PR curve is a performance measurement curve that is used for classification problems. It evaluates the generalization ability of the classifier by defining the degree of separability between two classes at different threshold values. Figure 9 shows the obtained PR curve for the proposed and baseline classification models on the test set. It is clearly shown that the proposed CoV-CTNet has a better learning capacity than the baseline and other existing deep CNN models. Although AUC of PR curve for CoV-CTNet is equal to DenseNet, however, overall CoV-CTNet achieved the highest F-score, Accuracy and MCC as compared to DenseNet and other deep models (Table 3 ). CT images that are detected as a COVID-19 infected in a classification stage using (Table 4 ). Convergences of the proposed CoV-RASeg on train and validation dataset are shown in Figure 10 , Table 5 . Whereas, precise learning of discriminating boundary is clear from the higher value of BFS (97.92%). The proposed CoV-RASeg is benchmarked against the SegNet. Table 5 Moreover, it can localize the infection in a precise manner, whether it is located at a single location or it is spread across multiple distinct lobes or segments of the lungs. We have validated the performance of our proposed architecture CoV-RASeg by comparing its performance with seven popular semantic segmentation models (VGG16/19, FCN, SegNet, U-Net, U-SegNet, Deep LabV1/3). The comparison is shown in Table 5 & Table 6 . The performance of the proposed "RA-CoVSeg" is compared with the existing techniques in three different scenarios, including training from scratch, training using attention and finally, the TLbased fine-tuning of the architecture. The IOU and BFS plot bars show that the proposed model is better or comparable in performance to existing techniques when it is compared with their best, lowest, and average values (shown in Figure 12 ). (Table 5 & Table 6 , Figure 13 ). The worst segmentation performance for COVID infected region is 49.69%, and for the background is 61.06%. Our proposed model, which is small in size shows the comparable performance with high capacity Deep LabV3 finetuned models. In the provided dataset, the typical region of CT images or healthy samples dominates the COVID-19 infected areas. This under-representation usually affects the performance of segmentation models. To address this problem, we used a pixel attention strategy during training. The incorporation of pixel weights consistently improves the segmentation for different categories of infections, which is evident from the visual quality of the segmentation output maps ( Figure 11 & Figure 13 ) and performance measure (Table 5) , whereas significant improvement for less severely infected lung sections is noted. The gain in performance is noted from 0.05 to 0.20%, as shown in Table 5 . Early COVID-19 pandemic: perspectives on an unfolding crisis Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Evaluating the accuracy of different respiratory specimens in the laboratory Laboratory diagnosis and monitoring the viral shedding of 2019-nCoV infections Artificial intelligence-enabled rapid diagnosis of patients with COVID-19 Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases Sensitivity of Chest CT for COVID-19: Comparison to RT-PCR Coronavirus disease 2019 (COVID-19): A systematic review of imaging findings in 919 patients Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT? Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study Development and Validation of a Deep Learning-based Automatic Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs Deep learning-based detection system for multiclass lesions on chest radiographs: comparison with observer readings The role of imaging in the detection and management of COVID-19: a review Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation and Diagnosis for COVID-19 Rapid AI Development Cycle for the Coronavirus ( COVID-19 ) Pandemic : Initial Results for Automated Detection & Patient Monitoring using Deep Learning CT Image Analysis Article Type : Authors : Summary Statement : Key Results : List of abbreviati Deep Learning in Medical Image Analysis Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks Lung Infection Quantification of COVID-19 in CT Images with Deep Learning A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19)," medRxiv COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images COVID-CAPS: A Capsule Network-based Framework for Identification of COVID-19 cases from X-ray Images Coronavirus Disease Analysis using Chest X-ray Images and a Novel Deep Convolutional Neural Network Harmony-Search and Otsu based System for Coronavirus Disease (COVID-19) Detection using Lung CT Scan Images The watershed transform: definitions, algorithms and parallelization strategies Satellite Image Enhancement Using 2D Level DWT Internet of image thingsdiscrete wavelet transform and Gabor wavelet transform based image enhancement resolution technique for IoT satellite applications Image contrast enhancement based on laplacian-of-gaussian filter combined with morphological reconstruction Residual Attention Network for Image Classification A survey of the recent architectures of deep convolutional neural networks Convolutional neural networks: an overview and application in radiology Very Deep Convolutional Networks for Large-Scale Image Recognition Deep Residual Learning for Image Recognition Going deeper with convolutions Densely connected convolutional networks ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Xception: Deep learning with depthwise separable convolutions Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images Adaptive Transfer Learning in Deep Neural Networks: Wind Power Prediction using Knowledge Transfer from Region to Region and Between Different Task Domains Transfer Learning and Meta Classification Based Deep Churn Prediction System for Telecom Industry A Survey on Transfer Learning Transfer Learning for Visual Categorization : A Survey PASSENGER DETECTION AND COUNTING FOR PUBLIC TRANSPORT SYSTEM Computer vision based room interior design A survey on deep learning techniques for image and video semantic segmentation SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation U-Segnet: Fully Convolutional Neural Network Based Automated Brain Tissue Segmentation Tool Fully convolutional networks for semantic segmentation DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs Attention gated networks: Learning to leverage salient regions in medical images Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation Interval estimation for a binomial proportion Confidence intervals for the area under the ROC Curve This work was conducted with the support of PIEAS IT endowment fund under the Pakistan Higher Education Commission (HEC). This research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2014R1A1A2053780).We also thank Pattern Recognition Lab (PR-Lab) and Pakistan Institute of Engineering, and Applied Sciences (PIEAS), for providing necessary computational resources and healthy research environment.