key: cord-0845033-ee4ufx71 authors: Serte, Sertan; Demirel, Hasan title: Deep Learning for Diagnosis of COVID-19 using 3D CT Scans date: 2021-03-10 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2021.104306 sha: 03e200ec15ac9b756e1f61ab14ce24c2afaa3d85 doc_id: 845033 cord_uid: ee4ufx71 A new pneumonia-type coronavirus, COVID-19, recently emerged in Wuhan, China. COVID-19 has subsequently infected many people and caused many deaths worldwide. Isolating infected people is one of the methods of preventing the spread of this virus. CT scans provide detailed imaging of the lungs and assist radiologists in diagnosing COVID-19 in hospitals. However, a person’s CT scan contains hundreds of slides, and the diagnosis of COVID-19 using such scans can lead to delays in hospitals. Artificial intelligence techniques could assist radiologists with rapidly and accurately detecting COVID-19 infection from these scans. This paper proposes an artificial intelligence (AI) approach to classify COVID-19 and normal CT volumes. The proposed AI method uses the ResNet-50 deep learning model to predict COVID-19 on each CT image of a 3D CT scan. Then, this AI method fuses image-level predictions to diagnose COVID-19 on a 3D CT volume. We show that the proposed deep learning model provides [Formula: see text] AUC value for detecting COVID-19 on CT scans. Coronavirus, first emerged in Wuhan, China in 2019, [1] . Coronavirus [2] is known as viral pneumonia, and this viral pneumonia can be grouped into COVID-19, SARS, and MERS. Currently, coronavirus is being spread via humanto-human transmission, and there is few and limitted vaccine for COVID- 19 . Reports claim that the best ways of preventing coronavirus spread is by performing rapid diagnosis to large populations and subsequently keep the infected individuals in isolation. Therefore, regular COVID-19 tests are necessary for the identification of infection in people so they can be isolated. Transcription-polymerase chain reaction (RT-PCR) tests mainly to allow the classification of individuals with COVID-19 in hospitals. Recently, Rubin et al. [3] described the role of chest imaging in patient management during the COVID- 19 Pandemic. Authors mentioned that chest imaging usage as a diagnostic tool to detect infected people might be problematic. The application of imaging technique requires a long time and cause risk to personnel. Therefore, the study emphasises the usages of a real-time reverse-transcriptase polymerase chain reaction (RT-PCR) together with chest imaging. Three clinical scenarios show the selection of diagnostic tools according to patients situations. Our proposed deep learning method focuses on a fast and accurate diagnosis of COVID-19 on 3D CT scans. A person's CT scan might contain many CT images, and radiologist might not be capable of examing large numbers of patients during an outbreak. Our proposed technique assists radiologists with rapidly and accurately detecting COVID-19 infection from these scans. Computed tomography scans (CT scan) and X-ray images are alternative diagnostic tools for detecting COVID-19. Doctors image lungs and look for signs of COVID-19 deformations on the CT or X-Ray images. This process requires a certain amount of time for correct pneumonia type classification. However, convolutional neural networks (CNNs) could be used instead of or in conjunction with the doctors for faster and better diagnosis of COVID-19 on CT scans. CNNs include AlexNet [4] , GoogleNet [5] , VGG [6] , Mo-bileNetV2 [7] , ResNet [8] and DenseNet [9] . These models have provided the classification of 1,000 objects in the ImageNet dataset [10, 11] . The performance results show that these models can achieve close to human-level object-level accuracy. These models also result in high classification performance in medical image classification. Recently, authors [12, 13] utilized CNNs for detecting COVID-19 on X-Ray images. The studies [14, 15] also employed CNNs to recognize COVID-19 on CT-scans. Other studies [16] [17] [18] show that these models can also be used for the classification of skin lesions. Furthermore, authors [19] [20] [21] [22] [23] have shown that CNN models provide accurate results for eye disease. Therefore, CNN models can be used on different medical images for the diagnosis of disease types. A recent medical review paper [24] summarized the application of these models. Recent works have also utilized 3D convolutional neural networks for classifying COVID-19 on CT volumes. These 3D CNNs provide spatiotemporal modelling of the CT volumes for COVID-19 classification. Tran et al. developed a 3D convolutional neural network, which they named C3D. Moreover, Zheng et al. [25] proposed DeCoVNet for 3D volume modelling. Hara et al. [26] proposed 3D ResNet models named ResNet-18, ResNet-50, and ResNet-101. Authors [27] have also introduced 3D-SqueezNet,3D-SuffleNet, 3D-MobileNet-V1, and 3D-MobileNetV2. Authors [28, 29] utilized 3D CNN deep learning models for detecting COVID-19 on 3D CT scans. The method [28] segments CT images and a group of CT images as inputs to the 3D CNN model. The model applies 3D convolution to the images, and then the output of the 3D convolution is used as the input to the AlexNet and ResNet models for the final COVID-19 classification. Similarly, Han et al. [29] learned COVID-19 on 3D CT scans using 3D CNN models. The method convolves CT volumes using 3D filters, then the outputs of 3D convolution are combined using the bag technique. The final COVID-19 predictions are then obtained through fully connected layers. Zhang et al. [25] employed 3D ResNet-18 [26] to detect COVID-19 on 3D CT scans. This study explained how the 3D-ResNet-18 model can be used for differentiating COVID-19 from common pneumonia. The proposed method, (Figure 1 and Figure 2 ), is different than other recent works, [28, 29] . We propose a new and novel artificial intelligence (AI) system for the classification of COVID-19 using a person's 3D CT volume. The technique separates each of the 3D CT scans into images, and then each image is used as inputs for a ResNet-50 convolutional neural network (CNN) model. The ResNet-50 model provides estimates for each of the CT images. Then, these predictions are fused to classify COVID-19 images. The main novelty of the paper is using the data fusion on decisions of parallel 2D CNNs. The results indicate that the fusion of the decisions of multiple 2D CNNs outperforms single 3D CNN approach. The unique aspects of this work are: 1. We propose a new AI system to estimate COVID- 19 from the images of a person's 3D CT volume. The proposed AI system employs ResNet-50 to obtain predictions on the CT images of a 3D CT volume. 2. The proposed AI system also employs the ResNet-18 model in conjunction with majority voting to provide a COVID-19 prediction on a person's 3D CT volume. This paper is organized as follows. First, we summarize the related work on COVID-19 classification using 2D and 3D CT scans. We describe the proposed artificial intelligence system for recognizing COVID-19 on a 3D CT volume. Finally, the performance of the proposed method is evaluated and discussed. The majority of previous works have proposed the detection of COVID-19 on 2D CT scans. Recent results have also reported the classification of COVID-19 in individuals using a captured 3D CT volume. A recent review [30] , summarized all artificial intelligence techniques related to COVID-19 classification using both 2D and 3D CT scans. Hu et al. [32] introduced a weakly supervised convolutional neural network for COVID-19 classification on CT Scans. The authors classified COVID-19, communityacquired pneumonia, and non-pneumonia. The proposed CNN model builds on five sets of convolutional on the input CT scans. They utilized weak classifiers on convolutional layers for class predictions. Then, they concatenated output prediction using softmax for the final output. Mei et al. [33] proposed using a Resnet-18 convolutional neural network in conjunction with support vector machines for COVID-19 classification. In this work, the CNN model allows prediction on the CT image, while SVM provides COVID-19 prediction on non-image data. The authors combined the outputs of ResNet-18 and SVM for the classification of COVID-19. Harmon et al. [34] utilized DensNet-121 deep learning architecture for the classification of COVID-19 and pneumonia. The proposed method was trained and tested on a multi dataset for performance evaluation. Bhandary et al. [35] used AlexNet in conjunction with support vector machines to classify COVID-19 and cancer on X-Ray and CT Scans. The authors also compared the SVM-based Alexnet with AlexNet, VGG16, VGG19, and ResNet50. Butt et al. [36] used a ResNet-18 deep learning model for the classification of COVID-19, viral pneumonia, and normal CT scans. This method creates 3D volumes of the CT scans and then extracts paths from these regions. Then, these images are used as inputs to the ResNet-18 model for differentiating COVID-19, viral pneumonia, and normal CT scans. Yan et al. [37] proposed a multi-scale convolutional neural network for the classification of COVID-19 and common pneumonia. The method employs the Gaussian pyramid to generate three scans of CT images. Then, the three convolutional neural networks are trained on each scale of the CT images. The estimates of the models are fused for COVID-19 classification. This method works on both CT images and 3D CT scans. The proposed method allows higher COVID-19 classification on 2D scans compared to 3D scans. Wang et al. [28] developed 3D convolutional neural networks for detecting COVID-19 infections on 3D CT scans. The authors utilized a U-Net model to segment lungs on CT images, and then a group of CT images was used as inputs to the proposed 3D convolutional neural network. The proposed 3D model architecture builds on 3D convolution filters, AlexNet, and ResNet models. Han et al. [29] proposed attention-based 3D multiple instance models for the classification of COVID-19, common pneumonia, and non-pneumonia cases. This method involves the generation of 3D features using 3D convolution on the 3D CT scans. The developed 3D features are combined with a bag model, and then these combinations are used to classify infections. Figure 1 describes our proposed artificial intelligence approach. The method uses a 3D CT scan as input, and then it outputs the COVID-19 and normal class predictions. We use 3D CT scans which are acquired using computed tomography CT scanner. The CT scan is a medical imaging technique, and the method provides a 3D CT volume of the patients' lungs. We can view these 3D CT volumes as axial, coronal, sagittal planes. Our proposed method employs axial views (slices) of the CT scans of the patients'. Figure 1 shows these axial slices of the 3D CT. These slices show the human lungs from the upper body towards the lower body. The proposed approach groups these axial slices into three categories. These groups are upper lung slices, middle lung slices and lower lung slices. Figure 1 clearly shows that upper and lower lung slides show bone structures and a small lung region area. In contrast, Figure 1 also indicates that the middle lung slices show a patient's lung's central regions. We developed our method based on this information, which utilizes middle lung sices of the 3D CT scans. The proposed method builds on the Mosmed-1110 dataset (Section 4). This dataset contains 3D CT scans of the patients, and each CT scan comprises about 40 axial slices. These slices start from the upper lung and end in the lower lung. Our approach to determining middle axial lung slices is as follows. We denote the first slice of the CT scan by 1, and we also denote the last slice by T. As a result, axial slices of 3D CT volume can be represented from 1 to T. T also shows the number of CT images in a 3D CT volume. Then, we divided the T value by two to determine the single middle CT image. Furthermore, we select previous and preceding images of the central scan to determine intermediate axial lung images. For example, we define three mid images by choosing the centre image and one previous and preceding mid scan images. As a result, our approach allows us to determine CT images in the middle lung region of the 3D volume. First, the method selects middle axial lung slices (Section 3.1), and then each of these middle axial slices goes through ResNet-50 convolutional neural network (CNN) model. When the proposed approach applies a single ResNet-50 model to a single axial slice to provide COVID-19 and normal class estimates, these two classes can be denoted by i = 1, . . . , n. A class is represented by i, and the number of classes is denoted by n. Since there are two classes, the value of n is 2 (n=2). Furthermore, the ResNet-50 model is denoted by j = 1, ..., m. Since a single model is used, the m value becomes m=1. The proposed method also applies m ResNet models to m axial slices. In this case, the m ResNet-50 model uses m axial slice to provide m COVID-19 and normal class predictions. We train a single ResNet-50 model and we applied it m times to m slices. These predictions can be denoted by p ij , where i = 1, .., n, j = 1, .., m. Then, these predictions are fused to classify COVID-19 and normal images. The main novelty of the paper is using the data fusion on decisions of parallel 2D CNNs. The results indicate that the fusion of the decisions of multiple 2D CNNs outperforms single 3D CNN approach. In creating a fusion of CNNs, we use the output probability values of the classification layers of each single CNN model to determine the confidence values, p i , that each image belongs to two classes. Here, p i ∈ [0, 1] for i = 1, . . . , n and n i=1 p i = 1. The proposed method employs the majority voting technique to provide COVID-19 estimates [38] . This technique is formulated as follows. where In the training part, the fine-tuned ResNet-50 model utilizes 3D CT scans to learn COVID-19 infection. First, 3D scans are expanded to 2D CT images of COVID-19 and non-COVID-19 images. Then, these 2D images are used to train the fine-tuned ResNet-50 model. All CT scans are resized to 256x256 RGB images, and then these images are used as inputs to the CNN models. The 224x224 random crops are extracted, and these crops are modelled using the models. These models are trained on the ImageNet dataset, and then they are used to adapt the parameters to CT images. The parameters of the convolutional layers are frozen, and only fully connected layers are estimated. The CT4 category includes two 3D CT scans of 75% COVID-19 infected lungs. Furthermore, mild (CT1) and moderate (CT2) levels demonstrate that the patient does not require any intensive care in a hospital and can therefore remain at home. In contrast, the patients must stay in the hospital under intensive care in severe (CT3) and critical stages (CT4). The proposed AI technique is generated and evaluated on two sets of MosMed datasets. Table 1 and 2 describle the system evaluation sets. We use CT0 and CT2 categories for classifying COVID-19 and non-COVID-19. Table 1 reports the number of training and test sets of CT scans in the datasets. The proposed AI system is trained on 80 COVID-19 scans and 164 normal scans. Then, the AI system is tested on 50 normal and 25 COVID-19 infected CT scans. Table 2 reports the number of training and test sets of CT scans in the datasets. The proposed AI system is trained on 45 COVID-19 scans and 45 normal scans. Then, the AI system is tested on 45 normal and 5 COVID-19 infected CT scans. The CCAP dataset [37] includes healthy, COVID-19, bacterial, viral, mycoplasma, and pneumonia. The proposed AI technique is also evaluated on the CCAP dataset. We used CT volumes of patients who were normal and those with COVID-19. Table 3 reports the number of training and test sets of CT scans in the datasets. The proposed AI system is trained on 65 normal CT scans and 46 COVID-19 CT scans. Then, the AI system is tested on 25 normal and 3 COVID-19 infected CT scans. The performance evaluation of the proposed deep learning method is evaluated using the Mosmed dataset. Performance evaluation is performed using metrics described as follows. The area under the receiver operating characteristic (ROC) curve (AUC), accuracy (ACC), sensitivity (SE), and specificity (SP) performance metrics are used to test the accuracy of the methods. We can describe accuracy, sensitivity, and specificity as: where true positive, positive, true negative, false positive, and false negative are denoted as TP, TN, FP, and FN, respectively. The performance of the proposed AI method is evaluated for image-level COVID-19 classification. We utilized the proposed technique to obtain the average AUC values of each CT image of a person's 3D scan. These average AUC values are reported in Figure 3 (a). The classification performance is low for the first 15 images. Then, the performance increases after the fifth image. Finally, the performance of COVID-19 classification decreases after 25 slices. The best COVID-19 prediction performance is achieved when middle scans are utilized for classification. In particular, the AI system provides the highest AUC values for scan numbers between 15 and 25. The results show that COVID-19 classification can be achieved using a middle scan. The classification performance increases towards the middle scan. Simi- Table 5 and Figure 3 (b) report the performance of the proposed system using training and testing ratio of 2:1. Table 5 reports the proposed AI system's performance for scan level COVID-19 classification. The AI system utilizes the ResNet-50 deep network in conjunc- Figure 3 (b) also shows average AUC values for corresponding middle scan numbers. The performance results show that the AUC values increase when the number of images is higher than fifteen. As a result, the system performance depends on the number of images. The system performance increases as the number of images increases. The proposed system's performance is also reported on Mosmed and CCAP datasets using a training and testing ratio of 9:1. Table 6 shows the performance of the maximal probabilitybased CNN model and Table 7 reports the performance of the majority voting based proposed model on the Mosmed dataset. Table 8 shows the performance of the majority votingbased CNN model and Table 9 reports the performance of the majority voting based proposed model on the CCAP dataset. J o u r n a l P r e -p r o o f The proposed method enables COVID-19 classification for both image-level and scan level. First, the system performance was evaluated for image-level COVID-19 classification. The proposed AI system provides the best COVID-19 classification accuracy when middle images of the CT scan are utilized. The results can be seen in Figure 3 (a) . Second, the proposed system performance was evaluated for scan level for COVID-19 classification. The performance results showed that the system performance increases as the total number of scans increases. Table 10 reports the performance comparison of the proposed AI system and other methods. The proposed ResNet50 and majority voting provide 0.90 AUC value while 3D-ResNet50 model provides 0.67 AUC value. The main strength of the proposed method is finetuning. The proposed ResNet18 and ResNet50 deep networks were fined-tuned on imagenet models [10, 11] . Imagenet models built on more than one million images. As a result, the fine-tuned proposed ResNet18 and ResNet50 models on imagenet allow accurate modelling for COVID-19 on CT scans. In contrast, we finedtuned 3D-ResNet18 and 3D-ResNet50 models on [40] Kinetics action video dataset. Kinetics dataset is the biggest publicly available dataset for fine-tuning the 3D-ResNet18 and 3D-ResNet50 models. This dataset includes 300K action videos. However, imagenet dataset is much bigger than Kinetics dataset, and this dataset provides more accurate modelling of proposed ResNet018 and ResNet-50 models. Another strength of the proposed method is that the modelling is based on majority voting. The majority voting receives estimates of the COVID-19 from CNN models and uses these predictions for COVID-19 classification. The proposed majority voting considers good estimates while it puts less emphasis on poor estimates of the CNN models. On the other hand, 3D-ResNet18 and 3D-ResNet50 deep networks model all CT scan images for COVID-19 classification. In other words, these models put equal emphasis on all the images of the CT scans to classify COVID-19. In conclusion, the proposed method's performance results show that the proposed approach is more robust and accurate than the 3D-ResNet18 and 3D-ResNet50 models. The proposed method requires that several ResNet-50 architectures are run; however, the number of parallel usage of the models depends on the available hardware memory capacity. A high memory requirement might not be very problematic for a modern desktop computer or a laptop, but the performance on a tablet, a mobile phone, or a web server would be significantly impacted. This limitation would prevent the use of this model in applications such as mobile telemedicine networks. A new and novel AI system is proposed for detecting COVID-19 infection on CT images and CT scans. This AI system builds on ResNet-50 and majority voting. The proposed method has been compared with other deep learning models and fusing techniques. The reported results show that the proposed Resnet-50 model, in conjunction with majority voting, outperforms all other models and fusing techniques. Covidgan: Data augmentation using auxiliary classifier gan for improved covid-19 detection Covid-19 identification in chest x-ray images on flat and hierarchical classification scenarios The role of chest imaging in patient management during the covid-19 pandemic: A multinational consensus statement from the fleischner society Imagenet classification with deep convolutional neural networks Going deeper with convolutions Very deep convolutional networks for large-scale image recognition Mobilenetv2: Inverted residuals and linear bottlenecks Deep residual learning for image recognition Laurens van der Maaten, and Kilian Q. Weinberger. Densely connected convolutional networks ImageNet Large Scale Visual Recognition Challenge ImageNet: A Large-Scale Hierarchical Image Database Early pleural effusion detection from respiratory diseases including covid-19 via deep learning Deep learning to distinguish covid-19 from other lung infections, pleural diseases, and lung tumors Discerning covid-19 from mycoplasma and viral pneumonia on ct images via deep learning Deep learning for mycoplasma pneumonia discrimination from pneumonias like covid-19 Gabor wavelet-based deep learning for skin lesion classification Wavelet-based deep learning for skin lesion classification Keratinocyte carcinoma detection via convolutional neural networks Geographic variation and ethnicity in diabetic retinopathy detection via deeplearning Transfer learning for early and advanced glaucoma detection with convolutional neural networks A generalized deep learning model for glaucoma detection Dry and wet age-related macular degeneration classification using oct images and deep learning Graph-based saliency and ensembles of convolutional neural networks for glaucoma detection Deep learning in medical imaging: A brief review Deep learning-based detection for covid-19 from chest ct using weak label. medRxiv Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet? Resource efficient 3d convolutional neural networks IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) A weakly-supervised framework for covid-19 classification and lesion localization from chest ct Accurate screening of covid-19 using attention-based deep 3d multiple instance learning Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19 Sampleefficient deep learning for covid-19 diagnosis based on ct scans. medrxiv Weakly supervised deep learning for covid-19 infection detection and classification from ct images Artificial intelligence-enabled rapid diagnosis of patients with covid-19 Artificial intelligence for the detection of covid-19 pneumonia on chest ct using multinational datasets Deeplearning framework to detect lung abnormality -a study with chest x-ray and lung ct scan images Deep learning system to screen coronavirus disease 2019 pneumonia Ccap: A chest ct dataset Skin lesion classification with ensembles of deep convolutional neural networks Mosmeddata: Chest ct scans with covid-19 related findings dataset The kinetics human action video dataset