key: cord-0996148-w2wx7y2v authors: Shibly, K. H.; Dey, S. K.; Islam, M. T. U.; Rahman, M. M. title: COVID Faster R-CNN: A Novel Framework to Diagnose Novel Coronavirus Disease (COVID-19) in X-Ray Images date: 2020-05-19 journal: nan DOI: 10.1101/2020.05.14.20101873 sha: 40feb27a116a0192b40421d2303e9f826ec1b4d9 doc_id: 996148 cord_uid: w2wx7y2v COVID-19 or novel coronavirus disease, which has already been declared as a worldwide pandemic, at first had an outbreak in a small town of China, named Wuhan. More than two hundred countries around the world have already been affected by this severe virus as it spreads by human interaction. Moreover, the symptoms of novel coronavirus are quite similar to the general flu. Screening of infected patients is considered as a critical step in the fight against COVID-19. Therefore, it is highly relevant to recognize positive cases as early as possible to avoid further spreading of this epidemic. However, there are several methods to detect COVID-19 positive patients, which are typically performed based on respiratory samples and among them one of the critical approach which is treated as radiology imaging or X-Ray imaging. Recent findings from X-Ray imaging techniques suggest that such images contain relevant information about the SARS-CoV-2 virus. In this article, we have introduced a Deep Neural Network (DNN) based Faster Regions with Convolutional Neural Networks (Faster R-CNN) framework to detect COVID-19 patients from chest X-Ray images using available open-source dataset. Our proposed approach provides a classification accuracy of 97.36%, 97.65% of sensitivity, and a precision of 99.28%. Therefore, we believe this proposed method might be of assistance for health professionals to validate their initial assessment towards COVID-19 patients. Deep learning is a popular area of research in the field of artificial intelligence. It enables end-to-end modeling to deliver promised results using input data without the need for manual feature extraction. The use of machine learning methods for diagnostics in the medical field has recently gained popularity as a complementary tool for doctors. A molecular diagnosis method of novel coronavirus was proposed 22 In terms of COVID-19 patients detection using X-Ray images, the Deep model of Ioannis et al. 27 reached a success rate of 98.75% for two classes and 93.48% for three classes. By comprising multiple CNN models, Hemdan et al. 28 have proposed a COVIDX-Net model that is capable of detecting confirmed cases of COVID-19. A transfer learning-based framework has been advised by Karmany et al. 29 to identify medical diagnoses and treatable diseases using image-based deep learning. Convolutional Neural Networks (CNN) is a deep neural network-based learning architecture for processing a massive amount of data. Nowadays, it is widely applied for medical imaging analysis. The CNN is used extensively over other Machine Learning methods because it does not need any manual feature extraction as well as does not require specific segmentation. In Layer. However, for selecting an anchor, the anchors are divided into two categories (positive All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint and negative) with an intersection-over-union (IoU) overlap ratio between ground-truth box and anchor as a classification index. The IoU overlap ratio is defined as in Equation (1). Using VGG-16, we minimize an objective function following the multi-task loss in Faster R-CNN, and the loss function is derived in Equation (2). L cls is the classification loss function, and L reg is the regression loss function. N cls and N reg are the normalization coefficients of the classification loss function L cls and the regression loss function L reg respectively. λ is the weight parameter between L cls and L reg . The classification loss function L cls is the logarithmic loss of two categories (COVID-19 and non-COVID), and it is defined as in the following Equation (3) For the regression loss function, it is defined as in Equation (4 Here, R is defined as a robust loss function in Equation (5) Regarding model development, rather than creating a model from scratch, we built it according to our sample input requirements. We have used similar layers and filters as compared to the original Faster R-CNN architectures and gradually increased the number of filters. In addition, it is essential to consider the Faster R-CNN when analyzing our proposed model and algorithm. Our proposed framework consists of 24 convolutional layers, followed by two fully connected layers and six pooling layers. These layers are typical CNN layers with different filter numbers, sizes, and stride values. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint For dataset design, we have followed a two-step procedure for data preparation. Initially, we used the X-ray images of COVID-19 patients, which is available as open-source data. Furthermore, we developed a custom dataset to train and evaluate, which comprised a total of 5450 chest radiography images across 2500 patient cases. To prepare the custom dataset for our use in the experiment, we combined and modified two different publicly available Table 1 . For the custom dataset, we have used 5450 sample images, whereas, in the COVIDx dataset, we developed our dataset with 13800 images. Train 1100 3000 80 4180 Test 300 950 20 1270 COVIDx Train 7966 5451 152 13569 Test 100 100 31 231 The COVIDx dataset is updated continuously with images shared by researchers from different regions. As of May 7, 2020, there are 183 X-Ray images diagnosed with COVID-19, 8066 patients as normal, and 5551 cases identified as non-COVID Pneumonia. By merging 'Normal' and 'non-COVID Pneumonia' into a single 'non-COVID' class, we designed it as a binary class dataset. The most noticeable fact is that very few patients of COVID-19 associated with X-Ray images, which leads to the scarcity of availability of X-Ray images. For the model building and training, we have used the Googles TensorFlow library and VGG-16 for highperformance numerical computation. Regarding the cross-validation approach, our study used the K-fold cross-validation method (K=10) with the support of Leave-one-out crossvalidation. Algorithm 1 provides insight into K-fold cross-validation working procedures. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . 1. Define sets of model parameter values 2. Randomly ruffle the dataset 3. Split the dataset into k=10 groups 4. for each parameter set, do 5. for each resampling iteration do 6. Hold-out specific samples 7. Fit the model on the remainder 8. Predict the hold-out samples 9. In this section, we discuss the loss observation, followed by the results of model validation. We performed experiments to detect and classify COVID-19 confirmed cases using X-Ray images and train the models in two classes: non-COVID and COVID-19. The model was evaluated using 10-fold cross-validation technique. We have used 90% of X-Ray images for training, and the rest of the 10% are used for testing or validation. Moreover, the loss function is highly essential to understand the excellence of the prediction. From Figure 3 ., we observe that the training loss and validity loss is decreased gradually after every 100 epoch. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . It is also noticeable that both training loss and validation values increased significantly at the primary epoch because of the number of COVID-19 data in that specific class (Figure 3) . The effectiveness of the model is achieved through testing, cross-validation, and direct image input testing. In order to evaluate the performance of the model, the complete trained model is validated with the same model dataset using K-Fold cross-validation. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint (G) Figure 4 : Illustration of different sample images are shown here. Figure 4 (A, B, D, E, and F) shows that the model predicts the sample as non-COVID, whereas the actual class is also a non-COVID. However, only 4 (C) predict the sample as non-COVID, whereas the actual class is COVID-19. 4(G) depicts the generated confusion matrix based on 10-fold cross-validation method. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 19, 2020. . In this study, we have proposed a deep learning model to detect COVID-19 cases from Chest X-Ray images. This automated system can perform binary classification without manual feature extraction with an accuracy of 97.36%. Moreover, this model is also capable of testing with a larger dataset and work with real-time systems. Furthermore, it can be helpful in areas where the test kit is not sufficient. Therefore, for the initial assessment of COVID-19 patients, this tool can act as a fruitful medium of diagnosis under the supervision of radiologists and doctors. At this point, we are working to make our model more vigorous so that it can detect both CT and X-Ray images. SKD and KHS had the idea for and designed the study and had full access to all the data in the study and take responsibility for the data and accuracy of the model generation. SKD, KHS, and MR contributed to the writing of the article. MR and TUI contributed to the critical revision of the report. All the data preparation and models developed by SKD and KHS. All authors contributed to data acquisition, data analysis, result validation, and reviewed and approved the final version. All authors declare no competing interest Not required Clinical features of patients infected with 2019 novel coronavirus in Wuhan Reverse transcription-polymerase chain reaction Sensitivity of Chest CT for COVID-19: Comparison to RT-PCR Time course of lung changes on chest CT during recovery from 2019 novel coronavirus (COVID-19) pneumonia. Radiology Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT? COVID-19 Pneumonia: what has CT taught us China: a descriptive study. The Lancet Infectious Diseases Relation between chest CT findings and clinical conditions of coronavirus disease (COVID-19) pneumonia: a multicenter study Coronavirus Disease 2019 (COVID-19): Role of chest CT in diagnosis and management A familial cluster of Pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Dermatologist-level classification of skin cancer with deep neural networks Deep learning ensembles for melanoma recognition in dermoscopy images No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity Invasive Ductal Carcinoma Detection Based Using Deep Transfer Learning with Whole-Slide Images Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks Convolutional neural networks for multi-class brain disease detection using MRI images Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning Attention U-Net Based Adversarial Architectures for Chest X-ray Lung Segmentation An automatic method for lung segmentation and reconstruction in chest X-ray using deep neural networks. Computer methods and programs in biomedicine COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images Molecular Diagnosis of a Novel Coronavirus (2019-nCoV) Causing an Outbreak of Pneumonia Recent advances in the detection of respiratory virus infection in humans Detection of coronavirus Disease (COVID-19) based on Deep Features Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks COVIDX-Net: A framework of deep learning classifiers to diagnose COVID-19 in x-ray images Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning COVID-19 image data collection RSNA pneumonia detection challenge Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19) Deep Learning-based Detection for COVID-19 from Chest CT using Weak Label Deep Learning System to Screen Coronavirus Disease 2019 Pneumonia Automated detection of COVID-19 cases using deep neural networks with X-ray images [3]World Health Organization, "Emerging respiratory viruses, including COVID-19: methods for detection, prevention, response and control", Available at:https://openwho.org/courses/introduction-to-ncov (accessed on: March 22, 2020).