key: cord-0996148-w2wx7y2v
authors: Shibly, K. H.; Dey, S. K.; Islam, M. T. U.; Rahman, M. M.
title: COVID Faster R-CNN: A Novel Framework to Diagnose Novel Coronavirus Disease (COVID-19) in X-Ray Images
date: 2020-05-19
journal: nan
DOI: 10.1101/2020.05.14.20101873
sha: 40feb27a116a0192b40421d2303e9f826ec1b4d9
doc_id: 996148
cord_uid: w2wx7y2v

COVID-19 or novel coronavirus disease, which has already been declared as a worldwide pandemic, at first had an outbreak in a small town of China, named Wuhan. More than two hundred countries around the world have already been affected by this severe virus as it spreads by human interaction. Moreover, the symptoms of novel coronavirus are quite similar to the general flu. Screening of infected patients is considered as a critical step in the fight against COVID-19. Therefore, it is highly relevant to recognize positive cases as early as possible to avoid further spreading of this epidemic. However, there are several methods to detect COVID-19 positive patients, which are typically performed based on respiratory samples and among them one of the critical approach which is treated as radiology imaging or X-Ray imaging. Recent findings from X-Ray imaging techniques suggest that such images contain relevant information about the SARS-CoV-2 virus. In this article, we have introduced a Deep Neural Network (DNN) based Faster Regions with Convolutional Neural Networks (Faster R-CNN) framework to detect COVID-19 patients from chest X-Ray images using available open-source dataset. Our proposed approach provides a classification accuracy of 97.36%, 97.65% of sensitivity, and a precision of 99.28%. Therefore, we believe this proposed method might be of assistance for health professionals to validate their initial assessment towards COVID-19 patients.

Deep learning is a popular area of research in the field of artificial intelligence. It enables end-to-end modeling to deliver promised results using input data without the need for manual feature extraction. The use of machine learning methods for diagnostics in the medical field has recently gained popularity as a complementary tool for doctors. A molecular diagnosis method of novel coronavirus was proposed 22 In terms of COVID-19 patients detection using X-Ray images, the Deep model of Ioannis et al. 27 reached a success rate of 98.75% for two classes and 93.48% for three classes. By comprising multiple CNN models, Hemdan et al. 28 have proposed a COVIDX-Net model that is capable of detecting confirmed cases of COVID-19. A transfer learning-based framework has been advised by Karmany et al. 29 to identify medical diagnoses and treatable diseases using image-based deep learning.

Convolutional Neural Networks (CNN) is a deep neural network-based learning architecture for processing a massive amount of data. Nowadays, it is widely applied for medical imaging analysis. The CNN is used extensively over other Machine Learning methods because it does not need any manual feature extraction as well as does not require specific segmentation. In Layer. However, for selecting an anchor, the anchors are divided into two categories (positive All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint and negative) with an intersection-over-union (IoU) overlap ratio between ground-truth box and anchor as a classification index. The IoU overlap ratio is defined as in Equation (1).

Using VGG-16, we minimize an objective function following the multi-task loss in Faster R-CNN, and the loss function is derived in Equation (2).

L cls is the classification loss function, and L reg is the regression loss function. N cls and N reg are the normalization coefficients of the classification loss function L cls and the regression loss function L reg respectively. λ is the weight parameter between L cls and L reg . The classification loss function L cls is the logarithmic loss of two categories (COVID-19 and non-COVID), and it is defined as in the following Equation (3)

For the regression loss function, it is defined as in Equation (4 

Here, R is defined as a robust loss function in Equation (5)

Regarding model development, rather than creating a model from scratch, we built it according to our sample input requirements. We have used similar layers and filters as compared to the original Faster R-CNN architectures and gradually increased the number of filters. In addition, it is essential to consider the Faster R-CNN when analyzing our proposed model and algorithm. Our proposed framework consists of 24 convolutional layers, followed by two fully connected layers and six pooling layers. These layers are typical CNN layers with different filter numbers, sizes, and stride values.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint

For dataset design, we have followed a two-step procedure for data preparation. Initially, we used the X-ray images of COVID-19 patients, which is available as open-source data.

Furthermore, we developed a custom dataset to train and evaluate, which comprised a total of 5450 chest radiography images across 2500 patient cases. To prepare the custom dataset for our use in the experiment, we combined and modified two different publicly available Table 1 . For the custom dataset, we have used 5450 sample images, whereas, in the COVIDx dataset, we developed our dataset with 13800 images. Train  1100  3000  80  4180  Test  300  950  20  1270  COVIDx  Train  7966  5451  152  13569  Test  100  100  31  231 The COVIDx dataset is updated continuously with images shared by researchers from different regions. As of May 7, 2020, there are 183 X-Ray images diagnosed with COVID-19, 8066 patients as normal, and 5551 cases identified as non-COVID Pneumonia. By merging 'Normal' and 'non-COVID Pneumonia' into a single 'non-COVID' class, we designed it as a binary class dataset.

The most noticeable fact is that very few patients of COVID-19 associated with X-Ray images, which leads to the scarcity of availability of X-Ray images. For the model building and training, we have used the Googles TensorFlow library and VGG-16 for highperformance numerical computation. Regarding the cross-validation approach, our study used the K-fold cross-validation method (K=10) with the support of Leave-one-out crossvalidation. Algorithm 1 provides insight into K-fold cross-validation working procedures.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. .

1. Define sets of model parameter values 2. Randomly ruffle the dataset 3. Split the dataset into k=10 groups 4. for each parameter set, do 5.

for each resampling iteration do 6.

Hold-out specific samples 7.

Fit the model on the remainder 8.

Predict the hold-out samples 9. 

In this section, we discuss the loss observation, followed by the results of model validation.

We performed experiments to detect and classify COVID-19 confirmed cases using X-Ray images and train the models in two classes: non-COVID and COVID-19. The model was evaluated using 10-fold cross-validation technique. We have used 90% of X-Ray images for training, and the rest of the 10% are used for testing or validation. Moreover, the loss function is highly essential to understand the excellence of the prediction. From Figure 3 ., we observe that the training loss and validity loss is decreased gradually after every 100 epoch.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . It is also noticeable that both training loss and validation values increased significantly at the primary epoch because of the number of COVID-19 data in that specific class (Figure 3) . The effectiveness of the model is achieved through testing, cross-validation, and direct image input testing. In order to evaluate the performance of the model, the complete trained model is validated with the same model dataset using K-Fold cross-validation.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . https://doi.org/10.1101/2020.05.14.20101873 doi: medRxiv preprint (G) Figure 4 : Illustration of different sample images are shown here. Figure 4 (A, B, D, E, and F) shows that the model predicts the sample as non-COVID, whereas the actual class is also a non-COVID. However, only 4 (C) predict the sample as non-COVID, whereas the actual class is COVID-19. 4(G) depicts the generated confusion matrix based on 10-fold cross-validation method. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . 

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 19, 2020. . In this study, we have proposed a deep learning model to detect COVID-19 cases from Chest X-Ray images. This automated system can perform binary classification without manual feature extraction with an accuracy of 97.36%. Moreover, this model is also capable of testing with a larger dataset and work with real-time systems. Furthermore, it can be helpful in areas where the test kit is not sufficient. Therefore, for the initial assessment of COVID-19 patients, this tool can act as a fruitful medium of diagnosis under the supervision of radiologists and doctors. At this point, we are working to make our model more vigorous so that it can detect both CT and X-Ray images.

SKD and KHS had the idea for and designed the study and had full access to all the data in the study and take responsibility for the data and accuracy of the model generation. SKD, KHS, and MR contributed to the writing of the article. MR and TUI contributed to the critical revision of the report. All the data preparation and models developed by SKD and KHS. All authors contributed to data acquisition, data analysis, result validation, and reviewed and approved the final version.

All authors declare no competing interest

Not required

Clinical features of patients infected with 2019 novel coronavirus in Wuhan

Reverse transcription-polymerase chain reaction

Sensitivity of Chest CT for COVID-19: Comparison to RT-PCR

Time course of lung changes on chest CT during recovery from 2019 novel coronavirus (COVID-19) pneumonia. Radiology

Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection

Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT?

COVID-19 Pneumonia: what has CT taught us

China: a descriptive study. The Lancet Infectious Diseases

Relation between chest CT findings and clinical conditions of coronavirus disease (COVID-19) pneumonia: a multicenter study

Coronavirus Disease 2019 (COVID-19): Role of chest CT in diagnosis and management

A familial cluster of Pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster

Dermatologist-level classification of skin cancer with deep neural networks

Deep learning ensembles for melanoma recognition in dermoscopy images

No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity

Invasive Ductal Carcinoma Detection Based Using Deep Transfer Learning with Whole-Slide Images

Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks

Convolutional neural networks for multi-class brain disease detection using MRI images

Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning

Attention U-Net Based Adversarial Architectures for Chest X-ray Lung Segmentation

An automatic method for lung segmentation and reconstruction in chest X-ray using deep neural networks. Computer methods and programs in biomedicine

COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images

Molecular Diagnosis of a Novel Coronavirus (2019-nCoV) Causing an Outbreak of Pneumonia

Recent advances in the detection of respiratory virus infection in humans

Detection of coronavirus Disease (COVID-19) based on Deep Features

Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks

COVIDX-Net: A framework of deep learning classifiers to diagnose COVID-19 in x-ray images

Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning

COVID-19 image data collection

RSNA pneumonia detection challenge

Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images

A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19)

Deep Learning-based Detection for COVID-19 from Chest CT using Weak Label

Deep Learning System to Screen Coronavirus Disease 2019 Pneumonia

Automated detection of COVID-19 cases using deep neural networks with X-ray images

[3]World Health Organization, "Emerging respiratory viruses, including COVID-19: methods for detection, prevention, response and control", Available at:https://openwho.org/courses/introduction-to-ncov (accessed on: March 22, 2020).