key: cord-0786967-43qxf408 authors: Elgendi, M.; Fletcher, R.; Howard, N.; Menon, C.; Ward, R. title: The Evaluation of Deep Neural Networks and X-Ray as a Practical Alternative for Diagnosis and Management of COVID-19 date: 2020-05-16 journal: nan DOI: 10.1101/2020.05.12.20099481 sha: 94fc692bac72f035e4ea12eaa522e28952a86524 doc_id: 786967 cord_uid: 43qxf408 High-resolution computed tomography radiology is a critical tool in the diagnosis and management of COVID-19 infection; however, in smaller clinics around the world, there is a shortage of radiologists available to analyze these images. In this paper, we compare the performance of 16 available deep learning algorithms to help identify COVID19. We utilize an already existing diagnostic technology (X-ray) and an already existing neural network (ResNet-50) to diagnose COVID-19. Our approach eliminates the extra time and resources needed to develop new technology and associated algorithm, thus aiding the front-line in the race against the COVID-19 pandemic. Results show that ResNet-50 is the optimal pretrained neural network for the detection of COVID-19, using three different cross-validation ratios, based on training time, accuracy, and network size. We also present a custom visualization of the results that can be used to highlight important visual biomarkers of the disease and disease progression. On March 11, 2020, the World Health Organization declared the COVID-19 virus as an international pandemic. 1 The virus spreads between people who are in close contact with one another through respiratory droplets produced by coughing or sneezing. 2 The current gold standard for COVID-19 detection is real time reverse transcription-polymerase chain reaction (RT-PCR). The test itself takes about 4 hours, however, the process before and after running the test, such as transporting the sample and sending the results, requires a significant amount of time. More importantly, the RT-PCR average turnaround time is 3-6 days, in addition to being relatively costly at an average of CA$4, 000 per test 3 . The need for a faster and relatively inexpensive technology for detecting COVID-19 is thus crucial to expedite universal testing. While the clinical presentation of COVID-19 is very diverse, a critical need is to identify the 20% of patients in which the virus has spread to the type II cells of the alveoli and bronchioles, which lead to hypoxia, organ failure, and death. 4 In order to meet this need, high-resolution computed tomography (HRCT) X-ray is commonly available worldwide, and the unique pattern of COVID-19 infection progression in the lungs has been identified. 5 However, despite the widespread availability of X-ray imaging, there is unfortunately a shortage of radiologist in most low-resource clinics and developing countries to help analyze and process these images. For this reason, computer algorithms, such as deep learning, that can automate the process of radiology have begun to attract great interest. 6 Note that X-ray costs about CA$40 per test 3 , making the use of it as a testing tool more feasible. Since the initial outbreak of the COVID-19, a few attempts have been made to apply deep learning to detect COVID-19. Narin et al. 7 reported an accuracy of 98% on a balanced dataset for detecting COVID-19 after investigating three pretrained neural networks. Sethy and Behera 8 explored 10 different pretrained neural networks, reporting an accuracy of 93% on a balanced dataset, for detecting COVID-19 on X-ray images. Zhang et al. 9 utilized only one pretrained neural network, scoring 93% on an unbalanced dataset. Hemdan et al. 10 looked into seven pretrained networks, reporting an accuracy of 90% on a balanced dataset. Apostolopoulos and Bessiana 11 evaluated five pretrained neural networks, scoring 98% of accuracy on an However, these attempts did not make clear which existing deep learning method would be the most efficient and robust for COVID-19 compared to many others. Moreover, some of these studies were carried out on unbalanced datasets. Our study aims to determine the optimal learning method, by investigating different types of pretrained networks on a balanced dataset, for COVID-19 testing. Additionally, we attempt to visualize the ResNet-50's weights, which were used for decision making, on-top of the original X-ray image to visually represent the output of the network. To determine the optimal existing pretrained neural network for the detection of COVID-19, we used the CoronaHack -Chest X-Ray-Dataset. The chest X-ray images dataset contains 85 images from patients diagnosed with COVID-19 and 1,576 images from healthy subjects. Five x-ray images collected from the Lateral position were deleted for consistency. Then, we balanced the dataset to include 53 COVID-19 and 53 healthy subjects. After creating a balanced dataset, which is important for producing solid findings, 16 pretrained networks were analyzed following the framework shown in Figure 1 . The 16 pretrained neural networks were trained on more than a million images to classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals. Each network has learned rich feature representations from a large spectrum of images. By replacing the last fully connect layer, as shown in Figure 1 , and retrain (fine-tune deeper layers) the neural network with the new dataset (53 COVID-19 and 53 healthy), the neural network can detect COVID-19 and healthy subjects. The performance of 16 pretrained neural networks using the same dataset (53 COVID-19 and 53 healthy), with different cross validation ratios, is shown in Table1. Interestingly, we found that the following five pretrained neural networks achieved a validation accuracy of 100% using 80-20% cross validation: ResNet-50, DarkNet-53, VGG-19, DenseNet-201, ResNet-18, ResNet-101, and GoogLeNet. The most consistent pretrained neural network in detecting COVID-19, regardless of the cross validation ratio, was ResNet-50, followed by DarkNet-53, followed by VGG-19. DenseNet-201, ResNet-18 and GoogLeNet achieved a validation accuracy below 90% for 50-50% cross validation suggesting that these neural networks are not robust enough for detecting COVID-19 compared to, for example, ResNet-50. In addition, ResNet-101 achieved 87.5% validation accuracy using 70-30% cross validation, suggesting that it is not robust for detecting COVID-19. Despite that the Inception-ReNet-v2 was pretrained on trained on more than a million images from the ImageNet database, 23 it was ranked the lowest in terms of the overall performance, suggesting it is not suitable to use for detecting COVID-19. Each pretrained network has a structure that is different from others, e.g., number of layers and size of input. The most important characteristics of a pretrained neural network are as follows: accuracy , speed, and size. 24 Greater accuracy increases 2/7 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 16, 2020. . the specificity and sensitivity for COVID-19 detection. Increased speed allows for faster processing. Smaller sized networks can be deployed on systems with less computational resources. Therefore, the optimal network is the network that increases accuracy, utilizes less training time, and that is relatively small in size. Typically, there is a tradeoff between the three characteristics, and not all can be satisfied at once. However, our results show that it is possible to satisfy all three requirements. ResNet-50 outperformed all other networks, while having increased speed and increased accuracy in a relatively small-sized network, as shown in Figure 2 . A visual comparison between all investigated pretrained neural networks is presented, with respect to the three characteristics. The x-axis is the training time (logarithmic scale) in seconds, the y-axis is the overall validation accuracy using three different cross validation ratios and the bubble size represents the network size. A comparison of optimal neural networks recommended in previous studies, along with the optimal neural network suggested by this work, is shown in Table 2 . Narin et al. 7 used a balanced sample size of 100 subjects (50 COVID-19 and 50 healthy). They investigated three pretrained neural networks: ResNet50, InceptionV3 and InceptionResNetV2, with a cross validation ratio of 80-20%. They found that ResNet50 outperformed the other two networks, scoring a validation accuracy of 98%. Sethy and Behera 8 used a balanced sample size of 50 subjects (25 COVID-19 and 25 healthy). They extracted features from All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 16, 2020. . Two studies reported results based on unbalanced datasets: Zhang et al. 9 and Apostolopoulos and Bessiana 11 . Zhang et al. 9 created a deep learnig network based on ResNet-50, which achieved an accuracy of 96% with a dataset of 70 COVID-19 and 30 Healthy subjects. Apostolopoulos and Bessiana 11 used a sample size of 224 COVID-19 and 504 healthy subjects. They tested five pretrained neural networks: VGG19, InceptionV3, InceptionResNetV2, Xception, and MobileNetV2. They found that VGG19 scored highest accuracy of 98.75% , with a cross validation ratio of 90-10%. Interestingly, ResNet-50 network achieved a validation accuracy of 100%, outperforming other studies that also evaluated ResNet-50 (note that these studies only compared ResNet-50 to a select few neural netowrks, whereas here we compared a total of 16). One possible reason for the difference in performace is that the dataset in our study differed from the datasets in other studies. Another reason is the network's parameter settings (e.g., learning rate). Note that our finding confirms that ResNet-50 is able to capture COVID-19 from chest X-ray images. While the ResNet-50 algorithm can distinguish COVID-19 patients from healthy individuals with 100% accuracy, we note the following limitations: 1. The sample size was relatively small, at 58 images for COVID-19 images. 2. The images were not annotated, which is an essential aspect that distinguishes the measurement position of the X-ray image. Thus, we were unable to differentiate between the measurement sites posteroanterior vs. anteroposterior of each X-ray image. Lateral positions were easily identifiable. 3. Our investigation compared COVID-19 patients to healthy individuals. As a next step in our investigation, the X-ray data from COVID-19 patients should also be compared against X-ray data from other respiratory infection patients in order to improve the specificity of the algorithm for detection of COVID-19. An important component to the automated analysis of the X-ray data is the visualization of the X-ray images, using colors to identify the critical visual biomarkers as well as indication of disease progression. This step can make disease identification more intuitive and easy to understand, especially for healthcare workers with minimal knowledge about COVID-19. The visualization can also expedite the diagnosis process. As shown in Figure 3 , COVID-19 and healthy subjects were identified based on the activation images and weights. In conclusion, fast, accurate and accessible tools are needed to help diagnose and manage COVID-19 testing infection. The current gold standard laboratory tests are time consuming and costly, adding delays to the testing process. Chest X-ray imaging for COVID-19 testing is widely available and computer-aided radiology can address existing shortages of radiologists. In this paper, we have reviewed and compared many deep learning techniques currently available in the market for detecting COVID-19 detection in chest X-ray images. After investigating 16 different pretrained neural networks, our results showed that ResNet-50 is the optimal pretrained deep learning network for detection of COVID-19 in HRCT chest X-ray images. Work to improve the specificity of these algorithms in the context of other respiratory infections is ongoing. 14 . All the experiments in our work were carried out in MATLAB 2020a on a PC with the following configuration: 3.70 GHz Intel(R) Core(TM) i7-6500U CPU 2.59 GHz, and 16.00 GB RAM. The dataset was divided into three cross validation sets: 1) 70% training and 30% testing, 2) 80% training and 20% testing, 3) 50% training and 50% testing. The last fully-connected layer was changed into the new task to classify two classes. The following parameters were fixed for the 16 pretrained neural networks: 1) learning rate was set to 0.0001, validation frequency was set to 5, max epochs was set to 8, and the min batch size was set to 64. The class activation mapping was carried by multiplying the image activations from the last ReLU layer by the weights of the last fully connected layer of the ResNet-50 network, called "activation 4 9 r elu, as f ollows : C(x, y) = ∑ W l=175 F l=173 (x, y) where C is the class activation map, l is the layer number, F is the image activations from ReLu layer (l = 173) with dimensions of 7 × 7 × 2048. Here, W referes to the weights at l = 175 with dimensions of 1 × 2048. Thus, the dimensions of C is 7 × 7. Then, we resized C to match the size of the original image and visualized it using a jet colormap. The CoronaHack -Chest X-Ray-Dataset can be downloaded from this link: https://www.kaggle.com/praveengovi/ coronahack-chest-xraydataset. Who declares covid-19 a pandemic Clinical features of patients infected with 2019 novel coronavirus in Wuhan Cost analysis of multiplex pcr testing for diagnosing respiratory virus infections Characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention Chest ct findings in patients with coronavirus disease 2019 and its relationship with clinical features Deep learning in radiology. Acad. radiology Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks Detection of coronavirus disease (covid-19) based on deep features Covid-19 screening on chest x-ray images using deep learning based anomaly detection Covidx-net: A framework of deep learning classifiers to diagnose covid-19 in x-ray images Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks Deep residual learning for image recognition Open source neural networks in c Very deep convolutional networks for large-scale image recognition Densely connected convolutional networks Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size Going deeper with convolutions An extremely efficient convolutional neural network for mobile devices Mobilenetv2: Inverted residuals and linear bottlenecks Inception-v3 for flower classification Xception: Deep learning with depthwise separable convolutions Learning transferable architectures for scalable image recognition Inception-v4, inception-resnet and the impact of residual connections on learning Imagenet large scale visual recognition challenge M.E. designed and led the investigation. R.F., N.H., C.M., and R.W. conceived the study and drafted the manuscript. All authors approved the final manuscript. The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.