key: cord-0953242-elhbfoyo authors: Maharjan, Jenish; Calvert, Jacob; Meng, Emily Pellegrini; Green-Saxena, Abigail; Hoffman, Jana; McCoy, Andrea; Mao, Qingqing; Das, Ritankar title: Application of deep learning to identify COVID-19 infection in posteroanterior chest X-rays date: 2021-07-24 journal: Clin Imaging DOI: 10.1016/j.clinimag.2021.07.004 sha: 27a04e8c47cf372c3cf302e2d8149b713d8dd1ed doc_id: 953242 cord_uid: elhbfoyo INTRODUCTION: Posteroanterior chest X-rays (CXRs) are recommended over computed tomography scans for COVID-19 diagnosis, as CXRs can be obtained with relatively low risk of facility contamination. The objective of this study was to assess seven configurations of six convolutional deep neural network architectures for classification of CXRs as COVID-19 positive or negative. METHODS: The primary dataset consisted of 294 COVID-19 positive and 294 COVID-19 negative CXRs, the latter comprising roughly equally many pneumonia, emphysema, fibrosis, and healthy images. We used six common convolutional neural network architectures, VGG16, DenseNet121, DenseNet201, MobileNet, NasNetMobile and InceptionV3. We studied six models (one for each architecture) which were pre-trained on a vast repository of generic (non-CXR) images, as well as a seventh, a DenseNet121 model which was pre-trained on a repository of CXR images. For each model, we replaced the output layers with custom fully connected layers for the task of binary classification of images as COVID-19 positive or negative. Performance metrics were calculated on a hold-out test set with CXRs from patients who were not included in the training/validation set. RESULTS: When pre-trained on generic images, the VGG16, DenseNet121, DenseNet201, MobileNet, NasNetMobile, and InceptionV3 architectures respectively produced hold-out test set areas under the receiver operating characteristic (AUROCs) of 0.98, 0.95, 0.97, 0.95, 0.99, and 0.96 for the COVID-19 classification of CXRs. The X-ray pre-trained DenseNet121 model, in comparison, had a test set AUROC of 0.87. DISCUSSION: Common convolutional neural network architectures with parameters pre-trained on generic images yield high-performance and well-calibrated COVID-19 CXR classification. In December 2019, a cluster of pneumonia with unknown etiology emerged, rapidly evolving into a world-wide health crisis with significant social, health, and financial consequences [1] . Given the rapid spread of infection [2] , the continued concern that asymptomatic carriers are contributing to community transmission [3] [4] [5] [6] [7] , the depletion of hospital resources due to high influxes of patients [8] , and the current absence of specific therapeutic drugs and widely available vaccines for treatment of COVID-19 infection [1, 9] , it is essential to detect onset at its early stages. Radiological examinations play an important role in the diagnosis and evaluation of this global health emergency [10] [11] [12] . Common radiological findings of the infection include multiple ground glass opacity and interlobular septal thickening in the lungs, with significant correlations between the degree of pulmonary inflammation and main COVID-19 clinical symptoms [10] . Although reverse-transcription polymerase chain reaction (RT-PCR) remains the standard to diagnose COVID-19 infection [11, 13, 14] , issues with the false negative rate [15] , sensitivity [12] , and limited supply [16] of RT-PCR assays have hindered prompt diagnosis. Complementary to RT-PCR assays, chest radiography can identify early phase lung infection [17] and prompt larger surveillance efforts [18] . In particular, there has been a recent flurry of work concerning the use of chest X-rays (CXR) to detect COVID-19 [12, 16, [19] [20] [21] [22] [23] [24] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] . All of these studies use models based on a number of convolutional neural network (CNN) architectures-in most instances, performance comparisons are limited to models derived from only one or a few architectures [12, [24] [25] [26] [27] [28] [29] [30] 32, 34] detection, we mention that other studies use related methods to detect and predict the severity of pneumonia among patients already known to be COVID-19 positive [36] [37] [38] . Additionally, a great deal of effort has been devoted to the use of computed tomography (CT) for COVID-19 detection [14, 31, 34] . Because obtaining chest radiographs presents a risk of contamination of radiology facilities, the American College of Radiology recommends the use of CXR over CT [39] . Additionally, X-ray imaging systems are cheaper and more prevalent to attain than CT scan systems [32] . [19] . We note that, while we reference the ImageNet dataset [43] , we used the parameters for the six CNN architectures which were already derived from training on ImageNet. Data preprocessing and labeling. Both datasets contained data obtained from single posteroanterior (or "front-on") X-rays as well as from CT scans composed of multiple concerted X-rays. We chose to exclusively use single CXR images, as they are the simplest and most J o u r n a l P r e -p r o o f common form of chest radiograph, and because the American College of Radiology has recommended them over CT scans due to the potential for CT scanners to spread the virus [39] . Due to the relative scarcity of COVID-19 positive images, we used all available images (294 images), despite the potential bias introduced by using multiple images from the same individual. For convenience, we selected from ChestX-ray14 as many (294 images) COVID-19 negative images as COVID-19 positive images from the COVID-19 dataset. We chose approximately equally many pneumonia, emphysema, fibrosis, and healthy images. We selected these conditions on the basis of shortness of breath and cough, which overlap with primary symptoms of COVID-19 [44, 45] and may therefore motivate a clinician to order a chest radiograph to determine COVID-19 status in those patients. We tabulate the two demographic pieces of information, sex and age, which were available for most of the 588 images and were used as inputs to the algorithm ( Table 1) . The size of all images was standardized to 224 x 224. We selected 220 COVID-19 positive negative, were allocated to a hold-out test set. The images for the hold-out test set were selected such that no images from patients in the hold-out test set were seen by the models during training. The prevalence of COVID-19 positive images in the training and testing sets were similar to that reported in another imaging analysis study for COVID-19 screening [46] . Our machine learning models were built from standard building blocks to create seven models that we found to be effective for this task of classification. We select six of the most common configurations of CNN architectures, VGG16 (named for the Visual Geometry Group at the University of Oxford) [47] , DenseNet121 and DenseNet201 [48] , InceptionV3 [49] , MobileNet [50] and NasNetMobile [51] . We chose these six architectures due to their popularity and the accessibility of ImageNet pre-trained parameters (available in the Python Keras library), as well as their contrasting depth and number of parameters. Among the six architectures, VGG16 has the highest number of parameters while MobileNet has the lowest. In terms of topological depth (including activation layers, normalization layers and so on) DenseNet201 is J o u r n a l P r e -p r o o f the deepest and VGG16 is the shallowest. We adapted each of the six CNN architectures pretrained using the ImageNet dataset by removing the classifier blocks and adding a custom classifier block to each. The final layer in the custom classifier blocks were designed with a softmax output for the two categorical outputs: covid and non-covid. We call these six models "off the shelf" (OTS) to indicate that they were pre-trained using ImageNet data. Table 2 ). The metrics associated with particular operating points varied more between the models. The ROC curves obtained by all models on the hold-out test set are presented in Figure 1 . The curves reflect the comparably sound validation performance across the models and a small decrease in test set performance relative to the validation average. This performance is not obtained at the expense of calibration, which we address using temperature scaling [52] . As demonstrated in Supplementary Figures 1 and 2 , after temperature scaling, the expected difference between the accuracy and confidence of OTS VGG16 model classifications is small, indicating good calibration. Grad-CAM [29] heat maps roughly localize the regions of the X-rays which had greatest relevance to OTS VGG16 classifications (see supplementary methods, Supplementary Figure 3 ). Figure 2 exhibit the performance of the final models on the hold-out test set. Here, label 1 corresponds to COVID-19 positive images. The support for the two classes in the test set is 15 per class. There is evidence from the rapidly expanding bank of infectious disease literature that COVID-19 patients can be more efficiently diagnosed, and at lesser expense, through analysis of J o u r n a l P r e -p r o o f radiographic image data as compared to RT-PCR assays [9, 13, 16, 23, 46, [53] [54] [55] . Deep learningbased detection of COVID-19 from chest radiographs has been studied toward this end [24, [24] [25] [26] [27] [28] [29] 31, [31] [32] [33] [34] 36, [56] [57] [58] . Aligned with the recommendation of the American College of Radiology, we characterized the retrospective performance of common deep learning models in the classification of COVID-19 from single CXR images. Our results join the growing body of evidence that a variety of CNN architecture based models trained on CXR images can be successfully used to distinguish COVID-19 infection from conditions with similar clinical presentation ( Table 2 , Figures 1 and 2) [33, 42, 43, 57] and also demonstrate that the performance need not come at the expense of calibration ( Supplementary Figures 1 and 2) . To better understand the regions of CXR images giving rise to a particular classification, rough localization can be performed using a standard tool like Grad-CAM (Supplementary Figure 3) . In many cases, these architectures were not designed for COVID-19 detection in CXR and benefitted from pre-training on generic (non-CXR) image data instead of CXR data (likely due to the relative dearth of CXR data). This observation is reflected by the disparity in performance between the OTS DenseNet121 model and its XRT counterpart (Table 2 ). Moreover, strong performance can apparently be achieved with relatively few COVID-19 positive examples. Previous works have applied machine learning models to COVID-19 identification with CXR images, but in the presence of no differential diagnoses (e.g. COVID-19 positive vs. healthy or no-finding patients) [14, 25, 59] or only from pneumonia patients [60] [61] [62] . Additionally, many studies applying machine learning to evaluate X-ray images for COVID-19 diagnosis only examined small or private datasets, or datasets with large class imbalances [24, 28, 29, 58] pre-trained on generic image data, produce high performing and well calibrated models for COVID-19 detection using CXR images. These results support future prospective validation for continued optimization of ML and X-rays for COVID-19 diagnosis. J o u r n a l P r e -p r o o f COVID-19: Epidemiology, Evolution, and Cross-Disciplinary Perspectives Preparing for large-scale community transmission of COVID-19 : guidance for countries and areas in the WHO Western Pacific Region Responding to community spread of COVID-19: interim guidance Presumed Asymptomatic Carrier Transmission of COVID-19 Clinical characteristics of 24 asymptomatic infections with COVID-19 screened among close contacts in Nanjing Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship First Case of 2019 Novel Coronavirus in the United States What does the coronavirus mean for the U.S. health care system? Some simple math offers alarming answers Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases Chest CT Findings in Patients With Coronavirus Disease 2019 and Its Relationship With Clinical Features COVID-19): A Perspective from China Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks Radiological diagnosis of new coronavirus infected pneumonitis: Expert recommendation from the Chinese Society of Radiology Deep Learning-Based Quantitative Computed Tomography Model in Predicting the Severity of COVID-19: A Retrospective Study in 196 Patients A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster A Weakly-Supervised Framework for COVID-19 Classification and Lesion Localization From Chest CT Chest CT findings in 2019 novel coronavirus (2019-nCoV) infections from Wuhan, China: key points for the radiologist Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review ChestNet: A Deep Neural Network for Classification of Thoracic Diseases on Chest Radiography n Clinical features of patients infected with 2019 novel coronavirus in Wuhan CT Imaging of the 2019 Novel Coronavirus (2019-nCoV) Pneumonia. Radiology 2020 Emerging coronavirus 2019-ncov pneumonia CT Imaging Features of 2019 Novel Coronavirus (2019-nCoV) Automatically discriminating and localizing COVID-19 from community-acquired pneumonia on chest X-rays COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images Automated detection of COVID-19 cases using deep neural networks with X-ray images CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images Deep Learning-Based Decision-Tree Classifier for COVID-19 Diagnosis From Chest X-ray Imaging Can AI Help in Screening Viral and COVID-19 Pneumonia? COVID-19 detection using deep learning models to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and stacking approaches AI-Driven Tools for Coronavirus Outbreak: Need of Active Learning and Cross-Population Train/Test Models on Multitudinal/Multimodal Data Truncated inception net: COVID-19 outbreak screening using chest X-rays Shallow Convolutional Neural Network for COVID-19 Outbreak Screening Using Chest X-rays Deep neural network to detect COVID-19: one architecture for both CT Scans and Chest X-rays COVID-19: Prediction, Decision-Making, and its Impacts COVID-19 in CXR: from Detection and Severity Scoring to Patient Disease Monitoring A Deep Learning Approach for COVID-19 8 Viral Pneumonia Screening with X-ray Images BS-Net: Learning COVID-19 pneumonia severity on a large chest X-ray dataset ACR Recommendations for the use of Chest Radiography and Computed Tomography COVID-19 Image Data Collection: Prospective Predictions are the Future Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Imaging of Community-acquired Pneumonia A Review of Coronavirus Disease-2019 (COVID-19 AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system Very Deep Convolutional Networks for Large-Scale Image Recognition Densely Connected Convolutional Networks Rethinking the Inception Architecture for Computer Vision MobileNetV2: Inverted Residuals and Linear Bottlenecks Learning Transferable Architectures for Scalable Image Recognition On calibration of modern neural networks Grad-cam: Visual explanations from deep networks via gradient-based localization Time course of lung changes on chest CT during recovery from 2019 novel coronavirus (COVID-19) pneumonia COVID-19 pneumonia: what has CT taught us? The Lancet Infectious Diseases PDCOVIDNet: a parallel-dilated convolutional neural network architecture for detecting COVID-19 from chest X-ray images Unveiling COVID-19 from CHEST X-Ray with Deep Learning: A Hurdles Race with Small Data A Deep Feature Learning Model for Pneumonia Detection Applying a Combination of mRMR Feature Selection and Machine Learning Models Detection of Coronavirus Disease (COVID-19) Based on Deep Features AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system in four weeks Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images The diagnostic evaluation of Convolutional Neural Network (CNN) for the assessment of chest X-ray of patients infected with COVID-19 The role of chest imaging in patient management during the COVID-19 pandemic: a multinational consensus statement from the Fleischner Society Covid-19 Imaging Tools: How Big Data is Big? Journal of Medical Systems