key: cord-266055-ki4gkoc8 authors: Kikkisetti, S.; Zhu, J.; Shen, B.; Li, H.; Duong, T. title: Deep-learning convolutional neural networks with transfer learning accurately classify COVID19 lung infection on portable chest radiographs date: 2020-09-02 journal: nan DOI: 10.1101/2020.09.02.20186759 sha: doc_id: 266055 cord_uid: ki4gkoc8 Portable chest x-ray (pCXR) has become an indispensable tool in the management of Coronavirus Disease 2019 (COVID-19) lung infection. This study employed deep-learning convolutional neural networks to classify COVID-19 lung infections on pCXR from normal and related lung infections to potentially enable more timely and accurate diagnosis. This retrospect study employed deep-learning convolutional neural network (CNN) with transfer learning to classify based on pCXRs COVID-19 pneumonia (N=455) on pCXR from normal (N=532), bacterial pneumonia (N=492), and non-COVID viral pneumonia (N=552). The data was split into 75% training and 25% testing. A five-fold cross-validation was used. Performance was evaluated using receiver-operating curve analysis. Comparison was made with CNN operated on the whole pCXR and segmented lungs. CNN accurately classified COVID-19 pCXR from those of normal, bacterial pneumonia, and non-COVID-19 viral pneumonia patients in a multiclass model. The overall sensitivity, specificity, accuracy, and AUC were 0.79, 0.93, and 0.79, 0.85 respectively (whole pCXR), and were 0.91, 0.93, 0.88, and 0.89 (CXR of segmented lung). The performance was generally better using segmented lungs. Heatmaps showed that CNN accurately localized areas of hazy appearance, ground glass opacity and/or consolidation on the pCXR. Deep-learning convolutional neural network with transfer learning accurately classifies COVID-19 on portable chest x-ray against normal, bacterial pneumonia or non-COVID viral pneumonia. This approach has the potential to help radiologists and frontline physicians by providing more timely and accurate diagnosis. Coronavirus Disease 2019 (COVID-19) is a highly infectious disease that causes severe respiratory illness (1, 2) . It was first reported in Wuhan, China in December 2019 (3) and was declared a pandemic on Mar 11, 2020 (4) . The first confirmed case of coronavirus disease 2019 in the United States was reported from Washington State on January 31, 2020. (5) Soon after, Washington, California and New York reported outbreaks. COVID-19 has already infected 10 million, killed more than 0.5 million people, and the United States has become the worst-affected country, with more than 2.4 million diagnosed cases and at least 122,796 deaths (https://coronavirus.jhu.edu, assessed Jun 28, 2020). There are recent spikes of COVID-19 infection cases across many states and around the world and there will likely be second waves and recurrence. A definitive test of COVID-19 infection is the reverse transcription polymerase chain reaction (RT-PCR) of a nasopharyngeal or oropharyngeal swab specimen (6, 7) . Although RT-PCR has high specificity, it has low sensitivity, high false negative rate, and long turn-around time (6,7) (currently ~4 days although it is improving and other tests are becoming available (8)). By contrast, portable chest X-rays (pCXR) is convenient to perform, has a fast turnaround, and is well suited for imaging contagious patients and longitudinal monitoring of critically ill patients in the intensive care units because the equipment can be readily disinfected, preventing crossinfection. pCXR of COVID-19 infection has certain unique characteristics, such as predominance of bilateral, peripheral, and low lobes involvement, with ground-glass opacities with or without airspace consolidations as the disease progresses. These characteristics generally differ from other lung pathologies, such as bacterial pneumonia or other viral (non-COVID-19) lung infection. Based on CXR and laboratory findings, clinicians might start patients on empirical treatment before the RT-PCR results become available or even if the RT-PCR come back negative due to high false negative rate of RT-PCR. Early treatment in COVID-19 patients is associated with better clinical outcomes. Similarly, computed tomography (CT), which offers relatively more detailed features (such as subtle ground-glass opacity (9,10)), has also been used in the context of COVID-19. However, CT suite and equipment are more challenging to disinfect, and thus it is much less suitable for examining patients suspected of or confirmed with contagious diseases in general and COVID-19 in particular. Longitudinal CT monitoring of critically ill patients in the intensive care units is also challenging. In short, pCXR has become an indispensable imaging tool in the management of COVID-19 infection, is often one of the first examinations a patient suspected of COVID-19 infection receives in the emergency room, and ideally used for longitudinal monitoring of critically ill patients in the intensive care units. The usage of pCXR under the COVID-19 pandemic circumstances is unusual in many aspects. For instance, pCXR is preferred as it can be used at the bedside without moving the patients, but the imaging quality is not as good as conventional CXR (11) . In addition, COVID-19 patients may not be able to take full inspirations during the examination, obscuring possible pathology, especially in the lower lung fields. Many sicker patients may be positioned on the side which compromises imaging quality. Thus, pCXR data under the COVID-19 pandemic circumstances are suboptimal and, thus, may be more challenging to interpret. Moreover, pCXR is increasingly read by non-chest radiologists in some hospitals due to increasing demands, resulting in reduced accuracy and efficiency. pCXR images contain important clinical features that could be easily missed by the naked eyes. Computer-aided methods can improve efficiency and accuracy of pCXR interpretations, All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint which in turn provides more timely and relevant information to frontline physicians. Deeplearning artificial intelligence (AI) has become increasingly popular for analyzing diagnostic images (12, 13) . AI has the potential to facilitate disease diagnosis, staging of disease severity and longitudinal monitoring of disease progression. One common machine-learning algorithm is the convolutional neural network (CNN) (14, 15) , which takes an input image, learns important features in the image such as size or intensity, and saves these parameters as weights and bias to differentiate types of images (16, 17) . CNN architecture is ideally suited for analyzing images. Moreover, the majority of machine learning algorithms to date are trained to solve specific tasks, working in isolation. Models have to be rebuilt from scratch if the feature-space distribution changes. Transfer learning overcomes the isolated learning paradigm by utilizing knowledge acquired for one task to solve related ones. Transfer learning in AI is particularly important for small sample size data because the pre-trained weights enable more efficient training and improved performance (18,19). Many artificial intelligence (AI) algorithms based on deep-learning convolutional neural networks have been deployed for pCXR applications (20) (21) (22) (23) (24) and these algorithms can be readily repurposed for COVID-19 pandemic circumstances. While there are already many papers describing prevalence and radiographic features on pCXR of COVID-19 lung infection (see reviews (25, 26) ), there is a few peer-reviewed AI papers (27-32) and non-peer reviewed papers (33-36) to classify CXRs of COVID-19 patients from CXR of normals or related lung infections. The full potential of AI applications of pCXR under COVID-19 pandemic circumstances is not yet fully realized. The goal of this pilot study is to employ deep-learning convolutional neural networks to classify normal, bacterial infection, and non-COVID-19 viral infection (such as influenza) All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint against COVID-19 infection on pCXR. The performance was evaluated using receiver-operating curve (ROC) analysis. Heatmaps were also generated to visualize and assessment the performance of the AI algorithm. We recognized that this dataset was a public, community-driven dataset and there are potential selection biases. A radiologist (BS) evaluated all images for quality and relevance and each case was COVID-19 positive based on available data. Thus, this dataset is useful and valid for the purpose of algorithm development. The other datasets were taken from the established Kaggle chest X-ray image (pneumonia) dataset (https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia). Although the Kaggle database has a large sample size, we randomly selected a sample size comparable to that of COVID-19 to avoid asymmetric sample size bias that could skew sensitivity and specificity. The sample sizes chosen for bacterial pneumonia, non-COVID-19 viral pneumonia, and normal pCXR were 492, 552 and 532 patients, respectively. Similarly, a chest radiologist evaluated all images for quality. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint CNN: The CNN architecture was based on VGG16, a convolutional neural network (37) . The VGG16 model was used because it was pretrained on the ImageNet database and properly employs transfer learning which makes the training process efficient. The data was normalized first by transforming all files into RGB images and resizing them into 224x224 pixels to make them compatible with the VGG16 framework. Next, the images were one-hot-encoded and split into 75% training and 25% testing. For data analysis, batch sizes of 32 were used to limit computational expense and trained for 50 epochs. Several optimizers were tested however, Adams optimization function gave the lowest validation loss. The learning rate was lowered from the recommended 0.01 to 0.001 to prevent overshooting the global minimum loss. Categorical cross entropy was used as a loss function since the loss value decreases as the predicted probability converges to the actual label. The VGG16 architecture was utilized for computation efficiency and ease to implement, for immediate translation potential. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. Figure 2 shows examples of pCXR from a normal subject and from patients with different lung infections. COVID-19 is often characterized by ground-glass opacities with or All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint without nodular consolidation with predominance of bilateral, peripheral and lower lobes involvement. Non-COVID-19 viral pneumonia is often characterized by diffuse interstitial opacities, usually bilaterally. Bacterial pneumonia is often characterized by confluent areas of focal airspace consolidation. Table 1 . The precision, recall and F1 scores for the whole pCXR are shown in Table 2 . The overall precision, recall and F1 scores showed good to excellent performance. For CNN with transfer learning performed on the whole pCXR, the overall sensitivity, specificity, accuracy, and AUC were 0.79, 0.93, and 0.79, .84 respectively. For CNN performed on segmented lungs, the overall sensitivity, specificity, accuracy, and AUC were 0.91, 0.93, 0.88, 0.89 respectively. The performance was generally better using segmented lungs. To visualize the spatial location on the images that the CNN networks were paying attention to for classification, heatmaps of the COVID-19 versus normal pCXR are shown in performed on the whole pCXR, the majority of the hot spots were reasonably localized to regions of ground glass opacities and/or consolidations, but some hot spots were located outside the All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint lungs. For CNN performed on segmented lungs, the majority of the hot spots were reasonably localized to regions of ground glass opacities and/or consolidations, mostly as expected. This study developed and applied a deep-learning CNN algorithm with transfer learning to classify COVID-19 CXR from normal, bacterial pneumonia, and non-COVID viral pneumonia CXR in a multiclass model. Heatmaps showed reasonable localization of abnormalities in the lungs. The overall sensitivity, specificity, accuracy, and AUC were 0.91, 0.93, 0.88, and .89 respectively (segmented lungs). There are a few AI studies to date using machine learning methods to classify CXRs of COVID-19, normal and related lung infections. By the time this paper is reviewed many more (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint No-Findings (N=500) vs. Pneumonia (N=500) as well as a binary classification for COVID vs. No-Findings which achieved 87.02% and 98.08% accuracies, respectively (31). Pereira et al. pneumonia vs no-finding using resampling algorithms, texture descriptors, and CNN. This model achieved a F1-Score of 0.65 for the multiclass approach and F1 score of 0.89 for the hierarchical classification (32). AUC and accuracy were not reported. A few non-peer reviewed pre-prints using AI to classify COVID-19 CXRs have also been reported (33-36). Our study had one of the larger cohorts, balanced sample sizes, and multi-class model. Our approach is also amongst the simplest AI models with comparable performance index, likely facilitate immediate clinical translation. Together, these studies indicate that AI has the potential to assist frontline physicians in distinguishing COVID-19 infection based on CXRs. Heatmaps are informative tools to visualize regions that CNN algorithm pays attention to for detection. This is particular important given AI operates on high dimensional space. Such heatmaps enable reality checks and make AI interpretable with respect to clinical findings. Our algorithm showed that the majority of the hotspots were highly localized to abnormalities within the lungs, i.e., ground glass opacity and/or consolidation, albeit imperfect. The majority of the above-mentioned machine learning studies to classify COVID-19 CXRs did not provide heatmaps. We also noted that CNN on whole pCXR image resulted in some hot spots located outside the lungs. CNN of segmented lungs solved this problem. Another advantage of using segmented lung is reduced computational cost during training. Transfer learning also reduced computational cost, making this algorithm practical. The performance is generally better using segmented lungs. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint Most COVID-19 positive patients showed significant abnormalities on pCXR (39) . Some early studies have even suggested that pCXR could be used as a primary tool for COVID-19 screening in epidemic areas (39, 40) , which could complement swab testing which still has long turnaround time and non-significant false positive rate. In some cases, imaging revealed chest abnormalities even before swab tests confirm infection (41,42). In addition, pCXR can detect superimposed bacteria pneumonia, which necessitates urgent antibiotic treatment. pCXR can also suggest acute respiratory distress syndrome, which is associated with severe negative outcomes and necessitates immediate treatment. Together with the anticipated widespread shortage of intensive care units and mechanical ventilators in many hospitals, pCXR also has the potential to play a critical role in decision-making, especially in regards to which patients to admit to the ICU, put on mechanical ventilation, or when to safely extubate. A timely implementation of AI methods could help to realize the full potential of pCXR in this COVID-19 pandemic. This pilot proof-of-principal study has several limitations. This is a retrospective study with a small sample size and the data sets used for training had limited alternative diagnoses. Although the Kaggle database has a large sample size for non-COVID-19 CXR, we chose the sample sizes to be comparable to that of COVID-19 to avoid asymmetric sample sizes that could skew sensitivity and specificity. Future studies will need to increase the COVID-19 sample size and include additional lung pathologies. The spatiotemporal characteristics on pCXR of COVID-19 infection and its relation to clinical outcomes are unknown. Future endeavors could include developing AI algorithms to stage severity, and predict progression, treatment response, All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint recurrence, and survival, to inform and advise risk management and resource allocation associated with the COVID-19 pandemic. In conclusion, deep learning convolutional neural networks with transfer learning accurately classify COVID-19 pCXR from pCXR of normal, bacterial pneumonia, and non-COVID viral pneumonia patients in a multiclass model. This approach has the potential to help radiologists and frontline physicians by providing efficient and accurate diagnosis. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint Table 2 shows the precision and recall rate and F1 score (whole CXR). Recall F1 -score (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.09.02.20186759 doi: medRxiv preprint The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health -The latest 2019 novel coronavirus outbreak in Wuhan, China Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia First Case of 2019 Novel Coronavirus in the United States The Laboratory Diagnosis of COVID-19 Infection: Current Issues and Challenges Detection of SARS-CoV-2 in Different Types of Clinical 10 Imaging and clinical features of patients with 2019 novel coronavirus SARS-CoV-2 Portable versus Fixed X-ray Equipment: A Review of the Clinical Effectiveness, Cost-effectiveness, and Guidelines. Ottawa (ON) Deep learning Using deep convolutional neural networks to identify and classify tumor-associated stroma in diagnostic breast biopsies Imagenet classification with deep convolutional neural networks Improving neural networks by preventing co-adaptation of feature detectors Very deep convolutional networks for large-scale image recognition Deep machine learning-a new frontier in artificial 19 Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images Artificial intelligence and machine learning in respiratory medicine A systematic review of the diagnostic accuracy of artificial intelligence-based computer programs to analyze chest x-rays for pulmonary tuberculosis Attention-Guided Convolutional Neural Network for Detecting Pneumonia on Chest X-Rays Deep Learning Algorithms with Demographic Information Help to Detect Tuberculosis in Chest Radiographs in Annual Workers' Health Examination Data Explainable COVID-19 Predictions Based on Chest X-ray Images An automated machine learning model to assist in the diagnosis of COVID-19 infection in chest x-ray images Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks Very Deep Convolutional Networks for Large-Scale Image Recognition Learning deep features for discriminative localization Association of Inpatient Use of Angiotensin Converting Enzyme Inhibitors and Angiotensin II Receptor Blockers with Mortality Among Patients With Hypertension Hospitalized With COVID-19 Correlation of Chest CT and RT-PCR Testing in Coronavirus