key: cord-0185256-ibycqqes authors: Tabik, S.; G'omez-R'ios, A.; Mart'in-Rodr'iguez, J. L.; Sevillano-Garc'ia, I.; Rey-Area, M.; Charte, D.; Guirado, E.; Su'arez, J. L.; Luengo, J.; Valero-Gonz'alez, M. A.; Garc'ia-Villanova, P.; Olmedo-S'anchez, E.; Herrera, F. title: COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on Chest X-Ray images date: 2020-06-02 journal: nan DOI: nan sha: b2cd097b7e4a5425919b2c519c34cbe8c0049806 doc_id: 185256 cord_uid: ibycqqes Currently, Coronavirus disease (COVID-19), one of the most infectious diseases in the 21st century, is diagnosed using RT-PCR testing, CT scans and/or Chest X-Ray (CXR) images. CT (Computed Tomography) scanners and RT-PCR testing are not available in most medical centers and hence in many cases CXR images become the most time/cost effective tool for assisting clinicians in making decisions. Deep learning neural networks have a great potential for building triage systems for detecting COVID-19 patients, especially patients with low severity. Unfortunately, current databases do not allow building such systems as they are highly heterogeneous and biased towards severe cases. This paper is three-fold: (i) we demystify the high sensitivities achieved by most recent COVID-19 classification models, (ii) under a close collaboration with Hospital Universitario Cl'inico San Cecilio, Granada, Spain, we built COVIDGR-1.0, a homogeneous and balanced database that includes all levels of severity, from Normal with positive RT-PCR, Mild, Moderate to Severe. COVIDGR-1.0 contains 377 positive and 377 negative PA (PosteroAnterior) CXR views and (iii) we propose COVID Smart Data based Network (COVID-SDNet) methodology for improving the generalization capacity of COVID-classification models. Our approach reaches good and stable results with an accuracy of $97.37% pm 1.86 %$, $88.14% pm 2.02%$, $66.5% pm 8.04%$ in severe, moderate and mild COVID severity levels. Our approach could help in the early detection of COVID-19. COVIDGR-1.0 dataset will be made available after the review process. In the last months, the world has been witnessing how COVID-19 pandemic is increasingly infecting a large mass of people very fast everywhere in the world. The trends are not clear yet but some research confirm that this problem may persist until 2024 [1] . Besides, prevalence studies conducted in several countries reveal that a tiny proportion of the population have developed antibodies after exposure to the virus, e.g., 5% in Spain 1 . This means that frequently a large number of patients will need to be assessed in small time intervals by few number of clinicians and with very few resources. In general, COVID-19 diagnosis is carried out using at least one of these three tests. • Computed Tomography (CT) scans-based assessment: it consists in analyzing 3D radiographic images from different angles. The needed equipment for this assessment is not available in most hospitals and it takes more than 15 minutes per patient in addition to the time required for CT decontamination [2] . • Reverse Transcription Polymerase Chain Reaction (RT-PCR) test: it detects the viral RNA from sputum or nasopharyngeal swab [3] . It requires specific material and equipment, which are not easily accessible and it takes at least 12 hours, which is not desirable as positive COVID-19 patients should be identified and tracked as soon as possible. Some studies found that RT-PCR results from several tests at different points from the same patients were variable during the course of the illness producing a high false-negative rate [4] . The authors suggested that RT-PCR test should be combined with other clinical tests such as CT. • Chest X-Ray (CXR): The required equipment for this assessment are less cumbersome and can be lightweight and transportable. In general this type of resources is more available than the required for RT-PCR and CT-scan tests. In addition, CXR test takes about 15 seconds per patient [3] . Which makes CXR one of the most time/cost effective assessment tools. Few recent studies provide estimates on expert radiologists sensitivity in the diagnosis of COVID-19 based on CT scans, RT-PCR and CXR. A study on a set of 51 patients with chest CT and RT-PCR essay performed within 3 days, reported a sensitivity in CT of 98% compared with RT-PCR sensitivity of 71% [5] . A different study on 64 patients (26 men, mean age 56 ± 19 years) reported a sensitivity of 69% for CXR compared with 91% for initial RT-PCR [3] . According to an analysis of 636 ambulatory patients [6] , most patients presenting to urgent care centers with confirmed coronavirus disease 2019 have normal or mildly abnormal findings on CXR. Only 58.3% of these patients are correctly diagnosed by the expert eye. In a recent study [3] , authors proposed simplifying the quantification of the level of severity by adapting a previously defined Radiographic Assessment of Lung Edema (RALE) score [7] to COVID-19. This new score is calculated by assigning a value between 0-4 to each lung depending on the extent of visual features such as, consolidation and ground glass opacities, in the four parts of each lung as depicted in Figure 1 . Based on this score, experts can identify the level of severity of the infection among four severity stages, Normal 0, Mild 1-2, Moderate 3-5 and Severe 6-8. In practice, a patient classified by expert radiologist as Normal can have positive RT-PCR. We refer to these cases as Normal-PCR+. Expert annotation adopted in this work is based in this score. tential to optimize the role of CXR images for a fast diagnosis of COVID-19. A robust and accurate DL model could serve as a triage method and as a support for medical decision making. An increasing number of recent works claim achieving impressive sensitivities > 95%, far higher than expert radiologists. These high sensitivities are due to the bias in the most used COVID dataset, COVID-19 Image Data Collection [8] . This dataset includes a very small number of COVID positive cases, coming from highly heterogeneous sources (at least 15 countries) and most cases are severe patients, an issue that drastically reduces its clinical value. To populate Non-COVID and Healthy classes, AI researchers are using CXR images from diverse pulmonary disease repositories. The obtained models will have no clinical value as well since they will be unable to detect patients with low and moderate severity, which are the target of a clinical triage system. In view of this situation, there is still a huge need for higher quality datasets built under the same clinical protocol and under a close collaboration with expert radiologists. The concept of Smart Data refers to the process of converting raw data into higher quality data with higher concentration of useful information [9] . Multiple studies have proven that higher quality data ensures higher quality models. Smart data includes all pre-processing methods that improve value and veracity of data. Examples of these methods include noise elimination, data-augmentation [10] and data transformation [11] among other techniques. In this work, we designed a high clinical quality dataset, named COVIDGR-1.0 that includes four levels of severity, Normal-PCR+, Mild, Moderate and Severe. We identified these four severity levels from a recent COVID radiological study [3] . We also propose COVID Smart Data based Network (COVID-SDNet) methodology. It combines segmentation, data-augmentation and data transformations together with an appropriate Convolutional Neural Network (CNN) for inference. The contributions of this paper can be summarized as follows: • To analyze reliability, potential and limitations of the most used COVID CXR datasets and models. • To provide a high quality dataset, called COVIDGR-1.0, for building triage systems with high clinical value. • To design a novel methodology, named COVID-SDNet, with a high generalization capacity for COVID classification based on CXR images. COVIDS-DNet combines segmentation, data-transformation to increase the discrimination capacity of the classification model, data-augmentation, and a suitable CNN model together with an inference approach to get the final class. Experiments demonstrate that our approach reaches good and stable results especially in moderate and severe levels, with 97.37% ± 1.86% and 88.14% ± 2.02% respectively. Lower accuracies were obtained in mild and normal-PCR+ severity levels with 66.5% ± 8.04% and 38.68% ± 2.44% respectively. This paper is organized as follows: A review of the most used datasets and COVID classification approaches is provided in Section 2. Section 3 describes how COVIDGR-1.0 is built and organized. Our approach is presented in Section 4. Experiments, comparisons and results are provided in Section 5 and finally Conclusions are pointed out in Section 6. The last three months have known an increasing number of works exploring the potential of deep learning models for automating COVID-19 diagnosis based on CXR images. The results are promising but still too much work needs to be done at the level of data and models design. Given the potential bias in this type of problems, several studies include explication methods to their models. This section analyzes the advantages and limitations of current datasets an models for building automatic COVID-19 diagnosis systems with and without decision explication. There does not exist yet a high quality collection of CXR images for building COVID diagnosis systems of high clinical value. Currently, the main source for COVID class is COVID-19 Image Data Collection [8] . It contains 76 positive and 26 negative PA views. These images were obtained from highly heterogeneous equipment from all around the world. To build Non-COVID classes, most studies are using CXR from one or multiple public pulmonary disease data-sets. Examples of these repositories are: • RSNA Pneumonia CXR challenge dataset on Kaggle [12] . • Figure- 1-COVID-19 Chest X-ray Dataset Initiative [13] • ChestX-ray8 dataset [14] . • MIMIC-CXR dataset [15] . • PadChest dataset [16] . For instance, COVIDx 1.0 [17] was built by combining three public datasets: (i) COVID-19 Image Data Collection [8] , (ii) Figure- 1-COVID-19 Chest X-ray Dataset Initiative [13] and (iii) RSNA Pneumonia Detection Challenge dataset [12] . COVIDx 2.0 was built by re-organizing COVIDx 1.0 into three classes, Normal (healthy), Pneumonia and COVID-19 using 201 CXR images for COVID class, including PA(PosteroAnterior) and AP(AnteroPosterior) views (see Table 1 ). Notice that for a correct learning front view (PA) and back view (AP) cannot be mixed in the same class. Although the value of these datasets is unquestionable as they are being useful for carrying out first studies and reformulations, however they do not guarantee useful triage systems for the next reasons. It is not clear what annotation protocol has been followed for constructing the positive class in COVID-19 Image Data Collection. The included data is highly heterogeneous and hence DL-models can rely on other aspects then COVID visual features to differentiate between the involved classes. This dataset does not provide a representative spectrum of COVID-19 severity levels, most positive cases are of severe patients [18] . Our claim is that the design of a high quality dataset must be done under a close collaboration between expert radiologists and AI experts. The annotations must follow the same protocol and representative numbers of all levels of severity, especially Mild and Moderate levels, must be included. Existing related works are not directly comparable as they consider different combinations of public data-sets and different experimental setup. A brief summary of these works is provided in Table 2 The most related studies to ours as they proposed different models to the typical ones are [17] and [19] . In [17] , the authors designed a deep network, called COVIDNet. They affirmed that COVIDNet reaches an overall accuracy of 92.6%, with 97.0% sensitivity in Normal class, 90.0% in Non-COVID-19 and 87.1% in COVID-19. The authors of a smaller network, called COVID-CAPS [19] , also claim that their model achieved an accuracy of 98.7%, sensitivity of 90%, specificity of 95.8%. These results look too impressive when compared to expert radiologist sensitivity, 69%. This can be explained by the fact that the used dataset is biased to severe COVID cases [18] . In addition, the performed experiments in both cited works are not statistically reliable as they were evaluated on one single partition. The stability of these models, in terms of standard deviation, has not been reported. DL classification models with explanation approaches: Several interesting explanations were proposed to help inspect the predictions of DL-models [22, 21] although all their classification models were trained and validated on variations of COVIDx. The authors in [21] first use an ensemble of two CNN networks to predict the class of the input image, as Normal, Pneumonia or COVID. Then highlight class-discriminating regions in the input CXR image using gradientguided class activation maps (Grad-CAM++) and layer-wise relevance propagation (LRP). In [22] , the authors proposed explaining the decision of the classification model to radiologists using different saliency map types together with uncertainty estimations (i.e., how certain is the model in the prediction). It is well known that the larger is the database the more effective is the learning of ML algorithms. Even when the data is of lower quality, algorithms can actually perform better, as long as useful information can be extracted by the model. Alternatively, instead of starting with an extremely large and noisy dataset, one can build a small and smart dataset then augment it in a way it increases the performance of the model. This approach has proven effective in multiple studies. This is particularly true in the medical field, where access to data is heavily protected due to privacy concerns and costly expert annotation. Under a close collaboration with four highly trained radiologists from Hospital Universitario Clnico San Cecilio, Granada, Spain, we first established a protocol on how CXR images are selected and annotated to be included in the dataset. A CXR image is annotated as COVID-19 positive if both RT-PCR test and expert radiologist confirm that decision within less than 24 hours. CXR with positive PCR are labeled as Normal-PCR+. The involved radiologists annotated the level of severity of positive cases based on RALE score as: Normal-PCR+, Mild, Moderate and Severe. Patients with positive RT-PCR that were annotated by expert radiologists as Normal are actually asymptomatic patients. COVIDGR-1.0 is organized into two classes, positive and negative. It contains 754 images distributed into 377 positive and 377 negative cases, more details are provided in Table 3 . All the images were obtained from the same equipment and under the same X-ray regime. Only PosteriorAnterior (PA) view is considered. COVIDGR-1.0 will be available to the scientific community after review at https://github.com/ari-dasci/covidgr. In this section, we describe COVID-SDNet methodology in detail, covering pre-processing to produce smart data, including segmentation and data transformation for increasing discrimination between positive and negative classes, combined with a deep CNN for classification. One of the pieces of COVID-SDNet is the CNN-based classifier. We have selected Resnet-50 initialized with ImageNet weights for a transfer learning approach. To adapt this CNN to our problem, we have removed the last layer of the net and added 512 neurons layer with ReLU activation and two or four neuron layer (according to the considered number of classes) with softmax activation. All the layers of the network were fine-tuned. We used a batch size of 16 and SGD as optimizer. The main stages of COVID-SDNet are three, two associated to pre-processing for producing quality data (smart data stages) and the learning and inference process. A flowchart of COVID-SDNet is depicted in Figure 2 . Different CXR equipment brands include different extra information about the patient in the sides and contour of CXR images. The position and size of the patient may also imply the inclusion of more parts of the body, e.g., arms, neck, stomach. As this information may alter the learning of the classification model, first, we used the pre-trained U-Net segmentation model provided in [24] to first extract the smallest rectangle that includes left and right lungs. Then, to avoid eliminating useful information, we add 2.5% of pixels to the left, right, up and down sides of the rectangle. An illustration with example of this pre-processing is shown in Figure 3 . To increase the discrimination capacity of the classification model, we used a Class-inherent transformations (CiT) Network inspired by GANs (Generative Adversarial Networks). This transformation method is actually an array of two generators G P and G N . G P learns the inherent-class transformations of the positive class P and G N learns the inherent-class transformations of the negative class N. In other words, G P learns the transformations that bring an input image from its own k domain, with k ∈ {P, N}, to the P class domain. While G N learns the transformations that bring the input image from its k space, with k ∈ {P, N}, to the N class space. The classification loss is introduced in the generators to drive the learning of each specific k-class transformations. More details about these transformation networks can be found in [11] . The architecture of the generators consists of 5 identical residual blocks. Each block has two convolutional layers with 3 × 3 kernels and 64 feature maps followed by batch-normalization layers and Parametric ReLU as activation function. The last residual block is followed by a final convolutional layer which reduces the output image channels to 3 to match the inputs dimensions. The classifier is a ResNet-18 which consists of an initial convolutional layer with 7 × 7 kernels and 64 feature maps followed by a 3 × 3 max pool layer. Then, 4 blocks of two convolutional layers with 3 × 3 kernels with 64, 128, 256 and 512 feature maps respectively followed by a 7 × 7 average pooling and one fully connected layer which outputs a vector of N elements. ReLU is used as activation function. Once the generators learn the corresponding transformations, the dataset is processed using G N and G P . Two pair of images (I+, I−) will be obtained from each input image I, where I+ and I− are respectively the positively and negatively transformed images of I. If I belongs to class P, G P and G N will produce the positive transformation I+ ∈ P+ and the negative transformation I− ∈ P-. If an input image I belongs to class N, G P and G N will produce its positive I+ ∈ N+ and negative I− ∈ Ntransformations. Figure 4 illustrates with example the transformations applied by G N and G P . Notice that these transformations are not meant to be interpretable by the human eye but rather help the classification model better distinguish between the different classes. The original binary problem is then converted into a four classes problem, where the new classes are N+, N-, P+ and P-. The CNN classification model described above in this section (Resnet-50) is trained to predict the new four classes. The output for each transformed image associated to the original one are actually four tuple. Herein, we propose an inference process to fuse the output. In this way, for each pair (I+, I−), the prediction I of the original image will be either P or N. Let I+ = argmax θ = argmax (θ N+ , θ N-, θ P+ , θ P-) and I− = argmax ψ = argmax (ψ N+ , ψ N-, ψ P+ , ψ P-) be ResNet-50 predictions for I+ and I− respectively, where θ and ψ are the probabilities of belonging to each class. Then: Experimentally, we used a batch size of 16 and SGD as optimizer. In this section we (1) provide all the information about the used experimental setup, (2) evaluate two state-of-the-art COVID classification models on our dataset then, analyze (3) the impact of data pre-processing and (4) Normal-PCR+ severity level on our approach. Due to the high variations between different executions, we performed 5 different 5 fold cross validations in all the experiments. Each experiment uses 80% of COVIDGR 1.0 for training and the remaining 20% for testing. To choose when to stop the training process, we used a random 10% of each training set for validation. In each experiment, a proper set of data-augmentation techniques is carefully selected. All results, in terms of sensitivity, specificity, precision, We compare our approach with the two most related approaches to ours, COVIDNet [17] and COVID-CAPS [19] . • COVIDNet: Currently, the authors of this network provide three versions, namely A, B and C, available at [25] . A has the largest number of trainable parameters, followed by B and C. We performed two evaluations of each network in such a way that the results will be comparable to ours. -First, we tested COVIDNet-A, COVIDNet-B and COVIDNet-C, pretrained on COVIDx, directly on our dataset by considering only two classes: Normal (negative), and COVID-19 (positive). The whole dataset (377 positive images and 377 negative images) is evaluated. We report in Table 4 recall and precision results for Normal and COVID-19 classes. -Second, we retrained COVIDNet on our dataset. It is important to note that as only a checkpoint of each model is available, we could not remove the last layer of these networks, which has three neurons. We used 5 different 5 fold cross validations. In order to be able to retrain COVIDNet models, we had to add a third Pneumonia class into our dataset. We randomly selected 377 images from the Pneumonia class in COVIDx dataset. We used the same hyper-parameters as the ones indicated in their training script, that is, 10 epochs, a batch size of 8 and a learning rate of 0.0002. We changed covid weight to 1 and covid percent to 0.33 since we had the same number of images in all the classes. Similarly, we report in Table 4 recall and precision of our two classes, Normal and COVID-19, and omit recall and precision of Pneumonia class. The accuracy reported in the same table only takes into account the images from our two classes. As with our models, we report here the mean and standard deviation of all metrics. Although we analyzed all three A, B and C variations of COVIDNet, for simplicity we only report the results of the best one. • COVID-CAPS: This is a capsule network-based model proposed in [19] and available at [26] . Its architecture is notably smaller than COVIDNet, which implies a dramatically lower number of trainable parameters. Since the authors also provide a checkpoint with weights trained in the COVIDx dataset, we were able to follow a similar procedure than with COVIDNet: -First, we tested the pretrained weights using COVIDx on COVIDGR-1.0 dataset. COVID-CAPS is designed to predict two classes, so we reused the same architecture with the new dataset and compute the evaluation metrics shown in Table 4 . -Second, COVID-CAPS architecture was retrained over the COVIDGR-1.0 dataset. This process finetunes the weights to improve class separation. The retraining process is performed using the same setup and hyper-parameters reported by the authors. Adam optimizer is used across 100 epochs with a batch size of 16. Class weights were omitted as with COVIDNet, since this dataset contains balanced classes in training as well as in test. Evaluation metrics are computed for five sets of 5-fold cross-validation test subsets and summarized in Table 4 . The results from Table 4 show that COVIDNet and COVID-CAPS trained on COVIDx overestimate COVID-19 class in our dataset, i.e., most images are classified as positive, resulting in very high sensitivities but at the cost of low positive predictive value. However, when COVIDNet and COVID-CAPS are re-trained on COVIDGR-1.0 they achieve slightly better overall accuracy and a higher balance between sensitivity and specificity, although they seem to acquire a bias favoring the negative class. In general, none of these models perform adequately for the detection of the disease from CXR images in our dataset. The results of the baseline COVID classification model considering all the levels of severity, with and without segmentation; and COVID-SDNet are shown in Table 5 Table 5 : Results of COVID prediction using ResNet-50 with and without segmentation, COVID-SDNet, Retrained COVIDNet-CXR A and Retrained COVID-CAPS. All four levels of severity in the positive class are taken into account. In general, COVID-SDNet achieves better and more stable results than the rest of approaches. In particular, COVID-SDNet achieved the highest balance between specificity and sensitivity with 77.67 ± 3.21 F1 in the negative class and 76.82 ± 3.08 F1 in the positive class. Most importantly, COVID-SDNet achieved the highest specificity with 79.20 ± 6.29, sensitivity 75.43 ± 5.91 and accuracy with 77.31 ± 2.92. When comparing the results of the baseline classification model with and without segmentation, we can observe that the use of segmentation improves substantially the sensitivity which is the most important criteria for a triage system. This can be explained by the fact that segmentation allows the model to focus on most important parts of the CXR image. To determine which levels are the hardest to distinguish by the best approach, we have analyzed the accuracy per severity level (S), with accuracy(S) = As it can be seen from these results, COVID-SDNet correctly distinguish Moderate and Severe levels with an accuracy of 88, 14% and 97, 37% respectively. This is due to the fact that Moderate and Severe CRX images contain more important visual features than Mild and Normal-PCR+ which ease the classification task. Normal-PCR+ and Mild cases are much more difficult to identify as they contain few or none visual features. These results are coherent with the clinical studies provided in [6] and [3] which report that expert sensitivity is very low in Normal-PCR+ and Mild infection levels. Recall that the expert eye does not see any visual signs in Normal-PCR+ although the PCR is positive. Those cases are actually considered as asymptomatic patients. To analyze the impact of Normal-PCR+ class on COVID-19 classification, we trained and evaluated the baseline model, COVID-SDNet classification stage, COVIDNet-CXR-A and COVID-CAPS, on COVIDGR by eliminating Normal-PCR+. The results are summarized in Table 7 Overall, all the approaches systematically provide better results when eliminating Normal-PCR+ from the training and test processes, including COVIDNet-CXR-A and COVID-CAPS. In particular, COVID-SDNet still represents the best and most stable approach. A further analysis of the accuracy at the level of each severity degree (see Table 8 These results show that although Normal-PCR+ is the hardest level to predict, its presence improves the accuracy of lower severity levels, especially Mild level. Automatic DL diagnosis systems alone are not mature yet to replace expert radiologists. To help clinician making decisions, these tools must be interpretable so that clinicians can decide whether to trust the model or not [27] . We inspect what led our model make a decision by showing the regions of the input image that triggered that decision along with its counterfactual explanation by showing the parts that explain the opposite class. We adapted Grad-CAM method [28] to explain the decision of the negative and positive class. Figures 5, 6 and 7 show (a) the original CXR image, (b) visual explanation by means of a heat-map that highlights the regions/pixels which led the model to output the actual prediction and (c) its counterfactual explanation using a heat-map that highlights the regions/pixels which had the highest impact on predicting the opposite class. The larger high intensity areas in the heat-map determine the final class. However, Figure 8 (b) represents first the counterfactual explanation and Figure 8 (c) represents the explanation of the actual decision. As expected, negative and positive interpretations are complementary, i.e, areas which triggered the correct decision are opposite, in most cases, to the areas that triggered the decision towards negative. In CXR images with different severity levels, the heat-maps correctly point out opaque regions due to different levels of infiltrates, consolidations and also to osteoarthritis. In particular, in Figure 5 (b), the red areas in the right lung points out a region with infiltrates and also osteoarthritis in the spine region. Figure 6 (b) correctly shows moderate infiltrates in the right lower and lower-middle lung fields in addition to a dilation of ascending aorta and aortic arch (red color in the center). Figure 5 (c) shows normal upper-middle fields of both lungs (less important on the left due to aortic dilation). Figure 7 (b) indicates an important bilateral pulmonary involvement with consolidations. As it can be observed in Figure 8 (c), the explanation of the negative class correctly highlights a symmetric bilateral pattern that occupies a larger lung volume especially in regions with high density. In fact, a very similar pattern is shown in the counterfactual explanation of the positive class in Figures 5(c) , 6(c) and 7(c). This paper introduced a dataset, named COVIDGR, with high clinical value. COVIDGR includes the four main COVID severity levels identified by a recent radiological study [3] . We proposed a methodology, called COVID-SDNet, that combines segmentation, data-augmentation and data transformation. The obtained results show the high generalization capacity of COVID-SDNet, especially on severe and moderate levels as they include important visual features. The existence of few or none visual features in Mild and Normal-PCR+ reduces the opportunities for improvement. As main conclusions, we must highlight that COVID-SDNet can be used in a triage system to detect especially moderate and severe patients. Finally, we must also mention that more robust and accurate triage system can be built by fusing our approach with other approaches such as the one proposed in [29] . As future work, we are working on enriching COVIDGR with more CXR images coming from different hospitals. We are planning to explore the use of additional clinical information along with CXR images to improve the prediction performance. This project is approved by the Provincial Research Ethics Committee of Granada. Projecting the transmission dynamics of sars-cov-2 through the postpandemic period American college of radiology and others. ACR recommendations for the use of chest radiography and computed tomography (CT) for suspected covid-19 infection Frequency and distribution of chest radiographic findings in covid-19 positive patients Stability issues of rt-pcr testing of sars-cov-2 for hospitalized patients clinically diagnosed with covid-19 Sensitivity of chest ct for covid-19: comparison to rt-pcr Chest x-ray findings in 636 ambulatory patients with covid-19 presenting to an urgent care center: A normal chest x-ray is no guarantee Severity scoring of lung oedema on the chest radiograph is associated with clinical outcomes in ards Covid-19 image data collection Big Data Preprocessing -Enabling Smart Data A snapshot of image pre-processing for convolutional neural networks: case study of mnist Fucitnet: Improving the generalization of deep learning networks by the fusion of learned classinherent transformations Radiological society of north america. rsna pneumonia detection challenge Figure 1 covid-19 chest x-ray dataset initiative Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Mimic-cxr: A large publicly available database of labeled chest radiographs Padchest: A large chest x-ray image dataset with multi-label annotated reports Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images How might ai and chest imaging help unravel covid-19s mysteries? Covid-caps: A capsule network-based framework for identification of covid-19 cases from x-ray images Automated detection of covid-19 cases using deep neural networks with x-ray images Deepcovidexplainer: Explainable covid-19 predictions based on chest x-ray images Estimating uncertainty and interpretability in deep learning for coronavirus (covid-19) detection Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks U-Net lung segmentation Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI Grad-cam: Visual explanations from deep networks via gradient-based localization Predicting covid-19 pneumonia severity on chest x-ray with deep learning