key: cord-0831850-hloo9yvw authors: Aggarwal, Priya; Mishra, Narendra Kumar; Fatimah, Binish; Singh, Pushpendra; Gupta, Anubha; Joshi, Shiv Dutt title: COVID-19 image classification using deep learning: Advances, challenges and opportunities date: 2022-03-03 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2022.105350 sha: 36e06f6b3b0e5104e573de31ffd6dd123ed18783 doc_id: 831850 cord_uid: hloo9yvw Corona Virus Disease-2019 (COVID-19), caused by Severe Acute Respiratory Syndrome-Corona Virus-2 (SARS-CoV-2), is a highly contagious disease that has affected the lives of millions around the world. Chest X-Ray (CXR) and Computed Tomography (CT) imaging modalities are widely used to obtain a fast and accurate diagnosis of COVID-19. However, manual identification of the infection through radio images is extremely challenging because it is time-consuming and highly prone to human errors. Artificial Intelligence (AI)-techniques have shown potential and are being exploited further in the development of automated and accurate solutions for COVID-19 detection. Among AI methodologies, Deep Learning (DL) algorithms, particularly Convolutional Neural Networks (CNN), have gained significant popularity for the classification of COVID-19. This paper summarizes and reviews a number of significant research publications on the DL-based classification of COVID-19 through CXR and CT images. We also present an outline of the current state-of-the-art advances and a critical discussion of open challenges. We conclude our study by enumerating some future directions of research in COVID-19 imaging classification. COVID-19 Image Classification using Deep Learning: Advances, Challenges and Opportunities Priya Aggarwal, Narendra Kumar Mishra, Binish Fatimah, Pushpendra Singh * , Anubha Gupta, Shiv Dutt Joshi Abstract-Corona Virus Disease-2019 (COVID- 19) , caused by Severe Acute Respiratory Syndrome-Corona Virus-2 (SARS-CoV-2), is a highly contagious disease that has affected the lives of millions around the world. Chest X-Ray (CXR) and Computed Tomography (CT) imaging modalities are widely used to obtain a fast and accurate diagnosis of COVID-19. However, manual identification of the infection through radio images is extremely challenging because it is time-consuming and highly prone to human errors. Artificial Intelligence (AI)-techniques have shown potential and are being exploited further in the development of automated and accurate solutions for COVID-19 detection. Among AI methodologies, Deep Learning (DL) algorithms, particularly Convolutional Neural Networks (CNN), have gained significant popularity for the classification of COVID-19. This paper summarizes and reviews a number of significant research publications on the DL-based classification of COVID-19 through CXR and CT images. We also present an outline of the current state-of-the-art advances and a critical discussion of open challenges. We conclude our study by enumerating some future directions of research in COVID-19 imaging classification. Index Terms-COVID-19 detection, Deep Learning, Convolutional Neural Networks, X-ray and CT scan images Coronavirus or COVID-19 is a viral disease that was first identified in Wuhan, China, in December 2019 and later spread quickly worldwide [1] , [2] . It is caused by Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) and has affected millions of people worldwide. COVID-19 infection starts in the throat's mucous membranes and spreads to the lungs through the respiratory tract. COVID-19 is a highly contagious disease; therefore, it is vital to rapidly screen, diagnose and isolate patients to prevent the spread of the disease and accelerate their proper treatment. Diagnosis of COVID-19 infection through medical imaging, such as CXR and CT scans, has been reported to yield accurate results and is being used widely in the screening of the disease [3] - [5] . However, successful interpretation of results through images faces several challenges due to the very recent development of the disease and similarities with other pulmonary disorders such as pneumonia [6] (refer to Fig. 1 ). Due to the complex nature of COVID-19, its accurate diagnosis is a relatively complicated time-taking task that requires the expertise of radiologists to achieve acceptable diagnostic performance. Control and eradication of COVID-19 depend heavily on isolating the infected and vaccinating the susceptible. At present, the gold standard for COVID-19 detection is the RT-PCR (Reverse Transcription Polymerase Chain Reaction) test; however, it requires more time to process the specimen and generate the result. Also, it has been observed that many patients may test positive for COVID-19 after recovery [7] . Vaccination is known to immunize people against the virus; however, they are still prone to the infection. Developing an effective and safe vaccine with prolonged efficacy is still in progress and will take substantial time. Further, vaccination of the entire global population will also take time due to the constraints on the availability of the vaccine and the geographical spread of the population. At the same time, efforts are underway for devising quick diagnostic solutions for COVID-19 detection through CXR and CT images that are analyzed routinely by radiologists. The manual diagnosis of COVID-19 is time-consuming, prone to human errors, and needs the assistance of a qualified radiologist. The availability of an expert radiologist is also required because the abnormalities during the early stages of COVID-19 may appear similar to the other pulmonary syndromes of Severe Acute Respiratory Syndrome (SARS) or Viral Pneumonia (VP) that can also pose an impediment to the timely diagnosis and treatment of COVID. As an example, some samples of CXR and CT images of COVID and non-COVID cases are shown in Fig. 2 and Fig. 3 . The axial images show bilateral scattered ground-glass opacity with air-space consolidation in the posterior segments of lower lung lobes with the peripheral and subpleural distribution. Since CXR and CT are recommended for various pulmonary abnormalities, any automated solution designed to diagnose COVID-19 should also consider other respiratory disorders to develop a more comprehensive and robust diagnostic system. The successful application of DL in computer vision and the biomedical domain has encouraged researchers to explore AI-based solutions for COVID-19 detection using CXR and CT-scan images. With the ongoing outbreak of COVID-19, though the research area is nascent but has shown tremendous potential and is progressing fast. Several studies have been conducted for the automated diagnosis of COVID-19 using DL techniques [5] , [8] . Typically, the DL-based model consists of a hierarchical structure with a Convolutional Neural Network (CNN) as an important block, where each layer extracts the features pertinent to COVID-19 that can be used to classify COVID-19 images from non-COVID images. Propelled by CNN's automatic feature learning capabilities, deep neural networks-based COVID-19 classification is being widely used. Of late, detection of COVID-19 using only CXR or CT images has shown potential in developing automated solutions. However, it is important to note that any automated solution for practical application needs a high detection rate and consistent performance over an unseen dataset. Thus, it requires advanced methods that can yield universally acceptable performance. Multimodal data analysis has the potential to yield better performance compared to single-modal data analysis because a DL model can learn robust and accurate features from a large heterogeneous dataset of multiple modalities and hence, can provide better classification performance [9] , [10] . Multimodal data analysis can be undertaken by considering CXR and CT images, thermal images, cough/speech, and blood samples. Due to the public availability of CXR and CT datasets, several single modal and multimodal data analysis studies have been published recently on COVID-19 detection having advantages, limitations, and challenges. To further increase the pace of research in COVID-19 diagnosis using CXR and CT images, a systematic survey and a comprehensive review of recent literature are required that can assist the researchers in the near future. Motivated with the above, we present a review of single modal and multimodal DL-based research studies of COVID-19 and introduce an overall pipeline. We also highlight various challenges and limitations in this area and briefly discuss the future scope. Since the development of DL-based methods has been facilitated by the public availability of many CXR and CT datasets, we also present a detailed description of each dataset along with a summary of relevant information in a tabular form to highlight its popularity in the COVID-19 literature and also provide the links for the same. Since the research in this field has started recently and is progressing fast, it is important to continuously review the developments that can help in catching up with the recent and push towards future developments. In literature, a few survey papers on COVID-19 image classification have been published [11] - [14] but a majority of these have reviewed a relatively small number of research papers mainly published in 2020. Our review includes a total of 71 research articles. Compared to the other survey papers, we only discuss studies that have used state-of-the-art DL techniques, have reported higher accuracy results, and are mainly published in 2021. In addition, our review is a comprehensive study that includes broad topics, such as DL-based classification pipeline, popular databases for COVID-19 classification, elaborate tables with details on preprocessing and, online data and code availability. Finally, we present a discussion on unique challenges and future directions of DL-based COVID-19 classification. Furthermore, compared to the other review studies that have focused only on either CXR or CT images, we have also covered multimodal works using both CXR and CT images. A key objective of this review is to summarize notable DLbased studies that can help future researchers in overcoming the challenges to the successful realization of automated solutions for the quick and robust diagnosis of COVID-19. The salient contributions of this study are as follows: 1) It briefs the pipeline of different popular DL-based methods employed in the related studies; 2) It provides details of the widely used COVID-19 datasets available publicly; 3) It presents an overview of data augmentation, preprocessing methodology, and K-Fold cross-validation used in DL approaches along with their code and data J o u r n a l P r e -p r o o f availability for reproducing the results. This information is important because it can help the researchers ascertain the reliability of the studies and can give the required push to further research; and 4) Finally, it suggests possible research directions by discussing unique challenges and future work based on: • the contribution percentage of each CNN learning method in the studied papers to find the most popular technique, and • the contribution percentage of COVID-19 dataset in the studied papers to target the creation of standard benchmark dataset. This review paper is organized as follows. Section II presents an overview of a DL pipeline. Section III summarizes the publicly available imaging datasets for COVID-19 diagnosis. A literature review on CXR, CT, and multi-modality-based COVID-19 diagnosis is carried out in Section IV. Challenges with COVID-19 image analysis are presented in section V. Opportunities and future work are discussed in Section VI. Finally, Section VII concludes the study. DL has advanced to a high level of maturity because of three primary factors: 1) availability of a high-end performing Graphics Processing Unit (GPU), 2) advancements in machine learning algorithms, especially CNN, and 3) access to a high volume of structured data. Consequently, DL methods have been very successful in COVID-19 detection using imaging data, whose details are presented next. Automated COVID-19 diagnosis with DL algorithms can be performed using data of various imaging modalities. The algorithm may include several steps, including pre-processing, segmentation, feature extraction, classification, performance evaluation, and explainable model prediction. Fig. 4 depicts a generic pipeline of DL-based COVID-19 diagnosis with steps discussed below. 1) Data Pre-processing: Pre-processing involves the conversion of the raw images into an appropriate format for further processing. Medical images collected from different devices vary in size, slice thickness, and the number of scans (e.g., 60 and 70 in CT). Together, these factors generate a heterogeneous collection of imaging data leading to non-uniformity across datasets. Thus, the pre-processing step largely involves resizing, normalization, and sometimes transformation from RGB to grayscale. In CT data, the voxel dimension is also resampled to account for the variation across the datasets that is also known as resampling to an isomorphic resolution [15] . Furthermore, images are improved via smoothing to improve the signal-to-noise ratio so remove the noise. Another pre-processing step involves the extraction or segmentation of desired regions of interest from an image for the classification task. For example, the lungs are majorly prone to COVID-19 infection. Therefore, for a successful diagnosis, lung regions are segmented from CXR and CT images and fed to the next processing step. It is laborious, tedious, and time-consuming to manually segment the lung area, which also depends heavily on the knowledge and experience of the radiologists. DL-based segmentation techniques such as a few-shot segmentation [16] , [17] and semantic segmentation [18] - [20] can automatically identify infected regions providing rapid screening of COVID-19 images. For the segmentation task, there are widely-used segmentation models such as fully convolutional networks (FCN) [21] , U-Net [22] , [23] , V-Net [24] , and 3D U-Net++ [25] . Sometimes, pixel values are thresholded to obtain a proper range of Hounsfield units (HU) to obtain the lung region. This step is particular to the dataset being used. The lung is an organ filled with air, and the air is the least dense object. Hence, pixel values are thresholded to segment the other non-lung tissue (e.g., skin, bone, or scanner bed) that may negatively impact the analysis. Of all DL models, U-Net is the most famous architecture for segmentation. It consists of two parts. The first part, considered as encoder, consists of a sequence of two 3x3 convolutional layers followed by a 2x2 Max Pool layer to learn the features at various levels. The second part, considered as a decoder, performs upsampling, concatenation with the correspondingly cropped features from the decoder layer, and two 3x3 convolutional operations. Through decoder operation, it tries to restore the learned feature maps to an image of original input size. U-Net has 23 convolutional layers in total. Karthik et al. (2021) [26] utilized repository-inspired U-Net architecture for segmenting lungs from CXR images 1 . Oh et al. (2020) [27] utilized FC-DenseNet103 architecture for the segmentation of lungs from CXR images and also compared the performance with the U-Net. It was also shown that the segmentation algorithm could be used for small training datasets, and the morphology of the segmentation mask can be used as a discriminatory biomarker. The segmentation scheme was tested on a cross-database to show statistically significant improvement in the segmentation accuracy. Another work by Wang et al. [28] utilized VGG based network for lung segmentation from CXR images. One famous work by Javaheri et al. (2021) [15] utilized BCDU-Net [29] to segment the lung area. This architecture was inspired by U-Net and utilized Bi-directional ConvLSTM along with densely connected convolutions. U-Net has also been used in other studies for lung segmentation [30] - [34] . Ouyang et al.(2020) [35] considered VB-Net [36] , a combination of V-Net [24] and bottle-neck structure for segmentation. 2) Feature Extraction and Classification: The main step of DL-based COVID-19 diagnosis is feature extraction and classification. DL methods extract features automatically and carry out binary or multiclass classification. Feature extraction can be performed in two ways: using transfer learning with a pre-trained model or a custom CNN model developed from scratch. CNN is the core block of many DL-based neural networks that perform feature extraction from the input images. It consists of several convolutional and pooling layers. Apart from these basic layers, it also consists of several layers of batch normalization and includes Dropout. A schematic 1 https://github.com/imlab-uiip/lung-segmentation-2d J o u r n a l P r e -p r o o f representation of a typical CNN is shown in Fig. 5 and is explained below. (i) Convolutional Layer: It consists of learnable filters (or kernels) that are convolved with the input images. It performs an element-wise dot product and sum to provide a number as an element of a matrix, called the feature map. Convolution operation follows two important features: Local Connectivity because filter weights are multiplied to a local area of the input image at a time, and Weight Sharing because the same filter weights are multiplied to every spatial location of the input image. Convolutional layers work in a hierarchical manner, where low-level features are extracted in initial layers, and high-level features are extracted in deeper layers. The convolution operation is followed by an activation function (e.g., ReLU) that introduces non-linearity into the network. (ii) Pooling Layer: This layer performs dimensionality reduction of the feature maps along the spatial dimension. It reduces the number of learnable parameters and thus, provides a reduction in the computational complexity. Average-Pooling and Max-Pooling are the two dominantly used pooling techniques. (iii) Fully-connected Layer: It performs the actual classification task. It consists of several neural network layers. The number of layers and the number of nodes in each layer are called the hyperparameters required to be tuned optimally. It is followed by a softmax layer that provides a class score for every class to an image, similar to the probabilities of belonging to different classes. An input image is classified to the class corresponding to the highest class score. Pre-trained models are the ones that have already been trained on other datasets by researchers. Generally, these models are trained on large databases such as the ImageNet database of natural images [37] . At first, forcing models to learn general image features is a preventive measure to avoid overfitting and learning domain-specific features. After ImageNet pre-training, the final 1,000-node classification layer of the trained ImageNet model is removed and replaced by a n-node layer, corresponding to the n-class classification for COVID-19 detection. In transfer learning, learned weights of the pre-trained DL architecture are used as the initial starting point for training the new dataset. A schematic representation of the transfer learning approach is shown in Fig. 6 . Transfer learning can be accomplished either by the fine-tuning weights of all the layers or by fine-tuning the weights of a few deeper layers. There are several pre-trained models that are used for COVID-19 diagnosis such as AlexNet [38] , different versions of Visual Geometry Group (VGG) [39] and ResNet [40] , Inception [41] , Xception [42] , InceptionResNet [43] , DenseNet [44] , etc. Apart from these pre-trained models, custom models are also popular for COVID-19 classification training, which implies training a model from scratch without utilizing any pre-trained model. 3) Performance Evaluation: The performance of the overall pipeline is assessed by evaluation metrics such as accuracy, sensitivity, specificity, precision, F1-score, Area Under the receiver operating characteristic curve (AUC), and so on. Typically, the data is partitioned into training, validation, and testing sets for the experiment. The training data is used to develop a particular model, while the appropriateness of the training and the model is assessed by monitoring the overfitting or underfitting, respectively, on the validation data at the same time. Finally, the performance of the developed model is tested on the unseen test data. Deep learning models are trained as black-box classifiers with no evidence of the correctness of the features extracted. Explainable AI is an emerging field that assigns important values to the input image regions leading to the predicted outcome. This assists radiologists in locating abnormalities in the lungs and gives an insight into important spatial areas that are responsible for distinguishing COVID-19 images from others. A few explainable models, including GRAD-CAM and GRAD-CAM++, used for COVID-19 diagnosis, are described in Table I . CXR is the most easily accessible and the fastest form of imaging with lesser side effects on the human body. CXR imaging has been traditionally used for the detection of pneumonia and cancer. Although it can detect COVID-19 infection, it fails to provide fine-order details of the infected J o u r n a l P r e -p r o o f [45] Class Activation Mapping is a visual explanation technique for deep convolutional neural networks by providing class-discriminative visualization. The CNN model must be re-trained because it is modified by removing all dense layers and adding a Global Average Pooling layer before the softmax layer. Grad-CAM [46] Gradient-CAM is an upgrade of CAM that does not need any architectural change or re-training. It uses the gradient details passing into the last convolutional layer to visualize the significance of each neuron. In an image, if the same class occurs multiple times, it fails to localize objects accurately. Also, it is not able to produce the heat map of the complete object. Guided Grad-CAM This technique upsamples the Grad-CAM maps and performs point-wise multiplication with the visualizations from Guided Backpropagation. It provides fine-order and class-discriminative visualization. Grad-CAM++ [47] Grad-CAM++ uses more sophisticated backpropagation to overcome issues of CAM and Grad-CAM techniques. It provides better visual explanations of CNN model predictions in terms of better object localization as well as explaining occurrences of multiple object instances in a single image. lungs. CT scan is a more sophisticated technique to evaluate the level of infection in various lobes of the lungs and is used to calculate the CT severity score of the patient. In fact, CXR is a 2D imaging, whereas CT provides 3D scans of organs from various angles. CXR imaging can be used for COVID-19 detection; however, to evaluate the level of severity of the infection, a CT scan is compulsory. This is one of the reasons that multimodal detection of COVID-19 using both CXR and CT scan images can give a better generalization ability to a neural network architecture. III. PUBLIC IMAGING DATASETS FOR COVID-19 DETECTION In all, about 35 public datasets (CXR and CT images) have been referred to and used by researchers to validate the algorithms in the articles reviewed in this work. The details are listed in Table II . Some of these datasets contain CXR images and CT-scan images of COVID-19, while others include those of normal subjects and different pulmonary diseases. The reason for using the latter type of datasets is to create more generalizable algorithms that can detect COVID-19 from a pool of more diverse radiography images. We briefly discuss some of these datasets and provide the download links for the dataset in Table II for ease of the readers and further research. We have included 20 datasets for CXR images, among which Kermany et al. (2018) [48] is the most popular dataset for normal and pneumonia CXR images. This dataset [48] consists of 5856 images, where 2780 images are of bacterial pneumonia, 1493 are of viral pneumonia, and 1583 belong to the normal subjects. The participants were recruited at the Guangzhou Women and Children's Medical Center. This dataset was used to develop Mooney dataset (2017) [49] , which is a CXR dataset available as a Kaggle competition on viral and bacterial pneumonia classification. It consists of 5,247 CXR images of normal, Image type Links Reference Papers Ali (2020) [50] CXR https://www.kaggle.com/ahmedali2019/ pneumonia-sample-xrays [51] BIMCV (2020) [52] CXR https://bimcv.cipf.es/bimcv-projects/padchest/ [53] CC-CCII database [54] CT http://ncov-ai.big.ac.cn/download?lang=en [30] , [55] Chest Imaging (2020) [56] CXR https://threadreaderapp.com/thread/ 1243928581983670272.html [5] , [57] Chung (2020) [58] CXR https://github.com/agchung/ Actualmed-COVID-chestxray-dataset [26] , [57] , [59] - [61] Cohen et al. (2020) [62] CXR and CT https://github.com/ieee8023/ covid-chestxray-dataset [5] , [9] , [10] , [23] , [26] - [28] , [51] , [53] , [55] , [57] , [59] - [61] , [ [23] , [69] , [72] , [75] , [77] , [81] , [89] , [90] , [98] , [99] Khoong (2020) [100] CXR https://www.kaggle.com/khoongweihao/ covid19-xray-dataset-train-test-sets [59] LIDC-IDRI database [101] CT https://wiki.cancerimagingarchive.net/ display/Public/LIDC-IDRI [30] Montgomery tuberculosis [96] CXR https://www.kaggle.com/raddar/ tuberculosis-chest-xrays-montgomery [27] [23] Mooney (2017) [49] CXR https://www.kaggle.com/paultimothymooney/ chest-xray-pneumonia/version/2 [5] , [10] , [26] , [57] , [60] , [63] - [65] , [67] , [72] , [73] , [76] , [85] , [87] , [ CXR https://www.kaggle.com/tawsifurrahman/ covid19-radiography-database [5] , [51] , [59] , [61] , [74] , [75] , [107] Radiology Assistant CXR and CT https://radiologyassistant.nl/chest/covid-19/ covid19-imaging-findings [63] Radiopaedia [108] CXR and CT https://radiopaedia.org/search?lang=us&q= covid&scope=cases [5] , [9] , [26] , [60] , [79] , [90] , [109] , [110] RSNA (2020) [111] CXR https://www.kaggle.com/c/ rsna-pneumonia-detection-challenge [5] , [26] , [28] , [80] , [90] , [109] , [112] Continued on next page J o u r n a l P r e -p r o o f [26] , [57] , [60] , [65] , [90] , [109] , [110] SARS-COV-2 CT-Scan (2020) [116] CT https://www.kaggle.com/plameneduardo/ sarscov2-ctscan-dataset [9] , [59] , [117] , [118] Tianchi-Alibaba database [119] CT https://tianchi.aliyun.com/dataset/dataDetail? dataId=90014 [30] USCD-AI4H [120] CT https://github.com/UCSD-AI4H/COVID-CT [10] , [59] , [117] , [ [53] , [66] , [68] , [69] , [79] , [83] , [84] , [ [52] introduced a dataset of more than 160,000 CXR images collected from 67,000 subjects at Hospital San Juan (Spain) from 2009 to 2017. This dataset includes CXR for COPD, pneumonia, heart insufficiency, pulmonary edema, pulmonary fibrosis, emphysema, tuberculosis, and other pulmonary syndromes. 27% of the data was manually labelled by physicians, and the rest was labelled using recurrent neural network. The Japanese Society of Radiological Technology (JSRT) dataset [97] was marked by radiologists for the detection of lung cancer nodules. This dataset contains 247 CXRs from 14 institutions, out of which 154 cases contain nodule markings. In addition, lung masks are also provided that can be used to study the performance of lung segmentation. The dataset of Irvin et al. (2019) [95] includes 224,316 CXRs of 65,240 patients divided into 14 classes, including no findings, enlarged cardiom, cardiomegaly, lung lesion, lung opacity, and edema; however, no COVID-19 cases were included in this study. The U.S. National Library of Medicine has made two datasets of Postero Anterior (PA) CXR images of various pulmonary diseases with a majority of cases considered for pulmonary tuberculosis (TB) [96] . These two datasets were collected from the Department of Health There are several publicly available datasets, such as Mooney dataset [49] , which do not include new original images but have been developed by collating the data of existing datasets. For example, [126] developed a dataset, COVIDx, consisting of 13,975 CXR images of 13,870 patients. The dataset was developed using five publicly available datasets, where COVID-19 cases were acquired from Cohen [62] , Chung [58] , and Rahman [106] . Non-COVID-19 pneumonia cases were acquired from Cohen [62] and RSNA pneumonia detection challenge dataset [111] . Finally, normal cases were collected from RSNA pneumonia detection challenge dataset [111] . The Khoong dataset [100] was constructed using normal and COVID-19 manifested CXR images from Cohen dataset [62] , and from https://github.com/JordanMicahBennett/SMART-CT-SCAN_ BASED-COVID19_VIRUS_DETECTOR/. Another such dataset available on Kaggle is Patel [104] , which consists of 6432 CXR images from normal, COVID-19, and pneumonia infected subjects acquired from three datasets, namely Cohen dataset [62] , Mooney [49] , and Chung dataset [58] . To include other pulmonary syndromes, Praveen [105] constructed a dataset consisting of 5800 CXR images from normal, COVID-19 pneumonia, SARS, Streptococcus, and ARDS (acute respiratory distress syndrome). These images have been acquired from Cohen dataset [62] . Sajid [113] consists of 10,000 CXR images created using data augmentation techniques. These images include normal and COVID-19 cases; however, the original source of the dataset has not been mentioned. This data set has been used so far by [59] , where eight different datasets were used to form a COVID-R dataset consisting of 2843 COVID-19 CXR images, 3108 normal, and 1439 viral and bacterial pneumonia manifested CXR images. These eight datasets include Cohen [62] , UCSD-AI4H dataset [120] , Chung dataset [58] , SARS-COV-2 CT-scan dataset [116] , Khoong dataset [100] , and Rahman [106] . SARS-COV-2 CT-Scan dataset [116] has 1,252 CT scans of 60 patients infected by COVID-19 and 1,230 CT scan images of 60 infected patients by pulmonary diseases. CC-CCII dataset [54] is a dataset of the CT images collected from cohorts from the China Consortium of Chest CT Image Investigation. Seven hundred fifty CT scans were collected from 150 COVID-19 subjects, and these slices were manually segmented. All CT images are classified into novel coronavirus pneumonia (NCP) due to SARS-CoV-2 virus infection, common pneumonia, and normal controls. LIDC-IDRI dataset [101] includes CT-scan images of 1018 lung cancer with labelled annotated lesions. The dataset was collected in collaboration with seven academic centers and eight medical imaging companies. MosMedData [102] consists of 1110 CTscans of COVID-19 patients collected between March 1, 2020 and April 25, 2020 in municipal hospital Moscow. Yan et al. [127] published a dataset on IEEE dataport consisting of 416 CT-scan images of 206 COVID-19 patients from two hospitals. The dataset also includes 412 CT-scan images of non-COVID-19 pneumonia patients. UCSD-AI4H dataset [120] consists of 349 CT-scans of 216 COVID-19 patients. The dataset has been confirmed by a radiologist from Tongji hospital. Tianchi-Alibaba database [119] consists of 20 CT scans of COVID-19 patients along with the segmentation of lungs and infections. The above-mentioned dataset included CT-scan images collected in collaboration with hospitals. There are other publicly available datasets, sometimes available as Kaggle competitions, which have been developed by combining two or more of the original datasets. For example, Gunraj et al. [93] , also known as COVIDx CT dataset, is available on kaggle. The first version was released in December 2020, and the second version was released in January 2021. The dataset includes three classes normal, pneumonia, and COVID-19. This dataset is presented in two subsets, "A" and "B", where the former includes cases with confirmed diagnoses and the latter includes all images from "A" and also those which are weakly verified. This dataset was constructed using publicly available datasets like Radiopaedia.org [108] , MosMedData [102] , CNCB 2019 novel coronavirus resource (2019nCoVR) AI diagnosis dataset [128] , COVID-19 CT lung and infection segmentation dataset [129] , LIDC-IDRI [101] , and integrative CT Images and Clinical Features for COVID-19 (iCTCF) [130] . Cohen et al. [62] is a publicly available dataset consisting of CT-scan and CXR images from 468 COVID-19 patients, 46 bacterial cases of pneumonia, 10 MERS, 5 Varicella, 4 Influenza, 3 Herpes, 16 SARS, 26 Fungal cases, and 59 unknown cases. Italian Society of Medical and Interventional Radiology (SIRM) COVID-19 database [115] consists of COVID-19 positive radiographic images (CXR and CT) with varying resolution. This database is constantly updated with new images. Vaya et al. [124] is a multimodal dataset introduced from the Valencian Region Medical Image Bank (BIMCV) containing chest radiographs of COVID-19 patients, having 2265 chest radiographs belonging to 1311 patients. There are no normal cases in this dataset. Radiopedia [108] dataset consists of case studies of several diseases, including COVID-19. It provides both CXR and CT images and has been considered as an authentic source of dataset for deep learning-based analysis. CXR images are generally used as a first-line imaging modality for patients under investigation of COVID-19 and have been analyzed in numerous studies of COVID-19 diagnosis. This imaging is comparatively inexpensive and is less hazardous to human health owing to being a low radiation J o u r n a l P r e -p r o o f modality. Table III lists the most relevant state-of-the-art studies in this direction published in recent years. CT images are processed differently than CXR images. CT data is three-dimensional (3D), consisting of several slices (16, 32, 64, 128, etc.) acquired during the scan. The slice capturing the largest lung region is selected and is often treated as a separate image. Some publicly available datasets consist of only one CT slice per subject. In other cases, all the slices are treated as independent samples for diagnosis that helps in increasing the number of images during training. In the testing phase, majority voting is done to map decisions on multiple slices of a subject to ascertain the class label. In some recent studies, three-dimensional (3D) CT data is utilized with 3D segmentation models and 3D-CNN architectures. Deep learning is a data-driven approach where classification decisions are made based on the features learned by a model during the training process. During test time, the model assumes that the input has some features similar to the features learned from the training dataset that could be used for decision making. However, if patterns are dissimilar, the model will not be able to classify them accurately, which reduces its generalization ability. Data augmentation is a technique used to overcome this limitation. However, since artificial images generated through data augmentation are from the same training dataset, its scope to improve the diversity or abundance of the features is limited. In such scenarios, a more effective approach towards improving the performance is the augmentation of the actual training dataset through multiple modalities. For detection of COVID-19, a model can achieve superior performance when a multimodal dataset is utilized compared to the single-modal analysis. For example, the performance of the COVID-19 detection modal based on only CXR or CT scan images can be further improved by incorporating both kinds of images into the model. Various models are used on CXR imaging, CT imaging, and combining both as described next. These studies can be categorized based on the DL architectures used. Tables IV and V list the most relevant state-of-the-art CT and multimodal based studies respectively. It is one of the first convolutional networks that performed a large-scale image classification task and revolutionized the application of deep learning. It was the winner of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012. It consists of 5 convolutional layers and three dense layers. It is similar to the famous LeNet architecture but incorporates improved techniques of Rectified Linear Unit (ReLU) activation function, dropout, data augmentation, and multiple GPUs. Though several improved neural network architectures have been introduced to date, most are based on or inspired by the AlexNet network. It has been used in many studies of COVID-19 detection [89] , [131] which mainly differs in feature selection and training of multiple classifiers in [131] and image generation using Generative Adversarial Network (GAN) in [89] . The use of GAN is to tackle the lack of a sufficient dataset for COVID-19 cases, which is one of the most important contributions of this work. Authors in [89] also compared AlexNet performance with two other deep transfer learning models, GoogleNet and ResNet18. More in-depth details of both these references are presented in Table III . VGG stands for Visual Geometry Group of Oxford University. It was the winner of the ILSVRC challenge 2014. This model is simple in architecture but is still very effective in performance. VGG16 and VGG19 architectures consist of 16 and 19 convolutional layers, respectively. VGGNet has a cascade of five convolutional blocks using the fixed kernel sizes of 3X3, where the first two blocks consist of two convolutional operations in each, and the last three blocks consist of three convolutional operations in each. It is pertinent to mention that a new convolution block along with process improvement techniques (batch normalization and dropout) can be easily added to the standard model that enables the learning of finer features and is more suitable for the newer tasks and improved learning speed/stability. Few initial studies utilized the VGGNet pre-trained model by adding or fine-tuning few layers for COVID-19 classification using CXR (two-class [64] , [78] and three-class classification [88] , [90] ), CT [123] and multimodal images [82] , [109] . Improvements were made by tweaking the pipeline. For example, [70] proposed a decompose, transfer and compose method for the classification of CXR images into normal, COVID-19 and SARS classes. First, deep local features are extracted using a pre-trained CNN model, principal component analysis (PCA) is used to reduce the dimensionality of the obtained feature set. The class decomposition layer is then used on the feature matrix to form sub-classes within each class, and each subclass is treated as an independent class. The final classification layer of the pre-trained VGG19 model is adapted to these sub-classes. The parameters of the adopted model are fine-tuned, and finally, the sub-classes are combined to give the predicted label. The final classification is refined using error-correction criteria. This method significantly improved results over the conventional VGG19 pre-trained model. Results were also compared against four different pre-trained models for the three-class classification problem. Another work by Heidari et al. (2020) [75] combined the original CXR image with two pre-processed images to form a pseudocolor image which is then fed as three input channels for VGG16 pre-training. In [132] , features of the convolutional layers of VGG16 were combined with the attention module (spatial attention module and channel attention module), followed by fully connected layers and softmax layer for COVID-19 classification based on CXR images. Brunese et al. (2020) [84] trained two models using VGG16. The first model discriminates healthy CXR images, and the second model detects COVID-19 from other generic pulmonary diseases. ResNet is the most famous pre-trained model that has been used widely for COVID-19 classification. Generally, it is assumed that the training performance of the model can be increased by adding more convolutional blocks. However, in practice, it has been observed that the performance of the deeper layer models starts decreasing and often returns diminishing results compared to the less deep models. This happens due to the problem of vanishing gradients. ResNet model overcomes this limitation by incorporating skip connections. ResNet consists of a cascade of several residual blocks, wherein the output of each convolutional block is added to the output of the convolution blocks of the deeper stages. ResNet has been used by several authors for the detection of COVID-19 using CXR images [23] , [27] , [83] , [85] , [91] , and CT images [30] , [32] , [117] , [133] . Further, details of these references are given in their respective tables. Further improvements were made by a few studies, such as the utilization of feature pyramid network along with ResNet for COVID-19 classification using CXR images in [28] , a two-step classification algorithm implementation using CXR images in [76] , wherein first, ResNet50 was used to classify CXR images into healthy and others. Further, ResNet101 was used to separate COVID-19 from the other viral pneumonia class. Authors in [103] combined multiple image-level ResNet50 predictions to diagnose COVID-19 on a 3D CT volume level. The performance of the proposed method was shown to be better than a single 3D-ResNet model. Authors in [33] proposed a local attention classification model using ResNet18 as backbone architecture. Ismael and Sengur [63] used SVM with the linear, quadratic, cubic, and Gaussian kernels as a classifier on ResNet50 features for COVID-19 classification using CXR images. ResNet50 was also used in [107] along with detecting and removing noise from images using the top-2 smooth loss function. In [35] , authors considered an ensemble classifier using two 3D ResNet-34 architectures for CT-scan images. The prediction scores obtained from the two ResNets were linearly combined, where the weights were decided according to the ratios of the pneumonia infection regions and the lung. CT-scan images of CAP and COVID-19 patients were collected from 8 hospitals, and the images were segmented to obtain the lung regions using the VB-Net [134] with a refined attention module that provided interpretability and explainability to the model. VB-Net was designed by adding bottleneck layers to a V-Net to integrate feature map channels. The role of the attention module was twofold. First, it learned all important features for classification, and second, it gave the 3D class activation mapping. The images were normalized voxel-wise, and the window/level scaling was performed to enhance the contrast of images. The ResNet architecture was trained using dual sampling to compensate for the unbalanced dataset. Li et al. [31] utilized a 3D ResNet50 model to differentiate COVID-19 from CAP. Before fine-tuning this model, the lung was segmented from 3D CT images using a U-Net-based segmentation method. Also, the framework could extract both two-dimensional local and 3D global representative features. Wu et al. [135] used a joint classification and segmentation approach termed JCS using 1,44,167 chest CT scans, which is one of the largest CT-scan datasets used in the literature. The dataset includes scans from 400 COVID-19 patients and 350 non-COVID subjects. Of these, 3,855 chest CT images of 200 patients have been annotated with fine-grained pixel-level labels of opacifications, lesion counts, opacification areas, and locations, thus benefiting various diagnosis aspects. A Res2Net was used in this work for classification, and image mixing was used to avoid over-fitting. Segmentation was performed using an encoder-decoder module, and an Enhanced Feature Module (EFM) was used with VGG-16 in the encoder. Feature maps acquired from different stages were fused to predict the sideoutput of each stage. An attention mechanism was used to filter relevant features. The output from the last stage, which gave the final prediction value, had the same resolution as the input CT image. The idea of inception network [41] is to use several filter sizes instead of choosing a particular filter size. The feature maps are concatenated at the output so that the network learns about the combination of required filter sizes. It cascades several inception modules, where each module consists of a concatenation of outputs from 1x1 convolution, 3x3 convolution, 5x5 convolution, and pooling operation. It has additional side branches also. Inception network has three more versions with improved performance. InceptionV2 and InceptionV3 have been proposed in the same paper [136] and InceptionV4 is explained in [43] . InceptionV2 replaces the 5x5 convolution operation with two 3x3 convolution operations to avoid information loss and uses factorization methods to achieve performance improvement. InceptionV3 contains all the features of InceptionV2 in addition to RMSprop optimizer, batch normalization, regularization, and 7x7 factorized convolution. In [79] , multiple texture-based features were extracted from the CXR images, such as local binary pattern, elongated quinary pattern, local directional number, locally encoded transform feature histogram, binarized statistical image features, local phase quantization, and oriented basic image features. These features were combined with the features learned by the InceptionV3 network. These features were resampled to handle the problem of the unbalanced dataset. Five popular machine learning classifiers, including K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Random Forest (RF), Multilayer Perceptrons (MLP), and Decision Trees (DT), were used to predict class labels. The predicted values were combined by considering the sum of the prediction probabilities obtained for each label by each learner, the product of the prediction probabilities obtained for each label by each learner, and also as the majority vote. [137] modified the Inception-V3 pre-trained model in the end and used it for the classification of CT images. DenseNet can be understood as the extension of ResNet50 architecture, where each layer receives additional input from all the preceding layers rather than a skip connection from a single previous layer. It transfers its output to all the subsequent convolutional layers for concatenation. Thus, each layer is said to obtain "collective knowledge" from all the preceding convolutional layers. DenseNet is being utilized in a few studies in the literature and has shown good performance compared to the other pre-trained networks. For example, Alshazly et al. (2021) [117] tested the efficacy on two datasets [116] and [120] , and obtained good results with the DenseNet201 pre-trained model on one of the datasets compared to the other pre-trained models. Chowdhury et al. [5] showed the superiority of the DenseNet201 pre-trained model over the other six pre-trained models for a three-class classification problems using CXR images. Comparison of the activation maps of different classes obtained from the convolutional layers provided insight into the image regions contributing to classification. In [53] , DenseNet-121 architecture is used to classify COVID-19 patients from control. First, a set of 1,024 higher-level features are extracted using the DenseNet-121 trained on ImageNet, and then a logistic regression is fitted to these features. Finally, interpretation is done using the Expected Gradients [138] . To further identify the features used by the ML model to differentiate between COVID-19-positive and COVID-19-negative datasets, a GAN is trained to transform COVID-19-negative radiographs to resemble COVID-19-positive radiographs and vice versa. This technique should capture a broader range of features than saliency maps because the GANs are optimized to identify all possible features that differentiate the datasets. In [73] , authors used DenseNet121 to classify CXR images. A gravitational search algorithm (GSA) was used to find the optimum hyperparameters for the network. This was compared with the performance of DenseNet121 and Inception-v3 with manual hyperparameter tuning, and the GSA performance was shown to be superior. The authors used gradient-weighted class activation mapping (Grad-CAM) for explainability. [139] consists of a sequence of three separate parts for automatic lung segmentation, non-lung area suppression, and COVID-19 diagnostic and prognostic analysis. It proposes DenseNet121-FPN for lung segmentation in chest CT image, and DenseNet-like structure for COVID-19 diagnostic and prognostic analysis Turkoglu in [122] used multiple kernels extreme learning machine (ELM) based DenseNet201 to detect COVID-19 cases from CT scans. The transfer learning approach was used because the available COVID-19 datasets were insufficient to train the CNN models effectively. ELM based on majority voting was used to generate the final prediction of the CT scan, and results of ReLU-ELM, PReLU-ELM, and Tanh ReLU-ELM were compared. XceptionNet, developed by Google, stands for extreme version of Inception. It achieves better performance than Incep-tionV3 by introducing depthwise convolution and pointwise convolution. Khan et al. [87] used XceptionNet to propose a COVID-19 detection algorithm. The performance learning curve for four classes is shown only for fold-4 that gives the best accuracy among all folds. However, the curves show randomness in the learning and overfitting. The optimization of the model towards good fit may reduce the achieved accuracy and other metrics levels. It uses depthwise separable convolution from the Xception network with the aim to reduce model complexity and parameters that would compress the network and improve the speed. It has been developed considering mobile and embedded-based DL applications. Arora et al. (2020) [118] used MobileNet architecture along with residual dense neural network to detect COVID-19 from the CT-scan images. Results were compared with multiple other pre-trained architectures and MobileNet has shown better performance. It has been developed with the aim of smaller networks having fewer parameters that can easily fit into the applications with low memory and bandwidth requirements. It achieves the goal by decreasing the number of input channels and replacing 3×3 filters with 1×1 filters. SqueezeNet is one of the light networks that has been used for COVID-19 classification in CXR-based studies [69] . In this study, the author proposes a SqueezeNet-based architecture using Bayesian optimization for embedded and mobile systems. It is shown that the model size of the proposed network is 77.31 times smaller than the AlexNet. Pham [61] used AlexNet, GoogleNet, and SqueezeNet to classify CXR images into 2classes and 3-classes to detect COVID-19 cases. Six datasets were constructed using publicly available images to consider balanced and unbalanced binary and multiclass scenarios with normal, COVID-19, and pneumonia cases. The algorithm's efficacy in distinguishing between COVID and non-COVID cases, COVID, and normal cases are illustrated. Also, different cases of train and test data split are considered. Polsinelli et al. (2020) [121] proposed a light CNN architecture based on SqueezeNet to classify CT images into COVID and non-COVID. The authors used a publicly available dataset to train the network and utilized a separate dataset for testing. The proposed CNN architecture outperformed the conventional SqueezeNet. It is a neural network architecture that uniformly scales the three dimensions, viz., depth (number of hidden layers), width (number of channels), and resolution (input image resolution) using compound coefficient for improved efficiency and accuracy. In one recent work, Luz et al. (2021) [112] utilized EfficientNet for three-class classification problem and also evaluated results on cross-site datasets. Though the original dataset is highly imbalanced, consisting of 7966 Normal images and 152 COVID-19 images, it has adopted data augmentation to undertake the analysis on a balanced dataset. It provides a valuable analysis by proposing a model with relatively 5 to 30 times lesser parameters and hence, reduced memory requirements. The hybrid model consists of an ensemble of the aforementioned models. Several works, such as [57] , [72] , [74] can be found in this direction. [72] has used several pre-trained as feature extractors and correlation-based feature selection J o u r n a l P r e -p r o o f using CXR images. The proposed model has been validated on two separate public dataset and has shown promising results. [57] has separately trained three transfer learning architectures (DenseNet201, Resnet50V2, and Inceptionv3) and combined them using weighted average ensembling to predict COVID-19. [74] has stacked many pre-trained models (ResNet101, Xception, InceptionV3, MobileNet, and NASNet), and extracted features are concatenated together before feeding them to the dense layer for the classification task. However, these methods add complexity experienced in pre-training multiple models. To classify CXR images into three classes; normal, COVID-19 and pneumonia, authors in [80] concatenated features obtained using XceptionNet and ResNet50V2. The concatenation was done to obtain features with both inceptionbased layers and residual-based layers. After concatenation, the feature set was passed through a convolutional layer and, further, given to a classifier. The network was trained with eight different subsets of a balanced dataset. The performance of the concatenated network was compared with XceptionNet and ResNet50V2 individually, and only marginal improvement was observed with the proposed method. Togacar et al. (2020) [51] stacked original images with images pre-processed using the fuzzy color technique. Features were extracted from these stacked images using MobileNetV2 and SqueezeNet. Social Mimic optimization method [140] was used to process the features, and a support vector machine classifier was used for classifying the images into three classes. In [141] , an ensemble DL model is proposed using three pre-trained models: AlexNet, GoogleNet, and ResNet using CT-scan images. Ensembling is performed using majority voting. The performance of the proposed method was observed to be superior compared to the three individual pre-trained networks. Hilmizen et al. [9] fed the CXR images to VGG16 and CT scan images to ResNet50 for feature extraction, which were concatenated before providing to the dense layers for classification. ImageNet pre-trained models are not sufficient for classifying medical images because they are trained using natural images. Since medical and natural images are different in many aspects, some studies trained new and special deep CNNs from scratch. These studies majorly either adopted simpler few stacked convolutional layers in [10] , [71] , [77] , [94] or adopted advanced layers such as residual blocks in [98] , [99] . Furthermore improvements are made by utilizing advanced architectures such as residual blocks with squeeze-excitation blocks in [65] , channel-shuffled dual-branched CNN in [26] , newly lightweight residual projection-expansion-projectionextension blocks in [126] , Long Short Term Memory (LSTM) with CNN in [55] , [60] , 3D CNN in [15] , 3D CNN with residual blocks in [34] , capsule network inspired architecture in [67] , [68] , and Graphs Convolution Network (GCN) with CNN in [142] . More in-depth details of all these references are presented in Tables below. Summary: In this paper, we have reviewed a total of 71 COVID-19 detection studies, based on the imaging modality used, i.e., 23 CT image studies, 42 CXR image studies, and six studies using both CT and CXR images. We observed that transfer learning had been efficiently used to detect COVID-19 from chest CT and CXR images. Of all studies, 57 (80% of the reviewed systems) used transfer learning with pre-trained weights, and only 14 used custom CNN. Fig. 7b shows the number of published papers using various DL architectures. ResNet is the most popular architecture used by 29% of the reviewed articles, followed by custom CNN and VGGNet. Since the transfer learning approach offers several advantages, it is a preferred choice in many studies. In general, training a model from scratch requires high computational power and larger datasets. The primary issue in training deeper models from scratch is learning a large number of parameters using a limited number of (available) training samples that lead to overfitting. Also, it is quite time-consuming to decide parameters and architectures from scratch. A pre-trained model with transfer learning facilitates faster convergence with network generalization. Thus, we observe that many studies on DL-based COVID-19 detection models using CXR, CT, and multimodal have used the transfer learning approach. In the last section, we discussed several works on COVID-19 image analysis. Although the performance of the proposed algorithms seems promising, there are certain shortcomings that must be addressed. We now present a discussion on some of the challenges and gaps in this area. Reproducibility of DL-based models has emerged as one of the major challenges in the literature. Results can be ascertained if only the dataset and the details of the model architecture and training hyperparameters are made available. Also, the open-source availability of code helps in reproducing the results and in devising further improvements. Some of the works based on CXR and CT-scan image classification have provided their codes [15] , [30] - [32] , [34] , [51] , [57] , [66] , [67] , [70] , [80] , [87] , [99] , [112] , [132] , [135] , [139] , [143] . However, none of the papers using multimodal architecture have provided codes in the open-source domain. Almost all the papers that derived the dataset from multiple sources have provided details of individual sources. However, most of them have not provided the link of their consolidated dataset except a few studies such as [30] , [32] , [67] , [79] , [91] . It is important to note that many authors who have provided their codes have also not provided their dataset in the public domain. A study by the authors in [94] have not provided details of their architecture, although it is based on a custom CNN. It is noted that most of the dataset used for binary class or multiclass classification for COVID-19 diagnosis is highly unbalanced. The skewness in the dataset can introduce bias in the performance of a trained model. These unbalanced datasets pose a major challenge to the AI researchers because the collection of a sufficient number of quality images, especially at the initial stage of the pandemic, was indeed difficult. For example, as listed in Table- V, the author in [82] has used only 23 CT and 172 CXR images as compared to 805 normal images. The author in [109] has used 20,000 lung cancer images as compared to 3500 normal images. In order to handle class imbalance with small COVID-19 data, larger penalties were associated with the mis-classification error in COVID-19 cases in [67] . In [74] , [87] , random sampling was used to select a balanced multiclass dataset from an unbalanced larger dataset. However, this method reduced the size of the dataset. Pereira et al. [79] investigated the effect of different resampling methods such as ADASYN, SMOTE, SMOTE-B1, SMOTE-B2, AllKNN, ENN, RENN, Tomek Links (TL), and SMOTE+TL on the performance of the proposed classification algorithm. In [26] , data augmentation, weighted-class batch sampling, as well as stratified data sampling were used to obtain an equal number of samples for each class in every training batch. In [131] , dataset balancing was carried out using the SMOTE algorithm. Authors in [23] addressed the class-imbalance problem owing to the limited set of COVID-19 CXR images by proposing a novel transfer-to-transfer learning approach, where a highly imbalanced training dataset was broken into a group of balanced minisets, followed by transferring the learned (ResNet50) weights from one set to another for fine-tuning. Data Augmentation is employed to increase the size of the training dataset by transforming the images of the original dataset in multiple ways. This enables the learning of diverse features during the training process and reduces the overfitting of the model. Two important techniques of data augmentation have been observed in the reviewed literature. First, several variations such as flip, rotate, skew, translation, random clipping, tilting, and scaling have been introduced to the original dataset, increasing the number of training samples. Second, inbuilt software libraries (e.g., Keras ImageDataGenerator function) have been utilized that introduce random variations in the training dataset during each iteration without increasing the number of samples. The range of random variations is a hyperparameter that needs to be fine-tuned for a given problem. Authors in [28] , [32] , [63] - [65] , [68] , [70] , [75] - [77] , [83] , [98] , [109] , [112] , [118] , [122] , [142] have used the first method, while the authors in [9] , [26] , [26] , [33] , [34] , [74] , [82] , [110] , [121] have used the second technique of data augmentation. There is a third method of data augmentation by generating synthetic images using a Generative adversarial network (GAN). For example, authors in [71] have used GAN and generic data augmentation techniques to increase the dataset size. Authors in [69] , [117] have added Gaussian noise and used brightness variations to augment the images. One study also used Gamma correction to augment images [142] . Authors in [30] , [103] , [107] , [141] , [143] have not incorporated data augmentation that could have definitely improved performance of the proposed model. Medical images are generally low contrast images. Hence, efforts are made to increase the contrast of these images so that the images are better transformed to the feature space while they traverse through a DL model. Moreover, broad heterogeneity in the quality of images captured at different sites using different imaging devices causes potential bias in image analysis. This challenge emphasizes the need for improving image quality as a pre-processing step. Contrast enhancement techniques are generally used in the literature for enhancing the quality of images and making them visually more appealing. A few studies carried out histogram modification of images for contrast enhancement [27] , [70] , [75] , [142] . Authors in [23] utilized local contrast enhancement J o u r n a l P r e -p r o o f on thresholded grayscale CXR images for enhancement and also for removing any text from the image. Authors in [75] removed the diaphragm region from the CXR images and applied bilateral lowpass filtering to the original images. Others normalized their images before feeding to a neural network [15] , [32] , [57] , [65] , [83] , [112] . Authors in [15] , [33] applied HU-based filtering on raw CT images for removing the redundant parts. Some used gamma correction to control the brightness of images used [27] . Authors in [118] used residual dense network (RDNN) to enhance the quality of CT-Scan images through super-resolution. The performance of the model deteriorated for low-quality images in [66] . A large number of non-infected regions or background have been separated in [31] , [33] - [35] using 3D CNN segmentation model based on U-Net [22] . Transfer learning has been used either as a fixed feature extractor (where the weights of the convolutional layers of the pre-trained architectures are used without alteration) or weights of the few or all convolutional layers of the model are fine-tuned or retrained. The choice of an approach depends upon the size and similarity of the training dataset of the given problem to the dataset used in the training of the original transfer learning model. Since weights of most of the standard DL models (used for transfer learning) are learned over 1000 classes of the ImageNet dataset consisting of natural images, these DL models may not be completely relevant for the classification of CT or CXR images. Hence, it is recommended to employ transfer learning by retraining the weights of a few convolutional layers. Several studies in papers [23] , [30] - [32] , [35] , [63] , [69] , [72] , [75] , [79] , [83] - [85] , [87] , [103] , [107] , [112] , [117] , [122] , [123] , [131] , [137] , [139] , [141] , [143] have used transfer learning models as fixed feature extractor only. Also, it is important to note that very few studies such as class decomposition with VGGNet [70] , attention with VGGNet [132] , feature pyramid network with ResNet [28] have proposed architectural changes in the proposed model that is very much required not only to achieve better classification capability but also to have faster and stable learning. In [67] , authors used a publicly available CXR dataset for common thorax diseases to pre-train their Capsule network, unlike other works where natural images from the ImageNet dataset have been used. They also demonstrated the superiority of their method as compared to the latter. In [87] , a pre-trained XceptionNet was retrained end-to-end, while authors in [99] fine-tuned a custom CNN stacked architecture pre-trained on CXR images of normal and pneumonia cases. In [66] , a 19 layer CNN has been developed using DarkNet as the classifier for YOLO real-time objective detection system. Training and validation curves of accuracy and loss function provide a visual assessment of the three aspects of the training/trained model. First, it indicates how rapidly the model learns the objective function in terms of the number of iterations. Second, it informs how well the problem has been learned in terms of underfit/ overfit/ good fit of the model. Underfitting is shown by the low training accuracy, while overfitting of the model is indicated by a substantial gap between the training and validation curves. The good fit of the model is represented by higher training accuracy and convergence between training and validation curves. Third, there could be random fluctuations or noisy behavior in the training/validation loss curves. This could be due to a number of reasons, including the small size of the dataset compared to the model capacity, need of regularization, feature normalization, etc. Hence, depiction of learning curves is important in research studies as has been done in [10] , [15] , [26] , [27] , [51] , [55] , [60] , [63] - [66] , [68] , [69] , [75] , [76] , [78] , [82] - [84] , [87] , [98] , [110] , [112] , [118] , [132] , [137] . When the dataset is small, as is the case in the medical imaging domain, cross-validation is an important technique to assess the robustness of a model. Here, every sample of the dataset is used once as the test sample. The complete dataset is divided into k number of folds. In the literature study, very few studies have been undertaken to incorporate K-fold cross-validation. Authors in [35] , [55] , [57] , [60] , [65] , [66] , [117] , [141] have used 5-fold and authors in [10] , [23] , [121] , [122] have used 10-fold cross-validation. It is important to note that although the authors in [10] , [35] , [122] implemented Kfold cross-validation, details about the outcome of each fold has not been discussed. For a small dataset, this is a highly recommended training strategy. During the COVID-19 pandemic, it has been observed that people were being infected symptomatically as well as asymptomatically, where the latter is less contagious. A CXR or CT scan is taken at a later stage to determine the degree of infection so that proper medication can be advised to a patient. In such scenarios, it becomes imperative to differentiate not only between COVID-19 versus healthy but also between COVID-19 and the other viral diseases such as pneumonia that affect human organs in a similar manner. The development of an efficient and optimal AI-based solution to specifically and exclusively detect COVID-19 is still a prime challenge. In one study [110] , detection of COVID-19 from CT scans achieved accuracy of more than 99% while classifying from normal images. However, the performance of the same model degrades considerably when the multiclass classification was undertaken, including pneumonia images. The same was observed in another study [66] with an accuracy drop of 11% after adding pneumonia samples. In [76] , an accuracy of 97.2% was obtained for detecting COVID-19 cases from non-COVID cases, including healthy, bacterial pneumonia, and non-COVID viral pneumonia using a two-step detection algorithm using ResNet pre-trained architectures. The authors also investigated the performance of ResNet101 in detecting COVID-19 in the presence of other pulmonary diseases such as edema, cardiomegaly, atelectasis, consolidation, and effusion. In this case, the performance of ResNet101 was found to be inferior to the rest of the networks for the COVID-19 class. SARS patient samples along with healthy ones are also considered in one study for three-class classification using CXR images [70] . Authors in [27] developed an algorithm to detect healthy cases, bacterial pneumonia, viral pneumonia, and tuberculosis. Here, the COVID-19 cases were included in the viral pneumonia class that could be further detected using RT-PCR or CT-scan. Pediatric cases were excluded from this study to prevent the network from learning age-related data. Authors in [79] considered seven classes, including COVID-19, SARS, pneumocystis, streptococcus, varicella, MERS, and normal cases. In another work by Wang et al. [139] , COVID cases are classified against several versions of pneumonia such as bacterial pneumonia, viral pneumonia, mycoplasma pneumonia, and viral pneumonia. Authors in [141] and [143] have undertaken three-class classification by adding lung tumor and bacterial pneumonia. In a few studies such as [23] , [57] , [67] , [117] , CXR images are classified into COVID and non-COVID, where the latter included normal, bacterial pneumonia cases, non-COVID viral pneumonia cases, and other pulmonary syndromes. Generalization is the ability of a DL model to perform well on an unseen dataset. A model to classify dog and cat trained using black cats only may not perform well when tested on white-colored cats. This requires the training of the model on a diverse dataset. Apart from the dataset, generalization ability can also be ascertained through the choice of hyperparameters of a network that cater to the high variance (overfitting) and high bias (underfitting) issues. Regularization, dropout, batch normalization, early stopping are some techniques that can be incorporated to achieve better generalization abilities. To demonstrate the generalization ability of the proposed network, a few works like [57] , [72] , [75] , [117] have demonstrated the performance of their proposed model on more than one dataset. [137] utilized Inception-V3 pre-trained model for transfer learning. Performance was evaluated on CT datasets from two different sites. Results on the same site test dataset achieved a total accuracy of 89.5% with a specificity of 0.88 and sensitivity of 0.87. Results on the test dataset from different sites (trained and tested on different site data) showed an accuracy of 79.3% with a specificity of 0.83 and sensitivity of 0.67. Convolutional neural network-based architecture possesses automatic feature extraction capability leading to the representation of DL models as a black box. To achieve wider acceptability of the automated solutions, it becomes imperative to have an interpretability and behavioral understanding of the model. The transparency and explainability of AI solutions are very critical in the medical domain, especially when used for life-threatening COVID-19 diseases. In the reviewed literature, some studies have utilized various interpretation methods with the most used one being Grad-CAM [26] - [28] , [30] , [31] , [74] , [76] , [84] , [91] , [99] , [117] , [142] and CAM [23] , [69] , [121] . As illustrated, GRAD-CAM and CAM methods work in a similar manner, using heat maps as being used in a few studies [35] , [60] , [66] , while the other methods work differently and highlight the affected area in a different manner. For example, Karthik et al. (2021) [26] visualized the infected areas detected by the proposed work using saliency maps, guided backpropagation, and Grad-CAM. Authors in [117] also used t-distributed Stochastic Neighbor Embedding (t-SNE) plot to visualize the clustering of COVID and non-COVID cases. In [139] , visualization of the suspicious lung area along with the visualization of feature patterns extracted by convolutional layers of the proposed architecture is done to understand the inference of a DL system. Literature lacks the comparison among methods on the same data [98] . Instead of considering different datasets while evaluating and training any new model, methods should be trained on the same data for comparison. Again, this poses the need to create larger and more heterogeneous datasets that can be used to train both large and small neural networks. It is pertinent to mention that a few authors, such as in [15] , [26] , [28] , [57] , [60] , [65] - [69] , [72] , [74] , [75] , [77] , [83] , [84] , [98] , [99] , [112] have compared the performance of various state-of-the-art algorithms using different datasets, which is not very informative as the performance metrics obtained in each case may be data-dependent. Some works such as [27] , [63] , [87] , [117] , [132] , [142] used the codes available for the existing publications or the same dataset to present a comparison. However, the benchmarking is still very limited. For example, in [27] , [63] , [87] , authors compared the results obtained using their proposed algorithm with only one existing methodology on the same data. Multimodal studies undertaken using both CXR and CT have shown great potential in learning various features and improved performance. Further, it has been observed that most of the studies used a single sequential architecture that is trained on a mix of CXR and CT datasets. It is expected that the model would perform better by employing two parallel feature extractors, one each for CT and CXR, respectively. These separately extracted features can be combined before feeding to the classification (Dense) layer. In this regard, [9] uses two separate transfer learning models to extract features from CT and CXR images and achieves improved performance than any individual model alone. Based on the literature review presented above, we provide some suggestions for future researchers. Some of these suggestions are apparent from the above discussion, while some entail the existing scenarios in the COVID era. During the study of literature, it is observed that a oneto-one performance comparison between two reference papers J o u r n a l P r e -p r o o f cannot be undertaken due to lack of uniformity in the datasets and the performance metrics used. It is worth noting that the current public datasets have a limited number of images for the training and testing of AI algorithms. This necessitates the creation of one or more big, authentic, publicly available quality datasets that can be used, compared, or evaluated against by future researchers. For ease of research, we are presenting the ten most popularly used datasets in Fig. 7a . Table II includes details of 37 COVID-19 datasets, and we have selected the ten most-cited datasets for making this figure. From the literature, it has been learned that the datasets used by researchers are highly unbalanced. It raises concerns about the generalizability of the trained models on the prospective patients' data. Some studies utilized a few methods for combating unbalancing problems, such as dual sampling in [35] and SMOTE in [131] . However, a vast majority of work has suffered from the challenge of unbalanced data. Secondly, any model developed for detecting COVID-19 should perform the same with the claimed accuracy on the unseen/prospective subjects' data or data of a different hospital. Thus, we believe that a cross-dataset study is of paramount importance to ascertain the generalizability of the model with respect to variation in images across sites. To the best of our knowledge, cross-data evaluation is conducted in only a few studies [53] , [112] . For the successful classification of a new test image, it is assumed that this new test image will consist of features similar to those learned during the training of the classification model. It necessitates the creation of a global training dataset that includes/captures major features. Furthermore, a proper benchmarking of different methods (or cross-method analysis) on the same dataset should be carried out to ascertain the efficacy of the proposed methods. A viral infection affects different parts of a body with different severity that leads to multiple symptoms. The accuracy of detection or diagnosis of a disease depends on the effectiveness of identifying and measuring the symptoms or patterns. Different diagnostic tools are used to identify these symptoms, measured at varying degrees and levels. Accumulation of patterns from various modalities can provide diverse features compared to the individual variables that can be utilized to learn a DL model better. For the detection of COVID-19, besides CXR and CT scan images, cough and thermal images can be used to augment the detection capabilities of the model. Any model can have practical application if it has a high degree of generalization ability, and multimodal data analysis provides a better approach towards its achievement. An explanation of how a DL model has reached a certain conclusion is crucial for ensuring trust and transparency, especially when one deals with identifying life-threatening COVID-19 disease. In order to be sure of the decision, doctors would like to know how AI decides whether someone is suffering from a disease by analyzing CXR and CT scan images. In this paper, we survey some existing techniques used for explaining the interpretability of DL models trained for COVID-19 classification. There is a need to explore more methods of Explainable AI for COVID-19 diagnosis as used in other applications [147] , [148] . Annotation of medical images is one of the laborious works due to the shortage of radiologists and technicians who can label the images. Deep learning has a great power to extract features from images, but its performance depends heavily on the size of labeled training data. However, one can still train deep networks by utilizing semi-supervised and reinforcement learning methods that consider a mixture of unlabelled and limited labeled data for training deep models. This can address the problem of highly imbalanced data, one of the major challenges in COVID-19 image analysis, if arising of the difficulties in labeling/annotations. It is important to not only predict COVID-19 but also the degree of its severity in a patient for deciding appropriate treatment. Tabik et al. [91] classified COVID-19 positive CXR images into severe, moderate, and mild classes based on the severity level of the disease. A more comprehensive understanding of the severity of the disease can aid doctors in curing this disease carefully. In all, future improvements would require the collection of hundreds or thousands of labeled image data of severe COVID-19 and other pneumonia data. The dataset should be collected considering geographic diversity in mind, which will help to increase its applicability worldwide. In addition, future work should also be considered in the direction of identifying the infected pulmonary region as belonging to the left lung, right lung, or bi-pulmonary region. One study has already been done in this direction by employing a residual attention network as a basic block [139] . Besides the above suggestions based on the AI/ML work in COVID, a few more suggestions are in order, as discussed below. It has been noted that the COVID-19 virus is highly mutant, and several variants have evolved over the course of time. Hence, a scaling-up of diagnostic capabilities of AI-based automated solutions quickly and widely will be critical for diagnosing new variants of COVID-19, in decision making, and in choosing the way ahead. Regional variations in the impact of the virus on human organs may be studied. This can assist in a better understanding of the identification of a robust global/local solution. Global solution needs global participation. As this pandemic has affected every corner of humanity, any strategy or measure to handle the crisis relies on a wider public acceptance. In order to have a better public trust, it is required that the decisions and information be transparent and available openly, especially when things are related to people's health. Any vaccine or medicine development program needs a high degree of acceptance of public confidence and should be in the common interest. At the same time, international legislation and regulatory bodies will play a crucial role in ensuring the needs of individuals, preserving intellectual rights, and resolving disputes. It is also required to ensure accessibility, availability, and affordability to everyone. This study presents a comprehensive review of the work regarding COVID-19 diagnosis based on CXR and CT image analysis using deep learning. The state-of-the-art DL techniques for the CXR, CT, and multi-modal data diagnosis are presented in Tables III, IV , and V, respectively. Publicly available datasets used in these reviewed studies are summarized in Table II . We discussed challenges associated with the current DL approaches for the COVID-19 diagnosis. It is important to note that each study in the literature has shown potential in automated detection of COVID-19 and at the same time faced challenges or lacked in analysis and evaluation of the proposed solutions from several points of view. We are of a considered opinion that consolidation of the important observations can act as a benchmark and assistance to the researchers while developing DL-based automated and efficient COVID-19 detection solutions. Some of the important findings from this study are as follows. This review indicates significant utilization of DL methods for the automatic identification of COVID-19 cases from other pulmonary diseases and normal groups. Despite so many studies being undertaken, the majority of the research has been carried out on either CXR or CT image analysis. Further, most studies utilized smaller datasets and also lacked comparative analysis with the existing research. It is further noted that codes and data are not available for many studies, posing challenges in ascertaining the utility of the methods in clinical settings. Although efforts are now being made to show the interpretability of the DL model via visual saliency on the CXR or CT images, these methods are still in the early stages. In order to assist the clinicians in hospitals with respect to COVID-19 diagnosis and cure, the upcoming trends in this area require cross-data evaluation (i.e., testing on the unseen dataset of a different hospital) and comparison of crossmethods or benchmarking of the most recent methods. Availability of codes and data in the public space should be required with any research paper so that future researchers/clinicians can deploy and test the methods in actual hospital settings. Efforts should be made to consolidate some bigger public, comprehensive, and diverse datasets having multi-modality data of COVID-19 collected from multiple sites. This would allow the researchers to develop more reliable methods and also enable benchmarking of methods. Interpretability of AI methods should be demonstrated and validated with the help of expert radiologists. It is highly recommended that clinicians, radiologists, and AI engineers work together to evolve interpretable and reliable DL solutions that can also be deployed with ease in the hospitals. Otherwise, despite umpteen number of global efforts, it will take time to utilize these technologies in hospitals to assist mankind. A novel coronavirus from patients with pneumonia in China A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster The role of chest imaging in patient management during the COVID-19 pandemic: A multinational consensus statement from the fleischner society Sensitivity of chest CT for COVID-19: Comparison to RT-PCR Can AI help in screening viral and COVID-19 pneumonia? COVID-19 in diabetic patients: Related risks and specifics of management New research reveals why some patients may test positive for COVID-19 long after recovery Artificial intelligence-enabled rapid diagnosis of patients with COVID-19 The multimodal deep learning for diagnosing COVID-19 pneumonia from chest CT-scan and X-ray images Deep neural network to detect COVID-19: one architecture for both CT scans and chest X-rays Deep learning and medical image processing for coronavirus (COVID-19) pandemic: A survey Role of deep learning in early detection of COVID-19: Scoping review Deep neural networks for COVID-19 detection and diagnosis using images and acoustic-based techniques: a recent review A comprehensive survey of COVID-19 detection using medical images CovidCTNet: an open-source deep learning approach to diagnose covid-19 using small cohort of CT images FSS-2019-nCov: A deep learning architecture for semi-supervised few-shot segmentation of COVID-19 infection Deep learning models for COVID-19 infected area segmentation in CT images Inf-Net: Automatic COVID-19 lung infection segmentation from CT images An integrated feature frame work for automated segmentation of COVID-19 infection from lung CT images Robust chest CT image segmentation of COVID-19 lung infection based on limited data Fully convolutional networks for semantic segmentation U-Net: Convolutional networks for biomedical image segmentation Transfer-to-transfer learning approach for computer aided detection of COVID-19 in Chest Radiographs V-Net: Fully convolutional neural networks for volumetric medical image segmentation UNet++: A nested U-Net architecture for medical image segmentation," in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support Learning distinctive filters for COVID-19 detection from chest X-ray using shuffled residual CNN Deep learning COVID-19 features on CXR using limited training data sets Automatically discriminating and localizing COVID-19 from communityacquired pneumonia on chest X-rays Bi-Directional ConvLSTM U-Net with densley connected convolutions Development and evaluation of an artificial intelligence system for COVID-19 diagnosis Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: Evaluation of the diagnostic accuracy AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system A deep learning system to screen novel coronavirus disease 2019 pneumonia Deep learning-based detection for COVID-19 from chest CT using weak label Dual-sampling attention network for diagnosis of COVID-19 from community acquired pneumonia Abnormal lung quantification in chest CT images of COVID-19 patients with deep learning and its application to severity prediction ImageNet: A large-scale hierarchical image database ImageNet classification with deep convolutional neural networks Very deep convolutional networks for large-scale image recognition Deep residual learning for image recognition Going deeper with convolutions Xception: Deep learning with depthwise separable convolutions Inception-v4, Inception-ResNet and the impact of residual connections on learning Densely connected convolutional networks Learning deep features for discriminative localization Grad-CAM: Visual explanations from deep networks via gradient-based localization Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks Labeled optical coherence tomography (OCT) and chest X-Ray images for classification Chest X-Ray Images (Pneumonia) Pneumonia sample xrays COVID-19 detection using deep learning models to exploit social mimic optimization and structured chest X-ray images using fuzzy color and stacking approaches Padchest: A large chest x-ray image dataset with multi-label annotated reports AI for radiographic covid-19 detection selects shortcuts over signal Computed Tomography, and Chest X-rays for the Detection of COVID-19 COVID-19 CXR (all SARS-CoV-2 PCR+) Automatic COVID-19 detection from X-ray images using ensemble learning with convolutional neural network Actualmed-covid-chestxray-dataset CoroDet: A deep learning based classification for COVID-19 detection using chest X-ray images A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images Classification of COVID-19 chest X-rays with deep learning: new models or fine tuning COVID-19 image data collection Deep learning approaches for COVID-19 detection based on chest X-ray images Application of deep learning for fast detection of COVID-19 in X-rays using nCOVnet FocusCovid: automated COVID-19 detection using deep learning with chest X-ray images Automated detection of COVID-19 cases using deep neural networks with X-ray images COVID-CAPS: A capsule network-based framework for identification of COVID-19 cases from X-ray images Convolutional capsnet: A novel artificial neural network approach to detect COVID-19 disease from X-ray images using capsule networks COVIDiagnosis-net: Deep bayes-squeezenet based diagnosis of the coronavirus disease 2019 (COVID-19) from Xray images Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network DL-CRC: Deep learning-based chest radiograph classification for COVID-19 detection: A novel approach Computer-aided detection of COVID-19 from X-ray images using multi-CNN and Bayesnet classifier An optimized deep learning architecture for the diagnosis of COVID-19 disease based on gravitational search optimization InstaCovNet-19: A deep learning classification model for the detection of COVID-19 patients using Chest X-ray Improving the performance of CNN to predict the likelihood of COVID-19 using chest X-ray images with preprocessing algorithms A deep learning approach to detect COVID-19 coronavirus with X-ray images XCOVNet: Chest X-ray image classification for COVID-19 early detection using convolutional neural networks COVIDX-net: A framework of deep learning classifiers to diagnose COVID-19 in X-ray images COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios A modified deep convolutional neural network for detecting COVID-19 and pneumonia from chest X-ray images based on the concatenation of Xception and ResNet50V2 A new approach for classifying coronavirus COVID-19 based on its manifestation on chest X-rays using texture features and neural networks A deep learning framework to detect COVID-19 disease via chest X-ray and CT scan images Application of deep learning techniques for detection of COVID-19 cases using chest X-ray images: A comprehensive study Explainable deep learning for pulmonary disease and coronavirus COVID-19 detection from X-rays ADOPT: automatic deep learning and optimization-based approach for detection of novel coronavirus COVID-19 disease using X-ray images Analysis of COVID-19 infections on a CT image using deepsense model CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest X-ray images CCBlock: an effective use of deep learning for automatic diagnosis of COVID-19 using X-ray images Within the lack of chest COVID-19 X-ray dataset: A novel detection model based on GAN and deep transfer learning COVID-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on chest X-ray images Covid cases X-ray and CT-scan-based automated detection and classification of COVID-19 using convolutional neural networks (CNN) Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison Two public chest X-ray datasets for computer-aided screening of pulmonary diseases COV19-CNNet and COV19-ResNet: Diagnostic inference engines for early detection of COVID-19 CovXNet: A multidilation convolutional neural network for automatic COVID-19 and other pneumonia detection from chest X-ray images with transferable multi-receptive feature optimization MosMedData: COVID19_1000 dataset Deep learning for diagnosis of COVID-19 using 3D CT scans Dataset contains chest xray images of Covid-19, Pneumonia and normal patients CoronaHack -chest X-Ray-dataset, classify the X Ray image which is having Corona COVID-19 radiography database Deep transfer learning based classification model for COVID-19 disease Covid cases Deep-chest: Multi-classification deep learning model for diagnosing COVID-19, pneumonia, and lung cancer chest diseases Automated detection of COVID-19 from CT scan using convolutional neural network RSNA pneumonia detection challenge Towards an effective and efficient deep learning model for COVID-19 patterns detection in X-ray images COVID-19 Patients Lungs X Ray Images 10000 Tuberculosis chest X-ray image data sets COVID cases SARS-CoV-2 CT-scan dataset: A large dataset of real patients CT scans for SARS-CoV-2 identification Explainable COVID-19 detection using chest CT scans and deep learning Transfer learning-based approach for detecting COVID-19 ailment in lung CT scan Tianchi competition Covid-CT-dataset: a CT scan dataset about COVID-19 A light CNN for detecting COVID-19 from CT scans of the chest COVID-19 detection system using chest CT images and multiple kernels-extreme learning machine based on deep neural network Diagnosis of COVID-19 using CT scan images and deep learning techniques BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients ChestX-Ray8: Hospital-scale chest X-Ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images Datasets & analysis Clinically applicable AI system for accurate diagnosis, quantitative measurements and prognosis of COVID-19 pneumonia using Computed Tomography COVID-19 CT Lung and Infection Segmentation Dataset Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning Novel feature selection and voting classifier algorithms for COVID-19 classification in CT images Attention-based VGG-16 model for COVID-19 chest X-ray image classification Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks Abnormal lung quantification in chest CT images of COVID-19 patients with deep learning and its application to severity prediction JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation Rethinking the inception architecture for computer vision A deep learning algorithm using CT images to screen for corona virus disease (COVID-19) Improving performance of deep learning models with axiomatic attribution priors and expected gradients A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis Social mimic optimization algorithm and engineering applications The ensemble deep learning model for novel COVID-19 on CT images COVID-19 classification by FGCNet with deep feature fusion from graph convolutional network and convolutional neural network Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images FocusNet: An attentionbased fully convolutional network for medical image segmentation Deep learning on chest Xray images to detect and evaluate pneumonia cases at the era of covid-19 Do explanations reflect decisions? a machine-centric strategy to quantify the performance of explainability algorithms Explainable AI: A review of machine learning interpretability methods Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI The authors would like to express their sincere appreciation to the editor and anonymous reviewers for their comments and valuable suggestions which have helped in improving the manuscript significantly. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:Thanks and Regards, None Y J o u r n a l P r e -p r o o f