key: cord-0951965-u41hdci9 authors: Loey, Mohamed; El-Sappagh, Shaker; Mirjalili, Seyedali title: Bayesian-based optimized deep learning model to detect COVID-19 patients using chest X-ray image data date: 2022-01-05 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2022.105213 sha: 470f3d447b46193dfb14598c94a3e96b075362b9 doc_id: 951965 cord_uid: u41hdci9 Coronavirus Disease 2019 (COVID-19) is extremely infectious and rapidly spreading around the globe. As a result, rapid and precise identification of COVID-19 patients is critical. Deep Learning has shown promising performance in a variety of domains and emerged as a key technology in Artificial Intelligence. Recent advances in visual recognition are based on image classification and artefacts detection within these images. The purpose of this study is to classify chest X-ray images of COVID-19 artefacts in changed real-world situations. A novel Bayesian optimization-based convolutional neural network (CNN) model is proposed for the recognition of chest X-ray images. The proposed model has two main components. The first one utilizes CNN to extract and learn deep features. The second component is a Bayesian-based optimizer that is used to tune the CNN hyperparameters according to an objective function. The used large-scale and balanced dataset comprises 10,848 images (i.e., 3616 COVID-19, 3616 normal cases, and 3616 Pneumonia). In the first ablation investigation, we compared Bayesian optimization to three distinct ablation scenarios. We used convergence charts and accuracy to compare the three scenarios. We noticed that the Bayesian search-derived optimal architecture achieved 96% accuracy. To assist qualitative researchers, address their research questions in a methodologically sound manner, a comparison of research method and theme analysis methods was provided. The suggested model is shown to be more trustworthy and accurate in real world. COVID-19, a novel form of Coronavirus, has wreaked havoc on the global health system, claiming thousands of lives and wreaking havoc on millions more [1] , [2] . Coronavirus (SARS-COV-2) invaded the human body for the first time in December 2019, and it spreads mostly via droplets created by infected individuals when they talk or cough. Due to the droplets' inability to travel long distances, they cannot transmit from human to human without coming into close touch [3] , [4] . COVID-19 has been identified as the organ of the community of coronaviruses [5] , [6] . COVID-19 infection is spreading every day owing to a lack of rapid diagnosis technologies. This illness will claim a staggering number of lives worldwide. congested places [8] - [10] . Governments have enacted new regulations to address overpopulation and regional overpopulation. Governments and healthcare organizations have done so by implementing infection control systems in this manner [11] - [13] . Several nations are currently developing vaccinations against COVID-19. Among these, vaccines Pfizer, Moderna, Sputnik V, Sinovac, and AstraZeneca have been approved and are being used in many nations [14] - [16] . According to clinical data, it has been said that the widely used vaccinations have attained effectiveness and are safe to use without causing major adverse effects. Nonetheless, a vast industrial scale is necessary to make the vaccine in sufficient quantity to cover the whole world's population. Additional study is needed to determine the duration of protection and the vaccines' efficacy, especially against newly discovered viral types. Additional studies are needed to develop an effective screening procedure for diagnosing and isolating viral cases. Numerous countries' health professionals and scientists are seeking to strengthen their treatment plans and testing capability by introducing multifunctional testing in order to halt the spread of the virus and to protect people from the fatal infection [17] . Mostly, all projected models will need chest X-ray or CT data from patients as the primary input parameter, which can be obtained exclusively from diagnostic centers [18] , [19] . Thus, each patient must make an inperson visit to the diagnostic center to confirm the presence of COVID-19 in his or her body. Most households in underdeveloped nations lack access to private transport. Additionally, individuals living in rural regions must drive a considerable distance to access a diagnostic center. As a result, individuals must use public transportation to the diagnostic center for COVID-19 testing. This will increase susceptibility to the propagation of COVID-19, among other things [20] . [29] . The primary goal of this study is to characterize the COVID-19 feature detected in chest X-ray images based on Bayesian optimized deep learning model. The following are the primary contributions of this study: 1) A novel DL model for recognizing COVID-19 based on the chest X-ray images is proposed. 2) Bayesian optimization is a technique used in place of sweeping hyperparameters throughout an experiment. 3) By identifying suitable network hyperparameters and training choices for CNN, the proposed approach improves recognition efficiency. 4) The proposed model has been trained, optimized, and tested using a real dataset. This dataset is large compared to the literature and balanced which support the results to be trusted by domain expert. Load the optimal DL network discovered during optimization and its accuracy of validation. The following sections are utilized throughout the remainder of this work. The second portion deals with comparable studies beforehand. Section 3 describes the key characteristics of the dataset. Section 4 shows in a methodological approach of the proposed DL model. Section 5 discusses the testing results, and Section 6 presents the conclusion and future work. J o u r n a l P r e -p r o o f [30] , Authors used a hybrid chest X-ray radiography (CXR) images model to utilize a decision-tree (DT) classifier based on DL to detect COVID-19. This classifier tested a set of three binary DTs made using the The results demonstrated a perfect level of 99% accuracy. CNN and machine learning classifiers [39] were used in order to build a model where many tests were done using CNN in order to identify the COVID-19 from chest X-ray pictures. The best accuracy of the proposed DL model was above 98% compare to the machine learning algorithms. The chest X-ray dataset contained 4292 pneumonia, 225 COVID-19, and 1583 healthy patients. The system achieved a remarkable level of accuracy with 98.5% of accuracy. They ultimately determined that the proposed CNN system could identify COVID-19 patients from a small number of cases, without any preprocessing and with the least possible number of layers. A deep learning algorithm based on the ResNet CNN model was used to identify COVID-19 [40] . In their proposed technique, thousands of images were used in the pre-trained phase to distinguish significant items, and a different number of images in the retrained phase were utilized to search for abnormalities in chest X-ray data. The COVID-19 chest X-ray dataset has 154 COVID-19 and 5828 healthy patients. The study achieved an accuracy of 72%. Abbas et al. [34] 105 SARS=11 -80 El-Rashidy et al. [35] 250 -500 Minaee et al. [36] 184 -5000 Wang et al. [37] 140 9576 8851 Khan and Aslam [38] 195 -862 Sekeroglu and Ozsahin [39] 225 4292 1583 Che Azemin et al. [40] 154 -5828 As shown in Table 1 , most published research for COVID-19 diagnosis has employed chest X-ray data to diagnose COVID-19 which highlighted the critical role of chest X-ray image analysis as an indispensable tool for physicians and radiographers. However, these studies used different and imbalance datasets, and they extracted insufficient features from images. As a result, the classification outcomes were not accurate nor intended [41] , [42] . The majority of the studies discussed before relied heavily on mathematical analysis and transfer learning to reliably diagnose COVID-19 infection. There is little research on using CNN with balanced data with optimization technique to identify COVID-19 in X-ray imaging. As a result, more research on deep learning with simplified efficiency criteria may be conducted. According to the literature evaluation conducted for this study, it is recommended that chest scans be used to balanced data to diagnose COVID-19. The new paradigms are generally more effective and efficient in combating the COVID-19 epidemic. COVID-19 patients are anticipated to undergo a variety of rigorous data gathering procedures. Not only the sample structure inside a collection, but also their distribution across classes, has a substantial impact on the model that will be created. Color, geometry, and pattern have a direct influence on the performance of intelligent J o u r n a l P r e -p r o o f computer-aided prototypes. Additionally, a consistent and robust model requires an equal number of samples that cover all conceivable situations or occurrences for each class. This paper conducted its experiments based on two public X-ray datasets. The first dataset is COVID-19 Radiography dataset 1 published by Rahman et al. [43] , [44] . The collection includes 3616 COVID-19, 10, 192 normal, 6012 lung opacity, and 1345 viral Pneumonia cases. The second public X-ray dataset is a Chest X-Ray Images 2 . The collection [45] contains 5,863 images of patients with Pneumonia/Normal lung function. By integrating the COVID-19 radiography dataset with chest X-ray dataset, we developed a new dataset. By eliminating low-quality and redundant images, the combined dataset comprises 10848 (3616 COVID-19, 3616 Normal cases, and 3616 Pneumonia) scans. The resulting dataset is balanced as illustrated in Fig. 1 . We split our dataset to three sets, as shown in Fig. 2 . To demonstrate the suggested model using a publicly available dataset, we created a model that does X-ray categorization. The diagnostic engine uses this X-ray classifier to determine whether an X-ray image is associated with COVID-19. To assess the classifier, we employ two datasets of COVID-19, normal, and Pneumonia. The new dataset is a massive archive including an unusually diverse population of COVID-19 patients. Figure 4 illustrates the suggested recognition model topology. The testing data will be used to assess the optimized model. After tweaking the CNN hyperparameters using Bayesian optimizer, the optimizer picks the optimum hyperparameters to be used in the testing stage. The test procedure will be utilized to find the optimal of hyperparameters in CNN model. During the first iteration of the search, we train the CNN using the default hyperparameters. Then, we adjust our CNN model's hyperparameters to approximate the objective function using the validation loss, which serves as our fitness function. Then, we get a fresh set of hyperparameters by using the projected improvement acquisition function. This Bayesian function specifies whether the next set of hyperparameters is created randomly or using the fitness model. We update the CNN architecture to match the hyperparameters once they are obtained. CNN is trained and used to calculate the validation loss using the training technique. After that, the Bayesian process is updated to provide a more precise estimate of the objective function. This method is done 30 times in total. The model with the lowest loss will be chosen after 30 iterations. This is the procedure technique of Bayesianoptimized CNN model of tuning hyperparameters. It is required to decide which hyperparameters are optimized before commencing the optimization. The In the broadest sense, optimization is the process of identifying a position that decreases a real-valued function known as the fitness function. Bayesian optimization is a term that refers to one of these processes. Bayesian Table 2 shows the four (Initial learning rate, SGD with momentum, depth of the network, and L2 regularization) tuning hyperparameter used in DL training generated by the Bayesian tuning algorithm. As seen in Figure 5 , proposed CNN architecture consists of two stages: feature extraction and learning classifier or classification. The objective of feature extraction stage is to extract significant characteristics from the data. Convolutional layers perform feature extraction in CNN. A learning predictor stage is used to train the system to categories data based on the characteristics extracted by the feature extraction layer. The classifier for learning is composed of one or more fully connected layers. Each layer has a certain number of nodes. Each Assume layer is convolutional, we have a set of square neuron nodes followed by a convolutional layer. If we employ a filter, the outcome of the convolutional layer will be ( − + 1) ( − + 1), which results in -feature maps. The convolutional layer functions as a feature vector, capturing characteristics from the inputs. Convolution retrieves picture features such as edges, lines, and corners. To calculate the output of convolution function in equation (2): where is a bias and is the mask of volume × . ), the input of layer − 1 The convolution operation then applies its activation function as specified in formula (3): = Ω( ) where Ω(. ) is referred to as non-linearity, and the function used to generate non-linearity in DTL comes in a variety of flavors, including tanh, sigmoid, and Rectified Linear Units (ReLU). In our technique, we use ReLU as the activation function in equation (4) to facilitate the training phase.: Bayesian optimization uses historical data to choose the optimal hyperparameters for assessment. In machine learning models and simulations, Bayesian optimization has been applied [46] , [47] . It assists in devising the time-consuming job of optimizing a large number of parameters. It has been used in several trials to determine the optimal set of gait characteristics. Our paper presents a unique CNN model that is trained entirely from scratch, rather than using the TL technique. We trained our deep learning model on a GPU using TensorFlow and MATLAB (2021a) based on Nvidia. We implement the proposed CNN model using the recommended training configuration (batch norm decay = 0.2, weight decay = 0.001, and dropout = 0.6). To avoid the overfitting concerns associated with deep nets, we use the dropout strategy [48] . The early-stopping is permitted if no decrease in correctness is observed. The starting learning rate is set from the domain [0.001-1] with a batch size of 64 and the learning rate is automatically reduced. This resulted in a shorter preparation time without sacrificing efficiency. They observed that model output increased as more samples were used in 10-fold cross-validation [49] . SGDM [50] has been selected as our optimizer strategy for enhancing CNN detection performance. Validation accuracy is a categorization score used to evaluate the effectiveness of the learning approach throughout the procedure. It makes it possible to identify overfitting as a possible cause. If evaluation and training are inaccurate, overfitting has already occurred. The proposed CNN model update training configuration parameters. To achieve the highest degree of model efficiency, an efficient balance between classes must be found. The dataset was divided into three scenarios. 1) Scenario 1: the data are split to 60% for training, 10% for validation, and 30% for testing; 2) Scenario 2: the data are split into 70% for training, 10% for validation, and 20% for testing); 3) Scenario 3: the data are split into 80% for training, 10% for validation, and 10% for testing. Our dataset is balanced, so it is sufficient to measure the model accuracy, i.e., accuracy = ( + )/(( + )+( + )), where is the quantity of properly labelled, is the number of incorrectly labelled, is the number of instances of the remaining categories that are properly named, and is the total number of incorrectly labelled classes in the remaining classes. The confusion matrices for three groups of labels (COVID-19, Normal, and Pneumonia) have also been reported. Scenario 1: Table 3 shows the results of optimizing CNN hyperparameters (depth of the network, initial learning rate, SGD with momentum, and L2 regularization) based on the Bayesian technique. The maximum number of objective function evaluations is 30 iterations. The best estimated CNN hyperparameters are: depth of the network= 2, initial learning rate= 0.010518, SGD with momentum= 0.83379, and L2 regularization= 1.606e-05 in all layers). Figure 6 (a) shows an overall accuracy of 95.1% of the best CNN model to detect COVID-19 patients. Figure 6 (b) shows a graph between the function evaluation and minimum objective. The goal, shown on the x-axis as min objective against the total number of function evaluations on the y-axis. Scenario 2: Table 4 shows the result of optimizing CNN hyperparameters: depth of the network, initial learning rate, SGD with momentum, and L2 regularization based on Bayesian technique. The maximum objective function evaluations are 30 iterations. Best estimated CNN hyperparameters model is (depth of the network=1, initial learning rate=0.042721, SGD with momentum=0.84845, and L2 regularization=5.3403e-07). Figure 7(a) shows an overall accuracy of 95.2% of the best CNN model to classify COVID-19 patients. Figure 7 (b) shows a graph between the function evaluation and minimum objective. The goal, shown on the x-axis as min objective against the total number of function evaluations on the y-axis. Scenario 3: Table 5 The comparative assessments of related work are given in Table 6 . It is quite obvious that the suggested Three ablation experiments were conducted to determine the effect of Bayesian optimization on our CNN model as illustrated in Table 7 . In the first ablation investigation, we compared Bayesian optimization against three scenarios. We employed convergence plots, accuracy to compare the three scenarios. We discovered that the best architecture achieved by Bayesian search had a 96 percent accuracy. The issue with comparing COVID-related research is that most studies employed distinct datasets and the split of the dataset into train, validation, and test sets is not publicly accessible. As a consequence, we trained and evaluated other researchers' techniques on our dataset in order to compare them to ours. Although the datasets were comparable in type, the distribution of data and the assessment process were distinct in each instance. Several studies use cross-validation, while others divided the whole dataset into a train, validation, and test set. Three kinds of X-rays were included in the datasets: normal, pneumonia, and COVID-19 [32] , [34] , [37] , [39] . Normal and COVID-19 X-rays were included in the datasets having two classes [31] , [35] , [36] , [38] , [40] . Notably, the suggested model is assessed using the COVID-19 Radiology Database scale. Given the global prevalence of positive COVID-19 cases, one may argue that the database is insufficiently big. However, we J o u r n a l P r e -p r o o f believe that this is a non-issue. Because the performance of CNN networks increases as the number of samples utilized in the development process increases, in this scenario, just computation time and physical hardware need to be considered. Another critical point to remember is that by the time positive COVID-19 cases are found using X-ray pictures, the infection may have progressed dramatically. In other words, whereas X-ray pictures are a valuable tool for confirming positive COVID-19 instances, they may not be clinically meaningful for early diagnosis. In this regard, our paper presents a unique CNN model that was trained entirely from scratch, rather than using a transfer learning technique. Additionally, rather of employing pre-trained CNNs, the suggested architecture's completely linked layers were investigated, analyzed, and employed for the COVID-19 infection detection job. Our research incorporates novel elements in this regard. Additionally, the suggested model is based on the end-to-end learning approach and does not use a bespoke feature extraction engine. Therefore, a model that is efficient, quick, and dependable was constructed, and encouraging results were obtained. In this study, we offer a new classifier for chest X-ray images using convolutional neural network models (CNNs) based on Bayesian optimization. The suggested model is composed of two distinct components. The first one used CNN to extract features and do classification. The second component is a Bayesian optimizer that is used to modify CNN hyperparameters in accordance with the goal function. The proposed COVID-19 dataset contains 10848 images (3616 COVID-19, 3616 Normal cases, and 3616 Pneumonia). We compared Bayesian optimization to three different ablation situations in the first ablation research. We compared the three situations using convergence charts and accuracy. We observed that the optimum architecture obtained by Bayesian search was 96% accurate. The findings indicated that the best CNN model is the most successful in identifying balanced COVID-19 pictures when compared to other models assessed on a smaller dataset. This research was compared to previous research using COVID-19 x-ray images. The model outperformed all existing classifiers in terms of predictive power and significance. X-ray analysis is sufficiently promising to permit extrapolation and generalization. In the future, we want to contribute our findings to other machine learning and deep learning projects. Despite its high accuracy rates, the suggested study should be replicated on a larger scale since it has the potential to be used in other medical applications. Funding: This research received no external funding On behalf of all authors, the corresponding author states that there is no conflict of interest. Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors. A new coronavirus associated with human respiratory disease in China Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study A Novel Coronavirus from Patients with Pneumonia in China Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Organ-specific manifestations of COVID-19 infection Pandemic and the Role of IoT, Drones, AI, Blockchain, and 5G in Managing its Impact WHO Coronavirus (COVID-19) Dashboard A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic Artificial Intelligence Facing COVID-19 Pandemic for Decision Support in Algeria IoT in the Wake of COVID-19: A Survey on Contributions, Challenges and Evolution Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection Ranking the effectiveness of worldwide COVID-19 government interventions Model and Simulation to Reduce Covid-19 New Infectious Cases: A Survey Sentiment Analysis on Covid19 Vaccines in Indonesia: From The Perspective of Sinovac and Pfizer What scientists do and don't know about the Oxford-AstraZeneca COVID vaccine China COVID vaccine reports mixed results -what does that mean for the pandemic? Testing for SARS-CoV-2 (COVID-19): a systematic review and clinical guide to molecular and serological in-vitro diagnostic assays Chest X-ray image phase features for improved diagnosis of COVID-19 using convolutional neural network Deep learning based detection and analysis of COVID-19 on chest X-ray images A Novel Bayesian Optimization-Based Machine Learning Framework for COVID-19 Detection From Inpatient Facility Data Pneumonia Classification Using Deep Learning from Chest X-ray Images During COVID-19 A deep transfer learning model with classical data augmentation and CGAN to detect COVID-19 from chest CT radiography digital images Prediction of COVID-19 Using Genetic Deep Learning Convolutional Neural Network (GDCNN) Detection of pneumonia infection in lungs from chest X-ray images using deep convolutional neural network and content-based image retrieval techniques Comparison and Validation of Deep Learning Models for the Diagnosis of Pneumonia Imagenet classification with deep convolutional neural networks Very deep convolutional neural network based image classification using small training sample size Going deeper with convolutions Deep Residual Learning for Image Recognition Deep Learning-Based Decision-Tree Classifier for COVID-19 Diagnosis From Chest X-ray Imaging An efficient mixture of deep and machine learning models for COVID-19 diagnosis in chest X-ray images PDCOVIDNet: a parallel-dilated convolutional neural network architecture for detecting COVID-19 from chest X-ray images Within the Lack of Chest COVID-19 X-ray Dataset: A Novel Detection Model Based on GAN and Deep Transfer Learning Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network End-To-End Deep Learning Framework for Coronavirus (COVID-19) Detection and Monitoring Deep-COVID: Predicting COVID-19 from chest X-ray images using deep transfer learning Deep Learning for The Detection of COVID-19 Using Transfer Learning and Model Integration A Deep-Learning-Based Framework for Automated Diagnosis Information Detection of COVID-19 from Chest X-Ray Images Using Convolutional Neural Networks COVID-19 Deep Learning Prediction Model Using Publicly Available Radiologist-Adjudicated Chest X-Ray Images as Training Data: Preliminary Findings Survey on deep learning with class imbalance Learning from imbalanced data: open challenges and future directions Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images Can AI Help in Screening Viral and COVID-19 Pneumonia? Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimizationb Automatic tuning of hyperparameters using Bayesian optimization Dropout: A Simple Way to Prevent Neural Networks from Overfitting On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning On the Importance of Initialization and Momentum in Deep Learning Bayesian optimization-based convolutional neural network model is proposed  CNN is used to extract and learn deep features  A Bayesian-based optimizer that is used to tune the CNN hyperparameters  The proposed algorithm is used for the recognition of chest X-ray images  The Bayesian search-derived optimal architecture achieved 96% accuracy We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.We confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have followed the regulations of our institutions concerning intellectual property.We understand that the Corresponding Author is the sole contact for the Editorial process (including Editorial Manager and direct communications with the office). He/she is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author.