key: cord-0717754-16drqz13 authors: Loey, Mohamed; Mirjalili, Seyedali title: COVID-19 cough sound symptoms classification from scalogram image representation using deep learning models date: 2021-11-10 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2021.105020 sha: 7a626998c2a3c071865f395e51d98d9981dd903e doc_id: 717754 cord_uid: 16drqz13 Deep Learning shows promising performance in diverse fields and has become an emerging technology in Artificial Intelligence. Recent visual recognition is based on the ranking of photographs and the finding of artefacts in those images. The aim of this research is to classify the different cough sounds of COVID-19 artefacts in the signals of altered real-life environments. The introduced model takes into consideration two major steps. The first step is the transformation phase from sound to image that is optimized by the scalogram technique. The second step involves feature extraction and classification based on six deep transfer models (GoogleNet, ResNet18, ResNet50, ResNet101, MobileNetv2, and NasNetmobile). The dataset used contains 1457 (755 of COVID-19 and 702 of healthy) wave cough sounds. Although our recognition model performs the best, its accuracy only reaches 94.9% based on SGDM optimizer. The accuracy is promising enough for a wide set of labeled cough data to test the potential for generalization. The outcomes show that ResNet18 is the most stable model to classify the cough sounds from a limited dataset with a sensitivity of 94.44% and a specificity of 95.37%. Finally, a comparison of the research with a similar analysis is made. It is observed that the proposed model is more reliable and accurate than any current models. Cough research precision is promising enough to test the ability for extrapolation and generalization. The coronavirus disease continues to spread globally with over 230 million confirmed cases and 4.7 million deaths worldwide as of October 2021 [1] . The coronavirus classes are subgroups of the (α, β, δ, γ) coronavirus [2] . The COVID-19 outbreak was caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in 2019 [3] . The origin of the β community of coronaviruses has been confirmed to be SARS-CoV-2 [2] . By 2020, the exponential dissemination of the disease pushed the World Health Organization to declare COVID-19 as a pandemic. SARS-CoV-2 can spread in a number of forms, especially in polluted and overcrowded environments [4] . Governments and healthcare institutions have introduced new policies to cope with overcrowding by implementing infection control systems [5] . The COVID-19 outbreak has triggered a drastic increase in international scientific collaboration. Deep Learning (DL) and Machine Learning have constructive impacts in the battle against Coronavirus [6, 7] . DL involves using a dataset to help detect and avoid deadly illnesses. The healthcare industry requires support from emerging technologies, such as Artificial Intelligence (AI) [8] , Internet of Things (IoT) [9] , and big data [10] , to deter the emergence of new Coronavirus diseases [4] . DL is best used to diagnose the virus and to correctly predict its propagation. The primary signs of COVID-19 include cough and exhaustion. Coughing is the consequence of several other illnesses and the effects of each infection differ; hence, diseases of the lung may impair the acoustics of cough [11] . The main objective of this research is to identify the observable characteristics of the COVID-19 cough, as demonstrated in Fig. 1 . The proposed model, when provided with a coughing tone, would identify the sound as either sick or stable. To control the COVID-19 pandemic, wide-scale research is an imperative in areas such as X-ray, CT scan image of chest [2, 12] , and medical face mask detection [5] . Developing nations are experiencing a shortage of health care professionals and devices such as personal protective equipment. This could cause significant problems in the emerging countries when compared to the industrialized countries, especially if the disease continues to spread at a rate similar to that of the West. It is important to develop effective means of early detection and diagnosis to reduce the death rates. Cough may play an important role in initial diagnostic tests [13] [14] [15] [16] . All the sound classification methods employ machine learning and DL. Machine learning classifiers include support vector machine [17] and decision tree [18] , while DL classifiers include the Convolutional Neural Network (CNN) models (AlexNet [19] , VGGNet [20] , GoogleNet [21] , ResNet [22] ). The CNN family is designed for high speed and performance in image classification. The main contributions of this research are as follows: 1) A novel DL algorithm that can identify COVID-19 based on a subset of tone. 2) The proposed model increases sound detection efficiency by introducing scalogram technique to convert sound to image. 3) Six models for DL training are implemented to get optimal efficiency. The present study is organized as follows. Section 2 of this paper reviews the extant literature. Section 3 explains the main features of the dataset. Section 4 displays the proposed COVID-19 cough sound model in an instructive manner. Section 5 reports the findings of the tests, and Section 6 presents the conclusions and potentials for future research. This section presents a survey of the latest literature on the diagnosis of COVID-19 using cough sounds. Subsequently, the latest assessment of DL toward cough sound scan analysis is addressed. This section compiles existing knowledge on the use of machine learning and DL in sound classification. The sound classification levels may be subdivided into a pre-processing stage, an extraction stage, and a classification stage. Most of the sound detection studies focus on sound construction and sound recognition based on traditional machine learning techniques [23] [24] [25] . The present research focuses on the classification and recognition of cough sounds produced by those with the COVID-19 virus. Schuller et al. [26] implemented a DL method to classify raw breathing and coughing of COVID-19 patients based on CNN. They adjusted the CNN method that utilizes breathing and coughing audio to recognize if a patient is infected with COVID-19 or if they are healthy. The proposed approach is almost greater than the conventional baseline. While the CNN model achieved an accuracy of 80.7%, with the current data available, a DL model is able to achieve the best performance. In [27] , Bansal et al. proposed a CNN model for COVID-19 audio recognition based on mel-frequency cepstral coefficients (MFCC). Two methods were discussed to identify audio sounds in this study. The first process uses the spectrogram as an input to the MFCC algorithm. Second, the collection of image processing pipeline is assessed via the transfer learning dependent technique using the VGG 16 architecture. The outcome of the introduced model, with a high-quality results approach, produced 70.58% test accuracy with 81% sensitivity. In Ref. [28] , the authors proposed a model to distinguish the COVID-19 sounds from the several types of non-COVID-19 sounds. They used 1838 cough and 3597 non-cough sounds that were categorized into 50 classes for training and testing (70 COVID-19 and 247 healthy sounds). The study showed that the overall accuracy of DL-based multiclass classifier was 92.64%. Other studies before the COVID-19 pandemic, such as [29] , introduced a transfer learning model to classify cough sound events. Neural network models are constructed from two stages, pretraining and fine-tuning, and then the decoded details are collected by a Hidden Markov Model (HMM). In this research, three cough HMMs and one non-cough HMM are added to the proposed model. The tests were performed on a dataset obtained from 22 patients who suffered from multiple respiratory disorders. Their proposed method shows that the qualified deep model can reach a high degree of precision of 90%. Hee et al. [30] proposed a machine learning classifier for asthmatic and healthy children. The dataset included 1192 cough samples from asthmatic children and 1240 cough samples from healthy children. Features such as MFCC were derived from the audio. The learned machine learning algorithm was generated with a Gaussian Mixture Model-Universal Context Model. The study showed that the overall sensitivity and specificity of machine learning classifier were 82.81% and 84.76%, respectively. In Ref. [31] , Amrulloh et al. introduced a classification model for pneumonia and asthma. Their approach quantified the sound features by MFCC, Shannon entropy, and non-Gaussian, and these characteristics were found to form the basis for artificial neural network classifiers. The proposed approach reached 89% sensitivity and 100% precision. The findings demonstrate how our approach could be used to discriminate between pneumonia and asthma in open areas. Most of the above research studies used mathematical analyses and machine learning to accurately identify COVID-19 infection. Fewer studies were found to utilize transfer learning and CNN of cough sound datasets for the variables of coronavirus patients and stable patients. Therefore, further studies are needed on DL with streamlined efficiency metrics. As per the literature review presented here, it is advised to use cough sounds for the diagnosis of COVID-19. The new paradigms tend to be quicker and more successful in combatting the COVID-19 pandemic. A wide range of difficult data collection methods is expected in COVID-19 patients. A database of respiratory sounds documented during acute COVID-19 infection is presented in Ref. [32] . The aim of the Coswara project is to establish a diagnostic method for COVID-19 pertaining to respiratory, cough, and speech sounds [33] . The dataset contained accessible deep_cough recordings from 92 COVID-19 positive patients and 1079 stable tones. The Sarcos 1 (SARS COVID-19 South Africa) dataset is extremely limited with only 8 COVID-19 cases and 13 healthy cases. Both Coswara and Sarcos are imbalanced because positive subjects are overshadowed by nonpositive subjects. This study conducted its experiments based on CoughDataset. The CoughDataset dataset is based on Coughvid. 2 The dataset is organized into five classes (COVID-19, Healthy, Lower, Upper, and Obstructive) and includes 3325 sound files with 16 KHz, mono, 1 s duration. The proposed DL model was trained and tested on COVID-19 (755 sound files) and healthy (702 sound files) classes, as shown in Fig. 2 . In order to render the proposed model in a public dataset, the model was built to provide cough classification. This cough classifier is used by the diagnostic engine to assess whether or not a sound is related to COVID-19. To evaluate the classifier, we used the CoughDataset data and the COVID-19 and non-COVID-19 sounds from the dataset. CoughDataset is a rich archive with data from an incredibly broad range of COVID-19 patients. This set comprises 3325 sounds divided into five main groups (755 of COVID, 702 of healthy, 1032 of lower, 186 of obstructive, and 650 of upper). The current method used 1457 cough sounds for preparation and research. The architecture diagram of the proposed DL cough classification model is shown in Fig. 3 . The introduced model includes two main components: The first component is the feature extraction, and it transforms sound to image based on scalogram, while the second component is the feature extraction and classification model based on the DL models (GoogleNet, ResNet18, ResNet50, ResNet101, Mobile-Netv2, and NasNetmobile). GoogleNet, ResNet, MobileNet, and NasNet are amongst the most widely used DL transfer learning models [34] [35] [36] . The proposed model used DL models for feature extraction and classification in the training, validation, and testing stages. A scalogram is the real value of the Continuous Wavelet Transform (CWT) coefficients of a wave [37] . This study adopts the scalogram approach in two measures. First, the 1-D electrocardiogram (ECG) signals are preprocessed for noise reduction. Second, 2-D scalograms utilizing CWT are used with the preprocessed signals. The ECG transforms the signal from the time to the frequency domain based on CWT, as shown in Fig. 4 . Low frequency and high frequency noise are removed by convolution using an average filter. The CWT utilizes internal products to calculate the resemblance of a wave and an examination function like the Fourier transform. The CWT of a function f(s) at a scale (x > 0) is calculated using equation (1). ϑ(s) is a continuous function in both the time domain and the frequency domain called the father signal. x is the continuously varying values of the scale parameter, and y is the position parameter. The outcome of the CWT coefficients is a matrix filled with wavelet located by scale and position. The goal of the father signal is to provide the generation root feature of the children signals. ECG is calculated by scale parameter and father signal in CWT [38, 39] . There are several efficient pre-train CNN that have the potential of passing learning. However, they require training and analysis of the dataset at their input layer. Several combinations and methods are applied to build the networks. In 2014, new transfer learning CNN model was proposed by C. Szegedy et al. [21] at Google. GoogleNet object classification deep network is composed of feature extraction network and classification network, as shown in Fig. 5 . It contains 22 convolutional layers [34, 35] . GoogLeNet has inception layers, each of which conducts a particular method of convolution and then concatenates the filters together for the next layer [40] . Nine inception modules are stacked vertically in sequence. At the global average pooling layer, the ends of the inception modules are linked to the global average. Many visual functions have also significantly gained from the deep models. A tendency to sharpen increasingly difficult issues and improve scoring precision has been observed over the years. The preparation of neural network becomes complicated and often degrades as the precision begins to saturate. Residual Network is also referred to as ResNet [22] . Residual Learning is structured to resolve many of the problems associated with DL. Deep ResNet smartly attempts to master several low-, mid-, and high-level features. The individual network is trained to retrieve minor fragments of knowledge. The idea of 'residual' may be understood as throwing away the functionality acquired throughout the previous layer. ResNet was inspired by VGGNet architecture. ResNet has several models, such as ResNet18/50/101. ResNet18 has 18 convolution layers with 3 × 3 filter, as shown in Fig. 6 . Furthermore, ResNet50 and ResNet101 have 50 layers and 101 layers, respectively, and each block has 3 convolution layers with 3 filters (1 × 1, 3 × 3, and 1 × 1). Each 2-layer block is replaced with this 3-layer bottleneck block, as illustrated in Fig. 7 . Data Augmentation is a methodology that can be extended to diversity databases to enhance the preparation of recognition. During the preparation, the photos were not changed at all. Data convergence enhances the efficiency of the classification [34, 41, 42] . MobileNetV2 and NasNetMobile are DL models for mobile devices. The design of MobileNetv2 comprises a total of 155 layers and 164 links [43, 44] . MobileNetv2 is inspired by mobile architecture based on depth-wise separable convolutions. The Global Average Pooling [45] layer was applied, which greatly decreases forward error estimation failure. The training weights were standardized using a batch normalization layer [46] . NasNet is a modular CNN that consists of simple building blocks that have been optimized using reinforcement learning [47] . A cell is made up of just a few operations, and these are replicated time and again due to the required size of the network. Mobile edition (NasNetMobile) comprises of 12 partitions of 5.3 million values and 564 million multiply-accumulates (MACs). A time-domain signal is converted into a frequency-domain signal using a scalogram, and the signal is analyzed on multiresolution. Nevertheless, the mechanism retains the morphological difficulty of signal processing. This indicates that current basic classifiers-based machine learning can be poor at predicting complex signals. We performed a picture to DL of CNN, which exhibits optimum efficiency for the recognition of visual morphology. No attempt has been made to equate the output of DL models with the feedback of the 2-D matrix using wavelet transform. Thus, the present study concentrated on designing the most representative DL models (GoogleNet, ResNet18, ResNet50, ResNet101, MobileNetv2, and NasNetmobile) that are most commonly used for image classification. The use of scalogram to depict signal characteristics and its capacity to distinguish biometrically are the novelties of this study. As described in Section 5, the proposed model was tested with input signal in the form of picture, as is appreciated for DL models. The introduced DL model is performed in the transfer mode with the suggested initial training setup (batch norm decay = 0.5, batch norm epsilon = e − 3 , dropout = 0.5, weight decay = e − 3 ). The learning rate of 0.01, with batch size = 8, was automatically reduced until it reached e − 5 . This decreased the preparation period without lowering the efficiency levels. The DL models were trained for 20 h on a single NVidia 2070 RTX with the CUDA and Deep Neural Network library (CuDNN) in Tensorflow and MATLAB. The dataset is split into 70% training images, 15% validation images, and 15% testing images. Both labeled and evaluation data were used in our experiment. The validation accuracy is a classification score for checking the learning method during the process. It enables the identification of overfitting as a potential trigger. If the accuracy of assessment and training is different, it signifies that overfitting has already happened. The test's consistency depends on how the data is learned from training. The split ratio depends on the volume of the dataset. To ensure the greatest degree of model efficiency, an effective balance must be struck between training and testing. Furthermore, a straightforward response is not possible concerning the process or parameter takes one over the top. It was observed that the model output increased as more samples were used [48] . Stochastic Gradient Descent with momentum (SGDM) [49] was chosen as the optimizer technique in the current study to improve detector performance. The setup of the DL models is shown in Table 1 . Table 1 displays the findings from each DL transfer models with initial learning rate at 0.01 and the number of epochs at 20. The batch size was set to 8 and early-stopping was allowed if an accuracy change was not seen. The best optimizer strategy was found to be SGDM, which aims to change the weight parameters. To prevent over-fitting issues with the DL net, we employed the dropout approach [50] . Loss function L(x, y) was used as the teaching criteria, which is described as the total of binary and box loss functions, as illustrated in equation (2) L( where the bounding boxes of k and k * are denoted by [k a ,k b , k w , k h ], w is the width and h is the height of the box, and x c represents the expected score class c. δ[b > 0] represents the boxes of non-background at 0. The regression loss involves the bounding box andthe classification loss L cl represented in equation (3): The regression loss L re is calculated as given in equations (4) and (5): where, Testing can show a good outcome that proves the reliability of the DL models. The confusion matrix is a technique to calculate the research statistical performance. The six statistical measurements include accuracy, sensitivity, specificity, precision, F1 score, and the Matthews Correlation Coefficient (MCC). The confusion matrices for the two groups (COVID-19 and Healthy) are shown in Fig. 8, and Fig. 9 . For a measurement close to the truth, accuracy = (N TP +N TN )/((N TP +N FP )+ (N TN +N FN ) ) where N TP is the number of correctly labeled, N FN is the number of mislabeled, N TN is the number of correctly labeled instances of the rest of the classes, and N FP is the number of mislabeled instances of the rest of the classes. For the GoogleNet model, the confusion matrix of the test is shown in Fig. 8(a) and the overall accuracy is 90.7%. The performance of the three ResNet models (ResNet18/50/101) are illustrated in Fig. 8(b,c,d) and the overall accuracy is 94.9%, 90.7%, and 91.7% respectively. Resnet18 achieved the highest accuracy due to the Fig. 11 . Precision, F1 score, and the Matthews Correlation Coefficient for all the DL models. small dataset. In Fig. 9 , the DL mobile models show that the overall accuracy is 88.9% and 89.9% for MobileNetv2 and NasNetMobile respectively. The accuracy of the DL models' predictions was measured quantitatively. The two widely employed classification efficiency indicators are sensitivity and precision. For the measurements, Sensitivity = N TP / (( N TP + N FN ))/((N TP +N FN )) and Specificity = N TP /( (N FP +N TN ) ). Fig. 10 introduces the sensitivity and specificity for the six DL models. The highest sensitivity of 97.22% is achieved by ResNet101, which refers to the test's ability to recognize the cough sounds of patients with COVID-19. A test with the high specificity of 95.37% for ResNet18 determines that the test is able to recognize patients who do not have COVID-19. √ . Fig. 11 introduces the precision and F1 score and MCC for the six DL models. The highest precision of 95.33% is achieved by ResNet18, which suggests that this model returns more pertinent outcomes than the others. A test with a high F1 score of 94.88% for ResNet18 determines the efficiency of the DL model. Finally, the MCC indicates that the more accurate statistical rate provided a strong score in each of the four uncertainty matrix groups. The best MCC is for ResNet18 of 89.82%. Further data collection is required to determine the ability of the deep transfer models to be much more reliable. Despite its positive precision rates, the proposed research needs to be repeated on a wider scale, since it may also be applied for other medical applications. The outcomes of the introduced model concerning the use of DL models in CoughDataset with scalogram images of COVID-19 patients and healthy individuals is shown in Fig. 12. Fig. 12 illustrates how our proposed model can classify data with high accuracy. Furthermore, the use of scalogram with DL models to depict signal characteristics and its capacity to distinguish biometrically are the novelties of the present study. Most of the related studies focus on the classification of cough sounds based on machine learning. Performance comparison of different methods in term of Accuracy (AC) is illustrate in Table 2 . A related study presented in Refs. [26, 27] used a small dataset, which included the real COVID-19 cough sound dataset. Much of the above study is based on the classification of cough and non-cough tones. By analyzing the performance of DL transfer models in handling COVID-19 cough sounds with the SGDM, we find that the performance measurement of all the DL models increases sharply in the case of cough signals with strong frequency. Although our recognition model performs the best, its accuracy only reaches 94.9% based on SGDM optimizer, the accuracy of the training data, and the attempt to check the labelling of the data. However, any flaw in the marking of the data that managed to slip past our scrutiny is likely to affect the recorded results. This effect is more Fig. 12 . Representative classifier outcomes of DL models. pronounced where the data amounts to be comparatively small. The observations of the present study and those from the other separate studies listed in the Related Works section, indicate that distinct latent features of cough sounds can be used for effective DL diagnosis of various respiratory disorders. The cough sound may be used as a preliminary diagnostic instrument since it differentiates healthy coughs from the COVID-19 coughs. We investigate the possibility of using the scalogram of sound as feedback to DL models to determine which model performs the best at categorizing medical images to sound. ResNet and GoogleNet are found to have high accuracy in this research and are called deep versions of DL transfer models. NasNetMobile and MobileNetv2 have high accuracy for mobile versions. The tests are conducted on one dataset for measurement which contains audio wave files. ResNet18 was 4.17%, 4.17%, and 3.24% more accurate than GoogleNet, ResNet50, and ResNet101 respectively. The NasNetMobile had 0.92% higher accuracy than MobileNetv2. The studies use the scenario in order to determine the efficiency and consistency of the current classification model. The findings reveal that the maximum classification accuracy is obtained by the ResNet18 model on the cough sound from the checked COVID-19 dataset. The experimental findings for the DL classifier demonstrate how it fits the cough sounds of COVID-19 patients better than other CNN classifiers. Therefore, it can prove more successful in diagnosis by sparing the doctors the intense workload involved with the initial sounding of the COVID-19 cough. The present study has introduced novel DL models for cough sound classification by focusing on tone, which would help curb the spread of COVID-19. The proposed model integrates two main components. The first one involved transforming sound waves to a picture, which was implemented using scalogram. The second component is the generation of universal features accompanied by additional classification using deep transfer models (GoogleNet, ResNet18, ResNet50, ResNet101, MobileNetv2, and NasNetmobile). The CoughDataset used includes 1457 wave cough sounds of 755 patients and 702 healthy individuals. While our recognition model is the most reliable, its accuracy is still at 94.9%. Cough research precision is promising enough to test this model for extrapolation and generalization. The results demonstrated that ResNet18 is the model most effective at classifying the cough sound relative to the other models evaluated on a smaller dataset. This study has been compared with others studies on COVID-19 cough sounds, and it was found to be more predictive and important than all of the established classifiers. The outcomes of the current study signal important contributions for future studies in machine learning and DL. Our results can be compared with another common type of time frequency representation named spectrogram. Despite its positive precision rates, the proposed research needs to be repeated on a wider scale in order to be employed in other medical applications. This research received no external funding. On behalf of all authors, the corresponding author states that there is no conflict of interest. COVID-19) Dashboard Within the lack of chest COVID-19 X-ray dataset: a novel detection model based on GAN and deep transfer learning A Study of the Neutrosophic Set Significance on Deep Transfer Learning Models: an Experimental Case on a Limited COVID-19 Chest X-Ray Dataset A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic Fighting against COVID-19: a novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection Deep learning and medical image processing for coronavirus (COVID-19) pandemic: a survey A survey on deep transfer learning to edge computing for mitigating the COVID-19 pandemic Artificial intelligence applications in the development of autonomous vehicles: a survey 2019 IEEE Pune Section International Conference (PuneCon) Survey on Big Data Analytics in Health Care COVID-19 Cough Classification Using Machine Learning and Global Smartphone Recordings A Deep Transfer Learning Model with Classical Data Augmentation and CGAN to Detect COVID-19 from Chest CT Radiography Digital Images A novel method for wet/dry cough classification in pediatric population A comprehensive approach for classification of the cough type* Classification of human cough signals using spectro-temporal Gabor filterbank features A real time cough monitor for classification of various pulmonary diseases Pre-processing and classification of cough sounds in noisy environment using SVM Analysis of cough detection index based on decision tree and support vector machine Imagenet classification with deep convolutional neural networks Very deep convolutional neural network based image classification using small training sample size Going deeper with convolutions Deep residual learning for image recognition A cough-based algorithm for automatic diagnosis of pertussis Covid symptom severity using decision tree Automatic cough detection based on airflow signals for portable spirometry system Detecting COVID-19 from Breathing and Coughing Sounds Using Deep Neural Networks Cough Classification for COVID-19 based on audio mfcc features using Convolutional Neural Networks AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app Cough event classification by pretrained deep neural network Development of machine learning for asthmatic and healthy voluntary cough sounds: a proof of concept study, Art Cough Sound Analysis for Pneumonia and Asthma Classification in Pediatric Population Novel coronavirus cough database: NoCoCoDa Coswara -A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis Deep learning in plant diseases detection for agricultural crops: a survey A survey on blood image diseases detection using deep learning A Survey of Deep Learning: Platforms, Applications and Emerging Research Trends Intelligent deep models based on scalograms of electrocardiogram signals for biometrics ECG classification using wavelet packet entropy and random forests A comparative study of DWT, CWT and DCT transformations in ECG arrhythmias classification An analysis of convolutional neural networks for image classification Deep transfer learning in diagnosing leukemia in blood cells Breast and colon cancer classification from gene expression profiles using data mining techniques, Art A novel comparative study for detection of Covid-19 on CT lung images using texture analysis, machine learning, and deep learning methods Extracting possibly representative COVID-19 biomarkers from X-ray images with deep learning approach and image data related to pulmonary diseases Inception-v4, inception-ResNet and the impact of residual connections on learning How does batch normalization help optimization? Learning Transferable Architectures for Scalable Image Recognition On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning On the importance of initialization and momentum in deep learning Dropout: a simple way to prevent neural networks from overfitting