key: cord-0484521-1wp0gjz5 authors: Yadav, Samir S.; Bendre, Mininath R.; Vikhe, Pratap S.; Jadhav, Shivajirao M. title: Analysis of deep machine learning algorithms in COVID-19 disease diagnosis date: 2020-08-25 journal: nan DOI: nan sha: 69c8c71b5986efa1cdd8deac37af4e49a7937f1f doc_id: 484521 cord_uid: 1wp0gjz5 The aim of the work is to use deep neural network models for solving the problem of image recognition. These days, every human being is threatened by a harmful coronavirus disease, also called COVID-19 disease. The spread of coronavirus affects the economy of many countries in the world. To find COVID-19 patients early is very essential to avoid the spread and harm to society. Pathological tests and Chromatography(CT) scans are helpful for the diagnosis of COVID-19. However, these tests are having drawbacks such as a large number of false positives, and cost of these tests are so expensive. Hence, it requires finding an easy, accurate, and less expensive way for the detection of the harmful COVID-19 disease. Chest-x-ray can be useful for the detection of this disease. Therefore, in this work chest, x-ray images are used for the diagnosis of suspected COVID-19 patients using modern machine learning techniques. The analysis of the results is carried out and conclusions are made about the effectiveness of deep machine learning algorithms in image recognition problems. The new COVID-19 disease caused by the new strain of corona virus was first detected in Wuhan, China since December 2019 [1] . The disease subsequently spread throughout China and around the world [2, 3] . A normal person can become infected if they have close contact with an infected person. Signs and symptoms reported include fever, fatigue, dry cough, shortness of breath, and respiratory failure. Most patients have mild symptoms and have a good prognosis. Cases of death are often the elderly and have underlying medical conditions such as cardiovascular disease and diabetes [1] . As of August 15, 2020, 2,589,208 cases and 63896 deaths were reported in India [4] . The Indian government has used many social isolation and quarantine measures to reduce the spread of disease in the community. Typically search, isolate and track 14 days of people in direct contact with the sick. From 24 March 2020, outbreaks of places like Mumbai, Pune have been completely sealed off. Schools and businesses are suspended to prevent the spread of disease. People in all countries do not get a VISA to enter India. Many domestic and foreign flights were also canceled due to the epidemic. To perform testing for the detection of COVID-19 are challenge for all, particularly the developed countries, because the testing device and its testing kits are quite limited and not accessible worldwide [5] .Several investigators and research institutions are currently researching on COVID-19 diagnosis [6, 7, 8, 9, 10] . Investigating early COVID-19 signs is not a reliable screening technique because there are certain instances in which patients have the signs but not diagnosed with COVID -19 as verified by the clinical examination or CT scan. While one of the important methods of diagnosing COVID-19 cases is the pathological examination. Nonetheless, this procedure has certain drawbacks because it should be performed in clinical labs that are only located in city centers and need time-consuming results. This can cause a problem as the positive patients cannot be isolated earlier, and they can infect more people through the crucial time of unrestricted movement [11, 12] . CT scan is also one of the popular methods for diagnosis of COVID-19 [13] . However, the problem in this radiological imaging is the overlapping with other diseases. When a COVID-19 patient is infected with another lung disease such as pneumonia, then it is difficult for a medical professional or a radiologist to diagnose these both similar looking CT scan images [14] . Also, one of the major disadvantages of COVID-19 diagnosis using a CT scan is its high radiation dose and its high cost, as it is not easy for common people to use this procedure [15] . Traditional radiography or Chest X-Ray (CXR) images can overcome the problem of costly CT scans and pathological tests because CXR is less costly and has minimal harmful consequences. Also, CXR is capable of identifying various lung diseases earlier [16] . The CXR imaging is a non-invasive procedure that takes 2-3 minutes to capture the image, and results can be fetched within thirty minutes. The modern radiographical machines are affordable for average income countries or underdeveloped countries [17] . Hence, in this research work, CXR is used to identify the deadly COVID-19 coronavirus. Currently, due to the rapid development of digital technologies, the use of automated and robotic systems has spread to many areas, both in industry, science and in everyday life. As a consequence, there is an increasing need for efficient processing of information presented, in particular, in video and image formats [18] . At the current moment, images have closely merged into human life. Therefore, many automated systems use them as the main source of information [19] . Finding, localizing, classifying and analyzing objects in an image by a computer is a complex task in computer vision. Computer vision is a set of software and technical solutions in the field of artificial intelligence (AI), aimed at reading and receiving information from images, in real time and without human intervention [20] . In the process of analysing information received from the eyes, the human brain does a tremendous amount of work. A person can easily describe what is happening in a randomly taken photograph. Images can carry a tremendous amount of detail and differ in many parameters, such as resolution, color, quality, brightness, noise, etc. Objects in images can also have many features: scale, position, color, rotation, tilt, etc. However, in digital format, each image is just an array of numerical data. Teaching a computer to find and classify images in an image taking into account all factors is a very complex algorithmic task. To solve it, machine learning technologies are actively used. A person receives a large amount of information through sight. Images are capable of storing a huge amount of it. As a consequence, their use in computer systems increases the performance of these systems. However, such technologies require complex calculations. The challenge in computer vision is to develop efficient algorithms that extract and analyze data from images or videos. At the moment, similar technologies are used to solve such complex problems as: • OCR-Optical character recognition: Converts text in an image to editable text. • Photogrammetry is a technology for creating a three-dimensional model of an object based on photographs taken from different angles. • Motion capture is a technology widely used in the film industry that allows to transform the movements of real people into computer animation. • Augmented reality (AR) is a technology that allows virtual objects to be projected onto an image of a real environment in real time. • Medical diagnostics -early detection of cancer cells, increasing the quality of MRI images, their analysis, etc. Recently many scientist have used machine learning algorithms in disease diagnosis of different diseases [21, 22, 23, 24, 25, 26] . In this work, deep machine learning algorithms are used for solving image recognition problems, and also neural networks are designed and trained for diagnosing COVID-19 from chest x-rays. Remaining of this paper organized as: section 2 gives the basic information about machine learning techniques. Section 2.5 contains a description of the architecture of a convolutional neural network and a description of modern models based on it. In section 3, modern models of convolutional neural networks were trained to solve the problem of diagnosing pneumonia and COVID-19 using X-ray images. Finally conclusions are drawn in the section 5 2 Technical Background Machine learning(ML) is a branch of research in the field of AI, which is based on methods of developing systems capable of learning. Ml algorithms show themselves effectively in tasks in which it is required to determine common features from previously trained data and identify new data from them. Artificial neural networks are often used in the design of such learning systems. Artificial neural network (ANN) is a computer model, which is based on the principles of a biological neural networka set of interconnected nerve cells -neurons. Each neuron has a set of input connections -synapses, through which it receives information presented in the form of impulses from other neurons. According to the data obtained, the neuron forms its state and, with the help of the axon, communicates it to other neurons, ensuring the functioning of the system. In the process of forming the system, some neural connections are strengthened, while others are weakened, ensuring the learning of the network. x 0 x 1 The weighted sum of the inputs is a linear combination, which means that regardless of the number of layers, the values of the output layer depend only on the inputs of the first layer. The activation function of the neuron ensures the normalization of the calculated sum and the nonlinearity of the neural network. Many neural network models also require the activation function to be monotonic and continuously differentiable over the entire domain. . . . . . . . . . Hidden layer Output layer We should also mention the Softmax function. This feature is often used on the last layer of deep neural networks in classification problems. Let the last layer of the network contain N neurons, each of which corresponds to a certain class. Then the value of the output of the i-th neuron is calculated by the formula: Thus, the result of each neuron will take values from the range [0, 1], and their sum is 1. As a result, the network will give the probabilities of the ratio of the input data to the given classes. The training of neural networks means the selection of the values of the weights of the connections for the effective solution of the task. Initially, weights are set randomly. Then, in the process of running the test data through the network, the weights are adjusted so that in the end the network gives the correct answers. The learning process is cyclical. During one iteration, a packet is fed to the network containing a number of elements from the input data. A single pass through the network of the entire set of test data is called an epoch. In order to control the learning process, it is necessary to somehow evaluate the work of the network. For this, a loss function (cost function) is introduced, which calculates the difference between correct and obtained results and forms a certain numerical value characterizing the magnitude of the network operation error. Thus, the task of training the network is reduced to the task of finding the global minimum of a given function. The table 2 contains the most frequently used loss functions, where y i is the expected value of the i-th neuron, x i is the obtained value of the i-th neuron, n is the number of output neurons. One of the popular methods in training deep neural networks is the backpropagation algorithm. Let the network have L layers, a l , w l , b l -vectors of values, weights and displacements of neurons on the l -th layer .. There are also N training pairs (x, y). In the learning process, the following iterations occur in cycles: 1. A vector x from the training set is fed to the network input, for each layer calculate the values: Calculate the value of the cost function: 3. Calculate the error values of the output layer: 4. Calculate errors for each previous layer: δ l j = k w l+1 kj δ l+1 k σ (z l j ) 5. Calculate the gradient of the cost function: δC δw l jk = a l−1 k δ l j 6. Update link weights: In addition to the method described above, other algorithms are often used for training, for example, RMSprop and Adam: These methods belong to the supervised learning algorithms , the most common type of learning, in which the network learns from pre-labeled data where the correct answers are already known. There are other approaches to training neural networks: Reinforcement learning is a method that assumes the presence of some environment in which the network operates. Such an environment reacts to the actions of the model and gives it certain signals. Unsupervised learning -learning in which the network does not have the correct answers in advance and independently searches for common and distinctive features of the input data. Genetic algorithms -algorithms that mimic the evolutionary mechanisms of the development of a biological population, act as an alternative to the error backpropagation algorithm. The value of an arbitrary weighting factor in a neural network is called a gene. Genes form chromosomes, and chromosomes form a population. Further, within one epoch, with certain probabilities occurs: • Crossing of chromosomes -the formation of a new chromosome from the genes of the other two • mutation -random change of an arbitrary gene • adaptation -the chromosomes showing the worst results are eliminated from the population. In learning algorithms based on the backpropagation method. The error value depends on the derivative of the activation function, so when using the sigmoid activation function, the error value decreases very quickly when propagating from the last layer to the first, thus the weights in the early layers are poorly corrected. Similarly, the explosive gradient problem can occurred when the error value becomes very large. A simple way to solve this problem is to use the ReLU function, whose derivative takes the values either 0 or 1. To solve such problems, preprocessed input data can be resorted, it is often recommended to limit the input data to the range [0; 1]. For example, in images, values can range from 0 to 255 and for better learning stability they can be divided by 255. As another optimization method, values can be centered to calculate the average value for the input image and subtract it from each pixel, so the most average value of the data in the image will be equal to 0. Together with this method, the standard deviation is often normalized, setting its value to 1. Network retraining is a problem when the network learns to analyze objects well only from the training set and does not work well with new data. One of the methods for solving this problem is Dropout, the essence of which is as follows: At each training iteration, neurons with some probability are turned off. The remaining neurons are trained by the backpropagation method of oshiyuki, after which the neurons return to the network. Most of the modern neural networks aimed at image analysis are based on the architecture of the convolutional neural network. Early neural networks consisted of fully connected layers-layers in which each neuron is connected to each neuron of the next layer, which significantly increased the computational complexity of the system as the number of neurons increased. Typical convolutional neural networks primarily use convolutional layers. Convolutional layers are characterized by the use of weight matrices, called filters or kernels, that are smaller than the original data. Such a kernel with a certain step goes through the set of input data (I) and calculates the sums of the products of the corresponding values of the cells and weights, forming a feature map (I * K) One. convolutional layer can contain several kernels and, accordingly, several feature maps. Since the features have already been detected, to simplify further calculations, it can be reduced the granularity of the input data. This provides a downsampling (pooling) layer, reducing the dimension of the input feature maps: from several neighboring neurons, the maximum or average value is taken, thereby forming a neuron of the feature map of a lower dimension. This reduces the number of parameters used in further network calculations. A convolutional neural network can have multiple pairs of alternating convolutional and downsampling layers. Thus, using the example of images, on the initial layers, the network finds such simple features as borders and corners.Then more and more complicated structures are defined as we go deeper into the network: from the simple forms to the whole categories irrespective of where they are located. The network ends with completely connected standard layers which correspond to the resulting characteristics in the class. The VGG architecture was proposed in 2014 by [27] . The main feature of the network is the use of consecutive convolutional layers with 3x3 filters instead of the previously used convolutional layers with large filters 5x5, 7x7, 11x11. This made it possible to reduce the number of network parameters while maintaining efficiency. The This model [28] , developed by Google, in 2014 took 1st place in the annual competition for image classification-ILSVRC. A key innovation of this network was the use of nested modules as layers, which are a set of filters of different dimensions, with the subsequent merging of their results. Conv 3x3 Conv 1x1 Maxpool 3x3 Conv 1x1 Conv 1x1 Combine filters Also, Inception completely abandoned the use of fully connected layers, instead of them, a global average pooling is used, which converts each feature map to one number, forming a vector of averaged values. This innovation made it possible to significantly reduce the number of parameters and, as a consequence, the computational complexity of the network. Later, improved versions of Inception were developed, in which the 5x5 layer was replaced with two successive 3x3 layers, and all layers with N × N filters were replaced with a 1 × N and N × 1 filter stack, which also reduced the number of parameters. ResNet [29] , also known as residual neural network, won the ILSVRC in 2015. Its feature was the presence of transmission connections that transmit information unchanged to deeper parts of the network, this information is summed up with the value calculated on the missing layers and transmitted further. The block shown in Fig. 8 demonstrates the building block of such a network. DenseNet [30] is a dense convolutional network, similar to ResNet, but with the difference that all blocks of the network are connected by direct connections, so each block receives information from all previous ones. All of these models were tested on ImageNet datasets, which include 1000 classes, over 1. the covid-19 class, hence confidence interval(CI) of the all the performance metric were calculated by using following formula 1: Where z is the number of standard deviation also called the CI level, acc is the performance metrics used for the evaluation of performance models. And the value of the N is the number of sample for the class used. We have used 95% CI and standard deviation as 1. As it can be seen in the table: VGG networks have the largest number of parameters, significantly surpassing other models, while demonstrating the lowest accuracy. DenseNet shows the best ratio of accuracy to parameter count. However, the third version of Inception has relatively slightly more parameters and precision. The best result was shown by ResNet-152 of the second version, but it has a significantly larger number of parameters, which affects the network learning time. 3 Methodology COVID-19 is a severe respiratory infection caused by the SARS-CoV-2 coronavirus. At the time of this writing, an outbreak of the virus that originated in the Chinese city of Wuhan has escalated into a global pandemic. The main method for diagnosing COVID-19 is reverse transcription polymerase chain reaction (PCR), but this is a laborious and human-intensive process. An alternative is an x-ray, such as a chest x-ray or computed tomography (CT). Research shows that most images of COVID-19 patients contain specific abnormalities, such as bilateral reticular nodular opacification or ground glass syndrome, to visually distinguish the disease from other types of pneumonia [31] . Despite the fact that X-rays are less sensitive than CT or PCR, it is a much more affordable and quick diagnostic method, which is an essential criterion during a pandemic as discussed in the introduction section. Solving the problem of automatic diagnosis of this disease will reduce the burden on doctors and increase the efficiency of their work. a. Normal b. Pneumonia c. COVID-19 Figure 9 : Chest X-rays cuDNN is a deep neural network library from Nvidia that allows to use GPU power for calculations. Python 3 is a flexible and powerful programming language that efficiently performs data analysis and processing tasks. TensorFlow is a feature-rich open source framework developed by Google that allows to design and train various neural network architectures. Keras is a high-level deep machine learning API included with TensorFlow. All models have been tested on a system with the following specifications: • Operating system -Ubuntu 20.04 • CPU -Intel Core i5 9400F CPU 2.90 Ghz • The amount of RAM -8 GB • Video card -Nvidia GeForce GTX 1660 The following metrics were used to assess the quality of the algorithms: • Precision : P = T P T P + F P • Recall: R = T P T P + F N • F1-: F 1 = 2 · P · R P + R • Accuracy: where TP is the number of true positive, FP -false positive, FN -False Negative Answers. Precision denotes the proportion of correctly identified objects of a class relative to all objects assigned to this class. Recall shows the proportion of elements of a class found by the network relative to all elements of this class. F1 combines precision and recall by calculating their harmonic mean. Also, in the process of training the network, the values of the loss function were considered as a characteristic of the quality of training. Collecting data for training neural networks in tasks of this type is a complex and time-consuming process that requires a lot of time and the participation of a large number of people. Therefore, ready-made, already marked datasets were used as a source of training and verification data: [32] A total of 14,197 photographs were collected, of which 8,066 were healthy patients, 5,558 with pneumonia and 573 with COVID-19. 100 images of each class were selected for training and validation. The total number of shots is shown in the table 6 Images from the training set have different resolutions, but neural networks require a predetermined number of input neurons. Therefore, as a preprocessing of the data, all images are scaled to one resolution before being sent to the network: 512x512 px. When using deep neural networks in image classification problems, additional research is required in order to select the optimal parameters for various parts of the algorithms. Pre-processing the input data can have a significant effect on training models. Within the framework of this part of the study, the following options for image processing were considered: • Scaling -dividing all values in the image by 255. • Centering the mean of the image at 0 and normalizing the standard deviation at 1. The study tested the following neural network models with standard parameters: • Inception V3, input layer dimension: 299x299 • ResNet-50, input layer dimension: 224x224 • DenseNet-201, input layer dimension: 224x224 All models were trained over 10 epochs, the size of one package was 16 images. Categorical cross entropy was used as the loss function, and Adam was used as the optimizer. Learning was stopped in advance if the value of the loss function on the validation set did not decrease for five epochs. The values of the error function (loss), precision and recall metrics based on training and validation results (prefix "val ") for processed images using scaling and centering methods are indicated in the tables 7 and 8, respectively. The selected algorithm for optimizing the error function can significantly affect the performance of neural networks. This part of the study analyzes various optimizers: Stochastic Gradient Descent (SGD), RMSProp, and Adam. Testing was carried out on InceptionV3, ResNet-50, DenseNet-201 networks and pre-scaled input images of 500x500 pixels. All models were trained for 10 epochs, the size of one package was 8 images. The tables 9, 10 and 11 indicate the values of the Precision, Recall and F1 metrics obtained as a result of validation for each class. Tables 9, 10 and 11 show that different optimizers are differently effective for different networks, so the ResNet-50 V2 network together with the RMSProp optimization algorithm for the Recall metric was able to identify a larger number of images with COVID-19, while Inception V3 with SGD method, shows the highest results on average across classes, according to the F1 metric. Thus, in the course of the study, it was revealed that to solve the problem of diagnosing COVID-19 using X-ray images, it is preferable to use the Inception network in conjunction with the SGD optimization algorithm. In this case, the values in the input images are recommended to be reduced to the range [0; 1] by means of scaling. In the course of the experiments, models were upgraded for the problem, trained and tested based on the following architectures: Inception, ResNet and DenseNet. The accuracy of networks according to the accuracy metric during training and testing is shown in Fig. 10 . As seen from Fig. 10 the accuracy of the networks increases with the number of training epochs and tends to 1, which indicates the high quality of these algorithms in solving image classification problems. In this paper, a study of deep machine learning algorithms in image recognition problems was carried out. Convolutional neural network architectures such as Inception, ResNet, and DenseNet were considered. Models were also trained and tested to solve the problem of diagnosing pneumonia and COVID-19 from chest x-rays. The results of the study showed that the Inception V3 network together with the SGD optimization algorithm is better at solving this problem. Preliminary reduction of values to the range [0; 1] can improve the quality of network training. Expanding training data and more epochs can significantly improve the quality of image recognition. Disease outbreak news Situation reports Covid-19 coronavirus pandemic World health organization declares global emergency: A review of the 2019 novel coronavirus (covid-19) Epidemiological and clinical features of the 2019 novel coronavirus outbreak in china Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study Clinical features of patients infected with 2019 novel coronavirus in wuhan, china Molecular diagnosis of a novel coronavirus (2019-ncov) causing an outbreak of pneumonia Chest x-ray scanning based detection of covid-19 using deepconvolutional neural network A deep learning algorithm using ct images to screen for corona virus disease Detection of 2019 novel coronavirus (2019-ncov) by real-time rt-pcr Imaging profile of the covid-19 infection: radiologic findings and literature review Essentials for radiologists on covid-19: an updateradiology scientific expert panel Drawbacks and limitations of computed tomography: views from a medical educator Chest tuberculosis: Radiological review and imaging recommendations Defining the diagnostic divide: an analysis of registered radiological equipment resources in a low-income african country Efficient processing of deep neural networks: A tutorial and survey Text information extraction in images and video: a survey Ambient intelligence: A new multidisciplinary paradigm Deep convolutional neural network based medical image classification for disease diagnosis Machine learning algorithms for disease prediction using iot environment Comparative analysis of ensemble classifier and single base classifier in medical disease diagnosis Automated cardiac disease diagnosis using support vector machine Application of machine learning for the detection of heart disease Detection of common risk factors for diagnosis of cardiac arrhythmia using machine learning algorithm Very Deep Convolutional Networks for Large-Scale Image Recognition Rethinking the Inception Architecture for Computer Vision Deep residual learning for image recognition Densely Connected Convolutional Networks COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images Covid-19 radiography database Funding No funding was received for this study. The authors declare no conflict of interest, financial or otherwise. This research paper does not contain any studies with human participants or animals performed by any of the authors. Not needed. All authors agreed on the submitted version.