key: cord-0076753-ivtb3lea authors: Sharma, Mayank; Narang, Kabir; Makhija, Shubham; Joy, Deborah T.; Gupta, Neeraj; Gupta, Rashmi title: Data interpretation leading to image processing: a hybrid perspective to a global pandemic, COVID-19 date: 2022-01-14 journal: Data Science for COVID-19 DOI: 10.1016/b978-0-323-90769-9.00015-3 sha: 4a34190a693714717b86e6c72316361d5cb101e4 doc_id: 76753 cord_uid: ivtb3lea When coronavirus hit, it caught mankind at the disadvantage of ignorance to its influence, and before the knowledge was realized the world was hit with another pandemic. After having overcome epidemics unheard of, the challenge lies in facing yet an added virus that poses a threat to the human species, the coronavirus disease, also known as COVID-19 [World Health Organization]. From the first established case of coronavirus in China found in the city of Wuhan to the first recorded coronavirus death, everyone has been looking for ways to tackle it. So let us take a look at how to deal with something that has the advantage of confining people to their homes and promises to be there for another decade or so and how to use the data that it gives as an advantage. This chapter will be taking a walk through the build and shift of COVID-19, then statistically measure its movement using data analysis and processing, assess its guise via image processing, and then go over some effective perspectives to keep this virus in check. Now, there is so much of COVID-19 data around, surmounting the ability to fathom it, so let us try and manipulate this data to find methods to nip it at the bud. This review will first describe the history and past research on coronavirus and then it will further move onto the COVID-19 in reference with the coronavirus and will extend over the past image processing techniques used for and what should be their future scope over the same outbreak. Coronavirus has been a major outbreak in the past decade, as seen from the history of events with such outnumbered deaths and outrage, the scenario repeats every decade. Talking about this decade in the 1960s the first human coronavirus was characterized, which is accountable for a considerable proportion of upper respiratory tract contaminations in children. Since then, many new cases of coronavirus are being detected, as in 2003 a minimum of five new human coronavirus cases were identified with SARS. The corona-virology including the group I (NL63) and group II (HKU1) coronaviruses has advanced significantly within this past decade. Since the outbreak of the SARS coronavirus the animal coronaviruses are also in the spotlight [4] . As the coronavirus cases were increasing significantly that affecting the world with such a large number of patients and deaths. On December 31, 2019, the Wuhan Health Commission in the Republic of China reported to the China CDC and the WHO 27 cases of pneumonia with symptoms like fever, dry cough, and dyspnea and radiologic findings exhibited two-sided lung glassy opaqueness. Wuhan at this time was among the top five most populated cities in China and was becoming a flash point of the pandemic as the 27 reported cases were traced to the Huanan Seafood Wholesale Market [5, 6] . With this increasing rate of exposure of humans to animals, there was neither a proper method nor identification techniques contributed to the rapid increase in the spread of the virus in Wuhan which eventually forced the WHO to declare this virus as a pandemic by January 30, 2020, and later by the 11th of March the expansion in number of cases led to the declaration of it as a deadly disease. Animals provide a perfect reservoir for the virus to build up and effectively multiply over and even mutate over more advantageous species to even more harmful, as animal cells provide them with a proper cellular environment to multiply and spread. "The Wuhan Coronavirus" was then named as 2019-nCov (novel coronavirus 2019) by the China CDC on the 7th of January, which was later changed by the WHO to SARS-COVID to relate to more meaningful disease symptoms instead of mentioning it by its geographic location or nationality [7e9]. So since the outbreak of this pandemic, many organizations and institutions are trying and even many new techs are even up for the prevention, detection, and cure of this unimaginable outrageous disease. For instance, the reverse transcription-polymerase chain reaction (RT-PCR) has set a benchmark for the diagnosis of COVID-19 but it provides the diagnosis only to an accuracy of 6%e80%, which means that 20%e34% patients with COVID-19 out of 100 are actually established COVID-19 negative regardless of being infected. As stated the RT-PCR diagnosis will only test positive to patients with COVID-19 in whom the virus has already been accumulated to a certain level to being identified; on the other hand, patients in whom the virus is below that justified level would be tested negative despite of being infected. So rather than one test, a repeated RT-PCR diagnosis is being performed to have a confirmed case of positive or negative with a timeframe of 24e72 h of being tested negative. Due to some inaccuracy by RT-PCR to test negative on the early stages of the infection, a new technique of CT chest scan is being used, which tests to an accuracy of 95% even in the early stage of COVID-19. This new technique is far better than the RT-PCR diagnosis as proved during a case study also, which gave 98% accuracy with the CT scan and 71% accuracy for RT-PCR diagnosis where the patients were imaged and assayed within 3 days of infection with the virus. This has visibly provided clear advantage to the society by preventing a patient to be discharged into the society. As this is the time of extreme care and protection, there should be a way of 100% accuracy to help not further spread the pandemic. The cons of increased CT usage include a massive economic burden on the healthcare resources and therefore the capacity to contaminate the CT scanners within the management algorithm [10e12]. Now working our way toward data, we are quite familiar with the term datadit refers to facts and information that lay the grounds for reasoning, reference, and (or) analysis. Rising from the meaning itself, data is meant to be analyzed and give a conclusion or inference. Formally when we talk about analyzing, we mean an exploratory examination of some structure. The specimen in this case being the coronavirus disease. According to medics, the human body constantly gives out data; furthermore, when a virus attacks and subdues the healthy cells the data begins to mutate and react violently or compliantly depending on the potency of the attacking virus. So when the coronavirus attacks a human cell the data begins to change. Grabbing this data and processing it to make it fit for investigation and explanation becomes a mandate to further predict the movement of the virus or measure its sway or find ways to thwart it. Coming to a global perspective, we face the war against coronavirus with the help of the data it provides us. Data in the form of symptoms or CT scans (as mentioned earlier) or even chronic diseases can be used to declare a person COVID-19 positive or COVID-19 negative. This data is easily retrieved from medical professionals, further assisting the construction of datasets that can be fed into systems that can learn from it and given fresh data predict if a person is carrying the virus. It is tedious and even classified as a risk to conventional minds, but as always, we aspire to create something better and efficient when we do not have human resources at our disposal. The next section discusses the interpretation of COVID-19 data collected from the country with the second largest population on the planet. The data interpretation given in Fig. 35 .1 is constricted to age range analysis, deaths, and recoveries, and the data received is as of April 21, 2020. According to the data provided by the Health Ministry of India, although the disease does not see any age, sex, gender, or religion before transmitting, young Indians seem to be at higher risk of contracting the disease, as 83% of the patients have been found to be under the age of 50 years. Almost 14.4% of the deceased or reported dead were from the age group of 0e45 years, 10.3% in the age group of 45e60 years, 33.1% in the age group of 60e75 years, and 42.2% above 75 years. This concludes that 75.3% of the dead were above 60 years of age. Furthermore, 83% of the deceased had some additional conditions as well [13] . According to the data provided by the Health Ministry of India, Maharashtra is the most affected state with a total of 6817 cases, out of which 957 are cured/discharged/ migrated and 301 are reported dead, as illustrated in Fig. 35 The data interpretation given in Fig. 35 .4 is constricted to be age range analysis, all deaths, and confirmed deaths, according to the data received as of April 25, 2020. About 97% people suffering from the disease worldwide are in minor condition, so only 57,553 (3%) are in grave or life-threatening situation. Out of cases that had an outcome, around 873,857 (81%) are recovered or discharged and 205,928 (19%) are reported dead. See Appendix A for more details. The coronavirus disease is presently distressing 210 nations and territories around the globe and two international conveyances [15] . The present data analysis is carried out for the top seven affected countries, as depicted in Fig. 35 .5. According to primary approximations by China's National Health Commission (NHC), deaths beyond the age of 60 years were about 80%, while 75% of them already had health conditions such as diabetes and heart ailments. According to the report issued by the WHO, the median age of cases distinguished outside China is 45 years, spanning 2e74 years. About 71% of cases belonged to the male category, as they comprise the world's maximum working class. After a study of 138 hospitalized patients with Novel Coronavirus (2019-nCoV)eInfected Pneumonia (NCIP), it was found that the typical age was 56 years (interquartile range, 42e68 years; range, 22e92 years) and 75 (54.3%) were men. Regardless of age or stature, people can be infested by the coronavirus disease, as it does not affect a person according to age or sex. Aged people and persons with preexisting medical ailments (such as asthma, cardiovascular disease, diabetes) are at a high risk of getting badly affected by the disease if they encounter it. Fig. 35 .6 highlights the deaths occurred in all ages in the pandemic. Death rate basically refers to the probability of dying due to a disease if infected. Rather in this case, it basically depends on age and may vary according to conditions. For a person in any given age group, it represents the risk of dying if infected with the virus. A very few cases are seen among children [15] . The novel coronavirus (2019-nCoV) has taken several thousands of lives worldwide and locked out a whole lot of cities and countries and yet has unknown global consequences in the forthcoming future. Also, the arrival of the novel coronavirus in late 2019 has gripped the headlines of all the news channels as it declared to be a global pandemic and with positive cases existing practically in every other city in the world. Throughout the course of history, viral diseases such as SARS, H1N1 swine flu, Spanish flu, and many other pandemics continue to haunt humanity, sometimes changing the course of history and at times signaling the end of the entire human civilization. The novel coronavirus outbreak, which emerged from a live animal market in the city of Wuhan, China's Hubei Province in December 2019, is continuing to haunt people around the world. The response to COVID-19 outbreak has been substantial, as the infection is at its peak and critical. The published literature can sketch the beginning of symptomatic individuals back to early December 2019 [16, 17] . As they were unable to identify the causative agents, these first reported cases were classified as "pneumonia of unknown etiology." The outbreak was declared a Public Health Emergency of International Concern on January 30, 2020. The WHO is working round the clock to analyze data, provide advice, coordinate with partners, help countries prepare, increase supplies, and manage expert networks. On February 11, 2020, the WHO announced a name for the new coronavirus diseasedCOVID-19 [18] . The virus under scrutiny hails from a huge family and is suspected to be to the cause of respiratory infections in humans, mammals, and birds. Most family members cause only mild symptoms in healthy patients, and as they are the cause of about 15% of the cases of the common cold, it is likely that you have successfully fought off a coronavirus infection without even noticing. COVID-19 turns out to be a carrier for other bacterial infections that cause similar types of respiratory disorders. These similar types of pathogens are single-stranded RNA viruses that can be outlying in different animal species [19] . The information provided by the China CDC over the genetic chronology of SARS-CoV was helpful for many countries to develop a primer for this genome, and most countries are utilizing this info and have started developing certain countermeasures. Certain countries like the United States are developing the criteria for investigating persons under consideration for SARS-CoV. If a person is considered as a PUI (Person Under Investigation) by the US CDC, i.e., they show certain symptoms of acute respiratory weakness like cough and difficulty in breathing and had developed fever, then it is advisable to put the practitioners at a sudden emergency in a place of infection control and prevention measures [20] . Various countries and most of the health organizations like the WHO have suggested a method of extending their research by collecting specimens from the upper and lower respiratory tracts, such as expectorated sputum, endotracheal aspirate, or bronchoalveolar lavage. The research to begin requires the samples to be studied upon remain intact and therefore requires a storage of 4 C. With the daily increasing number of patients in every country, there also began a subsequent advancement in the technology and methods to counter this pandemic. The early contamination of the virus onto a human cell does not show any visible symptoms on the behavioral characteristic of humans, but deep inside, it destroys the human white blood cells and even a decreased lymphocyte count could not be foreseen. Animals provide a perfect reservoir for the virus to build up and effectively multiply over and even mutate over more advantageous species to even more harmful, as the animal cells provide a proper cellular environment for viruses to multiply over and spread. Many vaccines are under observation but still any antiviral dosage or vaccine is not currently available for directly curing COVID-19. As the infection majorly affects the respiratory system, oxygen therapy acts as a major treatment to improve the state of a person severely infected by the disease and mechanical ventilation is also used in case there is a major respiratory failure. The only treatment available for now is symptomatic; only the symptoms of an infected person are being treated with medication. On March 13, 2020, the WHO came up with a document quoted as "Clinical management of severe acute respiratory infection (SARI) when COVID-19 disease is suspected." This document comprises certain guidelines that a healthcare system must follow to manage the treatment of COVID-19 patients. The document advises screening and isolation of all patients detected with the disease or had contact with an infected person as soon as possible, including doctors, nurses, or the complete medical staff (clinicians). The document concludes to the point of how a healthcare system should make strategies for addressing respiratory failure by providing oxygen therapy, mechanical ventilation, and high-flow nasal oxygen or noninvasive ventilation and symptomatic treatment [21] . To protect themselves from any kind of infection the healthcare providers must take certain precautions. Any kind of treatment should take place by an expert who should be wearing a proper personal protective equipment kit including N95 mask protection, goggles (protective), and many more protective equipment. A maximum number of patients with coronavirus infection appear to have respiratory failure or acute respiratory distress syndrome (ARDS) and are mechanically ventilated. It is basically a support given for breathing through a face mask, a helmet, or any other equipment. A mixture of air added with oxygen is usually given through a mask under a positive pressure. Both high-flow nasal cannula oxygen therapy and noninvasive ventilation are used in the management of acute hypoxemic respiratory failure. There are a lot of therapies available till date, but certain therapies including systemic corticosteroids are not well recommended for the treatment of viral pneumonia or ARDS. An appropriate and a selective approach should be adapted; any indiscriminate or unsuitable choice of antibiotics should be sidestepped, although some studies recommend it. As for now, no antiviral treatments are approved or authorized. According to certain studies, when the ailment results in complex clinical pictures of multiple organ dysfunction, organ function support, in addition to respiratory support, should be considered as mandatory. "Prevention is always better than cure," so rather than fighting with the disease, why not to prevent it before it affects the masses? As there is no proper vaccine cure present, preventive measures are currently serving as a strategy to limit the mass spread or community transmission. Prevention strategies are basically focused on mass testing and isolation of patients with the disease. The healthcare workers are also advised to take proper precautions before collecting any kind of specimen for testing, treatment, and diagnosis of patients. The WHO and certain other organizations have issued the following guidelines: B Avoid any kind of contact with patients suffering from any kind of similar symptoms. B Wash hands frequently and avoid going to public places. B Always have a protected contact with farm or wild animals, as no such clear evidence is there that proves them prone to infection. B People with any kind of symptoms should go in self-quarantine for a period of at least 14 days and should always cover their mouths while coughing or sneezing. B Strict hygiene measures should be followed by healthcare systems, providing emergency treatments for the prevention and control of infections. B Individuals with a weak immunity should avoid indulging in any kind of public gatherings [22] . Washing hands before and after going to any kind of public place and using hand sanitizers, always wearing a mask, and not touching the face again and again are considered very good practice in maintaining proper hygiene and thus preventing the disease by killing the virus. For healthcare workers who are in a very close contact with patients and thus too close to being affected, proper personal protective equipment kits must be used by them, including equipment to cover each and every part of their body, such as N95 masks, protection goggles, disposable gloves, and many more. For now, there are no straight vaccines available for the cause, but still scientific research is moving very close to developing one [16, 23] . Image processing is a modus operandi entailing a series of operations performed on a digital image to enhance it, analyzes, it or dig out information from it. Image processing consists of a chronologically ordered set of steps or a well-defined procedure that an image must undergo to give an end double that is optimized enough for interpretation by any imaging system [24] . Now, with the emergence of AI, image processing is being taken to new heights. In this section, we are going to play with the same idea and try to make it work in our favor seeing as we are ironically facing an invisible virus. There are numerous tools for image processing, some of which are CVIPtools, GNU Octave, MATLAB, OpenCV, Scikit-Image, SciLab, VGL, VNL, and so on [25e28]. In this chapter, we will be using Open Source Computer Vision, OpenCV in short; it is a library that is primarily used for real-time computer vision. As mentioned in the beginning of the chapter, CT scans can be effectively used to detect the coronavirus, but being restricted by the data that has been collected for machine analysis, here another primary scan set, that is, X-ray, will be used [29] . Figs. 35.9 and 35.10 depict the chest Xray images of normal and affected patients, respectively. The following algorithm is utilized to ensure that the images loaded are that of lungs [34] . Here we test the image for the presence of the virus using the supine, frontal view of the lungs. Step #1 Import the requisite libraries. Step #2 Load the image dataset file. Step #3 Set up a counter to calculate the number of images going into each dataset, Normal or Virus Infected. Chapter 35 Data interpretation leading to image processing 693 Step #4 Apply the lung finding algorithm on the images and start the counter. Step #5 Keep the loop going and be sure to destroy the windows before exiting each loop. Here the code is going to progress as the images are resized into similar formats to facilitate easier processing without being concerned with the aspect ratio. The following algorithm is utilized to ensure that images are sized [35] . Step #1 Import the necessary libraries to begin processing. Step #2 Fetch the set of images from our dataset folder. Step #3 Initialize the set of data (i.e., images) and class images. Step #4 Start a loop over the image sources so as to excerpt the class label from filename. Step #5 Load up the image then switch color channels and size the image to be fixed within 224 Â 224 pixels while overlooking the aspect ratio. Step #6 Update the data and label lists respectively and translate the data and labels to NumPy arrays while confining the pixel concentrations to the range [0, 255]. Step#7 Save the images back into the directory. The data analyzed depicts that on the global level, India holds 0.91% of COVID-19 cases, 0.40% of total deaths, and 0.67% of total recoveries. In comparison with the global high, India stands at 2.77% low in the total cases, 1.5% low in total deaths, and 5.01% low in total recoveries. Furthermore, according to the data, due to the imbalance in sex ratio in India, a majority of COVID-19 cases, deaths, and recoveries are among the men. At the global level, though, we find the death rate due to comorbidity seems to be higher than that of deaths caused solely by the coronavirus infection [36] . From the case study on the coronavirus disease, we have come to find that there are so far no vaccines or even drugs to curb and overpower the virus. The detection of coronavirus is purely based on the symptoms, and to declare a person COVID-19 positive, CT scans or X-rays of the chest are the determinants. Therefore rising from the fact that where images are concerned technology can assist, we have introduced image processing. Under image processing, we have successfully classified the images and extracted the data required by our algorithm in hopes of optimizing it for further erudition. With the availability of lung examination and the possibility of image manipulation, we have a final output of 50 X-ray images of normal patients and 50 X-ray images of patients affected with coronavirus. The aim of performing image processing now is to feed the data into a learning algorithm as labeled data for supervised machine learning or even deep learning techniques and let the algorithm learn the differences between COVID-19-positive and COVID-19-negative scans so as to further be able to classify a new data feed as a positive or a negative case of COVID-19. To sum this chapter up, beginning with the coronavirus hit leading to inspiring ideas that stirred up the technologic and medical fields, moving on to the ways to detect COVID-19, and tracing its history then jumping to diagnosis and prognosis the role that data can play in the analysis of the virus was studied. With inferences determined and effects measured, the situation of the globe as a whole, warring this disease, was studied. Further a stroll through the feasible solution of this pandemic problem was inferred, image processing. Using codes and languages a glimpse was given of what the bigger picture may look like in the future. On a final note, technology can be neither demarcated to the present nor rung with a particular field. There is no telling where it will lead and there is no imagining where it will go if it is fed with data and information. What is encouraging is that a beautiful picture is being painted as a blend that is formed of technology and medicine to make a concoction of the kind humanity has not seen before to provide a solution for a problem it has not perceived before. There are professionals coming up with ideas to use deep learning to help detect COVID-19 and to save our environment, and who knows maybe people may choose to work from homes even after the pandemic dies down. Let humans keep their eyes peeled and ears sensitive to the sound of change when providence chooses to relieve the world of this trial. World Health Organization: WHO, Naming the Coronavirus Disease (COVID-19) and the Virus That Causes it Death Toll Due to COVID19 Crosses 192,000 Globally Deep Learning on the March Against the Novel Coronavirus History and recent advances in coronavirus discovery Symptoms of Coronavirus WHO Director-General's Remarks at the Media Briefing on 2019-nCoV on 11 A Virus With a Deadly Boring Name. 2019-nCoV isn't going to cut it long term Detecting COVID-19 in X-ray Images With Keras, TensorFlow, and Deep Learning Everything You Need to Know About Coronavirus Testing Ministry of Health and Family Welfare (Government of India World Health Organization COVID-19 Evaluation and Treatment Coronavirus (COVID-19 Centers for Disease Control and Prevention: CDC, Evaluating and Testing Persons for Coronavirus Disease World Health Organization, Clinical Management of Severe Acute Respiratory Infection When COVID-19 Is Suspected. www.who.int/publications-detail/clinical-management-of-severe-acuterespiratory-infection-when-novel-coronavirus-(ncov)-infection-is-suspected COVID-19) Technical Guidance: Infection Prevention and Control Therapeutic options for the treatment of 2019-novel coronavirus: an evidence-based approach Digital Image Processing Fundamental Steps of Digital Image Processing Image Recognition and Image Processing Techniques Enhancing the Quality and Transparency of Health Research Building a Public COVID-19 Dataset of X-ray and CT Scans COVID-2019) Situation Reports 1223 139 3 2357 15,026 115 503 4 Israel 15,398 100 199 6602 8597 132 1779 23 Austria 15,225 77 542 6 12,282 2401 145 1690 60 Mexico 13,842 970 1305 84 7149 5388 378 107 10 Singapore 13,624 931 12 1060 12,552 22 2329 2 Chile 13,331 473 189 8 7024 6118 418 697 10 Japan 13,231 360 1656 11,215 287 105 3 Pakistan 13,201 478 272 3 2936 9993 111 60 1 Poland 11,617 344 535 11 2265 8817 160 307 14 Romania 11,036 401 619 18 3054 7363 236 574 32 S. Korea 10,728 10 242 2 8717 1769 55 209 5 Belarus 10,463 873 72 5 1695 8696 92 1107 8