key: cord-0820807-slqd6sfb authors: Alanazi, Saad Awadh; Kamruzzaman, M. M.; Alruwaili, Madallah; Alshammari, Nasser; Alqahtani, Salman Ali; Karime, Ali title: Measuring and Preventing COVID-19 Using the SIR Model and Machine Learning in Smart Health Care date: 2020-10-29 journal: J Healthc Eng DOI: 10.1155/2020/8857346 sha: 888f4b467fd115ad21011dba87f28f3b6b78d3b0 doc_id: 820807 cord_uid: slqd6sfb COVID-19 presents an urgent global challenge because of its contagious nature, frequently changing characteristics, and the lack of a vaccine or effective medicines. A model for measuring and preventing the continued spread of COVID-19 is urgently required to provide smart health care services. This requires using advanced intelligent computing such as artificial intelligence, machine learning, deep learning, cognitive computing, cloud computing, fog computing, and edge computing. This paper proposes a model for predicting COVID-19 using the SIR and machine learning for smart health care and the well-being of the citizens of KSA. Knowing the number of susceptible, infected, and recovered cases each day is critical for mathematical modeling to be able to identify the behavioral effects of the pandemic. It forecasts the situation for the upcoming 700 days. The proposed system predicts whether COVID-19 will spread in the population or die out in the long run. Mathematical analysis and simulation results are presented here as a means to forecast the progress of the outbreak and its possible end for three types of scenarios: “no actions,” “lockdown,” and “new medicines.” The effect of interventions like lockdown and new medicines is compared with the “no actions” scenario. The lockdown case delays the peak point by decreasing the infection and affects the area equality rule of the infected curves. On the other side, new medicines have a significant impact on infected curve by decreasing the number of infected people about time. Available forecast data on COVID-19 using simulations predict that the highest level of cases might occur between 15 and 30 November 2020. Simulation data suggest that the virus might be fully under control only after June 2021. The reproductive rate shows that measures such as government lockdowns and isolation of individuals are not enough to stop the pandemic. This study recommends that authorities should, as soon as possible, apply a strict long-term containment strategy to reduce the epidemic size successfully. e rapid growth of COVID-19 has forced scientists to develop urgent countermeasures to halt the outbreak. Scientists have proposed and implemented various technologies to reduce the negative consequences of the pandemic and to accelerate the recovery phase [1] . ese technologies include artificial intelligence (AI), machine learning, deep learning, cloud-based collaboration tools, fog computing, the Internet of ings (IoT), cognitive computing, and wireless communication. ere is great potential for these technologies to bring about a revolution in the healthcare industry [2] [3] [4] . AI, and in particular machine learning algorithms, has increasingly become an integral part of smart healthcare. ese technologies are increasingly referred to as the brain of smart healthcare services [5, 6] . Deep learning, a subset of machine learning in AI, has networks capable of learning-unsupervised-from unstructured or unlabeled data, and has been intensively used in many applications including COVID-19 [7] [8] [9] [10] . In deep learning, convolutional neural networks (CNN) are a class of deep neural networks, most commonly applied in the field of computer vision [11] . is method has been applied to many tasks, including super resolution, image classification, semantic segmentation, multimedia systems, healthcare, and emotion recognition. [12] [13] [14] [15] [16] . On the other hand, cloud computing can provide the digital infrastructure needed for smart health care. at is, smart human care services cloud could provide data storage and data processing for all activities [17] [18] [19] [20] . Fog computing is the latest computing paradigm and employs user or close user devices, also called the network edge or edge users, to perform data processing tasks. is network architecture enhances the flexibility of cloud computing as compared to more ubiquitous networks [21, 22] . e Internet of ings is a set of interrelated computing devices with unique identifiers and enables the network transfer of data without requiring human-to-human or human-to-computer interaction. It includes everything from desktops, laptops, and smartphones, to coffee makers, washing machines, and wearable devices [23] [24] [25] [26] [27] [28] . Smart healthcare services require formal processes for measuring, preventing, and managing the spread of COVID-19. e technologies mentioned can be used to reduce this pandemic's negative consequences and accelerate the recovery phase. Various models have been used to study how the virus spreads across populations: susceptible-infected (SI), susceptible-infected-recovered (SIR), susceptible-infected-susceptible (SIS), and the susceptible-infected-recovered-susceptible (SIRS) models [29] . ese models offer two possible outcomes. e first possibility is that the disease might, if new infections are not controlled, end up being an epidemic. e second outcome is that the virus dies off if the necessary measures are taken to protect susceptible individuals from infection. COVID-19 follows a similar pattern to that of other infectious diseases and contact tracing is needed to help reduce new infections. e similarities between COVID-19 and other infectious diseases in terms of contact infections make it possible to predict its outcomes. Two models are potentially useful in managing the disease. ese are the SIR and SIR-F models, in which F stands for "fatal with the confirmation." e idea is to ensure that people who recover are not again susceptible to the virus. e challenge, however, is that human behavior does not follow one set of specific rules but rather tends to change from one community to another and from one person to another. It is thus difficult to accurately use these models to determine the future outcomes of the disease, but the models can help in developing real-life approaches to the virus. Machine learning (ML) algorithms for time-series forecasting are statistics and computer science, where calculations are derived from the data. Scikit-Learn is a machine-learning library for Python. It is also a community-driven project with powerful regression tools for fitting curves to the table of suspected and recovered cases. ML can assist in establishing overarching information on the pandemic and in forecasting the advancement of infections. e various epidemic models can be evaluated with COVID-19 data using the model parameter estimation package CovsirPhy. SIR-F is a customized SIR-derived ODE model [30] . A discrete-time stochastic compartment model was used in [31] to study the dynamics of the COVID-19 epidemic. is model forecasts the spread of the disease in the next period based on parameter estimates and numerical simulation. In [32] , mathematical and numerical analyses were carried out using a time-based SIR model for COVID-19 with asymptomatic individuals. In another study, AI predictions and a modified SEIR (Susceptible-Exposed-Infectious-Recovered-Susceptible) model were used to study the COVID-19 epidemic trends in China. is is helpful in understanding the public health interventions applied [33] . e SEIR model was effective in predicting the COVID-19 trends and the infection rates. e AI-based model was developed based on a SARS dataset that suggests that there is hope for managing the pandemic. To validate the data, an advanced SIR prediction model was applied to the epidemic data from Italy and compared against the results from China [34] .Extended SIR forecast of the COVID-19 epidemic trend in Italy is studied in [35] , whereas introduction of population migration with an effective intervention approach for COVID-19 is introduced in [36] . e SIR model incorporating time-based parameters and AI algorithms was also used to study the spread of COVID-19 in South Korea [37] . However, we have not found related research for smart healthcare. is study presents a more accurate prediction model for smart healthcare services using a machine learning approach with the SIR model. e proposed model works with a stochastic model for analyzing the COVID-19 pandemic, and we then investigate time-series forecasting of COVID-19 for the next 700 days. We began with an exploratory data analysis of COVID-19 datasets belonging to John Hopkins University. ese datasets include all countries although for some there is no detailed information about the number of patients hospitalized, or of interventions of the government to lockdown institutions, schools, and markets. In this study, we considered modeling data for the Kingdom of Saudi Arabia (KSA) only and including other countries as would be the case in a global model so as to evaluate the effectiveness of the intervention in curbing the spread of COVID-19 [38] . Most countries have a COVID-19 dataset. Mortality, recovery, and infection rates are essential for the SIR model. e datasets present social behavior data but not for all countries. e population data are the starting point of our analysis. In the global model, we used population numbers for all countries. For each of the simple models, we made use of a general population pyramid across different age groups employing the real population pyramid of Table 1 . In Table 1 , people are categorized into 5 age groups, separated by the total number of males and females. Individuals in the various age groups spend their time in different places: children are at school, people of working age are mostly in offices or other workspaces, and retired persons are at home. e number of hours that the people are awake on a daily basis is given in Table 2 . e Johns Hopkins University (JHU) datasets are arguably the most popular COVID-19 data sources available. Data for about 181 countries are updated regularly. e number of confirmed, infected, fatal, and recovered cases for KSA as of 12 June 2020 and obtained from the JHU is shown in Table 3 [38] . e exploratory data analysis (EDA) of COVID-19 for KSA is summarized in Table 4 and Figure 1 . For purposes of comparison, the total number of cases globally can be seen in Figure 2 . In this section, we construct a mathematical model derived from an SIR model. An SIR model is a basic statistical tool for analyzing infectious disease outbreaks. We will evaluate this in the upcoming sections. Unmonitored symptomatic cases that are a source of infection in the population are not considered. ere is also a chance of being infected through touching objects that are infected. ere are numerous random factors relevant to disease transmission. Among the many possibilities, we propose the stochastic SIR model detailed below. At the end of the initial modeling, we produced the SIR-F model, a custom ODE SIR-derived model. A parameter estimate for SIR-F will be applied to subsets of time-series data in each country to determine the impact of interventions. In this basic model, the total population is divided into three subsets: susceptible, infected, and recovered. ere are transitions between these parts. Susceptible persons are derived by subtracting persons who are confirmed to be virus carriers via testing in hospitals from the total population. Moving from the susceptible to the infected cases, the contact rate, β, determines the disease velocity in the population. In detail, the transition from Sstate to I-state is not deterministic, but is always stochastic. Hence, β includes multiplication of rate, probability, and population number. e basic model can be defined as where S: susceptible � all − confirmed; I: infected � confirmed − recovered − deaths; R: recovered or fatal � recovered + deaths; β: effective contact rate (1/min); and c: recovery (+mortality) rate (1/min). e number of infected persons decreases with recoveries and deaths. Recovered individuals can also no longer change to the susceptible state. Ordinary Differential Equations (ODE) of the basic SIR model are given below [39] : where N � S + I + R is the total population, T is the elapsed time from the start date. ODE is used for the simulation to compute the variables after a short period of time. At any point in time, the ODE shows the rate of change for every variable. We can forecast the future by bringing together little changes over time. us, the SIR model is used for three main differential equations. To implement the ODE functions mentioned, we simplify them as follows: (3) where R 0 is the new term signifying the reproductive rate (contact rate), which means that one infection will cause several new infections. e infections should be increased for this case, i.e.,. Let us assume S � 1 in the beginning, then If R 0 is more than 1, it is highly probable that an outbreak will occur in the future. On the other hand, R 0 can be defined as a product of the contact rate β and infection time. R 0 is the most effective parameter for choosing interventions. ese three parameters can be decreased individually by vaccines, isolation, and antibiotics. R 0 also provides five indications concerning an outbreak: whether or not it will turn out to be a pandemic; the initial increase rate of the epidemic; the final fraction size of the susceptible population that will get infected; the equilibrium fraction of susceptible individuals in a population; and the critical vaccination threshold. To achieve an optimized contact rate (β) and recovery + mortality rate (c), we referred to the Table 3 data and used the Optuna package with Python, hyperparameter optimization package. e resulting optimization parameters can be seen in Table 5 . e root mean squared logarithmic error (RMSLE) score gives us the precise observed and estimated values of the optimization parameters. If the RMSLE value is high, the estimation is highly accurate. In this type of model, the main challenge is to estimate the best parameters to fit in the model. And if the model is based on time, the differential equations are necessary for modeling the real behaviors. Many parameter estimation methods are used in computational biology. e most common are Gauss Newton, Simulated annealing, Genetic algorithms, and so on. e main aim is to fit the model with the low costs and errors. We can get some idea of the ending time of the epidemic using susceptible (S)-recovered (R) analysis. An epidemic is said to stop not only when everyone has been infected but also when nobody is freshly infected. R-S phase planes prove that a decrease in the number of susceptible persons is exponentially related to recoveries. is means some groups of the population will always avoid infection [40] . In Figure 3 , the regression analysis is fitted to real data. If we consider some cases of those who died before going to the hospital (confirmed), then the model is expanded: where R � recovered and F � deaths. erefore, the model can be written as where S * is confirmed and uncategorized; α 1 is the mortality rate of S * cases; α 2 is the mortality rate of I cases (1/min); β is the effective contact rate (1/min); and c is the recovery rate (1/min). where N � S + I + R + F is the total population and T is the elapsed time from the start date. Table 6 shows the hyperparameter optimization of the SIR-F Model. Prediction of the future with the final parameters of the SIR-F model is also possible with the CovsirPhy: COVID-19 data with SIR model Python package in that it also considers S-R trend analysis. In the worst-case scenario, it is assumed that there is no lockdown, medicines, or vaccine for the population. is fundamental case is taken under consideration for the sake of comparison with the lockdown, medicine, and vaccine cases. As a result, the total number of infected people will be the highest in this case. So, if none of the interventions is invalid, the pure SIR-F model can predict the future within 30 days. is condition can accept the natural spread of the pandemic without any retarding effect on it. e prediction results are set out in Figure 4 , from 22 June to display the infected, fatal, and recovered cases. Apparently, the spread will speed up in the near future such that both the infected and the recovered numbers will go up fast, but the fatality rate will continue to still remain less. Similarly, the prediction results are presented for 700 days in Figure 5 , which shows that the epidemic will stop in the mid of October 2021. e daily amount of fatal, infected, recovered, and susceptible numbers is listed in Table 7 where about 5 M people are infected with COVID-19, 25 M people have recovered, and there are 700 fatalities. Besides that, the pandemic seems to lose its effect completely after July 2021. Rho, sigma, and R t (R 0 ) parameters are shown in Figure 6 for the worst-case scenario using the SIR-F model. R t (R 0 ) parameter is explained in equations (4) and (5) . e rho parameter shows that transitions to the infected state decreases after the 3rd stage and the reproductive rate increases at the 4th stage. e 5th stage is for forecasting the epidemic. In light of the rising numbers of cases and deaths, most governments have introduced interventions to reduce the spread of the virus causing COVID-19. In Europe and elsewhere, they have or are implementing measures to control the pandemic. ese nonpharmaceutical interventions differ but generally incorporate social distancing (e.g., prohibiting large-scale gatherings and encouraging people not to associate outside their family units), fringe terminations, school terminations, and measures to seclude indicative people and their contacts [41] . Some control factors should be added to the SIR-F model to investigate the effects of interventions such as school or market closures and lockdowns. In the case of lockdowns, the gs parameter defines the number of days that susceptible persons go out. Each age group in a population will go out for a different number of days. In Table 1 , the population pyramid is defined for KSA, and we will estimate the average number of days that each the age group goes out. To precisely estimate the gs value, we will use the data listed in Table 2 . It is assumed that all schools and offices are closed, and that fifty per cent of people work remotely, and people are going out only on one day per week. Before the lockdown, the gs value was 6.07. After the start of the lockdown, the spending day's data have changed, as shown in Table 8 . In Table 8 , ten types of people with different age groups are listed with regard to who are going out during the pandemic. Ultimately, the new value of gs is 3.66. e number of people going out before the lockdown was necessary to estimate gs after the lockdown. We updated the population pyramid in Table 8 by updating the school, office, and other columns with the new gs value. e new gs value is applied to the go-out table. We also assumed that workers go to their office one day a week, but they also go out for shopping and other emergency activities. is number is considered as 1 or zero due to inculturation of a more rule-based life here. e updated data are presented in Table 9 , which will be used to implement the SIR-F model. Journal of Healthcare Engineering 7 In the SIR-F model, gs is used as a control parameter of β. So, the β factor should change by the same amount as the change in gs. e effect of closedown begins at the start date of the third phase. e effect of lockdown can be seen by controlling the rho parameter in Figure 7 , which will be explained in the next section. New Medicines 2.11.1. Lockdown. All the schools and offices have been closed since 26 July 2020. is lockdown precaution to counter the pandemic will not only decrease the maximum Infected Recovered Susceptible 737 14 May 2022 700004 16 25603793 6636187 738 15 May 2022 700004 16 25603794 6636187 739 16 May 2022 700004 15 25603795 6636187 740 17 May 2022 700004 15 25603795 6636186 741 18 May 2022 700004 15 25603796 6636186 742 19 May 2022 700004 14 25603797 6636186 743 20 value of the effected persons but will also cause a delay of the infected bell curve to the late dates; ultimately, the pandemic period will be extended. Figure 7 verifies the prediction by decreasing and delaying of the infected curve to the April 2021. In Figure 7 , the prediction results of three cases are shown for the next 700 days. Also, the pandemic period ends in April 2022. In Table 10 , three cases have been given for the lockdown scenario; according to it, the number of infected persons is less than 1.5 M, the total number of recovered patients is about 14 M, and the total fatality rate decreased about four-hundred thousand people to about 392000. e ρ (rho) parameter of the 5th period is also decreased accordingly. In Figure 8 , the transition parameter from S to I, which is denoted as ρ (rho), could not decrease drastically, which means that the infection rate is still keeping high, and we are far from the end of the epidemic. e most critical outcome of ρ is checking the interventions of the governments. Also, in Figure 8 , the rho parameters are given for scenario 2, and it is clearly seen that school closure and lockdowns are not enough for stopping the epidemic in KSA. Medicines. New drugs are essential for patients to recover rapidly from the disease. Medication repositioning methodology (i.e., finding successful competitors from the library of existing medications for various illnesses) is utilized to build up the medication possibilities to treat COVID-19. e new α and c parameters are as in equation (11) . Figure 9 shows the predicted number of cases with medicine using the SIR-F model for 700 days, which mainly reduces the number of fatalities to 52000. Rho, sigma, and R t (R 0 ) parameters are shown in Figure 10 for the case with medicine using the SIR-F model, which indicates that the spread of infection is still very high, but the fatality rate is reduced due to medicine. e predicted number of SIR-F with medicine is shown in Table 11 . is table shows around 2.5 M infected cases, which results in the shortening of the length of the pandemic bell curve. e primary aim of this study was to evaluate the models for providing smart healthcare that are able to predict the onset of pandemics like COVID-19. is study proposes and implements SIR and SIR-F models with ML algorithms and presents both mathematical and numerical analyses and simulation results. e SIR epidemiological model is one of the oldest and has the most significant consequences of biological science. It contains the most significant highlights of the study of the virus disease transmission to be specific "Susceptible," "Infected," and "Recovered" people. e SIR model is applied to predict the data. Based on the SIR model, the pandemic will most probably be controlled by June 2021. e hyperparameters are presented in the various tables, and RMLSE is the primary accuracy metric to define the prediction error of real data and the SIR models. e reproductive rate was the most significant parameter for forecasting whether there will be a pandemic. Figures 4 and 6 show that despite the reproductive rate being low, the pandemic will increase, and the trend lines explain that the interventions by governments or individual isolation are not enough to stop the pandemic. Our proposed model proved to be successful in predicting peaks and the sizes of the COVID-19 outbreaks. Until the end of June 2021, the strategy of early detection and strict monitoring must continue to apply. With clear signs of an epidemic, individuals should be made aware of self-protection measures including frequent hand washing, either keeping soap on the hands for at least 20 seconds or using a hand sanitizer containing 60% alcohol, avoiding direct contact with sick people, keeping a distance of at least 6 feet from others, covering nose and mouth with a mask, using (and properly disposing of ) a tissue to cover sneezes or coughs, and cleaning and sanitizing regularly touched items and surfaces every day. Although the government in a few places has slowly withdrawn lockdown restrictions, there is still a high possibility of outbreaks. e number of new cases of infection is increasing, and people should not let down their guard against this highly contagious disease. Publicly available datasets were analyzed in this study. ese datasets can be found at https://github.com/CSSEGISandData/ COVID-19. e authors declare that they have no conflicts of interest. A comprehensive review of the COVID-19 pandemic and the role of IoT, drones, AI, blockchain, and 5G in managing its impact Smart healthcare monitoring: a voice pathology detection paradigm for smart cities An ISO/IEEE 11073 standardized digital twin framework for health and well-being in smart cities COVID-19 data with SIR model PEA: parallel electrocardiogram-based authentication for smart healthcare systems Architecture of smart health care system using artificial intelligence Emotion recognition using deep learning approach from audio-visual emotional big data Formant analysis in dysphonic patients and automatic Arabic digit speech recognition Methods of moving target detection and behavior recognition in intelligent vision monitoring Animal image retrieval algorithms based on deep neural network Arabic sign language recognition and generating Arabic speech using convolutional neural network Deploying machine and deep learning models for efficient data-augmented detection of COVID-19 infections A deep learning system for recognizing facial expression in realtime Cloud-supported cyber-physical localization framework for patients monitoring Deep features learning for medical image analysis with convolutional autoencoder neural network Letter: acid secretion by gastric mucous membrane Cloud-based collaborative media service framework for healthcare Secure quantum steganography protocol for fog cloud internet of things Explainable AI and mass surveillance system-based healthcare framework to combat COVID-I9 like pandemics Software defined healthcare networks Urban healthcare big data system based on crowdsourced and cloud-based air quality indicators COVID-19 networking demand: an auction-based mechanism for automated selection of edge computing services Cognitive IoT-cloud integration for smart healthcare: case study for epileptic seizure detection and monitoring DITrust chain: towards blockchain-based trust models for sustainable healthcare IoT systems Relational user attribute inference in social media A new chaotic map with dynamic analysis and encryption application in internet of health things Applying deep learning for epilepsy seizure detection and brain mapping visualization Controlled alternate quantum walks based privacy preserving healthcare images in internet of things e mathematics of infectious diseases B5G and explainable deep learning assisted healthcare vertical at the edge: COVID-I9 perspective A discrete stochastic model of the COVID-19 outbreak: Forecast and control A timedependent SIR model for COVID-19 with undetectable infected persons Emotion-aware connected healthcare big data towards 5G Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions Extended SIR prediction of the epidemics trend of COVID-19 in Italy and compared with Hunan, China e introduction of population migration to SEIAR for COVID-19 epidemic modeling with an efficient intervention strategy Analysis of COVID-19 spreading in South Korea using the SIR model with time-dependent parameters and deep learning Population pyramids of the world Saudi Arabia e SIR model when S (t) is a multi-exponential function Comparative prediction of confirmed cases with COVID-19 pandemic by machine learning, deterministic and stochastic SIR models 40JHU CSSE COVID-19 dataset. Daily reports