key: cord-0869851-9s7gyocm authors: Tiwari, Dimple; Bhati, Bhoopesh Singh; Al‐Turjman, Fadi; Nagpal, Bharti title: Pandemic coronavirus disease (Covid‐19): World effects analysis and prediction using machine‐learning techniques date: 2021-05-11 journal: Expert Syst DOI: 10.1111/exsy.12714 sha: 6af054bc6fe60ba79ee664675a98a5eb5c55d08c doc_id: 869851 cord_uid: 9s7gyocm Pandemic novel Coronavirus (Covid‐19) is an infectious disease that primarily spreads by droplets of nose discharge when sneezing and saliva from the mouth when coughing, that had first been reported in Wuhan, China in December 2019. Covid‐19 became a global pandemic, which led to a harmful impact on the world. Many predictive models of Covid‐19 are being proposed by academic researchers around the world to take the foremost decisions and enforce the appropriate control measures. Due to the lack of accurate Covid‐19 records and uncertainty, the standard techniques are being failed to correctly predict the epidemic global effects. To address this issue, we present an Artificial Intelligence (AI)‐based meta‐analysis to predict the trend of epidemic Covid‐19 over the world. The powerful machine learning algorithms namely Naïve Bayes, Support Vector Machine (SVM) and Linear Regression were applied on real time‐series dataset, which holds the global record of confirmed, recovered, deaths and active cases of Covid‐19 outbreak. Statistical analysis has also been conducted to present various facts regarding Covid‐19 observed symptoms, a list of Top‐20 Coronavirus affected countries and a number of coactive cases over the world. Among the three machine learning techniques investigated, Naïve Bayes produced promising results to predict Covid‐19 future trends with less Mean Absolute Error (MAE) and Mean Squared Error (MSE). The less value of MAE and MSE strongly represent the effectiveness of the Naïve Bayes regression technique. Although, the global footprint of this pandemic is still uncertain. This study demonstrates the various trends and future growth of the global pandemic for a proactive response from the citizens and governments of countries. This paper sets the initial benchmark to demonstrate the capability of machine learning for outbreak prediction. Deoke, 2017). Moreover, low platelet counts correspond with the higher severity score of disease like Multiple Organ Dysfunction Score (MODS), Acute Physiology and Chronic Health Evaluation (APACHE) and Simplified Acute Physiology Score (SAPS) (Vanderschueren et al., 2000) . Information Technology and Artificial Intelligent are playing an essential role in the prediction and analysis of Covid-19 trends. Various powerful machine learning algorithms have become a handy tool for acquiring the great result of Covid-19 predictions. Mardani et al. (2020) extends fuzzy approach of Hesitant Fuzzy Set (HFS) approach using Weighted Aggregated Sum Product Assessment (WASPAS) and Stepwise Weight Assessment Ratio Analysis (SWARA) method to rank the issues and challenges of Digital Technologies intervention to control Covid-19 pandemic. Data Mining techniques, applied on medical science topics, have gain popularity due to their incredible performance for predicting the outcomes and help to take a real-time decision (Asri et al., 2016) . By various algorithms and statistical techniques of machine learning, here we have been trying to find out the hidden trends, unknown facts and their relationship from the real-time time-series dataset of the Covid-19 epidemic. Data Mining applications are helpful for making better health policies and hospital error prevention (Patel et al., 2015) . We have selected three algorithms Naïve Bayes, Support Vector Machine and Regression for predicting the future trends of spreading Coronavirus in the world as taken base on the current records of this disease. The WHO has maintained a large number of real-time confirmed case records of Covid-19 cases to discover the unknown facts. Machine learning techniques can be helpful for health care professionals to take further decisions for the prevention and control of this pandemic. This paper suggests an intelligent prediction system for the Covid-19 pandemic that incorporates the benefits of (1) real-time Covid-19 pandemic time-series data, (2) facts visualization related to a pandemic for the world and (3) automatic future prediction for Covid-19. The major contribution of this work is as follows: 1. A meta-analysis to predict and analyse the trend of epidemic Covid-19 over the world with a graphical representation of Covid-19 symptoms, active cases and a list of the top-20 coronavirus affected countries. 2. A deep literature survey regarding prediction, screening, contact tracing, forecasting, medication and treatment of Covid-19 using AI techniques. 3. The AI-based prediction and forecasting for analysing the trends and growth of novel . A comparative analysis of Naïve Bayes (NB), Linear Regression (LR) and Support Vector Machine (SVM) techniques on the real-time epidemiological dataset. Further section of this study is organized as follows: Section 2 presents the literature of Covid-19 and machine learning-based predictions, Section 3 visualized the facts related to Covid-19 from a time-series dataset, Section 4 presents the machine learning algorithms that are applied for analysing the fact and trend of Covid-19 pandemic, Section 5 presents the methodology, experiments and results of this analytical study, finally, Section 6 presents the conclusion of this meta-analysis. The outbreak of pandemic Covid-19 generates a need for research in this area. Therefore, various researchers present their views and ideas for this pandemic. Although it is the latest spread that started at the end of the year 2019, it has spread in the various provinces of the countries and a bunch of papers have proposed theories and research related to this outbreak in the world. This section presents the researches related to Covid-19 and machine learning clinical predictions. Clinical mortality prediction and analysis of Covid-19 has been made on 150 dead Chinese patient's records (Ruan et al., 2020) . Rothan and Byrareddy (2020) highlights on the transmission, symptoms, epidemiology, pathogenesis and future direction to control this epidemic, and has concluded that reducing person-to-person transmission is only solution to control the current outbreak. Kucharski et al. (2020) presents a mathematical model for the early control and transmission of Coronavirus. A combined mathematical model with four datasets of SARS-CoV2 from within and outside Wuhan assesses the potential of human-to-human transmission of this disease. Yang, Zheng, et al. (2020) presented a metaanalysis of the prevalence of comorbidities and their effects on Covid-19 infected patients and discovered that the most prevalent symptoms of this pandemic are fever, cough and fatigue. Whereas, most prevalence comorbidities of this disease are hypertension and diabetes. Lippi et al. (2020) investigates the platelet count in blood samples of normal Covid-19 patients is different from severe disease infected patients. Srivastava et al. (2020) predicted the effects of the Covid-19 parameter estimation method. The effects of lockdown, speed of Coronavirus spread, reproduction number and contact ratio were also analyzed. Rahman et al. (2020) proposed a clustering-based framework to analyse the economic impact of the Covid-19 outbreak. Malaysian context was used as a case study to validate the experiments of the proposed algorithm. Karmore et al. (2020) focused on developing a cost-effective Medical Diagnosis Humanoid (HDM) for testing the symptoms of Coronavirus in the human body. Additionally, the relation of thrombocytopenia with severe Covid-19 has also been evaluated, and results showed that low platelet counts correspond to the severity of Covid-19 infected patients. Systematic review and meta-analysis have performed using three datasets to assess imaging features, laboratory, clinical and confirmed Covid-19 cases (Rodriguez-Morales et al., 2020) . Fang et al. (2020) discovered that diabetes and hypertension patients are prone to get infected by Coronavirus and suggested that cardiac patients, hypertension patients, diabetic patients and people who are treated with ACE2-increasing drug are at more risk of Covid-19. Alimadadi et al. (2020) has suggested that Machine-Learning and Artificial-Intelligence are powerful techniques to fight with Covid-19 epidemic that can be helpful in prevention, therapeutics, diagnosis and in-hospital operations. Wynants et al. (2020) presents critical appraisal and systematic review of prediction models to find the infection of Coronavirus. It has been concluded that prediction models achieved a better place in the literature for supporting medical decisions. Al-Turjman and Deebak (2020) presented a Privacy-Aware Energy-Efficient Framework (P-AEEF) protocol for securing the information of Covid-19 patient. The proposed protocol improved energy efficiency and security features against malicious access. Yang, Zeng, et al. (2020) predicts Covid-19 epidemic trends by integrating data before and after 23 January 2020 with Susceptible-Exposed-Infectious-Removed (SEIR) to generate the epidemic curve. It has concluded that the epidemic in China was at a peak in late February, which shows gradual declines by April end. Peng et al. (2020) presented dynamic modelling to analyse the epidemic Covid-19 in China. The researchers have presented a time-series forecasting regarding the Coronavirus trends prediction for the different countries. Chimmula et al. Petropoulos and Makridakis (2020) introduces a powerful objective approach for the continuous prediction of Covid-19. The forecast suggests the continuous increment of Coronavirus confirmed cases with associated uncertainty. The exponential smoothy family has been used to produce forecasting, which has an excellent capability to forecast short-duration patterns with additive and multiplicative combinations. Hu et al. (2020) has presented AI-based forecasting of Covid-19 to find the trends and the effects of the pandemic in China. It estimates the length, size and ending time of Coronavirus outbreak across China. The modified stacked encoder has been developed for the prediction that has the ability of Covid-19 real-time confirmed cases forecasting. Ceylan (2020) various ARIMA models have been formulated with different parameters. Forecasting and predictions made by the model provide help to decide precaution and policy formulation for the outbreak. Salgotra et al. (2020) provides genetic programming-based forecasting of Covid-19 trends in India. Various statistical parameters and explicit formulas had been used to calculate the effectiveness of the forecasting model. It has concluded that genetic programming-based models are based on simple linkage function and provides highly reliable time-series forecasting results. Lalmuanawma et al. (2020) presented a comprehensive review to show the role of AI and machine learning in the arena of predicting, forecasting, screening and drug development Covid-19 and its related epidemic. They stated that AI and machine learning has remarkably improved medication, screening, predicting and forecasting for Covid-19 and reduce human interruption in medical practice. Tuli et al. (2020) applied machine learning-based mathematical model to measure the threat of Covid-19 over the world. An iterative weighting-based generalized framework was developed for real-time prediction of the epidemic. The proposed model achieved higher accuracy and can be helpful in taking Covid-19 related decisions. Vaishya et al. (2020) presented the role of AI as a decisive technology to fight with Coronavirus. It has concluded that healthcare departments need AI technology to handle the Covid-19 outbreak and require proper suggestions in real-time to reduce the spread. Wang, Zheng et al. (2020) had integrated Covid-19 most updated epidemiological dataset and fitted it into the Logistic model to analyses the epidemic trends. After that fed the cap value into the Fbprophet model to draw the pandemic curve and predictions. The proposed mathematical model estimated that the global pandemic will peak in late October, with approximated 14.12 million people will be infected correlatively. Tiwari and Bhati (2020) presented a prediction of Covid-19 using Gradient-Boost, Extra-Tree, AdaBoost and Random-Forest for India and concluded that machine learning is an efficient approach to predict the outbreak. Machine Learning is a very much functional and practical tool for the prediction and classification of problems, which is helpful for decisionmakers to take decisions in various fields and it also provides great results in medical diagnosis and disease-related fact predictions. As A. R. Mishra et al. (2020) proposed a novel approach related to an intuitionistic fuzzy set to assess the health-care waste disposal techniques and works on new measures of parametric divergence. Asri et al. (2016) used machine-learning algorithms for predicting and diagnosing the effects and risk of breast cancer. Wisconsin breast cancer real dataset has been used for the prediction of disease. It has been stated that SVM performs greater than Naïve Bayes, k Nearest Neighbour, and Decision Tree in terms of 97.13% accuracy. Kourou et al. (2015) said machine learning tools can reveal key features from complex datasets, and a variety of techniques like Decision Trees (DTs), SVMs, Bayesian Networks (BNs) and Artificial Neural Networks (ANNs) are widely applicable for the prediction and prognosis of the disease. However, it is also evident that ML increases the understanding level of detecting cancer and resulting in effective decision making. Bhatla and Jyoti (2012) develop an analysis study for predicting heart disease by various machine learning techniques and discovered that Neural Network with 15 attributes outperforms for predicting heart disease. Whereas, Decision Tree also provides good accuracy with the combination of feature subset selection and genetic algorithms. Książek et al. (2019) proposed a Machine-Learning based novel approach to detect hepatocellular carcinoma disease at the initial stage. 5-folds Genetic Algorithm, SVM, Feature Selection and Normalization has applied for getting the best results of prediction in terms of F 1 -Score as 0.8849 and 0.8762. Long et al. (2015) proposed a heart disease diagnosis system by using Interval type-2 Fuzzy Logic System (IT2FLS) and Rough sets-based reduction system, that handles uncertainties and high-dimensional challenges of the dataset. This literature review related to machine learning-based prediction on medical diagnosis motivates us for predicting the Covid-19 outbreak facts, effects and future trends in the entire world using machine learning techniques. Medhekar et al. (2013) presents Naïve Bayes heart disease prediction using five basic categories low, avg, high, very high and no. It provides great accuracy as 88.76, 89.58 and 88.96 along with heart disease risk prediction. Pattekari et al. (2012) developed a Naïve based intelligent system to predict the risk of heart disease, which is capable of answering the complex queries related to the heart disease diagnosis and can assist a medical practitioner to take decisions. It has been concluded that the Naïve Bayes system is the most Coronavirus has a large family of viruses that can affect animals or humans. In humans, the Coronavirus affects the respiratory system, ranging from the simple cold to high severe diseases like Middle East Respiratory Syndrome (MERS) and Severe East Respiratory Syndrome (SERS). Covid-19 is a recent outbreak that has affected the entire world, which is caused by a recently discovered Coronavirus. This novel disease was unknown before it first surfaced in Wuhan city, China in December 2019. In this section, we focused on the symptoms of Covid-19 and how this affects the entire world in terms of confirmed, recovered, death and active cases. Two real-time datasets were collected from Kaggle.com. The first dataset was contained a cumulative count of worldwide recovered, confirmed and death cases of Covid-19 from 22 January 2020 to 19 May 2020 and the second dataset were stored the global time-series records of Covid-19 from 22 January 2020 to 19 May 2020. Table 1 depicts the symptoms that are usually found in Covid-19 affected patients in higher to a lower frequency (Coronavirus Symptoms information, 2020). and conjunctival congestion has been found in rare cases in the patient of Covid-19. Word Cloud (Figure 3 ) of these symptoms shows the highfrequency words that present in the Covid-19 symptoms dataset. Figure 4 depicts the active cases of Covid-19 pandemic of countries from 22 January 2020 to 19 May 2020. Here active cases have been calculated by subtracting the number of recovered cases and the number of death cases from the total number of confirmed cases, and the darker shades represent a higher number of active cases. Colour of geographical map is classified as >1, >200, >400, >600, >800 and >1000. Whereas, >1000 shows high alert countries of Covid-19 outbreak. Figure 5 represents Covid-19 confirmed, recovered, deaths and active cases of the entire world, these graphs are drawn based on the Covid-19 time-series dataset from 22 January 2020 to 19 May 2020. After that, Figure 6 shows daily basis increase and decrease in confirmed, recovered and death cases of Covid-19 pandemic based on time-series dataset from 22 January 2020 to 19 May 2020. Finally, Figure 7 represents a graph of confirmed, recovered and death cases of Top-5 Covid-19 affected countries. This section presents the machine learning-based algorithms that have been used for predicting the world effects and trends of the Covid-19 outbreak. Naive Bayes, SVM and Linear Regression are powerful machine learning algorithms that were used by various researchers for predicting Naïve Bayes is a simpler yet robust algorithm for predicting the results, by Machine-Learning, we are frequently interested in selecting the best hypothesis (h) based on given data (d). Naïve Bayes works based on Bayes' Theorem, which provides a way to calculate the probability of hypothesis based on our prior knowledge. whereas, P(hjd) represents the probability of hypothesis h on the data d, P(djh) shows the probability of data (d) on the given hypothesis (h) was true, P(h) prior probability of hypothesis h and P(d) prior probability of data d. By this, we calculate the posterior probability of P(hjd) from P(h) with P(d) and P(djh). Prediction can be made for new data by using Bayes's Theorem. Maths for Naïve Bayes is quite deep, but relatively implementation is simple. The probability of class k predictor value X is one over Z times the probability of class k (Naïve Bayes information, 2020). where P represents the probability of class k on given predictor value X over the Z times the probability of k, times the probability of each x given class k. Naïve Bayes provides the facility to catch uncertainty about the model based on the probabilities of the outcome, and it can be helpful for solving the predictive and diagnostic problems (Medhekar et al., 2013) . P C k jX ð Þ¼ 1 z *P C k ð Þ* Y n i¼1 P x i jC k ð Þ ð3Þ Z ¼ X k P C kð SVM is a supervised algorithm that works based on nonlinear mapping to restore the training data into higher dimensions and has examined the linear optimal separating hyperplane (Sonavane et al., 2013) . The SVM sets the hyperplane with the help of margins and support vectors. SVM has the advantage that it is less prone to overfitting than other methods and provides a condensed description of the learned model . SVM is based on finding the best hyperplane. Hyperplanes are the boundary of the decision in multi-dimensional space. In one dimension it is called a line, in two dimensions, it has called a plane, and for more dimensions, it can be called a hyperplane. The function of the line can be formulated as: whereas, x and y are selected as a feature and naming them as x1, x2……… xn. Equation of hyperplane is written as: SVM works on the hypothesis, and the hypothesis function can be defined as: For computing, the margin of the hyperplane equation is as follows: 1 n X n i¼1 max 0, 1 À y i w: Linear Regression is a popular predictive technique. It searches the best variable set for prediction and then the perfect variable from the set for predicting the outcome. It is based on sign and beta estimates; these Regression estimates explain the relationship between one dependent (y) variable and many independent (x) variables. The Linear Regression equation is as follows: where y represents the dependent variable, x 1 , x 2 …………x n are independent variables, b 0 is intercepted and b 1 , b 2 are coefficients and n represent the number of observations. Linear regression models are more accessible and more practical for solving prediction problems (Aghdaei et al., 2017) . When there is a single input variable, it is called a simple linear regression, and when there is a multiple-input variable, it is called a multiple regression model. Ordinary Least Square is a common technique to train the linear regression model. Experiments have been conducted through Jupyter Notebook Python on the cumulative count and time-series dataset of the Covid-19 pandemic. Our motive is to evaluate and predict the future cases of Covid-19 based on the previous trend by machine learning algorithms. For achieving this goal, Naïve Bayes, SVM, and Linear Regression techniques have been applied and comparatively tested; these belong to one of the most potent predictive techniques. The framework is given in Figure 8 , it represents the flow and procedure of prediction model implementation on the Covid-19 pandemic dataset. First, the procedure initially starts from domain understanding, where the problem is analysed, and the objective of the problem is discussed. The second phase is data understanding; before the implementation of any problem, it is to be required to understand the structure of data. The third feature, selection, is a very much important phase in which it must be decided that on which feature of data, future predictions are made and which attribute is directly related to the prediction. Before the implementation part, pre-processing of the dataset is also done for getting the effective results, then after only our real-time dataset of Covid-19 pandemic is ready to perform operations. In the fourth stage, data is split into two parts: the training part and the testing part, where a 0.42 percent portion of the data is selected for testing predictions. Fifth, prediction algorithms Naïve Bayes, SVM and Linear Regression, have been applied on Covid-19 realistic dataset. The sixth and the final phase represents the comparative study between algorithms for getting the predictive results of the worldwide spread of Covid-19. The process of forecasting starts from data collection. It is very much required to have an accurate dataset for trustworthy forecasting results. The actual time-series dataset of the Covid-19 outbreak has been used to predict the world effects and trends. The dataset has collected from Kaggle.com, which is a popular website to provide useful datasets. The various datasets have been used to perform the experiments related to the Covid-19 prediction. Table 3 describes the information regarding all datasets. Feature selection is known as an appropriate variable selection from the dataset. It plays a significant role in boosting the performance and accuracy of prediction techniques. Feature selection is the process of dimensionality reduction that is helpful to acquire needful information from a large dataset and reduce processing time with better performance. From Confirmed, Recovered, and Death cases datasets, fourth column to last columns have been selected that holds the initial and last date of Covid-19 cases from 22 January 2020 to 19 May 2020. From the Covid-19 world cases dataset Country/Region, Confirmed and Deaths columns have been selected as a key feature. Whereas, active cases have calculated as: F I G U R E 8 The procedure of Covid-19 analytical study using Machine-Learning techniques The model has trained on the training set that is known as the learning phase. Once the machine learns about the features and attributes of the data, it applies to a test set for future predictions. Where 42% of the dataset has been used for testing purposes and 58% for training purposes for getting more accurate predictions and results for the Covid-19 outbreak. The larger testing set ensures the higher accuracy of predictions rather than the smaller testing set. Hyperparameter tuning is very much required to optimize the performance of AI algorithms. Various hyperparameters have been selected for Naive Bayes, SVM and Linear Regression algorithms. Table 4 represents the selected hyperparameters of the applied techniques. The Outcomes of these predictive algorithms are measured in terms of Mean-Absolute-Error (MAE) and Mean-Squared-Error (MSE). The future prediction of Covid-19 cases all around the world is also depicted by the graph as actual cases versus predictive cases. MAE is a difference between the actual and predicted values, where absolute difference means ignoring the negative values, and it is calculated as: MAE calculates the outcome by averaging the error from each sample of the dataset, which is represented as: (Willmott & Matsuura, 2005) . MSE is calculated by averaging the squares of the errors; it shows the difference of the average squares between the actual and predicted values. MSE of the ensemble mean has never been larger than the MSEs arithmetic means of individual simulators (Rougier, 2016) . The equation of MSE is as follows: MAE and MSE are calculated for all the prediction algorithms that are applied in this study for predicting the Covid-19 pandemic cases over the world, which shows the average difference between the correct and predicted cases of Covid-19 pandemic, this shows the effectiveness of the predicted model. The Naïve Bayes prediction algorithm has been applied for predicting the future cases of Covid-19, and the best parameters have been selected for experimenting with Naïve Bayes. Naïve Bayes produces MAE = 488806.7492 and MSE = 400919367451.7439 on a testing set of Covid-19 pandemic realistic dataset, which shows the best prediction of a pandemic. Figure 9 depicts the graph of test-confirmed cases versus Bayesian predictions. Figure 10 represents the total confirmed cases of Covid-19 in the world from 22 January 2020 to 19 May 2020 and predicted cases by the Naïve Bayes prediction algorithm. Where the x-axis shows the number of cases and the y-axis shows the date of the occurrence of confirmed cases. This graph depicts that Naïve Bayes effectively predicted the results of Covid-19 pandemic confirmed cases for the world. Table 5 represents, the future 10 days prediction of the Covid-19 pandemic from 20 May 2020 to 29 May 2020 for the entire world by Naïve Bayes. SVM works with hyperplanes and support vectors, where the support vector is the data points that are closer to the hyperplane and influence the position and orientation of the hyperplane. The main objective of this technique is to find the best hyperplane that has the maximum margin. We experimented this technique on the Covid-19 pandemic dataset with some best hyperparameters as gamma, epsilon, shrinking and degree, best tuning of these hyperparameters boosted the performance of the prediction technique. The experiment of SVM with Covid-19 time series data provide MAE = 718150.1344 and MSE = 565545811024.1667, which is greater than Naïve Bayes MAE and MSE that shows Naïve Bayes produced better prediction than SVM. Figure 11 represents the graph of the tested confirmed cases versus SVM prediction. As the MAE and MSE values are greater, the margin between test data and SVM prediction is also higher, which shows less effectiveness of the SVM prediction technique. Figure 12 represents the total number of Coronavirus confirmed cases all over the world versus SVM prediction, and it also shows a little more difference between the actual values and predicted values. The test confirmed cases vs Bayesian prediction of Covid-19 in the world Table 6 represents, the future 10-day prediction of Covid-19 from 20 May 2020 to 29 May 2020 by using the SVM technique. The regression technique finds the relation between input (x) and output (y), it is a very much popular technique used for future forecasting. In our experiment, the Linear Regression technique has been used for predicting the future confirmed cases of Covid-19 by using the trends of previously confirmed cases. The MAE = 648733.0991, and MSE = 913583889578.4996 are produced by Linear Regression for Covid-19 pandemic dataset, which shows greater accuracy than SVM but lower accuracy than Naïve Bayes. Figure 13 depicts the test confirmed cases of Covid-19 versus Regression predicted cases in the world. Figure 14 represents value than SVM and predicts better than SVM. The predicted outcomes of Naïve Bayes are almost similar to the actual confirmed cases of Coronavirus. So, it can be conveyed that the future forecasting of Covid-19 cases by Naïve Bayes is more trustworthy than SVM and Regression. Further, a meta-analysis has been presented, which shows the various perspective of the novel Coronavirus. The graphs of Section 3, plotted the statistics related to the major symptoms, active cases and list of top-20 Covid-19 affected countries till 19 May 2020. Figure 6 . shows the daily increment of Covid-19 confirmed, recovered and death cases from 22 January 2020 to 19 May 2020. The US, Russia, Brazil, the UK and Spain were the top five countries, facing the Covid-19 outbreak till 19 May 2020. This paper also focuses on previous research conducted on Covid-19 trends prediction and conveying that machine learning and AI drastically gain more popularity in forecasting, screening, drug development and contact tracing. AI is not only convenient for treating the Covid-19 patients but also helpful for the government for taking appropriate decisions. However, most of the AI techniques are not compatible to work with real-environment, but still remarkable to tackle with the outbreak. Linear regression models for prediction of annual heating and cooling demand in representative Australian residential dwellings Prediction of arterial blood gas values from venous blood gas values in patients with acute exacerbation of chronic obstructive pulmonary disease Artificial intelligence and machine learning to fight COVID-19 Privacy-aware energy-efficient framework using the internet of medical things for COVID-19 Using machine learning algorithms for breast cancer risk prediction and diagnosis Presumed asymptomatic carrier transmission of COVID-19 An analysis of heart disease prediction using different data mining techniques Disease prediction by machine learning over big data from healthcare communities Time series forecasting of COVID-19 transmission in Canada using LSTM networks Automated diagnosis of coronary artery disease (CAD) patients using optimized SVM Prediction system for heart disease using Naive Bayes and particle swarm optimization Are patients with hypertension and diabetes mellitus at increased risk for COVID-19 infection? Parkinson's disease prediction using machine learning approaches A robust algorithm for classification and diagnosis of brain disease using local linear approximation and generalized autoregressive conditional heteroscedasticity model Artificial intelligence forecasting of covid-19 in china Predicting the morbidity of chronic obstructive pulmonary disease based on multiple locally weighted linear regression model with K-means clustering IoT based humanoid software for identification and diagnosis of Covid-19 suspects Thrombocytopenia in critically Ill patients: Clinical and laboratorial behavior and its correlation with short-term outcome during hospitalization Machine learning applications in cancer prognosis and prediction A novel machine learning approach for early detection of hepatocellular carcinoma patients Early dynamics of transmission and control of COVID-19: A mathematical modelling study. The Lancet Infectious Diseases Applications of machine learning and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review Thrombocytopenia is associated with severe coronavirus disease 2019 (COVID-19) infections: A meta-analysis A highly accurate firefly-based algorithm for heart disease prediction Time series modelling to forecast the confirmed and recovered cases of COVID-19 A novel extended approach under hesitant fuzzy sets to design a framework for assessing the key challenges of digital health interventions adoption during the COVID-19 outbreak Heart disease prediction system using naive Bayes Multiple ensemble neural network models with fuzzy response aggregation for predicting COVID-19 time series: The case of Mexico A novel EDAS approach on intuitionistic fuzzy set for assessment of health-care waste disposal technology using new parametric divergence measures Dengue disease spread prediction using twofold linear regression An analytical method for diseases prediction using machine learning techniques A hybrid intelligent system for the prediction of Parkinson's Disease progression using machine learning techniques Heart disease prediction using machine learning and data mining technique Prediction system for heart disease using Naïve Bayes Epidemic analysis of COVID-19 in China by dynamical modeling Forecasting the novel coronavirus COVID-19 Data-driven dynamic clustering framework for mitigating the adverse economic impact of Covid-19 lockdown practices An extended Pythagorean fuzzy complex proportional assessment approach with new entropy and score function: Application in pharmacological therapy selection for type 2 diabetes Clinical, laboratory and imaging features of COVID-19: A systematic review and meta-analysis The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak Ensemble averaging and mean squared error Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan Time series analysis and forecast of the COVID-19 pandemic in India using genetic programming A review of coronavirus disease-2019 (COVID-19) Novel approach for localization of indian car number plate recognition system using support vector machine A systematic approach for COVID-19 predictions and parameter estimation. Personal and Ubiquitous Computing A deep analysis and prediction of COVID-19 in India: Using ensemble regression approach Predicting the growth and trend of COVID-19 pandemic using machine learning and cloud computing Artificial intelligence (AI) applications for COVID-19 pandemic Thrombocytopenia and prognosis in intensive care Liver disease prediction using SVM and Naïve Bayes algorithms Kidney disease prediction using SVM and ANN algorithms A novel coronavirus outbreak of global health concern Prediction of epidemic trends in COVID-19 with logistic model and machine learning technics Updated understanding of the outbreak of 2019 novel coronavirus (2019-nCoV) in Wuhan Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance World Health Organization Situation reports Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal Prevalence of comorbidities and its effects in patients infected with SARS-CoV-2: A systematic review and meta-analysis AUTHOR BIOGRAPHIES Dimple Tiwari is a PHD scholar in Computer Science and Engineering Department at Ambedkar Institute of Advanced Communications Technologies and Research -AIACTR, Delhi, India. Her areas of interest include Sentiment Analysis, Artificial Intelligence, Information Security, Internet of things (IoT), Big Data. She is also a Microsoft certified in Bhoopesh Singh Bhati is an Assistant Professor in Ambedkar Institute of Advanced Communication Technologies & Research Govt He is a Recognize/ Active Reviewer of various reputed journals of IEEE He is a leading authority in the areas of smart/intelligent systems. His publication history spans over 250 publications in journals, conferences, patents, books, and book chapters She has 21 years of teaching experience. Her areas of interest include Sentiment Analysis, Artificial Intelligence, Information Security, Data mining and Data Warehouse, Internet of things (IoT), Big Data. She has published various research papers in reputed International Journals / Conferences and contributed Book Chapters The authors declare no conflicts of interest. The data that support the findings of this study are openly available in Kaggle.com at https://www.kaggle.com/aayushiagrawall/novel-dataset (Aghdaei et al., 2017) . https://orcid.org/0000-0001-8476-2798Fadi Al-Turjman https://orcid.org/0000-0001-5418-873X