key: cord-0787472-ouc1xcts authors: Mydukuri, Rathnamma V; Kallam, Suresh; Patan, Rizwan; Al‐Turjman, Fadi; Ramachandran, Manikandan title: Deming least square regressed feature selection and Gaussian neuro‐fuzzy multi‐layered data classifier for early COVID prediction date: 2021-03-26 journal: Expert Syst DOI: 10.1111/exsy.12694 sha: 7da16f7859a7d8efbad9e2ffdea2dffaf3f8af3a doc_id: 787472 cord_uid: ouc1xcts Coronavirus disease (COVID‐19) is a harmful disease caused by the new SARS‐CoV‐2 virus. COVID‐19 disease comprises symptoms such as cold, cough, fever, and difficulty in breathing. COVID‐19 has affected many countries and their spread in the world has put humanity at risk. Due to the increasing number of cases and their stress on administration as well as health professionals, different prediction techniques were introduced to predict the coronavirus disease existence in patients. However, the accuracy was not improved, and time consumption was not minimized during the disease prediction. To address these problems, least square regressive Gaussian neuro‐fuzzy multi‐layered data classification (LSRGNFM‐LDC) technique is introduced in this article. LSRGNFM‐LDC technique performs efficient COVID prediction with better accuracy and lesser time consumption through feature selection and classification. The preprocessing is used to eliminate the unwanted data in input features. Preprocessing is applied to reduce the time complexity. Next, Deming Least Square Regressive Feature Selection process is carried out for selecting the most relevant features through identifying the line of best fit. After the feature selection process, Gaussian neuro‐fuzzy classifier in LSRGNFM‐LDC technique performs the data classification process with help of fuzzy if‐then rules for performing prediction process. Finally, the fuzzy if‐then rule classifies the patient data as lower risk level, medium risk level and higher risk level with higher accuracy and lesser time consumption. Experimental evaluation is performed by Novel Corona Virus 2019 Dataset using different metrics like prediction accuracy, prediction time, and error rate. The result shows that LSRGNFM‐LDC technique improves the accuracy and minimizes the time consumption as well as error rate than existing works during COVID prediction. A new approach was introduced in Marmarelis (2020). Waheed et al. (2020) depending on data-guided detection and infection waves explained by Riccati with estimated parameters. The designed approach was employed with Covid-19 daily time-series data lead to epidemic time-course decomposition. An artificial-intelligence method was introduced in Alazab et al. (2020) depending on a deep convolutional neural network (CNN) to identify COVID-19 patients with real-world datasets. But the computational cost was not minimized by the artificial-intelligence method. A new approach was introduced in Abdel- Basset et al. (2020b) . The conditioner track proceeds the couple of CT images and their tags as contribution and products a significant representation of information is transmitted is used by division path to segment the new images. Toward support actual association among both paths, in its proposed an adaptive recalibration (RR) and recombination module that permits concentrated data exchange between paths with a trivial increase in computational complexity. A new framework was introduced in Abdel- Basset et al. (2020a) with respect to the heart rate and sleep data from wearable devices to estimate the epidemic development of COVID-19 in different cities. However, the computational complexity was not minimized. An objective approach was introduced in Petropoulos and Makridakis (2020) to forecast the COVID-19. The data was reliable and performed disease detection with an increase in confirmed COVID-19 with sizable, connected uncertainty. But the computational cost was not minimized by the objective approach. A nonlinear machine learning method was introduced in Kavadi et al. (2020) for global pandemic prediction of COVID-19. Progressive Partial Derivative Linear Regression model was introduced to identify the best parameters efficiently. However, the prediction accuracy was not improved by the designed method. A new methodology was introduced in Fokas et al. (2020) for predicting the time evolution of individuals in each country infected with SARS-CoV-2. But the prediction time was not minimized by the designed methodology. An online forecasting mechanism was introduced in Abdulmajeed et al. (2020) to stream the data from Nigeria Center for Disease Control. Also, the designed mechanism updated the ensemble model for providing the COVID-19 updates. However, the error rate was not minimized by an online forecasting mechanism. An innovative educational technique was introduced in DeFilippis et al. (2020) to provide experiential learning. FITs were supported in the early stage of the COVID-19 pandemic and anticipated the challenges in the United States. But the space complexity was not minimized by innovative educational techniques. AI platform was introduced in Ke et al. (2020) to find the drugs with anti-corona virus activities. Though the error rate was minimized, the time consumption for prediction was not minimized by AI. New technology was introduced in Vaishya et al. (2020) to identify the cluster of cases and to forecast the virus through collecting and examining all previous data. But the computational cost was not minimized. A new hybrid approach (HSMA_WOA) was introduced in Abdel-Basset et al. (2020c) depending on the SMA and WOA to resolve the image segmentation problems (ISP) for identifying the optimal threshold values. A novel meta-heuristic algorithm named slime mold algorithm (SMA) was presented for enhancing Kapur's entropy by the whale optimization algorithm. The designed algorithm failed to solve flow shop scheduling issues, DNA fragment assembly issues, and parameter evaluation of the photovoltaic solar cell. A novel IoT system based decision-making model was developed by Abdel-Basset et al. (2020c) and Al-Turjman and Deebak (2020) to find and examine patients with type-2 diabetes. A new decision-making model was introduced on type-2 neuromorphic numbers by the VIKOR method. A decision support system was introduced for the precise prediction of type-2 diabetes risks for patients. But the accuracy was not improved. A data analytics and visualization were introduced in Abdel- Basset et al. (2019) to examine malignant tumors and find weak spots of the tumor. However, failed to generate a greater positive impact and influence. MapReduce framework and fusion algorithm was introduced in Chang (2018a, 2018b) for medical imaging simulations. But the more varieties of gene simulations were not considered. The coronavirus disease 19 (COVID-19) pandemic has resulted in the propagation of clinical prediction models to perform the diagnosis, disease severity estimation, and prognosis. A COVID-19 originated from β-coronavirus was reported in December 2019 at Wuhan city in China. On March 11, 2020, COVID-19 was announced as the public health crisis of international distress by World Health Organization (WHO). As of August 17, 2020, the Times of India website reported India's overall coronavirus crosses 26 lakh and 50,921 deaths. In Table 1 listed the detailed description of the variables used in this paper. The coronavirus gets directly transmitted through cough, contact-transmission, sneeze, and respiratory globules. Patients are infectious before the beginning of clinical symptoms and contagiousness lasts up to three weeks after the recovery. Also, patients with mild symptoms were identified as infective. But the existing researchers failed to predict the COVID with higher accuracy and lesser time consumption. To address the existing problems, LSRGNFM-LDC technique is introduced. The main objective of the LSRGNFM-LDC technique is to perform an efficient COVID prediction with better accuracy and lesser time consumption. LSRGNFM-LDC technique comprises two processes, namely feature selection and classification. In this technique, first, preprocessing is carried out for removing the unwanted data from the Novel Corona Virus 2019 dataset. Preprocessing is employed for minimizing the time complexity. Next, DLSRFS process is used in the LSRGNFM-LDC technique for choosing the most relevant features. The data classification is carried out by applying neuro-fuzzy classifier to perform the prediction process, with aid of relevant features. The architecture diagram of the LSRGNFM-LDC technique is illustrated in Figure 1 . The preprocessing is used for eliminating the unwanted data with lesser time complexity. After preprocessing, regression process is carried to select the most relevant feature via identifying the line of best fit. After the feature selection process, the fuzzy rules are constructed to perform the classification process. The classification process is used to classify the patient data with relevant features. In this way, the efficient COVID data prediction is carried out with higher accuracy and lesser time consumption. A brief description of the feature selection and classification process is given in the below sub-section. 3.1 | Deming least square regressive feature selection process DLSRFS process is introduced in LSRGNFM-LDC technique to select the more relevant features from the dataset. DR-FS algorithm is an errorsin-variables model. DLSRFS process identifies the line of best fit for performing the relevant feature selection. DLSRFS process is not the same as the linear regression where it identifies the errors on both axes. The DLSRFS process determines the maximum likelihood assessment of the error-in-variable model. The steps involved in the DLSRFS process is illustrated in Figure 2 . Figure 2 illustrates the step-by-step process diagram of Deming least-square regressive feature selection analysis. Initially, many patient data is collected from Novel Corona Virus 2019 Dataset. Next, DLSRFS process is carried out for identifying the line of best fit. Then, this process is used to choose the relevant features for COVID disease prediction. Large number of patient data denoted as "P i = P 1 , P 2 , . . ,P n " with number of features "Ft j = Ft 1 , Ft 2 , . . ,Ft m ," From the expression, "n" denotes the number of data and "m" denotes the number of features in each dataset. DLSRFS process explain the available data (a j , Ft j ) are calculated readings of true values (a à j , Ft à j ) that lie on regression line given by, From (1) and (2), "e j " denotes the error value. "v j " denotes the ratio of their variance. Both values are independent of each other. The intercept "I 0 " and the slope "s 1 " are determined as,â From (3), "Ft i " and "â i " denotes the estimate of true values of "Ft j " and "a j " respectively. With help of the above equation, the DLSRFS process identifies the best fit line for every input feature. Also, it chooses the more relevant features to effectively predict the COVID disease. For achieving better outcomes, the DLSRFS process reduces the weighted sum of squared residuals. As a result, the DLSRFS process accurately selects the relevant features for disease prediction. The algorithmic processes of DLSRFS are given as, Step 1: Begin Step 2: For patient data 'Ft j ' with number of features Step 3: Apply Deming least square regression analysis Step 4: Find the best fit line Step 5: Select the most relevant features for prediction Step 6: End For Step 7: End Algorithm 1 demonstrates the step-by-step process of the DLSRFS algorithm. With the above algorithmic process, the DLSRFS algorithm reduces the time consumption for relevant feature selection from the input Novel Corona Virus 2019 Dataset as compared to conventional works. Initially, the number of patient data is taken from Novel Corona Virus 2019 dataset as input. These patient data are collected from the dataset. Then, Deming least square regression analysis is used for finding the best fit line. Finally, the DLSRFS algorithm is employed to choose the most relevant features for COVID disease prediction. Gaussian neuro-fuzzy multi-layered data classification (GNFMLDC) process is carried out to classify the data points with higher accuracy and lesser time consumption. GNFMLDC process two processes, namely fuzzy logic and neural network. The fuzzy logic includes three operations, namely fuzzification operation, inference operation, and defuzzification operation. It is shown in Figure 3 . From Figure 3 , the initially GNFMLDC process collects the input "I (t)" and determines the required performance. In the fuzzification operation, the inputs are converted to fuzzy sets. The goal of the inference operation is the efficient conversion of fuzzy input to the fuzzy output using an if-then loop. GNFMLDC process is designed with a set of rules termed fuzzy rules. The rules in the GNFMLDC process are denoted in the form of conditional statements. Finally, the defuzzification operation converts the fuzzy set to the output. A neuro-fuzzy system is defined as a fuzzy system with a learning algorithm to determine fuzzy sets and fuzzy rules through processing the features of input data. The learning process aimed at local information and resulted in modification of the fundamental fuzzy system. The learning process of the neuro-fuzzy system includes the semantical properties of the fundamental fuzzy system. A neuro-fuzzy system is considered a multi-layer (i.e., three-layer) feedforward neural network. The first layer denotes the input layer with input variables. The hidden layer symbolizes the fuzzy rules, and the third layer denotes the output variables. Initially, the most relevant features of input data are considered as an input in the LSRGNFMLDC technique and given to the input layer. It is given by, From Equation (4), "In(t)" denotes the input layer output. "Pa i " denotes the patient data with the most relevant feature. "weight 0 " represents the initial weight allocated at the input layer. "b" symbolizes the bias. Then, the input layer result is transmitted to the hidden layer. In that layer, a neuro-fuzzy system is used with the fuzzy rules and fuzzy sets. Fuzzy sets are trained as the connection weights and bias. The fuzzy set is portrayed using the membership function. It classifies the element in the fuzzy set as the continuous or the discrete form. A Gaussian membership function is the generalization of the characteristic function of a defined subset. Fuzzy if-then rules are employed for taking the correct decision based on the input data. The fuzzy rules are depending on the if-then rule condition as described in Equation (5). if age μ 1 , body temperature μ 2 , travel history μ 3 and chronic diseases μ 4 ð Þ , then Y i = lower risk,medium risk,higher risk ð5Þ In GNFMLDC, four input membership functions, and one output membership functions are employed. The four-input function are age, body temperature, travel history, and chronic disease. The three-output membership function of the GNFMLDC is lower risk, medium risk, and higher risk. After considering the membership function, the fuzzy rule is formulated as, if μ 1 > 50 &&37 C < μ 2 < 39 C &&14 < μ 3 < 28 if μ 1 > 50 &&μ 2 = 37 C &&μ 3 > 28 days &&μ From (6), (7), and (8), the load is determined as under load, normal load, and overload while comparing to the threshold metal weight using the if-then condition. The output triangular membership function is described in below Figure 4 The values of the output Gaussian membership function vary between 0 and 1 to categorize the fuzzy members that belong to the fuzzy set. Less than 0.25 is a representing a lower risk. 0.5 is denoting a medium risk. Greater than 0.5 is representing a higher risk. The de-fuzzification process converts the fuzzy sets into output results. The hidden layer output is given by, From Equation (9), "Hidden(t)" denotes the hidden layer result. "weight ih " denotes the weight allocated between the input layer and the hidden layer. After that, the hidden layer results are transmitted to an output layer. An output of the GNFMLDCclassifier renders the classification output for each input patient data. The result of the output layer is formulated as, From Equation (10), "Ou(t)" denotes the output layer result. "weight ho " represents the weight allocated between the hidden layer and the output layer. In this way, the patient data is classified into three types (i.e., Lower risk, Medium risk, and Higher risk) for performing the COVID prediction. The algorithmic description of the GNFMLDC process is explained below, Input: Number of patient data with the most relevant features Output: Improve prediction accuracy with minimal time consumption //Gaussian Neuro-Fuzzy Multi-Layered Data Classification Algorithm Step 1: Begin Step 2: For each input patient data 'Pa i ' Step 3: Perform the fuzzification process Step 4: Analyze the fuzzy interference function Step 5: Determine the de-fuzzification process Step 6: Obtain the classification results as higher-risk patient data, medium risk patient data, or higher risk patient data Step 7: End for Step 8: End F I G U R E 4 Output Gaussian membership function of fuzzy set Algorithm 2 explains the step-by-step process of the Generalized Recurrent Neural Brown Boosting Classifier Algorithm. Initially, the relevant features of each patient data are collected as input. After that, a fuzzification operation is carried out to convert the input into fuzzy sets. Inference operation is the conversion of fuzzy input to the fuzzy output through the if-then loop. The fuzzy inference operation is used to minimize the prediction time. Finally, the defuzzification operation changes the fuzzy set to output. With help of these rules, the patient data is predicted as the higher risk patient data, medium risk patient data or higher risk patient data with maximum accuracy and minimum time consumption. With the above algorithmic process, LSRGNFM-LDC technique attains the better COVID prediction performance through the classification process when compared to the traditional works. Experimental evaluation of proposed LSRGNFM-LDC technique and existing methods data-driven LSTM Tomar and Gupta (2020), fuzzy assisted system Adwibowo (2020) technique is compared against three conventional methods namely data-driven LSTM, fuzzy assisted system, and disruptive technologies. For the experimental consideration, the numbers of patient data are taken from 100 to 1000. Totally ten runs are carried out for all the parameters evaluation. With the help of these numbers of patient data, the classification is done to improve the performance of COVID19 prediction. The Attributes Description in this dataset is depicted in Table 2 . • Prediction accuracy Prediction accuracy is defined as the ratio of the number of patient data that are correctly predicted the risk level through the classification process to the total number of patient data taken. Consequently, the prediction accuracy is formulated as, Prediction Acc = Number of patient data that are correctly predicted the risk level Number of patient data à 100 From (11), the prediction accuracy is calculated. The prediction accuracy is measured in terms of percentage (%). The proposed LSRGNFM-LDC technique performs an accurate COVID prediction with an increasing number of patient data. LSRGNFM-LDC technique is employed to improve the prediction accuracy for selecting the most relevant features by using Deming regression. This process finds the line of best fit for performing the relevant feature selection. Then, the Gaussian neuro-fuzzy classifier is employed to classify the patient data for performing COVID prediction based on the risk levels through fuzzy if-then rules. The patient data is categorized into three types such as Lower risk, Medium risk, and Higher risk for performing the COVID prediction. This helps the LSRGNFM-LDC technique to improve prediction accuracy performance. Therefore, the proposed LSRGNFM-LDC technique illustrates that prediction accuracy is said to be minimized by 9% when compared to Tomar and Gupta (2020) , 5% when compared to Adwibowo (2020) and 7% when compared to Abdel-Basset et al. (2020a) respectively. Prediction time is defined as the amount of time consumed for predicting the number of patient data as lesser risk data, medium risk data, and higher risk data. It is the product of the number of patient data and the amount of time consumed for predicting one patient data. Therefore, the prediction time is calculated as, Prediction Time = Number of patient data à time consumed for predicting one data ð12Þ From (12), the prediction time is determined. The prediction time is measured in terms of milliseconds. With the help of the obtained experimental values, the graph is illustrated in Figure 5 . Abbreviations: LSRGNFM-LDC, least square regressive Gaussian neuro-fuzzy multi-layered data classification; LSTM, long short-term memory. consumed using the proposed LSRGNFM-LDC technique is lesser when compared to other works by Tomar and Gupta (2020) , Adwibowo (2020), and Abdel-Basset et al. (2020a). As described in Figure cess with help of fuzzy if-then rules for performing prediction process. Next, Guassian neuro-fuzzy classifier is used to categorize the patient data as lower risk level, medium risk level, and higher risk level. In this way, the LSRGNFM-LDC technique minimizes the prediction time. As a result, the proposed LSRGNFM-LDC technique illustrates that prediction time is said to be minimized by 41% when compared to Tomar and Gupta (2020), 34% when compared to Adwibowo (2020) Kolhar et al. (2020) and 25% when compared to Abdel-Basset et al. (2020a) respectively. The error rate is defined as the ratio of the number of patient data that are incorrectly predicted through the classification process to the total number of patient data taken. As a result, the error rate is determined as, Error Rate = Number of patient data that are incorrectly predicted Number of patient data à 100 From (13), the error rate is determined. The error rate is measured in terms of percentage (%). The error rate comparison of three different methods, namely data-driven LSTM Tomar and Gupta (2020), fuzzy assisted system Adwibowo (2020), disruptive technologies Abdel-Basset et al. (2020c) and LSRGNFM-LDC technique is illustrated in Table 4 . For a better assessment, the various number of patient data are taken as input ranging from 100, 200, 300… 1000. For the same number of input data, three various error rate results are attained as shown in Table 4 . Let us consider the number of patient data as100 for conducting the experiments. Among the 100-patient data, six patient data are incorrectly predicted by the LSRGNFM-LDC technique. Also, 20, 15, and 17 patient data are incorrectly predicted by data-driven LSTM Tomar and Gupta (2020), fuzzy assisted system Adwibowo (2020), and disruptive technologies Abdel-Basset et al. (2020c) respectively. Therefore, the error rate obtained by the LSRGNFM-LDC technique, data-driven LSTM Tomar and Gupta (2020), fuzzy assisted system Adwibowo (2020), and disruptive technologies Abdel-Basset et al. (2020a) is 6, 20, 15, and 17%, respectively. The statistical discussion confirms that the proposed technique attains a lesser error rate than existing methods. Table 4 demonstrates the error rate experimental results of three different prediction techniques with respect to the number of data collected from the COVID database. As described in Table 4 , the space complexity of all the methods gets increased while increasing the number of patient data. From the obtained values, the error rate of the LSRGNFM-LDC technique is lesser than other existing works. The reason behind the performance enhancement is the application of feature selection and classification techniques for predicting the COVID patient data. The LSRGNFM-F I G U R E 5 Measurement of prediction time LDC technique uses the Deming regression and neuro-fuzzy classifier for performing the prediction process. Besides, the DLSRFS process selects the most relevant feature selection for disease prediction. DLSRFS process discovers the line of best fit for performing the relevant feature selection. After that, the neuro-fuzzy classifier is applied to categorize the patient data depended on the risk levels through fuzzy if-then rules. The fuzzy if-then rule classifies the patient data into lesser risk, medium risk, and higher risk. In this manner, the proposed technique reduces the error rate during the COVID prediction. Likewise, the comparative analysis indicates that the error rate is found to be minimized by 60, 50, and 55% when compared to three existing classification algorithms Tomar and Gupta (2020) Space complexity is defined as the amount of memory space required for performing COVID prediction. It is the product of the number of patient data and the space required for storing one patient data. It is given by, Space Complexity = Number of patient data à space consumed for storing one data ð14Þ From (14), the space complexity is computed. The space complexity is measured in terms of megabytes (MB). With attained space complexity table values, the graph is shown in Figure 6 . From the figure, the red color bar denotes the space complexity of the LSRGNFM-LDC technique whereas the red color bar, green color bar, and violet color bar denotes the space complexity of data-driven LSTM, fuzzy assisted system, and disruptive technologies. With collected patient data, the space complexity of the three techniques is determined. From the attained values, the space complexity of the LSRGNFM-LDC technique is lesser than other conventional works. This is because of applying the feature selection process during the COVID prediction process. LSRGNFM-LDC technique uses the DLSRFS process for eliminating the irrelevant features. Also, the DLSRFS process chooses the most relevant feature selection through finding the line of best fit. DLSRFS process is used to effectively predict the COVID disease. This helps in minimizing the space consumption during the COVID prediction. Similarly, the result analysis denotes that space complexity is minimized by 57, 42, and 51% when compared to two existing classification algorithms Tomar and Gupta (2020) , Adwibowo (2020), and Abdel-Basset et al. (2020c), respectively. Abbreviations: LSRGNFM-LDC, least square regressive Gaussian neuro-fuzzy multi-layered data classification; LSTM, long short-term memory. A new technique termed LSRGNFM-LDC technique is introduced for performing the COVID prediction with better accuracy and lesser time consumption. The preprocessing is applied on each input features to eradicate the irrelevant data. Preprocessing is used to minimize the time consumption. DLSRFS process in LSRGNFM-LDC technique selects the most relevant features through identifying the line of best fit. In LSRGNFM-LDC technique, the neuro-fuzzy classifier utilizes the fuzzy if-then rules for performing the prediction process. The fuzzy if-then rule predicts the patient data into lesser risk, medium risk, and higher risk in a more accurate manner with higher accuracy and lesser time consumption. The wideranging experimental evaluation is performed with the COVID database. The quantitative results are verified in terms of higher prediction accuracy and lesser time as well as space complexity when compared to other related works. In future, our proposed work is also proceed using HSMA_WOA: A hybrid novel Slime mould algorithm with whale optimization algorithm for tackling the image segmentation problem of chest X-ray images FSS-2019-nCov: A deep learning architecture for semi-supervised few-shot segmentation of COVID-19 infection (106647) An Intelligent Framework using Disruptive Technologies for COVID-19 analysis A novel intelligent medical decision support model based on soft computing and IoT Online forecasting of COVID-19 cases in Nigeria using limited data (pp. 105683). Data in Brief. Adwibowo, A. (2020). Fuzzy logic assisted COVID 19 safety assessment of dental care COVID-19 Prediction and Detection Using Deep Learning Privacy-aware energy-efficient framework using the internet of medical things for COVID-19 A predictive model for the evolution of COVID-19 Computational intelligence for medical imaging simulations Data analytics and visualization for inspecting cancers and genes Adapting the educational environment for cardiovascular fellows-in-training during the COVID-19 pandemic Mathematical models and deep learning for predicting the number of individuals reported to be infected with SARS-CoV-2 Prediction of COVID-19 outbreak in China and optimal return date for university students based on propagation dynamics Modeling and prediction of the Covid-19 cases with deep assessment methodology and fractional calculus IoT based humanoid software for identification and diagnosis of Covid-19 suspects Partial derivative nonlinear global pandemic machine learning prediction of covid 19 Artificial intelligence approach fighting COVID-19 with repurposing drugs A three layered decentralized IoT biometric architecture for city lockdown during COVID-19 outbreak Effectiveness of preventive measures against COVID-19: A systematic review of In Silico modeling studies in indian context Predictive modeling of Covid-19 data in the US: Adaptive phase-space approach Multiple ensemble neural network models with fuzzy response aggregation for predicting COVID-19 time series: The case of Mexico Parkinson data analysis and prediction system using multi-variant stacked auto encoder Forecasting the novel coronavirus COVID-19 COVID-19 future forecasting using supervised machine learning models A systematic approach for COVID-19 predictions and parameter estimation A machine learning forecasting model for COVID-19 pandemic in India Prediction for the spread of COVID-19 in India and effectiveness of preventive measures Modeling and prediction of COVID-19 in Mexico applying mathematical and computational models A novel parametric model for the prediction and analysis of the COVID-19 casualties Artificial Intelligence (AI) applications for COVID-19 pandemic Covidgan: Data augmentation using auxiliary classifier GAN for improved covid-19 detection Predicting covid-19 in china using hybrid AI model Early prediction of the 2019 novel coronavirus outbreak in the mainland china based on simple mathematical model has received the Young Scientist Award, the Best Faculty Award, the Best Paper Award, in 2005, and the First Prize from the National Paper Presentation, in 2008. His research interests include the Internet of Things, big data Computer Science and Engineering) in 2017 at school of computer science and engineering Rizwan has ten Indian patents one USA patent. Dr. Rizwan is a guest editor for International Journal of Grid and Utility Computing (Inderscience), Recent Patents on Computer Science He has received several recognitions and best papers' awards at top international conferences. He also received the prestigious Best Research Paper Award from Elsevier Computer Communications Journal for the period 2015-2018, in addition to the Top Researcher Award for 2018 at Antalya Bilim University, Turkey. Prof. Al-Turjman has led a number of international symposia and workshops in flagship communication society conferences. Currently, he serves as an associate editor and the lead guest/associate editor for several well reputed journals He has more than 120 research contributions to his credit, which are published in referred and indexed journals, book chapters and conferences. He is presently working as Assistant Professor at SASTRA Deemed University for the last 14 years. He has delivered many lectures and has attended and presented in International Conferences in India and Abroad. He has edited more than 80 research articles to his credit, which includes his editorial experience across refereed and indexed journals, conferences and book chapters at national and international levels. His contemporary research interests include Big Data, Data Analytics, VLSI Design, IoT and Health Care Applications. How to cite this article The authors declare there is no conflict of interest. The data that support the findings of this study are openly available in Kaggle at https://www.kaggle.com/sudalairajkumar/novel-corona-virus- https://orcid.org/0000-0003-4878-1988Fadi Al-Turjman https://orcid.org/0000-0001-6375-4123