key: cord-0059945-7j4udykb authors: Pal, Mahua Nandy; Roy, Shuvankar; Kundu, Supriya; Choudhury, Sasmita Subhadarsinee title: Visualization and Prediction of Trends of Covid-19 Pandemic During Early Outbreak in India Using DNN and SVR date: 2020-07-29 journal: Big Data Analytics and Artificial Intelligence Against COVID-19: Innovation Vision and Approach DOI: 10.1007/978-3-030-55258-9_4 sha: 5aacfb86235d792820a8c114d8fdbe0b8296f757 doc_id: 59945 cord_uid: 7j4udykb First known case of Covid-19 was found out in Wuhan, China in December, 2019. The virus itself is a novel virus, its harshness is unpredictable, its transmission ability is extremely powerful and its incubation period is comparatively larger. Covid-19 pandemic affected world health and socio-economy severely. So, it is required to know earlier whether the condition is continuing to get worse or how to scale up medical facilities like tracing, testing, treatment, quarantine etc. to fight against it. Early outbreak data for Novel Corona virus attack in India has been considered for this work. The trend of confirmed cases, recovery cases and deceased cases using deep neural network (DNN) and support vector regression (SVR) using Gaussian and exponential kernel functions are modeled. A comparative view of the prediction analysis is also considered. considered as an epidemic throughout China but later a number of cases throughout the world appeared. Thousands of people were suffering seriously from COVID-19, and the curse began to grow exponentially in different places worldwide. In February 2020 World Health Organization (WHO) declared Corona Virus as a pandemic with its spread in more than 168 countries in the world. The situation has been devastating since then with increasing number of deaths in countries like Italy, Spain, USA etc. with world class medical facilities. In India the number of infected cases has crossed 3000 at the time of writing this article. The Virus do not have any proper medicine or vaccination till date and in order to reduce the spread of COVID-19, The Government of India had declared entire country under complete lockdown excluding the essential services in order to combat this deadly Virus. The coronavirus causes a range of symptoms such as pneumonia, fever, inhalation difficulty, and lung contamination. These viruses are common in animals worldwide, but very few cases had been known to affect humans before its outbreak in late 2019. WHO renamed it as novel coronavirus that affected the lower respiratory tract of people with pneumonia in Wuhan, China on 29th December of 2019. WHO mentioned the disease as coronavirus disease .Infection preventive and control (IPC) measures that may reduce the risk of exposure include the following: use of face masks; covering coughs and sneezes with tissues that are then safely disposed of; regular hand washing with soap or disinfection with hand sanitizer containing at least 60% alcohol (if soap and water are not available); avoidance of contact with infected people and maintaining an appropriate distance as much as possible; and refraining from touching eyes, nose, and mouth with unwashed hands. Researchers are currently deficient of adequate data to predict the rapid growth trend of this pandemic using machine learning tools. But predicting enables us in understanding possible consequences which may affect enormously the socioeconomical growth. Visualizing and forecasting the pandemic behavior with AI tools and data analysis enable us to prevent heavier health crisis and socio-economic devastation. This may be considered as the motivation of this work. As per the knowledge no works have been published yet which either represents SVR analysis of Covid-19 data or DNN model trained on growth values of available data. The contribution of the work is that with minimal training samples quite encouraging prediction results have been achieved. The DNN model we proposed here is a simple model which takes less training time and is less prone to overfitting. Following sections are arranged as follows. Section 2 represents literature survey, Sect. 3 discusses deep neural network architecture and support vector regression, Sect. 4 is dataset description, Sect. 5 is the representation of the early outbreak scenario of India with respect to the World at that time. Sections 6 and 7 are the representations and predictions using DNN and SVR models, Sect. 8 is error computation and Sect. 9 is discussion about the work. Section 10 concludes the chapter. A meeting of WHO Emergency Committee under the International Health Regulations (IHR) (2005) [1] about the epidemic of novel coronavirus 2019 in China, took place on 30th January, 20. According to them, China rapidly recognized the virus and shared its sequence; hence, other countries could diagnose it to save themselves. According to the mathematical model proposed in the research paper in The Lancet [2] , the growth of epidemic spreading rate will decrease down if the transmission rate of the communicable infections diminished to 0.25. A composite Monte-Carlo model (CMCM) is projected in [3] which supports future predictions using non-deterministic data distributions from a deterministic model. Some factors are non-deterministic and contribute to high uncertainty such as gathering of people and some factors are deterministic such as historical data. They claimed that these characteristics are characterized most effectively in probabilistic distribution as nondeterministic variables to the MC model. Second, the sensitivity values obtained from the MC simulation is utilized as remedial feedback to the rules that are produced from a fuzzy rule induction (FRI) system. According to [4] , with limited training samples, finding a forecasting model is a huge challenge in the field of machine learning. For this, three generally used methods have been used in the past, (1) augmenting the existing data, (2) using a panel selection to pick the most efficient model from several models, and (3) fine-tuning the parameters of an individual model for maximum achievable accuracy. A methodology that holds these data mining strategies is proposed in [4] . Reference [5] suggests AI-driven tools to forecast the nature of spread of COVID-19 outbreaks. AI driven tools are likely to have active learning-based cross-population train/test models that utilize multitudinal and multimodal data. The recent trend is to explore the utilization of deep learning architecture in different fields of research. References [6, 7] is a representation of application of deep convolutional neural network. Reference [8] is a reference which represents a well-established CNN architecture capable of extracting features in biomedical applications. In general, deep learning techniques require large amount of training data. They are extremely computationally exhaustive and the training phase is very time consuming with expensive GPUs. Covid-19 patient response to treatment based on Convolutional Neural Networks and Whale Optimization has been discussed in [9] . Detection of affected images from lung CT scan images is presented in [10] . Artificial neural networks having multiple number of layers is called deep neural network. A fully connected (FC) neural network layer is said to be a dense layer. It accepts the input from each of the previous units and produces outputs for all the output units. Activation function helps in learning the patterns of characteristic of data. In this model Leaky ReLu has been used as nonlinear activation function which is capable of learning the nonlinear trend of data. It is defined as, Here α is a hyper parameter set to 0.3. In this paper, a DNN architecture is proposed, which consists of only four layers of dense and Leaky ReLu. In comparison with different well-established deep architecture, this model is quite light weight with less number of trainable parameters and takes less time to be trained. As a result, the proposed architecture is less prone to over-fitting also. There are 116,161 trainable parameters as a whole. Table 1 represents the model of DNN where layer wise detailed technical architecture of the network is given. Input and output shape mapping is visible properly from Fig. 1. Support vector machine (SVM) is a popular machine learning tool for classification and regression. Specifying kernel functions allow facilitating higher dimension transformation of input data. In this work, two kernel functions have been explored to implement support vector regressor (SVR). Kernel functions shift data representation to a transformed coordinate system in the higher dimension feature space. Gaussian kernel is represented by, and exponential is represented by, The parameter σ should be optimized in an application-oriented way. The dataset [11] is contributed and updated on daily basis by Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). This dataset is supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL). Johns Hopkins University is currently using a GitHub repository to store its data. It is in 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository. Following are the representations of Covid-19 confirmed cases, deaths and recovery during early stage in India. The data visualization has been considered for the duration of 30th January, 2020 to 30th March, 2020. Figures 2, 3 and 4 are the data representation of India with respect to the world during the same duration and Figs. 5, 6 and 7 are the data representation of India with respect to the most affected countries at that stipulated duration. The period of early outbreak of covid-19 in India during 30th January to 30th March, 2020 has been considered. DNN was trained with consecutive growth values instead of using the time series data itself. The trained DNN predicts next 15 days confirmed cases, recovery cases and death cases data. Figures 8, 9 and 10 represent DNN training data plots whereas Figs. 11, 12 and 13 represent DNN prediction data plots. Prediction (Table 2) . Work experimentation has been done using SVR with Gaussian and exponential kernels. Figures 14, 15 and 16 represent mapping of data up to 30th March, 2020. Figures 17, 18 and 19 are prediction representations of next 15 days respectively (Tables 3, 4 and 5). Errors incurred by previously discussed three variations of prediction models have been compared. Absolute prediction error (APE) is defined as, Fig. 20 . To ascertain the observation from day wise APE plots, average of absolute prediction error percentages (AAPEP) have been computed which is defined as, where n is no. of actual prediction observation. In Fig. 21 , AAPEP values obtained from three models for three different types of data are presented. The recent Covid-19 pandemic behaviour and its dynamic nature with rapid and wide probability of change in trend are tried to visualize with respect to India in this work. It is very important to forecast as it is associated with threatening of life and social economy. Support vector regression using two different kernel functions and DNN model are used for representation of the growth of available data. Prediction of future data and prediction error is visualized, too, using those models. Average absolute prediction error in confirmed case, recovery case and death case prediction for DNN and two models of SVR are as shown in Table 6 . DNN is most effective in predicting confirmed cases whereas, Gaussian kernel effectively represents forecasting of death and recovery cases. As per overall observation, we can say that, DNN is the most efficient option among three in forecasting confirmed cases and decreased cases but the model is comparatively less efficient compared to others while predicting recovery cases. Train data availability during the early outburst period in India was not sufficient for learning the trend exactly for a DNN. But it is tried to capture the trend by executing training of DNN on growth data, which is least available in case of death data during the period considered. This is the reason why the model failed to learn the death trend properly and the error percentage is high in death case. This may be considered as the limitation of the work. Considering general population, at this moment there is no vaccination available for preventing COVID-19. The best prevention to subside this pandemic is to avoid being exposed to the virus. Forecasting its behaviour is necessary to analyze whether it is continuing or getting to be worse day by day or the chain of pandemic outbreak should soon be collapsed. It will be helpful for timely decision making and remedial action. According to the above analysis DNN prediction performance minimizes error in confirmed cases predictions whereas SVR with Gaussian kernel exhibit minimal error in recovery and death cases prediction. DNN would have performed better if more training samples might be available. The future scope of this work is to properly analyse the training set representation to the model so that the model learning becomes efficient with limited data available and can frame the trend prediction more powerfully. Emergency Committee Regarding the Outbreak of Novel Coronavirus (2019-nCoV) Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Composite Monte Carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction Finding an accurate early forecasting model from small dataset: a case of 2019-nCoV novel coronavirus outbreak AI-driven tools for coronavirus outbreak: need of active learning and crosspopulation train/test models on multitudinal/multimodal data Retinal vessel segmentation via deep learning network and fully connected conditional random fields Segmenting retinal blood vessels with deep neural networks U-net: convolutional networks for biomedical image segmentation Diagnosis and prediction model for COVID19 patients response to treatment based on convolutional neural networks and whale optimization algorithm using CT image Harmony-search and Otsu based system for coronavirus disease (COVID-19) detection using lung CT scan images Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins CSSE