key: cord-0191718-e59jzfb3 authors: Bedi, Punam; Shivani,; Gole, Pushkar; Gupta, Neha; Jindal, Vinita title: Projections for COVID-19 spread in India and its worst affected five states using the Modified SEIRD and LSTM models date: 2020-09-07 journal: nan DOI: nan sha: fc42e25c5b5d7a706d891696f48f67a0b7785911 doc_id: 191718 cord_uid: e59jzfb3 The last leg of the year 2019 gave rise to a virus named COVID-19 (Corona Virus Disease 2019). Since the beginning of this infection in India, the government implemented several policies and restrictions to curtail its spread among the population. As the time passed, these restrictions were relaxed and people were advised to follow precautionary measures by themselves. These timely decisions taken by the Indian government helped in decelerating the spread of COVID-19 to a large extent. Despite these decisions, the pandemic continues to spread and hence, there is an urgent need to plan and control the spread of this disease. This is possible by finding the future predictions about the spread. Scientists across the globe are working towards estimating the future growth of COVID-19. This paper proposes a Modified SEIRD (Susceptible-Exposed-Infected-Recovered-Deceased) model for projecting COVID-19 infections in India and its five states having the highest number of total cases. In this model, exposed compartment contains individuals which may be asymptomatic but infectious. Deep Learning based Long Short-Term Memory (LSTM) model has also been used in this paper to perform short-term projections. The projections obtained from the proposed Modified SEIRD model have also been compared with the projections made by LSTM for next 30 days. The epidemiological data up to 15th August 2020 has been used for carrying out predictions in this paper. These predictions will help in arranging adequate medical infrastructure and providing proper preventive measures to handle the current pandemic. The effect of different lockdowns imposed by the Indian government has also been used in modelling and analysis in the proposed Modified SEIRD model. The results presented in this paper will act as a beacon for future policy-making to control the COVID-19 spread in India. The world witnessed the COVID-19 (Corona Virus Disease 2019) pandemic that affected the lives of many people across the globe. The spread of this virus has taken an exponential speed. Therefore, on 11 th March 2020, the World Health Organization (WHO) announced COVID-19 as a "global pandemic" [1] . Due to the outbreak of COVID-19, an unavoidable situation was created, whereby the authorities of various countries and continents had to put restrictions on the movement of people and non-essential activities. Some of these restrictions included imposing of lockdowns, maintaining social distancing, work from home in academics and in business continuity plans. Thus, the spread of COVID-19 has left a major impact on the environment as well as on the lifestyle of human beings [2] , [3] . Almost all educational institutions were closed, sports leagues were cancelled, and people were advised to work from home, and perform contactless financial transactions using various digital platforms [4] , [5] . COVID-19 was first reported in the Wuhan city of China on 17 th November 2019 and from there it has spread to the whole world [6] , [7] . Due to the spread of this disease through human to human contact, a large number of cases have been reported worldwide [8] . Recent studies have found that a healthy person can be infected by coming in contact with an infected person or with the surface touched by an infected person to which the Coronavirus got transferred. Also, the symptoms of Coronavirus in an infected person are visible after a certain time period. During this time period, the infected person is a carrier of Coronavirus and is able to infect other healthy persons. As of 16 th August 2020, more than 21 million people have been infected and more than 0.7 million people have died from COVID-19 across the globe [9] . Therefore, COVID-19 has become a big threat to people and environment [10] , [11] . To tackle this difficult situation, the first step is to take precautionary measures to prevent the infection and the second step is that infected people must quarantine themselves and get medical help. Taking precautions on an individual level is also required, such as use of sanitizer, use of face mask and maintaining social distancing. These resources namely sanitizers and face masks are the need of time [12] . However, it is equally important to properly dispose-off the used masks so as to protect the environment. Governments have also taken many steps to arrange necessary resources to provide better medical services to the infected people. An estimate of these resources is created so that all the needs of people can be met in time. Different methods are being used by researchers to estimate the resources. For a country like India, having a large population of around 1.38 billion, it is a challenging task to handle this pandemic efficiently [13] . In India, the first COVID-19 positive case was reported in the Thrissur district of Kerala on 30 th January 2020, of a student who returned back from Wuhan university, China. During the initial period, there wasn't a substantial increase in the number of cases in India and by 15 th March, the number of cases barely crossed the figure of 100. As of 16 th August 2020, COVID-19 has spread to 215 countries with more than 6 million active infected cases globally. In India, the number of COVID-19 cases has crossed 2.5 million with more than 0.6 million active infected cases, 1.8 million recovered cases and 50,122 as the total number of deaths, which makes India the worst affected country in Asia. The variation in the total number of infected cases is evident in India with highest reported cases from Maharashtra and lowest from Mizoram [14] . The number of cases globally as well as in India is increasing at a very rapid rate. As is evident from the data, state Maharashtra is the worst affected in terms of total cases which accounts for about 23% of the cases in India. The next four worst affected states/ union territories are Tamil Nadu, Andhra Pradesh, Karnataka and Delhi having approximately 38% of the total cases and the rest of Indian states/ union territories having another 39% cases. North eastern states of India are much better like Mizoram, Sikkim and Meghalaya, each having less than 1500 cases so far. To understand the future spread of pandemic and to devise management strategies, various models have been designed, which give information regarding the time of attainment of infection peak, the number of infected cases and the requirement of medical infrastructure to manage the spread [15] , [16] . In this paper, an epidemiological model named Modified SEIRD (Susceptible-Exposed-Infected-Recovered-Deceased) model has been proposed. It utilises the real data of infections, recoveries and deaths caused by COVID-19 to make predictions. We have used the data of India and its five states having the highest number of total cases to make predictions using Modified SEIRD model. Considering the data up to 15 th August 2020, Maharashtra, Tamil Nadu, Andhra Pradesh, Karnataka, and Delhi are the five states with the highest number of reported cases. The proposed model uses a parameter named epsilon for COVID-19 projections, which takes into account the proportion of Exposed population that is asymptomatic but infectious, and is unknowingly spreading the infection. This paper predicts the number of cases in Infected, Exposed, Recovered and Deceased compartments. Student ttest was used to obtain confidence levels for time-series data in consideration [17] . T-test is useful when the sample size is very small as compared to large population size. Since the data available is limited, t-test becomes an appropriate choice to find the confidence intervals for COVID-19 predictions. The effect of different lockdowns imposed by the Indian government has also been utilized in modelling using the proposed Modified SEIRD model. Furthermore, this paper utilises Deep Learning (DL) based Long Short-Term Memory (LSTM) model to perform short-term projections for the next 30 days i.e. from 16 th August 2020 to 14 th September 2020. The results obtained by LSTM model have been compared with the results of the Modified SEIRD model. The epidemiological data up to 15 th August 2020 has been used to obtain projections in this paper. The upper and lower estimates of the predictions made by both the models have also been calculated using 90% confidence intervals. The same has been shown in the graphs corresponding to short-term predictions. It has not been shown for longterm predictions because the values are very close to the reported data and are not distinguishable on the scale chosen for these graphs. The rest of the paper is organized as follows: the next section presents the Review of Literature. In Section 3, various models used for projections, namely, the proposed Modified SEIRD model along with SEIR, SEIRD models and the LSTM have been explained. Section 4 presents the Experimental Setup. The Results are discussed in Section 5, which is followed by Conclusion at the end. COVID-19 is a communicable disease that has been declared as a pandemic by the WHO [18] . Moreover, there is no medicine or vaccine available to cure this infection as of now. Hence, the only way to protect oneself from this pandemic is to get protected from the contact of any infected person. With the ongoing pandemic threat, researchers started to study the future of COVID-19. These research groups are mainly divided into two categories, where one tries to find the vaccine and other group tries to predict the damage that can be caused by this disease. Based on the predictions, resources can be prepared to treat people and minimize the fatalities. This paper predicts the future trend of COVID-19 by modelling the effect of different lockdowns imposed by the Indian government. Since the beginning of COVID-19, various researchers have predicted its spreading trend for different countries and their states [19] , [20] . Ahmed [21] has studied the effect of patient age, gender and their geographical location on the infection spread. In his work, the population of India that has arrived from different regions is taken into consideration. These people were divided into six groups to study the regional effect of Corona on a patient. Clustering and Multiple Linear Regression are two techniques that have been used for this study. Clustering has been used to find the similarity between different groups. Multiple Linear Regression has been used to predict the source of infection by assigning the new patient case into one of the above defined groups. In every group, different age distributions have been studied and their recovery rates were calculated. Ceylan [22] utilised Auto-Regressive Integrated Moving Average (ARIMA) models to predict the future trend of the Coronavirus disease in the three worst affected countries of Europe, namely Italy, Spain and France. The author formulated several ARIMA models using different parameter values. The best models were then used for estimating the spread of the disease in each of the three countries. Patrikar et al [23] have used the modified SEIR method in their work to predict the curve of COVID-19 for India. In this modified SEIR, the effect of social distancing has been studied on the COVID-19 and different graphs have been obtained. It was also concluded that social distancing is working in India, according to the model given by Poonia et al [24] . In the paper, authors performed short term forecast of COVID-19 for different states of India in worst case scenario. SEIR model was also used to study the effect of temperature on COVID-19 outbreak in China [25] . In the paper, the authors incorporated climatic factors in the original SEIR model to analyse their impact on the spread of COVID-19. Due to the dynamic nature of the Coronavirus spread, many researchers have performed short-term forecasting of the future spread of this deadly virus. Roosa et al [26] performed 5-day and 10-day forecasts of the total confirmed cases for Guangdong and Zhejiang provinces of China. The authors used logistic growth model, Richards growth model and a sub-epidemic wave model to generate predictions. A bootstrap approach was adopted in the paper to compute the uncertainty bounds for predicting cumulative cases in near future. Arora et al [27] made use of different variations of Long Short-Term Memory models for shortterm prediction of COVID-19 cases in India. Deep LSTM, Bi-directional LSTM (Bi-LSTM) and Convolutional LSTM (Conv-LSTM) were used to calculate one-week predictions for different states and union territories of India. Tomar et al [28] also used LSTM along with power-law curve fitting to predict the trend of COVID-19 in India. The authors forecasted the total number of confirmed, recovered and deceased cases for a short time span of next 30 days. In addition, the paper also analysed the effect of different values of transmission rate on the number of predicted infected cases. Pal et al [29] combined the COVID-19 data with the weather statistics of different countries to predict active cases using shallow LSTM model. The authors used Bayesian optimization together with fuzzy rules to predict the future risk of Coronavirus in their work. Pandey et al [30] analysed the initial outbreak of COVID-19 in India and forecasted the trend for two weeks in future. The authors used the SEIR model to predict the future trend of the virus in India, with and without interventions. Additionally, Regression model was also used by the authors to predict the change in the number of confirmed and death cases in India. Chakraborty et al [16] proposed a hybrid approach to generate ten-day predictions for United Kingdom, France, India, Canada and South Korea. The model combined Wavelet model and ARIMA models for computing predictions for next ten days. Various studies have been recently conducted by the researchers to understand the dynamics of this pandemic for different regions across the globe [3] , [1] , [31] , [32] , [33] , [34] . The forecast of COVID-19 in Indian context also has been investigated by many researchers using mathematical and epidemiological models, but limited contribution exists for its states [35] , [19] , [36] , [37] , [38] , [23] . In this paper, a compartmental epidemiological model, named Modified SEIRD (Susceptible-Exposed-Infected-Recovered-Deceased) model has been proposed for the projection of COVID-19 spread in India and its five states with the highest number of total cases. Further, short-term predictions for next 30 days have also been computed using the LSTM model and the results are compared with the projections of the proposed Modified SEIRD model. The forecasts are as good as the quality of data available and, therefore, the future spread of the virus may also affect the predictions. The next section presents the proposed Modified SEIRD model and the LSTM model used for projections. This section describes the proposed Modified SEIRD model based on SEIR and SEIRD epidemiology models, and the Long Short-Term Memory Model (LSTM) model, that have been used in this paper for projecting the trend of COVID-19 in India and its five states having the highest number of total cases. The proposed SEIRD model for COVID-19 is based on SEIR and SEIRD epidemiology models which are first described below. To model the trend of a disease, researchers have used various epidemiology models in the past [35] , [38] , [39] . One of the most common models used in the literature is SEIR (Susceptible-Exposed-Infectious-Recovered) model [34] [40] . It is a mathematical data modelling technique based on SIR (Susceptible-Infectious-Recovered) model, which is used for forecasting the spread of an epidemic [41] , [42] . The SEIR model is diagrammatically represented in Figure 1 . In SEIR model, the population is divided into four compartments namely S, E, I, and R [43] , [44] . Each of these compartments can take different values with respect to time, representing their dynamic behaviour. These have been described as follows:  S(t) -The number of individuals who are susceptible to the disease, i.e. who are not (yet) infected on day t.  E(t) -The number of individuals who have been in contact with the infected people and are exposed to the disease on day t, but disease symptoms are not yet visible in them. Such individuals are called asymptomatic.  I(t) -The number of infected individuals on day t, assumed to be infectious and are able to transmit the infection to others.  R(t) -The number of individuals who were infected, but they have recovered on day t and developed immunity. This also includes the people who have died [45] , [33] , [46] . Some authors name this compartment as Removed compartment. The assumptions related to the SEIR model are listed below:  There is no entry or departure from the population except possibly through death from the disease [35] .  Recovered people become immune to the disease, and can no longer spread the infection.  The increase in the number of infected people is directly proportional to the number of infected people as well as to the size of the exposed population.  Number of recovered persons is directly affected by the number of persons being infected. In SEIR model, shown in Figure 1 , the transmission rate β, regulates the flow of spread which describes the possibility of spreading infection within a susceptible and infectious individual. It represents the average contact frequency. represents the onset rate where 1/ is average latent period. Infected individuals leave the infected compartment at a rate to join the recovered class, 1/ being the average infectious period. The spread of a disease may lead to many casualties. However, this category of people is not included in SEIR model separately. Therefore, to represent this group of people separately, a new compartment 'D' is added to the existing SEIR model and the resulting model is known as SEIRD model [47] , [48] . SEIRD model has been depicted in Figure 2 . Here, D(t) denotes the number of deceased individuals on day t. Infected people become Deceased with a rate . As the COVID-19 is a pandemic and SEIRD model works well for such situations, so this paper uses the SEIRD model and proposes SEIRD model with infectivity in exposed population to make predictions. The proposed Modified SEIRD model has been described in the next subsection. SEIRD model assumes that exposed population is non-infectious, whereas it has been seen in COVID-19 cases that asymptomatic individuals are tested positive and are also responsible for spreading the disease. Hence, we have modified the SEIRD model to include infectivity in exposed population. The SEIRD model has been modified with the introduction of parameter 'ε' (epsilon), which accounts for the part of Exposed population that is asymptomatic, but infectious, and hence infecting others [35] . The proposed model is named as Modified SEIRD model and is shown in Figure 3 . Here, N = S + E + I + R + D, is the total population. Each person belongs to one of the five compartments namely Suscepted, Exposed, Infected, Recovered and Deceased. A person can shift from one compartment to other. The parameters used in equations (1), (2), (3), (4) and (5) regulate the shift of people among different compartments. These parameters are discussed below:  Beta (β): This parameter represents the transmission rate per capita. It denotes the number of persons which come in contact with an infected person per day.  Epsilon (ɛ): This parameter denotes the proportion of exposed people which are infectious and unknowingly infecting other susceptible people.  Alpha (α): It represents the onset rate where 1/α is average latent period.  Gamma (γ): It represents the recovery rate where 1/γ is mean infectious period.  Mu (μ): This parameter denotes the rate at which infected people become deceased. To study the growth of infection, researchers have used the basic reproduction number, Ʀ0, and effective reproduction number, Ʀe, as evaluation metrics in their work [49] . Basic reproduction number Ʀ0 is defined as the average number of secondary infections generated when one infected person is introduced into a host population where everybody is susceptible. Ʀ0 is calculated by equation (6): Effective Reproduction number, Ʀe, is the average number of new infections generated by an infectious individual on day t, in the partially susceptible population S. It is calculated by equation (7): Equations (6), (7) clearly show the increase by a factor ɛ * ( β α ) in the value of Ʀ0 and the corresponding increase in Ʀe by the exposed infectious population. Hence the parameter ɛ, representing the proportion of infectious exposed population contributes to the growth of COVID-19. High value of this parameter indicates that there are more infectious exposed persons and contact tracing should be done, so that these persons can be isolated to reduce the spread of the disease. Deep Learning (DL) is a branch of Machine Learning which is inspired by the working of the human brain. Some of the popular DL techniques are Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) etc. RNNs are an extension of DNNs that consist of feedback links along with feed-forward connections [50] . Unlike a DNN, RNNs use the previous output to compute the next output. So, they can efficiently process natural language, recognize speech, and perform image captioning as compared to other DL techniques [51] . RNN suffers from the problem of vanishing and exploding gradient [52] . To solve these problems, LSTM networks are used. LSTM is an extension of RNN which allows the network to learn long-term dependencies in the input data. It processes and forecasts time-series data very efficiently. Each LSTM cell comprises of three gates: Input Gate, Forget Gate, and Output Gate. Figure 4 depicts these three gates present in each LSTM cell. The equations (8), (9), (10), (11) , (12) and (13) are used in LSTM cell. , , , and , , , are the weights and biases, respectively. In each LSTM cell, tanh activation function is used, which distributes the gradients computed in the backpropagation algorithm while training the LSTM network [51] . Hence, it solves the vanishing and exploding gradient problem of RNN. As the data of COVID-19 is a time-series data and LSTM models time-series data very well, so we have used LSTM network to model the COVID-19 data of India and its five states having the highest number of total cases. In this paper, the experimentation was performed in Python using Jupyter Notebook and PyCharm Integrated Development Environment. COVID-19 data was collected for India as well as its five states having highest number of total cases. The epidemiological data till 15 th August 2020, was taken from the covid19india.org website [14] . For experimentation, two different models based on epidemiological and DL have been used to make projections. In this paper, the proposed Modified SEIRD, an epidemiological model and LSTM, a deep learning-based model, have been used. Further, t-test has been used to compute the lower and upper estimates of 90% confidence interval for the predictions and are shown in Figure 11 (a-f) -16 (a-f) for short-term predictions. The experimental setup for both of these models have been described in the upcoming sub-sections. For experimentation, proposed Modified SEIRD model uses various parameters mentioned in Section 3 above. The values of these parameters, β, ε, α, γ and μ, have been estimated in this paper through curve fitting to actual data, performed using different functions of Python's lmfit library. The optimized values of these parameters have been obtained by the minimize function. This function uses dual annealing method for obtaining a global optimal solution. The initial value of infected, I0, is taken as 1 based on the assumption that the infection started from one person. The initial value of exposed, E0, recovered, R0, and deceased, D0, are initialised to 0. The value of S0 is calculated by equation (14): A study of variation in different parameters of the proposed Modified SEIRD model due to the Lockdowns imposed by the Indian government has also been included in the paper. By using the data of a The six different time-series of Daily Confirmed, Daily Recovered, Daily Deceased, Total Confirmed, Total Recovered, and Total Deceased for India and its states obtained after preprocessing have been used for creating one LSTM for each of these series. These models have been created using the keras API. Since LSTM is very sensitive to range of data provided to it as input, therefore, pre-processing was required to normalize the input data to [0,1]. Normalization was done by using equation (15) through the MinMaxScaler function of sklearn library. x norm = x−x minimum After normalization, the input data was split into 80-20 ratio as training and testing dataset respectively. Further, these datasets were divided into feature set and the output value set. The dimensionality of feature set and output value set depends on the hyper-parameter, number_of_previous_days. Its value signifies the number of previous days' output used to predict the output for the current day. Length of training set is taken as and value of number_of_previous_days as . The dimensionality of feature set and output value set is taken as ( − × ) and ( − × 1) respectively. Finally, these models were fine-tuned against the hyper-parameter, number_of_previous_days, for the data of India and its five states having the highest number of total cases. The value of this hyper-parameter ranges between 1 and 90. The value of number_of_previous_days was selected with minimum testing loss. The optimal values of this hyper-parameter for India and its five states having the highest number of total cases, has been tabulated in Table 1 . The results obtained by both the proposed Modified SEIRD model and LSTM model are presented in the next section. This section discusses the results obtained by both the proposed Modified SEIRD model and LSTM models respectively. The proposed Modified SEIRD model performs both long-term and short-term projections, whereas, LSTM performs only short-term projections. Further, the 90% confidence intervals have been calculated using t-test. The corresponding results are discussed in detail in the following sub-sections. India is a geographically diverse nation, consisting of 36 states and union territories. Owing to India's large population and high population density, the Indian government implemented various nationwide lockdowns to curb the spread of Coronavirus. Though the COVID-19 pandemic did not leave any state/ union territory unaffected, some states have been far more affected by this deadly disease, as compared to others. As on August 15, 2020, the top five Indian states with the highest number of Total Confirmed cases are: Maharashtra, Tamil Nadu, Andhra Pradesh, Karnataka and Delhi. The curve for Daily Confirmed cases for India have been shown in Figure 5 (a), while Figure 5 (b) presents the curves for Exposed, Active Infected, Recovered, Deceased and Total Confirmed cases for India. Table 2 shows the parameter values obtained for India by the proposed Modified SEIRD model. The results obtained for India witness a peak of COVID-19 cases for India on 3 rd November 2021 with 8,57,390 daily cases, and the disease may die out towards the end of year 2023. These results are based on the current situation; however, these may vary depending on various decisions taken by the Indian government and the precaution measures taken by people from time to time. It can be seen from Table 2 that the value of β, that represents the infection transmission rate per capita, increased during Lockdown 1.0. This resonates with the decision of the Central government to extend the lockdown. The effect of Lockdowns was also observed on ɛ, whose value reduced as lockdowns were extended and people became aware about the importance of precautionary measures. But during Unlock 1.0, the value of ɛ increased, thus causing a rise in the number of exposed people. It was observed that both the parameters ɛ and μ decreased over time, except during Unlock1.0 when many relaxations were given by the government, more people were exposed and the value of ɛ suddenly increased. The increase in the value of α signifies a reduction in the average latent period and an increased γ depicts a reduction in the time to recover 1/γ, resulting in increase in recovery rate among the Indian population. There has also been a reduction in the value of reproduction number for India, and it has reduced from Ʀ0 = 3.028 to Re = 1.163 as on 15 th August 2020. This indicates that there has been a reduction in the number of secondary infections produced by an infected person. This positive change in parameter values can be attributed to the restrictions imposed by the Indian authorities. Next, Table 3 presents the parameter values obtained for the Maharashtra state, while its forecasting results have been shown in Figure 6 (a) and Figure 6 (b). For Maharashtra, the death rate μ has always remained higher than most of the other Indian states, and the recovery rate γ has been lower than the national average. Due to the severity of the infection, the Maharashtra government decided to extend the duration of restrictions imposed on its residents. However, several relaxations were offered during Unlock 3.0 and the impact of this has been captured by the β parameter of the proposed model. Its value sees a decreasing trend till Unlock 2.0, but an increase in this value can be seen during Unlock 3.0. Surprisingly, a downfall in the proportion of Exposed population is observed at the same time. However, under the prevailing circumstances, the proposed model predicts more than 0.8 million COVID-19 deaths in the state by the end of year 2022. Table 4 shows the parameter values obtained for Tamil Nadu, while its forecasting results for Tamil Nadu have been shown in Figure 7 (a) and Figure 7 (b). Table 4 , the parameter values computed from the model, indicate a hike in the recovery rate (γ) as well as the death rate (μ) in the state. Next, Table 5 presents the parametric values obtained by the proposed model for Andhra Pradesh, while Figure 8 (a) and Figure 8 (b) represent its forecasting results. For Andhra Pradesh, it can be seen that the per capita transmission rate of the disease, represented by β, significantly increased during the fourth Lockdown and the first Unlock period. However, at the time of writing of this paper Andhra Pradesh is the only state that has surpassed its COVID-19 peak. The state observed its peak on the first day of August 2020 with the maximum recorded Daily Confirmed cases of 11,122. It can also be verified from Figure 8 (a). There is a significant difference in the shape of the curve obtained for Andhra Pradesh and Tamil Nadu. The main reason behind this is that both these states saw a steep rise in the number of COVID-19 cases around the second half of the year 2020. This scenario can also be seen by observing the trend of their parameter values. The infection curve for Andhra Pradesh has started seeing a downward trend, which is also reflected by the increasing value of the recovery rate. Also, the value of reproduction number for Andhra Pradesh has gone below the value 1, which clearly indicates that the decline of this deadly disease in the state. Next, Table 6 presents the parameter values for the state of Karnataka, while Figure 9 (a) and Figure 9 (b) present the forecasting results for the same. The results obtained for Karnataka show that there will be a peak of COVID-19 cases on 7 th April 2021 with 1,24,722 Daily infections, and the disease will die out towards the end of year 2022. The values of Ʀ0 and Ʀe for Karnataka were found to be 9.93 and 1.31 (as on 15 th August 2020) respectively, as calculated by the proposed model. An interesting trend was observed in the values of β and ε. When one of these parameters decreases, the other increases and viceversa. Overall, β has reduced over time, which depicts a reduction in the transmission rate per capita in this state. It was seen that during the first two Unlocks, ε also reduced. However, Unlock 3.0 witnessed a surge in its value, thereby indicating an increase in the number of exposed individuals as more relaxations were offered to the public. On an average, it was found that γ, the recovery rate has increased in the state, leading to a higher number of recovered individuals. It can be seen from the results that the value of μ, that denotes the rate of deceased population, has decreased over time. Table 7 shows the parameter values for Delhi and its forecasting results have been shown in Figure 10 (a) and Figure 10 (b). Delhi is the national capital of India which has always remained in the top five states till 15 th August, and has been severely affected by COVID-19. However, Delhi's rank has drastically changed over the recent months and it has now shifted down from the second spot to the fifth spot. The value of reproduction number for Delhi has also come down from 3.21 to 1.06 as of 15 th August 2020. Despite this fact, the projections for Delhi indicate that the current scenario is just a local-minima and its infection curve will reach its peak on 15 th June 2021 with 6249 Daily Confirmed cases. At the time of peak, the number of Daily Infected cases is expected to rise to 6249, with more than 51k Active Infected people of Delhi. According to the proposed Modified SEIRD model, Delhi will have more than 2 million Total cases by the end of 2022, out of which 98% people would have recovered. This attributes to the high recovery rate of this union territory. However, the sharp decrease in the number of cases in Delhi, has also brought the authorities under a scanner. Multiple news channels have questioned the authenticity of the quality of the tests being conducted for COVID-19 patients. (16), (17) and (18) respectively. Case Fatality Rate = The Recovery Rate and Case Fatality Rate on 15 th August 2020 and on 14 th September 2020 (30 days from now) for both the models, have been given in Table 8 and Table 9 respectively. Figure 11 (a-f) present the projections made by both the models for Daily Confirmed cases in India and its five worst affected states. The lower and upper estimates corresponding to 90% confidence interval (CI) has also been shown in the graphs for the predictions made by both the models. Figure 11 (a-f): Actual and Predicted Daily Confirmed cases for India and its five worst-affected states As is evident from Figure 11 , that the predictions made by both the models were found to be significantly similar to each other. Figure 12 (a-f) present the projections made by both the models for Daily Recovered cases in India and its five worst affected states along with their 90% confidence intervals obtained through t-test. (e) (f) Figure 12 (a-f): Actual and Predicted Daily Recovered cases for India and its five worst-affected states The graphs in Figure 12 show Figure 13 (a-f) present the projections made by both the models for Daily Deceased cases in India and its five worst affected states along with the lower and upper estimates corresponding to 90% confidence intervals. (e) (f) Figure 13 (a-f): Actual and Predicted Daily Deceased cases for India and its five worst-affected states To lessen the impact of any pandemic, it is important to control the number of deceased patients. The trend of Daily Deceased individuals for next 30 days can be observed in Figure 13 (a-f Figure 15 (af) present the projections made by both the models for Total Recovered cases in India and its five worst affected states along with their confidence intervals obtained through t-test. Figure 16 (a-f) present the projections made by both the models for Total Deceased cases in India and its five worst affected states along with their confidence intervals obtained through t-test. With the exponential growth in the number of Corona Virus Disease 2019 (COVID-19) cases worldwide, countries need to prepare themselves with appropriate measures to tackle this epidemic. This can be achieved through proper projections that can help the governments to take decisions accordingly and create more infrastructure, if required. This projection is particularly important for a country like India which ranks second in population size with high population density in several states. In this paper, a Modified SEIRD (Susceptible-Exposed-Infected-Recovered-Deceased) model was proposed to perform COVID-19 projections for India and its five states having the highest number of total cases. This model also considers asymptomatic infectious patients for the projections. Further, Long Short-Term Memory (LSTM) model was also used in this paper to perform short-term projections. To predict the trend of the pandemic, data up to 15 th August 2020 was utilised for experimentation purposes. The results obtained by the proposed Modified SEIRD model and LSTM model were compared for next 30 days. Lower and upper estimates of 90% confidence intervals for the predictions were computed using Student t-test. The effect of different lockdowns imposed in India, was also analysed and modelled using the proposed model. Propagation analysis and prediction of the COVID-19 COVID-19 and the environment: A critical review and research agenda Modeling behavioral change and COVID-19 containment in Mexico: A trade-off between lockdown and compliance Age-structured impact of social distancing on the COVID-19 epidemic in India Epidemic Peak for COVID-19 in India Mathematical Modeling and Epidemic Prediction of COVID-19 and Its Significance to Epidemic Prevention and Control Measures Coronavirus: China's first confirmed Covid-19 case traced back to CoronaTracker: worldwide COVID-19 outbreak data analysis and prediction A data driven epidemic model to analyse the lockdown effect and predict the course of COVID-19 progress in India Mathematical Model Based COVID-19 Prediction in India and its Different States. medRxiv Worldometer: India Population COVID19 INDIA COVID-19: Mathematical Modeling and Predictions. arXiv Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis Data-based analysis, modelling and forecasting of the COVID-19 outbreak Corona Epidemic in Indian context: Predictive Mathematical Modelling. medRxiv Lessons learnt during the first 100 days of COVID-19 pandemic in India Healthcare impact of COVID-19 epidemic in India: A stochastic mathematical model Potential of age distribution profiles for the prediction of COVID-19 infection origin in a patient group Estimation of COVID-19 prevalence in Italy Projections for novel coronavirus (COVID-19) and evaluation of epidemic response strategies for India Short-Term Forecasts of COVID-19 Spread Across Indian States Until 29 May 2020 under the Worst-Case Scenario Impact of temperature on the dynamics of the COVID-19 outbreak in China Short-term Forecasts of the COVID-19 Epidemic in Guangdong and Zhejiang Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India Prediction for the spread of COVID-19 in India and effectiveness of preventive measures Neural network based country wise risk prediction of COVID-19 SEIR and Regression Model based COVID-19 outbreak predictions in India. medRxiv When Will COVID-19 End? Data-Driven Prediction Prudent public health intervention strategies to control the coronavirus disease 2019 transmission in India: A mathematical model-based approach A time-dependent SEIR model to analyse the evolution of the SARS-CoV-2 epidemic outbreak in Portugal Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions Lecture Notes in Mathematical Epidemiology COVID-19 Epidemic Forecast in Different States of India using SIR Model. medRxiv Understanding the COVID19 Outbreak: A Comparative Data Analytics and Study Application of Mathematical Modeling in Public Health Decision Making Pertaining to Control of COVID-19 Pandemic in India Forecasting the long-term trend of COVID-19 epidemic using a dynamic model Prediction of the COVID-19 spread in African countries and implications for prevention and control: A case study in South Africa Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan, China A Simple SEIR Mathematical Model of Malaria Transmission A simulation of a COVID-19 epidemic based on a deterministic SEIR model How and when to end the COVID-19 lockdown: an optimization approach Effectiveness of control strategies for Coronavirus Disease 2018: a SEIR dynamic modeling study. medRxiv Analysis and control of an SEIR epidemic system with nonlinear transmission rate A fractional-order model for the novel coronavirus (COVID-19) outbreak Identification and estimation of the SEIRD epidemic model for COVID-19 A model based study on the dynamics of COVID-19: Prediction and control Neural Networks: A Comprehensive Foundation 2nd edn Deep Learning Long Short-Term Memory The authors would like to acknowledge Prof. Daman Saluja and Ms. Apoorva Uboveja (B.R. Ambedkar Centre for Biomedical Research); Dr. Anita Mangla and Dr. Neeru Dhamija (Daulat Ram College), and Dr. Uma Chaudhary (Bhakaracharya College of Applied Sciences) of University of Delhi for the valuable discussion at initial level that created our interest in carrying out this research.