key: cord-1022711-zcevubab authors: Selvamuthu, Dharmaraja; Khichar, Deepak; Kalita, Priyanka; Jain, Vidyottama title: Estimation of Mortality Rate of COVID-19 in India using SEIRD Model date: 2021-09-20 journal: OPSEARCH DOI: 10.1007/s12597-021-00557-x sha: e370183d85c44ae2ed1eeb1d9ce8aee0219b66e7 doc_id: 1022711 cord_uid: zcevubab In India, the number of infections is rapidly increased with a mounting death toll during the second wave of Coronavirus disease (COVID-19). To measure the severity of the said disease, the mortality rate plays an important role. In this research work, the mortality rate of COVID-19 is estimated by using the Susceptible-Exposed-Infected-Recovered-Dead (SEIRD) epidemiological model. As the disease contains a significant amount of uncertainty, a fundamental SEIRD model with minimal assumptions is employed. Further, a basic method is proposed to obtain time-dependent estimations of the parameters of the SEIRD model by using historical data. From our proposed model and with the predictive analysis, it is expected that the infection may go rise in the month of May-2021 and the mortality rate could go as high as 1.8%. Such high rates of mortality may be used as a measure to understand the severity of the situation. The number of infections of COVID-19 in India has put it as the second-worst affected country in the world with more than 350,000 reported cases. There is a notable imbalance between the supply and the demand of medicines and oxygen cylinders in the healthcare system. With every passing day, the death toll rises with 3000 deaths reported every day due to COVID-19 pandemic. This infectious disease is historically different in the following ways. The asymptomatic cases, one of the cause of virus transmission among the healthy population, are practically unnoticed by the healthcare system. Also, the government policies, social movements and actions by healthcare departments of any country are not yet well studied in epidemiology. In the literature, a variety of stochastic and mathematical models with different complexity levels were presented and discussed ( [1] [2] [3] [4] , etc.). Few of them validated the COVID-19 pandemic results and predicted the trajectory of COVID-19 in order to match the available data of the region under consideration ( [1] [2] [3] , etc.). Note that all these models follow one common approach, i.e., introducing distinguishable compartments to all the possible states of a person living in an affected region. Stanford Model [5] assumed different compartments such as exposed, symptomatic, pre-symptomatic, hospitalized, and recovered for a COVID-19 patient's journey from beginning to end. This assumption led to a large number of unpredictable interactions when the parameters of the mathematical models are tuned accordingly. With every single moment, the rapid rise in COVID-19 cases caused the exact data recording task challenging and unreliable. Consequently, mathematical modelling with such data may be questionable and the confidence interval of the obtained results may be uncertain. Thus, the trade-off between the description and accuracy of the mathematical model should be given the the greatest preference. The best way to measure mortality is by calculating rate of mortality rather than counting the number of deaths as the number of deaths is heavily influenced by the number of people who are at the risk of dying. Therefore, the mortality rate is considered as the fundamental criterion to assess the state of any disease. In this research work, the mortality rate is estimated with the historical data. It is observed that the results shown in this work are significantly relevant to the actual data and contain less absolute percentage of error. The rest of this paper is organized as follows: Sect. 2 represents a review over the relevant mathematical models developed for COVID-19 outbreak. The Susceptible-Exposed-Infected-Recovered-Dead (SEIRD) model is proposed and the mechanism of mortality rate calculation is described in Sect. 3. Further, case study of COVID-19 pandemic by using India data is discussed in Sect. 4. Finally, concluding remarks are summarized in Sect. 5. During COVID-19 pandemic, the prediction and the mortality rate may be estimated by using various epidemiological models, namely, Susceptible-Infected-Recovered (SIR), Susceptible-Exposed-Infectious-Removed (SEIR), Susceptible-Exposed-Infectious-Removed-Death (SEIRD), etc. In the last three decades, a lot of research works were carried out to measure the mortality rate. A few relevant ones are as follows, Piccolomin and Zama [6] proposed a forced SEIRD model for analysis and forecast of the COVID-19 spread in the Italian region. By comparing the predicted results with the actual data, it was concluded that the model is quickly adapted to monitor various infected areas at different epidemic stages. To predict about the COVID-19 outbreak, its dynamic behaviour should be considered as the major factor. Therefore, Rapolu et al. [7] studied a time-dependent SEIRD model to anticipate the COVID-19 transmission. A SEIRD model based on partial differential equations was studied by Viguerie et al. [8] . They presented a strong qualitative agreement between the simulated data of the spatio-temporal COVID-19 spread and epidemiological data for the Italian region of Lombardy. The identification and estimation of the SEIRD model for COVID-19 are discussed in [9] . The identification of the model from the observed number of deaths and confirmed cases is not up to the mark. Thus, by using the Monte Carlo method, Korolev [9] finds some fairly accurate estimations for reproduction numbers. Ala'raj et al. [10] developed a dynamic hybrid model based on SEIRD and studied the properties related to the COVID-19 pandemic. Their model analyzed the real-time data and provided long-term and short-term forecasts with confidence intervals. He et al. [2] proposed a SEIR model based on the conditions of hospitals, quarantine, and external inputs for COVID-19 spread. Lopez and Rodo [11] presented the modified SEIR compartmental model by incorporating the effects of varying proportions of lock-downs responsible for minimizing the COVID-19 infection. Further, Carcione et al. [12] extended a SEIR epidemiological model to analyze the tailored measures of epidemic control. The tailored measures of epidemic control contained group-specific protection and the use of tracing apps. Roda et al. [13] investigated that a SIR model performed better than a SEIR model in representing the confirmed-case data using the Akaike Information Criterion (AIC) for COVID-19. Mondal and Antonopoulo [14] adapted SIR modelling for the prediction of the novel COVID-19 disease and studied its functionality. To track the transmission and recovering rate at time t, Chen et al. [1] performed a timedependent SIR model for COVID-19. Okuonghaea and Omame [15] discussed the impact of various non-pharmaceutical control measures (government and personal) on the population dynamics of COVID-19 for Lagos and Nigeria. Through the numerical simulations, they showed the effect of control measures such as common social distancing, use of face mask, and case detection on the dynamics of COVID-19 spread. Several other researchers discussed different approaches to predict the COVID-19 outbreak and to capture the trend of various COVID-19 cases. Singhal et al. [4] studied two different models; the first model is the parametric model for various parameters relating to the spread of the virus, while the second one is a non-parametric model based on the Fourier decomposition method (FDM), fitted on the available data. They provided the results for India, Italy, and the United States of America (USA). The cases of COVID-19 were compressed into a short period and were strongly shaped by population aggregation and heterogeneity by Rader et al. [16] . By pairing their estimates with globally comprehensive data on human mobility, they predicted that crowded cities worldwide could experience more prolonged epidemics. Different models on Auto-Regressive Integrated Moving Average (ARIMA) were developed to predict the epidemiological behavior of COVID-19 in Italy, Spain, and France [3] . All over the world, during January 2020-June 2021, mortality is mostly affected by COVID-19, analyzed by John Hopkins University & Medicine [17] . Liang et al. [18] studied the factors related to cross-country variation in COVID-19 mortality. The COVID-19 mortality rate is positively related to the proportion of the population aged 65 or older and transport infrastructure quality. However, it is negatively related to the COVID-19 test number per 100 people, government effectiveness, and the number of hospital beds. An optimal control model for the spreading of the COVID-19 pandemic was discussed by Dhaiban and Jabbar [19] . The effect of population density on COVID-19 spread and related mortality in India was investigated by Bhadra et al. [20] . By using correlation and regression analysis of infection and mortality rates due to COVID-19, they found moderate relation between COVID-19 spread and population density. In this section, a variant of the SEIR epidemiological model is used, namely, SEIRD, which computes the number of people infected (or dead) with a contagious illness for a total population at a given time. The total population, say N, is partitioned into five compartments: • D n ∶ The number of deceased individuals till the nth day. • R n ∶ The number of recovered individuals till the nth day. • I n ∶ The number of infectious individuals on the nth day. These individuals may infect others. This compartment is called the infectious individuals compartment. • E n ∶ The number of exposed individuals on the nth day. These individuals are already infected with the virus but are not yet infectious. This compartment is called the exposed individuals compartment. Over the time, these individuals may become cause of spreading the disease while transitioning into the infectious individuals compartment. • S n ∶ The number of susceptible individuals on the nth day. When a susceptible person comes in close contact with an infectious one, he may contract the virus and will be transitioned into the exposed individuals compartment. Therefore, for a given nth day, we have S n + E n + I n + R n + D n = N . While transitioning from one compartment to another, individuals behavior will be governed by the following system of equations: where i n is the number of individuals that are currently in S n and will be infected by the ith individual of I n on the nth day. These will be transitioning into E n+1 . -i n is 1 or 0 depending on whether an ith individual of E n will go to I n+1 (and hence become infectious) or not. -i n is 1 or 0 depending on whether an ith individual of I n will transition to R n+1 (after recovering from the disease). Here, for n, i ∈ ℕ , i n , i n , and i n are assumed to be independently and identically distributed random variables. Let , and be the respective expectations of i n , i n , and i n . Here, represent the rate of exposed becoming infectious, represent the probability of an infectious individual recovering in a day and represent the probability of an infectious individual dying in a day. Another variable represent the number of infectious people interact with others per day. Though the case of i n is more subtle than it seems, ∀ n , i ∈ ℕ , these are assumed to be independently and identically distributed if the population is infinite. Due to the finite population, infectious individuals are bound to make contact with those who have already recovered and hence are immune for the time. Since can be considered as the number of infectious people interact with others per day and the probability of the occurrence of i n is given by Using the process governing equations, the expectation for the random sum of i n , by Wald's equation, is given by [21] (1) Similarly, the expectation for the random sum of other random variables are [21] Thus, the expected number of individuals in different compartments are as follows [21] : In the proposed model, the values of the parameters are taken as = 0.2 as mentioned in [22] and = 0.1 as mentioned in [22] . The parameters and will be estimated by using the historical data in Sect. 4. The mortality rate on the nth day, say MR n , which depends on the ratio of the number of people who died due to COVID-19 and the number of people who got infected on that day, is calculated as This may provide inaccurate results. The reason is that patients, who died on a particular day, were infected by the virus roughly two weeks before. To obtain the better results, it should be considered into account. Backer et al. [23] mentioned that the time between the contraction of the virus and the onset of symptoms is approximately 14 days. On the other side, Huang et al. [24] and Wang et al. [25] reported that the median time from the onset of the symptoms to being admitted into the intensive care unit (ICU) is near about ten days. Additionally, the World Health Organization (WHO) mentioned that the time between the onset of symptoms and death ranges from two weeks to 8 weeks [26] . Hence, there is a requirement to introduce mortality rate based on the ratio of the number of people who died and the number of people who got infected few days back. Therefore, true mortality rate, say TMR n , which is obtained by dividing the number of deaths on nth day by the number of confirmed COVID-19 infections kth days before of that nth day, is defined as where k ∈ ℕ . This calculation counts k as 14 as mentioned in [23] . The model described in Sect. 3 will be employed to obtain the results for the prediction of the mortality rate with the India data [27]. March 2021 and afterwards. The number of infections per day was more in comparison with the earlier reported cases in India. Consequently, India unexpectedly entered deep into the second wave of coronavirus. In the total tally of infections, India reached at the second position after USA. The main cause of this sudden surge was considered as a mutant of the earlier version, commonly known as the double mutant. This particular strain was found in many countries, like, UK, South Africa, etc., other than India. Though it has been a subject of global interest, but it is still not confirmed scientifically. The transmission of the virus from an infected person to a susceptible one is very much similar as it happened in the first wave (Oct 2020). Since the spread of this virus is dependent on location, person, etc., it is difficult to characterize the behavior of this virus at the equal level all over India. One other prime factor behind the massive increase in the number of cases is the increased testing drive. It is very important that parameter values should be estimated appropriately and efficiently. Values of the parameters and are considered as provided by Kuniya et al. [22] . This section is devoted for estimating the values of and by using the historical data for India [28] . These parameters are prone to fluctuate due to their correlation with the political state of the region. Though the lockdown is supposed to be the best way to reduce the virus transmission, uplifting the previously imposed restrictions may tend to increase the people's interaction in person. Further, it is also observable that these parameters are different for disparate regions. Therefore, the parameters should be estimated based on the data to relate the model with the actual senario. Kuniya et al. [22] proposed an estimation method in which the parameter is a simple random variable and the estimate is a single value. Since such methods may be inaccurate at times, a basic method to find a time-dependent estimate for the parameters and using historical data is intended to present in this paper to improve the accuracy of the model. The parameter is estimated by taking the average of ̂ n , where by using Eq. (11), ̂ n is calculated as follows Here, D n represents the difference between the number of deceased individuals on n − 1th day and nth day, i.e., D n = D n − D n−1 and, I n−1 represents the data of the number of infected individuals on n − 1th day. By using the historical data, is estimated as 0.0012. Further, the following steps are for the estimation of , say ̂ n . Step 1: By using Eq. (9) and using the data of the number of infected individuals till the nth day, the number of exposed individuals may be calculated by where I n represents the difference between the number of infected individuals on the n − 1th day and nth day, i.e., I n = I n − I n−1 , = 0.1 [22] and = 0.2 [22] . Step 2: By using Eq. (8), the estimate for is made from the exposed individuals as where E n−1 represents the difference between the number of exposed individuals on the n − 2 th day and n − 1 th day, i.e., E n−1 = E n−1 − E n−2 . Results of the proposed estimation process of and are shown in Figs. 2 and 3 , respectively. These estimations are for the first wave in India. Notice that there is an overall downward trend with seasonal fluctuations. Such behavior may be attributed to the overall incessant efforts of the authorities and public to keep the virus in check. The second wave, as discussed earlier, is assumed to follow similar transition patterns as the first wave. The next section discusses the predictions made for the second wave. The expected number of infectious and deceased individuals are predicted using the model (Eq. (1)) and data [27] provided in the previous sections. Infectious means the total number of active infections at a point of time, which is equal to the cumulative sum of all daily reported infections minus the cumulative sum of daily removals (removals include both the recovered and deceased individuals). For deceased individuals, the average daily number of deaths is predicted. To run the model and simulate the number of infectious individuals, a starting point is needed. This work makes predictions starting from April 2021 till August 2021. Naturally, 1st April 2021 is chosen as the initial point and the model is run using Eq. (9). Figure 4 exhibits the predictions for several active infections. It is observed from Fig. 4 that in the one-month prediction, i.e., in May 2021 the absolute percentage error for infectious individuals (calculated as |I n (actual)−I n (predicted)| I n (actual) × 100%) is less than 20% . After the one-month prediction, the absolute percentage error increases, and from June 15, 2021 onwards the absolute percentage error decreases. The second wave of coronavirus in India looks much harsher than the first one. The peak infections are much ahead and the daily impending death toll is creating havoc within the country. From the predictions, it is inferred that the peak of the second wave is expected to occur somewhere in May 2021 with expected active infections of well over 4 million. The second wave is higher than the first one but it is expected that the number of infections returning to a manageable scale by August 2021. The predictions for the number of daily deceased individuals are shown in Fig. 5 . It is noticed from Fig. 5 that in the one-month prediction, i.e., in May 2021 the absolute percentage error is less than 30% . The absolute percentage error for the number of daily deceased individuals is calculated as After the one-month prediction, the absolute percentage error slightly increases and decreases but the fluctuation is less than 37% and from June 23, 2021, onwards the absolute percentage error decreases. The daily count for the deaths due to COVID-19 is expected to peak near the end of May 2021 with the count going as high as 5000. In the next section, mortality rate is estimated and predicted using the results obtained here. As mentioned earlier, the way usual mortality rate calculated using Eq. (12) is fundamentally inaccurate because of the lag between the time of infection and the time of death. Figure 6 shows the estimates for mortality rate. Further, the TMR formula is used to obtain the accurate mortality rate and the results are shown in Fig. 7 . TMR is found to somewhat higher than the usual one. TMR shows high death rates and an upward trend during the period March-April, 2021, which seems realistic. It is claimed that TMR provides a more accurate picture of the disease. Further, Eq. (13) is applied to the predicted number of infections and deaths from the model, depicted in Fig. 8 . It predicts that the mortality rate is expected to stay high (more than 1%) for most of the time post-March 2021, while peaking with a value of almost 1.8% near the beginning of May 2021. The absolute percentage error for TMR is calculated as |TMR(actual)−TMR(predicted)| × 100% . Also, in the initial predicted month for TMR, i.e., in May 2021, the absolute percentage error is less than 30% . After the one-month prediction, the absolute percentage In this paper, the SEIRD model is presented as a tool to predict the dynamic behavior of COVID-19 cases in India using available data of the reported cases of historical data. The SEIRD or SEIR model consists of susceptible, exposed, infectious, recovered and dead cases while SIR model is restricted to susceptible, infectious and recovered cases. Since SIR model does not consider the latent stage (the exposed individuals), it will be an inappropriate model to employ in the proposed scenario. Therefore, SEIR \& SEIRD models can provide better approximation as compare to SIR model [12] . The mortality rate of COVID-19 is estimated by taking into account the dynamic behavior of the spread of the disease. It was inferred that the infections are expected to peak in May-2021 and the mortality rate could go as high as 1.8%. Such high rates may be used as a measure to understand the severity of the pandemic. There is an advantage that this predictive methodology can be applied in any region, if the historical data is available. This analysis will be useful in making important decisions such as health care policies in India. In future work, the estimated results can be validated by the simulation model. In future, our proposed model can also be extended by obtaining well established reproduction rate (R) along with mortality rate. Also, the scenarios effects (base, best, worst, etc.) on a monthly basis helping policies on restrictions like lock down can also be incorporated in future. A time-dependent SIR model for COVID-19 With undetectable infected persons SEIR modeling of the COVID-19 and its dynamics Estimation of COVID-19 prevalence in Italy, Spain, and France Modeling and prediction of COVID-19 pandemic using Gaussian mixture model Potential Long-Term Intervention Strategies for COVID-19 Monitoring Italian COVID-19 spread by a forced SEIRD model A Time-Dependent SEIRD Model for Forecasting the COVID-19 Transmission Dynamics Simulating the spread of COVID-19 via a spatially-resolved susceptible-exposed-infected-recovered-deceased (SEIRD) model with heterogeneous diffusion Identification and estimation of the SEIRD epidemic model for COVID-1 Modeling and forecasting of COVID-19 using a hybrid dynamic model based on SEIRD with ARIMA corrections A modified SEIR model to predict the COVID-19 outbreak in Spain and Italy: simulating control scenarios and multi-scale epidemics A simulation of a COVID-19 epidemic based on a deterministic SEIR model. Front Why is it difficult to accurately predict the COVID-19 epidemic? A SIR model assumption for the spread of COVID-19 in different communities Analysis of a mathematical model for COVID-19 population dynamics in Crowding and the shape of COVID-19 epidemics Mortality Analysis COVID-19 mortality is negatively associated with test number and government effectiveness An optimal control model of COVID-19 pandemic: a comparative study of five countries Impact of population density on Covid-19 infected and mortality rate in India Introduction to probability and stochastic processes with applications Prediction of the epidemic peak of coronavirus disease in Japan Incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from Wuhan, China Clinical features of patients infected with 2019 novel coronavirus in Wuhan Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China Ministry of Health and Family Welfare Introduction to statistical methods, design of experiments and statistical quality control We thank the editor and the anonymous reviewers for their insightful comments, which helped in improving the paper. We thank Mr. Nimesh Sangwan for preparing the initial draft of this paper. This research work is supported by the Department of Science and Technology, India.Authors Contribution All authors confirm the responsibility for the following: study conception and design, data collection, analysis and interpretation of results, and manuscript preparation. The datasets generated and/or analysed during the current study are available in the following weblink: Ministry of Health and Family Welfare, Government of India. http:// nhm. gov. in/ Ethical approval Manuscripts reporting studies involving human participants, human data or human tissue: Not applicable Consent for publication manuscript does not contain data from any individual person, hence, it is "Not applicable"