key: cord-1040110-sj714xke authors: Sahoo, Bijay Kumar; Sapra, Professor Balvinder Kaur title: A data driven epidemic model to analyse the lockdown effect and predict the course of COVID-19 progress in India date: 2020-06-20 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2020.110034 sha: eb975841eeba78e0546e026f972f3f54c46bed9c doc_id: 1040110 cord_uid: sj714xke We propose a data driven epidemic model using the real data on the infection, recovery and death cases for the analysis of COVID-19 progression in India. The model assumes continuation of existing control measures such as lockdown and quarantines, the suspected and confirmed cases and does not consider the scenario of 2(nd) surge of the epidemic due to any reason. The model is arrived after least square fitting of epidemic behaviour model based on theoretical formulation to the real data of cumulative infection cases reported between 24 March 2020 and 30May 2020. The predictive capability of the model has been validated with real data of infection cases reported during June 1-10, 2020. A detailed analysis of model predictions in terms of future trend of COVID-19 progress individually in 18 states of India and India as a whole has been attempted. Infection rate in India, as a whole, is continuously decreasing with time and has reached 3 times lower than the initial infection rate after 6 weeks of lock down suggesting the effectiveness of the lockdown in containing the epidemic. Results suggest that India, as a whole, could see the peak and end of the epidemic in the month of July 2020 and March 2021 respectively as per the current trend in the data. Active infected cases in India may touch 2 lakhs or little above at the peak time and total infected cases may reach over 19 lakhs as per current trend. State-wise results have been discussed in the manuscript. However, the prediction may deviate particularly for longer dates, as assumptions of model cannot be met always in a real scenario. In view of this, a real time application (COV-IND Predictor) has been developed which automatically syncs the latest data from the national COVID19 dash board on daily basis and updates the model input parameters and predictions instantaneously. This real time application can be accessed from the link: https://docs.google.com/spreadsheets/d/1fCwgnQ-dz4J0YWVDHUcbEW1423wOJjdEXm8TqJDWNAk/edit?usp=sharing and can serve as a practical tool for policy makers to track peak time and maximum active infected cases based on latest trend in data for medical readiness and taking epidemic management decisions. Corona virus disease 2019 (COVID- 19) is an infectious disease caused by severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) [1, 2] . The disease was first identified in December 2019 in Wuhan, the capital of China's Hubei province. Since then, the numbers of cases have spread to all over the world. On March 11, 2020 , the World Health Organization (WHO) formally declared the outbreak of novel corona virus as a Global Pandemic. As of June 01, 2020, a total of 6,152,160 cases are confirmed in more than 227 countries and 26 cruise ships. There are 3,142,964 active cases and 371,700 deaths [3]. The first case of the 2019-20 corona virus pandemic in India was reported on January 30, 2020, originating from China and now India has become the largest affected country in Asia. As of June 01, 2020, the Ministry of Health and Family Welfare has confirmed a total of 190,535 cases, 91,819 recoveries and 5,394 deaths in the country. For India, the fatality rate is relatively lower at 3.09%, against the global 6.63% as of 20 May 2020. Six cities account for about half of all reported cases in the country-Mumbai, Delhi, Ahmedabad, Chennai, Pune and Kolkata [4, 5] . On March 22, 2020, India observed a 14-hour voluntary public curfew followed by a nationwide lockdown since March 24, 2020 , besides several other measures such as quarantine of the suspected cases, public health guidelines on social distancing, frequent hand washing and wearing face mask while stepping out of home for essential services. Modelling and predicting the course of the outbreak in each region is important for the management and containment of the epidemic, and for balancing the impact from the public health vs. the economic crisis. Majority of COVID-19 epidemic models have originated from the SIR (Susceptible, Infected, and Recovered or Removed) model [6] and its many variations have been used in several countries, such as India [7] , China [8, 9] , Italy [10, 11] and Brazil [12] . These SIR-type models are useful for policy-decision makers to know the potential impact of pandemic and for prompting them to take early actions to minimise the impact. However, subsequent to breakout of pandemic, more information is required for a detailed planning, such as peak arrival of the pandemic, the number of hospital beds needed at the peak time, and taking decision on relaxing/lifting the lockdown, and finally returning to normal living. A recent study published in Nature [13] reveals that major non-pharmaceutical interventions and lockdown, in particular, have considerable effect on reducing transmission based on a large study carried out using real data in 11 European countries. Several studies are now becoming available on analysis of various epidemic control measures to contain the epidemic spread of COVID-19 in various countries [14] [15] [16] [17] . In our study, we propose a data driven epidemic model to analyse the lockdown effect in India, 2 nd largest population in world and to predict the course of COVID-19 progress for medical readiness using the latest data on cumulative infection cases and removed cases due to recovery and death. The model has the advantage that it does not depend on the susceptible population, a key parameter required for SIR type models. However, it has the disadvantage that it cannot be used when the epidemic has just started and the data are limited. The model has been implemented in a Google sheet for real time analysis of epidemic trend based on the latest data and it predicts various important parameters such as peak time, and number of active infected cases at peak time. This data will be helpful for arrangement of various medical recourses and taking epidemic related management decisions. The manuscript presents the theoretical formulation and development of the model for cumulative, daily and active infected cases, application to COVID-19 scenario in 18 individual states of India and India, as a whole. It analyses the lockdown effect and predicts the course of epidemic progress as discussed in the following sections. Finally, a link to the real time application (COV-IND Predictor) has been provided for daily updates and predictions. Let N(t) is the number of total infected cases at time t. The rate of change of total infected cases can be expressed as (1) Where λi (t) represents the infection rate at time t Generally, infection rate represents the number of contacts per person per unit time and it decreases with various control measures such as quarantine, lockdown etc [18, 19] . Let us consider a scenario of continued lockdown till the epidemic comes to a near end. Also, it is assumed that the infection rate in the population is highest at start of lockdown which decreases exponentially with increase in lockdown period and finally approaches zero after a sufficient time. With this, the transient variation of infection rate subsequent to lockdown can be written as (2) Where λ0 is the initial infection rate at time of implementing lockdown. τ is the characteristic time of decrease which depends upon the societal factor, the extent of implementation of the lockdown in the society, number of quarantine person, number of samples tested etc. Substituting the expression for λi (t), Eq(1) can be written as If N0 is the number of total infected cases at time of implementing lock down (t= t0), the solution to the above equation can be written as (4) This can be re-expressed as (5) Where and (6) The number of infected cases, N (t) (using Eq 5 and 6) can be expressed as (7) The model presented in Eq. (7) is also known as Gompertz function [20] , initially proposed based on the nature of function expressive of the law of human mortality, with the assumption that mortality rate decreases exponentially as a person ages. A similar assumption has been made for the infection rate variation under lockdown scenario in the present study. In real scenario, this particular assumption of the model cannot be met always. Hence there may be a change in the trend of infections rate due to various reasons (e.g. movement of migrant workers from one state to another, mild relaxation in lockdown rules). In such cases, it is advisable to do a trend analysis of latest data, update the model input parameters and make the predictions. However, the best approach is to develop a real time application which will fetch data from the national database dashboard at regular intervals and make predictions based on the latest trend analysis. Differentiating Eq. (7) with respect to t, the number of new infected cases per day (i.e. daily new infected cases), Nn(t) can be obtained as follows: Since those who are admitted to the hospital either recover after a hospital stay of T days, or may die after a similar number of days, there should be a delayed relationship between number of daily new infected cases, 'Nn(t)' and number of daily removed cases, 'Nr(t)' due to recovery and death. Hence, Nr(t) can be related to Nn(t) by the following relation , t > T Where T is the mean recovery time of COVID-19 patients. This can be determined through time-lag correlation analysis between daily new cases and daily removed cases where epidemic has nearly come to an end. Similarly, the number of infected active cases, a(t) can be estimated by taking the difference between cumulative new infected cases and cumulative removed cases up to time ti.e. Simplifying, we get This indicates that the number of active infected cases at time t, is the sum of the number of daily new infected cases for a period of T preceding t. The integration is used for the continuous functions of the model while the sum is used for the discrete real data. Maximum medical resources are required when the active cases attain maxima. Hence, predicting the maximum active cases and the time when this maximum will be attained is of utmost importance for planning and arrangement of medical resources such as number of hospital beds, ventilators, personal protective equipments for health care providers etc. The rate of change of active infected cases at any time t can be written as (12) Let tp be the time at which the active cases attain peak; the occurrence of the peak is achieved if (13) This condition leads to After attaining peak, the newly recovered and deceased cases start to exceed the newly infected cases. The demand for medical resources, such as hospital beds, isolation wards and respirators, starts to decrease beyond this peak. Using equations (8) and (9), Nr(t) can be expressed as (15) Taking the ratio of Nn(t) and Nr(t)as given in Eq. (8) and Eq.(15) respectively, and using condition given in Eq. (14) for the peak time, tp can be obtained as: Eq. (16) can be used to obtain the time when active infected cases will attain peak with the information of characteristic time constant (τ), recovery time (T) and fitting parameter k2. Once the peak time tp is estimated, the number of active infected cases at t= tp can be estimated using Eq.(11). The first case of the 2019-20 corona virus pandemic in India was reported on January 30, 2020. India observed a nationwide lockdown since March 24, 2020 (55th day after 1 st case) to control the epidemic in addition to several other measures. As of June 01, 2020, the Ministry of Health and Family Welfare has confirmed a total of 190,535 cases, 91,819 recoveries and 5,394 deaths in the country. About 18 states have exceeded 100 confirmed cases as on May 01, 2020 [4, 5] .These states have been considered for model analysis and prediction. State-wise break up of confirmed cases as on March 24 (start date of lockdown)and as on June 01, 2020 along with statistics of samples tested for these states is provided in Table 1 . The time series data of confirmed cases between March 24 and May 30, 2020 have been converted into logarithmic values as per requirement of the model (Eq.5) and least square fitted using 'Origin' software. The inbuilt function (ExpDec1) available in the software 'Origin' has been selected for fitting analysis which is expressed as: Where x is taken as time (days) and Y is the logarithmic value of number of cumulative infected cases up to x days. Comparing Eq. (17) with the model given in Eq. (5), the model parameters k1, k2 and τ can be obtained from fitting parameters (Y0, A1 and t1) using the following equations The curve fitted to the data of India, as a whole, is shown in Fig 1. Similar least square fitting exercise has been carried out for the selected states as well. The derived fitting parameters for the selected states and India, as a whole are also presented in Table 1 . Subsequently, models for each state and India as a whole have been tested against the real data of confirmed cases reported during June 01-10, 2020 to find the deviation of predicted values and testing the validity of the model. The maximum percentage deviation of model predictions has been given in Table 1 . The results shows that deviations are within 10 % in most of the states except a few states like Assam, Haryana, Karnataka, Odisha where it appears that a 2 nd surge is emerging. In these states, model prediction may not be accurate with the existing fitting parameters and needs to be updated using the upcoming real data. It is important to note that the model is derived based on the certain assumptions as highlighted in model formulation section and does not consider scenario of 2 nd surge due to any reason. Hence, model parameters needs to be updated in such changed scenario. Using Eq. (1), one can express the time-dependent infection rate as: [ ] Using Eq. (2), the infection rate (per day) can be estimated by knowing the parameter λ0 and τ. τ is the characteristic time obtained directly from least square fitting analysis and is given in Table 1 . λ0 can be obtained as the ratio of fitting coefficient k2 and τ. The characteristic time constant, τ, governs the decreasing trend of the infection rate. Higher this value, slower is the decrease in rate of infection. Table 1 provides the value of characteristic time constant for various states. Comparing the infection rate of various states and India as a whole with countries such as USA, Italy, Germany and China, one can observe that the infection rate is quite low (~ 2-3 times) in India. However, the decreasing trend is not as fast as USA, Italy, Germany and China [2] . This may be due to low testing of samples in India, particularly during the initial period of the epidemic. Now that the epidemic in Kerala appears to have come to an end, the data from this state has been used to perform cross correlation between daily new cases and removed cases due to recovery and death during the period March 14, 2020 and April 30, 2020. The plot of normalised correlation factor with respect to maximum value with different time lag is shown in Fig 3. As may be seen, the correlation is found to attain maximum when the time lag between them is 15 days i.e. peak of daily new cases and daily removed cases is just lagged by 15 days. This is known as the mean recovery time, T for COVID patients. Typically for COVID-19 infection, the reported value of the mean recovery time varies from 14 to 16 days [2] .This recovery time of 15 days has been used for other states and India as a whole to estimate the peak of active infected cases. Subsequent to estimation of mean recovery time and model parameters through least square fitting exercise, predictions have been made for daily, cumulative and active infected cases with time. In this analysis, January 30, 2020 is designated as time, t=1 (date of 1 st reported case in India) and accordingly March 24, 2020 is t=55 (start date of lock down). Predictions have been made only for t>55 till the cumulative infection cases attain saturation. Fig 4 shows the plot of predicted mean, minimum, maximum of total infected cases, new daily cases and active infected cases in India with time. Fig 5, Fig 6 and Fig 7, show the plot of predicted total infected cases, new daily cases and active infected cases with time respectively in selected states of India. One can refer these plots to find the approximate time of peaking and near end of epidemic and number of active infected cases and saturation cases at peak and end time in various states of India and India as a whole. Table 2 provides state-wise results of predicted time to reach peak of the epidemic, time to attain 99 % of the total infected cases (~ end time of epidemic) and number of active and total saturation of infected cases with lower and upper bound value considering the error margin in the derived model input parameters. These results are very much useful for planning and arrangement of medical resources. Results suggest that India, as a whole, could see the peak of the epidemic in the month of July 2020. Some of the states such as Madhya Pradesh, Punjab have already seen the peak of epidemic by this time while Andhra Pradesh, Assam, Gujarat, Jammu &Kashmir, Jharkhand, Karnataka, Rajasthan may see the peak in the month of June, 2020. Maharashtra, Haryana, Odisha, Telangana, Uttar Pradesh, West Bengal may see the peak of epidemic in the month of July, 2020 while Delhi, Tamil Nadu and Bihar may see the peak in peak of the epidemic in August, 2020. Results on active infected cases at peak time suggest that active infected cases for India, as a whole, may go a little above 200K (K stands for thousands, here onwards). The most affected states-Maharashtra, Tamil Nadu and national capital Delhi may see the active infected cases up to 57K, 42K and 41K respectively. Gujarat, West Bengal, Uttar Pradesh and Rajasthan may see the number of active infected cases between 5K and 10K at the peak time while remaining states may see the number of active infected cases below 5K. As it is well known, model assumptions cannot be met always in real scenarios and hence the prediction may deviate depending upon change in the trend of latest data due to several factors such as movement of migrant workers and relaxation in epidemic control measures. In this context, a real time application (COV-IND Predictor) has been developed by implementing the above model in a Google sheet which automatically syncs the latest data from national COVID19 dash board [5] on daily basis and updates the model input parameters using the inbuilt forecast function and makes the predictions instantaneously for future dates. The application can be accessed from the link: https://docs.google.com/spreadsheets/d/1fCwgnQ-dz4J0YWVDHUcbEW1423wOJjdEXm8TqJDWNAk/edit?usp=sharing While the data predictions in this manuscript are based on real data only up to May 30, 2020, it is advisable to check the latest predictions from the above link which gives a more reliable prediction of the COVID-19 scenario in India, as a whole, and in individual selected state based on the latest trend in the COVID-19 infected cases. This real-time application can serve as a helping hand for policy makers to track peak time and maximum active infected cases based on latest trend in data for medical readiness and for taking epidemic management decisions periodically. We propose a data-driven model to track and predict the course of the epidemic. Many parameters characterizing an epidemic can be determined from the model using the available latest data which can be validated by a few real data sets. Subsequently, the model can be used for predictions. This presented approach can be applied not just to the current Covid-19 epidemic, but also, in general, to future epidemics. The model gives best predictions with online type predictor, utilising latest data to update the model input parameters periodically and predict the course of epidemic for the next two weeks. The model is of special significance for predicting the approximate peak time and end time of the epidemic so as to keep a readiness for maximum resources during the peak time. The model is able to well capture the observed decrease in the infection rate post lockdown, thus confirming the effectiveness of lockdown in containing the epidemic. The model has been implemented in a Google sheet which can serve as a practical tool for epidemic management decisions such as medical resource planning, required number of daily testing and ultimately relaxing lockdown rule and regulation in order to balance the impact from the public health vs. the economic crisis [21] [22] [23] [24] . The authors declare no competitive interests. Coronavirus disease 2019 (COVID-19)-Symptoms and causes A data driven model for predicting the course of COVIN-19 epidemic with application for China Ministry of Health and Family Welfare, Government of India. www.mohfw.gov.in [Retrieved 1 A contribution to the mathematical theory of epidemics Age-structured impact of social distancing on the COVID-19 epidemic in India A conceptual model for the corona virus disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental action Mathematical modelling of the dynamics of the Corona virus COVID-19 epidemic development in China Analysis and forecast of COVID-19 spreading in China, Italy and France Chinese &Italian COVID-19 outbreaks can be correctly described by a modified SIRD model 2020 Modelling and forecasting the Covid-19 pandemic in Brazil Imperial College COVID-19 Response Team Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe COVID-19: towards controlling of a pandemic Taking the right measures to control COVID-19 Positive effects of COVID-19 control measures on influenza prevention COVID 19 in INDIA: Strategies to combat from combination threat of life and livelihood An updated estimation of the risk of transmission of the novel coronavirus (2019-nCov) Eric Avila-Vales. A data driven analysis and forecast of an SEIARD epidemic model for COVID-19 in Mexico On the nature of the function expressive of the law of human mortality and on a new mode of determining the value of life contingencies Comparing non-pharmaceutical interventions for containing emerging epidemics When is contact tracing not enough to stop an outbreak? Evaluation of control measures implemented in the severe acute respiratory syndrome outbreak in Beijing Quantitative evaluation on control measures for an epidemic: A case study of COVID-19 Acknowledgement: Authors would like to thank Shri Suresh Babu RM, Associate Director, Health, Safety and Environment Group of BARC for his valuable suggestions and inputs during internal review of the manuscript.