key: cord-201898-d1vbnjff authors: Jha, Vishwajeet title: Forecasting the transmission of Covid-19 in India using a data driven SEIRD model date: 2020-06-08 journal: nan DOI: nan sha: doc_id: 201898 cord_uid: d1vbnjff The infections and fatalities due to SARS-CoV-2 virus for cases specific to India have been studied using a deterministic susceptible-exposed-infected-recovered-dead (SEIRD) compartmental model. One of the most significant epidemiological parameter, namely the effective reproduction number of the infection is extracted from the daily growth rate data of reported infections and it is included in the model with a time variation. We evaluate the effect of control interventions implemented till now and estimate the case numbers for infections and deaths averted by these restrictive measures. We further provide a forecast on the extent of the future Covid-19 transmission in India and predict the probable numbers of infections and fatalities under various potential scenarios. The infections and fatalities due to SARS-CoV-2 virus for cases specific to India have been studied using a deterministic susceptible-exposed-infected-recovered-dead (SEIRD) compartmental model. One of the most significant epidemiological parameter, namely the effective reproduction number of the infection is extracted from the daily growth rate data of reported infections and it is included in the model with a time variation. We evaluate the effect of control interventions implemented till now and estimate the case numbers for infections and deaths averted by these restrictive measures. We further provide a forecast on the extent of the future Covid-19 transmission in India and predict the probable numbers of infections and fatalities under various potential scenarios. Almost every continent of the planet is grappling with a large number of infections arising due to virus called Coronavirus 2, SARS-CoV-2 [1] . These infections that may result in a mild to severe symptomatic disease called Coronavirus disease 2019 or COVID-19 were first detected in Wuhan, a city in central China [2, 3] . Later the infections spread across the globe and it has forced nations to undertake drastic measures to minimize the loss of precious human lives [4, 5] . For a populous country like India, which has a dense and large population (≈ 1.4 billion), the cause of concern is quite high. Therefore, it is of special importance to study the spread of COVID-19 in India, and make reliable predictions which can help in mitigation of its ensuing effects. These timely critical information may be crucial for devising strategies for containment of infections and estimating the requirements of medical facilities. In India an early complete nation-wide lock-down was imposed from 25 th March when the number of cumulative SARS-CoV-2 infections were around 650. These strict measures prevented any large scale disaster and slowed down the rate of infections in the initial stages and helped in geographical containment of the epidemic. However, recent days have seen no real decrease perhaps due to gradual weakening of restrictive measures owing to pragmatic social and economic reasons. From June 1 st India continues to have a complete lock-down only in the defined containment zones where the infection rates are high. These steps of gradual easing of lock-down have been necessitated as the balance between life and livelihoods are intertwined, which calls for invoking more intelligent strategies because a complete extended lockdown cannot be sustained for very long time without other competing collateral losses to the most vulnerable sections of society. Alternative steps based on isolation of infected patients through the lock-down in the containment zones and more widespread testing and contacttracing are being followed for controlling the rate of infection. This represents the transition from suppression to mitigation strategy for the resolution of any potential outbreak but efficacy of these steps remains to be seen as the execution of these policies on ground level are challenging. The transmission dynamics of viral epidemics in any population is an interplay of various factors related to viral, immunologic, environmental and sociological conditions. A number of mathematical and physical models have been proposed in general to understand the evolution of epidemics, aiming to make reliable predictions so that to help governments to formulate proper policies and response plans for effective control of the disease [6] [7] [8] . Simple deterministic mathematical models based on the formulation of differential equations have been extensively used to provide information on the transmission mechanism of various viral epidemics. The SIR model is a one of the simplest epidemiological models that is based on dividing the population among three compartments, the susceptible, the infected and the recovered (or deceased) populations and determining their time evolution [9] . The SEIR [10] model is a simple extension of the SIR model, where an additional compartment of exposed population with a latency period is introduced which is more appropriate for COVID-19 like epidemic which has an inherent latency and asymptomatic transmission [11] . Extended models have been employed that use several separate compartments for various sub-populations such as, asymptomatic, quarantined, hospitalized or components based on the variations for example, according to age, gender etc. [12, 13] . However, this entails incorporation of many unknown parameters and uncertain initial conditions about which the information is either not available or there are large associated uncertainties. In the present article, we employ a dynamic SEIRD model with the inclusion of population of deaths as a separate compartment in the SEIR model. Several works have been already performed in the Indian context to explain the COVID-19 dynamics in the initial phase of its transmission [14] [15] [16] [17] [18] . We incorporate the crucial parameter of contact period with a time variation connecting its value at the beginning of the epidemic to the current reduced value. The reduction in the values of contact rate has been achieved due to many isolation measures, primarily the imposed nation-wide lock-down. The time variation in the contact rate β(t) is determined through arXiv:2006.04464v1 [q-bio.PE] 8 Jun 2020 the effective reproduction number R(t) that is in turn related to the doubling time of the rate of infection growth [19] [20] [21] . We integrate this parameter in the SEIRD model calculations and estimate the role of interventions in preventing the number of probable infections and death till now. Further, we consider different potential scenarios for the rate of growth of infections for making projections of SARS-CoV-2 transmission. We make a forecast for the probable numbers of infections and fatalities in the coming times. The projections provide information for the extent of suppression and containment strategies that need to be employed to mitigate the impact of Covid-19 in coming times. It is to be mentioned that results obtained in this work are to be used for the research purposes only. The data for the present studies are collected from the repository hosted at website https://www.worldometers.info/coronavirus [22] for cases specific to India. The epidemiology of the COVID-19 outbreak using a deterministic SEIRD model is studied with five compartments governed by a set of ordinary differential equations where, S(t) is the susceptible population, E(t) is the exposed population, I(t) is the infectious population, R(t) is the recovered population and D(t) is the number of deaths at any instant t and N = S + E + I + R + D is the total population. We have not included separate compartments for the number of asymptomatic, quarantined, hospitalized populations or the variations according to age or gender, as these lead to increase in number of unknown parameters and therefore lead to large uncertainties in the predictions. In any case, these numbers can be estimated in an average way with their relations to populations that have been considered. In addition, assumption about the no re-infection of the recovered population is made as there is no evidence to the contrary. The parameters of the above set of equations are the latent period of being exposed A = 1/φ that is related to the incubation period of the virus, the contact period B of infection = 1/β, the period of being free from being infected G = 1/γ commonly known as the recovery time , the parameter corresponding to death D = 1/δ of the infected population. These parameters determine the transitions that occur across the compartments as the time evolves. Here, the parameters A, G and D are specific to characteristics of SARS-CoV-2 and only weakly correlated to the health responses of the country and therefore expected to have similar values across countries. The parameter B represents the strength (speed) of the virus transmission which is intimately related to the prevailing conditions of containment measures undertaken by specific countries. Apart from these parameters, the fraction of the susceptible population at the beginning to the total population α = S(0)/N is a very important parameter. Taking total population of the region as S(0) may lead to gross overestimation of case numbers, because the part of population may be inherently immune or less affected by the virus or live in isolated conditions. Furthermore, the extent of initial exposed latent population defined by = E(0)/I(0), parameter, may also be an important parameter that indicates the presence of a number of undetected or asymptomatic exposed individuals at the beginning. One of the most significant parameters that describes the pandemic is the basic reproduction number of infection R 0 , which is defined as the number of individuals that are infected from the uninfected, susceptible population by one infected individual under normal conditions [6, 19] . There are challenges in determining R 0 in terms of the parameters of deterministic model as one requires estimates of included parameters that are uncertain [23] . During the spread of the epidemic one can define an effective reproduction number R i (t), which is a time dependent quantity that changes because of control measures and depletion of susceptible population. It provides the dynamic information on the strength of the epidemic transmission as the time evolves. In general, the infection continues to expand if R i (t) has values greater than 1, while the epidemic stops eventually if R i (t) is persistently less than 1. The estimation of the effective reproduction number is complicated and many models have been proposed for its determination [24, 25] from the data. Here we use a simple method based on fitting the incidence data growth rate by a distribution with gaussian shape to determine the behaviour of R i (t). It must be mentioned however, that reported data has an inherent delay as compared to the instantaneous population numbers that are required for the estimation of its actual value. In the SIR -type models or their simple extensions, such as one described above R i (t) can be expressed as In the initial stages of the infection, S(t) ∼ N and R i = β γ , since (γ δ). The R i (t) value can be estimated using the initial doubling time T d of the number of infections [20] The T d value can be determined by fitting the reported growth in the cases of infection, which shows an exponential growth at the beginning of the epidemic, where, the daily growth rate r(t) is determined from the data of reported cases of infections. At smaller values of The values of r(t) are extracted from the reported data of daily growth rate of infections starting from 14 th March to 28 th May (day 76) with a 9-day moving average. It is fitted with a function in the following form where a, b, σ and t 0 are fit parameters. These parameters are determined from the best fit approach through the local minimization of the sum of squares of the error. The resulting fit to the daily growth rate is shown in Fig. 1a along with the band with standard error on fit parameters. In addition, the projections for next days after 28 th May are also shown for various probable scenarios by the straight lines that are used for the extrapolations of infection growth rate. It is seen from the figure that India had a peak daily growth rate of ∼ 20 % at the beginning of the epidemic which reduced to ∼ 5 % after one month of imposition of lock-down. It is to be noted that the nationwide lock-down imposed on 25 th March has been continually relaxed in phased manner and exists now only in the containment zones from 1 st June. However, after the decrease in growth rate in infections in the initial phase following the lock-down, the cases of infection have continued to grow at somewhat constant rate for a while. The extrapolations for next 30 days that define various probable scenarios are approximated as a linear reduction or increase from the present value of infection growth rate. The quantity r(t) determined from the data is also used to study the evolution of R i (t) in time as shown in Fig. 1b . It must be noted that R i (t) also depends on the period of infection for which, we present the result for values G = 12.7 days and G = 20 days. The R i (t) values have been extracted from the r(t) of the reported cases and also obtained through fitted value of r(t). These values are seen to decrease from a peak value of ∼ 4 to a value of ∼ 1.6 for G = 12.7 days, which is still substantially higher than the value 1 that is required for the spontaneous disappearance of the infection. The T d (t) value that is directly extracted from the data and also from the fit shows a constant value of ∼ 16 days. The value of R i (t) and T d values are also shown for one probable scenario where the rate reduces by one-half of the present value in a linear manner. This shows a moderate reduction in the value of R i (t). In addition, the rate decrease leads to a significant increase in T d values. The SEIRD model calculations using eqn. 1 have been performed to make comparisons with the data aggregated for India using the reported cases of infected, recovered and dead populations up to 28 th May and to make forecasts about the future scenarios. The contact rate parameter β(t) is taken to be time dependent with the parameters β 0 and β 0c fixed in accordance of equation 6. The parameter A is taken as 5.1 days, which is the mean incubation period and bit larger than the latency period. The value of parameter δ = 0.025 is taken, which determines the death population and very weakly affects populations in other compartments. The parameter γ(t) is taken in the following form The time variation in this parameter with γ 0 = 0.079 corresponding to period of 12.7 days and κ=0.01 takes into account the larger value of G ≈ 26 days that is needed to explain the behaviour of data in the initial stages. As the time elapses, a reduction in the recovery period is seen and γ approaches γ 0 value. The model was applied from the day of the epidemic when cumulative number of infections were ∼ 100 as on 14 th March. The fraction of the population at day 1 in compartments are set as follows : I = 88/1.4e7, R = 10/1.4e7 and D = 2/1.4e7 as provided by the reported data. Other initial conditions, defined by α and are the unknowns in the model. We take α=0.1 which is similar to value of α=0.08 extracted for European countries in Ref. [26] . The parameter = 3.2A, is important for the initial description of data but it does not affect long time dynamics of the epidemic as predicted by the model. The results of calculations with these parameters that use the time varying β(t) parameter as determined above provide a good description of the evolution in the case numbers of reported infected, recovered and death population as shown in Fig. 2a . In addition, the calculations have also been performed for constant β = 0.167 value, which is obtained from the best fit to the exponential distribution according to Eqn. 4. While the model results as shown in Fig. 2b provide a good description in initial days, it grossly over-predicts the case numbers as compared to the reported cases. It is quite evident from the figure that the time dependence of β(t) is necessary to understand the dynamics of infection spread for cases in India. The period of infection related to the recovery time of the infected individuals is also taken with a time variation. This parameter is primarily the characteristic of the epidemic and it is only mildly dependent on the responses of the health-care systems. In absence of any effective therapy or cure that may shorten the length of the infection it is relatively well known and it is taken as ∼ 12.7 days. However, larger value is required to explain the reported data both with constant β-value [17] or even when time variation in β-value is taken into account as 5 it has been found in the present study. The recovery rates are continually improving, a feature also reflected in the reported recovery data. Therefore, a time dependence in the parameter γ(t) is introduced to account for this observation. We show the calculations in Fig. 3 To study the role of interventions we perform calculations with different values for the contact rate β. The interventions have led to a decrease in the daily growth rate of infections which is intimately related to the β value. We use the constant values of β = 0.252 and 0.125, which correspond to the peak rate of growth and half its value. The peak β-rate is expected to have prevailed in the early stages of infection spread in the absence of any interventions such as, the lock-down or the conditions of no enhanced public awareness. In addition, we also give results obtained from the β = 0.167, β = 0.01 and time varying β-value. The infected, recovered and death populations for these β-values are shown in Fig. 4a, Fig. 4b and Fig. 4c , respectively. From the comparison it is evident that the lock-down and other interventions have prevented any large spread of infections and kept the death numbers particularly low. These interventions could have prevented around 4 million peak infections and 200,000 deaths at the 100 day mark. The lower growth rate also means that number of active infections are low at any instant which helps to optimize the response of health care systems. The rate of infections in India have remained approximately constant after the initial reduction for last several days. After an extended lock-down slowly the restrictions have been loosened up. We extrapolate values of β(t) to predict the outcomes of various probable scenarios. The β(t) values corresponding to growth rate value r pr as on 28 th May are varied so as to attain a given value at the end of next 30 days assuming a time variation with constant slope. These scenarios are named as the best case, the optimistic case, the most likely, current, problematic and alarming scenarios respectively. The time evolution of the epidemic is studied with these time variations for the future. The resulting predictions for the populations of infected, recovered and dead are shown in Fig. 5a, Fig. 5b and Fig. 5c respectively. The growth rate same as r pr may lead to a high number of total infections (∼ 8 million) with fatalities in excess of 300,000. In the scenario that we term as the most likely scenario, we can have a total of 3.5 million infected cases with almost 140,000 fatalities over the course of pandemic. This will correspond to a In this case, the death figures can be kept substantially low in the range of 25,000-50,000. In contrast, if the rate of growth were to increase from the present values due to pre-mature lifting of the lock-down in the affected zones and other lapses, the death numbers can be 500,000 with a rather alarming number of infected individuals in short time. Higher rates of growth also mean the large number of active infected cases appearing early and that may stretch the health care systems to the brink. We have made detailed comparison of model predictions with the real data using the important parameter of contact rate and infection rate derived from the data itself from the first principles. It must be noted that there is an inherent delay in the reported rate and instantaneous rate of the infection. In addition, the effect of any restrictive measures undertaken appears with a delay in the reported rate, which is estimated by the fit parameter t 0 ∼ 15 days in the present case. The model calculations are able to describe very well the case number of infected and recovered populations of reported data till now. The imposed restrictions have led to a reduction in the R i (t) and an increase in the T d values as the time elapses. The quantitative measure of the intensity of the imposed lockdown that reduced the growth rate r(t) to almost half its value is given by the fit parameter σ = 14 days. It is seen from the calculations that a large number of infections and fatalities have been averted due to imposition of the lock-down. Some part of this reduction may be ascribed to the enhanced public awareness, and growing disease monitoring and testing capabilities with the passage of time. However, effect of complete lock-down in reducing the infection rate has been quite significant. After the initial period of 40 days following the complete lock-down, there has not been much gain in the reduction of infection spread rate in last 30 days. It is probable that the gradual weakening of the lock-down due to socio-economic reasons might have offset the gains due to restrictive measures. Nevertheless, the continued restrictions have prevented any rise in the rate of growth of infections, which in absence of any such measures is expected to rise again. Even the growth rate of ∼ 4 − 5% attained so far implies an exponential growth and it is seen that the epidemic in India is still in early stages. Current estimates of future trends in new infections in the weeks after 28 th May suggest that more severe outbreak may occur in coming times leading to high number of infections. With the estimates from the most likely scenario, over 450,000 would be clinically diagnosed at the maximum resulting in ∼ 140,000 total fatalities. Impact of the severity of the disease outbreak is quantified through the case fatality ratio (CFR). It is defined as the ratio of fatality rate D(t) to the cumulative number of infections C(t). The CFR values have varied from ∼ 3.3 -2.8 as on April 15 to the present day which is less than the global average of ∼ 6.2. However, some doubts remain about the estimations of CFR because it is possi- ble that both the number of fatalities and infections may be underestimated. It is more likely that C(t) may be underestimated more due to the presence of large number of asymptomatic or non-critical infected cases which leads to the overestimation of CFR, assuming reported D(t) cases to be true. CFR remains low as long as the health facilities are able to cope with the rate of patients requiring critical care. In the scenarios if the number of active infected cases is large as predicted by multiple scenarios described above, requirements of hospitalizations and critical care resources may increase sharply. In such a situation, the health care system is going to be severely challenged in providing the critical care facilities for prevention of fatalities. The CFR in these conditions may rise to higher values. Therefore, imposing stricter measures inside the containment zones and more extensive testing and contact-tracing seems to be only viable logical preventive option that can lead to a manageable reduction in infected cases and casualties in absence of any therapies or large-scale immunity. There are limitations of the simplistic model employed here and therefore the exact quantitative numbers presented in the work are only indicative. In the present model, the asymptomatic populations are taken only in an indirect way at the start of the epidemic through the introduction of the parameter . Inclusion of this population as a separate compartment however would lead to introduction of extra set of unknown parameters. Further, we have not considered the regional and age specific heterogeneities in the model. While we have made a reasonable assumption for the parameter α =0.1, implying a 10% of the total population as the susceptible population, the overall numbers presented in this work may differ if it has a significantly different value. This number is going to be affected as the country has seen large scale migration from the infected areas to the other areas in recent times which may increase the pool of susceptible population. Further, we have made forecasts in this work based on probable daily growth rates. The determination of contact rate parameter through the measured rate in the simple way is uncertain due to stochastic fluctuations in the early stages and inaccuracies and time delay of the reported data. Further, there are challenges on designing the control mechanisms based on the basis of the numbers of daily growth as discussed in Ref. [27] . However, this work shows the operational use of the R i (t) calculated from the instantaneous infection rate to provide a reasonable description of the transmission dynamics. In this article, we have presented results of SEIRD model calculations to study the role of interventions and make future projections in the Covid-19 spread in India. To make reliable forecast we have determined the time dependent reproduction number R i (t) and contact rate parameter β(t) from the data for the daily rate of increase of infections. It is shown that timely imposition of lock-down and other public health interventions have led to a substantial reduction in the effective reproduction number which decreased to a present value of ≈1.6 from the peak value of ≈3.2 corresponding to an increase in the doubling time of the infections. Calculations performed using the time dependent contact rate parameter β(t) in the SEIRD model provide a good description of the case numbers of infections, recovered and deaths. We further make the projections for different probable scenarios. In the most likely scenario the model predicts a peak of active infections around the month of September with significant number of fatalities over the course of the epidemic around end of November. The results show the impending critical challenges for health care systems due to prospective high number of people with infections. The salient feature of the simple model employed in this work is the use of minimal uncertain parameters and therefore in our opinion it makes reliable predictions of the infections and fatalities. The projections of peak infections suggest big challenges for the available critical care health facilities in the management of pandemic. New innovative solutions have to be continuously found and intelligent measures have to be effectively implemented if the Covid-19 infections have to be contained with a moderate number and the ensuing fatalities have to be minimized. The most important extension of this study will be to incorporate the regional variability and apply this model by considering the state wise infection data and make predictions accordingly. Situation Report 133 Infectious Diseases of Humans cal Epidemiology of Infectious Diseases: Model Building, Analysis and Interpretation Mathematical Models in Population Biology and Epidemiology 2nd edn Seasonality and perioddoubling bifurcation in an epidemic model Swarnajit Chatterjee, Mintu Karmakar and Raja Paul medarXiv We thank V. V. Parkar, D. K. Mishra and G. Chaudhuri for useful discussions and their interest in the work.