key: cord-275261-t39kofet authors: Ghosal, Samit; Sengupta, Sumit; Majumder, Milan; Sinha, Binayak title: Prediction of the number of deaths in India due to SARS-CoV-2 at 5–6 weeks date: 2020-04-02 journal: Diabetes Metab Syndr DOI: 10.1016/j.dsx.2020.03.017 sha: doc_id: 275261 cord_uid: t39kofet Abstract Introduction and Aims: No valid treatment or preventative strategy has evolved till date to counter the SARS CoV 2 (Novel Coronavirus) epidemic that originated in China in late 2019 and have since wrought havoc on millions across the world with illness, socioeconomic recession and death. This analysis was aimed at tracing a trend related to death counts expected at the 5th and 6th week of the COVID-19 in India. Material and methods Validated database was used to procure global and Indian data related to coronavirus and related outcomes. Multiple regression and linear regression analyses were used interchangeably. Since the week 6 death count data was not correlated significantly with any of the chosen inputs, an auto-regression technique was employed to improve the predictive ability of the regression model. Results A linear regression analysis predicted average week 5 death count to be 211 with a 95% CI: 1.31–2.60). Similarly, week 6 death count, in spite of a strong correlation with input variables, did not pass the test of statistical significance. Using auto-regression technique and using week 5 death count as input the linear regression model predicted week 6 death count in India to be 467, while keeping at the back of our mind the risk of over-estimation by most of the risk-based models. Conclusion According to our analysis, if situation continue in present state; projected death rate (n) is 211 and467 at the end of the 5th and 6th week from now, respectively. The pandemic of COVID-19 (Coronavirus disease 2019) caused by SARS-CoV-2 (severe acute 68 respiratory syndrome coronavirus 2) has created a havoc on the human civilization. Since, its 69 appearance in the city of Wuhan (Hebei district) in China, it has been a relentless march of new cases 70 and deaths. [1] What makes it more scary is the novel strain of the virus and the unknowns 71 associated with it. [2] The present strategy has been to prevent its spread by social isolation and a 72 scientific overdrive to manufacture newer rapid diagnostic kits as well as medications. [ The ability of this family of viruses to readily undergo genetic recombination not only within same 78 group, but also between group, makes them readily susceptible to natural selection and changing its 79 nature of virulence. [8] The most striking feature however, is its ability to freely cross from one 80 species to another. HCoV 229E belongs to the group 1 of the coronaviruses family thought to be 81 responsible for the epidemic of common cold. [7] Transmission from bats to humans is thought to be 82 the initial transmission process for HCoV 229E, which had happened within the last two centuries. The aim was to identify the top 15 countries i.e. those most heavily affected and hence could 98 contribute to a substantial quantity of robust data, and compute a predictive model for India. We 99 thought this was of paramount importance, since it would help understanding as well as planning for 100 the future course of action. 101 India has entered week 4. This analysis was aimed at tracing a trend related to death counts 102 expected at the 5th and 6th week of the COVID-19 in India. 103 Global data was collected from the WHO COVID-19 situation report and the Indian data was updated 105 from the website covid19india.org. Data was collected in a CSV file and uploaded in Jupyter 106 notebook and analysed with the Python 3.8.2 software. As a re-validation process and for simplicity 107 of understanding the data was also analysed using excel with XL-STAT statistical software. 108 Inputs: Total number of infected cases, active cases, recovery numbers,. 109 Outputs: Total deaths and case fatality rates (CFR) 110 In order to get a good predictive value data was analysed for the top 15 infected countries with India 111 the 16 th country. 112 There was one missing data (NA) in the dataset, which was the recovery numbers from the US. In 114 view of the heterogeneity of data and significant outliers data imputation with mean was ruled out. 115 As a recovery strategy a correlation analysis was conducted (leaving out the US data) using python 116 and a strong r=0.99 (P<0.001) was found between total number of infected cases and recovery. 117 Utilising this robust association and the formula generated from linear regression ( • Step1: A correlation analysis was performed to ascertain the presence of and thereafter the 128 strength of association between the output (week 5 death count) and the inputs from week 129 4. There was a strong correlation between week 5 deaths and all the input variables. (Table 130 2) 131 132 jump from approximately 10 deaths at week 4 to 211 at week 5 and then 467 by week 6. 202 We speculate the need for urgent interventions (which are being taken as of now), to prevent this 203 drastic and sharp rise in death rates which indirectly also indicates an increase in infection rate. 204 The main limitation of this analysis was that it takes most input data into consideration without 206 taking into account the logistic actions being taken or not taken during the process. However, the 207 end of weeks results are highly indicative of both the virus-related natural trajectory as well as the 208 local government's reactions. 209 Secondly, limiting our analysis to the top 15 most infected countries could lead to an over-210 estimation of the outcomes. However, faced with a catastrophe of such magnitude, it is worth over-211 estimating rather than under-estimating. 212 In spite of all the limitations the biggest strength of this study was very high adjusted R 2 found in all 214 the predictive models. In addition there was cross-validation with two different software practically 215 ruling out any error creeping in from one mode of analysis. 216 According to our analysis, if situation continue in present state; projected death rate (n) is 211 and 218 467 at the end of the 5th and 6th week from now, respectively. Keeping these projected mortality 219 data in mind, current measured for containment of COVID-19 must be strengthened or 220 supplemented. COVID-19 and Italy: what next? The Lancet 2020 Features, Evaluation and Treatment 229 Coronavirus (COVID-19) COVID-19 and the consequences of isolating the elderly. The Lancet COVID-19) technical guidance: Laboratory testing for 2019-nCoV in humans. 237 World Heath Organisation Information for Clinicians on Therapeutic Options for COVID-19 Patients. Centres for Disease 242 Control and Prevention The species Severe acute respiratory syndromerelated coronavirus: classifying 2019-nCoV and 246 naming it SARS-CoV-2. Coronaviridae Study Group of the International Committee on Taxonomy of 247 Viruses Chapter 35 Coronavirus genome structure and replication Fenner and White's Medical Virology The proximal origin of SARS-CoV-2. 255 Coronavirus disease (COVID-2019) situation reports. World Health Organization