key: cord-0742289-muoziy9m authors: Tamrakar, V.; Srivastava, A.; Parmar, M. C.; Shukla, S. K.; Shabnam, S.; Boro, B.; Saha, A.; Debbarma, B.; Saikia, N. title: District level correlates of COVID-19 pandemic in India date: 2020-10-11 journal: nan DOI: 10.1101/2020.10.08.20208447 sha: b40d07fbfd5441c25808a9403dded4e728658ed8 doc_id: 742289 cord_uid: muoziy9m Abstract Background The number of patients with coronavirus infection (COVID-19) has amplified in India. Understanding the district level correlates of the COVID-19 infection ratio (IR) is essential for formulating policies and intervention. Objectives The present study examines the association between socioeconomic and demographic characteristics of India's population and the COVID-19 infection ratio at the district level. Data and Methods Using crowdsourced data on the COVID-19 prevalence rate, we analyzed state and district level variation in India from March 14 to July 31, 2020. We identified hotspot and cold spot districts for COVID-19 cases and infection ratio. We have also carried out a regression analysis to highlight the district level demographic, socioeconomic, infrastructure, and health-related correlates of the COVID-19 infection ratio. Results The results showed that the IR is 42.38 per one hundred thousand population in India. The highest IR was observed in Andhra Pradesh (145.0), followed by Maharashtra (123.6), and was the lowest in Chhattisgarh (10.1). About 80 percent of infected cases and 90 percent of deaths were observed in nine Indian states (Tamil Nadu, Andhra Pradesh, Telangana, Karnataka, Maharashtra, Delhi, Uttar Pradesh, West Bengal, and Gujarat). Moreover, we observed COVID-19 cold-spots in central, northern, western, and north-eastern regions of India. Out of 736 districts, six metropolitan cities (Mumbai, Chennai, Thane, Pune, Bengaluru, and Hyderabad) emerged as the major hotspots in India, containing around 30 percent of confirmed total COVID-19 cases in the country. Simultaneously, parts of the Konkan coast in Maharashtra, part of Delhi, the southern part of Tamil Nadu, the northern part of Jammu & Kashmir were identified as hotspots of COVID-19 infection. Moran's- I value of 0.333showed a positive spatial clustering level in the COVID-19 IR case over neighboring districts. Our regression analysis found that district-level population density ({beta}: 0.05, CI:004-0.06), the percent of urban population ({beta}:3.08, CI: 1.05-5.11), percent of Scheduled Caste Population ({beta}: 3.92, CI: 0.12-7.72),and district-level testing ratio ({beta}: 0.03, CI: 0.01-0.04) are positively associated with the prevalence of COVID-19. Conclusion COVID-19 cases were heavily concentrated in 9 states of India. Several demographic, socioeconomic, and health-related variables are correlated with the COVID-19 prevalence rate. However, after adjusting the role of socioeconomic and health-related factors, the COVID-19 infection rate was found to be more rampant in districts with a higher population density, a higher percentage of the urban population, and a higher percentage of deprived castes and with a higher level of testing ratio. The identified hotspots and correlates in this study give crucial information for policy discourse. Keywords COVID-19, socioeconomic, co-morbidity, geographical, hot-cold spot, districts, India. With more than16,96,962 confirmed cases on July 31,2020, India ranked third globally in terms of the total number of infected patients of COVID- 19 (1). The rate of spread of the disease was slow in the initial three months of the first outbreak in Kerala in January 2020, possibly because of the early nationwide lockdown (2) (3) (4) ; widespread coverage about the pandemic in print, electronic and social media (5) , and targeted efforts by the union and state governments on quarantine facilities and travel protocols (6, 7) . With the demarcation of local containment zones, these definitive measures significantly reduced the doubling time (S1Fig.), although there is no sign of stalling in the infection rates. There is a rapid increase in the number of confirmed cases of COVID-19 in many districts. India has been recording over 50,000 new cases every day since Despite such a fast spread of COVID-19, India has a fairly high recovery rate and the lowest fatality rate globally (8) . Despite India's advantage of having a young age structure less susceptible to COVID-19 related deaths (9) . India may have to undergo a higher burden of disease shortly due to other demographic factors (10) such as the enormous population size, high population density, higher percentage of people living in poverty, lower levels of per capita public health infrastructure, and a high prevalence of co-morbid situations. Like any other health and demographic indicator, COVID-19 infection varies widely among the different states of the country (11, 12) . However, the geographical pattern of the COVID-19 infection rate' does not coincide with the patterns of demographic and health indicators such as the under-five mortality rate or nutritional status. COVID-19 has been spreading rapidly in the urban areas, especially in states with megacities with densely populated urban slums like Delhi, Maharashtra, Tamil Nadu, and West Bengal. The sudden surge of return labour migration to the states of origin (due to COVID-19 related national lockdown), state-level health care system, adherence to physical distancing measures, and local government management are other potential community-level factors affecting geographical variations in the spread of COVID-19 in India. Some recent studies have computed composite indices to rank the districts in terms of their COVID-19 vulnerabilities using demographic information and infrastructure characteristics (13) (14) (15) . While such analyses are useful for district-level planning and prioritization they are based on the assumption that vulnerability will decrease as -the districts' socio-economic indicators improve. However, such an inverse relationship may not be applicable in the rare COVID-19 context; for instance, a higher percentage of urban population may indicate a higher socio-economic status of the district population in a non-COVID situation but maybe positively correlated with the spread of COVID-19. COVID-19 is more prevalent in cities and towns than in rural areas or hilly regions (16) .Therefore, it is imperative to unfold the empirical relationship patterns between the district's socio-economic and infrastructural characteristics and the COVID-19 infection ratio. To the best of our knowledge, no such previous study has been conducted on COVID-19 in India. This study examined the district level socio-economic and demographic correlations of COVID-19 infection ratio in India. Identification of such correlates is crucial for framing health policy and appropriate intervention. We used crowdsourced district-level data on COVID-19 available in the public domain until July 31, 2020, accessed from the COVID-19India dashboard (17) . It is an application programming interface (API) to monitor the COVID-19 cases at national, state, and district levels. The data compiled in this web portal is based on state bulletins and official handles. The details of the data are available on the website. This portal data is consistent with the data provided by the Ministry of Health and Family Welfare, Government of India (https://www.mohfw.gov.in/) (18) . For explanatory variables, we utilized data from the National Family Health Survey of India 2015-16 (NFHS-4), a cross-sectional survey of 601,599 households, and 2.87 million individuals from all 29 states and 7 union territories (19) . The survey collected data on various socioeconomic, demographic, health, and family planning indicators and anthropometry and biomarkers' measures related to anemia, hypertension, and diabetes. The NFHS-4 is the most recent source of such biomarker-based data at the district level in India. We also used some socio-economic and demographic variables from the Census of India (20) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 11, 2020. . https://doi.org/10.1101/2020.10.08.20208447 doi: medRxiv preprint For all the 640 districts in the thirty-five states and eight union territories of India, we defined the outcome variable, COVID-19 Infection Ratio (IR), as the number of confirmed cases in a given district per 100,000 population. For the district-level population for the year 2020, we projected the district population using an exponential growth rate from the census 2001 and 2011. The infection of ratio was calculated as: Where, ‫ܥ‬ = the number of confirmed cases in i th district andܲ = total projected population in the i th district on July 31,2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 11, 2020. . https://doi.org/10.1101/2020.10.08.20208447 doi: medRxiv preprint We performed a bi-weekly trend analysis of COVID-19 cases in India. To examine the district level correlates of the outcome variable, we carried out a linear regression analysis at the district level. Two separate district-level regression models were fitted. Model 1 presents the independent variable's unadjusted effect without controlling the effect of any other independent variable. Model 2 shows the adjusted results of the independent variables on the dependent variable. We did all the analyses in the statistical package Stata14.1. We tested for the possible multicollinearity among the independent variables before fitting them to the regression model. We generated descriptive maps of 727 districts in the software package QGIS and later exported the shapefiles to GeoDa software to perform spatial analysis. Using the first-order 'Queen's contiguity matrix as the weight, we estimated Moran's I and univariate Local Indicators of Spatial Association (LISA). 'Moran's I" is the Pearson coefficient measure of spatial autocorrelation, which measures the degree to which data points are similar or dissimilar to their spatial neighbours (29) . The LISA cluster map yields four types of geographical clustering of the interest variable (30) . Here, "high-high" refers to the regions with above-average infection ratio and sharing the boundaries with neighbouring areas with above-average infection ratio values. On the other hand, "high-low" indicates regions with below-average value and the surrounding areas with above-average infection ratio. The "high-high" are also referred to as hot spots, whereas the "low-low" referred to as cold spots. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. . https://doi.org/10.1101/2020.10.08.20208447 doi: medRxiv preprint S1 Table presents the national bi-weekly (14 days) national pattern of new confirmed, infected, recovered, and deceased COVID-19 cases in India. In India, the average bi-weekly new confirmed cases rose 744 times (from 63 to 46,900), the average recovered cases increased 6307 times (from 5 to 31,533), the average infected cases increased 1152 times (from 63 to 72614). The average deceased cases increased 734 times (from 1 to 734) between the 1 st and the 10 th biweekly phases, as given in Table A1 . The COVID-19 cases have amplified in each of the Indian states until July 31, 2020. Therefore, the five states of Maharashtra, Andhra Pradesh, Tamil Nadu, Karnataka, and Telangana, have accounted for more than half of the country's total cases. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. with response t to health-related variables as well, there exists a wide disparity. While the testing ratio ranged from 0 to11471 persons per 100 hundred thousand, the percentage of full immunization among children ranged from 7.14 to 100. The tobacco consumption among women ranged from 0.8 to 88 percent. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. The regression analysis present in Table 2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. . https://doi.org/10.1101/2020.10.08.20208447 doi: medRxiv preprint percentage of Scheduled Caste Population (β: 3.92, CI: 0.12-7.72) and district level testing ratio (β: 0.03, CI: 0.01-0.04) were positively associated with the prevalence of COVID-19. All variables are computed at the district level . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. In terms of the total number of positive cases, India ranked third after the US and Brazil, reporting more than one million COVID-19 cases as on July 31, 2020. A further concern is India'sCOVID-19 curve remaining on the upward trajectory with no sign of bending like Italy The apparent concern is that these states also contribute significantly to the Indian economy (32) . Another important observation of this study is that districts bordering the six metropolitan cities . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. . were observed to be India's highest hot spots, possibly because they contribute the largest share of migrants and commuters to these megacities. This study indicates, in addition to Maharashtra, Delhi, Tamil Nadu, and Gujarat, states with low socio-economic indicators such as the EAG states (Bihar, Chhattisgarh, Jharkhand, Madhya Pradesh, Uttar Pradesh, Rajasthan, Odisha, Uttarakhand) are more affected due to the lack of health care facilities. Secondly, this study examined the district level correlates with the COVID-19 infection ratio in India. Our research reveals that the district's infection ratio of COVID-19 is associated with various socio-economic variables. However, we observed a statistically significant association only with a limited variable. After adjusting the role of socio-economic and health-related factors, the COVID-19 infection rate was found to be higher in the districts with a higher level of population density, a higher percentage of the urban population, a higher percentage of deprived castes and a higher level of testing ratio. High population density might lead to social distancing challenges, thereby districts with higher density have higher infection ratio. Similarly, as the percentage of the urban population increases, the chances of unavoidable economic activities might increase, which exposes more people to the Coronavirus .Previous studies also showed that higher population densities in congested slum areas and large towns accelerated COVID-19 infection and mortality rates (33) (34) (35) . The congestion, slum concentrations, inadequate housing, and sanitation in poor urban areas may explain such high disease. A positive association between COVID-19, IRs and testing ratio indicates underreporting of COVID-19, in districts where the testing ratio is low. Studies based on individual data show that older people are more vulnerable to COVID-19 infections (36, 37) . This study also identified that pre-existing diabetes is positively associated with COVID-19 disease (36, 37) . In our research, we didn't find such associations, possibly because of the study design. Unlike these studies, we are identifying macro-level correlates of the COVID-19 infection rate. Interestingly, as population belonging to deprived castes such as SCs increase, the chances of COVID-19 increase since SCs are more vulnerable and daily laborer they may have higher infection chances. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. . https://doi.org/10.1101/2020.10.08.20208447 doi: medRxiv preprint The COVID-19 pandemic is expected to have a long-term impact on health, economy, and social processes globally, including India. Only a clear understanding of the disease's spatial distribution and its determinants will help to formulate policies and interventions. Therefore, the possible risk factors should be included in policy preparedness and implementation during the COVID-19 pandemic. We found that population density, urban residence, Scheduled Caste population, and testing rates are significantly correlated with the infection ratio (IR). As in urban areas, the population density is very high, and social distancing is challenging to maintain, the role of government is crucial in combating the pandemic. By ensuring the health and hygiene-related facilities, (providing adequate clean water, adequate sanitation, and sewerage facilities, cleaning the city, maintaining quarantine centers and public health care institutions, etc.), and improving public distribution system to ensure minimum food supply, especially among the urban poor and other deprived sub-groups, can help to control the spread of COVID-19 infection. More tests are required to classify patients with asymptomatic conditions. Currently, India has a population of over 1.3 billion, but till July 31, approximately 18.8 million (lesser than 2 percent) tests have been carried out. Simultaneously, people's negligent behavior towards COVID-19 protocols (say not following the social distancing norms, not wearing the mask in pubic place, and coughing without covering mouth) put them at a higher risk. Finally, there is need to improve infrastructure (hospitals, ventilators, PPE kits), and human resources (doctors, nurses, and frontline workers) in healthcare facilities. Our analysis does have a few limitations. First, there is a possibility of under-reporting positive and fatal cases due to a lack of testing or social stigma. Hence our data gives the most conservative estimates of infection ratio. Second, for most cases, the patients' level of information (such as age, sex, and co-morbidity) is unavailable. Therefore, we analyzed the district level determinants instead of individual-level determinants. Thus, our results identified the major correlates only at the district level. Finally, we analyzed the number of confirmed cases rather than the number of active cases, as the later considers the recovery rate and reflects the health service available in a region. We used the number of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 11, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 11, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 11, 2020. . https://doi.org/10.1101/2020.10.08.20208447 doi: medRxiv preprint . COVID-19 coronavirus pandemic Assessing the Impact of Complete Lockdown on COVID-19 Infections in India and its Burden on Public Health Facilities A Situational Analysis Paper for Policy Makers International Institute for Population Sciences COVID-19 in India: Potential Impact of the Lockdown and Other Longer-Term Policies Comorbidities and multi-organ injuries in the treatment of COVID-19 Retweets of officials' alarming vs reassuring messages during the COVID-19 pandemic: Implications for crisis management This is how India is reacting to the coronavirus pandemic These are the coronavirus quarantine facilities in India NBER WORKING PAPER SERIES DECODING INDIA'S LOW COVID-19 CASE FATALITY RATE Decoding India's Low Covid-19 Case Fatality Rate † Demographic science aids in understanding the spread and fatality rates of COVID-19 Besides population age structure, health and other demographic factors can contribute to understanding the COVID-19 burden Epidemic Trend of COVID-19 Transmission in India During Lockdown-1 Geographical Variation in COVID-19 Cases, Prevalence, Recovery and Fatality Rate by Phase of National Lockdown in India New Guidelines on the measures to be taken by Ministries/Department of Government of 40-3/2020-DM-I (A) India; 2020. Available from: 164.100.117.97/WriteReadData/userfiles/MHA Order Dt. 1.5.2020 to extend Lockdown period for 2 weeks w.e.f. 4.5.2020 with new guidelines.pdf 14. Singh SS. Corona Virus.The mystery of the low COVID-19 numbers in West Bengal A vulnerability index for the management of and response to the COVID-19 epidemic in India: an ecological study COVID-19 and heatwaves: a double whammy for Indian cities COVID-19-India. COVID-19 India Dashboard COVID-19 INDIA, Ministry of Health and Family Welfare, Government of India IIPS. International Institute for Population Science India: National Family Health Survey (NFHS-4) Office of the Registrar General & Census Commissioner, India (ORGI), Provisional Population TotalsPaper 2 of India & States/UTs Disability Divides in India: Evidence from the 2011 Census Under-Five Child Growth and Nutrition Status: Spatial Clustering of Indian Districts Excess under-5 female mortality across India: a spatial analysis using 2011 census data Who is at the highest risk from COVID-19 in India? Analysis of health, healthcare access, and socioeconomic indicators at the district level. medRxiv Demographic Foundations of Family Change Changing Demographic Characteristics and the Family Status of Chinese Women Household projection using conventional demographic data Disability divides in India: Evidence from the 2011 census Notes on Continuous Stochastic Phenomena Hotspots and Coldspots: Household and village-level variation in orphanhood prevalence in rural Malawi IIPS. International Institute for Population Science India: National Family Health Survey (NFHS-4) Contagion effect of COVID 19 outbreak: Another recipe for disaster on Indian economy Distribution of COVID-19 Morbidity Rate in Association with Social and Economic Factors in Wuhan, China: Implications for Urban Development Does Density Aggravate the COVID-19 pandemic? Effect of population density on epidemics Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Source: Author's Computation. Note: Computation of Infected Cases I i = (C i -R i -D i ) Infection Rate (IR)* from provided data from COVID-19 Dashboard of India And rest of the explanatory variable's calculation used from the fourth round of National Family Health Survey (IIPS **mortality estimates (reference period five year of the preceding survey. Fig 1. Trend in new cases by number of days in India Source: Author's calculations 1029 confirmed cases as the primary indicator of the spread of the infection. Despite these limitations, the study's merit lies in bringing together spatial-demographic vulnerabilities prevalent across the nation during the pandemic period. To sum up, the study findings identified the district level indication of COVID-19 and their demographic and socio-economic features.Supporting Information S1