key: cord-0823195-112kxj5s authors: Imdad, Kashif; Sahana, Mehebub; Rana, Md Juel; Haque, Ismail; Patel, Priyank Pravin; Pramanik, Malay title: A district-level susceptibility and vulnerability assessment of the COVID-19 pandemic's footprint in India date: 2020-11-08 journal: Spat Spatiotemporal Epidemiol DOI: 10.1016/j.sste.2020.100390 sha: 21a96f4fd9b126c42fcb7f6d86163dbe3734f5e1 doc_id: 823195 cord_uid: 112kxj5s In this study, we trace the COVID-19 pandemic's footprint across India's districts. We identify its primary epicentres and the outbreak's imprint in India's hinterlands in four separate time-steps, signifying the different lockdown stages. We also identify hotspots and predict areas where the pandemic may spread next. Significant clusters in the country's western and northern parts pose risk, along with the threat of rising numbers in the east. We also perform epidemiological and socioeconomic susceptibility and vulnerability analyses, identifying resident populations that may be physiologically weaker, leading to a high incidence of cases and pinpoint regions that may report high fatalities due to ambient poor demographic and health-related factors. Districts with a high share of urban population and high population density face elevated COVID-19 risks. Aspirational districts have a higher magnitude of transmission and fatality. Discerning such locations can allow targeted resource allocation to combat the pandemic's next phase in India. The emergence of the COVID-19 pandemic from Wuhan, China (Columbus et al. 2020; Lupia et al. 2020; Torales et al. 2020) , in December 2019 and its rapid dispersion across the globe (Bonilla-Aldana et al. 2020; Cruz et al. 2020) , caught most countries and healthcare systems off-guard. Ameliorative measures ranged from initially quarantining patients to progressively containing entire provinces (Harapan et al. 2020) , as the virus' ambit grew beyond political and geographic boundaries. However, with the virus' spread continuing unabated and being supplanted by transmissions from pre-symptomatic and asymptomatic individuals (WHO 2020b), partial and complete lockdown of regions and entire countries were quickly adopted. Nations like India, where the outbreak became potentially threatening after its initial rampage in East Asia and Western Europe, were somewhat quicker to impose such lockdown measures (The Lancet 2020). Yet, despite such restrictions being imposed, many countries, India included, have experience sharp jumps in cases due to existing gaps in their healthcare systems and vulnerabilities in their socioeconomic and politicaladministrative setups (McAleer 2020), which exacerbate contamination risk and dampen recovery rates. Furthermore, the closure of offices and factories has already cast a lasting effect on the global economic landscape (Ajami 2020; Gong et al. 2020; McKibbin and Fernando 2020) . Large numbers of low-income migrant workers, who mostly live on-site at their workplaces or are heavily dependent on daily incomes/wages for sustenance, have perforce been uprooted. This has, in all probability, further reduced their ability to withstand a viral outbreak due to impaired health from malnutrition/hunger and the need to travel long distances to return to their original homes. Such a situation has especially emerged in India, where thousands of migrant workers mostly from the southern and western zones of the country, have thronged bus terminals and railway stations or have even attempted to travel across states on foot (Lal 2020; Singh 2020) , raising the spectre of further widespread community transmission and the incursion of the virus into especially socioeconomically vulnerable areas (and more importantly into rural hinterlands) that are ill-equipped in terms of healthcare (both at the individual and community level for resident populations and returning migrants- Nacoti et al. 2020) and economic resources to deal with a surge in infections. We therefore seek to examine a number of aspects in this paper. Firstly, we delve into the pattern of COVID-19 outbreak in India, relating this with air-travel from abroad (as the COVID-19 is essentially an 'imported' virus-WHO 2020b) and those availing the same within the country. Secondly, we track the contagion's spread through the nation during the successive lockdown phases by mapping its geographic trajectories and its potential outreach zones, i.e. the areas where the virus is most likely to spread into. For doing this, we have had to perforce use slightly older data than that which reflects the current situation, due to the intervening lag time for computation, analysis and writing up of the results. Some research accounts have noted that the contagion's virulent nature is dampened and hindered by higher temperatures (Briz-Redon and Serrano-Aroca 2020; Prata et al. 2020; Xie and Zhu 2020) and that it reacts to raised humidity levels (Ahmadi et al. 2020; Luo et al. 2020; Qi et al. 2020 ), conditions which mostly persist across India. Along with this, some studies have reported that hydroxychloroquine or chloroquine-based drugs may be useful in combating this virus (Colson et al. 2020; Gao et al. 2020; , though this finding is disputed as well (Boulware et al. 2020; Hernandez et al. 2020) . India has had a history of chloroquine use and vaccination due to its historical struggles with malaria (Das et al. 2012; Anvikar et al. 2014) , and it may thus be seemingly possible that her citizens are likely more resistant to the COVID-19 threat. Based on the above two aspects, we first estimate the epidemiological susceptibility of a particular location (since this is a medical/health issue foremost), identifying those areas where the disease is most likely to affect people. Subsequent to this, we examine the overall susceptibility of a region to a COVID-19 outbreak, factoring in the socioeconomic and health conditions of its residents, which makes them more prone to succumbing to the virus. Finally, we look at the overall socioeconomic vulnerability of the districts (i.e. where deaths are most likely to occur) and its variation across the country. Susceptibility analyses are important to ascertain the likely incidence of a disease and the populations it can affect (Lohmueller et al. 2003) . This also helps health-care providers perceive the risks involved for certain groups in order to better influence their health-related behaviour (BMJ 2020; Brewer et al. 2007) . In relation to the virus, a range of susceptibility studies have been performed to understand how the physiological attributes of patients condition their response to the virus (Rothan and Byrareddy 2020; Shi et al. 2020; Zhao et al. 2020) , and its links with the disease's spread in different areas (Fanelli and Piazza 2020; Karako et al. 2020) . Other studies have compared habits like smoking or ambient environmental conditions (Coccia 2020; with the virus' threat and examined how co-morbidities can occur due to underlying health issues Yang et al. 2020 ). Attempts to estimate this coronavirus' spread have however been comparatively less in the overall Indian context (e.g. Biswas and Sen 2020; Dhanwant and Ramanathan 2020; Pandey et al. 2020) , with some regression models being employed to predict virus outbreaks across the country (Tomar and Gupta 2020; Singh and Adhikari 2020) . Others have tried to estimate the COVID-19's prevalence within smaller regions (e.g. Simha et al. (2020) for the southern Indian state of Karnataka and Kumar (2020) for the western state of Maharashtra), measures of controlling its rampage across the country (Mandal et al. 2020 ) and the debilitating effects it can engender on the nation's overall healthcare system/setup (Chatterjee et al. 2020) . The other aspect examined in the Indian context has been the efficacy of the implemented lockdown measures (e.g. Paital et al. 2020) , following similar studies that were undertaken elsewhere (e.g. Ibarra-Vega 2020; Tobias 2020). Sardar et al. (2020) in examining the effects of the initial lockdown measures have opined that it would have little effect in the western Indian states that are the most affected by the virus but can prove to be somewhat beneficial in other areas, while Das et al. (2020) have provided guidelines on the critical community size required to make lockdowns effective. However, it has been demonstrated that there exists a great deal of heterogeneity in how people and regions respond to and are affected by this pandemic (Sominsky et al. 2020) , with an urgent need for further science-based assessments of the virus' spread (Bedford et al. 2019) . With India being an extremely diverse country, in geographic as well as socio-demographic aspects, it is very much feasible that the virus' impact and spread will vary markedly across regions. Furthermore, due to the large variations in economic and healthcare attributes across the nation, the physiological response and case-fatalities will also differ significantly across areas. Therefore, we seek to examine the COVID-19's footprint across the nation and link it intricately with the situational realities in different locales, something which is sorely required but has received scarce attention in the current context. The entire analysis has been based on the district-level administrative zones ( Figure 1 ) denoted in the Indian Population Census of 2011, which lists 640 districts nationwide, in order to keep parity with the Census datasets used. The number of district-level COVID-19 cases as reported by the Ministry of Health and Family Welfare (MoHFW), Government of India (GoI) was the principal secondary dataset utilized. This data was also used for the identification of COVID-19 hotspots and analysis of its possible outreach. On 24th March 2020, the Government of India (GOI) ordered a nationwide lockdown for 21 days (Phase-I), from 25 March -14 April 2020, further extending this in Phase-II (15 April -3 May) and then to Phase-III (May 3 to May 17). Therefore, when this study was commenced in April, 2020, the district-wise numbers of COVID-19 cases were collected on four dates from the MoHFW, GoI, database to understand the patterns of the virus' spread in each timeslot and feasibly represent each of the lockdown phases. These dates were-a) Pre-lockdown Phase (on 23rd March); b) Early lockdown Phase (29th March); c) Mid-lockdown Phase (10th April) and d) Late lockdown Phase (18th April). Subsequently, the lockdown was extended till 31 May (Phase-IV, though with significant allowances for transport and industries/commercial establishments). To show this evolving situation, we have subsequently also mapped the nationwide situation on 1 June 2020 and again on 15 July 2020 (see Section 4.5). Two sets of indicators were used for the ensuing susceptibility and vulnerability analyses. For selecting these, we considered all the relevant information at the district level that could feasibly be associated with the COVID-19 outbreak (these datasets were collated from information made available by Office of the Registrar General of India (2011), MoHFW (2016), IIPS and ICF (2017) and similar other sources-see Table 1 and Table 2) . The district-level susceptibility is related to the viral transmission, while the vulnerability denotes the risk of fatality after infection. We identified the various potential indicators based on an extensive literature review (e.g. Coudhry and Avindandan 2020; Dowd et al. 2020; França et al. 2009; Rocklov and Sjodin 2020; Sajadi et al. 2020; WHO 2020a; Zhong et al. 2020 ) and the parameters finally selected for both the susceptibility and vulnerability analyses basically encapsulate the district-wise overall socioeconomic, demographic, climatic, health and hygiene conditions. Their descriptions and references are listed in Table 1 and Table 2 , respectively, while the conceptual framework devised for the entire workflow is depicted in Figure 2 . Since the dispersion of the COVID-19 virus from its source region and its subsequent entry into other locales was mostly via returning air-travellers, the initial spreading centres were most likely to develop around cities that handled large volumes of air traffic. The annual air passenger data for 2018-2019 was collected from the Airports Authority of India (https://www.aai.aero/en) for 54 international airports across the country. The number of COVID-19 cases within a district was used as an attribute of its centroid, and a spatial buffer analysis was done to understand the relation between the distance from the airport and the number of COVID-19 cases. The Inverse Distance Weighting interpolation method was used to represent the ratio between the number of COVID-19 cases and the number of air passengers for each international airport, with higher ratios obviously indicating the greater likelihood of a large number of COVID-19 cases emanating from or being transmitted by passengers who have passed through that particular airport. The district-level numbers of COIVD-19 cases in the different time periods were visualised through a series of maps. Similarly, changes in the virus hotspots and its potential outreach were identified for each time step using the Getis-Ord Gi* statistic (Getis and Ord 1992; Ord and Getis 1995) . In contrast, the Moran's I values (Moran 1950) were used to distinguish between statistically significant COVID-19 clusters and their relations with other such proximate clusters (cf. ). The COVID-19's potential outreach was assessed using the vector-based double and vector-based integer assessment through Python coding. The Euclidean distance was computed from each district's centroid and the number of COVID-19 cases for that district was used as its mass (cf. Dong et al. 2015; Klobucnik and Malikova 2016) . As such, the higher the number of COVID-19 cases in a district, the higher would be its outreach potentiality towards the surrounding districts -with this following a distance-decay function (e.g. Frolov 1977; Pueyo et al. 2013 ). The estimated COVID-19 potential value was normalized (cf. Choi et al. 2010) in the range of 0-1 for weighting each conditioning parameter. Subsequently, the derived normalized district potential outreach values were also scaled between 0 to 1 and the natural breaks classification scheme was used to group the districts on the basis of their respective values. The term susceptibility and vulnerability are often used interchangeably for individuals and communities with excessive health burdens or issues. More specifically, the vulnerability component indicates the external factors (i.e. exposure to a disease), while the susceptibility aspect refers to the inherent (mostly physiological) characteristics/capabilities of individuals and communities of coping with the diseases (Kovats et al. 2003) . In this study, the COVID-19 susceptibility indicates the efficiency of the disease's spread into regions/communities, whereas the vulnerability implies the degree to which these regions/communities may be unable to cope with the adverse effects of COVID-19 infection and thereby suffer grave health consequences and possible death (Kovats et al. 2003) . So, higher values of susceptibility indicate an enhanced risk of the spread of this infection, while higher values of the vulnerability index imply a greater threat of death from the COVID-19 infection. The epidemiological susceptibility analysis dataset, therefore, includes the actual number of COVID-19 cases in the country along with other relevant socioeconomic, demographic and climatic factors. The socioeconomic susceptibility analysis dataset, on the other hand, excludes the variable denoting the number of COVID-19 cases in a district. We sought to undertake such a dual analysis in order to represent both, the actual situation which has emerged (keeping in mind data limitations and lags) as well as the situation that can arise (i.e. if all areas are eventually targeted or intruded into by the virus, which ones may succumb more easily). While constructing the epidemiological susceptibility index, the maximum weightage was given to the variable denoting the district-wise number of COVID-19 cases, as the virus' mere presence in a district endangers that area the most and makes it more liable to become a spreading centre than any other ambient socioeconomic, demographic or climatic factor. For the socioeconomic susceptibility index, all variables were given almost equal weights (see Table 1 ). With the widespread transmission of COVID-19 already underway, it is tricky to conduct any susceptibility analysis using real-time information due to the obvious time-lag between information gathering and analysis. In this paper, we have used the then most recent available data for the districts as released by the MoHFW, GoI, when we commenced our analysis in the latter half of April 2020. Similar to the weightage pattern accorded for the socioeconomic susceptibility computation, the index for ascertaining the socioeconomic vulnerability was also developed by assigning equal weightage to all variables ( Table 2) . The above computed socio-economic susceptibility and vulnerability indices have some indicators that are common to both and some that are exclusive to each. The justification for each indicator has been mentioned in the relevant tables. For example, some of the indicators which are not related to the spread of infection (and thereby the computation of the susceptibility index) but are associated with fatality (such as old age population and health infrastructure) were considered only while computing the vulnerability index (see Table 2 ). The susceptibility or vulnerability index is the normalised value of the Un-normalised Index. For computing the susceptibility and vulnerability indices, an un-normalised index (i.e. the summation of all the weighted indicators) was first computed (Eq. 1). After this, a normalised index was prepared, which ranged between 0 to 100 (Eq. 3). Before beginning the statistical enumeration, all the variables were converted to positive directions (i.e. a higher value would show a greater risk of susceptibility or vulnerability). The index construction procedure can be specified as follows: where, w stands for the weight assigned to each ith indicator. The expresses the scale free indicator obtained by dividing the original value (x) by the mean value ( ̅ ) for the variables (Eq. 2). The rationale of this scale free indicators has been previously analysed in the literature (Kundu 1984 , 2004 c.f. Haque 2016 . (Eq. 2) Eventually, the normalised index was computed as follows: The final output value obtained ranged between 0 and 100 (normalised index), with a higher value representing a greater risk of susceptibility or vulnerability to COVID-19. The assessment of spatial heterogeneity in the relationship between two variables may be useful if the presence/numbers of COVID-19 cases is available for all districts. Due to the unavailability of this data, we have only focussed on eliciting linear and non-linear relationships of the district-wise number of COVID-19 cases with the ambient socioeconomic, demographic and climatic variables. In computing the linear relationship, we obtained the Pearson's correlation plot using the 'corrplot' package in R (Wei and Simko 2017) . For the multivariate analysis, the non-linear relationship of COVID-19 cases and its correlates were computed using the Generalized Additive Model (GAM), via the 'gam' package in R (Hastie 2019). COVID-19 cases were first diagnosed in India in late January, 2020 and crossed 60,000 cases by the end of the second week of May 2020 (with present numbers in early-August, 2020 being more than 19 lakhs). With this highly contagious disease transmitting primarily through international tourists and returning travellers from aboard, the initially affected sites in India were mostly cities that have international airports (or regions adjoining such places) or are major tourist destinations. This is made apparent by the district-wise spread pattern of the COVID-19 virus ( Figure 3 ). In the first phase ( Figure 3A ), cases were reported from western India (around Mumbai and Ahmedabadtwo of the main commercial hubs of the country), from around New Delhi (the national capital) and Ladakh (popular tourist destination and a prominent Indian Army base) and from the southern states of Kerala (from where many residents migrate/travel for work to the Gulf region), Tamil Nadu, Andhra Pradesh and Karnataka (all of which have major metropolitan centres and commercial hubs-Chennai, Hyderabad and Bengaluru). Further intensification of the above pattern in the next time step of 29th March 2020 was apparent, with the adjoining regions reporting substantial case numbers ( Figure 3B ). Broad swathes of the eastern part of the country still remained mostly unaffected, except for Kolkata (the major regional metropolitan centre) and its surroundings. In the next time step (10th April 2020), two large contiguous zones in northern India and western-central-south India April 2020 ( Figure 3D ), showed the merging of the two entities described above, with infilling of their intervening districts (i.e. reporting of cases from previously unaffected areas) and a rise in cases in districts already afflicted. This resulted in a near-continuous stretch, from Kashmir to Kanyakumari, along the central and western corridors and down the eastern and western coastal belts, reporting COVID-19 cases. By now the eastern zone had noted a rise in cases (with its epicentre at Kolkata). This manifested as a narrow line of districts along the densely populated Ganga plains in Uttar Pradesh and Bihar, which merged into the larger/more contiguous zone further west. The above trends highlighted that, as expected, the distance from an international airport and the numbers of COVID-19 cases were characterised by distance-decay [i.e. were inversely correlated and with increasing distance from the airport, case numbers decreased (Figure 4) ]. During the successive lockdown phases, the percentage of cases within 50 km of an airport had decreased slightly, but numbers rose gradually for locations within 50-100 km of an airport. Thus, districts situated within this distance buffer (50-100 km) seemingly had a higher probability of being affected by COVID-19 and thereby swelling the overall case numbers. This indicated that the virus had spread out from its initial centres during the latter part of the lockdown and that its footprint was becoming much more apparent across the hinterland. Of the 50 international airports examined, the ratio of air passengers to COVID-19 cases were far higher for the Mumbai, Delhi, Bengaluru and Hyderabad airports ( Similarly, Hyderabad and Bengaluru (the main information technology hubs of India) are also entwined with global markets while New Delhi (the national capital) is connected by flights worldwide for diplomatic, administrative and tourism purposes. Thus, the initial outbreaks occurred mostly through these four airports. As mentioned before, many residents of Kerala work in the Gulf countries and this state is also a major tourist destination, with numerous flights routed to it through the Mumbai and Bengaluru airports. Therefore, it was also an initial hub of COVID-19 cases. The Getis-Ord Gi* method was used to identify the spatial distribution of potential hot spots (statistically The location of the cold spots remained almost similar to that discerned from the pre-lockdown phase data of 23rd March 2020, with clustering mainly around West Bengal. The mid-lockdown phase data of 10th April 2020 ( Figure 6C ) indicated intense clustering of high values around Maharashtra and Gujarat. However, the initial high values clustering around Kerala started diminishing in this mid-lockdown phase, possibly due to the pre-emptive actions regarding testing and quarantine taken by the Kerala State Government and the Union Government's policies. In fact, Kerala was initially considered as the first Indian state likely to 'beat the curve' when it showed a continuous downtrend/decrease in its number of active COVID-19 cases from 6th April till 10th May (this situation has subsequently changed markedly for the worse and the patterns/trends as of 1st June 2020 and 15th July 2020 for the whole country have also been discussed in Section 4.5). The cold spot intensity had declined in the eastern part, almost throughout West Bengal. The 18th April 2020 dataset ( Figure 6D ) revealed a much more concentrated hot spot with a z-score of 7.84 in Maharashtra and Gujarat. By then, the entirety of Kerala and Andhra Pradesh had ceased to be part of any hot spot region. However, the area and intensity of the cold spot region located in the eastern part of the country had also declined significantly, indicating that the next phase of the virus' spread could target this zone. Again, subsequent events and updated news reports about the COVID-19 spread has validated just this, with the eastern region around Kolkata presently emerging as one of the most affected by the pandemic. While the Getis-Ord Gi* analysis of the district-level COVID-19 cases identified the clustering of low and high index values, it only delineated large clusters through neighbourhood analysis and ignored those districts that had high numbers of cases but were surrounded by low-value neighbours or vice versa. Thus, the Mo n's I (Jackson et al. 2010 ) was computed to obtain additional insights into the statistically significant High-High, High-Low, Low-High and Low-Low clusters derived previously for each of the four datasets (Figure 7) . Orissa are easily visually correlated as districts with high epidemic susceptibility. The potential gravity of the COVID-19 spread was assessed for the four different time steps to ascertain its likely outreach areas (Figure 8 ). The possible potential outreach on 23rd March 2020 ( Figure 8A ) was mostly around the Delhi NCR (National Capital Region). Some small outreach pockets were seen in Kerala and Maharashtra (e.g. Mumbai and Pune). From the 29th March 2020 dataset ( Figure 8B ), potential outreach epicentres were found in three significant locations-Delhi-centric, Mumbai-centric and Kerala-centric (due to reasons outlined in Section 4.1). In the 10th April 2020 dataset, Kerala displayed a marked improvement in controlling the outbreak, as discussed previously ( Figure 8C ). However, rapid growth was observed for Mumbai and its surrounding areas and this scenario continued into the next phase as well ( Figure 8D ). We thus predicted that based on the potential outreach analysis, the gravity of the COVID-19 outbreak was likely to be very high in western India, especially in Maharashtra. The current situation reflects this, as this state ranks highest within India in COVID-19 incidence with over 13.8 million reported cases, numbers of active cases as well as deaths (India Covid-19 Tracker at https://www.covid19india.org/ as on 1st October 2020). Maharashtra is overwhelmingly afflicted in Mumbai and its nearby areas of Thane and Pune. Neighbouring Gujarat has also been hardest hit in Ahmedabad, Surat and Vadodara. As per the performed potential outreach analysis, the North-eastern states (except Assam) and the Himalayan states are relatively less affected by the COVID-19 outbreak. However, the eastern districts of Bihar, Jharkhand and West Bengal are seen to gradually come under the pandemic's grasp, over the different times-steps, pointing towards this zone possibly carrying forward the rising patient numbers in the near future, as the spread in western/northern India may start to peak or gradually decline. Most of the COVID-19 afflicted residents in eastern India have a travel history from either Mumbai, Kerala or Delhi. Initially, only a few cases were observed in this region due to its lower volume of international air traffic (as denoted by the lower air passenger to COVID-19 case ratio) but case numbers have risen during the latter lockdown phases and beyond it due to inter-state in-migration from the western and southern parts of the country, particularly as a result of large numbers of returning migrant workers (Mullick 2020) . Overall, the potential outreach analyses denoted that the virus' epicentres have been mostly concentrated in and around Mumbai in Maharashtra. As Kerala and also Karnataka initially recovered quite quickly, the earlier epicentre in southern India was reduced during the subsequent lockdown phases (however, infections have again risen sharply in Karnataka and Tamil Nadu towards the end of the final lockdown phases in May 2020 as restrictions have relaxed and presently, these two states along with Andhra Pradesh occupy the second to fourth positions in the country in terms of COVID-19 incidence). The eastern part of the country now faces the imminent threat of becoming an epicentre if further transmissions into the region occurs from the ongoing return of many currently unemployed migrant labourers from other parts of the country and consequent community transmissions. The comparative outlooks of the undertaken hotspot and the potential outreach analyses differed slightly in perspective. While the hot spot enumeration discerned COVID-19 affected zones based on the actual ground situation/data as ascertained for that point in time, the potential outreach maps denoted the gravity factors for the four different time periods and depicted how conditions may worsen in the major epicentres or have eased off in other areas (based on the then available data). The outreach analysis thus gauges the outbreak's potential across India and denotes zones likely to be affected next. Using the indicators listed in Table 1 and Table 2 , we had developed the area-based composite COVID-19 susceptibility and vulnerability indices at the district level-for India, with a view towards providing policy makers with some indication on which districts are likely to be most susceptible or vulnerable to a COVID-19 outbreak and specifically where should the Government target its resources and accordingly plan a data-driven intervention strategy. The elicited results from these indices are presented below. A five-pronged classification scheme (ranging from Very High to Very Low) was used to visualise the enumerated district-level epidemic susceptibility values (derived using quintile class- Table 3 ). There are significant regional clusters in the northern, eastern, southern and parts of the north-eastern states of India (Figure 9 ). Large swathes of south-western and northern India, covering most districts of Kerala, Tamil Nadu, Telangana, Andhra Pradesh, Karnataka, Maharashtra, Gujarat, Rajasthan, Delhi, Haryana and Punjab are highly susceptible to this pandemic. Primarily, these were the areas where the initial outbreak occurred and the subsequent transmission of has been phenomenal, with people therein being seemingly less able to cope with the COVID-19 virus (i.e. their physiology is more easily affected by it). Possibly, the prevalence of urbanization (which creates congestion), an existing burden of non-communicable diseases (hyper tension, diabetes and obesity) and a greater proportion of the elderly population in these regions may have heightened the overall epidemic susceptibility. Ironically, most of the highly urbanized and economically well-off states seem to have reported higher COVID-19 susceptibility, since the initial epicentres of the virus outbreak were in their large cities. Contrarily, almost all districts in the seven north- susceptible to the pandemic. It is interesting that despite being economically poorer and having low indicators with respect to almost every socioeconomic and health related parameter than the rest of India, these areas are less susceptible to the pandemic, suggesting that the COVID-19 outbreak might be an erratic phenomenon that cannot be explained solely by traditional socioeconomic theories and which might require further investigation. As many districts in the moderate susceptibility to COVID-19 class lie in economically poorer zones, this can have important connotations in terms of impaired health-care services and facilities, thereby deteriorating immune response and patient recovery. Furthermore, as discerned in previous sections, India's eastern region is likely to become a secondary virus epicentre. Thus, even though this zone is denoted in the moderate susceptibility category, its lower socioeconomic standing and existing poorer healthcare assets and availability/accessibility accord it high priority for pre-emptive future resource allocation in order that the likely forthcoming challenges can be met adequately. Figure 10 visualizes the socio-economic susceptibility index for India's districts, following a similar classification to that adopted for Figure 9 (see Table 3 ). The underlying assumption of this particular index is that if all districts are evenly infected by the COVID-19 outbreak, then what would be the magnitude or susceptibility of further transmission in a certain area, with this being dependant on its socioeconomic status. Results reveal that large portions of eastern, north and north-western India are high to very highly socioeconomically susceptible to this pandemic. These areas are particularly characterized by high population densities, chronic malnutrition, poor health infrastructure, larger family sizes, poor hygiene practices, poverty concentration and marginalization including lower health-related knowledge and awareness, thereby precipitating such outcomes. On the contrary, most districts in the north-eastern, southern and extreme northern regions (e.g. in the northern part of Rajasthan and in Punjab, Himachal Pradesh, Jammu and Kashmir and Uttarakhand), and are found to be socioeconomically less susceptible in terms of COVID-19 transmission. Interestingly, some areas that initially had higher numbers of COVID-19 cases have emerged in the low susceptible category (e.g. Kerala) and this merits further explanation. In Kerala, after the initial outbreak was reported, effective State Government measures temporarily lowered further transmissions by facilitating mass testing, awareness creation at the community level and though stringent physical distancing and lockdown norms. Alongside this, most importantly, there has been efficient management of both international and inter-state migrants in the state. However, conditions have progressively worsened as restrictions have eased, and large numbers of expats have returned from the Gulf region and from other parts of the country (Nidheesh 2020) . The state has also seen a very high jump in the number of reported cases in the aftermath of the annual Onam festival (held from 22 August to 2 September) (The Hindu 2020), with numbers rising from about 50,000 before it to above 190,000 presently by the end of September 2020 (India Covid-19 Tracker at https://www.covid19india.org/ as on 1st October 2020). With the autumn festival season approaching fast, there are thus grave fears that there could be a significant rise in case numbers across the country (CNA 2020), while fears over such a rise have also affected local economies further (Sharma 2020) . Aspirational Districts (ADs) are more susceptible to COVID-19 outbreaks due to their already limited coping-up capacities, as they are some of the socio-economically poorest/most backward regions of India. The overall health infrastructure and particularly the numbers of primary and community health centres in such locales are also less, with limited staff and bare minimum facilities further compounding the issue. However, the incidence of COVID-19 cases in these areas was initially fairly low, being less than 2% (a total of 610 cases as of 6th May 2020) of the total cases nationwide. Of the 112 ADs affected by the virus on the above date, the worst-hit have been Ranchi (55 cases), Baramulla (62), YSR (55), Nuh (57), Jaisalmer (34) and Kupwara (47), which are all located in the Red Zone (the locales that faced the most stringent lockdown norms) as per the Indian Government's classification. Considering their susceptibilities, the GoI (through the NITI Aayog) has taken steps to ensure appropriate and timely action for resolving supply shortages in test kits, Personal Protective Equipment (PPE) and in providing masks to the respective empowered populations/groups in these districts. We were also able to identify these districts based on their respective susceptibility indicators, e.g. percentage of poor population, household size, impaired child health (i.e. suffering from anaemia, underweight or stunted development), and poor female health (anaemia and underweight). Apart from these ADs, we further identified some other districts that are also quite susceptible due to their socioeconomic background and these areas (most districts of Madhya Pradesh, Orissa, Andhra Pradesh, Bihar, Jharkhand, West Bengal and Uttar Pradesh) need support similar to that being provided to the ADs (i.e. urgent supply of testing kits, PPE and masks). Based on the selected set of indicators, the discerned socioeconomic vulnerability to the COVID-19's impact ( Figure 11 ) is likely to be higher in many districts of Madhya Pradesh, Orissa, Telangana, Andhra Pradesh, Bihar, Jharkhand, West Bengal and Uttar Pradesh and these areas are likely to report higher fatalities. There is a 30% overlap of the places most at risk in these states with those demarcated under the Union Government's Aspirational Districts Programme, i.e. 75 of the discerned 255 highly to very highly vulnerable districts are also ADs. The scarcity of healthcare facilities and personnel in these districts remains a major issue and such areas also have a large number of inter and intra state migrants, who have steadily returned home during the lockdown. Therefore, it is quite possible that cases of COVID-19 infection shall sharply rise in these locations and that the poorer medical infrastructure shall increase the infection rate and incidence of deaths, if adequate measures are not taken. Contrastingly, this impact shall be markedly lower in the states of Kerala, Punjab, Haryana, Himachal Pradesh and in most of the North-eastern states and the Rann of Kachch (in Gujarat). Some districts of Tamil Nadu, Gujarat, Rajasthan and Maharashtra are moderately vulnerable. The enumerated index values are higher in some districts due to these housing a greater proportion of those that are more vulnerable to this contagion (i.e. more number of children and women who are anaemic and/or underweight, lower female education levels and a higher prevalence of diabetic patients). Therefore, these districts are likely to find it difficult to cope with the COVID-19 threat and its related morbidity aspect. On the other hand, the lesser vulnerable districts seemingly do have sufficient capability to deal with the threat, as observed in the districts of Kerala, from where very few morbidity cases were initially reported, compared to the other parts of India, despite this state being an early epicentre of COVID-19. Figure 12 shows the correlations derived between the COVID-19 cases (i.e. the number of cases per district) and its socioeconomic, demographic and climatic factors. This pandemic is positively associated with the percentage of urban population and the population density of a district. It is also positively associated with the percentage of women having attained 10th or lower standard of schooling. Furthermore, districts with a higher share of poorer households (40%) had lower viral transmissions. In the second half of April 2020, although the virus had spread across 377 districts, the gravity of the outbreak was mostly concentrated in districts that were major urban agglomerations, such as Mumbai, Delhi and Hyderabad. People in urban areas are relatively more mobile (with the resultant greater public transport congestion and crowds) than ruralites. They may also be forced to maintain lower residential and social distances due to the higher densities of built-up zones as well as population, especially within slums areas where housing shortages are quite severe (Haque et al. 2020) . These factors together with the inherent economic deprivation faced by slum residents would likely enable a more widespread outbreak in such congested locales (Ahmed et al. 2020) , as has been evidenced by the large numbers of cases reported from the Dharavi slum area of Mumbai, which is one of the largest such entities in Asia. This is also the likely cause for the extremely high number of infections being subsequently reported from Delhi and its surrounding urban agglomeration and from Chennai and its neighbourhoods. As urbanites generally attain a higher level of education, the pandemic's outbreak is also positively associated with the educational status, more so because this 'imported' virus mainly came into the country via air-travel (i.e. from the movement of economically well-off sections of the population, who can be expected to have also attained higher education levels). The above bivariate linear relationships do not provide the non-linear associations between variables. For this, the multivariate non-linear association was evaluated via the GAM. The outcome variable for this model was the cumulative number of COVID-19 cases in the district, i.e. the total tested/confirmed positive cases. The obtained findings predict a flexible relationship between the enumerated variables ( Figure 13 ). Districts with a population density between 1500-2500 people/km 2 specifically have a higher risk of this virus spreading amidst them. Of the 377 districts affected by the virus (as per the last date on which the data for this paper was collected), apart from a few situated in urban metropolitan cities, all others contain medium to large towns which have a moderate level of population density. The highly urban districts face a greater threat of a COVID-19 outbreak, particularly when the urban population share crosses 60%. In the urban agglomerations of India, a significant proportion (17.4%) of residents live in slums (e.g. about half the population of Greater Mumbai lives in such locales). While they play a vital role in the city's functioning, these slum areas are poorly planned and obviously very densely populated, as a result of the ever-accelerating urbanization trend in India. Such sites thus become potential hotspots for infectious diseases like COVID-19. This unplanned urban growth also poses considerable risk in terms of impaired preparedness and ready response to any infectious disease outbreak, with mass quarantining at such close quarters, while maintaining social distancing, being almost impossible. Hence, the responses of healthcare officials, governments and communities to this ongoing pandemic can be devised towards generating a paradigm shift in how urban spaces and residences are planned and designed, with possible ramifications for future peri-urban transformations, inner-city renewal and slum rehabilitation (Jha 2020 ; Regmi 2020; Van den Berg 2020). The findings also suggest that although districts with a higher prevalence of children suffering from anaemia and/or being underweight face an elevated risk of this viral transmission, the relationship slope is not very significant. Thus, these health indicators are not strongly correlated with COVID-19 cases in the Indian context. A possible reason for this seemingly discordant finding is that may be the GoI and/or State Governments had till the last date of the initial data collection for this paper (i.e. 18th April 2020) not conducted an adequate number/proportion of population-level tests, rather testing only those who had already developed COVID-19 related symptoms. Therefore, a similar analysis performed on a bigger and possibly more representative sample obtained via large-scale population-level testing may yield better explanations. Several studies have found that climatic parameters such as mean annual temperature and relative humidity have a crucial role in spreading the COVID-19 virus (Sajadi et al. 2020; , with evidence suggesting that a mean temperature of 10°C and a medium to high (60% to 90%) relative humidity range is most suitable for its transmission. The greater majority of India's districts have a higher daily mean temperature than 10°C in April and except for a few districts in the Himalayan and coastal belts, all districts have an average relative humidity less than the suitable level. Therefore, unlike as discerned in other studies, we could not find any significant relationship between the currently prevalent climatic parameters and this viral outbreak. The lockdown measures in India were extended to 31 May (Phase-IV, though there were significant allowances in terms of movement and industries/commercial establishments gradually during this period), with the period from 1 June till 30 June being denoted as Unlock-1 and again as Unlock-2 (from 1 July to 31 July), as the nation reopened and eased off restrictions further. Each of the three attributes-the spread of COVID-19 cases, the clustering of cases and the hot/cold spots have thus evolved further since this study was commenced with the initial database of the first few lockdown phases in mid-April, 2020. This section has thus been prepared to show the changed conditions as on 1st June 2020, and on 15th July 2020, in the above aspects, thereby denoting the complete scenario at the end of all the lockdown phases, that effectively ended on 31st May 2020, and the continued evolution of the disease in India during the Unlock-1 (1-30 June) and Unlock-2 (1-31 July) phases. It was apparent that the spread of COVID-19 in India could not be wholly attained simply by the various lockdown measures. This was evidenced by the overall continued rise in case numbers throughout the country [ Figure 14 In the central portion of the country overall cases were still relatively on the lower side. Hotspots were clustered mostly in a broad swathe along the western and southern region [ Figure 15 (B)]. Cases in the western part of the country were more intensely concentrated in certain districts, as was evident from the low-low and low-high clustering patterns [ Figure 15 (C)]. The COVID-19 crisis' impact on society and the global economy has been profound and likely to be long-lasting. Possibly, governments either underestimated its threat or did not have robust enough socio-political systems and healthcare infrastructure to combat transmissions. Instead, while helpful, the lockdown measures have engendered economic instability and developing nations have been the worst affected, trying to contain the pandemic while addressing rising unemployment and at-stake livelihoods of a vulnerable population. Thus, it is paramount that correct and sustainable economic/socio-political decisions are taken. For this, proper predictions of the pandemic's path, with pinpointing of areas that it can affect the most, are pertinent, for prudent and targeted resource allocation. Our analysis has highlighted the initial centres of the COVID-19 pandemic in India and its spread. By identifying its hotspots and significant clusters, we pinpointed locales where possible community transmissions have occurred. Our computations were based on the data garnered during the initial lockdown phases implemented in India. However, subsequent reports of the pandemic's spread have largely validated our earlier estimates of the areas that were most likely to be affected (i.e. the virus gaining a foothold and then spreading in various clusters of Eastern India, around Kolkata). Our analysis thus lays the groundwork for identifying future hot spot zones so that communities and local governments can anticipate and allocate critical resources accordingly. Mass testing of COVID-19 in such hotspots can curb the disease's spread into adjoining areas. By also designating locales that are relatively safer and less susceptible/vulnerable to the pandemic (i.e. cold spots), we also provide a framework for earmarking places where economic activities can be carried out more safely, to boost flagging economies. We have also identified districts that are more vulnerable to the virus. These locations can, if possible, strive to build food stocks to avoid any food-security related issue in case sudden lockdowns are again required if a second wave of the virus arises. In the most vulnerable cities and the suburban areas located within 50-100 km around them, crowds in market places must be stringently restricted, with continued social distancing norms and masks being mandatory. Districts that have high epidemiological susceptibility need enhanced vigilance (possibly through specially constituted rapid-action teams and through detailed co-morbidity surveys) to avoid transmissions and a higher incidence of death. Areas that are socioeconomically susceptible face the greatest threat as they house substantial numbers of people who may not be physiologically capable of coping with the COVID-19 virus. Overall, our demarcated hot and cold spot areas and those that are epidemiologically susceptible or socioeconomically vulnerable tallies quite well with the red-orange-green containment zones delineated by the GoI (NDMA 2020), providing further validation of the performed analyses. There are still some parts of India where the COVID-19 has not been able to intrude significantly, especially in physiographically rugged and isolated tracts and in some rural hinterlands. It is paramount that these locations be guarded against community transmissions while more afflicted areas are dealt with adequately. Though the quickly enforced lockdown no doubt caused much misery, especially for migrant workers far from home, it may have also given the country some time to build up its medical responses and anticipate where the contagion can spread next. We hope that our study has created a framework that enables this to be accomplished fruitfully. Densely populated areas raise the risk of greater physical contact among people. As the SARS-COV-2 is a highly contagious virus, population density is an unavoidable measure (Rocklöv and Sjödin 2020) Investigation of effective climatology parameters on COVID-19 outbreak in Iran Why inequality could spread COVID-19 Globalization, the Challenge of COVID-19 and Oil Price Uncertainty Antimalarial drug policy in India: Past, present and future Global database of Plasmodium falciparum and P. vivax incidence records from 1985-2013. Sci. Data A new twenty-first century science for effective epidemic response Space-time dependence of corona virus (COVID-19) outbreak Covid-19: risk factors for severe disease and death Coronavirus infections reported by ProMED A randomized trial of Hydroxychloroquine as postexposure prophylaxis for Covid-19 Meta-analysis of the relationship between risk perception and health behavior: the example of vaccination A spatio-temporal analysis for exploring the effect of temperature on COVID-19 early evolution in Spain Estimation of COVID-19 prevalence in Italy Healthcare impact of COVID-19 epidemic in India: A stochastic mathematical model Application of a fuzzy operator to susceptibility estimations of coal mine subsidence in Taebaek City COVID-19 sows dread in India's festival season as infections cross 6 million Factors determining the diffusion of COVID-19 and suggested strategy to prevent future accelerated viral infectivity similar to COVID Chloroquine for the 2019 novel Coronavirus SARS-CoV-2 novel Coronavirus: an emerging global threat Why COVID-19 Outbreak in India's Slums Will Be Disastrous for The Urban Poor COVID-19, a worldwide public health emergency. Spanish Clinical Magazine Malaria in India: The Centre for the Study of Complex Malaria in India Critical community size for COVID-19 -a model based approach COVID-19 associated pneumonia in clinical studies The analysis of spatial association by use of distance statistics A balance act: minimizing economic loss while controlling novel coronavirus pneumonia Infrastructure development and access to basic amenities in Class-I cities of West Bengal, India: Insights from census data Location matters: Unravelling the spatial dimensions of neighbourhood level housing quality in Kolkata Coronavirus disease 2019 (COVID-19): A literature review Generalized Additive Models Hydroxychloroquine or Chloroquine for treatment or prophylaxis of COVID-19: a living systematic review Modeling containing COVID-19 infection. A conceptual model National Family Health Survey (NFHS-4) A modified version of Moran's I Observer Research Foundation Analysis of COVID-19 infection spread in Japan based on stochastic transition model The impact of population potential on population redistribution in the long-term historical context: case study of region Stredne Povazie Methods of assessing human health vulnerability and public health adaptation to climate change. World Health Organization, Health Canada, World Meteorological Association, United Nations Environment Programme Predication of Pandemic COVID-19 situation in Maharashtra Measurement of Urban Processes: A Study in Regionalization Towards Building a Composite Index for Asia: Realising the Millennium Development Goals, Vi-29. United Nations Development Programme (UNDP) COVID-19 and India's nowhere people. The Diplomat Prevalence and impact of cardiovascular metabolic diseases on COVID-19 in China Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease The role of absolute humidity on transmission rates of the COVID-19 outbreak novel coronavirus (2019-nCoV) outbreak-A new challenge Prudent public health intervention strategies to control the coronavirus disease 2019 transmission in India: A mathematical model-based approach Prevention Is Better Than the Cure: Risk Management of COVID-19 The Global Macroeconomic Impacts of COVID-19: Seven Scenarios District wise health care infrastructure Notes on continuous stochastic phenomena Covid-19 cases spike in 6 states as migrants return. Hindustan Times, 10th June At the Epicenter of the COVID-19 Pandemic and Humanitarian Crises in Italy: Changing Perspectives on Preparation and Mitigation. Innovations in Care Delivery NDMA (National Disaster Management Authority) (2020) Hotspot' (Red Zone) classification Kerala reopens without communal transmission despite covid rise. livemint, 9th June Census of India 2011. Ministry of Home Affairs, Government of India Local Spatial Autocorrelation Statistics: Distributional Issues and an Application Inter nation social lockdown versus medical care against COVID-19, a mild environmental insight with special reference to India SEIR and Regression Model based COVID-19 outbreak predictions in India Climatic factors influence the spread of COVID-19 in Russia Climatic influence on the magnitude of COVID-19 outbreak: a stochastic model-based global analysis Temperature significantly changes COVID-19 transmission in (sub) tropical cities of Brazil Assessment of 21 Days Lockdown Effect in Some States and Overall India: A Predictive Mathematical Study on COVID-19 Outbreak Covid-19 cuts into business of religion in India Host susceptibility to severe COVID-19 and establishment of a host risk score: findings of 487 cases outside Wuhan A simple Stochastic SIR model for COVID 19 Infection Dynamics for Karnataka: Learning from Europe. Populations and Evolution (preprint) Chloroquine and hydroxychloroquine in the treatment of COVID-19 with or without diabetes: A systematic search and a narrative review with a special reference to India and other developing countries Age-structured impact of social distancing on the COVID-19 epidemic in India Explained: Indian migrants, across India. The Indian Express One size does not fit all -Patterns of vulnerability and resilience in the COVID-19 pandemic and why heterogeneity of disease matters The Hindu (2020) Coronavirus | Kerala cases rise to new high after Onam drop. The Hindu The Lancet (2020) Indian under COVID-19 lockdown Evaluation of the lockdowns of the SARS-CoV-2 epidemic in Italy and Spain after one month follow up Prediction for the spread of COVID-19 in India and effectiveness of preventive measures The outbreak of COVID-19 coronavirus and its impact on global mental health How will COVID-19 affect urban planning? The CityFix Susceptibility Analysis of COVID-19 in Smokers Based on ACE2 High Temperature and High Humidity Reduce the Transmission of COVID-19 A review of the 2019 novel Coronavirus (SARS-CoV-2) based on current evidence R package "corrplot": Visualization of a Correlation Matrix (Version 0.84) WHO (World Health Organisation) (2020a) (World Health Organisation) (2020) Newsroom/Coronavirus (Covid-19 WHO (World Health Organisation) (2020b) Coronavirus disease 2019 (COVID-19) Situation Report-73. World Health Organisation Exposure to air pollution and COVID-19 mortality in the United States. medRixv (preprint) Effects of temperate and humidity on the daily new cases and new deaths of COVID-19 in 166 countries Association between ambient temperature and COVID-19 infection in 122 cities from China Prevalence of comorbidities and its effects in patients infected with SARS-CoV-2: a systematic review and meta-analysis Relationship between the ABO Blood Group and the COVID-19 Susceptibility. medRxiv Preprint Knowledge, attitudes, and practices towards COVID-19 among Chinese residents during the rapid rise period of the COVID-19 outbreak: a quick online cross-sectional survey The authors are grateful to their respective institutions for providing the infrastructure required to conduct this research. The assistance of Ms. Sayoni Mondal, Research Scholar at the Department of Geography, Presidency University, Kolkata, India in arranging the article references is gratefully acknowledged. None Percentage of the total population living in urban areas Urban areas are particularly most vulnerable as they are the initial centres of the outbreak. The prevalence of slums within cities especially raises the risk of this viral transmission (Coudhry and Avindandan 2020) .Office of the Registrar General of India (2011) 1 7Percentage of women having 10th standard education and aboveProxy parameter for the awareness about the COVID-19 spreading. It is assumed that if an area has a higher proportion of women 10 or more years of schooling, then overall household awareness will be higher and this can lower transmission (Zhong et al. 2020 ). (2017) 1 7 Percentage of households exposed to smokingSmoking or being exposed to smoking in any form can elevate the risk of infection from COVID-19 (WHO 2020). Wealth status The economically wealthier section can make better arrangements for rigid interventions such as lockdown while poor people may be forced to go out to arrange food supplies. Accessing meals from community kitchens, receiving food items from donations, buying essentials items from the government fair shops, may elevate the probability of the less economically well-off sections to catch the infection. Percentage of households exposed to smokingSmoking may not only increase the risk of viral transmission but also raise the probability of dying, as the COVID-19 primarily afflicts the pulmonary system.