key: cord-1005866-rkei8ul2 authors: Azad, Sarita; Devi, Sushma title: Tracking the spread of COVID-19 in India via social networks in the early phase of the pandemic date: 2020-08-08 journal: J Travel Med DOI: 10.1093/jtm/taaa130 sha: f6f13707b3f8aedc7ca4c6aceb60b508ee63e366 doc_id: 1005866 cord_uid: rkei8ul2 BACKGROUND: The coronavirus pandemic (COVID-19) has spread worldwide via international travel. This study traced its diffusion from the global to national level and identified a few superspreaders that played a central role in the transmission of this disease in India. DATA AND METHODS: We used the travel history of infected patients from January 30 to April 6, 2020, as the primary data source. A total of 1386 cases were assessed, of which 373 were international, and 1013 were national contacts. The networks were generated in Gephi software (version 0.9.2). RESULTS: The maximum numbers of connections were established from Dubai (degree 144) and the UK (degree 64). Dubai’s eigenvector centrality was the highest that made it the most influential node. The statistical metrics calculated from the data revealed that Dubai and the UK played a crucial role in spreading the disease in Indian states and were the primary sources of COVID-19 importations into India. Based on the modularity class, different clusters were shown to form across Indian states, which demonstrated the formation of a multi-layered social network structure. A significant increase in confirmed cases was reported in states like Tamil Nadu, Delhi, and Andhra Pradesh during the first phase of the nationwide lockdown, which spanned from March 25 to April 14, 2020. This was primarily attributed to a gathering at the Delhi Religious Conference (DRC) known as Tabliqui Jamaat. CONCLUSIONS: COVID-19 got induced into Indian states mainly due to International travels with the very first patient travelling from Wuhan, China. Subsequently, the contacts of positive cases were located, and a significant spread was identified in states like Gujarat, Rajasthan, Maharashtra, Kerala, and Karnataka. The COVID-19’s spread in phase one was traced using the travelling history of the patients, and it was found that most of the transmissions were local. The first case in India was reported on January 30, 2020, from Wuhan, China. In the absence of any cure, this disease could have been fatal for a vast country such as India, affecting its 1.3 billion residents. However, as of April, 2020, the COVID-19 infection rate in India was markedly lower than in other affected countries. This slow spread could mainly be due to prompt 3-week nationwide lockdown from March 25 to April 14, 2020 4 . To control the pandemic, the Indian government enacted a range of social distancing strategies, such as citywide lockdowns, screening measures at train stations and airports, and isolation of suspected cases. A complete restriction on domestic and international flights was imposed by the Indian government during this time. Hence, the travel data available for this study were restricted until April 6. The exponential global spread of COVID-19 resulted in a 22% reduction in international travel followed by 57% in March 2020. This situation has put 100 to 120 million tourism jobs at risk and severely affected the tourism sector 5 . Also, it was speculated by many researchers and media that the widespread vaccination for tuberculosis or malaria resistance could have helped India remain immune to the pandemic to some extent, possibly slowing the rate of infection 6 . As of April 10, India's five worst-hit states were Maharashtra, Delhi, Tamil Nadu, Rajasthan, and Telangana, which were declared hotspots in terms of the total number of COVID-19 infections 7 . In the initial phase, COVID-19's transmission was mainly due to international travel. used to ascertain the complexity of varying network systems. In a social network, contact patterns can be used to analyse disease dynamics. A network can be inferred through statistical metrics such as degree, modularity, and centrality, which are the essential factors that quantify a network 13 a weak external cohesion (outside the group) 17 . Some well-known methods are documented in the literature that enable the construction of such communities in the form of clusters known as modularities [18] [19] [20] [21] . In this study, the network was generated via Gephi software (version 0.9.2), which uses the Louvain method for community detection 22 . The main objective of this study was to determine the social network behind the spread of COVID-19 in India. We demonstrated the situation from the beginning and how the outbreak spread throughout Indian states via cluster formation. This work is an essential contribution as fewer studies are available on the COVID-19 transmission network as a whole. The data utilised in this study were obtained from https://www.covid19india.org, which include the patient number, their state of residence, their travel history, and the source. Gephi (version 0.9.2) software was used for network generation and visual exploration. Since there was a complete restriction on domestic and international flights after the first lockdown in India that commenced on March 25, 2020, the travel data available for the present study were from January 30 to April 6. This software includes many essential parameters that are explained as follows: Degree centrality is an important parameter that measures the total number of edges attached to a particular node 23 . A node with the highest degree centrality means that the node has more linkage with other nodes. There are two types of degree centrality: in-degree and out-degree. In a directed graph, the edges that go into a node are in-degree edges and the edges that come out of a node are outdegree edges. Mathematically it is expressed as: where X ij includes both in and out edges. Closeness centrality measures how much a node is close to all other nodes in a given network and can be calculated as the average of the shortest path length from one node to every other node 24 expressed as: where d(i,j) is the length of the shortest path between nodes i and j in the network, N is the number of nodes. The Eigenvector centrality of a node is defined as the weighted sum of the centralities of all nodes that are connected to it by an edge, A ij, where C j is the eigenvector linked to the eigenvalue ϵ of A. The importance of a node in a given network is measured by its eigenvector centrality, which also gives other nodes weight. Eigenvector centrality measures how influential a node is in a given network 25 . Clustering and modularity: One of the central objectives of SNA is the identification of communities that are formed during an event. "Clustering" and "modularity" are the two terms that are used in this context. For example, clustering is the propensity of two nodes with a common neighbour to be neighbours of each other while modularity is the partitioning of a network into internally well-connected groups. The Newman-Girvan modularity is commonly used for community detection. This method detects communities by gradually removing edges from a given network 26 and giving priority to the edges that are "between" communities 27 . Spectral clustering is another community detection method that uses the eigenvalues of a symmetric matrix. The modularity's common requirement is that the connections within graph clusters should be dense. In this study, the modularity was calculated via the Louvain method. The Louvain method is a two-step iteration that uses a hierarchical algorithm to detect communities in a given network. The vertices are merged into a community, and it maximizes a modularity score for each community by evaluating how much more densely connected the nodes within a community are, compared to how connected they would be in a random network 28 . This process repeats until it reaches the maximum modularity value 29 . Figure 1 depicts a social network formed by contacts from 10 countries (these contacts may be Indians or foreigners) and 24 Indian states, which mainly took part in the initial disease transmission through international travel. A real network is typically comprised of nodes (units); in our case, countries were the primary nodes and represented the one-directional flow of information to Indian states. Figure 1 also shows small nodes in which patient numbers are marked; each edge represents the travel details of an infected patient from any one of the countries to Indian states. Figure 1 demonstrates how people travelled from all over the globe to different Indian states and how they formed a large number of clusters. In this network, the term "modularity" is used, which measures the numbers of clusters. To better understand these clusters and how they were counted in a given network, Figure 1(a) shows a magnified view of Figure 1 . It shows that Dubai has modularity class 3, which means there are three densely placed clusters, whereas the rest of the nodes are randomly connected. Similarly, the UK has modularity class 7, which shows it formed 7 main clusters (numbers are marked in Figure 1(a) ). There are a few listed countries where the maximum number of people travelled to India. Figure 2 shows the number of international travellers to Indian states. The largest numbers of international contacts from different countries were established in Kerala. For example, 90 people travelled from Dubai to Kerala, and 24 travelled from the UAE to Kerala. Statewise data are provided in Figure 2 . It demonstrates that most of the people travelled from Dubai and the UK to Indian states. For example, people from the UK travelled to 18 different states, whereas people from Dubai travelled to 15 different states. Table 1 summarises the metrics that quantity the connections in Figure 1 . It shows that Dubai and the UK have 144 and 69 degrees, respectively, which indicates the highest links were established in various states from these two countries. Table 1 shows that Dubai and the UK had the highest closeness centrality (closeness to all of the other nodes), which means these two countries were the main diffusion points as these two were connected with the maximum number of nodes. Table 1 indicates that the UK had a higher modularity class (the number of clusters is 7) than Dubai, which means that those who travelled from the UK formed a larger number of groups across Indian states. Dubai had the highest eigenvector centrality (0.84), which means it was the most influential node. Therefore, we conclude that the UK and Dubai were the main sources of COVID-19 importations into India. Further, it has been reported that gatherings at places of worship represent a high risk for disease transmission to potentially large numbers of people from a single case. These gatherings often involve dense mixing of many people in a confined space, sometimes over significant periods 30 to this gathering. Figure 3 shows a social network that demonstrates the spread of COVID-19 across Indian states caused by this religious conference. Table 2 summarises the metrics that quantify the connections in Figure 3 . The maximum number of infected cases by the DRC were traced in Tamil Nadu (385 degrees), Delhi (301 degrees), followed by Andhra Pradesh (138 degrees), Assam (24 degrees), and Uttar Pradesh (11 degrees). However, although the degree of connections were very high in these states due to the DRC, they formed fewer clusters outside of their communities. For example, Andhra Pradesh's modularity class is one, Delhi's is two, and Tamil Nadu's is three, which shows that the transmission due to the DRC remained confined to a few states. Table 3 shows that although other states had lower degrees (Gujarat, 74 degrees, and Rajasthan, 32 degrees), they formed a larger number of clusters (Gujarat, 7 and Rajasthan, 8). Table 3 demonstrates that the number of people who returned after attending the DRC in these states were less as compared to the local connections. From the data, contacts of the first positive cases were located mainly in Gujarat, Kerala, Jammu, and Rajasthan. Infected persons who travelled either locally within the state or interstate came into contact with other people, which is how the virus spread. The first positive cases are marked in Figure 3 . A recent study reported that the risk of COVID-19 outbursts in India increased because of domestic flights. The local spread of COVID-19 was also caused by railway travel, as India has 10 times more train travellers than air travellers. This might have increased the risk of the virus spreading across states 32 . To better understand this local and interstate transmission, we magnified part of Figure 3 in Figure 4 . When the spreader and his/her travelling history are traced, it is called the local transmission. However, in community transmission, it is not possible to detect the origin of the infected persons. Figure 4 shows the high local transmission in Gujarat (degree 74 = DRC 6 +local 59 +interstate 9). The first person who tested positive in Ramganj, a city in Rajasthan, was located and formed a cluster within the state (degree 32 = DRC 7 + local 13 +interstate 12). The connections from Rajasthan, the DRC, and Karnataka demonstrated a multi-layered social network structure (including connections from local, DRC and interstate). Figure 4 also shows the connections from Maharashtra, Karnataka, Jammu Kashmir, and Kerala. Maharashtra's modularity is 6, and Karnatka's is 8, Jammu Kashmir is 6, and Kerala is 5, which means these states formed a large number of clusters, although the number of connections from the DRC were low, as shown in Table 3 . In summary, four distinct stages of COVID-19 have been identified to date 33 . Stage 1 is when the disease is imported from affected countries without any local origin and it has not spread locally. Stage 2 is the phase of local transmission, which includes people with a travel history to other already affected countries. Stage 3 is the phase of community transmission where the source of the disease is untraceable and the infected individual cannot be isolated. Once the population enters Stage 3, individuals contract infections randomly, and it becomes difficult to track the disease. Hence, we concluded that COVID-19's transmission in India remained at the local level (Stage 2) as of April 6, 2020. In addition to the outburst due to the DRC, it did not develop into community transmission (Stage 3) because of the timely isolation of infected cases in Delhi. Over the past few decades, world connectivity via air travel has increased markedly. Increased air travel has facilitated the spread of infectious diseases geographically. Travelling from high infection rate origins to destinations where the infection rate is lower has a huge impact on disease transmission. In countries with poor testing, poor contact tracing and fragile health care facilities will most likely lead to increased global transmission 34 A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Pneumonia of unknown aetiology in Wuhan, China: potential for international spread via commercial air travel World Health Organization. Novel Coronavirus-China Disease Outbreak News India coronavirus: Modi announces 21-day nationwide lockdown, limiting movement of 1 Global travel patterns: an overview Why does India have so few Covid-19 cases and deaths? India Today. 5 states worst hit by coronavirus in India | A quick look at the numbers. 2020 Zika in Angola and India Zika in travellers 1947-2017: a systematic review Ebola virus outbreak in North Kivu and Ituri provinces, Democratic Republic of Congo, and the potential for further transmission through commercial air travel Measles: a re-emerging problem in migrants and travellers The reproductive number of COVID-19 is higher compared to SARS coronavirus Centrality measures for disease transmission networks Topological dynamics of the 2015 South Korea MERS-CoV spread-on-contact networks Using social network measures in wildlife disease ecology, epidemiology, and management Uncovering the overlapping community structure of complex networks in nature and society Community detection in graphs Model-based clustering for social networks The dynamic nature of contact networks in infectious disease epidemiology Collective dynamics of small-world networks Complexity: The bigger picture Community structure in social and biological networks DyCoNet: A Gephi Plugin for Community Detection in Dynamic Complex Networks Uncovering space independent communities in spatial networks An efficient parallel algorithm for computing the closeness centrality in social networks Eigenvector centrality for characterization of protein allosteric pathways Modularity and community structure in networks Multi-Level Algorithms for Modularity Clustering Scalable community detection with the Louvain algorithm Parallel Heuristics for Scalable Community Detection COVID-19 pandemic: it is time to temporarily close places of worship and to suspend religious gatherings A single mass gathering resulted in massive transmission of COVID-19 infections in Malaysia with further international spread Estimating COVID-19 outbreak risk through air travel COVID-19: The 4 Stages Of Disease Transmission Explained 2020