key: cord-0887247-uug9k5p2 authors: Hâncean, Marian-Gabriel; Slavinec, Mitja; Perc, Matjaž title: The impact of human mobility networks on the global spread of COVID-19 date: 2021-03-07 journal: J Complex Netw DOI: 10.1093/comnet/cnaa041 sha: e51a8cb8630de0c1e0af3c315742935839e5c4df doc_id: 887247 cord_uid: uug9k5p2 Human mobility networks are crucial for a better understanding and controlling the spread of epidemics. Here, we study the impact of human mobility networks on the COVID-19 onset in 203 different countries. We use exponential random graph models to perform an analysis of the country-to-country global spread of COVID-19. We find that most countries had similar levels of virus spreading, with only a few acting as the main global transmitters. Our evidence suggests that migration and tourism inflows increase the probability of COVID-19 case importations while controlling for contiguity, continent co-location and sharing a language. Moreover, we find that air flights were the dominant mode of transportation while male and returning travellers were the main carriers. In conclusion, a mix of mobility and geography factors predicts the COVID-19 global transmission from one country to another. These findings have implications for non-pharmaceutical public health interventions and the management of transborder human circulation. The ongoing pandemic of coronavirus disease , caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), originated in Wuhan, China. The SARS-CoV-2, a coronavirus that infects humans, spread, as of September 2020, in 216 countries and territories [1] . COVID-19 has proved to be a highly contagious pathogenic viral infection, globally amassing more than 30 million confirmed cases and giving rise to approximately 1 million deaths [1] , since its detection in December 2019. Subsequent non-pharmaceutical interventions (NPIs) have been implemented by both authorities in China and worldwide to decrease the number of new daily infections and deaths, as well as to control the demand for healthcare and avoid the overwhelming of national medical systems. Essentially, many of these interventions have consistently restricted human mobility. Either with the purpose to reduce local transmissions, while for 73 (22.6%) patients there is no available evidence on the sex variable. In a 3-month time, COVID-19 spread in almost all the countries and territories of the globe. Specifically, approximately 67% of all first cases were officially announced during March 2020. Also, 95% of the countries and territories publicly confirmed their first cases between 1 January 2020 and 31 March 2020. Specifically, 12% in January, 18% in February and 65% in March. Put it differently, in 3 months, COVID-19 spread in 208 out of the 219 documented countries and territories. Figure 1 displays the distribution of the first individual COVID-19 cases by type of travellers and by modes of transportation (where applicable). As shown in Fig. 1(a and c) , of all the 323 documented cases, 64% (n = 207) are returning travellers, 24% (n = 77) visitors, 5% (n = 15) local residents and 7% (n = 24) are missing data (i.e. data not available in the country-level official reports). In terms of transportation modes ( Fig. 1(b and d) ), the airplanes were used by 71% (n = 229) of the individual cases, waterborne transport was used by 4% (n = 12), land transport (bus or car) was used by 3% (n = 10) and railways by 1% (n = 3). For 21% (n = 67) of the patients, the means of transportation was not public information (i.e. data not available). In the case of 3% (n = 9) of the cases, patients reported not having a recent travel history during the past 20 days. It is noteworthy that the percentage exceeds 100% as, in Centrality layouts for visualizing the patterns of (a) COVID-19, (b) incoming migration and (c) inbound tourism ties among countries. Node size is proportional to out-degree centrality. Countries located more centrally in the pictures have higher centrality scores. Core-countries are marked with red. In (a), ties among core-countries are marked in red. some cases (n = 6), multiple means of transportation were used, for example, railways and bus, airplane and bus or car, maritime and airplane. Using out-degree centrality layouts, we illustrate three binary directed graphs: the COVID-19 network (203 nodes, 213 ties, density (D) = 0.005, out-degree centralization (out-deg C) = 0.249, in-degree centralization (in-deg C) = 0.015) (Fig. 2a) , the incoming migration network (203 nodes, 571 ties, (Fig. 2b ) and the inbound tourism network (203 nodes, 519 ties, D = 0.013, out-deg C = 0.002, in-deg C = 0.321) (Fig. 2c) . In Fig. 2 , variations in country-level out-degree centrality scores (computed as percentage) is marked by both layout location (states with higher out-degrees are more central) and node size (the larger the node, the higher the centrality score). The centralization of the incoming migration network (both in-degree and out-degree) is influenced by the restrictions on the tie eliciting (only the first top three migration source countries were accounted in the analysis). Fitting a core/periphery model [37, 38] , we found that China (CHN), the United States of America (USA), France (FRA), Italy (ITA), Great Britain (GRB) and Iran (IRN) represent the core of the COVID-19 network. The core/periphery model has a fit (correlation) of 0.2, for an expected density of core/periphery ties and periphery/core ties of 0.5 (at five random starts and 100 iterations). In Fig. 2 , we mark the six core-countries with red in all the three networks, for rapid identification of their centrality placement. All three networks fit power-law distributions [39] . Particularly, this is indicated by Kolmogorov-Smirnov (KS) statistic tests, that is, the COVID-19 network (KS = 0.07, p = 0.99), the incoming migration network (KS = 0.05, p = 0.99), the inbound tourism network (KS = 0.04, p = 0.99). Correlations were performed based on 5000 permutations. Table 1 reports the quadratic assignment procedure (QAP) Pearson's correlation coefficients computed for the squared adjacency matrices previously illustrated as networks in Figs. 2 and 3 [38] . All coefficients are positive and statistically significant. The pair-wise correlation measurements indicate rather weak relationships between the network variables. Table 2 presents the most important results of our article: the estimates for the structural and relational attribute effects included in the ERGM models (see Section 4 for the definition of the effects). These models differ in the following way. Model 1 examines purely structural factors (edges, activity spread, indegree of 2, 1 and 0, outdegree of 2, 1 and 0, twopaths, and isolates). Model 2 examines the incoming migration and inbound migration effects (the predictors or the relational attribute effects of interest), controlling for structural effects. Model 3 (the full model) examines the predictors of interests, while controlling both for the structural effects and for the other three relational attribute effects (contiguity, common language and same continent). We notice, in Table 2 , that incoming migration and inbound tourism have a statistically significant positive impact (p = 0.0001) in the process of COVID-19 tie formation (Models 2 and 3). Specifically, COVID-19 case importations are more likely from states that are important sources of migration and tourism. Model 2 illustrates an increase of 3.9 (Exp(1.36), Est = 1.36, p = 0.0001, SE=0.22) in the odds of receiving a COVID-19 tie from a country that is a top source of migration, and of 2.5 (Est = 0.93, p = 0.0001, SE = 0.18), from a country that is a top source of tourism. The probabilities of case importations from top sending migration and tourism countries are 0.80 (exp(1.36)/(1+exp(1.36)), and 0.72, respectively. Model 3 shows that the effects of incoming migration and inbound tourism remain statistically significant positive (p =0.0001) while controlling for geographical (contiguity and same continent) and cultural (common language) proximities. Precisely, the corresponding probabilities of a COVID-19 case importation are 0.73 (Est=0.97, p = 0.0001, SE = 0.23) and 0.68 (Est = 0.77, p = 0.0001, SE = 0.16). We also observe that geographical (contiguity and same continent) and cultural (common language) proximities are statistically significant predictors (p = 0.0057, p = 0.0220 and p = 0.0001, respectively) when accounting for COVID-19 ties between countries. Importing COVID-19 cases from neighbouring countries (contiguity) has a 0.68 probability (Est = 0.75, p = 0.0057, SE= 0.27), while COVID-19 contagion on the same continent, a probability of 0.59 (Est = 0.37, p = 0.0220, SE= 0.16). Common language gives an increase of 2.1 (Est=0.73, p = 0.0001, SE=0.14) in the odds of receiving a COVID-19 tie (the corresponding tie probability is 0.67). Referring to the structural effects, the tendency of forming ties in the network (edges) has statistically significant negative estimates in Models 2 and 3 (Est = −2.09, p = 0.0398, SE = 1.02, and Est = −2.36, p = 0.0159, SE= 0.98, respectively). This indicates that the number of ties in the observed network is lower than expected by random. Activity spread is a statistically significant negative effect in all the three models (Est = −4.84, p = 0.0001, SE = 0.38, Est = −3.96, p = 0.0001, SE= 0.39, and Est = −4.30, p = 0.0001, SE= 0.40, respectively). This suggests that the number of network structures with active nodes is lower than expected by chance alone. Put it differently, most countries have similar levels of activity in the network (i.e. sending COVID-19 ties). The goodness of fit (GOF) statistics suggests that our models reproduce and fit the data well. Specifically, GOF statistics were less than 2.0 for features not explicitly modelled [40, 41] . Moreover, as shown in Table 2 , the estimates do not exceed ± 5, while standard errors (S.E.) are no larger than 10 [42] . We start this section by summarizing our main findings. Analysing data on COVID-19 importations in 203 countries, we show that the first cases were preponderantly males, returning travellers and that the most frequent mode of transportation was by air-flights. We also illustrate that the early country-to-country COVID-19 transmission network fits a power-law distribution and has five main global transmitters: China, France, Great Britain, Iran, Italy and the USA. Our statistical models indicate that incoming migration and inbound tourism are statistically significant positive predictors (p = 0.0001) for COVID-19 case importations. Specifically, a given country is more likely to receive COVID-19 cases from states that are among its top three sources of international migration and tourism. These predictors remain statistically significant (p = 0.0001) after controlling for contiguity, sharing the same continent and having a common language (which provides empirical support for our hypotheses). Our work is in accord with simulation [29] and observational [28, 30] studies showcasing the risk of COVID-19 outbreak through air travel. Also, we are in good agreement with previous papers demonstrating that the COVID-19 global [43] or domestic prevalence follows a power-law distribution [44] . Furthermore, we found that the first COVID-19 cases across the globe were mostly males. Contrary to this specific finding, studies on the COVID-19 prevalence in various countries [45] argue males and females are equally likely to be diagnosed with COVID-19. During the last decades, international tourism has increased, at an unprecedented level [46] and the global migration network (i.e. countries connected by migration flows) has evolved into a small-world type of structure, displaying a constant decrease in the average path length [47] . Put it differently, the global circulation of people is now more intense than ever, while distances have apparently melted. This entails unquestionable multiple social and economic benefits. However, at the same time, it seems to lay out various vulnerabilities, among which we can include the velocity of biological viruses' transmission. In this context, some studies have implied that the COVID-19 diffusion may be different from other pandemics given migrants' recent possibilities of travelling over long distances [48] . The role of migration corridors (origin and destination places socially connected by migrants) has been also emphasized as a key factor, in addition to the COVID-19 contiguity diffusion [27] . Moreover, it has been claimed that international tourists have even higher risk of spreading COVID-19 due to their unconstrained and infrequent movement patterns [48] . Evidence suggests that the movement of tourists should not be disregarded as a factor in COVID-19 diffusion while an abundant literature has so far documented case importations across the world [18] [19] [20] [21] [22] [23] [24] [25] [26] . Our study confirms the already reported evidence underlining the role of human mobility networks in the spread of biological viruses, in general. And, moreover, we give support to previous literature arguing the impact of migration and tourism on COVID-19, in particular. Despite the preventive measures and NPIs implemented by governments worldwide or the efforts of providing effective medication [49] , we show that COVID-19 spread, in a 3-month time interval, in 95% of the countries and territories that officially confirmed cases. A global diffusion of this magnitude may indicate that national states lacked an efficient coordination in controlling the country-to-country virus spread. A core group of countries seems to have involuntarily acted as hubs in the global virus transmission, confirming one of the well-documented features of complex networks, the preferential attachment [50] . The out-degree centrality of these countries in other global structures, such as migration and tourism networks, may have facilitated their acting as main global transmitters. In line with our research objectives stated in the Section 1, we described how COVID-19 spread from one country to another and demonstrated the statistically significant effect of migration and tourism inflows (p = 0.0001). One possible implication refers to the utility that models predicting population movements [13, 51] have in preparing responses to future potential different pandemics. Or even possible next waves of COVID-19. Modelling global human mobility may be an important input in increasing the international coordination of national authorities for handling cross-border circulation. Additionally, we believe our study may be a valuable contribution to developing global strategies for the early prevention of pandemics. For instance, coordination of information exchange between neighbouring countries as well as between states that share cultural and migration corridors may be of vital importance for case importation detection. National early warning systems should be internationally synchronized and integrate customized real-time information about global mobility flows as well as about countries that act as global transmitters. Furthermore, international coordination of information streams on people's movement patterns may be a driver for increasing the consistency level of the governmental preventive measures. It may be the case that the existing uncoordinated variations in the responses of domestic authorities to the COVID-19 to have facilitated the global virus spread. Lastly, our work may be particularly useful for researchers aiming to compare COVID-19 to other biological viruses' non-medical spreading dynamics, such as severe acute respiratory syndrome (SARS) for example [48] . On top of that, we hope our results may be beneficial for policymakers and national authorities in their efforts of managing human mobility across borders, economic difficulties and migration-related issues. Rapid action is needed as the COVID-19 pandemic has heavily hit the tourism industry and the air-flight sector. And, perhaps, even more important, the global spread of COVID-19 has consistently negatively impacted upon migration, exacerbating social vulnerabilities and polarities, increasing stigmatization and exclusion and aggravating migrants' health vulnerabilities. We are aware that our study may have at least two limitations. The first refers to the official countrylevel identification of the first COVID-19 cases. Our raw data come from national authorities' public announcements. Theoretically, that does not eliminate the possibility of COVID-19 existing in a country prior to being officially reported. The second concerns the network variables included in our statistical modelling as control variables. The measurement of geographical and cultural proximities was limited to contiguity, co-location on the same continent and language sharing. This limit is due to the fact that, generally, similar proxy variables at a global level are exceptionally scarce. Despite these limitations, it is noteworthy that the estimates of our statistical models hold for any global COVID-19 network that displays similar structural features. To the best of our knowledge, this is the first study to document the global spread of COVID-19 outbreak and to assess the effects of human mobility networks. The evidence from this study shows that the global spread of the COVID-19 outbreak was not random but patterned by the incoming migration and inbound tourism, with geographical and cultural proximities being controlled for. Further research on this topic is needed to increase our understanding of how nowadays global human mobility routes impact upon virus spreading. In this observational study, we collected empirical data on the global spread of COVID-19 outbreak at a country-level. Before starting the data collection process, we used the information from the World Health Organization [1] to identify the countries and territories with confirmed cases of COVID-19. Afterwards, for each state and territory in the list, we searched for available data on the first officially reported cases. We amassed a tally of 323 cases (patients), for which we identified related attribute information on age, sex, travel mode, confirmation date, type of traveller (e.g. visitor, returning traveller, etc.) and travel information (target and source countries). These country-level first cases of COVID-19 were documented by examining the evidence publicly shared by the national authorities in 219 countries and territories. For some states, the authorities announced more than one person as 'first case of COVID-19'. The available information on the travelling history of the first cases was used to build an origin-destination country matrix. This matrix was subsequently used to construct the COVID-19 global spreading nexus, that is, a binary directed graph. Specifically, in this network, an arrow from country A to country Billustrates that a patient k, returning traveller or visitor from A, was officially announced as the first COVID-19 case in country B (Fig. 4a) . Moreover, the isolates in the network (for instance, countries H and G in Fig. 4a ) are states for which information on the transmission route is not available due to various reasons (e.g. the patient did not report having a travel history during the last 15 days, national authorities did not provide travelling history information, etc.). According to our research design, the first cases of COVID-19 in a specific country are the patients who were officially identified 'as first COVID-19 cases' by domestic authorities. In addition to the COVID-19 (country-level spreading) network, we built five supplementary networks: the (incoming) migration network, the (inbound) tourism network, the contiguity network, the same continent network and the common language network. Subsequently, we performed a list-wise deletion on the six network variables and kept for the analysis 203 (93%) out of the 219 documented countries and territories. Triads similar to the one displayed as Fig. 4b were used to build the (incoming/ inflow) migration network and the (inbound) tourism network. Particularly, the top three migration sending countries and the top three tourism sending countries were identified for each of the states included in the COVID-19 network. We used the most recent available data on the global migration flows (2017), from the World Bank [52] and on the global tourism flows (2016), from World Tourism Organization [53] . In the contiguity network, two countries are connected by an edge if they share the same border. In the common language network, two states are linked if at least 20% of the population in one country speak the language of the other country. In the same continent network, two states are joined together if located on the same continent (we employed the geo-scheme advanced by the United Nations [54] ). We used undirected dyads, similar to the one exhibited in Fig. 4c , to build these networks. The information on the geographical variables is publicly available from Centre D'Études Prospective Et D'Informations Internationales [55] . We used a suite of software tools for network descriptive statistics and visualization. Precisely, we run the core/periphery algorithm available with UCINET 6.0 [37, 38] for detecting the core-countries in the COVID-19 network. Also, we performed the Kolmogorov-Smirnoff test for power-law distribution fit, using the algorithm available with the NetIndices R-package [56] . For network visualizations, we employed the circular and treemap node layouts available in visone [57] . The dataset files used in this article as well as the code are available for replication [58] . Initially, we performed quadratic assignment procedure (QAP) correlations (with 5000 permutations) to infer association among network variables, at a dyadic level. QAP correlations were useful to identify multi-collinearity, but not to explain patterns of ties. Consequently, we modelled the COVID-19 network using exponential random graph models (ERGMs) [31] . Generally, ERGMs are statistical models for understanding how and why network ties arise in a specific network [41] . In this article, we were interested in examining the generative processes giving rise to the structures or tie patterns in the observed network of COVID-19 countries. In our ERGMs, we estimated two classes of predictors: structural effects and relational attribute effects. The coefficients estimated for each predictor define the probability of each edge in the COVID-19 network and the probability of the entire network [59, 60] . The structural effects refer to specific tie patterns or link configurations within the network [59, 60] . In Fig. 5 , we visually display the structural effects introduced in our ERGMs. These images are built on similar visualizations available in the literature [41] . Edges (arcs) refer to the overall tendency for COVID-19 ties to form in the network. Generally, ERGMs include this term by default. Activity spread term controls for the COVID-19 network's tendency to have active nodes (countries that send many arrows). The out-degree (0) or sinks parameter marks nodes with zero out-degree but positive in-degree. The indegree (0) or sources parameter marks nodes with zero in-degree but positive out-degrees. The isolates term defines nodes with both in-degree and out-degree equal to zero [59, 60] . Sinks, sources and isolates are useful model parameters for controlling degree distributions in directed networks [40] . Out-degree (1) , (2) and (3) are related to the activity spread, while in-degree (0), (1) and (2) are related to popularity spread, as resulting from the network self-organization. In-and out-degree effects are indicative of the tendencies for centralization in the in-and out-degree distributions [41] . Twopath parameter accounts for the situations wherein actors who receive ties are also sending. Its model inclusion controls for the association between out-degree and in-degree. The relational attribute effects in our ERGMs are predicated to affect the probability of a COVID-19 tie. The estimated effects were: ties with top three migration sending countries (incoming migration), ties with top three tourism sending countries (incoming tourism), ties with contiguous countries (contiguity), ties with same continent countries (same continent) and ties with countries wherein the language is spoken by 20% of the population (common language 20%). We performed a diagnostic goodness of fit examination using the following procedure. We simulated 100 networks based on our model estimates. Then, we compared the simulated and the COVID-19 networks, examining the difference between the observed and mean scores on dyad-wise shared partners, edge-wise shared partners, degree, in-degree and geodesic distances [31, 40, [59] [60] [61] . In addition, we assessed whether the parameter estimates are in the ±10 intervals and whether their standard errors do not exceed 5 [42] . Coronavirus disease (COVID-19). Weekly Epidemiological Update The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak Quantifying the association between domestic travel and the exportation of novel coronavirus (2019-nCoV) cases from Wuhan, China in 2020: a correlational analysis Assessing the impact of reduced travel on exportation dynamics of novel coronavirus infection (COVID-19) An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study Lockdown strategies, mobility patterns and COVID-19. Lockdown Strategies, Mobility Patterns and COVID-19 The effect of human mobility and control measures on the COVID-19 epidemic in China Effects of human mobility restrictions on the spread of COVID-19 in Shenzhen, China: a modelling study using mobile phone data Modeling the worldwide spread of pandemic influenza: baseline case and containment interventions Modelling disease outbreaks in realistic urban social networks Forecast and control of epidemics in a globalized world Understanding individual human mobility patterns Preparedness and vulnerability of African countries against importations of COVID-19: a modelling study The first two cases of 2019-nCoV in Italy: where they come from? First known person-to-person transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the USA Early spread of COVID-19 in Romania: imported cases from Italy and human-to-human transmission networks Routes for COVID-19 importation in Brazil Covid-19: Italy confirms 11 deaths as cases spread from north Clinical features of the first cases and a cluster of Coronavirus Disease 2019 (COVID-19) in Bolivia imported from Italy and Spain Novel coronavirus (2019-nCoV) early-stage importation risk to Europe Transmission of 2019-nCoV infection from an asymptomatic contact in Germany Estimation of COVID-19 outbreak size in Italy 2020) A novel coronavirus outbreak of global health concern Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Preliminary estimation of the novel coronavirus disease (COVID-19) cases in Iran: a modelling analysis based on overseas cases and air travel data Coronavirus and migration: analysis of human mobility and the spread of Covid-19 Potential for global spread of a novel coronavirus from Estimating COVID-19 outbreak risk through air travel Impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak ergm: a package to fit, simulate and diagnose exponential-family models for networks The role of language in shaping international migration International migration: a panel data analysis of the determinants of bilateral flows Travel-style preferences for visiting a novel destination: a conjoint investigation across the novelty-familiarity continuum Tourists' intention to visit a country: the impact of cultural distance Modeling determinants of tourism demand in Colombia Models of core/periphery structures Ucinet 6 for Windows: Software for Social Network Analysis Statistical analyses support power law distributions found in neuronal avalanches Closure, connectivity and degree distributions: exponential random graph (p*) models for directed social networks Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications From neighbors to school friends? How adolescents' place of residence relates to same-ethnic school friendships Power-law distribution in the number of confirmed COVID-19 cases Strong correlations between power-law growth of COVID-19 in four continents and the inefficiency of soft quarantine strategies Are men more at risk of infection? Men, Sex, Gender and COVID-19 Global travel patterns: an overview Effects of non-pharmaceutical interventions on COVID-19 cases, deaths, and demand for hospital services in the UK: a modelling study Changes in population movement make COVID-19 spread differently from SARS Topological analysis of SARS CoV-2 main protease The Matthew effect in empirical data 2020) COVID-19 and SARS-CoV-2. Modeling the present, looking at the future Migration and remittances data Yearbook of Tourism Statistics: Data Standard country or area codes for statistical use (M49) Notes on CEPII's Distances Measures: The GeoDist Database Are network indices robust indicators of food web functioning? A Monte Carlo approach Analysis and visualization of social networks Data from: the impact of human mobility networks on the global spread of COVID-19 Specification of exponential-family random graph models: terms and computational aspects ) A statnet tutorial Advances in exponential random graph (p*) models applied to a large social network We are grateful to Ms Laura Trandafir for figure formatting preparations.