key: cord-0753413-x13tcdcq authors: Zhang, Yan; Chen, Nengcheng; Du, Wenying; Yao, Shuang; Zheng, Xiang title: A New Geo-Propagation Model of Event Evolution Chain Based on Public Opinion and Epidemic Coupling date: 2020-12-10 journal: Int J Environ Res Public Health DOI: 10.3390/ijerph17249235 sha: 9aa181b96fe8d11c263bfc4dd40c7ed3b37f71a5 doc_id: 753413 cord_uid: x13tcdcq The online public opinion is the sum of public views, attitudes and emotions spread on major public health emergencies through the Internet, which maps out the scope of influence and the disaster situation of public health events in real space. Based on the multi-source data of COVID-19 in the context of a global pandemic, this paper analyzes the propagation rules of disasters in the coupling of the spatial dimension of geographic reality and the dimension of network public opinion, and constructs a new gravity model-complex network-based geographic propagation model of the evolution chain of typical public health events. The strength of the model is that it quantifies the extent of the impact of the epidemic area on the surrounding area and the spread of the epidemic, constructing an interaction between the geographical reality dimension and online public opinion dimension. The results show that: The heterogeneity in the direction of social media discussions before and after the “closure” of Wuhan is evident, with the center of gravity clearly shifting across the Yangtze River and the cyclical changing in public sentiment; the network model based on the evolutionary chain has a significant community structure in geographic space, divided into seven regions with a modularity of 0.793; there are multiple key infection trigger nodes in the network, with a spatially polycentric infection distribution. China adopts active prevention and control measures to effectively control the development of the Corona Virus Disease 2019 (COVID-19) pandemic [1] . On 22 January 2020, the number of diagnosed cases in Hubei province exceeded 100 in a single day and Hubei Province launched a level II public health emergency response. On 23 January 2020, at 10:00 a.m., Wuhan was under lockdown. Its public transportation such as subways, ferries, long-distance passenger were suspended, and the airport and railway stations were temporarily closed. On 24 January 2020, a total of 830 confirmed cases have been reported nationwide and Hubei Province launched the level I response to public health emergencies. Since then, medical teams all over the country have been sent to aid Hubei, and Wuhan has become the focus of people's concern. After 76 days of active anti-epidemic efforts, Wuhan announced on 8 April 2020 that the control of the exit channel from Wuhan was lifted (www.xinhuanet.com/mrdx/ 2020-04/08/c_138956757.htm), and the prevention and control situation changed positively for the better and achieved important stage results. In view of the fact that the basic reproductive number (R 0 ) of this pandemic event is higher than that of "SARS", it poses a challenge to the development of containment of the pandemic [2] . Based on the social geographic calculation perspective, this paper proposes a new approach to the study of this pandemic based on complex networks. A complex network is a large-scale network with a complex structure, a topological abstract expression of the real system and somewhere between random and regular networks [25] . Its nodes can be arbitrary basic units with specific information content. For traditional "neatly aligned" Euclidean Structure Data, they can be clearly represented as a grid type, where any point in the grid can easily find its neighbor nodes, which can be achieved with classical neural networks (CNN, RNN) . In real life, in addition to "regular" grid structure data, there are Non-Euclidean Structure Data such as social networks, compound structures, biogenetic proteins, and knowledge graphs. A point in Non-Euclidean structural data is difficult to define its neighbor nodes and the number of neighbor nodes is different for different nodes. In addition, complex networks have some concepts such as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks [26] . Research on studying coupled disease-behavior dynamics in complex networks is growing rapidly [27] , and it has many applications in the study of the COVID-19 pandemic. Simulation results for complex network methods show that it is possible to cut off connections between symptomatically infected individuals and their neighbors, and cut off connections between hub nodes and their neighbors [28] . Some authors treat airports as network nodes and air traffic flows as weights of complex networks, but do not consider the spatial interactions between nodes [29] . At the same time, some scholars have used gravity models to predict infection [30] , suggesting that epidemics are spatially linked and exhibit an inverse square relationship [31] . We believe that it is appropriate to consider the influence of the nodes on the surroundings and to combine complex networks with gravitational models. We take the COVID-19 epidemic event as the background to further understand the magnitude and distribution of the spatial impact of this major public security event, and construct a spatial impact evaluation model of the epidemic based on the reciprocal relationship between network activities and the geographic environment, including users' network activities such as check-in information, concern information, interaction information, emotional information. This will help to quantify the spatially coupled impact of public opinion and disasters, detect epidemic hotspots, refine the geographic dissemination processes of major public health events, and establish the relationship between cyber-virtual space and geographic real space. The rest of the paper is organized as follows. We introduce a new model. We discuss event evolution in Section 3.1 and epidemic distribution in Section 3.2. We use the proposed model to perform experiments on two types of data. In Sections 3.3 and 3.4, network construction and structural analysis were performed, respectively. The study area is Wuhan, which is the largest national central city in Central China, and according to the 2018 Wuhan Statistical Yearbook (http://tjj.wuhan.gov.cn/), Wuhan has a total of 13 districts and a resident population of 1121.2 million, of which 7 districts are main urban areas. The coordinate system of all spatial data in this paper is the China Geodetic Coordinate System 2000 (CGCS2000). We plotted the geographic location of Wuhan in Figure 2 . There was a lot of discussion about the COVID-19 outbreak during the epidemic. According to the timeline, the growth rate of confirmed cases slowed down on 16 February 2020, the new confirmed cases "fell back" to three digits on 19 February 2020, and on 21 February 2020, 25,000 hospital beds in cubicles have been built, and the shortage of medical resources had been addressed. In the meantime, the State Council Information Office issued that the epidemic had changed positively and the control work had achieved good results (http://cpc.people.com.cn/n1/2020/0221/c431601-31597371.html). This article covers the period from 10 January to 17 February 2020, from the beginning of the epidemic's development to the beginning of the epidemic's subsidence, and collects all original Weibo related to the epidemic in the past month. Based on the Scrapy crawler framework and the Mongodb database, an advanced search script was developed for Sina Weibo, and keywords such as pneumonia, epidemic, and virus were set to increase the amount of retrieved data. After the final search and de-duplication, a total of 213,869 original Weibo data were retained. The location data contained in the social media data are usually in the form of address descriptions, which we need to geo-decode. Baidu's address parsing function (http://lbsyun.baidu.com/) is used to process the captured check-in address data and perform spatial location statistical inference and address matching. Due to the discrepancy between Baidu coordinates and the CGCS2000 coordinates used in this paper, which may be up to hundreds of meters, we used coordinate correction (from Baidu coordinate system to CGCS200) to solve this problem. In the end, there were about 30,000 check-in data, excluding points outside the study area and those located in a larger range, and about 10,000 Weibos (Just like the tweets on Twitter) containing locations were retained after filtering, with a total of 1033 check-in locations. The fever clinic data and community infection data used in this paper were compiled from the CDC, local health committees, government agencies, social media, and the subject network data. The road network and zoning data are from Open Street Map (OSM) (www.openstreetmap.org), all data until 17 February 2020. A common method of measuring the trend of a group of points or areas is to calculate the standard distance in the x and y directions, respectively. These two measurements can be used to define the axis of an ellipse that contains the distribution of all elements. The ellipse is called a directionally distributed ellipse because the standard deviation of the xand y-coordinates is calculated from the mean center to define the axis of the ellipse. Just as the directionality of an element can be felt by plotting it on a map, calculating standard deviation ellipses makes this tendency more explicit, to see if the distribution of elements has a particular direction. Directionally distributed ellipses are calculated as follows. where x and y are the coordinates of element i; n is the total number of elements. The sample covariance matrix is decomposed into standard form, allowing the matrix to be represented by an intrinsic value and an eigenvector. Thus, the standard deviation of the x and y axes is. The variance is measured by an adjustment factor to produce an ellipse containing the desired percentage of data points. When working with one-dimensional data, the 3σ criterion is a common rule of thumb which conveys that a certain percentage of the data values will fall within one, two, and three standard deviations of the mean. In a normal distribution, this means that 68%, 95%, and 99.7% of the data values will fall within one, two, and three standard deviations, respectively. However, when dealing with spatial data of higher dimensions (x and y), this percentage breakdown is rare. A more applicable rule of thumb, derived from the Rayleigh distribution, states that in a two-dimensional coordinate system (x and y), a one-standard deviation ellipse will cover approximately 63% of the elements; two standard deviations will cover approximately 98% of the elements; and three standard deviations will cover approximately 99.9% of the elements. The criterion for discriminating between spatial clustering and anomalies is based on the Anselin Local Moran Index [32] , which is used to identify local epidemic-opinion hotspots and anomalous regions. In the epidemic-public opinion space, the intensity of public opinion topics and node infection results vary and are correlated with each other in different spaces. A positive index indicates "high-high" or "low-low" clustering of spatial elements, i.e., high or low elements with the same attribute value are adjacent to each other and the element is part of a cluster. A negative value of the partial Moran index indicates "high-low" or "low-high" clustering of spatial elements. Neighboring elements of a spatial element are different values, i.e., the element is an outlier and is divided into a high value "high-low" aggregation surrounded by a low value and a low value "low-high" aggregation surrounded by a high value. The Local Moran Index is defined as follows. where: x i is the attribute of element i; X is the average value of the corresponding attribute of the element; w i,j represents the spatial weight between elements. In this paper, we will study the regional grids, assigning different grids different definitive data based on geographic location relationships, and determining the spatial weight matrix based on the spatial distances between the grids. We regard a residential community as a "cellular", and use community infection data to construct the interaction between cells according to the gravitational model. Afterwards, the interaction edges are filtered by a certain threshold value, and the epidemic community are used as the network nodes, and the adjacencies between the filtered nodes constitute the edges of the network, thus forming a more complex network structure. This network structure may even contain most of the information of a real public health event [33] . The connected edges of the network structure imply the changing trends of the epidemic event (e.g., the shift of the geographic focus of infection and discussion), and more valuable information can be extracted from this structure (geographic division of infected communities, evolutionary trends of opinions, and evolutionary trends of sentiments) [34, 35] . A complex network can be modeled by using a single diagram G = (V, E). V represents the set of all nodes of a complex network and E represents the set of all edges. The mathematical expression for V is as follows. where m is the determined number of nodes, N i,j represents the degree of connectivity from the i spatial node to the j spatial node, and the larger N i,j represents the stronger the connectivity between regions. A typical complex network is composed of several nodes and connected edges with different weights between them, and within the complex network consists of several communities, and the interaction between nodes within the same community can be more frequent than between different communities. In order to construct the edge connections of complex networks, this paper attempts to use the basic gravity model (also known as gravity model and field strength model), which constructs the network edges to represent the mutual induced interactions between the communities. The gravity model is one of the classical models of geography, based on the laws of geography "all things or phenomena are spatially related, but the connection between things or phenomena close to each other is generally closer than the connection between things or phenomena farther away" [36] . The basic idea can be described simply as: the gravitational force between the two places is proportional to the product of a certain scale quantity between the two places, while inversely proportional to the square of the distance between the two places. In different application scenarios, the scale of the two places often have different forms of performance. In the field of tourism [37] , the scale of two places usually represents the level of tourism supply in the destination and the level of tourism demand in the travel destination; in the spatial and trade connection of two places [38, 39] , the scale of two places is usually expressed in the form of GDP and total population of two places; in the study of interest recommendation [40] , the scale of two places is expressed as the number of sign-ups at the point of interest, etc. In this paper, the number of confirmed visits to the community is used as the scale quantity and use Equation (5) to calculate cell connectivity. Among them: parameter A i,j represents the degree of connectivity between community i and community j; parameter K is a constant, where it takes the value of 1; D represents the confirmed infection data of the cells; and d represents the spatial distance between cells. Community structure is an important topological feature of complex networks and the interaction between nodes within the same community will be more frequent than between different communities. In this paper, a modularity Louvain algorithm based on multilevel optimization is used to parse the community structure of complex networks, and it is adopted to detect some small areas with closely connected community or cluster structure, which has the advantages of speed and high efficiency [41] . Using epidemic community data in real space and social media data in virtual space, communities are divided according to two steps: modularity optimization and network aggregation. Modularity is an evaluation indicator that defines the degree of community structure and is mathematically defined as follows: where the degree of connectivity A i,j between communities represents the weight of the edge between the node i and the node j; k i delegates the degree of the node i (the number of bars at the end of the arc plus the number of bars at the beginning of the arc); m represents the total number of nodes in the complex network; and C i represents the community with node i. The δ function is 1 when otherwise it is 0. In the random case, the number of edges between node i and node j is , the closer the modularity to 1 represents the better community partitioning. The gravity model formula is substituted into the modularity formula to obtain a geographic propagation model of the evolutionary chain of typical public health events for the fused gravity model-complex network. The model treats each cell as an independent community with the same initial number of communities as the number of nodes. On this basis, all nodes are fused and coalesced based on the modularity gain until the local maximum of modularity is reached, i.e., no node can increase the network modularity, and the community structure is no longer changed. Finally, the community results are compressed by converting their internal node weights into new nodes and weights, and the weights of the original inter-community edges are converted into inter-node weights [42, 43] . The algorithm is illustrated in Figure 3 . In addition to the module degree, complex networks have the following more important evaluation metrics. Degree of a node: in a directed graph, the number of arc end bars of a node is the out-degree of a node, the number of arc head bars of a node is the in-degree of a node, and the degree o f a node = out − degree + in − degree. The degree of nodality indicates influence in the epidemic network, where the in-degree indicates the causative event that led to the node's infection and the out-degree indicates the infection event triggered by the node, the value of them is determined by the system network topology. The greater the node's out-degree, the more severe the consequences of the node on its neighbors; the greater the in-degree, the more pathways leading to the node, and the more difficult it is to control. Betweenness centrality is an indicator of a node's centrality in a network. It is equal to the number of shortest paths from all vertices to all others that pass through that node. A node with high betweenness centrality has a large influence on the transfer of items through the network, under the assumption that item transfer follows the shortest paths [44] . Closeness centrality (or closeness) of a node is a measure of centrality in a network, calculated as the reciprocal of the sum of the length of the shortest paths between the node and all other nodes in the graph. Thus, the more central a node is, the closer it is to all other nodes [45] . A higher PageRank means that the node is more likely to be accessed, and the number of links pointing to the node has a higher link weight [46] . Average shortest path length reflects the average distance between all nodes and the overall efficiency of the network. Average clustering coefficient reflects the average connection tightness of all nodes in the network. In complex network diagrams, a higher graph density represents a tighter network connection; a higher degree of modularity represents a more pronounced community structure; a smaller network diameter represents better accessibility between points. Generally speaking, the greater the degree of outreach of a node indicates that the event has a more widespread impact in the evolutionary chain network of public health events, the more severe the consequences of the event, and is the central node of the evolutionary network of public health events; the greater the degree of inreach of a node, the greater the number of paths of the event, the greater the difficulty of control, and is a key node in the evolutionary development of public health events. The node with the largest betweenness centrality has the strongest connection to the subsequent induced public health event. As shown in Figure 4 , the evolution of public opinion during the influenza pandemic is divided into eight stages based on the timeline between 10 January and 17 February 2020, when more than 200,000 Weibos are displayed according to their temporal attributes. The period from 10 January to 19 January, during which the number of public opinion changes on Weibos was relatively stable, 2. the period from 20 January to 25 January, during which the number of discussions on Weibos increased sharply, can be described as a period of rapid growth in public opinion discussions, which was related to the phenomenon of "human-to-human transmission" as confirmed by academician Zhong Nanshan, followed by the continued fermentation of the pneumonia outbreak. On 23 January 2020 at 10:00 a.m., Wuhan declared to close the city, which is the first time in human history that the toughest epidemic prevention measures have been taken against a major city of 10 million. From 26 January to 31 January, during the Spring Festival, the number of discussions on social media rapidly declined, and the focus of netizens shifted; 4. from 1 February to 6 February, the number of discussions fluctuated, but still at a low level, during this period, the Huoshenshan Hospital delivered and built 11 new hospitals. 5. From 7 February to 8 February, the number of public opinions on Weibo reached a small peak in Wuhan residents' discussion of the epidemic; 6. from 9 February to 10 February, the heat of the previous discussion had faded, and the medical staff completed the nucleic acid testing of all suspected patients. 7. From 11 February to 13 February, discussions reached another peak due to the closed management of all residential areas; 8. from 14 February to 17 February, related public opinion discussions gradually decreased amid fluctuations. During this period, the closed management of residential areas was further strengthened, and residents' focus was on the official rumor dispelling information in addition to the epidemic itself. Based on natural language processing (NLP) techniques, the collected data were categorized into negative, positive, and neutral, and the daily percentage of negative sentiment Weibos were plotted together as shown in Figure 4 , where the proportion of negative sentiment reached a maximum of nearly 45% before and after the full suspected patient intake policy was implemented, and then it began to decline. According to the above-mentioned stages of public opinion evolution, we calculated the 1 standard deviation directional distribution of Weibos in different stages, as shown in Figure 5 and the graph in the upper right corner of it, where the size of the nodes represents the number of check-ins. Stage 1 shows a northwest-southeast directional distribution, with the geographic discussion center in the Sha lake area of Wuchang District; stage 2, the discussion of the climbing period, shows a shift of the geographic center of gravity towards Jianghan District; stages 3, 4, and 5 show a similar directional distribution as the center of gravity; stage 6, the discussion of the trough period, shows a more scattered distribution and a shift of the center of gravity towards the Qingshan District; stages 7 and 8 show a more compact directional distribution, with the geographic center of gravity shifting across the Yangtze River to the other side area. The check-in information in cyberspace is closely related to the population distribution, and an important time point is around 23 January, when the direction is southeast-northwest and then reverses. Since all suspected patients must be seen at the fever clinic, we hypothesized that the commuting distance between the infection node and the fever clinic has an effect on the outcome of the epidemic infection. We first used OSM road network data, 71 fever clinics, and community with confirmed cases for shortest path analysis. The result shows that Shizishan-Zhangjiawan area, East Lake Development Zone-Optics Valley Software Park, and some suburban areas are located greater than 6 km from the fever clinic, with the farthest nodes more than 10 km from the fever clinic. Using the previously mentioned spatial clustering and anomaly testing methods, the main urban area of Wuhan was divided into 30 × 40 grids, which were processed using the spatial connection tool for the next study. As shown in Figure 6B , the light red grid is the "high-high" clustering area, indicating that there are more confirmed diagnoses in this area and more confirmed diagnoses in the periphery at a confidence level of p = 0.05, while the dark red grid indicates the "high-low" area, indicating that there are more confirmed diagnoses in this area and relatively few confirmed diagnoses in the periphery. There are still some "low-high" clustering and "low-low" clustering. The comparison of Figure 6 shows that under certain conditions, the distance to the fever clinic did not have a strong correlation with the confirmed diagnosis as we imagined, and we can reject the original hypothesis. Based on the gravity model mentioned earlier, we perform complex network modeling of social media data in cyberspace and confirmed outbreak data in real space. In addition, we plot the spatial interactions between nodes at different connectedness scales, as shown in Figure 7 . Due to the large number of interaction edges, we filter the network edges with certain thresholds and list the network states with the degree of connectedness greater than 10, the degree of connectedness greater than 100, and the degree of connectedness greater than 1000 after the threshold comparison. We use a basic hypothesis to deduce that the area with confirmed cases has an impact on the surrounding areas, and this effect increases with the number of confirmed cases, and decreases with the distance between nodes, then the epidemic nodes form a chain of disaster transmission that interacts with each other. In this paper, we use complex network modeling to find the community structure of the epidemic chain. Figure 7A ,B is a schematic representation of our geo-spatial interactions based on two different scales constructed for the city cell (with confirmed cases). As a comparison on an opinion dimension, we use the Weibo check-in data to do a comparative experiment, and the results are shown in Figure 7C ,D. The results of Louvain's algorithm are shown in Figure 8 , node in real space was divided into seven major communities with very distinct structures, namely, Optics valley, Wuchang-South Lake, Yuejiazui-Xudong, QingShan, HanYang, HanKow region along the river, and North Hankow. The geographical barriers such as Yangtze River and East Lake have a clear watershed effect on the community structure, as opposed to administrative divisions, which do not have significant effects. The influence within the community is much higher than that of the nodes between the communities, and the final division of the modularity is 0.793, showing a strong complex network of community structure. According to the statistics, seven community nodes have an in-degree higher than 100, indicating that the infection pathways leading to this part of the node are more difficult to control and are key nodes in the evolution and development of the public health event disaster. Four community nodes have an out-degree higher than 100, indicating that these community nodes have the widest impact in this epidemic evolution chain network, the nodes cause the most severe impact, and are the central nodes of the public health event disaster evolution network. The network structure of the control group as shown in Figure 8B , the community structure along the Yangtze River for segmentation, basically divided into a broad area of Wuhan City along the river, Wuchang-Optics Valley, Qingshan-Hankow, and Hangyang-Qiaokou, the total network modularity of 0.558 less than the real epidemic data network of 0.793. The locations with the largest number of Weibos are Wuhan Union Medical College Hospital, Wuhan Central Hospital and Wuhan Jin-Yin-Tan Hospital, all of which are public designated fever clinics. The number of comments on a single Weibo post at Concordia Hospital was 3138, and such a high level of public attention reflects the heated discussion the epidemic has generated in cyberspace. A Weibo initiative to provide free hotels for medical staff, positioned at Hubei Province Hospital of Traditional Chinese Medicine, has been reposted 3455 times. In addition, a Weibo positioned at Wuhan Union Hospital to cheer up the medical staff has been reposted 2553 times. The epidemic has not only had a huge impact on real space, but has also sparked great concern and discussion in cyberspace, with the large hospital as the frontline of the fight against the epidemic quickly becoming the focus of national attention. The density of the complex network graph constructed using the epidemic data was 0.041, higher than 0.024 for the Weibo check-in network, and the average clustering coefficient was 0.7, higher than 0.49 for the Weibo check-in network. Higher modularity of the epidemic network suggests that CVOID-19 could lead to severe localized infections and that there is a possibility of local concentration (localized concentrations) and cross-effects of infection. In addition, complex networks are also characterized by small world and scale-free [47] , with epidemic data networks having small average path lengths (2.962), larger average clustering coefficients (0.7) and more obvious small world effects; point of interest (POI) data networks have relatively smaller average clustering coefficients (0.49). Therefore, it indicates that there is a diversity of interaction behavior in cyberspace and the overall structure of this network is relatively "loose" in general [48, 49] . We discuss the scale-free characteristics of the two networks in Figure 9 . The scale-free characteristics of the epidemic network are more pronounced, with a fitted R 2 of 0.572, indicating that except for a few communities with more severe impact, the remaining communities are less infected, and the city closure measures have curbed the spread of the virus in time [50, 51] . In contrast, the POI network showed a weaker scale-free feature with a fitted R 2 of 0.204, indicating that the network information spreads more uniformly in space and has a larger range of influence. Thus, there is no clear indication that Wuhan residents in infected areas focus their attention on certain (few) points of interest. We classify all nodes into 5 levels, from low (Level 1) to high (Level 5), according to their degree values. In Figure 10A , we can see that the distribution of nodes in the POI network is hierarchical, with a large number of Level 5 nodes in the northern part of the Han River and the western part of the Yangtze River, in a decreasing ring shape. The mentioned area is densely populated and affluent, and is a well-known commercial and financial center of Wuhan. In addition, there are a number of POIs, such as the infectious disease hospital, which is the first line of defense against the epidemic, and the South China Seafood Market, where the SARS-CoV-2 virus has been found, which generated a great deal of discussions during the study period. Correspondingly, there are multiple clusters in the real space community network ( Figure 10B ), and the degree peaks appear in the cluster center to show multi-community distribution. That is to say, there are multiple regions with degree peaks, which also reveals the geographic transmission characteristics of COVID-19, the multicenter distribution is very likely to trigger mass infection. We mentioned in Secton 3.4 that a node with high betweenness centrality has a large influence on the spread of virus through the network. The more nodes with high betweenness centrality, the easier it is for the virus to spread and the faster the number of people diagnosed will rise. So it is an effective way to stop a new round of virus spread through controlling these geographic nodes. In addition, complex networks are tightly connected within communities and sparsely connected between communities, which opens up possibilities for prevention and control. By controlling a few strong connections between communities, we can effectively prevent cross-infection and transmission between communities as shown in Figure 11 . The 30 communities attribute with the highest degree of outbound and inbound for the real space community nodes are in Tables 1 and 2 . We perform calculations and classifications based on node attributes. We have divided the key nodes into three categories. As shown in Table 2 , Shihua Community has high in-degree, but low betweenness centrality, and it is susceptible to the influence of other nodes but has low importance in the network, making it difficult to induce new community infections in the network. Correspondingly, the Tujia Gou node has high out-degree (ninth), very high in-degree (third), and high betweenness centrality, indicating that the node is very well connected throughout the network, and that poor control (untimely response) is highly likely to trigger a new round of virus spread. In addition, there is another type of node that has high out-degree but low betweenness centrality, such as Huateng Park and Baoli City. While there may be more confirmed cases at these nodes, they have a lower risk of spreading across the network. The risk of inducing disaster: high level ; medium level ; low level . This article describes the epidemic situation in the most affected areas of China. From the perspectives of traditional public opinion analysis and geographic location analysis, we explore the migration of the center of gravity of public opinion on social media, and the clustering/anomalies of epidemic communities. More importantly, we propose an innovative model that combines network science and geoscience. The complex network relationships implied by network public opinion data and node infection data are used to construct directed edges, establish network simulation evolution chains in the context of major public safety events, assess the riskiness of disaster nodes, as well as the entire network. Cities are the primary carriers of human economic activity, and the implementation of a total city closure would result in severe economic losses. The virus will likely continue to spread for some time before a vaccine is used on a large scale. Controlling the critical geographic nodes of disease onset would help slow the spread of the virus. Smartphone developers can embed this model into applications that dynamically calculate the riskiness of different regional nodes based on actual epidemics, guiding people to reduce clustering in high-risk nodes. The limitations of our model are that it only uses data from one point in time, and the risk of network nodes should be dynamically simulated over time. It would be an interesting and valuable direction of research to adjust the networks with incorporate concrete application scenarios. Author Contributions: Conceptualization,Y.Z.; methodology,Y.Z. and N.C.; software, X.Z.; data curation, N.C.; writing-original draft preparation, Y.Z.; writing-review and editing, W.D. and S.Y. All authors have read and agreed to the published version of the manuscript. WHO declares COVID-19 a pandemic The importance of contact tracing when predicting epidemics. arXiv 2020 MHLW COVID-19 Response Team; Suzuki, M. Closed environments facilitate secondary transmission of coronavirus disease 2019 (COVID-19) Risk Assessment of COVID-19 based on multisource data from a geographical viewpoint Virtually perfect? Telemedicine for COVID-19 Modeling of Epidemic Transmission and Predicting the Spread of Infectious Disease Basic reproduction number and predicted trends of coronavirus disease 2019 epidemic in the mainland of China Examining the impact of COVID-19 lockdown in Wuhan and Lombardy: A psycholinguistic analysis on Weibo and Twitter Social media in disaster risk reduction and crisis management Implications of volunteered geographic information for disaster management and GIScience: A more complex world of volunteered geography Rapid assessment of disaster damage using social media activity Spatialities of data: Mapping social media 'beyond the geotag The geography of Twitter topics in London Citizens as sensors: The world of volunteered geography From earth observation to human observation: Geocomputation for social science Geo-parsing messages from microtext Discovery of unusual regional social activities using geo-tagged microblogs. World Wide Web The geography of happiness: Connecting twitter sentiment and expression, demographics, and objective characteristics of place How Urban Factors Affect the Spatiotemporal Distribution of Infectious Diseases in Addition to Intercity Population Movement in China Exploring urban spatial features of COVID-19 transmission in Wuhan based on social media data Travel recommendation using geo-tagged photos in social media for tourist Mining location from social media: A systematic review Characterizing geographical preferences of international tourists and the local influential factors in China using geo-tagged photos on social media The structure and function of complex networks Community detection in graphs Coupled disease-Behavior dynamics on complex networks: A review. Phys. Life Rev A new SAIR model on complex networks for analysing the 2019 novel coronavirus (COVID-19) How did COVID-19 impact air transportation? A first peek through the lens of complex networks Population flow drives spatio-temporal distribution of COVID-19 in China Space-time dependence of corona virus (COVID-19) outbreak. arXiv 2020 Network analysis in public health: history, methods, and applications Complex social contagion makes networks more vulnerable to disease outbreaks Detecting discussion communities on vaccination in twitter On the first law of geography: A reply The impacts of cultural values on bilateral international tourist flows: A panel data gravity model A gravity model of immigration Bayesian model averaging and jointness measures: Theoretical framework and application to the gravity model of trade Enhancing Personalized Trip Recommendation with Attractive Routes Scalable community detection with the louvain algorithm The igraph software package for complex network research Complex network measures of brain connectivity: Uses and interpretations A faster algorithm for betweenness centrality Ranking of closeness centrality for large-scale social networks Weighted pagerank algorithm Complex networks: Small-world, scale-free and beyond Social awareness of crisis events: A new perspective from social-physical network Chinese tourists in Nordic countries: An analysis of spatio-temporal behavior using geo-located travel blog data The effect of human mobility and control measures on the COVID-19 epidemic in China COVID-19 control in China during mass population movements at New Year The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University. Our deepest gratitude goes to the anonymous reviewers for their careful work and thoughtful suggestions that have helped improve this paper substantially. The authors declare no conflict of interest.