key: cord-0566253-dfx9lclz authors: Wang, Songhe; Wei, Kangda; Lin, Lei; Li, Weizi title: Spatial-temporal Analysis of COVID-19's Impact on Human Mobility: the Case of the United States date: 2020-10-08 journal: nan DOI: nan sha: ce4725de864ca90d55e3a9e4466296a019e9e3c8 doc_id: 566253 cord_uid: dfx9lclz COVID-19 has been affecting every aspect of societal life including human mobility since December, 2019. In this paper, we study the impact of COVID-19 on human mobility patterns at the state level within the United States. From the temporal perspective, we find that the change of mobility patterns does not necessarily correlate with government policies and guidelines, but is more related to people's awareness of the pandemic, which is reflected by the search data from Google Trends. Our results show that it takes on average 14 days for the mobility patterns to adjust to the new situation. From the spatial perspective, we conduct a state-level network analysis and clustering using the mobility data from Multiscale Dynamic Human Mobility Flow Dataset. As a result, we find that 1) states in the same cluster have shorter geographical distances; 2) a 14-day delay again is found between the time when the largest number of clusters appears and the peak of Coronavirus-related search queries on Google Trends; and 3) a major reduction in other network flow properties, namely degree, closeness, and betweenness, of all states from the week of March 2 to the week of April 6 (the week of the largest number of clusters). The COVID-19 pandemic has been affecting the world since December, 2019 (1) . As for the United States, the government announced a national emergency on March 13 and attempted to slow down the spread of the virus by issuing social distancing guidelines. Many societal aspects are subsequently affected including human mobility-the driven force for social-economic development. According to Apple COVID-19 Mobility Trends Reports (2) , after declaring national emergency, the traffic flow, including driving and walking, has decreased over 50%. The impact of COVID-19 on the mobility of various social-demographic groups within the United States has been assessed by several studies (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) . To provide some examples, Kabiri et al. (3) study the difference of mobility changes of younger and senior communities. They find that both younger and senior communities have been performing social distancing after the national emergency declaration. The percentage of people staying at home and the social distancing index, a measurement based on the change of a community's daily work trips, show a trend of decreasing until mid-April and then gradual increasing. In addition, they find that people need a certain period of time to react to government guidelines and term this phenomenon social distancing inertia (4). Engle et al. (5) evaluate how local infection rate would lead to mobility reduction. They show that the region with a higher percentage of residents older than 65 has a higher mobility reduction rate. Lee et al. (6) analyze the relationship between mobility and various socio-demographic factors and find that 1) the community with higher income has a higher stay-at-home percentage compared to the community with lower income after the outbreak, and 2) the community with higher population density has shorter travel distance compared to the community with lower population density. Fellows et al. (7) conclude that even though many factors, such as perceived risk, community guidance, and media coverage, could affect people's mobility patterns, the factor that has the most effect is the governmental non-pharmaceutical intervention, e.g., the staying-at-home order. Similar studies have also been conducted in other countries in analyzing the impact of COVID-19 on human mobility patterns (10) (11) (12) . Five months after the declaration of national emergency, many states have moved to various phases of reopening. Some studies have examined this process. For example, Huang et al. (13) study the impact of reopening on mortality, infections, and hospitalizations in Harris county, Texas, United States. As another example, Soltani and Rezazadeh (14) suggest that releasing a small portion of the population may cause a significant increase in confirmed cases. Most previous work focused on different responses from various social-demographic communities and the local factors associated with the mobility patterns. Others study the effect of reopening on the pandemic's evolution. However, these studies do not study the relationship between people's awareness of the situation and the change of mobility patterns, and do not explore the mobility patterns between different states. Our work not only explores the correlation between people's awareness of the pandemic and the mobility patterns, but also uses network analysis and clustering to reveal the underlying mobility structures between different states. To be specific, from the temporal aspect, we analyze the correlation between human mobility and the amount of search queries from Google Trends at the state level. As a result, we find that, on average, people adjust their mobility two weeks after their awareness of the situation. From the spatial aspect, we perform network clustering to reveal hidden mobility structures among all states. The result shows that the connections among states at first decreased and then gradually restored after mid-April. Meanwhile, the states in the same cluster tend to have shorter geographically distances. In addition, we find, again, a two-week delay between the point when the largest number of clusters appears and the point when the peak of the Google search trend regarding Coronavirus occurs, which confirms our temporal analysis results. Lastly, the network analysis shows a reduction of degree, closeness, and betweenness-the properties that describe mobility flows of a state-for all states. In the rest of the paper, we first discuss the data sources used in our analysis. Then, we explain our methodology for this research. Next, we present temporal and spatial analysis results and explain our findings. Finally, we conclude and discuss potential future work. In this section, we provide details of our data sources. Regarding the mobility data, the first dataset that we use is Apple COVID-19 Mobility Trends Report (2) . The data contain traffic flows measured by the number of route changing queries via mobile devices of all states of the U.S., from January 13 to July 20, 2020, The flow of January 13 is set to 100, and the flow of other days is computed relatively to the flow data of January 13. For example, if the traffic flow of a certain day is increased by 20% compared to the traffic flow of January 13, the flow of that day is marked as 120. Some sample data are shown in Table 1 The second dataset is from Google Trends (15), which we use as an indicator of people's awareness of the pandemic. The dataset contains both state-level and nationwide weekly search data regarding Coronavirus. We extract the part of the data from January 13 to July 20, 2020 for our analysis. The data are adjusted: with 0 meaning a very small number of queries was made and 100 representing the largest number of queries made over the past year. A number between 0 and 100 indicates that a week has a specific higher search percentile than all other weeks over the past year. Some sample data are shown in Table 2 . Furthermore, in order to conduct daily analysis, we linearly interpolate the weekly data so that each day in the study period also has the search data metric. The third dataset is the Multiscale Dynamic Human Mobility Flow Dataset (16) , which contains weekly mobility flow data from the week of March 2 to the week of May 11, 2020. Each week's data contain the origin and destination of visitor and population flow, and the longitudes and latitudes of the origin states and destination states. The visitor flow is measured by recording the geographical locations of mobile devices, and the population flow is calculated based on the visitor flow and the population of each state. Regarding the pandemic data, we use the daily reports of COVID-19 state-level confirmed cases from New York Times (17) . We have pre-processed all the data by conducting min-max (15) . 0 meaning the lowest search amount over the past year and 100 meaning the highest search amount over the past year. As a reference, the national emergency is declared on March 13 in the U.S. feature scaling. This operation ensures that the data are on the same scale, providing convenience for our modeling, analysis, and visualization. We hypothesize that there exists a time difference between the change of mobility and people's awareness of the pandemic-which is assumed to be reflected by Google Trends data. In other words, we suspect that the point of people realizing the seriousness of a situation and the point of people taking actions to reflect their awareness may not be well synchronized. To test our hypothesis, we study the correlation between the two time series: the mobility data from Apple (2) and the search trend data of COVID-19 from Google (15) . To be specific, we compute the Pearson's correlation coefficients by fixing the mobility data and shifting the search trend data one day at a time. The study period is from January 13 to July 20, 2020. As a result, the highest correlation coefficients (absolute values) all occur after shifting the search trend data forward in time. The number of days of delay corresponds to the highest correlation coefficients of different states is shown in Figure 1 . The minimum delay of people adjusting their mobility to reflect their awareness of the pandemic is 11 days, the maximum delay is 18 days, and the average delay is 14 days. These observations coincide with the previous finding from Soltani and Rezazadeh (14) , which states that people need a certain amount of time to adjust their mobility to the posed government guidelines. In order to explore possible hidden structures embedded in the mobility flow network among all states (18), we perform network clustering using the Label Propagation algorithm implemented in the Python library NetworkX (19) on the Multiscale Dynamic Human Mobility Flow Dataset (16) . The graph we study contains 50 states as its nodes and the traffic flows among the states as its edges. The algorithm works as follows: first the algorithm assigns each node a distinct label and then recursively assigns the label of each node to be the label that appears most frequently among the labels of the node's neighbors. In the case of multiple labels sharing the highest frequency, one of the labels will be randomly chosen for the node. The algorithm stops when all nodes have the labels that appear most frequently among the labels of their neighbors. Since the Multiscale Dynamic Human Mobility Flow Dataset contains weekly inter-state mobility flows over 11 weeks (from the week of March 2 to the week of May 11, 2020), we have 11 mobility flow networks to analyze. For each week, we run the Label Propagation algorithm 100 times to reveal hidden clusters. During that process, we record the mode, mean, and standard deviation of the number of the clusters of the 11 weeks. These statistics are shown in Table 3 . We take the mode of the number of clusters of each week as the final number of clusters to conduct further analysis. (16) . The larger the number of clusters, the more isolated the states are. The largest number of clusters appears at week 6, i.e., the week of April 6. From the clustering result, we observe that the number of clusters tend to increase and then decrease over the study period of 11 weeks. To be specific, in the first two weeks (before the national emergency declaration on March 13), all states are identified as one cluster. We suspect the reason being that people's awareness of COVID-19 remained low and the staying-at-home order was not implemented. As the pandemic evolves and people become more aware of the situation, the number of clusters increases, which implies people have restricted their travel and the states are more isolated. The maximum number of clusters appeared at week 6 (the week of April 6). This result may correlate with some major events occurred between March 13 and April 6, including the declaration of national emergency, the issuing of stay-at-home orders in all 50 states, and the cancellation of events with crowds. Afterwards, the number of clusters starts to decrease, which implies that people gradually resume their prior-pandemic travel behaviors. Figure 2 TOP showcases the clustering results. To provide some examples, Week 3 (the week of March 16) shows that in the beginning of the pandemic, we only have two clusters, which indicates that mobility flows are more uniformly distributed across the country. Week 6 (the week of April 6) presents the largest number of clusters, i.e., eight clusters, indicating that the mobility flows among states reach the lowest point. Week 11 (the week of May 11) shows that the number of clusters reduces to four, showing that the states are less isolated and the mobility flows start to recover. In order to quantify our results, we further compute two types of distances: cluster average distance and country average distance. Both distances are calculated using the coordinates of the geometric centroids of the states. For computing the cluster average distance, we first empirically pick a state in the cluster as the centroid and then take the average of the distances from the rest of the states in that cluster to the centroid. Next, we compute the country average distance by taking the distances from all states to the centroid. The relative reduction in percentage of cluster average distance compared to the country average distance of all clusters over nine weeks, starting from the week of March 16 and ending at the week of May 11, are shown in Figure 2 . The first two weeks, i.e., the weeks of March 2 and March 9 are exclude since for these two weeks the algorithm reports the whole country as one cluster. As a result, all cluster average distances are much shorter than their corresponding country average distances, which suggests that the states in the same cluster are geographically close to each other. This result confirms that the increase of the number of clusters indeed reflects the isolation of the states and reductions in inter-state mobility flows. Table 4 shows the change of the number of clusters and nationwide weekly Google Trends data over the 11 weeks. A two-week delay between the peak of search trends on Google and the largest number of clusters is again spotted. It is worth noticing that even though we use different mobility data in our mobility adjustment analysis and network analysis, both analyses find the same amount of delay in time between the mobility change and people's awareness of the pandemic. Week The change of the number of clusters and weekly Google Trends data over the 11 weeks. We observe a two-week delay between the peak of Google Trends data and the largest number of clusters. This result coincides with our previous finding, which, on average, it takes 14 days for people to adjust their mobility to reflect their awareness of the pandemic (indicated by the Google Trends data). Because the pandemic may affect different states unequally, we have examined other network flow properties, namely the degree, closeness, and betweenness of each state as a node in the traffic flow network. In particular, the degree of a node is defined as the sum of edge weights to all its neighbors. The change of the degree can inform the change of the flow of a state. The closeness of a node is defined as the average length of the shortest path (computed using the edge weights) from the node to all other nodes in the graph, which can reflect the accessibility of a state. Having the closeness calculated for all nodes, we could study which states tend to be less "attractive" during the pandemic. Lastly, betweenness quantifies the number of times a node that acts as the "bridge" along the shortest path between two other nodes. The betweenness of a node can be used to identify which state is serving as a domestic transportation "hub". The higher the value the more influence the state has on the domestic mobility. Table 5 lists top 10 states with the most reductions in degree, closeness, and betweenness of the week of April 6 (the week has the largest number of clusters and most isolated) compared to the week of March 2 (the beginning of the pandemic). We can see that the average reductions in degree, closeness, and betweenness are 7.7%, 6.9%, and 21.6% respectively. Figure 3 illustrates the change of inter-state flow connections of the week of March 2 and the week of April 6. The size of a node represents the total mobility flow through that state during the week, while each edge indicates 10,000+ mobility flow connecting two states. A significant reduction of the edges (i.e., mobility flow) is observed. The outbreak of COVID-19 has significantly impacted our society and dramatically changed our mobility patterns. We conduct spatial-temporal analysis of the influence of COVID-19 on human mobility in the United States. From the temporal perspective, we find that there exists a delay of, on average, 14 days between people's awareness of the pandemic and the change of mobility patterns. From the spatial perspective, we discover that the states within the same cluster tend to have shorter geographical distances. There also exists a two-week delay between the appearing of the largest number of clusters and the occurrence of the peak of the Google search trend regarding Coronavirus. Lastly, we perform a network analysis and find that there is a reduction of the network flow properties, namely degree, closeness, and betweenness, for all states. For the future research, we are mainly interested in two directions. The first direction concerns a more detailed mobility analysis at the city level. We can first use existing techniques such as Compressed Sensing (20, 21) to interpolate the sparse mobile data used in this work. In combination with other COVID-19 tracker data, we can then potentially estimate the impact of COVID-19 on city-level traffic states and simulate the dynamics of the virus spreading using some existing techniques (22) (23) (24) . Another interesting topic in this direction is to study the change of shared mobility such as bikes (25, 26) . The second research direction concerns future mobility, as the future traffic system is projected to be connected and autonomous (27), we would like to study when mobility patterns have a drastic change, to what extent the connected and autonomous vehicles can be used to supplement the traffic system and fulfill various societal needs. We can leverage recent advances in autonomous driving modeling and simulation to facilitate this research direction (28, 29) . Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The lancet COVID-19-Mobility Trends Reports How different age groups responded to the COVID-19 pandemic in terms of mobility behaviors: a case study of the United States Observed mobility behavior data reveal social distancing inertia Staying at home: mobility effects of COVID-19 Human Mobility Trends during the COVID-19 Pandemic in the United States The COVID-19 Pandemic, Community Mobility and the Effectiveness of Non-pharmaceutical Interventions: The United States of America Impacts of COVID-19 mode shift on road traffic COVID-19 and Income Profile: How People in Different Income Groups Responded to Disease Outbreak, Case Study of the United States Human Mobility in Response to COVID-19 in France Effects of the covid-19 pandemic on population mobility under mild policies: Causal evidence from sweden Lockdown strategies, mobility patterns and COVID-19 Population stratification enables modeling effects of reopening policies on mortality and hospitalization rates A New Dynamic Model to Predict the Effects of Governmental Decisions on the Progress of the CoViD-19 Epidemic Multiscale Dynamic Human Mobility Flow Dataset in the US during the COVID-19 Epidemic Covid-19) Data in the United States Data mining and complex network algorithms for traffic accident analysis Exploring network structure, dynamics, and function using NetworkX Citywide Estimation of Traffic Dynamics Via Sparse GPS Traces Efficient Data Collection and Accurate Travel Time Estimation in a Connected Vehicle Environment via Real-Time Compressive Sensing Estimating urban traffic states using iterative refinement and Wardrop equilibria Virtualized Traffic at Metropolitan Scales City-Scale Traffic Animation Using Statistical Learning and Metamodel-Based Optimization Predicting Station-Level Bike-Sharing Demands Using Graph Convolutional Neural Network Predicting station-level hourly demand in a large-scale bikesharing network: A graph convolutional neural network approach ADAPS: Autonomous Driving Via Principled Simulations A Survey on Visual Traffic Simulation: Models, Evaluations, and Applications in Autonomous Driving