key: cord-0233878-3qujsu2q authors: Yoon, Jisung; Jung, Woo-Sung; Kim, Hyunuk title: COVID-19 confines social gathering to familiar, less crowded, and neighboring urban areas date: 2021-09-02 journal: nan DOI: nan sha: e5cc0606e19091f11f45e32d3452992f123c0622 doc_id: 233878 cord_uid: 3qujsu2q Understanding human urban dynamics is essential but challenging as cities are complex systems where people and space interact. Using a customer-level data set from a leading Korean accommodation platform, we identify that urban hierarchy, geographical distance, and attachment to a location are crucial factors of social gathering behaviors in urban areas. We also introduce a model that incorporates the factors and reconstructs the key characteristics of the data. Our model and analysis show that COVID-19 leads to significant behavioral changes in social gathering behaviors. After the outbreak, people are more likely to visit familiar places, avoid places at the highest level of the urban hierarchy, and travel close distances, while the total number of accommodation reservations does not change much. Interestingly, these changes facilitate social gathering activities only at other high levels, implying an external shock reduces the centralization of human urban dynamics but worsens the inequality of urban areas at low levels. Human urban activities are principal elements of social phenomena, including the growth of cities [1, 2] , economies [3, 4] , and epidemics [5] . They tend to be concentrated in parts of cities and form a hierarchy of geographical areas, where regions at upper levels attract more people than those at lower levels [6] [7] [8] . A person may frequently visit a popular region, often referred to as a hotspot [9] [10] [11] , even though it is far from living areas. Strong urban hierarchy raises various concerns during a pandemic [12] [13] [14] . The spread of infectious diseases would be broad and prevalent if it originates from a hotspot at the top of the hierarchy [15] [16] [17] [18] . The economic impact of a pandemic also differs by hierarchy level. After the COVID-19 outbreak in China, household consumption decreased more in urban regions than in rural regions [19] . The income of the populations working in the informal economy, which is usually located at low hierarchy levels, was negatively affected by the COVID-19 pandemic [20] . Despite its importance on human activities, the urban hierarchy has been rarely considered when analyzing behavioral changes in response to a pandemic [21] [22] [23] [24] [25] . Here, we develop a mechanistic model incorporating urban hierarchy and compare a visitation pattern before and after the COVID-19 outbreak by using individual-level booking records from a leading Korean accommodation platform. This type of data has revealed the structure of urban systems [26] , the emergence of multiple centers in a city [27] , and the spatial distribution of socioeconomic factors [28, 29] . Interestingly in Korea, accommodation bookings are significantly correlated with recreational gatherings (Fig. 1a ) that spread infectious diseases [17] . Our analyses show that urban hierarchy, geographical distance, and attachment to a location are important factors of recreational gatherings in Seoul, the largest city of the Republic of Korea. As many users of the Korean platform book accommodations for intimate relations or rests after overnight partying, we can use the accommodation reservation data (see Methods) as a proxy of recreational gatherings in Seoul. To validate it, we compare reservation counts at the administrative division level with the mobility inflows from Seoul mobility data aggregating GPS locations (See Methods). The rank correlation between the reservation counts and the mobility inflows is significant ( = 0.347, -value < 0.001; Fig. 1a ). The correlation becomes stronger if we only consider nighttime inflows (from 9 pm to 6 am, = 0.400, -value < 0.001). To understand the effect of COVID-19 on recreational gathering behaviors, we split the data into two periods: pre-COVID-19 (From January 21, 2019 to November 3, 2019) and post-COVID-19 (From January 20, 2020 to November 1, 2020). January 20, 2020 is the first day that a COVID-19 infection case was reported in Korea. Both periods start from the fourth week of January and span 286 days. The weekly trend of reservation counts is shown in We assign each accommodation to a Google S2 cell (https://github.com/google/ s2geometry). S2 cells are space tessellations that divide the Earth into cells of a similar size area. It is known as a robust, flexible spherical geometry [10, 30, 31] . We used level-14 S2 cells of which size ranges from 0.19 2 to 0.40 2 (on average 0.32 2 ). Then, we aggregate the reservation counts by S2 cell and identify a hierarchy of cells by assigning a hotspot level, an inverse decile rank of aggregated reservation counts, to each cell. Level 1 is the highest, and level 10 is the lowest level. Fig. 2a and Fig. 2b show the urban hierarchy maps a b Figure 1 : (a) A comparison of the reservation counts to the mobility inflows in Seoul, Republic of Korea. We aggregate the mobility inflows at the level of the administrative division. Each dot represents an administrative division. A significant correlation supports the reservation history data can be a good proxy of the urban recreational gatherings in Seoul. (b) Weekly reservation counts. For a data privacy concern, we normalize the weekly reservation counts by the maximum weekly reservation count. for both periods. The assigned hotspot levels are almost consistent for both periods. Cells with high levels correspond to popular recreational areas in Seoul such as Gangnam, Sinchon, and Yeongdeungpo Time Square (highlighted in Fig. 2a ). To further analyze behavioral changes induced by COVID-19, we take a subset of customers who have at least two reservation records in both the pre-and post-COVID-19 periods as the focus group. This focus group covers 30% of the total customers in the pre-COVID-19 period and periods. Interestingly, COVID-19 affects the inequality of urban areas differently by the hierarchy level. The proportion of the highest level decreases after the COVID-19 outbreak (Fig. 2d inset). However, this proportion was not equally distributed across other levels. People visited levels 2, 3, and 4 rather than low levels ( ≤ 6). We explain the worsening inequality by decomposing individual recreational gathering behaviors in the next section. Individual records can be converted to sequences of cells and hotspot levels. The arrows in Fig. 3a represent a synthetic journey that consists of urban areas. The cell trajectory of this example is → → → → → . Note that the same place can appear multiple times. Based on the assigned levels of the cells, the level trajectory is 1 → 1 → 1 → 3 → 1 → 2. For each trajectory constructed from the data, we define the most frequent cell as the recreational home, so is the home in the example. We construct a flow matrix where is the number of trips between level and normalized by the total flows (Fig. 3b) . We here exclude self-transitions, trips within the same cell, to focus on the transitions between different cells. The majority of transitions are concentrated in high levels of the hierarchy, and the flow matrix is almost symmetric. To check whether the transition from level to level , ( | ), depends on level , we build a null model [10] that ( | ) is proportional to the total inflows to destination's level as follow, where =1 is the total outflow from level and The radius of recreational activities, (See Methods), a variance of geographical distances from the recreational home of a sequence, quantifies how far the places in a trajectory are. The unit of is a kilometer (km). The distributions of show that the majority of people stays within a single cell without moving to other cells (Fig. 3c, Attachment to a location is an indicator of customer satisfaction and an important factor for the accommodation business [32, 33] . In Fig. 3d , we show the home ratio which is the fraction of the most frequent cell in a sequence. Interestingly, the home ratio is about 0.6 regardless of sequence length. The home ratio slightly increases after the COVID-19 outbreak for the light users who booked accommodations no more than 20 times, while there is no difference in the home ratio between the two periods for the heavy users who booked accommodations more than 20 times (Top 10% users by sequence length). Our empirical analysis reveals that urban hierarchy, geographical distance, and attachment to a location need to be considered simultaneously to explain recreational gatherings in Seoul, Republic of Korea. Leveraging our key findings on the individual movements, we develop a model reproducing their patterns and detecting behavioral changes during the COVID-19 pandemic (Fig. 4a) . Our model is motivated by the literature analyzing human mobility [21, 22, 25] . First, an agent starts from an initial cell randomly picked from the cell-level reservation count distribution. Then, the agent explores places with probability or chooses a previously visited place in proportion to the frequency in the reservation history with probability 1 − , where ∈ [0, 1] controls the likelihood that the agent decides to explore places and is the number of iterations starting from one. As the iteration increases, the agent is more likely to choose previously visited places. If the agent decides to explore places, the agent first chooses the hotspot level for the next Through a grid search, we estimate the model parameters that minimize the Jensen-Shannon divergence (JSD) of the hotspot entropy distribution ℎ , the reservation count distribution , and the radius of recreational activities distribution between synthetic sequences and the data. Note that and affect and ℎ , while is independent of and ℎ . Taking advantage of this property, we jointly optimize the model by searching the best and that minimize the product of JSD of and ℎ , namely and ℎ . Next, with the best and , we fit the best that minimizes , JSD of . For the grid search, we explore with dividing 0 to 1 into 51 bins (bin width = 0.02), with dividing 0 to 3 into 121 bins (bin width = 0.025), and with dividing 0 to 5 into 201 bins (bin width = 0.025). We repeat the simulation ten times and average the estimated model parameters. Our model successfully reconstructs the flow matrix , all distributions, and the retention of attachment to a location (Fig. 4b-d ). Fig. 4b shows the flow matrix from the model, We explore the influence of the COVID-19 outbreak on individual movements by comparing the best model result for each period. We show the and ℎ varied by model parameters and in Fig. 5a and the varied by model parameter in Fig. 5b . For both periods, the overall fitness landscape does not change, while the optimal point does. In response to the COVID-19 pandemic, the likelihood of finding places decreases ( = 0.820 > = 0.800), indicating people prefer to stay in familiar places. In addition, the tendency to explore a high-level place decreases ( = 2.075 > = 2.025), and people become reluctant to travel far from their recreational homes ( = 1.325 < = 1.375). The differences in the estimated parameters between the two periods are not subtle, and the model's goodness of fit is sensitive to the parameter changes. In Fig. 5a , if increases or decreases by 0.02 from the estimated optimal point * , × ℎ increases by a factor of 1.058 and 1.080, respectively. If increases or decreases by 0.025 from the estimated optimal point * , × ℎ increases by a factor of 1.119 and 1.057, respectively. Similarly, if increases or decreases by 0.025 from the estimated optimal point * , increases by a factor of 1.022, 1.024, respectively. Also, we would like to note that the goodness of fit's standard deviation for ten repetitions is an order of magnitude smaller than the average value of goodness of fit, implying our simulation results are robust to random errors. Intuitively, before the pandemic, people prefer to visit popular places as they have fewer restrictions on geographic distance. However, during the pandemic, people chose familiar (high ), relatively less popular (low ), and closer places (large ) from their recreational homes. Furthermore, these changes imply the effect of COVID-19 on urban inequality. As people avert crowded places but choose less popular places, the concentration of activities at the highest level is dissolved naturally. However, the number of visitations at low hierarchy levels ( ≤ 6) also decreases ( Supplementary Fig. S9 ) because the probability of exploring places quickly converges to zero by iterations. In this paper, we quantify the characteristics of recreational visitations with three factors: urban hierarchy, geographical distance, and attachment to a location. Leveraging our findings, we develop a model that successfully reconstructs and explains empirical patterns found in Seoul, Republic of Korea. We show that the COVID-19 pandemic led people to be less likely to visit different levels in the hierarchy. They prefer familiar, less popular, and closer locations. Furthermore, we suggest a possible mechanism to explain the worsening inequality with the model parameters and simultaneously. Our study has several limitations. First, agents start from the empty reservation history and find a place by the model mechanisms. In reality, each individual could have past reservation records and find a place depending on the given history. Second, we use the geographic distance between cells, while the urban transportation systems can distort the distance. Third, accommodation reservation is a single layer of urban recreational gathering. Diverse layers and their interactions should be considered to understand its dynamics deeply. Despite these limitations, our study enhances the understanding of urban human activities and would help design effective public health policies considering individual movements around home areas. Practically, our model can be a simulation tool to prepare for unexpected future events that may affect human behaviors. With our model, academic and industry researchers can also tackle inequality issues stemming from behavioral changes across the urban hierarchy. We source an accommodation reservation data set from Goodchoice Company LTD, a Korean platform that occupied 29% of the online market share in 2020 (Wiseapp Report, https:// www.dailypop.kr/news/articleView.html?idxno=51946, a news article written in Korean). The data contains anonymized customer-level reservation histories, spanning the period from January 2019 to November 2020, comprising 1,038 unique accommodation locations in Seoul, Republic of Korea. No demographic information is available. Seoul mobility data was downloaded from the Seoul Open Data Plaza (https://data. seoul.go.kr/dataVisual/seoul/seoulLivingMigration.do, Accessed on December 3, 2021). The data contains mobility flows between administrative divisions (425 divisions in total) decomposed by gender, age, time of departure, and time of arrival, by aggregating the mobile phone signals from transceiver stations in Seoul. Also, for each individual, the data provide an estimated daytime residence (denoted as "W", mostly workplace), nighttime residence (denoted as "H", mostly home), and other classes (denoted as "E"). With the classifications above, we can infer the context of urban mobility. For instance, a movement from a workplace to another area to enjoy recreational gathering is classified as the "WE" type. We use the mobility data spanning the period from January 2020 to October 2020 and focus on the mobility types with "WE", "HE", and "EE" to track non-routine mobility patterns for recreational gatherings. Based on urban hierarchy, we characterize location trajectories with two proposed measures: hotspot entropy and radius of recreational activities. First, the hotspot entropy is the Shannon entropy of hotspot levels in a trajectory. It is defined as where is the total number of hotspot levels ( = 10) and is the frequency of a hotspot level in the trajectory. For example in Fig. 3a , is [ 2 3 , 1 6 , 1 6 , 0, 0, 0, 0, 0, 0, 0] and hotspot entropy ℎ is − 2 3 log 2 3 − 1 6 log 1 6 − 1 6 log 1 6 = 0.867. Second, to quantify how far the places are in a trajectory, we define the radius of recreational activities. It is the variance of geographic distances from the most frequent cell (home cell) in a trajectory (similar to the radius of gyration) and is calculated as where is the length of a trajectory, is the Haversine distance between the centers of the two cells, and ℎ is the recreational home cell that is the most frequent cell in the trajectory. If there are multiple most frequent cells, we randomly pick one as the home cell. Supplementary Information : COVID-19 confines recreational gatherings in Seoul to familiar, less crowded, and neighboring urban areas Origin hotspot level Origin hotspot level where is the total number of hotspot level. and is almost symmetry for both periods ( 2 > 0.999). Figure S5 : Urban hierarchy leads people to move farther than expected. We collect the trajectories of which recreational home cells are near the Gangnam area (red cross, hotspot level 1), which is one of popular regions in Seoul. Relative visit frequency decays with the geographic distance from the home cell, but the geographic distance cannot explain the pattern near regional hotspots. For the data privacy concern, we normalize the reservation count with the maximum reservation count of the actual data set. The growth equation of cities Demography and the emergence of universal patterns in urban systems Mobility and innovation: A cross-country comparison in the video games industry Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters Mobility network models of COVID-19 explain inequities and inform reopening The structure and dynamics of cities The new science of cities Urban characteristics attributable to density-driven tie formation Structure of urban movements: Polycentric activity and entangled hierarchical flows Hierarchical organization of urban mobility and its connection with city livability Uncovering the spatial structure of mobility networks Why inequality could spread COVID-19 Racial residential segregation and economic disparity jointly exacerbate COVID-19 fatality in large American cities COVID-19 exacerbating inequalities in the US Error and attack tolerance of complex networks Epidemics and immunization in scale-free networks. Handbook of Graphs and Networks Coronavirus disease exposure and spread from nightclubs, South Korea Spatial structures and scientific paradoxes in the AIDS pandemic Pandemic, mobile payment, and household consumption: microevidence from China. Emerging Markets Finance and Trade Policy opportunities and challenges from the COVID-19 pandemic for economies with large informal sectors Modelling the scaling properties of human mobility The universal visitation law of human mobility COVID-19 Pandemic, Geospatial Information, and Community Resilience Reduction in mobility and COVID-19 transmission Mobility patterns are associated with experienced income segregation in large US cities Unraveling the hidden organisation of urban systems and their mobility flows The spatiotemporal evolution and influencing factors of hotel industry in the metropolitan area: An empirical study based on China Understanding Airbnb spatial distribution in a southern European city: The case of Barcelona Impact of the COVID-19 pandemic: Insights from vacation rentals in twelve mega cities Travel time estimation using spatio-temporal index based on Cassandra. IS-PRS Annals of Photogrammetry The relationship between customer loyalty and customer satisfaction Customer loyalty in the hotel industry: The role of customer satisfaction and image We thank Goodchoice Company LTD. for making the accommodation reservation data available for this research. We thank I. All authors contributed to the work presented in this paper. J.Y. was involved in conceptualization, analysis, and writing, W.-S.J. and H.K. contributed to conceptualization and writing. All authors discussed the results and commented on the manuscript at all stages. The scripts used in this analysis can be found at https://github.com/jisungyoon/ hotspot. The authors have no competing interests. Supplementary Information is available for this paper. Correspondence and requests for materials should be addressed to Dr. Hyunuk Kim. Due to privacy concerns, the accommodation reservation data we used cannot be shared publicly. The Seoul mobility data is publicly available.