key: cord-0687159-yq0iz6ef authors: Liu, Yang; Gu, Zhonglei; Liu, Jiming title: Uncovering transmission patterns of COVID-19 outbreaks: A region-wide comprehensive retrospective study in Hong Kong date: 2021-06-06 journal: EClinicalMedicine DOI: 10.1016/j.eclinm.2021.100929 sha: b185e5c8038af210c18fc4c0968c9ace7012fecd doc_id: 687159 cord_uid: yq0iz6ef BACKGROUND: Given the dynamism and heterogeneity of COVID-19 transmission patterns, determining the most effective yet timely strategies for specific regions remains a severe challenge for public health decision-makers. METHODS: In this work, we proposed a spatiotemporal connectivity analysis method for discovering transmission patterns across geographic locations and age-groups throughout different COVID-19 outbreak phases. First, we constructed the transmission networks of the confirmed cases during different phases by considering the spatiotemporal connectivity of any two cases. Then, for each case and those cases immediately pointed from it, we characterized the corresponding cross-district/population transmission pattern by counting their district-to-district and age-to-age occurrences. By summating the cross-district/population transmission patterns of all cases during a given period, we obtained the aggregated cross-district and cross-population transmission patterns. FINDINGS: We conducted a region-wide comprehensive retrospective study in Hong Kong based on the complete data report of COVID-19 cases, covering all 18 districts between January 23, 2020, and January 8, 2021 (https://data.gov.hk/en-data/dataset/hk-dh-chpsebcddr-novel-infectious-agent). The spatiotemporal connectivity analysis clearly unveiled the quantitative differences among various outbreak waves in their transmission scales, durations, and patterns. Moreover, for the statistically similar waves, their cross-district/population transmission patterns could be quite different (e.g., the cross-district transmission of the fourth wave was more diverse than that of the third wave, while the transmission over age-groups of the fourth wave was more concentrated than that of the third wave). At an overall level, super-spreader individuals (highly connected cases in the transmission networks) were usually concentrated in only a few districts (2 out of 18 in our study) or age-groups (3 out of 11 in our study). INTERPRETATION: With the discovered cross-district or cross-population transmission patterns, all of the waves of COVID-19 outbreaks in Hong Kong can be systematically scrutinized. Among all districts, quite a few (e.g., the Yau Tsim Mong district) were instrumental in spreading the virus throughout the pandemic. Aside from being exceptionally densely populated, these districts were also social-economic centers. With a variety of situated public venues, such as restaurants and singing/dancing clubs, these districts played host to all kinds of social gathering events, thereby providing opportunities for widespread and rapid transmission of the virus. Thus, these districts should be given the highest priority when deploying district-specific social distancing or intervention strategies, such as lockdown and stringent mandatory coronavirus testing for identifying and obstructing the chain of transmission. We also observed that most of the reported cases and the highly connected cases were middle-aged and elderly people (40- to 69-year-olds). People in these age-groups were active in various public places and social activities, and thus had high chances of being infected by or infecting others. FUNDING: General research fund of the Hong Kong research grants council. T a g g e d H 1 1. IntroductionT a g g e d E n d T a g g e d P The COVID-19 pandemic continues. As of January 8, 2021, the daily increase in new cases has exceeded 0.5 million, resulting in over 87 million confirmed cases with more than 1.9 million deaths across 191 countries/regions [1] [2] [3] . To mitigate the spread and outbreak of COVID-19, many countries have implemented various intervention and social distancing strategies, such as quarantining suspected and confirmed cases, implementing school closures and workplace lockdowns, and reducing public transportation. The effects of different strategies have been modeled and evaluated [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] . It remains a serious challenge for public health decision-makers to identify the most effective yet timely strategies for specific regions. The transmission of COVID-19 is a complex process governed by many factors, such as demographics, population mobility, and environment [16] . Therefore, the transmission patterns within different countries or regions are highly heterogeneous and vary over time.T a g g e d E n d T a g g e d P To effectively mitigate the spread of COVID-19, answering the following research question is critical: how can the transmission patterns of outbreaks among different age-groups within/across populated geographic locations throughout different outbreak phases T a g g e d E n d T a g g e d P be accurately characterized, especially in densely populated cities/ regions with high transmission risk? To answer this question, we must answer three sub-questions: (1) Are there any underlying or hidden transmission networks of COVID-19 outbreaks throughout the different phases and, if so, how can they be uncovered?T a g g e d E n d (2) What are the structures of such transmission networks and what do they imply? (3) Do these transmission networks reveal the cross-district or crosspopulation transmission patterns of outbreaks and, if so, what are the implications for public health policymaking? T a g g e d P To achieve the aforementioned goal, we proposed a spatiotemporal connectivity analysis method for discovering the transmission networks of COVID-19 outbreaks [17, 18] . We conducted a finegrained retrospective study of one of the world's most densely populated regions, Hong Kong, at both the individual and meta-population levels throughout different outbreak phases. As an international metropolitan city with a population of 7.5 million, Hong Kong has over 7,000 people per square kilometer. In an environment with such a dense population and high connectivity, residents have higher chances of coming into close contact with others. Inevitably, this can trigger and catalyze the rapid spread of infectious disease. Through a comprehensive retrospective study of Hong Kong, we provided a scientifically grounded method of discovering and quantitatively characterizing the transmission dynamics of COVID-19 in terms of agegroup, district, and periods in a typical metropolis. The proposed spatiotemporal analysis method can help guide efforts to control and prevent the spread of the disease not only in Hong Kong but also in other densely populated international metropolises.T a g g e d E n d T a g g e d H 1 2. MethodsT a g g e d E n d T a g g e d H 2 2.1. Data sources and collectionT a g g e d E n d T a g g e d P We conducted the study and analysis based on the complete data report of COVID-19 cases in Hong Kong, covering all 18 districts between January 23, 2020, and January 8, 2021. The data were publicly released by the Department of Health, Hong Kong Special Administrative Region Government [19] . (Note that The Government of the Hong Kong Special Administrative Region is the owner of the intellectual property rights to all content available on DATA.GOV.HK, including but not limited to all data.) For each confirmed case, the data attributes included age, gender, residency, onset date, report date, whether the individual was asymptomatic, the individual's building of residence, and the buildings that the individual visited during the 14 days before the date of case confirmation. We used the latitudes and longitudes of the buildings, identified from the buildings' names by the Google Geocoding API, to group single buildings that appeared under different names (due to inconsistencies in the collected data) as single locations. For simplicity, the term "building" in this study refers to the locations (9,305 locations out of 10,247 buildings) after pre-processing.T a g g e d E n d T a g g e d H 2 2.2. Construction of transmission networks via spatiotemporal connectivityT a g g e d E n d T a g g e d P We constructed the transmission networks of the confirmed cases during different phases by considering the spatiotemporal connectivity of any two cases. Each case represented a node in the network. For the nodes of two cases, we defined the node of the first case as being connected to the node of the second case if (1) the onset date of the first case was within the 14 days before the onset date of the second case (referred to as "temporal connectivity") and (2) the two cases visited the same building within the 14 days before their Evidence before this study The spread of COVID-19 is a complex process governed by many disease-related factors, so the transmission patterns within different countries or regions are highly heterogeneous and vary over time. As a result, it remains unclear how to identify the most effective yet timely strategies for specific regions. To effectively mitigate the spread of COVID-19, it is critical to answer the following question: how can the transmission patterns of outbreaks among different age-groups within/across populated geographic locations throughout different outbreak phases be accurately characterized? We showed that the proposed spatiotemporal connectivity analysis can accurately characterize the cross-district and cross-population interactions, and thus provide clear explanations of the underlying and heterogeneous disease transmission patterns. Through a comprehensive retrospective study of Hong Kong, we provided a scientifically grounded method of discovering and quantitatively characterizing the transmission dynamics of COVID-19 in terms of age-group, district, and periods in a typical metropolis. The proposed spatiotemporal analysis method can help guide efforts to control and prevent the spread of the disease not only in Hong Kong but also in other densely populated international metropolises. This work touches upon an essential problem at a critical moment in time: determining the most effective and timely strategies for halting the spread of COVID-19 in specific regions, which remains a serious challenge for public health decisionmakers. By revealing transmission patterns using spatiotemporal connectivity analysis, we gained insights into the dynamism and heterogeneity of COVID-19 transmission, and into the optimization and implementation of common pharmaceutical (e.g., the vaccination of high-risk populations) and nonpharmaceutical (e.g., the lockdown of high-risk areas) intervention strategies. T a g g e d E n d T a g g e d P confirmed diagnosis (referred to as "spatial connectivity"). Based on the above definition, the transmission network can be constructed as a directed graph. The graph properties of the transmission network were defined as follows: T a g g e d E n d T a g g e d P • Degree of a node: the number of connections (links) a node has to other nodes, denoted as k.T a g g e d E n d T a g g e d P • Out-degree of a node: the number of connections that point from a node to other nodes.T a g g e d E n d T a g g e d P • In-degree of a node: the number of connections that point to a node from other nodes.T a g g e d E n d T a g g e d P • Maximum degree: the maximum value among all nodes' degrees, denoted as k max .T a g g e d E n d T a g g e d P • Average degree: the average value of all nodes' degrees, denoted as k aver .T a g g e d E n d T a g g e d P • Eccentricity of a node: the maximum distance from that node to all other nodes in the network.T a g g e d E n d T a g g e d P • Degree distribution: the probability that a randomly selected node in the network has a degree of k.T a g g e d E n d T a g g e d P • Power-law-like distribution and degree exponent: a power-lawlike distribution has a degree distribution of k −γ , where γ is the degree exponent.T a g g e d E n d T a g g e d P More technical details of the procedure used to construct the transmission networks were provided in the Supplementary Materials section.T a g g e d E n d T a g g e d H 2 2.3. Discovery of cross-district and cross-population transmission patternsT a g g e d E n d T a g g e d P For each node (case) and those nodes immediately pointed from it, we characterized the corresponding cross-district transmission pattern by counting their district-to-district occurrences. This quantified how frequently the virus was transmitted from one district to another by a specific patient. By summating the cross-district transmission patterns of all of the cases during a given period, we obtained an aggregated cross-district transmission pattern. This pattern indicated the intensity of disease transmission from one district to another over time. Similarly, we can construct the cross-population transmission pattern for each case and those cases immediately pointed from it, as well as the aggregated cross-population transmission pattern. Together, these patterns measured the intensity of the disease transmission dynamics from one age-group to another. More technical details of the procedure used to uncover the cross-district and cross-population transmission patterns were provided in the Supplementary Materials section.T a g g e d E n d T a g g e d H 2 2.4. Role of the funding sourceT a g g e d E n d T a g g e d P The funders of the study had no role in study design, data collection, data analysis, data interpretation, writing of the Article, or the decision to submit for publication. All authors had full access to all the data in the study and were responsible for the decision to submit the Article for publication.T a g g e d E n d T a g g e d H 2 2.5. Adherence to the RECORD guidelinesT a g g e d E n d T a g g e d P The manuscript adheres strictly to RECORD guidelines.T a g g e d E n d T a g g e d H 1 3. ResultsT a g g e d E n d T a g g e d H 2 3.1. Transmission network patternsT a g g e d E n d T a g g e d P Between January 23, 2020, and January 8, 2021, 9153 cases of COVID-19 infection were reported in Hong Kong. To explore the structures and patterns of the transmission dynamics of COVID-19 T a g g e d E n d T a g g e d P outbreaks in Hong Kong at the individual level, we constructed the transmission networks for the major waves of outbreaks using the methods introduced in the Data and Methods section and in the Supplementary Materials section. Fig. 1 (A) showed the daily new cases. Fig. 1 (B) illustrated the transmission networks of the second, third, and fourth waves of outbreaks. The illustrations of the transmission networks were created using Gephi 0.9. 2 [20] . Note that we did not emphasize Hong Kong's first wave of COVID-19 outbreaks, which occurred between late January and mid-February 2020. Retrospectively, the first wave of outbreaks was much less significant in terms of the scale and duration of COVID-19 infections. Therefore, we focused on the second, third, and fourth waves. In our analysis, the durations of the second, third, and fourth waves were from March 17 to April 12, 2020, July 5 to September 21, 2020, and November 19, 2020, to January 8, 2021, respectively. The start date of a wave was the date from which the number of daily new cases was consistently larger than (or equal to) 10 and increased quickly. The end date of the wave was the date from which the number of daily new cases remained below 10 for at least two weeks and the trend stabilizes. The start and end dates of each wave defined here were consistent with the government's announcements [21, 22] .T a g g e d E n d T a g g e d P As Fig. 1 (B) shows, three individual-level transmission networks were constructed for the second, third, and fourth waves, respectively. All three networks showed patterns of densely connected clusters, indicating the existence of transmission hotspots. Moreover, the node degree of all three transmission networks displayed the powerlaw-like distributions. Fig. S1 provided more details. We also showed the eccentricity distributions of the transmission networks of these three waves in Fig. S2 . Despite these commonalities, the three transmission networks also demonstrated clear quantitative differences in their transmission scales, durations, and patterns. First, the maximum degree k max and the average degree k aver were quite different for these three transmission networks: for the second wave, k max ¼ 64 and k aver ¼ 3:77; for the third wave, k max ¼ 112 and k aver ¼ 6:00; and for the fourth wave, k max ¼ 127 and k aver ¼ 6:34. This indicated the much greater severity of the third and fourth waves of outbreaks than the second wave in terms of transmission intensity (measured by the degree of the transmission networks). Moreover, the degree exponent γ was also very different for the three transmission networks: for the second, third, and fourth waves, γ ¼ 1:59; 2:18; and 1:89, respectively. This demonstrated that the degree distributions of the transmission networks of the third and fourth waves were steeper than that of the second wave, indicating the existence of highly connected nodes in the third and fourth waves. These highly connected nodes may represent super-spreader individuals who deserve the utmost attention in epidemiological investigations. Furthermore, despite the fourth wave having a lower total number of infections (3,673 cases) than the third wave (3,780 cases), the virus seems more transmissible than it was during the third wave, as reflected by the maximum and average degrees of the transmission matrices. This observation was also consistent with Fig. 1(B) , which showed the more densely connected clusters formed in the fourth wave than in the third wave.T a g g e d E n d T a g g e d P In order to further explore the roles of different building categories in disease transmission, we divided the buildings into residential ones and non-residential ones. Fig. S3 showed the transmission networks in residential and non-residential buildings of the three major waves of COVID-19 outbreaks in Hong Kong. We can observe that, although there was not much difference in the number of residential buildings (4,294) and that of non-residential buildings (5,953), the majority of the spatiotemporal connections (35,973 out of 50,789) appeared in non-residential buildings, indicating that non-residential scenarios dominated the disease transmission and contributed most of the highly connected nodes in all three waves. The reason might be that the scale of transmission in the household is generally small, while that in non-residential places such as clubs or restaurants could T a g g e d E n d T a g g e d P be very large. Therefore, effective surveillance and tracing of the contacts and visits in the non-residential buildings, especially the public places, are of vital important in controlling the disease spread.T a g g e d E n d T a g g e d P To make our investigation more comprehensive, we explored various ways to establish the temporal connectivity. In the current analysis, we included both symptomatic and asymptomatic cases to ensure data integrity. For asymptomatic cases, the onset date was replaced by the report date. However, this substitution may not be that accurate as the asymptomatic cases did not develop symptoms. To make the analysis more complete, we also considered an alternative way to construct the temporal connectivity by excluding all asymptomatic cases. The comparison between the degree distribution of the transmission network with asymptomatic cases and that without asymptomatic cases was shown in Fig. S4(A) . We can observe that the degree distributions of both transmission networks demonstrated similar trends. Furthermore, we tested an alternative way to establish the temporal connectivity. Instead of setting a 14-day cutoff, we assumed that the transmission probability with respect to a specific length of serial interval followed a gamma distribution [17] . Then the temporal connectivity of two nodes was determined by this probability distribution. Fig. S4(B) showed the degree distributions of the transmission networks (with/without asymptomatic cases) constructed by the gamma distribution-based connectivity. Moreover, we conducted a sensitivity analysis on the degree distributions with respect to the number of cutoff days (from 1 to 14 days) and the number of days before symptom onset of index cases (from 0 to 5 days) for possible transmission. From Figure S5 , we can see that the degree distributions with various settings were relatively robust, indicating the consistency of the proposed spatiotemporal analysis.T a g g e d E n d T a g g e d P To examine in more detail the transmission patterns of the outbreaks, we magnified the transmission network of the fourth wave T a g g e d E n d T a g g e d P (Fig. 2) . The size of a node indicated its degree of connectivity and the different colors indicated the different case clusters identified and reported by the Hong Kong government [19] . For clarity of illustration, we only colored the 11 major clusters in the figure. The confirmed cases clearly formed different clusters/groups. The largest cluster was the Singing and Dancing cluster (colored orange), which included more than 700 confirmed cases from over 20 singing and dancing venues/clubs distributed across 7 of Hong Kong's 18 districts [23] . More importantly, the figure indicated that many other clusters and individual cases were connected with the Singing and Dancing cluster. For instance, the square node of case #6346 was a connector that linked two large clusters: the Singing and Dancing cluster and the Construction Sites at LOHAS Park/Kai Tak cluster (colored teal). The longest transmission path over the network (colored blue) also supported this: it passed through 12 cases (#5446, #5585, #5550, #5588, #6544, #5891, #6827, #7478, #8394, #8579, #8687, and #8824), 5 of which (#5585, #5550, #5588, #6544, and #5891) belonged to the Singing and Dancing cluster. The remaining seven cases belonged to other clusters. All of the observations from the figure were consistent with the Singing and Dancing cluster being the largest super-spreader group triggering Hong Kong's fourth wave of COVID-19 infections [24] . Identifying densely connected clusters can guide the accurate implementation of lockdowns or stringent mandatory coronavirus testing in high-risk groups/areas [25, 26] , so as to obstruct the chain of transmission in a timely manner.T a g g e d E n d T a g g e d H 2 3.2. Cross-district and cross-population transmission patterns throughout different outbreak phasesT a g g e d E n d T a g g e d P To facilitate the accurate deployment of control strategies in highrisk locations or age-groups for the effective mitigation of disease T a g g e d E n d T a g g e d F i g u r e Gephi is an open-source and multiplatform software distributed under the dual license CDDL 1.0 and GNU General Public License v3. Note: Hong Kong's first wave of COVID-19 outbreaks between late January and mid-February 2020 was also shown in (A) but not emphasized in this study. Retrospectively, the first wave of outbreaks was much less significant in terms of the scale and duration of COVID-19 infections than the other waves. Therefore, we focused on the second, third, and fourth waves.T a g g e d E n d T a g g e d E n d T a g g e d P spread, it is necessary to discover the cross-district and cross-population transmission patterns throughout different outbreak phases. We divided the entire Hong Kong region into 18 administrative districts (the names and indices of which are listed in Table S1 ) and its entire population into 11 age-groups (the details of which are listed in Table S2 ). Using the methods described in the Supplementary Materials section, we then calculated the aggregated cross-district and crosspopulation transmission patterns of the second, third, and fourth waves of COVID-19 outbreaks in Hong Kong. Fig. 3 (B,C) illustrated the heat maps of the constructed cross-district and cross-population transmission patterns, respectively. The construction procedure and heat maps showed that the matrices were not symmetrical, indicating that our method can appropriately reflect directional transmission from one district (or age-group) to another.T a g g e d E n d T a g g e d P For the second wave of outbreaks, the spread among districts (D1: Central & Western, D4: Wan Chai, and D5: Yau Tsim Mong) and the spread among age-groups (A3-A6: 20-59) were relatively concentrated. Thus, the epidemic was relatively easy to control. In the third wave, the spread among districts remained relatively concentrated, especially in D7 (Kowloon City) and D9 (Wong Tai Sin). However, transmission occurred across almost all age-groups. In this situation, it would be more appropriate to make district-specific control policies (e.g., strengthening social distancing strategies in severely affected districts) than age-specific control policies. For the fourth wave, although the statistical distribution of the daily case numbers was somewhat similar to that of the third wave, the transmission dynamics over districts (and age-groups) were quite different. The cross-district transmission of the fourth wave was more diverse than T a g g e d E n d T a g g e d P that of the third wave, with D5 (Yau Tsim Mong) at the center of transmission. Meanwhile, the transmission over age-groups of the fourth wave was more concentrated than that of the third wave. This indicated that implementing age-specific intervention or control strategies (e.g., prioritization of vaccination for high-risk age-groups, or closure or access control of places for age-specific population gathering such as kindergartens, dancing clubs, or elderly care homes) during the fourth wave might have been more effective than doing so during the third wave.T a g g e d E n d T a g g e d H 2 3.3. Overall statistics for and general characteristics of the transmission patternsT a g g e d E n d T a g g e d P To establish an overall understanding of the transmission connectivity among all confirmed cases during the past year, we summarized the distribution of cases in terms of their level of connectivity with other cases (i.e., degrees) in Tables 1 and 2. The out-degree and in-degree distributions were summarized according to gender, age, whether the individual was asymptomatic, residency, case classification, the connection with other cases, the time from onset to case confirmation, and district. To provide a more intuitive picture of the relationships between the out-degree/in-degree distributions and the abovementioned eight variables, we visualized all variabledegree histograms in Fig. S6 .T a g g e d E n d T a g g e d P The gender distribution showed more cases among people of the female gender. The age distribution showed that the reported cases were mainly distributed among individuals 20 to 69 years old (boxed in blue) in all out-degrees and in-degrees. This was consistent with T a g g e d E n d T a g g e d F i g u r e T a g g e d E n d T a g g e d P the fact that most people who were active in public places fall into this age range and thus had a higher probability of spreading the virus (corresponding to the out-degree) and becoming infected (corresponding to the in-degree) than children/youths (0-19 years old) and the elderly (≥ 70 years old). Most of the high outdegree and in-degree cases were distributed among 40-to 69year-olds. The fact that people in these age-groups tended to gather in crowds or participated in activities involving many participants, such as dancing, singing, and Yum Cha (morning tea), explained this finding. Regarding classification, most cases were local or epidemiologically linked with local cases (boxed in green), whereas the impact of imported cases was relatively limited. The district distribution of the cases showed that Yau Tsim Mong and Wong Tai Sin contributed most of the high-degree cases (boxed in red). As this indicated these two districts as highrisk regions during outbreaks, they should be considered highpriority regions for disease control.T a g g e d E n d T a g g e d H 1 4. DiscussionT a g g e d E n d T a g g e d P We investigated the transmission patterns and dynamics of COVID-19 outbreaks in Hong Kong using spatiotemporal network analysis at both the individual and meta-population levels. The results offered some implications for the assessment of transmission risk and the deployment of policies and strategies for preventing disease spread.T a g g e d E n d T a g g e d P First, the majority of confirmed cases were local or epidemiologically linked with local cases. Compared with the limited impact of imported cases, local cases dominated the outbreaks. Therefore, timely detection and treatment of the latest local cases become the key to preventing COVID-19 outbreaks, especially when the external/ international transportation was already under strict control.T a g g e d E n d T a g g e d P Second, among all the districts in our analysis, only quite a few (e. g., the Yau Tsim Mong district) played a key role in spreading the virus throughout the pandemic, especially during the fourth wave. In T a g g e d E n d T a g g e d F i g u r e Table S1 . (C) The heat maps of the cross-population transmission patterns of the second, third, and fourth waves of outbreaks, where A1 to A11 denote the indices of the population's 11 age-groups. More details were provided in Table S2 .T a g g e d E n d T a g g e d E n d Table 1 Comparison of cases with different out-degrees on the transmission network of COVID-19 in Hong Kong between January 23, 2020, and January 8, 2021. The data were represented in the format n (%), where n denotes the number of cases in each category and the value in brackets denotes the corresponding percentage. Furthermore, k denotes the out-degree of cases (nodes) and P-values were calculated based on the Pearson's chi-square test and the Fisher's exact test. Note that in the Comparison of cases with different in-degrees on the transmission network of COVID-19 in Hong Kong between January 23, 2020, and January 8, 2021. The data were represented in the format n (%), where n denotes the number of cases in each category and the value in brackets denotes the corresponding percentage. Furthermore, k denotes the in-degree of cases (nodes) and P-values were calculated based on the Pearson's chi-square test and the Fisher's exact test. Note that in the T a g g e d E n d T a g g e d P HKBU12201619). Y. Liu acknowledges the support provided by the HKBU/CSD Departmental Start-up Fund for New Assistant Professors.T a g g e d E n d T a g g e d H 1 Supplementary materialsT a g g e d E n d T a g g e d P Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.eclinm.2021.100929.T a g g e d E n d T a g g e d H 1 ReferencesT a g g e d E n d WHO coronavirus disease (COVID-19) dashboard on behalf of the CMMID COVID-19 working group, effectiveness of isolation, testing, contact tracing, and physical distancing on reducing transmission of SARS-CoV-2 in different settings: a mathematical modelling study Dance off: the niche Hong Kong social scene behind city's biggest Covid-19 cluster Visualising Hong Kong's biggest Covid-19 superspreader event More areas come under stricter COVID testing rules. rthk.hk T a g g e d E n d T a g g e d P fact, the population density in these areas is extremely high. Taking the Yau Tsim Mong district as an example: it is one of the most densely populated districts in Hong Kong, with over 48,000 persons per square kilometer. Moreover, these areas generally possess a large number of public venues for social gatherings, such as restaurants and singing/dancing clubs, thereby providing opportunities for the widespread and rapid transmission of the virus. Thus, it should be given the highest priority when deploying district-specific social distancing or intervention strategies. This implication was consistent with the Hong Kong government's recent lockdown and stringent mandatory coronavirus testing policies in the densely populated areas of Jordan, Mong Kok, and Yau Ma Tei in the Yau Tsim Mong district for identifying and obstructing the chain of transmission [25, 26] . Note that the data we used in our analysis were between January 23, 2020 to January 8, 2021 while the government's policies were implemented on January 23, 2021, which is two weeks later than our analysis. The consistency between our analysis results and the implemented policies indicated that the proposed spatiotemporal connectivity analysis enables us to provide an early warning of the high-priority regions or locations for intervention, demonstrating the potential of the analysis in predicting the trend of the outbreak. Furthermore, the cross-district and cross-population transmission matrices constructed from the spatiotemporal connectivity analysis can be regarded as a generalization of the next-generation matrices of compartmental models for future epidemic forecasting.T a g g e d E n d T a g g e d P Third, the reported cases and highly connected cases were mainly middle-aged and elderly people (e.g., 40-to 69-year-olds). Compared with young people, these individuals may be more active in various social activities and likely to gather in large public spaces, such as restaurants and singing/dancing clubs, and thus have higher chances of being infected by or infecting others. Therefore, the current policy prohibiting group gatherings implemented by the government can effectively help reduce the transmission risk. Obviously, the advantage of implementing age-specific control strategies is that it enables the decision-makers to take into consideration the age-specific susceptibility, transmissibility, mobility patterns, and social habits in the risk estimation as well as the corresponding intervention strategies, so that the high-risk age-groups can be identified and treated in a timely and effective way. However, compared to the locationspecific strategies (e.g., lockdown and mandatory testing in a specific region), the age-specific strategies are not that easy to be implemented due to the mobility and uncertainty of the population in or between different regions/locations. One possible way to benefit from the advantages of both age-specific and locationspecific strategies would be to implement the age-specific control or interventions in a location-specific manner. For example, to address the risk associated with the gathering of middle-aged and elderly people (40-to 69-year-olds), we could implement the strategy of social gathering ban in places where such high-risk age-groups are most likely to gather, such as singing/dancing clubs. By doing so, the age-specific transmission could be contained in a relatively more informed and effective way.T a g g e d E n d T a g g e d P Finally, all of the waves of COVID-19 outbreaks in Hong Kong, even those with similar statistical distributions (Figs. (2,3) ), can be systematically scrutinized, characterized, and differentiated with respect to their spatiotemporal and population-specific transmission patterns. To achieve effective control of the disease spread, tailormade strategies must be developed and deployed according to the transmission dynamics rather than a static or overall statistic. By revealing the transmission patterns of COVID-19 among different age-groups within/across populated geographic districts/locations throughout different outbreak phases, we provided objective and quantifiable guidance for the control and prevention of COVID-19 outbreaks not only in Hong Kong but also in other international metropolises with similar population density, connectivity, and mobility characteristics.T a g g e d E n d T a g g e d P The proposed spatiotemporal analysis method is general at the methodological level and instructive in understanding the COVID-19 transmission patterns among different age-groups within/across populated geographic locations throughout different outbreak phases. However, it should be noted that, when applying the method to other cities or different time periods, the location/time-specific settings, such as the data scale and availability, the testing capacity, the case categories, and the distribution of age-specific populations and mobility patterns should be taken into consideration, thus making the analysis results more situation-specific and informative. For example, compared to other large cities, Hong Kong had relatively few cases of COVID-19 during its various waves and fairly complete data information. Moreover, different cities/ regions, during different time periods, may also present heterogeneities in other aspects related to disease transmission, e.g., the testing capacity, the percentage of the indigenous cases and imported cases, and the mobility patterns and social habits of different age-groups. Therefore, the numerical results reported in this study may not be directly applicable to other situations. To ensure that the analysis is tailor-made, and the results are context-aware, specific data, information, and settings in other cities/regions or different time periods should be adaptively used to guide the location/time-specific parameterization of the model.T a g g e d E n d T a g g e d P Note that in this work, the information of activities/clusters (e.g., the singing/dancing clubs) that may lead to the transmission was directly obtained from the data website of the Hong Kong government [27] , not inferred from our analysis. What we have done was integrating the spatiotemporal connectivity method with the obtained activity/cluster-related information to conduct a retrospective analysis. In fact, the reported information on activities or contacts that led to the transmission could be uncertain because the incubation period for COVID-19 ranges from 1 to 14 days [28] , making the accurate determination of infection sources difficult. Therefore, instead of providing a deterministic inference on the activities/ contacts that led to each transmission, the analysis conducted in this paper aimed to offer a better understanding of the spatiotemporal dynamics and trends of the disease spread, so as to indicate the potential scenarios that may have a relatively high risk of transmission for targeted control and intervention.T a g g e d E n d T a g g e d H 1 5. Data sharingT a g g e d E n d T a g g e d P The modeling/analytics tools of this study will be made publicly available on: http://aic.hkbu.ai/.T a g g e d E n d T a g g e d H 1 CRediT authorship contribution statementT a g g e d E n d T a g g e d P Yang Liu: Conceptualization, Visualization, Data curtion, Formal analysis, Writingoriginal draft. Zhonglei Gu: Data curtion, Formal analysis, Writingoriginal draft. Jiming Liu: Conceptualization, Visualization, Data curtion, Formal analysis, Writingoriginal draft.T a g g e d E n d T a g g e d H 1 Declaration of InterestsT a g g e d E n d T a g g e d P We declare no competing interests.T a g g e d E n d T a g g e d H 1 CRediT authorship contribution statementT a g g e d E n d