key: cord-0940918-hp8pswt7 authors: Liu, Y.; Gu, Z.; Xia, S.; Shi, B.; Zhou, X.-N.; Shi, Y.; Liu, J. title: What are the Underlying Transmission Patterns of COVID-19 Outbreak? – An Age-specific Social Contact Characterization date: 2020-04-18 journal: EClinicalMedicine DOI: 10.1016/j.eclinm.2020.100354 sha: ee90f05aa5a937db108c1c8f8e033e67ed2b4ef9 doc_id: 940918 cord_uid: hp8pswt7 Abstract Background COVID-19 has spread to 6 continents. Now is opportune to gain a deeper understanding of what may have happened. The findings can help inform mitigation strategies in the disease-affected countries. Methods In this work, we examine an essential factor that characterizes the disease transmission patterns: the interactions among people. We develop a computational model to reveal the interactions in terms of the social contact patterns among the population of different age-groups. We divide a city's population into seven age-groups: 0-6 years old (children); 7-14 (primary and junior high school students); 15-17 (high school students); 18-22 (university students); 23-44 (young/middle-aged people); 45-64 years old (middle-aged/elderly people); and 65 or above (elderly people). We consider four representative settings of social contacts that may cause the disease spread: (1) individual households; (2) schools, including primary/high schools as well as colleges and universities; (3) various physical workplaces; and (4) public places and communities where people can gather, such as stadiums, markets, squares, and organized tours. A contact matrix is computed to describe the contact intensity between different age-groups for each of the four settings. By integrating the four contact matrices with the next-generation matrix, we quantitatively characterize the underlying transmission patterns of COVID-19 among different populations. Findings We focus our study on 6 representative cities in China: Wuhan, the epicenter of COVID-19, together with Beijing, Tianjin, Hangzhou, Suzhou, and Shenzhen, which are five major cities from three key economic zones. The results show that the social contact-based analysis can readily explain the underlying disease transmission patterns as well as the associated risks (including both confirmed and unconfirmed cases). In Wuhan, the age-groups involving relatively intensive contacts in households and public/communities are dispersedly distributed. This can explain why the transmission of COVID-19 in the early stage mainly took place in public places and families in Wuhan. We estimate that Feb. 11, 2020 was the date with the highest transmission risk in Wuhan, which is consistent with the actual peak period of the reported case number (Feb. 4-14). Moreover, the surge in the number of new cases reported on Feb. 12-13 in Wuhan can readily be captured using our model, showing its ability in forecasting the potential/unconfirmed cases. We further estimate the disease transmission risks associated with different work resumption plans in these cities after the outbreak. The estimation results are consistent with the actual situations in the cities with relatively lenient control policies, such as Beijing, and those with strict control policies, such as Shenzhen. Interpretation With an in-depth characterization of age-specific social contact-based transmission, the retrospective and prospective situations of the disease outbreak, including the past and future transmission risks, the effectiveness of different interventions, and the disease transmission risks of restoring normal social activities, are computationally analyzed and reasonably explained. The conclusions drawn from the study not only provide a comprehensive explanation of the underlying COVID-19 transmission patterns in China, but more importantly, offer the social contact-based risk analysis methods that can readily be applied to guide intervention planning and operational responses in other countries, so that the impact of COVID-19 pandemic can be strategically mitigated. Funding General Research Fund of the Hong Kong Research Grants Council; Key Project Grants of the National Science Foundation of China. 1 2 population into seven age-groups: 0-6 years old (children); 7-14 (primary and junior high school students); 15-17 (high school students); 18-22 (university students); 23-44 (young/middle-aged people); 45-64 years old (middle-aged/elderly people); and 65 or above (elderly people). We consider four representative settings of social contacts that may cause the disease spread: (1) individual households; (2) schools, including primary/high schools as well as colleges and universities; (3) various physical workplaces; and (4) public places and communities where people can gather, such as stadiums, markets, squares, and organized tours. A contact matrix is computed to describe the contact intensity between different age-groups for each of the four settings. By integrating the four contact matrices with the next-generation matrix, we quantitatively characterize the underlying transmission patterns of COVID-19 among different populations. Findings We focus our study on 6 representative cities in China: Wuhan, the epicenter of COVID-19, together with Beijing, Tianjin, Hangzhou, Suzhou, and Shenzhen, which are five major cities from three key economic zones. The results show that the social contact-based analysis can readily explain the underlying disease transmission patterns as well as the associated risks (including both confirmed and unconfirmed cases). In Wuhan, the age-groups involving relatively intensive contacts in households and public/communities are dispersedly distributed. This can explain why the transmission of COVID-19 in the early stage mainly took place in public places and families in Wuhan. We estimate that Feb. 11, 2020 was the date with the highest transmission risk in Wuhan, which is consistent with the actual peak period of the reported case number (Feb. [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] . Moreover, the surge in the number of new cases reported on Feb. 12-13 in Wuhan can readily be captured using our model, showing its ability in forecasting the potential/unconfirmed cases. We further estimate the disease transmission risks associated with different work resumption plans in these cities after the outbreak. The estimation results are consistent with the actual situations in the cities with relatively lenient control policies, such as Beijing, and those with strict control policies, such as Shenzhen. Interpretation With an in-depth characterization of age-specific social contact-based transmission, the retrospective and prospective situations of the disease outbreak, including the past and future transmission risks, the effectiveness of different interventions, and the disease transmission risks of restoring normal social activities, are computationally analyzed and reasonably explained. The conclusions drawn from the study not only provide a comprehensive explanation of the underlying COVID-19 transmission patterns in China, but more importantly, offer the social contact-based risk analysis methods that can readily be applied to guide intervention planning and operational responses in other countries, so that the impact of COVID-19 pandemic can be strategically mitigated. To our knowledge, this is the first work that explicitly characterizes and quantifies the underlying transmission patterns among different populations throughout different phases of the COVID-19 outbreak. We show that the age-specific social contact patterns can accurately characterize the interactions among different groups of people, and thus provide explanations on the underlying disease transmission and associated risks in different phases of the outbreak. We analyze the situations in 6 representative cities in China. These cities are Wuhan, the epicenter of the outbreak, and Beijing, Tianjin, Hangzhou, Suzhou, and Shenzhen, five cities situated in three major economic zones. This work has touched upon an important problem at a critical moment in time: COVID-19 has spread to 180 countries in 6 continents, and a deeper understanding of what may have happened in the outbreak is now overdue. Addressing this key question enables us to gain insights into the retrospective and prospective situations of the disease outbreak; this in turn will help further answer a series of questions in precision control and prevention of the disease; namely, how future risks and trends in different regions may evolve, how effective different intervention strategies can be in controlling the outbreak, and what may happen if people gradually return to schools and workplaces in the later stage of the outbreak at some point. Thus, through the prism of the outbreak in China, this work offers The novel coronavirus disease has spread widely at the global level. 1 We select 6 major cities in China for our study: Wuhan, Beijing, Tianjin, Hangzhou, Suzhou, and Shenzhen; their geographical locations and the disease situations (in terms of total case number from Dec. 2019 to Feb. 2020) are shown in Figure 1 to select these cities for our study is that the population of these cities contains a large number of migrant workers and college students from other cities or provinces. The frequent human mobility largely increases the risk of imported cases, posing great challenges to the control and prevention of COVID-19, especially when people are gradually returning to workplaces and schools in a later stage. The data used in our study include: The daily confirmed cases from Dec. 8 The demographic data of Wuhan, 14 Beijing, 15 Tianjin, 16 Hangzhou, 17 Suzhou, 18 and Shenzhen. 19 The underlying transmission patterns of COVID-19 among different populations are difficult to characterize because they are complex and related to various observations and disease-related factors, including the number of confirmed cases, the potential risks brought by unconfirmed cases, the distribution of different case categories (indigenous/imported) in different regions/cities, the population distribution of different age-groups, the social contact patterns in different settings (e.g., households, schools, workplaces, and public places), the extent of interventions implemented in different regions/cities, etc.. To address this challenging issue in a fundamental way, we examine an essential factor that characterizes the disease transmission patterns: the interactions among people. 20, 21 Specifically, we examine the interactions in terms of the social contact patterns among the population of different age-groups. To characterize the age-specific social contact-based transmission, we divide a city's population into seven age-groups: 0-6 years old (children); 7-14 (primary and junior high school students); 15-17 (high school students); 18-22 (university and college students); 23-44 (young/middleaged people); 45-64 years old (middle-aged/elderly people); and 65 or above (elderly people). The population in each of the seven groups has its own specific social circles, gathering places, or activity patterns. Meanwhile, we consider four representative settings of social contacts that may cause the disease spread: (1) individual households, which may lead to the transmission within families; (2) schools, including primary/high schools as well as colleges and universities, which may cause the spread among students and teachers; (3) various physical workplaces, which may affect in-office and outside workers; and (4) public places and communities, such as stadiums, markets, squares, and organized tours, where the spread within a dense population may arise. Let G 1 -G 7 be the seven age-groups: ij ij C C C and P ij C denote the total number of contacts between individuals from G i and those from G j under the settings of Households, Schools, Workplaces, and Public/community, respectively, P i and P j denote the population of G i and G j , and C H , C S , C W , and C P denote the 7×7 social contact matrices. In Eq. (1), we use demographic data to calculate P i (i = 1, …, 7). For C H , C S , C W , and C P , as the city-specific data of social contacts between age-groups is unavailable, we adopt a computational method 20 to estimate them. The appropriateness of using such a computational method for social-contact estimation in data-scarce situations has been validated 20 : the estimated C H , C S , C W , and C P are consistent with the results from a real-world social contact survey 22 in terms of the strong assortativeness and the appearance of similar secondary diagonal contact patterns. Next, we represent the overall age-specific social contact matrix as a linear combination of the above four matrices: 21 where , , , 0 H S W P r r r r  are the weights of matrices C H , C S , C W , and C P , respectively, and satisfy that B19. 23 The results in Figure 3 show that our model with the above parameter settings can adequately capture the disease trends in different cities; our sensitivity analysis also confirms that the developed model (to be described below) is relatively robust to the parameter settings. With the overall age-specific social contact matrix C, we can characterize the disease transmission pattern using the next-generation matrix K t : 24 where S t , B, A, and K t are 7×7 matrices and I t is a 7×1 vector. Specifically, S t , B, and A are diagonal matrices, with the diagonal elements t ii s , b ii , and a ii (i = 1, …, 7) being the size of susceptible population in G i at the t-th generation of the disease infection, the individual susceptibility in G i , and the infectivity of infected individuals in G i , respectively. The i-th element in vector I t denotes the number of infectious individuals in G i at the t-th generation of the disease infection. By referring to Li et al.'s work, 6 we set the reproduction number R 0 = 2.2. For the recovery rate γ, we calculate it as follows: First, according to the definition of recovery rate, 25 it is the reciprocal of the duration of being 8 infectious (i.e., γ = 1/infectious period). Then, according to Svensson's work, 26 7 indicates that the mean incubation period is 5.2 days for COVID-19. As there is not precise infection date for those patients to estimate the mean latent period, we use the mean incubation period to approximate the mean latent period. Therefore, the recovery rate is estimated as γ = 1/(7.5-5.2). For the infectivity, we set a ii = 1.0 (for i = 1, …, 7) according to Xia et al.'s work. 21 For the susceptibility b ii , as it represents the probability of being infected when a susceptible individual is exposed to infectious contacts, we estimate it as follows: For each G i , we first calculate its infected population ratio r i by dividing the number of infected cases in G i by P i , i.e., r i = n i /P i . With r i calculated for all 7 age-groups, we then obtain a multiplier, 1/min{r 1 , …, r 7 }, through normalizing the smallest r i to 1, and inflate all other infected population ratios by 1/min{r 1 , …, r 7 }. Then we estimate the susceptibility as b ii = r i /min{r 1 , …, r 7 }. As different cities have different numbers of infected cases and different population sizes, they Similarly, if we consider different work resumption plans, we will need to increase these weights proportional to the rate of work resumption. Specifically, we reduce r W from its original value to 0 as of Jan. 23 (the starting date of implementing stringent public health control policies). Moreover, we gradually recover its value from the starting date of our work resumption plans to reflect the effect of "back-to-work" policies. We apply the similar rationale to r S and r P . For r H , as the public social distancing policies would increase social contacts within households, we increase the value of r H starting from Jan. 23, and gradually reduce to its original value once the "back-to-work" policy kicks in. The funders of the study had no role in study design, data collection, data analysis, data interpretation, writing of the Article, or the decision to submit for publication. All authors had full access to all the data in the study and were responsible for the decision to submit the Article for publication. As can be seen in Figure 2 , the distribution of age-groups involving relatively intensive contacts in households and public/communities is rather scattered, and thus it is easy to cause the disease spread among different age-groups in these two settings. This is consistent with the observation that the transmission of COVID-19 in the early stage mainly took place in public places and families. In contrast, the distribution of age-groups with intensive contacts in schools and workplaces are relatively concentrated. Moreover, the composition of people in these two settings is relatively stable, making the management easier than that in public places or communities. Because most of the schools and workplaces were closed before the Chinese Spring Festival and have not been reopened or resumed yet, the scale of the COVID-19 outbreak in these two settings is relatively limited. However, if normal educational and economic activities are to be resumed, a large number of students and staff will gather in these two settings, which may bring a real challenge to the control and prevention of COVID-19 infection in these concerned places. For different cities, the transmission patterns of COVID-19 might be different. In Wuhan, the cases were mainly indigenous. In the other 5 cities, the cases might be either indigenous or imported from Hubei. Therefore, for these 5 cities, we need to take both indigenous cases and imported cases into consideration when investigating the transmission patterns among populations. To model the potential local transmission risk caused by the imported cases, we use the following approach: First, for each confirmed case, we identify if it is imported or indigenous according to the information provided by the Municipal Health Commission [8] [9] [10] [11] [12] [13] . If the case is an imported case, we consider its potential risk in bringing in local transmission. According to Li et al.'s results, 6 the mean serial interval is 7.5 days, so we assume that for each imported case, from the day of arrival to the day of hospitalization, he/she could infect 1/7.5 person per day. We apply the same principle to all imported cases to estimate their potential infections. Those cases infected by the imported ones are considered as potential cases in our study. The confirmed cases and potential cases together constitute the disease transmission risk, i.e., the focus of the following retrospective and prospective analyses. With the age-specific social contact-based transmission modeling, we are able to describe and explain what may have happened retrospectively and what can be anticipated prospectively of the COVID-19 outbreak. Figure 3 shows the estimation on the trends of disease infection and the transmission risks associated with different work resumption plans based on the social contact patterns and reported cases. From the results of Wuhan (Figure 3(A) ), we can observe that the situation without intervention (the brown line) is estimated to be much severer than that with interventions (the blue line), indicating the effectiveness of the interventions implemented in Wuhan. Here the interventions refer to various social distancing measures, including quarantine of patients, closure of workplaces and schools, suspension of public transportation, and requirement for people to wear masks. 29, 30, 31 It can also be observed from Figure 3 bars denote the newly confirmed cases reported every day while the light red bars denote the potential cases locally infected by the imported cases, which are estimated according to the mean serial interval of 7.5 days. 6 Plans A1-A3 refer to the plans that start on Feb. 17 (Monday) and finish the resumption by 1 week, 1/2 months, and 1 month, respectively. Plans B1-B3 refer to the plans that start on Feb. 24 (Monday) and finish the resumption by 1 week, 1/2 months, and 1 month, respectively. The COVID-19 pandemic has hit the global economy by a storm. As the public health crisis escalates, we analyze what will happen if the social/business activities gradually restore from the strong control and isolation to the normal situation (including public and work places). We analyze the disease transmission risks associated with different work resumption plans in Beijing, Tianjin, Hangzhou, Suzhou, and Shenzhen, respectively. Since Wuhan was in a serious situation during the outbreak, its work resumption may take longer than the other 5 cities; its detailed plans for the resumption and associated risks are discussed in the Supplementary as shown in Figure S1 . We conduct our prospective study on 2 sets of different work resumption plans (Plans A1-A3 and Plans B1-B3) and accordingly examine their associated risks of disease transmission. Plans A1-A3 resume work when the disease transmission is well under control, i.e., the number of new reported cases is about to become zero. Specifically, Plans A1-A3 start on Feb. 17 (Monday) and finish the resumption by 1 week, 1/2 months, and 1 month, respectively. Plans B1-B3 are stricter than Plans A1-A3: they resume work when the number of new reported cases has been zero for three consecutive days. Therefore, Plans B1-B3 start on Feb. 24 (Monday) and, similarly, finish the resumption by 1 week, 1/2 months, and 1 month, respectively. In order to parameterize our model for risk prediction with different work resumption plans, we Feb. 11 until its resumption plan begins, the weekly work resumption rate will be 10%. For Suzhou, we use the same resumption settings as Hangzhou as both are in the Yangtze River Delta Economic Zone. Last but not the least, for Shenzhen, according to the municipal government regulations, general enterprises may not resume work before 24:00 on Feb. 9. 36 Therefore, we set Shenzhen's weekly work resumption rate to be 10% from Feb. 10 until its resumption plan begins. As shown in Figures 3(B) -(F), Plans B1-B3 represent a stricter work resumption policy: they start 1 week later than Plans A1-A3. The estimation of disease transmission risks is consistent with actual situations. For example, Beijing implemented a relatively lenient policy on the early resumption of work, and thus has several new cases reported every day during the past two weeks; this is consistent with our estimated risk trend of Beijing with the plans A1-A3. In contrast, Shenzhen still strictly controlled the resumption of work, so there is no new case reported during Feb. 24-29. 13 This fact is consistent with our estimation on Shenzhen's risk with stricter plans B1-B3. possible and complete the resumption as slow as possible, which corresponds to the least expected GDP growth. Alternatively, it would be practically desirable and feasible to gradually bring the work back to normal, while keeping the necessary control measures so as to mitigate the potential transmission to the lowest level possible. In such a case, Plan B1 would be preferred, as it achieves a good balance between a well-controlled risk and an acceptable productivity in the cities. We conduct the sensitivity study to examine variations of the analytic results with respect to variations in different age-groups and various social contact patterns. Specifically, we analyze the sensitivity of the estimated disease trends with respect to changes in the infectivity matrix A, the individual susceptibility matrix B, and the contact matrix C. Importantly, it should be noted that, although the numerical results derived from the 6 cities in this study may not be the same as those in other countries, the developed methods are general at the methodological level and the idea of using age-specific contacts to characterize the disease transmission patterns is instructive in understanding, and hence planning corresponding interventions in, the situations of the disease outbreaks in those countries. When applying the developed methodology to the wider global population, country/region-specific scenarios and settings, such as case categories, distribution of age-specific population, working environment and hours, and interventions and work resumption plans, should be incorporated to provide better tailor-made parameterization, thus making the retrospective and prospective analyses more situation-specific and informative. As COVID-19 is a newly emerging infectious disease, we are still in the process of gaining more knowledge and understanding of its transmission patterns. As a result, the parameters estimated based on the current understanding might not be as adequate or precise as those in some of the well-understood diseases, such as seasonal influenza. Therefore, one of our future research directions is to continue investigating the characteristics of the disease, from both epidemiological and computational perspectives, so as to parameterize the model in a more accurate way. Further, in this study, we have modeled the underlying transmission of COVID-19 outbreak by considering the age-specific social contact patterns. It should be pointed out that there also exist other disease-related factors that might affect the disease transmission patterns, such as the cross-region mobility of the population and the environmental factors. We plan to incorporate these disease-related factors into the model, thus making our analysis more comprehensive. Moreover, the current study focuses on the representative cities in China. However, since the pandemic of COVID-19 is becoming more serious around the world, it will be desirable to conduct further analyses on a global scale. In this regard, the general methodology provided in this study can readily be applied, while considering country/region-specific social, demographic, and epidemiological characteristics, such as infection-related social contact patterns 37 . To further generalize and transfer our research, we plan to collaborate with researchers and practitioners around the world to conduct the corresponding analyses for other countries/regions. Liu wrote the paper. We declare no competing interests. All codes and data will be made publicly available. The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health -The latest 2019 novel coronavirus outbreak in Wuhan World Health Organization, Coronavirus disease 2019 (COVID-19) Situation Report -69 Strategy and Policy Working Group for NCIP Epidemic Response, Chinese Center for Disease Control and Prevention, Urgent research agenda for the novel coronavirus epidemic: transmission and non-pharmaceutical mitigation strategies Report of the WHO-China Joint Mission on Coronavirus Disease Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. The Lancet Inferring the structure of social contacts from demographic data in the analysis of infectious diseases spread Identifying the relative priorities of subpopulations for containing infectious disease spread Social contacts and mixing patterns relevant to the spread of infectious diseases Little Italy: an agent-based approach to the estimation of contact patterns-fitting predicted matrices to serological data The construction of next-generation matrices for compartmental epidemic models The mathematics of infectious diseases A note on generation time in epidemic models CoronaTracker: worldwide COVID-19 outbreak data analysis and prediction Time-varying transmission dynamics of novel coronavirus pneumonia in China Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts. Lancet Global Health Chinese Center for Disease Control and Prevention, The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China Estimating the costs of school closure for mitigating an influenza pandemic A systematic review of social contact surveys to inform transmission models of close-contact infections The authors would like to specially thank K. Guo from the Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, for her helpful discussions in estimating the