key: cord-0811463-ppcdkfxq authors: Xu, Xiao-Ke; Liu, Xiao-Fan; Wu, Ye; Ali, Sheikh Taslim; Du, Zhanwei; Bosetti, Paolo; Lau, Eric H Y; Cowling, Benjamin J; Wang, Lin title: Reconstruction of Transmission Pairs for novel Coronavirus Disease 2019 (COVID-19) in mainland China: Estimation of Super-spreading Events, Serial Interval, and Hazard of Infection date: 2020-06-18 journal: Clin Infect Dis DOI: 10.1093/cid/ciaa790 sha: 00a4b80a7e0f35c3f8ba74f5a9eeea73b77bdc3b doc_id: 811463 cord_uid: ppcdkfxq BACKGROUND: Knowledge on the epidemiological features and transmission patterns of COVID-19 is accumulating. Detailed line-list data with household settings can advance the understanding of COVID-19 transmission dynamics. METHODS: A unique database with detailed demographic characteristics, travel history, social relationships, and epidemiological timelines for 1,407 transmission pairs that formed 643 transmission clusters in mainland China was reconstructed from 9,120 COVID-19 confirmed cases reported during January 15 - February 29, 2020. Statistical model fittings were used to identify the super-spreaders and estimate serial interval distributions. Age and gender-stratified hazard of infection were estimated for household versus non-household transmissions. RESULTS: There were 34 primary cases identified as super-spreaders, with 5 super-spreading events occurred within households. Mean and standard deviation of serial intervals were estimated as 5.0 (95% CrI: 4.4, 5.5) and 5.2 (95% CrI: 4.9, 5.7) days for household transmissions and 5.2 (95% CrI: 4.6, 5.8) and 5.3 (95% CrI: 4.9, 5.7) days for non-household transmissions, respectively. Hazard of being infected outside of households is higher for age between 18 and 64 years, whereas hazard of being infected within households is higher for young and old people. CONCLUSIONS: Non-negligible frequency of super-spreading events, short serial intervals, and a higher risk of being infected outside of households for male people of working age indicate a significant barrier to the identification and management of COVID-19 cases, which requires enhanced non-pharmaceutical interventions to mitigate this pandemic. In December 2019, a novel coronavirus disease have also been reported in several recent studies [5] [6] [7] [8] . Starting from the last week of January 2020, more than 260 Chinese cities have implemented intensive social distancing and confinement policies, which brought the epidemic under control [7] [8] [9] [10] . However, the epidemic has still caused more than 10,000 confirmed cases in China outside Hubei Province within a month. To enhance public health preparedness and awareness, Chinese health authorities have publicly reported detailed records of confirmed COVID-19 cases since mid-January. This provides a unique resource for studying the transmission patterns, routes, and risk factors of A c c e p t e d M a n u s c r i p t 5 In mainland China, 27 provincial and 264 urban health commissions have publicly posted 9,120 confirmed case reports online during January 15 -February 29, 2020, which accounts for 72% of all cases confirmed in mainland China outside Hubei Province. We compiled a unique line-list database using these reports, which contains detailed information about the demographic feature, social relationship, travel history and key epidemiological timelines (e.g., dates of symptom onset, confirmation, and hospitalization). In contrast to several published COVID-19 data repositories [11] [12] [13] [14] [15] [16] which focus on describing information about individual cases, our database allows to reconstruct transmission pairs and clusters by inferring potential causal associations among different cases. The entire dataset of transmission pairs is available at our GitHub (https://github.com/linwangidd/covid19_transmissionPairs_China). See Supplementary Materials for more details. We reconstructed 1,407 transmission pairs using the epidemiological evidence among reported cases. The section "Reconstruction of transmission pairs" in Supplementary Materials specifies how we identified a pair or a group of confirmed cases using information about their close contacts, stratified transmission pairs into household and non-household settings using information about familial relationships, and determined the direction of transmission between infector and infectee using information about travel histories. For each transmission pair, we term the infector the primary case and the infectee the secondary case. We also consider connected chains of confirmed cases, in which we term the original case the index and the entire chain of cases, including the index, the transmission cluster ( Figure 1a ). We categorized each transmission pair by the social relationship between primary and secondary cases (e.g., familial members of the same household, non-household relatives, colleagues, classmates, friends, and other face-to-face contacts). Considering that during the Spring Festival travel season (January 10 -February 18, 2020) several billion human movements can occur because of the tradition of Chinese New Year (to visit and A c c e p t e d M a n u s c r i p t 6 live with their original families), we considered any transmission pair with immediate familial relationships (e.g., a person's spouse, parents, and children) as a household transmission pair, and with other familial relationships (e.g., a person's siblings with age older than 17) or close contacts with no familial information (e.g., classmates, colleagues) as a non-household transmission pair. The numbers of household (662) and nonhousehold (745) transmission pairs are almost even. A c c e p t e d M a n u s c r i p t 7 We estimated the age-stratified hazard of infection for household versus non-household transmissions by the ratio between the probability that a secondary case of age group b was infected by a primary case of age group a within the same household and the probability that a secondary case of age group b was infected by a primary case of age group a outside of households, i.e., . If , then the infection within households has a higher risk than the infection outside of households for secondary cases of age group b being infected by primary cases of age group a. We estimated the gender-specific hazard of infection for household versus non-household transmissions by the ratio between the probability that a secondary case of gender b was infected by a primary case of gender a within the same household and the probability that a secondary case of gender b was infected by a primary case of gender a via non-household transmission. The We in total reconstructed 643 transmission clusters formed by 1,407 transmission pairs (Figure 1a) Hazard of being infected within households was higher for the age groups of young (<18) and elderly (>65) people, whereas the hazard of being infected outside of households was higher for age groups between 18 and 64 years (Table 1 ). Primary cases of elderly (>65) people were more prone to cause household infections. Hazard of infection between different genders was higher for households than non-household transmission (Table 2) . A c c e p t e d M a n u s c r i p t 9 We have built a line-list database with detailed demographic information, travel history, epidemiological timelines, and social relationships for 1,407 transmission pairs that formed 643 transmission clusters in mainland China outside Hubei Province. We identified 34 primary cases as super-spreaders. Majority of superspreading events were observed for non-household transmissions, which is consistent with a recent study 21 on transmission settings of COVID-19 (e.g., hospitals, residential care, prisons, boarding schools, cruise ships). This indicates the importance of non-pharmaceutical interventions (e.g. isolation, quarantine, social distancing, and confinement 7,22-24 ) in mitigating the COVID-19 epidemic. Household studies are helpful to identify risk factors for certain demographic groups 25, 26 . The analysis of the age-stratified and gender-specific hazard of infection suggests a higher risk of infection within households for age groups of young (<18), elderly (>65) and female people. The higher risk of being infected outside of households for male people of age between 18 and 64 years may indicate their role in driving household secondary infections, perhaps because these were travelers from Wuhan of working age. We identified 50 transmission pairs (~3.5%) with secondary case reported symptom onset earlier than primary case (i.e., negative-valued serial intervals), which is consistent with recent clinical reports 27,28 and epidemiological studies 29, 30 . We estimated that the mean serial interval is around 5 days for both household and non-household infections, which is considerably shorter than the mean serial interval estimated for SARS (e.g., 8 .4 days 31 ) and MERS (e.g. 7.6 days 32 ). Our findings have several limitations. First, the household sizes and primary cases with no secondary infections were not provided from the original public case reports. This may give rise to biased estimates if we estimate the household reproduction number and secondary attack rate from the raw data. Field surveys will be helpful to adjust such biases. Second, the information on nosocomial infections and public gathering settings was not available from original case reports, so that the observation of super-spreading events may be less common from our dataset. Third, caution is needed when attempting to generalize the age-stratified hazard of infection to other A c c e p t e d M a n u s c r i p t 10 demographic settings. For example, in our study (Table 1) A c c e p t e d M a n u s c r i p t 16 A c c e p t e d M a n u s c r i p t 18 Figure 1 detail/30-01-2020-statement-on-the-second-meeting-of-theinternational-health-regulations World Health Organization. Coronavirus disease 2019 (COVID-19), situation report -133 Risk for Transportation of 2019 Novel Coronavirus Disease from Wuhan to Other Cities in China Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China The effect of human mobility and control measures on the COVID-19 epidemic in China Early dynamics of transmission and control of COVID-19: a mathematical modelling study Effect of non-pharmaceutical interventions for containing the COVID-19 outbreak in China. medRxiv Open access epidemiological data from the COVID-19 outbreak Epidemiological data from the COVID-19 outbreak, realtime case information Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside Hubei province, China: a descriptive and modelling study An interactive web-based dashboard to track COVID-19 in real time Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application Superspreading and the effect of individual variation on disease emergence Serial intervals of respiratory infectious diseases: a systematic review and analysis Secondary attack rate and superspreading events for SARS-CoV-2 What settings have been linked to SARS-CoV-2 transmission clusters? Impact assessment of non-pharmaceutical interventions against COVID-19 and influenza in Hong Kong: an observational study. medRxiv Interventions to mitigate early spread of SARS-CoV-2 in Singapore: a modelling study The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study Epidemiological research priorities for public health control of the ongoing global novel coronavirus (2019-nCoV) outbreak Covid-19 -Studies Needed Presumed Asymptomatic Carrier Transmission of COVID-19 Asymptomatic cases in a family cluster with SARS-CoV-2 infection Serial Interval of COVID-19 among Publicly Reported Confirmed Cases Temporal dynamics in viral shedding and transmissibility of COVID-19 Transmission dynamics and control of severe acute respiratory syndrome Hospital outbreak of Middle East respiratory syndrome coronavirus Household size and composition around the World A c c e p t e d M a n u s c r i p t 12 A c c e p t e d M a n u s c r i p t 13