key: cord-0760981-zv8xt1dd authors: Sun, K.; Wang, W.; Gao, L.; Wang, Y.; Luo, K.; Ren, L.; Zhan, Z.; Chen, X.; Zhao, S.; Huang, Y.; Sun, Q.; Liu, Z.; Litvinova, M.; Vespignani, A.; Ajelli, M.; Viboud, C.; Yu, H. title: Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2 date: 2020-08-13 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2020.08.09.20171132 sha: 5ec3a6d38cf3ecf0b59898945b3233f0a3715403 doc_id: 760981 cord_uid: zv8xt1dd A long-standing question in infectious disease dynamics is the role of transmission heterogeneities, particularly those driven by demography, behavior and interventions. Here we characterize transmission risk between 1,178 SARS-CoV-2 infected individuals and their 15,648 close contacts based on detailed contact tracing data from Hunan, China. We find that 80% of secondary transmissions can be traced back to 14% of SARS-CoV-2 infections, indicating substantial transmission heterogeneities. Regression analysis suggests a marked gradient of transmission risk scales positively with the duration of exposure and the closeness of social interactions, after adjusted for demographic and clinical factors. Population-level physical distancing measures confine transmission to families and households; while case isolation and contact quarantine reduce transmission in all settings. Adjusted for interventions, the reconstructed infectiousness profile of a typical SARS-CoV-2 infection peaks just before symptom presentation, with ~50% of transmission occurring in the pre-symptomatic phase. Modelling results indicate that achieving SARS-CoV-2 control would require the synergistic efforts of case isolation, contact quarantine, and population-level physical distancing measures, owing to the particular transmission kinetics of this virus. While the age dependency in clinical severity of COVID-19 has been well documented (1-5), there is limited information on how transmission risk vary with age, clinical presentation, and contact types (6) (7) (8) (9) (10) (11) (12) . Individual-based interventions such as case isolation, contact tracing and quarantine have been shown to accelerate case detection and interrupt transmission chains (13) . However, these interventions are typically implemented in conjunction with population-level physical distancing measures, and their effects on contact patterns and transmission risk remain difficult to separate (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) . A better understanding of the factors driving SARS-CoV-2 transmission is key to achieve epidemic control while minimizing societal cost, particularly as countries relax physical distancing measures. Hunan, a province in China adjacent to Hubei where the COVID-19 pandemic began, experienced sustained SARS-CoV-2 transmission in late January and early February 2020, but the outbreak was swiftly suppressed thereafter. As in many other provinces in China, epidemic control was achieved by a combination of individual-based interventions targeting cases and their contacts and population-level physical distancing measures. In this study, we reconstruct transmission chains for all identified SARS-CoV-2 infections in Hunan, as of April 3, 2020, based on granular epidemiological information collected through extensive surveillance and contact tracing efforts. We identify the demographic, clinical and behavioral factors that drive transmission heterogeneities and evaluate how interventions modulate the topology of the transmission network. Further, we reconstruct the infectiousness profile of SARS-CoV-2 over the course of a typical infection and estimate the feasibility of epidemic control by individual and population-based interventions. We analyze detailed epidemiological records for 1,178 SARS-CoV-2 infected individuals and their 15,648 close contacts, representing 19,227 separate exposure events, compiled by the Hunan Provincial Center for Disease Control and Prevention. Cases were identified between January 16 and April 2, 2020; index cases were captured by passive surveillance and laboratory confirmed by RT-PCR. Individuals who were close contacts of the index cases were followed for at least 2 weeks after the last exposure to the infected individual. Prior to February 7, 2020, contacts were tested if they developed symptoms during the quarantine period. After February 7, 2020, RT-PCR testing was required for all contacts, and specimens were collected at least once from each contact during quarantine, regardless of symptoms. Upon positive RT-PCR test results, infected individuals were isolated in dedicated hospitals, regardless of their clinical severity, while their contacts were quarantined in medical observation facilities. The dataset includes 210 epidemiological clusters representing 831 cases, with additional 347 sporadic cases (29%) unlinked to any cluster (detailed in Materials & Methods). For each cluster, we stochastically reconstruct transmission chains and estimate the timing of infection most compatible with each patient's exposure history. We analyze an ensemble of 100 reconstructed transmission chains to account for uncertainties in exposure histories ( Fig. 1 visualizes one realization of the transmission chains). We observe between 0 and 4 generations of transmission, with the largest cluster involving 20 SARS-CoV-2-infected individuals. The number of secondary infections ranges from 0 to 10, with a distribution of secondary infections best characterized by a negative binomial distribution with mean µ = 0.40 (95% CI, 0.39 to 0.41) and variance µ(1 + µ/k) = 0.95 (95% CI, 0.92 to 0.97), where = 0.30 (95% CI, 0.29 to 0.30) is the dispersion parameter (Fig. 1) . This suggests 80% of secondary infection can be traced back to 14% of SARS-CoV-2 infected individuals, indicating substantial transmission heterogeneities at the individual-level. To dissect the individual transmission heterogeneities and identify predictors of transmission, we analyze the infection risk among a subset of 14,622 individuals who were close contacts of 870 SARS-CoV-2 patients. This dataset excludes primary cases whose infected contacts report a travel history to Wuhan. The dataset represents 74% of all SARS-CoV-2 cases in the Hunan epidemic, for whom contacts have been carefully monitored, capturing 17,750 independent exposure events. We start by characterizing variation in transmission risk across the diverse set of 17,750 exposures. We focus on quantifying how the duration, timing and type of contact impact transmission risk, accounting for other factors including age, sex, clinical presentation, travel history of the primary case, as well as age and sex of the contacts. Exposures are grouped into 5 categories based on the type of contact settings, namely: household, extended family, social, community, and healthcare (Table S2) , with the duration of exposure approximated by the time interval between the initial and final dates of exposure. We also stratify exposures by the date of occurrence, with January 25, 2020 marking the beginning of enhanced physical distancing measures in Hunan (based on Baidu Qianxi mobility index (25) , Fig. S1A insert). To address putative variation in infectiousness over the course of infection, we distinguish exposures based on whether the exposure window contains the time of symptom onset of the primary case, a period associated with high viral shedding. We use a mixed effects multiple logistic regression model (GLMM-logit) to quantify the effects of these factors on transmission (see Fig. S1A for regression results, and Table S3 for a detailed definition of all risk factors and summary statistics). We find a marked gradient of transmission risk scales positively with closeness of social interactions (Fig. S1A ): household contacts pose the highest risk of transmission (see also (12) ), followed by contacts in the extended family, social and community settings. Contacts in the healthcare setting have the lowest risk, suggesting that adequate protective measures were adopted by patients and healthcare staff in Hunan, China. Interestingly, the impact of physical distancing differs by transmission setting (Table 1) : enhanced physical distancing measures elevates the risk of transmission in the household, likely due to increased contact frequency at home as a result of physical confinement during the "lockdown". In contrast, reduced within-city mobility is associated with a reduction in transmission risk per contact opportunity in the community and social settings, possibly caused by adoption of prudent behaviors such as mask wearing, hand washing and coughing/sneezing etiquette. We also find that longer exposure window is a significant risk factor, with one additional day of exposure increasing the transmission risk by 10% (95% CI, 5% to 15%). Transmission risk is higher around the time of symptom presentation of the primary case (Table 1) . And while susceptibility to SARS-CoV-2 gradually increases with age, we find no statistical support for age differences in infectivity (Fig. S1A ), in agreement with previous findings (12) . For each of the 17,750 contact exposure events, we estimate the probability of transmission using the point estimate of the baseline odds and odds ratios from the GLMM-logit regression (Fig. S1A ). In Fig. 2A , we plot the distribution of transmission probabilities in the household, extended family, social, and community settings separately. The average "per-contact" transmission probability is highest in the household (7.1%, 95% CI, 1.2% to 19.3%), followed by family (1.7%, 95% CI, 0.4% to 5.7%) and social settings (0.9% with 95% CI, 0.3% to 2.7%), and lowest in the community (0.4% with 95% CI, 0.1% to 1.1%). The gradient of transmission probabilities across settings is the joint effects of increasing duration of exposure with closeness of social interactions (Fig. 2B) , superimposed on setting-specific risk differences (Fig. S1A ). It is worth noting that these "per-contact" transmission probabilities were evaluated in a situation of intense interventions measures and high population awareness of the disease, and thus, they may be not generalizable elsewhere. The number of contacts is also a key driver of individual transmission potential and varies by transmission setting. Fig. 2C presents the contact degree distribution, defined as the number of unique contacts per individual. We find that the distributions of individual contact degree are overdispersed with dispersion parameter 0 < < 1 across all settings. Furthermore, household ( = 0.72) and extended family ( = 0.64) contacts are less dispersed than social ( = 0.19) and community ( = 0.14) contacts, suggesting that contact heterogeneities are inversely correlated with the closeness of social interactions. Fig. 2D visualizes the age-specific contact patterns between the primary cases and their contacts, demonstrating diverse mixing patterns across settings. Specifically, household contacts present the canonical "three-bands" pattern with the diagonal representing age-assortative interactions and the two off-diagonals representing intergenerational mixing (26, 27) . Other settings display more diffusive mixing patterns by age. Next, we summarize the overall transmission potential of an individual by calculating the cumulative contact rate (CCR) of the primary case. The CCR captures how contact opportunities vary with demography, temporal variation in the infectiousness profile, an individual's contact degree, and interventions. (See Section 4.2 in Materials and Methods for detailed definition). Through regression analysis, we focus on how the overall transmission opportunity of an infected individual is affected by different intervention measures across transmission settings. After adjusting for age, sex, clinical presentation, and travel history to Wuhan, we find that physical distancing measures increase CCRs in the household and extended family and decrease CCRs in social and community settings (Fig. 2E ). In contrast, faster case isolation (measured as the time between isolation, or pre-symptomatic quarantine, and symptom onset) universally reduces CCRs, decreasing transmission opportunities across all settings (Fig. 2E ). We have characterized the SARS-CoV-2 transmission risk factors and have shown that individual and population-based interventions have a differential impact on contact patterns and transmission potential. Next, we use our probabilistic reconstruction of infector-infectee pairs to further dissect transmission kinetics and project the impact of interventions on SARS-CoV-2 dynamics and control. Based on the reconstructed transmission chains, we estimate a median serial interval of 5.7 days, with an inter-quartile range (IQR) of 2.8 to 8.7 days, which represents the time interval between symptom onset of an infector and his/her infectee (Fig. S2B) . The median generation interval, defined as the interval between the infection times of an infector and his/her infectee, is 5.3 days, with an IQR of 3.1 to 8.7 days (Fig. S2A ). We estimate that 63.2% (95% CI, 59.6% to 66.4%) of all transmission events occur before symptom onset, which is comparable with findings from other studies (6-8, 10-13, 19, 28) . However, these estimates are impacted by the intensity of interventions, as we will show later. In Hunan, interventions including case isolation, contact tracing, and close-contact quarantine were in place throughout the epidemic. Case isolation and contact quarantine are meant to prevent potentially infectious individuals from contacting susceptible individuals, effectively shortening the infectious period. As a result, we would expect right censoring of the generation and serial interval distributions (29). Symptomatic cases represent 86.5% of all SARS-CoV-2 infections in our data; among these patients, we observe longer generation intervals for cases isolated later in the course of their infection (Fig. 3A) . The median generation interval increases from 4.1 days (IQR, 1.9 to 7.2 days) for cases isolated 2 day since symptom onset, to 7.0 days (IQR, 3.6 to 11.1 days) for those isolated more than 6 days after symptom onset (p<0.001, Mann-Whitney U test). We observe similar tends for the serial interval distributions (Fig. 3B) . The median serial interval increases from 1.7 days (IQR, -1.5 to 4.7 days) for cases isolated less than 2 day after symptom onset, to 7.3 days (IQR, 3.4 to10.9 days) for those isolated more than 6 days after symptom onset (p<0.001, Mann-Whitney U test). Faster case isolation restricts transmission to the earlier stages of infection, thus inflating the contribution of pre-symptomatic transmission (Fig. 3C) . The proportion of pre-symptomatic transmission is estimated at 86.6% (95% CI, 80.8% to 92.3%) if cases are isolated within 2 day of symptom onset, while this proportion decreases to 47.5% (95% CI, 41.4% to 53.3%) if cases are isolated more than 6 days after symptom onset (p<0.001, Mann-Whitney U test). To adjust for censoring due to case isolation and reconstruct the infectiousness profile of a SARS-CoV-2 infection in the absence of intervention, we characterize the changes in the speed of case isolation over time in Hunan. Fig. S4 shows the distributions of time from symptom onset to isolation during three different phases of epidemic control, coinciding with major changes in COVID-19 case definition (Phase I: before Jan. 27 th ; Phase II: Jan. 27 th -Feb. 4 th ; Phase III: after Feb. 4 th , Fig. S3 ) (30). In Phase I, 78% of cases were detected through passive surveillance; as a result, most cases were isolated after symptom onset (median time from onset to isolation 5.4 days, IQR (2.7, 8.2) days, Fig. S4A ). In contrast, in Phase III, 66% of cases were detected through active contact tracing, shortening the median time from onset to isolation to -0.1 days with IQR (-2.9, 1.7) days, Fig. S4C . Phase II is intermediate. We use mathematical models (detailed in Materials and Methods) to dynamically adjust the serial interval distribution for censoring and apply the same approach to the time interval between a primary case's symptom onset and onward transmission ( Fig. S6A-B) . These censoring-adjusted distributions can be rescaled by the basic reproduction number ! to reflect the risk of transmission of a typical SARS-CoV-2 case since the time of infection or since symptom onset (Fig 3D-E) . Assuming no interventions were in place, we estimated that infectiousness peaks near the time of symptoms onset (Fig. S6B) , consistent with our regression estimates that transmission risk is higher if the onset of the primary case occurred within the window of exposure (Table 1) . represents a typical scenario of unmitigated SARS-CoV-2 transmissibility in an urban setting. The reconstructed infectiousness profile in the absence of control is shown in solid red lines in Fig. 3D -E, with respect to time of infection and symptom onset respectively. Notably, SARS-CoV-2 infectiousness peaks slightly before symptom onset (-0.1 days on average), with 86% of the overall infectiousness concentrated within ±5 days of symptom onset and 52% of the overall infectiousness in the pre-symptomatic phase (Fig. 3E ). Next, we evaluate the impact of case isolation on transmission by considering three different intervention scenarios mimicking the speed of isolation in the three phases of the Hunan epidemic control. We first assume that 100% of infections are detected and isolated and that isolation is fully protective (i.e., there is no onward transmission after the patient has been isolated/quarantined). The infectiousness profiles of the three intervention scenarios are shown in dashed lines in Fig. 3D -E. We find that the basic reproduction number decreases in all intervention scenarios, but the projected decrease is not sufficient to interrupt transmission (Fig. 3D , ! " = 1.77 for Phase I, ! " = 1.54 for Phase II, and ! " = 1.10 for Phase III). We further relax the assumption of 100% case detection and isolation and relate changes in the basic reproduction number to the efficacy of surveillance and compliance with case isolation and contact quarantine (measured as the fraction of total infections isolated) as well as the speed of isolation (delay from symptom onset to isolation, phase diagram in Fig. 3F ). Dashed lines in Fig. 3F illustrate 30%, 40% and 50% of reduction in ! . To reduce the ! in half (the minimum amount of transmission reduction required to achieve control for a baseline !~2 ), 100% of infections would need to be isolated even if individuals are isolated as early as the day of symptom onset. In practice, epidemic control is unrealistic to achieve if case isolation and quarantine of close contacts are the only measures in place. Individual-based interventions are unlikely to be the sole mode of SARS-CoV-2 control in the months ahead. Layering additional physical distancing measures (e.g. through increased teleworking, reduced operation in the service industry, or broader adoption of face mask wearing), could provide substantial relief on the burden of case isolation and contact quarantine. The synergistic effects of these interventions are illustrated in Fig. 3G . We find that a 30% reduction in transmission from population-level measures would require a 70% case detection rate to achieve epidemic control, assuming that cases can be promptly isolated on average upon symptom presentation. Of note, a 30% reduction in transmission could be achieved in various ways and does not necessarily require physical distancing measures. It could also encompass the benefits of residual population-level immunity from the first wave of COVID-19, especially in hard-hit regions (32, 33). As a sensitivity analysis, we further consider a more optimistic scenario with a lower baseline ! = 1.59, corresponding to an epidemic growth rate of 0.08 day -1 (95% CI, 0.06 to 0.10) in Wuhan (30), which is adjusted for reporting changes. As expected, control is much easier to achieve in this scenario: if detected SARS-CoV-2 infections are effectively isolated on average 2 days after symptom onset, a 25% population-level reduction in transmission coupled with a 43% infection isolation rate is able to achieve control (Fig. 3H ). To our knowledge, our study is the most comprehensive analysis of contact tracing data so far. Detailed information on 1,178 SARS-CoV-2 infected individuals along with their 15,648 contacts has allowed us to dissect the behavioral and clinical drivers of SARS-CoV-2 transmission; to evaluate how transmission opportunities are modulated by individual and population-level interventions, and to characterize the typical infectiousness profile of a case. Informed by this understanding, particularly the importance of pre-symptomatic transmission, we have evaluated the plausibility of SARS-CoV-2 control through individual and population-based interventions. Contacts in healthcare settings pose the lowest risk of transmission in Hunan, suggesting that adequate protective measures against SARS-CoV-2 were taken in hospitals and medical observation centers (Table 1 ). The risk of transmission scales positively with the closeness of social interactions, with a lower per-contact risk estimated for community exposures (including contacts in the public transportation system, food and entertainment venues), intermediate risk for social and extended family settings, and highest risk in the household. The transmission risk associated with household exposures is further elevated when intense physical distancing is enforced, and for contacts that last longer. These lines of evidence support that SARS-CoV-2 transmission is facilitated by close proximity, confined settings, and high frequency of contacts. We cannot evaluate the relative risks of transmission in other settings such as schools, workplaces, conferences, prisons, or factories, as no contacts in these settings were reported in the Hunan dataset. Regression analysis indicates a higher risk of transmission when an individual is exposed to a SARS-CoV-2 patient around the time of symptom onset, in line with our reconstructed infectiousness profile that peaks just before symptom onset. These epidemiological findings are in agreement with viral shedding studies (6, 34, 35) . We estimate that overall in Hunan, ~63% of all transmission events were from pre-symptomatic individuals, in line with estimates from other modeling studies (6, 7, 10, 12, 36) . However, this proportion is inflated by case isolation and contact quarantine measures, with right-censoring affecting transmission primarily in the symptomatic phase. We estimate that the relative contribution of pre-symptomatic transmission drops to ~52% in an uncontrolled scenario where case-based interventions are absent. Case isolation reduces the "effective" infectious period of SARS-CoV-2 infected individuals by blocking contacts with susceptible individuals. We observe that faster isolation significantly reduces CCRs across settings (Fig. 2E ). We also observe shorter serial and generation intervals and a larger fraction of pre-symptomatic transmission when individuals are isolated faster ( Fig. 3A -C). In contrast, population-level physical distancing measures have differential impacts on CCRs, decreasing CCRs in social and community settings, while increasing CCRs in the household and family. As a result, strict physical distancing confines the epidemic mostly to families and households (see also Fig. S7 ). The precise impact of physical distancing on transmission is difficult to separate from that of individual-based interventions. However, our analysis suggests that physical distancing changes the topology of the transmission network by affecting the number and duration of interactions. Interestingly, the topological structure of the household contact network is highly clustered (37), and high clustering is expected to hinder epidemic spread (38, 39). Thus, these higher-order topological changes could contribute to reducing transmission beyond the effects expected from an overall reduction in CCRs. Observationally, the effectiveness of physical distancing measures on reducing COVID-19 transmission has been demonstrated in China (16, 40) and elsewhere (41). We have explored the feasibility of SARS-CoV-2 epidemic control against two important metrics related to case isolation and contact quarantine: the speed of isolation and the infection isolation proportion (Fig. 3F ). For a baseline transmission scenario compatible with the initial growth phase of the epidemic in Wuhan, we find that epidemic control solely relying on case isolation and quarantine of close contact is difficult to achieve. Layering case isolation and quarantine of close contact with moderate physical distancing makes control more likely over a range of plausible parameters -a situation that could be further improved by residual immunity from the first wave of SARS-CoV-2 circulation (32, 33). Successful implementation of contact tracing requires a low-level of active infections in the community, as the number of contacts to be monitored is several folds the number of infections (~13 contacts were being traced per SARS-CoV-2 infected individual in Hunan). The timing of easing of lockdown measures should align with the capacities of testing and contact tracing efforts, relative to the number of active infections in the community. Technology-based approaches could also facilitate intense contact tracing efforts (7, 42) . It is important to point out several caveats. Our study is likely underpowered to assess the transmission potential of asymptomatic infections given the relatively small fraction of these infections in our data (13.5% overall and 22.1% of infections captured through contact tracing). There is no statistical support for decreased transmission from asymptomatic individuals (Fig. S1A) , although we observe a positive, but non-significant trend in transmission risk scaling with disease severity. There is conflicting evidence from viral shedding studies; viral load appears independent of clinical severity in some studies (6, 23, 35, 43) while others suggest faster viral clearance in asymptomatic individuals (44) . Before February 7 in Hunan, a fraction of contacts were only tested upon symptom presentation, which may affect our estimates on age-specific susceptibility, as younger individuals are less likely to develop symptom (45) . The rate of asymptomatic infections and their impact on transmission have profound implications on the feasibility of control through individual-based interventions. Careful serological studies combined with virologic testing in households and other controlled settings will be needed to fully resolve the role of asymptomatic infections and viral shedding on transmission. In conclusion, detailed contact tracing data illuminate important heterogeneities in SARS-CoV-2 transmission driven by biological and behavioral factors and modulated by the impact of interventions. Crucially, and in contrast to SARS-CoV-1, the ability of SARS-CoV-2 to transmit during the host's pre-symptomatic phase makes it particularly difficult to achieve epidemic control (46) . Our risk factor estimates can provide useful evidence to guide the design of more targeted and sustainable mitigation strategies, while our reconstructed transmission kinetics will help calibrate further modeling efforts. Moving forward, it will be particularly important to intensify collection and analysis of rich contact tracing data to monitor how transmission risk changes over time with growing population immunity, waxing and waning of interventions, and reactive changes in human behavior and contact opportunities. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. Data were deidentified, and informed consent was waived. All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived. Data and materials availability: the datasets generated during and/or analyzed during the current study are not publicly available due to privacy concerns but are available from the corresponding authors on reasonable request. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . (E) Rate ratios of negative binomial regression of the cumulative contact rates (CCRs) against predictors including the infector's age, sex, presence of fever/cough, Wuhan travel history, whether symptom onset occurred before social distancing was in place (before or after Jan. 25, 2020), and time from isolation to symptom onset. CCRs represent the sum of relevant contacts over a one-week window centered at the date of the primary case's symptom onset. Dots and lines indicate point estimates and 95% confidence interval of the rate ratios, numbers below the dots indicate the numerical value of the point estimates; Ref. stands for reference category; * indicates p-value<0.05, ** indicates p-value<0.01, *** indicates p-value<0.001. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. Patients with mild symptoms and no radiographic evidence of pneumonia Moderate Patients with fever, or respiratory symptoms, and radiographic evidence of pneumonia Severe Patients who have any of the following: a. respiratory distress, breathing rate ≥30 beats/min; or b. finger oxygen saturation ≤93% during resting state; or c. PaO2/FiO2 ≤300mmHg (1mmHg = 0.133kPa). Patients whose pulmonary imaging have obvious progress of lesions (>50%) within 24~48 hours are managed as severe case. Patients who have any of the following: a. respiratory failure that requires mechanical ventilation; or b. shock; or c. other organ failures that requires ICU admission. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . For each SARS-CoV-2 positive individual in the database, information is compiled on the start/end date of exposure, along with the date of symptom onset (for symptomatic individuals) and laboratory confirmation. Biologically, the time of infection should occur before the onset of symptom or a positive RT-PCR test. Thus, we update the patient's end date of putative exposures in the database as the earliest of the reported exposure end date, date of symptom onset, or date of laboratory confirmation. If the start date of exposure is later than the date of symptom onset or positive RT-PCR test, it likely reflects recall error and we update the exposure start date as missing (1.9% of the records). We collected data on 15,648 individuals in close contact with the 1,178 confirmed SARS-CoV-2 infections identified in Hunan Province, China, representing 19,227 unique exposure events following the national protocol (1). Information included age, and sex of the contacts, type of contacts (household, extended family, social, community, and healthcare, see Table S2 for definition), as well as the start and end dates of contact exposure. If the contact was confirmed with SARS-CoV-2 by RT-PCR, a unique identifier mapping the individual to the SARS-CoV-2 patient database was provided. A household member living with a SARS-CoV-2 infected individual. A family member not residing in the same household but who has been in close contact with the primary SARS-CoV-2 infected individual. Friends, coworkers and classmates who study, work or are in close contact with the primary infected individual. Staff who interact with SARS-CoV-2-infected individuals in restaurants, entertainment venues, or other service settings; passengers seated in close proximity to a SARS-CoV-2 infected individual. Healthcare workers who provide diagnosis, treat or nurse a SARS-CoV-2 patient or other patients and caregivers in the same ward as a SARS-CoV-2 infected individual. Any individual reporting encounters as described in Table S2 and occurring within <1m of a SARS-CoV-2 infected individual (irrespective of displaying symptom) was considered a close contact, at risk of SARS-CoV-2 infection. All records were extracted from the electronic database managed by Hunan Provincial Center for Disease Control and Prevention. All individual records were anonymized and de-identified before analysis. Based on the contact tracing database, we define a SARS-CoV-2 cluster as a group of two or more confirmed SARS-CoV-2 cases or asymptomatic infections with an epidemiological link, i.e. occurring in the same setting (e.g. home, work, community, healthcare, or other) and for which a direct contact between successive cases can be established within two weeks of symptom onset of the most recent case (alternatively, the date of RT-PCR test for asymptomatic infections). In total, there are 210 clusters recorded in the database, for a total of 831 SARS-COV-2 infections. While clusters of cases are grouped together based on shared exposures, a subset of cases report additional exposures outside the cluster as possible causes of infection as well. As a result, there can be more than one index case within each cluster. In addition, for cases that only report exposures within the cluster, a unique infector cannot always be identified, given simultaneous SARS-CoV-2 exposures within the same cluster. A sporadic case is defined as a laboratory-confirmed SARS-CoV-2 individual who does not belong to any of the reported clusters (i.e. a singleton who has no epidemiological link to other infections identified). In total, there are 347 sporadic cases recorded in the database. Since the source and direction of transmission within a cluster cannot always be defined based on epidemiological grounds alone, we next turn to a modeling approach to probabilistically reconstruct possible infector-infectee transmission chains and further evaluate predictors of transmission. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . https://doi.org/10.1101/2020.08.09.20171132 doi: medRxiv preprint For each cluster and each patient in the cluster, the time of infection t ! !"# is stochastically sampled by randomly drawing from the incubation period distribution and subtracting this value from the reported time of symptom onset, i.e. t ! !"# = t ! $%& − τ ! !"'( , where τ ! !"'( is the sampled incubation period and t ! $%& the date of symptom onset (2) . The incubation period follows a Weibull distribution: )*+, ( ) = -. -./ 2− -. -3 with shape parameter = 1.58 and scale parameter = 7.11. The median incubation period is taken to be 5.56 days with IQR (3.14, 8.81) days (2). The sampled time of infection ) )*0 must satisfy the following constrains: • t ! !"# must fall within the start and end dates of the exposures identified by epidemiological investigation. We remove all singletons from the reconstruction of transmission chains, since they are not epidemiologically linked to other cases, but we consider these singletons when we analyze the distribution of secondary cases and when we represent the transmission network in Fig. 1. Next, we calculate the number of secondary infections for each of the 1,178 SARS-CoV-2 individuals based on the 100 reconstructed transmission chains among 831 cluster cases, and the 347 singletons. The distribution of secondary infections is shown in Fig. 1 We fit a negative binomial distribution to these data using package "pystan" version v2.19.1.1 with uniform prior. We estimated mean = 0.40, 95% CI 0.39 to 0.41 and dispersion parameter = 0.30, 95%CI 0.29 to 0.30. The generation interval is defined as the time interval between the dates of infections in the infector and the infectee. We calculate the generation intervals of all the infector-infectee pairs based on 100 realization of the reconstructed transmission chains. The distribution of the generation interval is shown in Fig. S2A . The observed serial interval is defined as the time interval between dates of symptom onsets in the infector and the infectee. We calculate the serial interval of all the infector-infectee pairs based on 100 realizations of the reconstructed transmission chains with known dates of symptom onset. The distribution of the serial interval is shown in Fig. S2B . We select all infector-infectee pairs for which the infector has been isolated during the course of his/her infection, date of symptom onset is available, and times of infection have been estimated. We stratify the data by the infector's time interval between onset and isolation, τ !$5 , with )63 ∈ {(−∞, 2), [2, 4) , [4, 6) , [6, +∞) }, and assess how the generation interval and serial interval distributions change with the speed of case isolation (Fig. 3A and Fig. 3B ). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . As cases are isolated earlier in the course of infection, we expect that the contribution of pre-symptomatic transmission will increase. This is because symptomatic transmission occurs after pre-symptomatic transmission and transmission will be blocked after effective isolation. In other words, isolated individuals remain infectious, but they can only effectively transmit before isolation, which is predominantly in their symptomatic phase. To validate the hypothesis that the contribution of pre-symptomatic transmission is affected by interventions, we first estimate the overall contribution of pre-symptomatic transmission among all reconstructed transmission chains. Let ), 8 represent each transmission event from an infector to infectee , in realization of the 100 sampled transmission chains; = 0 indicates that infection in an infectee occurred before the time of symptom onset of his/her infector, denoting presymptomatic transmission, while = 1 indicates that the time of infection occurred after the infector's symptom onset (i.e. post-symptomatic transmission). Thus, the overall fraction of pre-symptomatic transmission in realization can be calculated using the following formula: Mean and 95% CI of 941 can be estimated over the 100 realizations of the reconstructed transmission chains. We further stratify 941 by the time interval between an infector's symptom onset and isolation, considering four categories (days): (−∞, 0), [0,2), [2, 4) , [4, 6) , [6, +∞) The mean and variance (based on 100 realization of the sampled transmission chains) of 941 for each category of the isolation intervals is shown in Fig. 3C. In Hunan province, all COVID-19 cases regardless of clinical severity were managed under medical isolation in appointed hospitals; while contacts of SARS-CoV-2 infections were quarantined in designated medical observation centers. In Section 4, we estimate that the risk of transmission in the healthcare setting is the lowest among all contact settings, thus case isolation and contact quarantine are highly effective to block onward transmission after isolation/quarantine. As a result, the observed serial/generation intervals are shorter than they would be in the absence of case isolation and contact quarantine. The censoring effects are clearly demonstrated in Fig. 3A and Fig. 3B , where we observe that the median generation time drops from 7.0 days for )63 > 6 ( ) after symptom onset, to 4.1 days for )63 < 2 ( ). Moreover, the speed of case isolation is not static over time. Fig. S4 shows the distribution of time from symptom onset to isolation in three different phases of epidemic control (Phase I, II, and III) defined by two major changes in COVID-19 case definition issued by National Health Commission on Jan. 27 and Feb. 4. The median time from symptom onset to isolation decreases from 5.4 days in Phase I to -0.1 days in Phase III, due to the expansion of "suspected" case definition (3) and strengthening of contact tracing effort (Fig. S3 ). Estimating the generation interval distribution in the absence of interventions is important to understand the kinetics of SARS-CoV-2 transmission, as the shape of the generation interval distribution represents the population-average infectiousness profile since the time of infection. To minimize the potential error of flipping the directionality of infector-infectee relationship during contact tracing, we further limit our analysis to the infector-infectee pairs where the primary case had a travel history to Wuhan (and no other SARS-CoV-2 contact), while the secondary case did not have a travel history to Wuhan but was epidemiological linked to the primary case. To further reduce potential recall bias on the timing of symptom onset/exposure, we down-sample the outlier incubation periods. To do this in a statistically sound manner, we rely on the independence of the incubation periods of the infector and the infectee, and down-sample infector-infectee pairs whose joint likelihood of the observed incubation period pair is very low. Specifically, we first estimate the joint empirical distribution of the incubation periods of both the infector and infectee using the gaussian kernel density estimate (4) in the package "scipy" version v1.5.0 function "scipy.stats.gaussian_kde" (5) . The joint likelihood of observing the incubation periods of a given infector-infectee pair based on the kernel density estimate is denoted as -:1 ( ) )*+, (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . represented relative to expectations, and vice versa. We introduce a down-sampling weight in accordance with the incubation period distribution as )*+, = = R ) )*+, , 8 )*+, S/ -:1 R ) )*+, , 8 )*+, S. To account for the "censoring" of generation interval distribution due to quarantine/case isolation, we first exclude generation intervals where transmission occurred after isolation of the infector (only 4.3% of the reconstructed transmission events, attesting to the effectiveness of isolation). We then divide the generation intervals into three groups based whether the date of symptom onset of the infectors fall within a given phase of epidemic control in Hunan. In Group 1 the illness onset of the infectors occurred before Jan. 27 th (Phase I); in Group 2 the illness onset of the infector occurred between Jan. 27 th and Feb. 4 th (Phase II); in Group 3, the illness onset of the infector occurred after Feb. 4 th (Phase III). For a given generation interval >? of an infector-infectee pair in each group, we denote: • The time of symptom onset of the infector as 3*612 . • The time of case isolation/quarantine of the infector as )63 . • The time of transmission from the infector to the infectee as )*0. . Fig. S4 . The corresponding cumulative probability distribution is denoted as A? ) ( )63 ). The probability of this infection-infectee pair escaping the "censoring" due to quarantine and case isolation is ) ). For every n observations of the generation interval )) under intervention ) ( )63 ) given 3) , there should be = * 9 ! "#$. observations of >? given 3) without intervention ) ( )63 ). Thus, we denote the sampling weight adjusted for case isolation as )63 = / 9 ! "#$. . The overall resampling weight of generation interval )) between infector and infectee considering both incubation period distribution and censoring due to case isolation is given by In contrast to the generation interval distribution, which characterize the relative infectiousness of a SARS-CoV-2 infection over time with respect to the time of infection, we now focus on the interval between symptom onset and transmission. This shifts the reference point of the infectiousness profile from the time of infection to the time of symptom onset. Namely the distribution of symptom onset to transmission adjusted for case isolation { AH B:8. } represents the population-average relative infectiousness profile over time since the onset of symptom. Of note, since we observe substantial pre-symptomatic transmission for SARS-CoV-2, negative values of AH B:8. are allowed. Similarly to the previous section, we resample from { AH ( , )} with sampling weights 6BC9D1 ( , ) until a sample of size = 10000 is reached to obtain the distribution of symptom onset to transmission { AH B:8. }. The resampled distribution represents the infector's relative infectiousness (population average) with respect to the infector's symptom onset (Fig. S6B) . The best-fit distribution is a normal distribution: A recent study (6) estimated the initial growth rate of the epidemic in Wuhan at 0.15 day -1 95% CI (95% CI, 0.14 to 0.17) ahead of the lockdown. The estimate is based on the daily rise in reported cases by onset date; adjustment for increased reporting due to a broadening case definition places the growth rate at 0.08 day -1 (6). The Euler-Lotka equation (7) describes the relationship between the basic reproduction number N , the epidemic growth rate , and the generation interval distribution ( ): We assume that no effective intervention had been implemented in Wuhan by the time of the lockdown (Jan. 23). Using the generation time distribution adjusted for "censoring" due to quarantine and case isolation >? B:8. ( ) described in the previous section, we estimate the basic reproduction number in Wuhan during the exponential growth phase at To The corresponding basic reproduction number assuming 100% SARS-CoV-2 infection detection rate is given by: Similarly, following Section 3.3.1, 3) 8 ( ) gives the probability that transmission is blocked after time since symptom onset in the infector, for the 3 phases of epidemic control ∈ { , , }. We can estimate the average risk All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . In Fig. 3E , we visualize the transmission profile with respect to symptom onset time AH +3*243D (8) for all three phases of epidemic control (dashed lines). We start by characterizing the controllability of SARS-CoV-2 (measured as N under control measures) as a function of infection isolation rate and the speed of case isolation/pre-symptomatic quarantine. In Fig. 3F , we plot the phase diagram of N as a function of infection detection proportion (fraction of all SARS-CoV-2 infections detected) and the mean time from symptom onset to isolation/quarantine )63 . Contour lines indicates reductions in R0 from baseline non-intervention conditions. It is worth noting that we do not know the precise prevalence of truly asymptomatic infections as well as their role in transmission. Here we assume that asymptomatic cases have a similar shape of infectiousness profile over the course of infection as symptomatic cases, and a peak of infectiousness corresponding to the time of symptom onset in symptomatic cases, as shown in Fig. S6 . The corresponding )63 for asymptomatic cases is measured as time from peak infectiousness to isolation. Here we assume that the distribution of symptom onset/peak infectiousness to isolation follows a normal distribution with mean )63 and standard deviation of 2 days. We further consider the synergic effects of layering individual-based intervention (case isolation, contact tracing, and quarantine) with population-based interventions (ie, via physical distancing, measured as a reduction in effective contact rates). In Fig. 3G , we plot the phase diagram of N as a function of the proportion of population-level contact reduction and infection isolation rate, with the average speed of isolation 0 days after symptom onset/peak infectiousness and standard deviation of 2 days. The blue area indicates the region below the epidemic threshold, where control is achieved, and the red area indicates region above the epidemic threshold. Last, we consider a sensitivity analysis with a lower base N = 1.59, using the growth rate of = 0.08 observed in Wuhan data with adjustment for changes in reporting (Section 3.4). In Fig. 3F , we plot the phase diagram of N as a function of % population-level contact reduction (i.e. through physical distancing) and isolation rate, assuming that SARS-CoV-2 infections are isolated 2 days after symptom onset/peak infectiousness on average with a standard deviation of 2 days. The blue area indicates the region below the epidemic threshold and the red area indicates region above the epidemic threshold. In this section, we use a mixed effects multiple logistic regression model to evaluate the risk of SARS-CoV-2 transmission for each exposure reported in the contact tracing database. Each entry in the database represents a contact exposure between a SARS-CoV-2 infected individual and his/her contact. For individuals who were in contact with SARS-CoV-2 infected individual, the contact individual's age, sex, type of contact, the start/end dates of exposure, as well as the infection status (whether the exposed individuals was eventually infected with SARS-CoV-2) are carefully documented (Section 1.2). All SARS-CoV-2 infected individual (both primary cases and secondary infections via contact exposures) have unique identifiers that can be mapped to the SARS-CoV-2 patient line-list database, where additional information about the course of infection is also available (see Section 1.1 for detailed information). An individual in the contact tracing database can be exposed to multiple SARS-CoV-2 cases; further, an individual in the contact tracing database can be exposed to the same SARS-CoV-2 case through multiple independent exposures. All exposures are recorded independently. For each exposure in the contact-tracing database, the regression outcome is coded as 1 if the contact eventually becomes infected and 0 if not infected. For each exposure, a list of independent variables, their definitions, and corresponding values are shown in Table S3 (fixed effects in the mixed model): (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020 We also introduce random effects for each SARS-CoV-2 case, representing the individual-level infectiousness heterogeneity that is not explained by the independent variables representing fixed effects. These random effects also take into account the lack of independence of our observations. A contact could report more than one SARS-CoV-2 exposure. If the contact eventually becomes infected, however, only one of the many exposures will be the actual source of infection. In this case, if we denote the number of exposures as 1U93 , for each of the contact's 1U93 exposure entries in the database with two different outcomes, either the contact became infected (1 as regression outcome) with regression weight 1/ 1U93 or the contact avoided infection from the same exposure (0 as regression outcome) with regression weight ( 1U93 − 1)/ 1U93 . We remove contacts who become infected but also have travel history to Wuhan (81/15646, <1%), as the infection could possibility originate from exposures in Wuhan in addition to exposure to local cases in Hunan. A fraction of the regression variables has missing values in the contact-tracing database (see Table S3 , column 3). We adopted the state-of-the-art "Multivariate Imputation by Chained Equations" algorithm (8) (implemented in R package "MICE" version 3.9.1 https://cran.r-project.org/web/packages/mice/index.html) to impute missing values in the database. All independent variables in Table S3 are used as predictors for data imputation. The number of multiple imputations is set as 10 with each imputation running 10 realizations. For each of the 5 realizations of imputed contacttracing databases, we independently perform mixed effects multiple logistic regression of the risk of SARS-CoV-2 transmission with all exposures and variables described in Table S3 as covariates. The regression is performed using R package "lme4" (9) version v1.1-23 function "glmer" (https://cran.r-project.org/web/packages/lme4/index.html). The final odds-ratio estimates are pooled from the 5 independent regressions on 5 imputed databases using "MICE" package's "pool" function, based on Rubin's rule (8) . The odds ratios of independent variables, their 95%CIs, and the baseline odds (intercept) are reported in Fig. S1A . To examine the model's fit to the data, we explore (i) how well the model reproduces the age profiles of infectorinfectee pairs and (ii) whether the model captures the amount of transmission that occurs in different settings (household, family, transportation, etc). We first randomly choose one of the five imputed contact-tracing databases. For each exposure entry in the imputed contact-tracing database, we calculate the model predicted risk of infection based on all fixed variables in the regression. We simulate the infection status of the contact according to the predicted risk by drawing from a binomial distribution. We repeat the process for all contacts, and further simulate 100 realizations of projected infection databases to gauge variability. Fig. S1C shows the observed age distribution of the infector-infectee pairs in the original data, and Fig. S1B visualizes the projected age distribution based on the regression model, averaged over 100 realizations. Violin plots in Fig. S1D show the relative fraction (with projection uncertainties) of transmission that is explained by each type of contacts, based on the model, while the dots in Fig. S1D represents the empirical observations. We find that the model accurately captures the strong assertiveness of transmission in the 30-50 years age group, and the off diagonals that represent transmission between different generations. Further, the model reproduces the relative contribution of different types of contacts seen in the empirical data (Fig. S1D) . As a sensitivity analysis of imputing missing data (especially addressing the issue of imputing "onset within exposure" for SARS-CoV-2 infected individuals that are asymptomatic), we perform a GLMM-logit regression with entries of missing data removed. We've further break-up the age bracket of predictor "Age (case)" (Table S3) into 0-12 years, 12-25 years, 26-64 years, and 65+ years. We remove the predictor of "onset within exposure", however, for predictor "clinical severity (case)", we break down the category "mild & moderate", and "severe & critical" based on the whether onset of the primary case occurred within the exposure time window. "(-)" indicate symptom onset outside the exposure time window, while "(+)" indicate symptom onset within the exposure time window. The results of the regression are shown in Fig. S8 . While Section 4.1 addresses predictors of "per-contact" transmission risk heterogeneity, in this section we aim to characterize variation in individual contact patterns of SARS-CoV-2 cases by type of contact. We are particularly interested in the impact of both individual-based and population-based intervention on contact rates. Intuitively, the overall transmission rate of an infectious individual can be interpreted as the sum of contact rates across contact categories weighted by the "per-contact" transmission risk. Thus, conditioning on all other predictors, higher contact rates would translate to higher transmission rates. We use regression analysis to model the individual contact patterns of each symptomatic SARS-CoV-2 case, whose contacts are traced and documented in the contact-tracing database. We focus on symptomatic cases (the majority of our data) because we are particularly interested in contacts near the time of symptom onset, since we have previously shown that transmission risk is highest near symptom onset. We first define a time window 6VC9. of peak infectiousness as ±5 days before and after each case's symptom onset 6VC9. . This time window accounts for a majority (86%) of the total infection risk of a typical symptomatic SARS-CoV-2 infection (Fig. S6B ). In addition, we consider the 4 main contact types separately: community, social, family, and household contacts. For each symptomatic SARS-CoV-2 case and contact type , we denote the number of contacts on day as ) 6 . Here each contact in ) 6 is weighted by the regression odds ratios of GLMM-logit, excluding effects from duration of exposure and if onset is within exposure time window. The cumulative daily contact rate < #012. 6 within the time window 6VC9. for a given case is given by: Here ( ) is the infectiousness profile with respect to symptom onset (Fig. S6B) . Clearly, case isolation will impact an infected individual's contact rate, irrespective of whether the case is symptomatic. However here we restrict our analysis to symptomatic cases as the speed of case isolation and pre-symptomatic quarantine can be quantitatively measured as the time from isolation/pre-symptomatic quarantine to symptom onset. To quantify the impact of socio-demographic factors and interventions on < #012. 6 , we consider a negative binomial regression with < #012. 6 as the dependent variable and proxies of interventions intensities as independent variables in the regression. Specifically, we use a within-city mobility index as a proxy for the intensity of populationlevel social distancing, while we use time between isolation and symptom onset to measure the intensity of individuallevel interventions (here, case isolation). We also include demographic and clinical predictors as independent variables to adjust for age and sex differences, as well as other changes in contact patterns. A full description of all regression variables is shown in Table S4 : All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . https://doi.org/10.1101/2020.08.09.20171132 doi: medRxiv preprint Whether the SARS-CoV-2 case had "fever". Cases without "fever" are the reference class. Whether the SARS-CoV-2 case had "dry cough". Cases without symptom "dry cough", i.e. Dry Cough (N), is the reference class. If the SARS-CoV-2 case reported a travel history to Wuhan: cases without travel history to Wuhan are the reference category. Physical distancing (Before/After Jan. 25) Based on the within-city mobility index (Fig. S1A, insert) provided by Baidu Qianxi (10), we grouped the individual patients into categories depending on whether the patients symptom onsets occurred before and after January 25, 2020, corresponding to weak/strong physical distancing. Onsets occurred before Jan. 25 (weak physical distancing) is the reference class. Time from case isolation to symptom onset. This is used as a proxy for individuallevel intervention intensity. The larger the value, the earlier the case is being isolated. Positive values indicate isolation before symptom onset, negative values indicate isolation after symptom onset. The regression is performed using the R package "MASS" (11) version 7.3-51.6 function "glm.nb" (https://cran.rproject.org/web/packages/MASS/index.html). The point estimates rate ratios along with their 95% CIs for each of the variables are presented in the bottom panels of Fig. 2E . We identify an effect of interventions on contact rates, along with clinical factors; these effects tend to be most intense in the social and transportation settings. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . https://doi.org/10.1101/2020.08.09.20171132 doi: medRxiv preprint Phase III of epidemic control (after Feb. 4): time from onset to isolation distribution has a median of -0.1 days with IQR (-2.9, 1.7) days. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 13, 2020. . https://doi.org/10.1101/2020.08.09.20171132 doi: medRxiv preprint Clinical Characteristics of Coronavirus Disease 2019 in China Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Baseline Characteristics and Outcomes of 1591 Patients Infected with SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy Estimates of the severity of coronavirus disease 2019: a model-based analysis Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study Temporal dynamics in viral shedding and transmissibility of COVID-19 Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science (80-. ) Asymptomatic and presymptomatic SARS-COV-2 infections in residents of a long-term care skilled nursing facility -King County Presymptomatic transmission of SARS-CoV-2 -Singapore Serial Interval of COVID-19 among Publicly Reported Confirmed Cases Suppression of COVID-19 outbreak in the municipality of Infectivity, susceptibility, and risk factors associated with SARS-CoV-2 transmission under intensive contact tracing in Hunan Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan, China Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China. Science Potential roles of social distancing in mitigating the spread of coronavirus disease 2019 (COVID-19) in South Korea Modeling the impact of social distancing testing contact tracing and household quarantine on second-wave scenarios of the COVID-19 epidemic Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside Hubei province, China: a descriptive and modelling study First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China The early phase of the COVID-19 outbreak in Lombardy, Italy. arXiv Early dynamics of transmission and control of COVID-19: a mathematical modelling study Within city mobility index Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases Reactive school closure weakens the network of social interactions and reduces the spread of influenza COVID-19 in Europe Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts Clinical and immunological assessment of asymptomatic SARS-CoV-2 infections The natural history and transmission potential of asymptomatic SARS-CoV-2 infection Probability of symptoms and critical disease after SARS-CoV-2 infection. arXiv Factors that make an infectious disease outbreak controllable Diagnosis and treatment guideline on pneumonia infection with 2019 novel coronavirus Density estimation for statistics and data analysis Groothuis-Oudshoorn, mice: Multivariate imputation by chained equations in R Fitting linear mixed-effects models using lme4 The authors acknowledge Dr Christophe Fraser from the University of Oxford, Dr David Spiro from Fogarty International Center, National Institutes of Health, and Dr Peter Kilmarx from