key: cord-0713558-cm2sx1dm authors: Hu, Shixiong; Wang, Wei; Wang, Yan; Litvinova, Maria; Luo, Kaiwei; Ren, Lingshuang; Sun, Qianlai; Chen, Xinghui; Zeng, Ge; Li, Jing; Liang, Lu; Deng, Zhihong; Zheng, Wen; Li, Mei; Yang, Hao; Guo, Jinxin; Wang, Kai; Chen, Xinhua; Liu, Ziyan; Yan, Han; Shi, Huilin; Chen, Zhiyuan; Zhou, Yonghong; Sun, Kaiyuan; Vespignani, Alessandro; Viboud, Cécile; Gao, Lidong; Ajelli, Marco; Yu, Hongjie title: Author Correction: Infectivity, susceptibility, and risk factors associated with SARS-CoV-2 transmission under intensive contact tracing in Hunan, China date: 2021-04-30 journal: Nat Commun DOI: 10.1038/s41467-021-23108-w sha: 5ab5209e4a75e47e8e3b0c5a3991291d2da001d9 doc_id: 713558 cord_uid: cm2sx1dm nan Suspected COVID-19 cases A suspected COVID-19 case is defined as a person who meets three clinical criteria OR two clinical criteria and one of epidemiological criteria: a) Clinical criteria: i) acute respiratory illness; ii) radiographic evidence of COVID-19 viral pneumonia; iii) normal or decreased white blood cells count in the early stage of the disease and normal or decreased lymphocyte count. b) Epidemiological criteria: i) history of travel to or residence in Wuhan or domestic location reporting community transmission or countries/territories/areas/overseas reporting widespread SARS-CoV-2 transmission during the 14 days prior to symptom onset; ii) contact with any confirmed cases during the 14 days prior to symptom onset; iii) cluster of contact with COVID-19 patients (nucleic acids amplification test positive) within 14 days before symptom onset or to individuals with fever and/or symptoms of respiratory infection within 14 days. We categorized confirmed COVID-19 cases according to their clinical severity, i.e., mild, moderate, severe, and critical case-patients. The details are presented in Tab. S1. An individual with an epidemiologic link is a SARS-CoV-2 infected individual who has either been exposed to a symptomatic or an asymptomatic individual, or had the same exposure as the SARS-CoV-2 infected individuals. Generally, epidemiologically-linked cases include, but are not limited to SARS-CoV-2 infected individuals' household contacts (i.e., household members regularly living with the case), relatives (i.e., family members who had close contacts with the case but did not live with the case), social contacts (i.e., a work colleague or classmate), and other close contacts (i.e., caregivers and patients in the same ward, persons sharing a vehicle, and those providing a service for the case in public places) who have been close-proximity interactions (within 1 meter) with index case-patient and have acquired SARS-CoV-2 infections. The flowchart describing the selection criteria of the analyzed subjects is shown in Since January 27, the designated hospitals and local Centers for Diseases Prevention and The open reading frame 1ab gene (ORF1ab) and nucleocapsid gene (N) were amplified and tested. Results were reported positive when both the ORF1ab gene and N gene were positive. Specimens tested as Ct-value of >=35 and <39.2 were retested for confirmation, a retest Ctvalue of >=39.2 was treated as positive, otherwise negative. Overall, the dynamics of the epidemic in Hunan followed an exponential growth before January 23, 2020, and a decrease in the number of cases after February 1, 2020 ( cases by type of exposure. Cluster size was defined as the total number of COVID-19 symptomatic cases and asymptomatic subjects in a cluster. We characterized 123 clusters with clear evidence of human-to-human transmission, which includes 499 of the COVID-19 cases presented in Tab. S2. Cluster size distribution was bimodal, with most clusters were between 2 and 4 cases 8 (94/123, corresponding to 76.4%). The largest cluster included 20 cases. The median cluster size was 3 (Tab. S2). We estimated the time from infection to symptom onset (i.e., the incubation period) based on information about the likely exposure of confirmed COVID-19 cases. Only cluster cases with confirmed human-to-human transmission and no travel history to Wuhan/Hubei were included for estimation. The rationale for this choice is that in multiple circumstances entire clusters took part in the same trip to/from Wuhan, thus preventing the unambiguous identification of the source of infection and transmission chain. Therefore, to provide more robust estimates and avoid multiplicity of biases, we have filtered those clusters. The exposure information was provided in the form of a time interval bounded by the dates of the first and last possible exposure. If the exposure start date of the case was missing or before that of the first infector, it was replaced by the exposure start date of the first infector. For the rest cases without dates of first exposure (17 individuals), they were imputed by the random numbers generated from a gamma distribution that best fitted the data of time intervals between the first and last exposure. As a sensitivity analysis, first exposure date of 7 individuals was imputed using the date when their infector came back to Hunan from Wuhan. Another sensitivity analysis was performed by excluding these 17 cases. We estimated the distribution of interval-censored exposure data by using maximum likelihood and compared three distributions (Weibull, gamma, and lognormal). The goodness of fit was assessed using Akaike information criterion (AIC). Results are presented in Tab. S3. a. Sensitivity analysis performed based on 258 cases including 7 individuals for which the first exposure date was imputed using the date when their infector came back to Hunan from Wuhan. b. Sensitivity analysis performed based on 251 cases (i.e., excluding 17 individuals without first exposure date). We analyzed clusters of COVID-19 cases with known epidemiological links and no travel history to Wuhan/Hubei to estimate the interval between onset of symptoms in primary (index) cases and the onset of symptoms in secondary cases generated by these primary cases *Period was defined using the date of symptom onset of the infector in each transmission pair. Following the approach similar to He, et al 2 , and accounting for the correction proposed by Ashcroft, et al 3 , the infectiousness profile (i.e., transmission probability from primary cases to a secondary case) was inferred using the serial intervals from confirmed transmission pairs combined with the incubation period distribution fitted in our analysis. Assuming that the infectiousness profile βc(tI -tS1) follows a gamma distribution with a time shift c to allow for start of infectiousness (tI) c days prior to the date of symptom onset (tS1). The serial intervals distribution f(tS2-tS1) would be the convolution between the infectiousness profile and incubation period distribution g(tS2-tI), where tS2 is the date when secondary case shows symptoms. The parameter vector θ, which includes shape and scale of the gamma distribution and the time shift c, were estimated using maximum likelihood based on the convolution of serial interval and incubation period. Allowing for the start of infectiousness to be around symptom onset and taking into account the window of symptom onset (tS1l, tS1u), the likelihood function was given by The results of the estimation are presented in the main text. Generation time -that is the time interval between infection of the primary case (tI1) and infection of the secondary cases (tI2) generated by such primary case -was inferred using the data of incubation period combined with infectiousness profile estimated in our analysis. We considered that infected cases would show symptoms at certain time (tS) before or after onset of infectiousness. Assuming that the distribution of generation time follows a gamma distribution φ(tI2 -tI1), the observed distribution of incubation period g(tS -tI1) can be inferred as the convolution between the infectiousness profile βc(tI2-tS) and the generation time distribution. We constructed a likelihood function based on the convolution, which was fitted to the observed incubation period, with tI1 provided in the form of a time interval bounded by the dates of the first and last possible exposure (tE1, tE2), given by Shape parameter (α) and rate parameter (β) of the gamma distribution of generation time were estimated using maximum likelihood method. The generation time was estimated to be 5.7 days (median: 5.5 days, interquartile range: 4.5, 6.7 days) based on a gamma distribution (shape=10.56, rate=1.85). Other key time-to-event distributions were estimated by using maximum likelihood. In particular, we estimated: i) the time from symptom onset to the date of collection of the first sample for PCR testing and ii) the time from symptom onset to laboratory confirmation. Three distributions (Weibull, gamma, and lognormal) with shift parameters allowing negative intervals were fitted and compared. The goodness of fit was assessed using AIC. As described above, the infectiousness profile peaked before the day of symptom onset. This may be driven by the control measures like isolation of infectors. We estimated the distribution of interval from symptom onset to the sampling date of first PCR and to laboratory confirmation to evaluate the timing of identification, isolation, and diagnosis of 13 infectious individuals. Results are presented in the main text and Tab. S5 (where only the best fitting distribution is shown). From the analysis of contact tracing records, we identified 8 clusters with evidence of asymptomatic transmission as shown in Fig. S5 . Reference a Contacts who were exposed to multiple cases of different generations of SARS-CoV-2 transmission. *Significance was tested using two-sided Wald test with α=0.05. We analyzed the odds ratio of SARS-CoV-2 transmission given the characteristics of the infectors and their contacts. To consider the clustering effect of an infector and a cluster, mixed effect logit models (i.e., generalized linear mixed-effect model, GLMM, for binary data with the logit link) were used to explore potential drivers of the susceptibility and infectivity of SARS-CoV-2 virus. The specifications of the GLMM models are defined as follows: Where: • g is a logit link function; • is the intersect • _ is the fixed effects of the age group of the infector in the successful (1) or unsuccessful (0) transmission event ; • _ is the age group of the contact (potential infectee) in the successful/unsuccessful transmission event ; • _ is the type of contact occurred in the successful/unsuccessful transmission event ; • _ is the generation of the successful/unsuccessful transmission event ; • _ is the number of close contacts of the infector involved in the successful/unsuccessful transmission event ; • _ is the gender of the infector in the successful/unsuccessful transmission event ; • _ is the gender of contact in the successful/unsuccessful transmission event ; • _ discriminates whether the infector involved in the successful/unsuccessful transmission event is symptomatic or asymptomatic; • _ indicates the observation period for an infector/contact involved in the successful/unsuccessful transmission event ; • 0 and 1 are random effects attributed to an infector and a cluster, respectively. = [ |( 0 , 1 )] is the mean of the response variable of a given value of the random effects. The results of the multivariate analysis based on GLMM are presented in Table S9 . The results for fixed effects, including 3 age groups for infector's and infectee's age, are presented in the Table S10 and Figure S8 . To evaluate the disaggregated effects of age, we also used transformed (log) continuous age variables (i.e., age of infectors and contacts) (Tab. S11). The goodness-of-fit evaluation was based on the estimates provided in the Table S12 . Model diagnostic measures and residuals plots (Fig. S7) were evaluated by DHARMa residual diagnostics for hierarchical models 4 . To further explore how the probability of SARS-COV-2 infections changes with a change in each covariate, the average marginal effects of age of infector and contacts, type of contact between infector and contact were estimated across all contacts, holding the effect of other covariate constant (Fig. S9 ). In addition, to explore possible non-linearity in the connection of age and of the number of contacts with SARS-CoV-2 transmission, we used generalized additive mixed models (GAMM). We used the same specifications as in the GLMM models presented in the main text. The summary of the results of the GAMM models is shown in Fig. S10 and in Tab. S13. The obtained results suggest that the risk of SARS-CoV-2 transmission monotonically increase with the age of contacts and with the number of infector's contacts. This is consistent with the patterns that have been shown in GLMM models. To explore the possible effect of timing on the results of the regression analysis we have also introduced an additional variable identifying the three phases of the epidemic: 1. Before the level 1 emergency response was activated in Hunan province (Jan 24); 2. After the level 1 emergency response activation, but before the growth of cases was reversed (Fig. S2 ); 3. After the outbreak growth was reversed. Table S10 . Stepwise regression analysis of factors associated with the probability of acquiring SARS-CoV-2 infections in generalized linear mixed models with categorized age of infectors and contacts. No. of contact Step 1-1 Step 1-2 Step 1-3 Step 1-4 Step 1-5 Step 1-6 a Step 1-7 Step 1-1 Step 1-2 Step 1-3 Step 1-4 Step 1-5 Step 1-6 a Step 1-7 c Referring to three phases of epidemic control and major changes in COVID-19 case definition, period of observation was defined using quarantine and isolation date, as well as date of diagnosis. Note that significance was tested using two-sided Wald test with α=0.05. 26 Table S11 . Stepwise regression analysis of factors associated with the probability of acquiring SARS-CoV-2 infections in generalized linear mixed models with log-transformed age of infectors and contacts. No. of contact Step 2-1 Step 2-2 Step 2-3 Step 2-4 Step 2-5 Step Contacts who were exposed to multiple cases of different generations of SARS-CoV-2 transmission. c Referring to three phases of epidemic control and major changes in COVID-19 case definition, period of observation was defined using quarantine and isolation date, as well as date of diagnosis. Note that significance was tested using two-sided Wald test with α=0.05. (the reference group [red lines] was contacts who were exposed to an index case-patients); (d) The probability of SARS-CoV-2 infections at a given age of contacts in a specific setting. Note that the box plots in panel a to c show the point estimates and 95% confidence interval of the relative risk of SARS-CoV-2 infections as compared to the reference group. The lines and shaded areas in panel d represent the point estimates and 95% confidence interval for the probability of SARS-CoV-2 infections, respectively. Note that significance was tested using two-sided Wald test with α=0.05. OR (95%CI) P-value OR (95%CI) P-value OR (95%CI) P-value OR (95%CI) P-value OR (95%CI) P-value OR (95%CI) P-value OR (95%CI) Note that significance was tested using two-sided Wald chi-square test with α=0 GAMM-predicted non-linear 3-knot splines for age of infector National Health Commission of the People's Republic of China. Diagnosis and treatment guideline on pneumonia infection with 2019 novel coronavirus Temporal dynamics in viral shedding and transmissibility of COVID-19 COVID-19 infectivity profile correction Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models