key: cord-0835752-ygrg6rhf authors: Wang, Kai; Zhao, Shi; Liao, Ying; Zhao, Tiantian; Wang, Xiaoyan; Zhang, Xueliang; Jiao, Haiyan; Li, Huling; Yin, Yi; Wang, Maggie H; Xiao, Li; Wang, Lei; He, Daihai title: Estimating the serial interval of the novel coronavirus disease (COVID‐19) based on the public surveillance data in Shenzhen, China from January 19 to February 22, 2020 date: 2020-05-26 journal: Transbound Emerg Dis DOI: 10.1111/tbed.13647 sha: dee5e89ec7d4c9163192a57901bd9c3fd9584ebf doc_id: 835752 cord_uid: ygrg6rhf BACKGROUNDS: The novel coronavirus disease (COVID‐19) poses serious threat to global public health and economics. Serial interval (SI), time between the symptom onsets of a primary case and a second case, is a key epidemiological parameter. We estimated SI of COVID‐19 in Shenzhen, China based on 27 records of transmission chains. METHODS: We adopted three parametric models: Weibull, Lognormal and Gamma distributions and an interval censored likelihood framework. The three models were compared using the corrected Akaike information criterion (AICc). We also fitted the epidemic curve of COVID‐19 to the exponential growth to estimate the reproduction number. FINDINGS: Using a Weibull distribution, we estimated mean SI at 5.9 days (95%CI: 3.9−9.6) and a standard deviation (SD) at 4.8 days (95%CI: 3.1−10.1). Using a logistic growth model, we estimated the basic reproduction number in Shenzhen at 2.6 (95%CI: 2.4−2.8). CONCLUSION: The SI of COVID‐19 is relative shorter than that of SARS and MERS, other two beta coronavirus diseases, which suggests the iteration of the transmission was rapid. It is crucial to isolate close contacts promptly to control the spread of COVID‐19 effectively. The coronavirus disease 2019 of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has induced 96,274 confirmed cases and 3,304 deaths as of March 5, 2020 since the start in the end of 2019 in Wuhan, China (National Health Commission of the People's Republic of China, 2020). The symptomatic cases presented with atypical pneumonia, with a high mortality rate ~17% among severe cases. Common symptoms of the infection include fever, cough and shortness of breath (Guan et al., 2020; Li et al., 2020) . The cases in the epicenter Wuhan city and Hubei province accounted for 60.96% and 83.48% of the national total, respectively. Mathematical epidemic models have been widely used to estimate the spread and other health outcomes of the COVID-19, as well as the disease burden. These models depend on knowledge of key transmission parameters. One of the most important parameters estimated is the serial interval (SI). The SI is the time between symptom onset of the infector and the infectee in a transmission chain. The serial interval is important because it is commonly used to approximate generation interval (GI, time between infection timing of primary cases and infection timing of secondary case) which is non observable. With the distribution of GI and the time-varying reproductive number (how many secondary cases may be caused by a typical primary case at a moment), the whole course of the epidemic is determined. Conventionally, SI is used to approximate GI, but GI is strictly positive. This is widely used in analyzing epidemics of infectious disease including influenza, measles, Zika virus and Ebola virus (Zhou et al., 2019; Te Beest et al., 2014; Cowling et al., 2009; Black and Ross, 2013; Orellano et al., 2012; Archer et al., 2012; Yom-Tov et al., 2015) . From January 19 to February 22, 2020, the total number of COVID-19 cases in Shenzhen was 417 This article is protected by copyright. All rights reserved We collected the confirmed COVID-19 cases from the Health Commission of Shenzhen Municipality (Shenzhen Municipal Health Commission, 2020). According to the latest figures released by the Health Commission of Shenzhen Municipality, the total number of confirmed cases in Shenzhen was 417 as of February 22, and remained unchanged up to February 27 ( Figure 1 ). As shown in the Figure 2 , we examined the transmission events from the publicly released information and identified 27 transmission chains, including 23 infectees matched with only one infector. The transmission events were extracted independently by 2 authors. The finial list of included cases was decided on discussion between authors with full agreement required prior to inclusion. The SI of COVID-19 were fitted to three parametric probability distributions including Weibull, Lognormal and Gamma distributions with mean μ and standard deviation (SD) σ, denoted by d(•|μ,σ) (Cowling et al., 2009; Zhao et al., 2020a) . The interval censored likelihood of SI estimates is defined in the following equation. . Here, the u(•) was the probability density function (PDF) of exposure following a uniform distribution with a range from D min to D max . For the range of onset dates of multiple infectors linked to the i-th infectee, the terms D i min and D i max denoted the lower and upper bounds, respectively. The δ i was the observed onset date of the i-th infectee. The estimates of μ and σ were calculated by using the maximum likelihood estimation approach. Their 95% confidence intervals (95%CI) were calculated using the profile likelihood estimation framework with cutoff threshold determined by a Chi-square quantile (Fan and Huang, 2005) . The three models were compared using the corrected Akaike information criterion (AIC) to evaluate their fitting performance (Burnham and Anderson, 2004) . With SI estimated, we fitted a logistic growth model to the number of COVID-19 local cases time series in Shenzhen (Zhao et al., 2019) . The logistic growth model is defined in the following equation. . This article is protected by copyright. All rights reserved Thus, the epidemic growing rate, denoted by γ, can be estimated. We estimated the basic reproduction number (R 0 ) by using the formula that R 0 = M(−γ) (Wallinga and Lipsitch, 2007) , where the M(•) is the moment generation function of the PDF of SI (i.e., a proxy of GI). Among the 27 confirmed infectees, 11 are male and 16 are female. There were 17 out of these 27 (63%) patients were infected by their relatives who had been to Wuhan (i.e., transmission in family). The ages of these infectees have a mean at 37.4, median at 37.0, interquartile range (IQR) between 32.0 and 48.5, and the range from 2 to 78 years. The age is relatively young given that Shenzhen has a relatively young population, as a rapidly developed special economic region. The observed SIs of all 27 samples have a sample mean at 5.9, sample median at 4, interquartile range (IQR) between 2 and 10, and range from 1 to 16 days. In order to calculate the population mean and population standard deviation, we fitted sample to three distributions and summarized results in Table 1 . We found that the AICc slightly favored the Weibull distribution model (smaller AICc implies a better fitting). Figure 3 shows the likelihood profiles of varying SI with respect to μ and σ of SI with Weibull distribution model, which was consistent to the estimated SI of influenza in reference (Zhou et al., 2019) . We estimated the mean of SI at 5.9 days (95%CI: 3.9−9.6) and SD of SI at 4.8 days (95%CI: 3.1−10.1). The fitted Weibull distribution was shown in Figure 4 . Combining these findings with SI we can obtain relatively reliable estimates of the basic reproduction number (R 0 ) in Shenzhen. In order to obtain the basic reproductive number in Shenzhen, we exclude 297 imported cases. We found that the logistic growth model fitted the observed epidemic curve well, as shown in Figure 5 . The intrinsic growth rate (γ) was estimated at 0.22 (95%CI: 0.19−0.26). We estimated R 0 at 2.6 (95%CI: 2.4−2.8), which is in line with previous estimated (Li et al., 2020; Zhao et al., 2020b; Wu et al., 2020; Jung et al., 2020) . Similarly, Wu et al. estimated that R 0 for COVID-19 was 2.68 (95% CI: 2·47-2·86) through a susceptible-exposed-infectious-recovered metapopulation model (Wu et al., 2020) . This article is protected by copyright. All rights reserved It is of great importance to estimate the SI of COVID-19 accurately in order to interpret the transmission characteristics of the diseases. A shorter SI implies faster growth rate (with a moderate R 0 ), and steeper epidemic curve, and higher replacement rate of generations. We estimated that the SI of COVID-19 in Shenzhen was 5.9 days (3.9-9.6). Zhao et al. (Zhao et al., 2020 ) estimated the mean of SI at 4.4 days from a Gamma distributed model based on 21 records of transmission chains in Hong Kong SAR China. Meanwhile, the median serial interval was estimated at 4.6 days by Nishiura (Nishiura et al., 2020) through a Weibull distribution model from 28 records which is one of the earliest estimates. You et al. (You et al., 2020) found the serial interval having an average of 4.41 days with a standard deviation of 3.17 days. The value of the serial interval in above papers were shorter than that in a published study (Guan et al., 2020) , which was based on six known infector-infectee pairs in the early outbreak of COVID-19 and with the gamma distribution (mean = 7.5 days, standard deviation SD = 3.4 days). Du (Du et al., 2020) obtained data on 468 COVID-19 transmission events reported in mainland China outside of Hubei Province and 12.6% of the serial intervals in their sample were negative. But one should be cautious to deal with negative value of serial interval. The purpose of estimating serial interval is to approximate generation interval which is non observable but strictly positive. The generation interval together than the reproductive number governs the progression of an epidemic. The sample size of our study is small and serial intervals in our cases were all nonnegative. Indeed, we found some small serial intervals like 1 day in our samples. Nishiura (Nishiura et al., 2020) did not find negative serial intervals among 28 infector-infected pairs from multiple countries. Negative values of serial interval or very short serial interval (such as 1 or 2 days) implies pre-symptomatic transmission and highlights the difficult in control and rapid spread. Moreover, values of the serial interval for COVID-19 estimated by recent studies are evidently shorter than MERS (Assiri et al., 2013) and SARS (Lipsitch et al., 2003) , whose values are 7.6 days and 8.4 days, respectively. Timely isolation is essential to the epidemic prevention when small serial interval of disease transmission showing up as the shorter the SI is, the faster the disease spreads. The relatively short SI implies that the COVID-19 transmitted rapidly with a moderate R0. Thus swift actions in isolation, prevention and control are crucial to cut off the transmission. However, the sample size in this work is relatively small, even though it is already larger than initial study of six cases Accepted Article How generation intervals shape the relationship between growth rates and reproductive numbers Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Estimation of the Time-Varying Reproduction Number of COVID-19 Outbreak in China. medRxiv Estimating the secondary attack rate and serial interval of influenza-like illnesses using social media Simple framework for real-time forecast in a data-limited situation: the Zika virus (ZIKV) outbreaks in Brazil from 2015 to 2016 as an example Estimating the serial interval of the novel coronavirus disease (COVID-19): A statistical analysis using the public data in Hong Kong from Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China