key: cord-1040061-qi88fun2 authors: Zhang, Lei; Tao, Yusha; Wang, Jing; Ong, Jason J.; Tang, Weiming; Zou, Maosheng; Bai, Lu; Ding, Miao; Shen, Mingwang; Zhuang, Guihua; Fairley, Christopher K title: Early characteristics of the COVID-19 outbreak predict the subsequent epidemic size date: 2020-06-02 journal: Int J Infect Dis DOI: 10.1016/j.ijid.2020.05.122 sha: cd33660078241c5a1338cc9670afd5b1204eed39 doc_id: 1040061 cord_uid: qi88fun2 OBJECTIVES: The largely resolved first wave of the COVID-19 epidemic in China provides a unique opportunity to investigate how the early characteristics of the COVID-19 outbreak predicts its subsequent size. METHODS: We collected publicly available COVID-19 epidemiological data from 436 Chinese cities during 16(th) January-15(th) March 2020. Based on 45 cities which reported >100 confirmed cases, we examined the correlation between early-stage epidemic characteristics and subsequent epidemic size. RESULTS: We identified a transition point from a slow- to a fast-growing phase for COVID-19 at 5.5 (95% CI, 4.6-6.4) days after the first report, and 30 confirmed cases marked a critical threshold for this transition. The average time for the number of confirmed cases to increase from 30 to 100 (time from 30-to-100) was 6.6 (5.3-7.9) days, and the average case-fatality rate in the first 100 confirmed cases (CFR-100) was 0.8% (0.2-1.4%). The subsequent epidemic size per million population was significantly associated with both of these indicators. We predict a ranking of epidemic size in the cities based on these two indicators and found it highly correlated with the actual ranking of epidemic size. CONCLUSIONS: Early epidemic characteristics are important indicators for the size of the entire epidemic. The recent outbreak of the coronavirus SARS-COV-2 has led to a worldwide pandemic (Cucinotta and Vanelli, 2020) with substantial social, health and economic costs ( UN News, 2020) . The epidemic spread quickly from Wuhan to the neighbouring cities and then to the rest of China , possibly exacerbated by the 'travel rush' for the Lunar New Year. On 23 rd January 2020, the Chinese government initiated an unprecedented move of introducing a 'metropolitan-wide quarantine' of the city of Wuhan, by terminating all public transportation in the city and intercity links (Chen et al., 2020, Zhang Lei et al., J o u r n a l P r e -p r o o f 2020). Within days of implementing this quarantine, an additional 12 major cities in Hubei province were also similarly quarantined (Sina news, 2020) . Within a week of the quarantine, all 31 provinces of mainland China initiated the highest level of public health emergency response (Chinese Center for Disease Control and Prevention, 2020) . The strict control nationwide has been highly effective (Chinazzi et al., 2020 , Pan et al., 2020 , Tian et al., 2020 , with the daily reported confirmed cases significantly reduced from 3000-4000 at its peak to 10 cases or less a day in mid-March (World Health Organization, 2020a). The COVID-19 epidemic had evolved into a worldwide pandemics. As of 24 th May, over 5,200,000 cases have been reported in over 200 countries. The epicentre has moved from China to Europe, to the United States, and now to South America (World Health Organization, 2020a). The epidemic in China preceded epidemics in other countries and has now largely resolved (Kupferschmidt and Cohen, 2020) . The Chinese data provides a unique opportunity to understand how the early characteristics of an epidemic may predict its subsequent epidemic size. This may provide a useful guide for understanding the development of the COVID-19 epidemics in cities in other global settings. We hypothesise that the early characteristics of an outbreak would be good indicators to forecast the subsequent size of the epidemic. Previous studies have demonstrated that early epidemiological indicators, such as the basic reproduction ratio, may determine the peak of an epidemic and the epidemic level in the long term (Holme and Masuda, 2015, Ridenhour et al., 2014) . This study aims to investigate the association between the early characteristics of the COVID-19 epidemic and the size of the epidemic assessed in its late stages. This knowledge could provide public health policymakers with an indication of the likely size of an epidemic and therefore, the urgency of control measures that may be required. We collected publicly available data from 436 Chinese cities, inclusive of prefectures and municipalities, that reported on cases of COVID-19 (number of confirmed cases, deaths and recovery cases) from 16 th January to 15 th March 2020 when the epidemic in China was largely resolved. Our primary data source was Dingxiangyuan (DXY) (https://ncov.dxy.cn/), an online platform built by the members of the Chinese medical community, which integrated COVID-19 epidemic information from both local media and government reports. We included a total of 45 cities with more than 100 confirmed cases in our analysis. In contrast, 391 cities reported less than 100 cases over the two months, and the epidemics in these cities were presumably controlled by the nationwide emergency public health response. These cities were not included in our analysis. The analysis was conducted among 31 provinces of mainland China and excluded the two Chinese special administrative regions (Macau and Hong Kong) and Taiwan. This is because they were under a different epidemic surveillance system and also the epidemics underwent a major growth after 15 th March. We also excluded the city of Jining, Shandong province as most of its reported cases were due to a prison epidemic outbreak. For included cities, we obtained the population size of each of these cities from the statistical yearbooks of these cities or their respective provinces. For each of the 45 Chinese cities, we used the Joinpoint software (https://surveillance.cancer.gov/joinpoint/) to identify the trend and transition point of the epidemic during the initial phase of the epidemic based on the first 100 confirmed cases. We used a maximum of one joinpoint (corresponding to two time intervals) and a two-phase fit can be successfully determined through the Joinpoint software automatically (National Cancer Institute, 2020). We identified: (1) the time of the transition point between two phases; (2) the number of cases at the transition point; the growth rates of the (3) first (slow-growing) phase and (4) the second (fast-growing) phase. For each model calibration, we estimated the sum of squared errors (SSE) and mean squared error (MSE) for the fitness of J o u r n a l P r e -p r o o f joinpoint models (Table S2) . Datasets with less than six data points were automatically fitted with a single-phase model to avoid over-parameterisation. We identified the majority of transition points occurred below 30 cases (Table 1, Table S1 ) and hence regarded 30 cases as an important threshold for the epidemic growth where the epidemic changed from a slow-growing to a fast-growing phase. We also estimated three additional predictors based on the first 100 confirmed cases, namely: (1) the days required to increase from 30 to 100 cases (time from 30-to-100); (2) the case-fatality rate among the first 100 confirmed cases (CFR-100); and (3) the case recovery rate among the first 100 confirmed cases. The 'first 100 cases' was taken as the number of confirmed cases at the day the 100 th confirmed case was reported. We defined the outcome indicator as the epidemic size per million population, which was the cumulative number of confirmed cases at 15th March divided by the population size then multiply by one million in each of the corresponding Chinese city. We categorised the Chinese cities into three groups: (1) Wuhan city; (2) 15 neighbouring cities of Wuhan in Hubei province; and (3) 29 cities in the rest of the country. We compared the indicators between two groups using the nonparametric Mann-Whitney tests, and indicators across three groups using Kruskal Wallis one-way analysis of variance (ANOVA). We used the Spearman correlation test to examine the correlation between the epidemic size per million population of COVID-19 and each of the seven predictors as defined previously. We found that the epidemic size was associated with most of the proposed predictors (Table S3 ). However, most of these predictors, except the case-fatality rate within the first 100 confirmed cases, was collinear with the time from 30-to-100. Some predictors were not representative of all 45 cities. We, therefore, chose 'time from J o u r n a l P r e -p r o o f 30-to-100' and CFR-100 for the subsequent prediction of the ranking of the risk of a major epidemic (details in supplementary materials). We compared the predicted ranking of the epidemic size per million population based on the 'time from 30-to-100' and CFR-100 with the actual ranking of the epidemic size. We first ranked the two indicators 'time from 30-to-100' and CFR-100 independently, then sum the corresponding indexes of the indicators. We then ranked again the sum of the indexes to produce a predicted ranking of the epidemic size. This index was then compared with the actual ranking of the epidemic size in each city using Wilcoxon matched-pairs signed-rank tests. We identified a total of 45 Chinese mainland cities that reported more than 100 cases of COVID-19. Apart from the epicentre Wuhan, there were 15 neighbouring cities in Hubei and 29 cities in other Chinese provinces. Most (n=34) cities demonstrated a successful two-phase fit, whereas 11 cities had only one phase identified (Figure 1, Figure S2 ). Notably, the number of cases at the phase transition point in the Chinese cities was (17.7 [95% confidence intervals, 11.9-23.6], Table 1 ), and 88.2% (30/34) of Chinese cities had their phase transition points below 30 cases (Figure 1, Figure S2 ). We regarded the 30 confirmed cases as a critical threshold where the COVID-19 epidemic started to increase rapidly. Table 1 reported epidemiological characteristics for the first 100 confirmed cases. In the 45 Chinese cities, the days required for the number of confirmed cases to increase from 30 to 100 was 6.6 (5.3-7.9) days. In Wuhan, the number of days to rise from 30 to 100 cases was two days, compared to 4.4 (3.1-5.7) days for the 15 other cities in Hubei province and 7.8 (6.7-9.0) days for the 29 Chinese cities outside Hubei province. The difference was significantly (Kruskal-Wallis one-way ANOVA, p=0.0005). The average case-fatality rate in the first 100 confirmed cases across the 45 Chinese cities was 0.8% (0.2-1.4%). In J o u r n a l P r e -p r o o f particular, CFR-100 in Wuhan was 2.5%, followed by 1.8% (0.6-2.9%) in 15 other cities in Hubei province and 0.2% (0-0.4%) for the 29 Chinese cities outside Hubei province. The slow-growing phase was relatively short (5.5 [4.6-6.4] days) with a growth rate of 3.3 (2.6-4.1) cases/day, whereas the growth rate in the fast-growing phase was about five times higher (16.1 [12.3-19 .8] cases/day). Also, the case-fatality rate in the first 100 cases was positively correlated with the epidemic size (Spearman correlation, p=0.0168, r=0.3472). Other predictors were also significantly associated with the epidemic size, but they were found to be significantly collinear with 'time from 30-to-100' (Table S3) Table 2 demonstrated a highly significant correlation between the predicted ranking of epidemic size per million population based on the 'time from 30-to-100' and CFR-100 and the actual ranking of epidemic size (Wilcoxon matched-pairs signed-rank test, p<0.0001, r=0.7627). Our study identified that among 45 cities in China, 30 cases might be a critical threshold for switching from a relatively slow-growing phase to a fast-growing phase, which grows five times faster. Of the seven J o u r n a l P r e -p r o o f early-stage epidemic characteristics we assessed, we found that the time from the 30 th to 100 th case and the case-fatality rate in the first 100 cases were strong indicators of the epidemic size of the future outbreak. These early-stage 'indicators' may be useful to public health officials in other settings to identify the appropriate control measures. Our study of Chinese cities provides a unique opportunity to understand the COVID-19 epidemic in cities with quite different reproductive rates when the virus was first spread to the cities. This may provide guidance to other cities with different reproductive rates worldwide. We argue that before the outbreak was detected, it is likely that most Chinese cities had similar reproductive rates, but the nationwide response dramatically lowered the reproductive rates at a time when many Chinese cities were in different stages of the epidemic. This allowed an observational study of many similar cities that had different reproductive rates to determine what factors predicted large epidemics and therefore allowing the identification of cities that were likely to have large epidemics. The first 30 cases appear to be an important indicator for the initiation of a fast-growing phase of COVID-19. It is possible that the detection of 30 cases may represent a time when the epidemic shifts from one associated primarily with imported cases to one primarily driven by local transmission. Once it reaches this critical mass, local transmissions start to dominate, and a large number of domestic transmissions start to surface. This pattern may not be evident if a city is very closely associated with another major outbreak, and this may be the reason that most of Wuhan's neighbouring cities do not have a slowgrowing phase. Besides, the duration of the slow-growing phase is only about six days which is consistent with the incubation period of SARS-CoV-2 infection (5-6 days (Lauer et al., 2020 ). This suggests a large number of pre-symptomatic cases in the incubation period has now become symptomatic and detectable, which marks the rapid expansion of the number of cases. Characteristics of the epidemic in the early stage may reflect the potential epidemic development and response of the healthcare system. The number of days required to increase from 30 to 100 cases represents a rough measure of the initial growth rate of a localised epidemic. A short duration implies a fast and probably uncontrolled expansion of the epidemic, which is likely associated with either high transmissibility of the virus or a delayed diagnosis. The high transmissibility is likely related to the absence of effective prevention or intervention strategies at this stage. Besides, when symptomatic patients present themselves to the hospitals in hundreds, this likely means a lot more pre-asymptomatic cases are yet to be diagnosed, and the epidemic is far more severe than what has been observed. Consistently, a high case-fatality rate in the first 100 diagnoses also implies a missed opportunity for earlier diagnosis. Considering the incubation period (5-6 days) and the time from symptom onset to death (2-8 weeks (World Health Organization, 2020b)), a high case-fatality rate in the early stage suggests that the surveillance system was too slow in responding to the epidemic to prevent infected individuals from progressing to severe pneumonia. Our analysis showed that based on these two simple early-stage indicators, we are able to rank the predicted epidemic size per million population, and this ranking is highly consistent with the ranking of the actual epidemic size at a later stage of the epidemic. This may provide useful insights into the potential severity of the COVID-19 epidemic in its later development. Therefore, a fast-growing phase and a high case-fatality rate are early warning signs for the healthcare system to react to the epidemic accordingly. This is comparable to previous studies where a novel framework has been developed to assess the epidemic severity for influenza based on its transmissibility and clinical severity (e.g. casefatality rate) (Carrie et al., 2013 , Shrestha et al., 2011 . However, these studies did not investigate the characteristics and their implications of the early epidemic. Our findings need to be interpreted with caution. First, our results are not a quantification of the actual size of the epidemic, but rather a comparison between the predicted and actual rank of epidemic size. J o u r n a l P r e -p r o o f Second, since China had implemented rigorous control strategies to curb the epidemic in most parts of the country, the epidemic size may be under-represented in comparison with settings in other countries. Third, we regarded Wuhan's epidemic as a major outbreak, yet, the epidemics in many cities worldwide are already comparable to or exceed Wuhan's level. The implications of a Chinese city ranking may be interpreted differently in these settings. Therefore, whether the prediction of ranking is applicable to other countries is uncertain and warrants further investigations. The first 30 cases may mark an important threshold for the transition from a slow to a fast-growing phase of the COVID-19 epidemic. Early epidemic characteristics may be regarded as important indicators for later epidemic development and size. Ethical approval was not required. ☑ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Novel Framework for Assessing Epidemiologic Effects of Influenza Epidemics and Pandemics COVID-19 control in China during mass population movements at New Year The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak Office of the State Council holds press conference on joint prevention and control of pneumonia epidemic of new coronavirus infection WHO Declares COVID-19 a Pandemic The basic reproduction number as a predictor for epidemic outbreaks in temporal networks The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan, China Unraveling R0: considerations for public health applications Estimating the burden of 2009 pandemic influenza A (H1N1) in the United States Hubei province locked down 13 cities An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China Coronavirus update: COVID-19 likely to cost economy $1 trillion during 2020, says UN trade agency World Health Organization. Coronavirus disease (COVID-2019) situation reports; 2020a Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). . Geneva: World Health Organization Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside Hubei province, China: a descriptive and modelling study What is Required to Prevent a Second Major Outbreak of the Novel Coronavirus COVID-19 Upon Lifting the Metropolitan-Wide Quarantine of Wuhan City, China: A Mathematical Modelling Study (3/11/2020) The work is supported by Bill & Melinda Gates Foundation. We are thankful for the input and discussion from Guoqiang Li and Xinghui Li. J o u r n a l P r e -p r o o f