key: cord-0991812-hk491mr6 authors: Qi, Rui; Ye, Chao; Qin, Xiang-rong; Yu, Xue-Jie title: Case fatality rate of novel coronavirus disease 2019 in China date: 2020-02-26 journal: nan DOI: 10.1101/2020.02.26.20028076 sha: 46634d03aa169aab9c372746ebaf3aaa65d7c7d9 doc_id: 991812 cord_uid: hk491mr6 Abstract Background: A pandemic of coronavirus disease 2019 (COVID-19) which have caused more than 80 thousand persons infected globally is still ongoing. This study aims to calculate its case fatality rate (CFR). Methods: The method, termed as converged CFR calculation, was based on the formula of dividing the number of known deaths by the number of confirmed cases T days before, where T was an average time period from case confirmation to death. It was found that supposing a T, if it was smaller (bigger) than the true T, calculated CFRs would gradually increase (decrease) to infinitely near the true T with time went on. According to the law, the true T value could be determined by trends of daily CFRs calculated with different assumed T values (left of true T is decreasing, right is increasing). Then the CFR could be calculated. Results: CFR of COVID-19 in China except Hubei Province was 0.8% to 0.9%. So far, the CFR had accurately predicted the death numbers more than 3 weeks. CFR in Hubei of China was 5.4% by which the calculated death number corresponded with the reported number for 2 weeks. Conclusion: The method could be used for CFR calculating while pandemics are still ongoing. Dynamic monitoring of the daily CFRs trends could help outbreak-controller to have a clear vision in the timeliness of the case confirmation. should be corrected as cases at T days before, where T is an average time period from case 67 confirmation to death. This study aims to calculate the CFR of the COVID-19 in China by 68 estimating the average time period from case confirmation to death. it should be realized that deaths at day X are averagely from cases at day X-T rather than day 75 X. Given a T value, a group of CFRs (daily CFRs) can be obtained from different X days. As 76 known that death number at day X should be less than case number at day X-T (if more than 77 day X-T, CFR would be greater than 100% which is illogical). Based on this point, the range 78 of T can be narrowed. More importantly, no matter what T value is assumed, even it is far 79 away from the true T value, the daily CFRs would converge towards (infinitely approach to 80 but never be over) the true CFR with time (X) increases. The following example will illustrate 81 this principle (Table 1) . Assuming CFR = 10%, T = 4 for a disease, the cases number was 82 from 100 to 10000 at day X (X=1 to 100), then the deaths number would be 10 (10, 20 and so 83 on) at day X+4 (5, 6 and so on). When calculating daily CFRs based on case and death 84 numbers with formula deaths (X) divided by cases (X-T), law 1: if assumed T was equal to 85 the true T value (4 in the example), calculated daily CFRs at different day X would constantly 86 be the true CFR (0.1); if assumed T was greater than the true T (5 and 6), daily CFRs would 87 be greater than the true CFR (0.1) and infinitely reduce to near it with the time (X) increased; 88 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. A number of T values were selected for screening based on convergence laws. After 104 different T values were tried, as Figure 1 showed, when assumed T was 11, the daily CFRs 105 were decreasing and had no pronounced increase, when it was 0 to 7; the daily CFRs had 106 pronounced increase after early time (T > 11 were not shown due to continuously decreasing 107 trends). CFRs increased as expected according to laws at later stage in some assumed T 108 values (e.g. T=0), but it decreased at early stage which seemed not satisfy the convergence 109 laws. Actually, it was normal. Convergence laws happened due to the force of the true CFR 110 drawing daily CFRs towards its direction by dominating accumulated death numbers. At early 111 stage, the outcome of death had not yet occurred resulting in daily CFRs decreasing with the 112 growth of case number. Thus, T value exploration by convergence laws should depend on 113 period of death growth. 114 Results of Figure 1 indicated the true T should be in the range of 8 to 10. As differences 115 between CFRs were too small at converging stage to compare and scales of y axis in different 116 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint . https://doi.org/10.1101/2020.02.26.20028076 doi: medRxiv preprint 7 / 20 plots of Figure 1 varied greatly, Figure 1 was only used for preliminary tendency exploration. 117 Converging stage CFRs had been cut out to plot with same y axis scales for the true T and 118 CFR estimation (Figure 2 ). 119 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint Figure 1 . Calculated daily CFRs of non-Hubei regions by Mar 1 when T was assumed from 0 124 to 11. 125 As mentioned in Methods, with the time increased, even under a false T in calculation, 126 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint . https://doi.org/10.1101/2020.02.26.20028076 doi: medRxiv preprint 10 / 20 the daily CFR could converge towards the true CFR though more times needed. If assumed T 127 was equal to the true T value, calculated daily CFRs would keep constant. As Figure 2 128 showed, for T= 11 and 8, comparing with T = 9 and 10, CFRs still had slightly decreasing and 129 increasing trends, respectively. Linear models (blue lines) were generated for analysis of 130 variances and linear trends of theses CFR points in each plot. The slopes of models became 131 flatter and approached towards to 0 when T was from 8 to 9 and 11 to 10. The results 132 indicated the true T should be bigger than 8 and less than 11. When T = 9, the CFRs were 133 almost staying in one line (red dotted line in Figure 2 ) and slightly increased later. When T = 134 10, though the daily CFRs decreased early but quickly they reached a stable stage. So the true 135 T might be between 9 and 10 days. The mean values of data in plot 9 and 10 of Figure 2 were 136 0.8% and 0.9%, respectively. The true CFR of COVID-19 in China except Hubei Province 137 should fall between 0.8% and 0.9%. An assumed T was the closer to the true T value, the 138 earlier daily CFRs converging to the true CFR happened. The mean value of CFRs at later 139 stage of plot 9 or 10 was approximately 0.85%. As shown in Figure 2 , if data was analyzed 140 before Feb 20, T = 9 (0.81%) might be determined as the true T value (true CFR). But now 141 the T was postponed to between 9 and 10 days. The reason was not the uncertainty of method, 142 but the long disease course of COVID 19. Time of case confirmation to outcome was longer 143 in some cases than most which caused the true T bigger and CFR slightly increased. For (only sporadic cases were reported recently in the non-Hubei regions), so far, the naive CFR 148 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint Figure 2 . Converging stage daily CFRs of non-Hubei regions when T was from 8 to 11 154 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint Blue lines represented linear models generated for analysis of variances and linear trends of 155 these data points in plots. Red dotted line and number were the estimated true CFR. 156 As shown in Table 2 , after Feb 3, death number (350) were more than Jan 21 case 158 number (270), if the T was 12 (Feb 3 minus Jan 1), the CFR would be illogically greater than 159 100%. In another words, death numbers only when before day 12 were less than case number 160 at day 1. So the time T should be less than or equal to 11 days (12-1). The death number when 161 was firstly more than the case number at day 2 was Feb 5 (day 15), so the T should be less 162 than or equal to 12 (14-2). The rest could be done in the same manner. Finally, the smallest T 163 value (T = 11) was selected as the upper limit for convergence screening. 164 Figure 3 was the calculation of daily CFRs with assumed T values (0 to 11). When 165 assumed T was 8 to 11, daily CFRs were continuously decreasing. When T = 0 and 3, there 166 were increase trends at later stage which meant they were smaller than the true T value. 167 Converging stage CFRs data when T = 4 to 7 was selected for plotting with the same y axis 168 scales ( Figure 4) . As it showed, For T= 4, CFRs had increase trends, and T = 5, the CFRs 169 slightly increased. When T was 7, CFRs decreased and reached stable at later stage. When T 170 was 6, plateau stage appeared earlier than T = 7. The slopes of linear models became flatter 171 and approached towards to 0 when T approaching to 6. Then T = 6 was selected as the true T 172 value for the true CFR calculation. The true CFR of COVID-19 in Hubei calculated by mean 173 value of the daily CFRs of plot 6 in Figure 4 was 5.4%. The estimated T value was smaller 174 than non-Hubei regions. It was not surprising as it seemed that time of case confirmation to 175 death was shorter. Previously in Wuhan City of Hubei Province, many patients had not been 176 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint . https://doi.org/10.1101/2020.02.26.20028076 doi: medRxiv preprint 13 / 20 confirmed and reported timely due to overwhelmed medical services and lack of testing kits. 177 The death number (from confirmed and unconfirmed population) could prefer to "select" 178 forward case pools with bigger population. Thus, to obtain an accurate CFR, timeliness of 179 case conformation should not vary too much. The possibility could not be rule out that the 180 CFR might slightly increase later like non-Hubei regions due to the long disease course of 181 COVID 19. 182 183 Figure 3 . Calculated daily CFRs of Hubei Province by Mar 1 when T was from 0 to 11. 184 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. True numbers of death were compared with numbers estimated by the calculated T and 190 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint As shown in Figure 5 -non-Hubei, since Feb 4, calculated death numbers had a good fit to the 193 true death data. The curves came closest to coinciding in shape. For Hubei (Figure 5-Hubei) , 194 the predictive curve was similar in shape with true death line, however, from Jan 23 to Feb 10, 195 predicted death numbers were smaller than the true numbers. The predicted curve from Jan 23 196 to Feb 10 seemed be moved to right about 2 or 3 days. A subset data from Jan 21 to Feb 12 197 was selected to recalculate the T, and results in Hubei, T was 2 days. However, it could be go back on production. In summary, as death numbers had been almost accurately predicated 208 by calculated true CFR for more than 3 weeks, it could be considered as the true CFR of 209 COVID-19 in China except Hubei Province. For Hubei, calculated death number 210 corresponded with the reported number for more than 2 weeks. 211 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Chinese Spring Festival holidays. The Chinese government rapidly isolated Wuhan and took 224 emergency measure nationwide to prevent and control disease. Non-Hubei regions response 225 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.02.26.20028076 doi: medRxiv preprint A Novel Coronavirus from 275 Patients with Pneumonia in China Epidemiological and clinical 277 characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a 278 descriptive study Clinical features of patients infected 280 with 2019 novel coronavirus in Wuhan Potential biases 282 in estimating absolute and relative case-fatality risks during outbreaks Case fatality rate for Ebola virus disease in west Africa Heterogeneities 287 in the case fatality ratio in the West African Ebola outbreak 2013-2016 A novel coronavirus outbreak of global health 290 concern to COVID-19 could be regarded as timely. The situations of outbreak in Hubei and non-Hubei 226 regions were quite different. So CFRs were calculated separately. Diagnose and confirmation 227 towards patients presenting with more severe disease had priority in Hubei, especially Wuhan 228 as the limited healthcare-facilities and testing capacities. Thus, the calculated CFR for Hubei 229 was higher due to the underdetection of mild or asymptomatic cases. Other cases gave a 15% death rate (7). However, regardless of the sample size, these cases were 236 highly biased towards the more severe cases for CFR calculation. Another study reported the 237 CFR was 4.3% which also had a biased study population (Wuhan hospitalized patients) (8). A 238 newly epidemiological study estimated the CFR was 3.06% (95% CI 2.02-4.59%) from 4,021 239 cases (9). This study included data from non-Hubei regions, so the CFR should be smaller 240 than that of Wuhan. When epidemic was still ongoing, CFR could be estimated by following a 241 cohort, however, it was time-consuming and difficult to included size-enough and 242 representative patients from unbiased population. Considering the features of daily CFRs 243 convergence, true CFR estimation based on population-level big data might be a good way. 244In our study, calculated T values were different, T was between 9 and 10 in non-Hubei 245 but was 6 in Hubei. The time in Hubei from confirmation to death was shorter comparing 246 with non-Hubei. On Feb 13, more than 10 thousands cases were reported one day including 247 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. In conclusion, by converged CFR calculation method, the true CFR of COVID-19 in 264China except Hubei Province was approximately from 0.8% to 0.9%. This calculated CFR 265 could accurately predict the death numbers for more than 3 weeks. The CFR in Huibei was 266 5.4% at the present stage. This method in our study can be used for CFR calculation when a 267 pandemic is still ongoing and monitoring the case confirmation situation. 268 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.02.26.20028076 doi: medRxiv preprint