key: cord-278013-0d6o5w8z authors: Omori, Ryosuke; Mizumoto, Kenji; Nishiura, Hiroshi title: Ascertainment rate of novel coronavirus disease (COVID-19) in Japan date: 2020-03-10 journal: nan DOI: 10.1101/2020.03.09.20033183 sha: doc_id: 278013 cord_uid: 0d6o5w8z We analyzed the epidemiological dataset of confirmed cases with COVID-19 in Japan as of 28 February 2020 and estimated the number of severe and non-severe cases, accounting for under-ascertainment. The ascertainment rate of non-severe cases was estimated at 0.44 (95% confidence interval: 0.37, 0.50), indicating that unbiased number of non-cases would be more than twice the reported count. Severe cases are twice more likely diagnosed and reported than other cases. would be more than twice the reported count. 9 Conclusions: Severe cases are twice more likely diagnosed and reported than 10 other cases. Considering that reported cases are usually dominated by non-severe 11 cases, the adjusted total number of cases is also about a double of observed count. 12 Our finding is critical in interpreting the reported data, and it is advised to 13 interpret mild case data of COVID-19 as always under-ascertained. 14 Keywords: coronavirus CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . The majority of COVID-19 cases exhibit limited severity; 81% of 7 reported cases in China has been mild and only 16% are severe (Guan et al., 8 2020) . It is natural that the ascertainment rate would be different between severe 9 and non-severe cases. The present study aims to estimate the ascertainment rate 10 of non-severe cases, employing a statistical model. 11 We analyzed the epidemiological dataset of confirmed cases with 13 COVID-19 in Japan as of 28 February 2020. The confirmatory diagnosis was 14 made by means of reverse transcriptase polymerase chain reaction (RT-PCR). 15 The present study specifically analyzed cases by (i) prefecture, (ii) age, and (iii) 16 severity. Severe case was defined as (i) severe dyspnea that required oxygen 17 support plus pneumonia or intubation or (ii) case that required management in 18 intensive care unit. 19 We estimated the number of severe and non-severe cases using the ratio 20 of non-severe to severe reported cases (Guan et al., 2020 , Novel, 2020 . We 21 estimated the ascertainment rate among non-severe cases by 1/k, describing data 22 generating process of both severe and non-severe generated from Poisson process 23 with probabilities p x,a for severe cases and kf a p x,a for non-severe cases in age 24 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03.09.20033183 doi: medRxiv preprint group a and prefecture x, respectively. Here f a denotes the ratio of non-severe to 1 severe reported case of age group a, as estimated from age-specific severity and 2 incidence rate ratio in China (Guan et al., 2020 , Novel, 2020 . We estimate k and 3 p x,a using the loglikelihood function: 4 where N x,a , D ns,x,a and D s,x,a represent the population size, the observed counts of 5 non-severe and severe cases of age group a in prefecture x, respectively. 6 Maximum likelihood estimates were obtained by maximizing the equation (1) 7 and the profile likelihood-based confidence intervals were computed. 8 The ascertainment rate of non-severe cases, k, was estimated at 0.44 (95% 10 confidence interval (CI): 0.37, 0.50). Resulting estimate of non-severe cases is 11 shown in Figure 1A , showing along with reasonably good fit to severe case data 12 in Figure 1B . Age-specific pattern of estimated non-severe cases was similar to 13 that among severe cases. The largest estimated number of non-severe cases was 14 80 cases (95% CI: 63, 98) among those aged 50-59 years and 78 (95% CI: 61, 15 95) among cases aged 60-69 years, respectively. Such adjustment gives adjusted 16 estimate of the total cases by age group. 17 The present study estimated the ascertainment-adjusted number of cases in 19 Japan, using age-specific severe fraction of cases. We assumed that the ratio of 20 severe to non-severe cases in a given age group is a constant and that the age-21 independent gap is explained by the under-diagnosis and under-reporting, 22 estimating the ascertainment rate among non-severe cases to be 0.44. 23 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03.09.20033183 doi: medRxiv preprint As a take home, it must be remembered that severe cases are twice more 1 likely diagnosed and reported than other cases. Reported cases are usually 2 dominated by non-severe cases, and the adjusted total number of cases is about a 3 double of observed count. Our finding is critical in interpreting the reported data, 4 and it is advised to regard the mild case data as always under-ascertained. 5 In addition to the proposed adjustment, it should be noted that the 6 ascertainment rate of severe cases needs to be additionally estimated, and such 7 estimation requires direct measurement of the total number of cases or infected 8 individuals by means of seroepidemiological study or other testing methods of all 9 samples (Nishiura et al., 2020) . That is, the actual total number of cases is greater 10 than what it was adjusted in the present study. Using seroepidemiological 11 datasets, we plan to address relevant issues in the future. Other limitations 12 include that (i) we did not explore detailed natural history, e.g. dynamically 13 changing symptoms over the course of infection, and underlying comorbidities, 14 (ii) we ignored right-censored data, e.g. the time delay from illness onset to 15 severe manifestations, for simplicity. The latter led us to underestimate the 16 ascertainment rate. (iii) it is worth noting that the data of age dependent severity 17 employed in our analysis is only based on the observed data in China. 18 Considering the possibility of underreporting or biased age distribution, the 19 nature of this age distribution may lead to underestimation. 20 Despite multiple future tasks, we believe that the present study successfully 21 demonstrated that the ascertainment rate can be partly adjusted by examining 22 age-dependent number of cases including severe cases. The proposed adjustment 23 should be practiced in other country settings and also for other diseases. 24 25 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Top: non severe cases, middle: severe cases, and bottom: total cases. x-marks 5 represent observed counts, while unfilled circles show estimated cases. Whiskers 6 extend to lower and upper 95% confidence intervals, derived from profile 7 likelihood. 8 9 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03.09.20033183 doi: medRxiv preprint Clinical 17 Characteristics of Coronavirus Disease 2019 in China The rate of underascertainment of novel coronavirus (2019-nCoV) infection: 21 Estimation using Japanese passengers data on evacuation flights. 22 Multidisciplinary Digital Publishing Institute The epidemiological characteristics of an outbreak of 2019 24 novel coronavirus diseases (COVID-19) in China. Zhonghua liu xing bing xue 25 za zhi= COVID-19) Situation Report -41