key: cord-0906698-7zld5s7f authors: Linton, Natalie M.; Akhmetzhanov, Andrei R.; Nishiura, Hiroshi title: Localized end-of-outbreak determination for coronavirus disease 2019 (COVID-19): examples from clusters in Japan date: 2021-03-01 journal: Int J Infect Dis DOI: 10.1016/j.ijid.2021.02.106 sha: 673773f777e4bcfab91c675ed7a1be9b9fa8c942 doc_id: 906698 cord_uid: 7zld5s7f OBJECTIVES: End-of-outbreak declarations are an important component of outbreak response as they indicate that public health and social interventions may be relaxed or lapsed. The present study aimed to assess end-of-outbreak probabilities for clusters of coronavirus disease 2019 (COVID-19) cases detected during the first wave of the COVID-19 pandemic in Japan. METHODS: We computed a statistical model for end-of-outbreak determination that accounted for the reporting delay for new cases. Four clusters representing different social contexts and time points during the first wave of the epidemic were selected and their end-of-outbreak probabilities were evaluated. RESULTS: The speed of end-of-outbreak determination was most closely tied to outbreak size. Notably, accounting underascertainment of cases led to later end-of-outbreak determinations. In addition, end-of-outbreak determination was closely related to estimates of case dispersionk and the effective reproduction number [Formula: see text]. Increasing local transmission ([Formula: see text]) leads to greater uncertainty in the probability estimates. CONCLUSIONS: When public health measures are effective, lower [Formula: see text] (less transmission on average) and larger k (lower risk of superspreading) will be in effect and end-of-outbreak determinations can be declared with greater confidence. The application of end-of-outbreak probabilities can help distinguish between local extinction and low levels of transmission, and communicating these end-of-outbreak probabilities can help inform public health decision-making with relation to the appropriate use of resources. The coronavirus disease 2019 (COVID-19) pandemic is the most fatal and disruptive biological disaster of recent history. As of 24 February 2021, COVID-19 has been diagnosed in over 113 million people and associated with more than 2.5 million deaths. Travel restrictions, school closures, cancellation of public events, and other public health and social measures implemented to curb disease transmission have deeply affected lives and livelihoods worldwide. In the midst of the disaster, mathematical modeling J o u r n a l P r e -p r o o f has served prominently in informing pandemic response as outbreaks erupted and grew, 1,2 but the field can also provide insight into the transmission dynamics of outbreaks as they come to an end. 3, 4 Control of COVID-19 has proven difficult, and timely, localized information regarding which outbreaks are growing and which are likely to end is needed to inform control measures. Here, we explain a method to estimate end-of-outbreak probabilities at localized levels, allowing for evidence-based decision-making around the scaling-back of public health and social response measures in real-time. We demonstrate the applicability of this method in relation to clusters of cases using several examples from Japan. Early research into the transmission dynamics of COVID-19 indicated that spread of the causal virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was highly overdispersed. 5, 6 The degree of overdispersion is quantified by dispersion parameter , which describes the variance in the distribution of the number of secondary cases infected by a typical primary case. Lower values of k represent greater variance and thereby a propensity towards superspreading-the phenomenon by which some infected individuals transmit a pathogen to large numbers of secondary cases, while most do not infect others. 6 Larger values of k indicate less dispersion, and a Poisson distribution is obtained as a special case of the negative binomial distribution as → ∞. 7 During the pandemic, for COVID-19 has been reported in the range of 0.1-0. 6, 5, [8] [9] [10] indicating that most secondary infections are caused by a small fraction of primary cases, and therefore that superspreading events can fuel disease transmission. In Japan, the propensity of COVID-19 towards superspreading was addressed by including the prevention, detection, and suppression of clusters-groups of cases linked by a common place and timeas a key component of national response. 11 Additional public health and social measures took into consideration commonalities in transmission settings between clusters 12 -most prominently in the form of a nationwide messaging campaign encouraging residents to avoid the "Three Cs": closed spaces with poor ventilation, crowded places, and close-contact situations. 11,13 Although Japan had relatively low levels of epidemic growth and fatalities reported during the first six months of the pandemic compared to other developed nations in Europe and North America, it faced a resurgence of cases during the summer months. J o u r n a l P r e -p r o o f Current World Health Organization (WHO) guidance focuses on assigning levels of risk based on epidemiological, health system, and surveillance criteria from trends and other descriptive data to inform the loosening and re-tightening of response activities. 14 The guidance indicates that an outbreak can be considered controlled if the effective reproduction number -the average number of secondary cases produced per primary case in the presence of interventions and immune individuals in a given time periodis maintained below the threshold value of 1 for at least two weeks. While this distinguishes well between epidemic growth and decline, this method of surveillance does not differentiate between whether an outbreak will end entirely or will continue as stuttering chains of transmission (i.e. 0 < < 1) before potentially resurging ( > 1). 15 Other COVID-19 guidance that address end-of-outbreak declarations relate to the incubation period-based guidelines used for other directly transmittable diseases such as measles and Ebola virus disease. For those diseases, two times the maximum incubation period-the time from infection to symptom onset-since the last possible date of exposure to a source of infection within the outbreak is used to determine the time until the end of an outbreak can be declared. 16, 17 Despite widespread use, this method has previously been proven flawed, and is particularly vulnerable when local surveillance systems are weak. 3, 18 An alternative statistically-based method is to leverage parameters that describe transmission dynamics underlying the epidemic curve to estimate the probability that one or more cases will be reported going forward in time. 19 This more rigorous basis for end-of-outbreak determination can also be adjusted to account for additional factors such as case underascertainment and changing transmission dynamics. Including this statistical analysis adds to the information available to politicians and public health officials when making decisions regarding response needs. Here, we present examples where this method was used to assess end-of-outbreak probabilities for clusters in Japan during the first wave of the pandemic. J o u r n a l P r e -p r o o f Epidemiological data were collected from the official case reports published online by reporting jurisdictions (prefectures and some cities) within Japan. In some instances, the data were supplemented and additionally verified using information from press briefings from the reporting jurisdictions. The collected data include date of onset, date of report, and linkage information used to inform grouping of cases into clusters. A dataset including dates and cluster information are included in the supplementary materials. Each cluster is given by the epidemic curve represented at the time of report t with onset dates ≤ for an epidemiologically defined group of cases = {1, … , }. The probability that one or more new cases ( ) will be reported on day is written as follows: 19 where is the probability that y secondary cases arise from a given primary case i following a negative binomial distribution with the mean and variance (1 + ), with k the dispersion parameter. The function (. ) defines the cumulative distribution function (CDF) of the serial interval f which is backprojected using the time delay from illness onset to report ℎ, and defined by the convolution: J o u r n a l P r e -p r o o f (2) Outbreak extinction is determined once the estimate for the probability of observing one or more additional cases Prob( ( ) > 0) comes below a given threshold. The day the probability estimate drops below the threshold is in effect the day the outbreak would be declared over. Selection of a threshold value depends on whether the goal is to minimize the observation period (higher threshold) or minimize risk that undetected cases may exist and become detected following the end of outbreak declaration (lower threshold). Here, we examine a 5% threshold, which translates to a 5% risk that the end-of-outbreak declaration would be preemptive and case(s) would be detected following the declaration. 4 For some cases, date of onset was unavailable. Either the case did not give permission for disclosure, the information was not collected, or the case was asymptomatic at the time of detection and did not later report symptoms. The latter scenario may therefore represent either pre-symptomatic cases with no followup report or cases who were completely asymptomatic until recovery. We approached missing dates of illness onset in two ways. First, we excluded cases with no reported onset date and calculated the probability of new cases solely based on the existing epidemic curve of illness onsets, hereafter referred to as the "reported" dataset. Second, we sampled the reporting delays and subtracted their values from the known report dates for all cases with no available date of illness onset to obtain a proxy onset date, hereafter referred to as the "imputed" dataset. Despite efforts to obtain high quality surveillance data through contact tracing and testing, it is still likely that cases remain underascertained. Previous reports have suggested 9.2-44.4% case ascertainment using data on Japanese evacuees from the original epicenter of Wuhan, China and laboratory testing J o u r n a l P r e -p r o o f conducted in Japan during January and February 2020, respectively. 21, 22 For clusters with relatively stable and captive populations (e.g., medical centers and senior homes) ascertainment is likely to be higher due to intensive contact tracing on a focused population, though as chains of transmission move away from the common exposure setting, ascertainment will approach that of the general population. Clusters based on social contact linked to >1 common exposure setting (e.g., multiple restaurants, gyms, or other venues) are more likely to have lower levels of case ascertainment, though in Japan it is expected that ascertainment for cases related to a cluster would be higher than for the general population due to targeted case finding. To address likely case underascertainment, we sampled from a binomial distribution with probability of success = 1 − , where is the ascertainment rate. We assume that underascertained cases could only exist within one serial interval (i.e. 5 days 23 ) from the date of onset of the last reported case because the contact tracing team is unlikely to miss cases from two consecutive generations. A number of unreported cases at day can be then inferred using the following observational model: where is a maximally possible number of unreported cases that was assigned to 50 in our simulations. The model was applied using data on the epidemic curves for individual clusters, accounting for missing dates of onset and varying levels of case ascertainment as described previously. The applied parameter values are shown in Table 1 . was explicitly varied between 0.5, 1.5, and 3, although an estimate of the local time-varying effective reproduction number ( ) would be a sensible option when conducting analyses in real time. Otherwise, we accounted for parameter uncertainties via resampling. Estimates for were drawn from a positive half-normal distribution using mean and SD from published studies. 8, 10, 24 Other distributions were also resampled from their empirical distributions. The serial interval distribution was previously reported by Nishiura et al, 23 and the reporting delay was estimated from all for cases reported in Japan through the end of May using doubly interval-censored methods described elsewhere. 23 The present study analyzed publicly available data which were already de-identified upon press release. The present study was approved by Medical Ethics Board of the Graduate School of Medicine, Kyoto University (R2673). Characteristics of the four clusters are shown in Table 2 . The Aichi fitness gyms cluster had the smallest number of cases (n=40) while the Hokkaido senior care facility cluster was the largest (n=94). The average age was lowest for the Kyoto cluster, which was associated with parties attended primarily by university students, while the average age was highest for the senior care facility cluster. Three-quarters of the Hokkaido cancer center cluster were female, while the female-to-male ratio for the gyms and universityrelated parties cluster were nearly evenly split between males and females. The proportion of cases with no reported date of onset ranged from 2.5% for the Aichi gyms cluster to 46.5% for the Kyoto universityrelated parties cluster. The time between first onset and last onset within each cluster ranged from 22 to 43 days. The delay from onset to prefecture report date for cases reported between when the first case was detected in January and May 31, 2020 was estimated at 7.2 days (95% credible interval [CrI]: 7.1-7.3 days) using the best-fit gamma distribution (see Table S1 and Figure S1 ). Figure 1 depicts the probability distributions of observing additional cases by cluster, varyingly accounting for asymptomatic cases and missing dates of onset (imputed dataset) and underascertainment of cases. indicates that more time is needed to be sufficiently certain that the end of an outbreak is declared appropriately. 3, 4 As well, a larger consistently resulted in slightly longer observation periods compared to smaller (see Table 2 and Figures S2.1-2.4) . The CrI were widest for the combination of a large and small , indicating greater uncertainty in whether the outbreak had truly ended. The probabilities for some of the upper 95% CrI never dropped below the threshold values within the 42-day periods examined. As guidance continues to be developed for COVID-19 response, it is important to incorporate insights from statistical modeling to declining phases of the pandemic. 27 and , we estimated that end-of-outbreak declarations could be made before 28 days (two times the approximate maximum incubation period for COVID-19) 16, 25 had passed from the date of last report of any case (which is often when the same day the case is isolated), and using these estimates could potentially allow for earlier end-of outbreak declaration, leading to saved resources. Parameters such as the reporting delay and serial interval may also vary throughout the epidemic. When surveillance is heightened the reporting delay may be shorter than when surveillance systems are overwhelmed. Similarly, nonpharmaceutical interventions such as contact tracing, isolation, and physical distancing change contact patterns and limit the time during which an infectious case may be in contact with susceptible individuals, shortening the serial interval. 28 Although these possible variations were not accounted for in this study, they can be incorporated if deemed to be of value to inform decisions regarding the continuation of public health and social response measures. Furthermore, the size and scale of the epidemic curves used in our analyses depend on case, cluster, and outbreak definitions and case ascertainment by the surveillance system. When a cluster definition is limited to cases directly linked to a location or activity (e.g., a hospital or an event) then the cluster size will be smaller than if the cluster includes all secondary infections to household members and other contacts not directly related to the transmission event(s) that defined the cluster. Likewise, when the cases are limited to those whose samples test positive (i.e. via polymerase chain reaction [PCR] or antigen testing) the scope and scale of the epidemic curve will be smaller than if other probable cases were included in the outbreak case definition. The clusters reported here include original cluster cases as well as all subsequent cases in chains of transmission reported by local public health jurisdictions. It is possible that some cases associated with the cluster were missed or incorrectly attributed; however, we have repeatedly reviewed the data to minimize these possibilities. Our analyses accounting for possible underascertainment of cases likewise show that when accounting for missed cases extends the observation period before reaching our 5% threshold is extended. Through the end of May 2020, only PCR-positive cases were included in the case definition for COVID-19 cases in Japan. Infected individuals may not have been tested if they were never suspected of J o u r n a l P r e -p r o o f being a case or did not meet testing criteria. 29 In addition, PCR sensitivity is less than perfect, reducing to around 70% more than one week after symptom onset, 30 so some infected individuals may have received a false negative test result if their viral load at the time they were tested was insufficient to trigger a positive result. Sequentially repeated PCR testing in Japan for persons with persistent symptoms and/or new onset of symptoms after initially being tested while asymptomatic has identified cases that were initially PCR negative but epidemiologically linked to other cases, as has been seen elsewhere. 31 In addition, importation of cases is not accounted for in this method. Defining outbreaks based on the epidemiological linkage of cases to at least one common source of exposure (i.e. clusters) necessarily precludes inclusion of new sources of infection (i.e. importation). A new case linked to-for example-a physical location that was a common source of exposure for cases in a cluster may represent an importation event rather than a continuation of the outbreak/cluster unless there is clear epidemiological link (e.g., close contact or physical proximity during a given timeframe) between the newly detected case and the cluster. When applying this method to outbreaks defined by a geographic region with free-flow borders to other regions with active transmission-as was the cases for prefectures in Japan during the first wave of the pandemic-importation of one or more cases before the outbreak would simply add to the existing epidemic curve and funnel into the end-of-outbreak probability calculations. Likewise, exportation of cases is not accounted for in this method, as any case epidemiologically linked to the cluster is included, regardless of geographic boundaries within Japan. However, possible exportation across international borders is not accounted for, and even when examining clusters if local public health jurisdictions minimize publicly shared information (as was seen in later stages of the pandemic), some links may be missed. Lastly, further analyses describing transmissivity of the virus are needed to improve understanding of the most plausible range for these critical values. In addition, when more information is available regarding the different transmission routes of COVID-19 (i.e. airborne, droplet spread or contact with contaminated fomites) the model could potentially be updated to account for differences in these routes, as it was previously done for modeling the flare-ups of Ebola virus disease due to sexual transmission from male survivors. 4 Other methods for statistical end-of-outbreak determination have recently been proposed, and provide alternative options for examining cases using geographically-based outbreak definitions 32, 33 The focus of this study on clusters may be different from typical geographically-based analyses, but we believe focusing on this scale can be meaningful to decision-makers dealing with clusters on an individual basis, such as officials for the involved local health jurisdictions (i.e. cities and prefectures) as well as the facilities (hospitals, senior homes, gyms, schools, etc.) where cases have been identified. In summary, we have incorporated use of the reporting delay distribution into a model for end-ofoutbreak probability estimation and applied this method to clusters in the COVID-19 epidemic in Japan. In doing so, we provide estimates of the probability that the outbreak will continue in real time. Communicating these probabilities can inform public health decision-making around the appropriate use of resources when transmission has declined for a given outbreak. Each subfigure begins at the last date of onset within the cluster. All plots assume = 0. Reporting delay Gamma Mean: 7.2 (SD: 4.7) Estimated Re: effective reproduction number. k: overdispersion parameter. SD: standard deviation. Cluster A: Fitness gyms cluster; Cluster B: University parties cluster; Cluster C: Senior care facility; Cluster D: Cancer center cluster. Observation days are reported relative to the last date of report of a case in the cluster (Day 0). Cells with "-" did not reach the proscribed threshold probability within the 42-day period analyzed. How will country-based mitigation measures influence the course of the COVID-19 epidemic? Impact of Non-Pharmaceutical Interventions (NPIs) to Reduce COVID-19 Mortality and Healthcare Demand Rigorous surveillance is necessary for high confidence in end-of-outbreak declarations for Ebola and other infectious diseases Sexual transmission and the probability of an end of the Ebola virus disease epidemic Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China Superspreading and the effect of individual variation on disease emergence Maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study Pattern of early human-to-human transmission of Wuhan Real-time monitoring the transmission potential of COVID-19 in COVID-19) response in Japan Clusters of Coronavirus Disease in Communities Prime Minister's Office of Japan; Ministry of Health Labour and Welfare. Avoid the "three Cs Public Health Criteria to Adjust Public Health and Social Measures in the Context of COVID-19 Inference of R0 and transmission heterogeneity from the size distribution of stuttering chains Public Health Agency of Canada. Interim guidance: Public health management of cases and contacts associated with novel coronavirus disease 2019 (COVID-19) WHO Recommended Criteria for Declaring the End of the Ebola Virus Disease Outbreak Recrudescence of Ebola virus disease outbreak in West Africa Ministry of Health Labour and Welfare (MHLW). COVID-19 clusters in Japan The rate of underascertainment of novel coronavirus (2019-nCoV) infection: Estimation using Japanese passengers data on evacuation flights Ascertainment rate of novel coronavirus disease (COVID-19) Serial interval of novel coronavirus (COVID-19) infections Evaluating transmission heterogeneity and super-spreading event of COVID-19 in a metropolis of China Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data R: A Language and Environment for Statistical Computing Key questions for modelling COVID-19 exit strategies Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions. Science (80-) Clinical sensitivity and interpretation of PCR and serological COVID-19 diagnostics for patients presenting to the hospital Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: A report of 1014 cases An exact method for quantifying the reliability of end-of-epidemic declarations in real time Accurate forecasts of the effectiveness of interventions against Ebola may require models that account for variations in symptoms during infection A systematic review of COVID-19 epidemiology based on current evidence The authors have no conflicts of interest to disclose.