key: cord-0322077-olwrmlrh authors: Parag, K. V. title: Sub-spreading events limit the reliable elimination of heterogeneous epidemics date: 2021-03-13 journal: nan DOI: 10.1101/2021.03.13.21253477 sha: a79f967a54f7bbf8f9556faa256e446c8c3f2feb doc_id: 322077 cord_uid: olwrmlrh We show that sub-spreading events i.e., transmission events in which an infection propagates to few or no individuals, can be surprisingly important for defining the lifetime of an infectious disease epidemic and hence its waiting time to elimination or fade-out, measured from the time-point of its last observed case. While limiting super-spreading promotes more effective control when cases are growing, we find that when incidence is waning, curbing sub-spreading is more important for achieving reliable elimination of the epidemic. Controlling super-spreading in this low-transmissibility phase offers diminishing returns over non-selective population-wide measures. By restricting sub-spreading we efficiently dampen remaining variations among event reproduction numbers, which minimises the risk of premature and late end-of-epidemic declarations. Because case-ascertainment rates can be modelled in exactly the same way as control policies, we concurrently show that the under-reporting of sub-spreading events during waning phases will engender overconfident assessments of epidemic elimination. While controlling sub-spreading may not be easily realised, the likely neglecting of these events by surveillance systems could result in unexpectedly risky end-of-epidemic declarations. Super-spreading controls the size of the epidemic peak but sub-spreading mediates the variability of its tail. 100 101 Here Λ s := s−1 u=1 I s−u w u , which depends on I s−1 1 , is known as the total infectiousness. It characterises how many 102 past effective cases contribute to the next observed case-count at s. The generation time distribution (which we 103 assume to be equal to the serial interval distribution) is central to defining the impact of each past case [20] . 104 If we describe an epidemic as consisting of a sequence of spreading or transmission events, with the reproduction 105 number of the spreading event at time s as R s then standard renewal models assume a fixed R s [21], [22] , [23] . 106 This formulation, while useful, does not account for possible heterogeneities in transmission, which are known 107 features of many respiratory diseases such as the SARS and MERS coronaviruses [2] . If we define a distribution 108 over R s with mean E[R s ] = µ s then these models set P(R s = µ s ) = 1. Heterogeneous transmission is a mean 109 preserving spread of this condition i.e. events with fixed mean µ s can have different R s values. 110 We define super-spreading events as those driven by R s significantly larger than µ s . This characterisation differs 111 slightly from the standard in [2] , which directly uses numbers of secondary cases. Since the renewal model has 112 long-term memory (i.e. factors in the age of infections via Λ s ) using I s would not be as appropriate here. However, 113 because I s behaves like a noisy, scaled version of R s (by the properties of Poisson mixtures [24]), these two 114 definitions are largely consistent. In this work we consider the end of the epidemic, which follows the waning 115 phase of the epidemic. This contrasts the development in [2] , which focusses on the growth phase. We define sub-116 spreading events as having R s notably smaller than µ s . This type of event has received appreciably less attention 117 (than super-spreading ones) and is our main topic of study. 118 If we make the usual assumption that R s is gamma (Gam) distributed with shape k and scale µ s k −1 i.e. the right 119 side of Eq. (1), then we obtain the negative binomial (NB) relation I s ∼ NB k, µsΛs µsΛs+k . This is the most common 120 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint method for incorporating heterogeneity within renewal model frameworks [12] , [25] , [14] . As k gets smaller the 121 likelihood of both super-and sub-spreading events increases [2], [24] . Special cases are at k → ∞, k = 1 and 122 k → 0 for which I s has a Poisson, geometric and logarithmic distribution respectively with mean µ s Λ s [18] . Many 123 of the infectious diseases that feature significant heterogeneity have been found to exhibit k < 1 [2], [26] . Note 124 that this NB model can also be used to describe reporting noise and other types of heterogeneity. 125 Variation in reproduction numbers and incidence 126 We explicitly characterise how heterogeneity in transmission can control the incidence of an epidemic and then 127 assess the implications of this observation. Consider any arbitrary distribution over the effective reproduction number 1)), we obtain the straightforward but general Eq. (2) below. Eq. (2) shows that for any renewal model there is a direct relationship between mean incidence and µ s . Similarly, we apply the law of total variance to get: . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint is simple, its ramifications, which are meaningful, have not been explored. 153 Understanding how properties of the R s distribution map onto the related I s one, which is a mixed Poisson 154 distribution, provides insights into other epidemic properties. From mixed Poisson theory [24] we deduce that (a) 155 if R s is unimodal and continuous then I s is also unimodal, (b) the shape of I s will be similar to that of the mixing 156 distribution describing R s (e.g. when R s is exponential, I s becomes geometrically distributed) and (c) every mixed 157 Poisson distribution corresponds to a unique mixing distribution i.e. multiple R s distributions cannot map to the 158 same I s distribution [24] . Properties (a)-(c) establish that much about super-and sub-spreading events can be 159 learned from R s . We will exploit these relationships to better understand epidemic elimination and heterogeneity. numbers, µ ∞ s+1 , are known, then we can construct the probability of elimination given some sample As we condition on the sample R ∞ s+1 and because 165 incidence is non-negative and I j does not depend on R ∞ j+1 , we can decompose z s to get Eq. (5). . Keeping to convention, we will often present results in terms of ∆s, which is the 178 time relative to that at which incidence was last non-zero, t 0 . We compute the relative time at which the epidemic 179 is eliminated with α% confidence, given the transmission event sample R ∞ s+1 , as t α in Eq. (7). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint Eq. (7) also gives the time that an epidemic, composed of spreading events R ∞ s+1 , can be declared over with at least 183 α% confidence [29] , [15] . This confidence reflects the remaining variability among possible epidemic trajectories 184 despite conditioning on the fixed future R ∞ s+1 sample and past observed incidence I s 1 . If we think of the epidemic as a process that generates infections then its survival function, which captures 186 the probability of the epidemic propagating at least 1 future case after s, is precisely 1 − z s . In Eq. (4) we showed how VM ratios of the event reproduction numbers, R s , directly control those of the incidence 233 values, I s . We first verify this relationship on numerous epidemics simulated according to the heterogeneous renewal 234 model of Eq. (1). We consider epidemics characterised by an initial exponential growth followed by drastic control 235 (e.g. a lockdown measure) and compute the VM ratios of both I s and R s for all times s of the epidemics. (iii) Sub-spreading control (size-biased reporting). This a novel intervention that we introduce here. It focuses 255 on removing the sub-spreading events and is the converse of (ii). The gamma distribution of R is lower-truncated 256 at some minimum value a and rescaled so that ∞ a P(R) dR = 1 and ∞ a P(R)R dR = ρµ. We use a to define 257 sub-spreading events. This scheme is analogous to a size-biased case reporting strategy in which events producing 258 few cases are under-sampled, with a as the left under-reporting point of the R distribution i.e. we never sample 259 R < a. As (ii) and (iii) do not admit simple VM expressions and we investigate them through simulation. As a → 0 and b → ∞ all three control (or reporting) measures (and the scale parameter of their respective 261 gamma R distributions) converge. They also converge as k → ∞ since the R distribution becomes degenerate at 262 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint ρµ under these conditions and there is no heterogeneity. For simplicity, from this point we will usually refer to 263 schemes (i)-(iii) via their control classification, switching to their reporting analogue only later when discussing 264 results. We treat uniform control as a baseline since it ignores the specific form of the R distribution. While 265 preferentially limiting super-spreading is sensible, and has been shown to have superior performance relative to 266 uniform control [2], sub-spreading control has, to our knowledge, not been investigated. This is likely because it 267 seems counter-intuitive to focus on events with low transmission potential. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint meaning that most of the variability derives from the sub-spreading events. We confirm this notion in Fig. 3 , which 280 examines the cumulative distribution function F (R) at the extreme values of Fig. 2 . 281 Fig. 3 : Control strategies shape R distributions. We plot the cumulative distribution function F (R) for uniform (green), super-spreading (red) and sub-spreading (blue) control measures for a renewal model with k = 0.5 at large and small ρµ from Fig. 2 . We find that super-spreading control is best (in terms of VM ratios) at large ρµ because there is a notable probability of super-spreading events which becomes truncated under this measure i.e. it rises to 1 first. However, when ρµ is small this becomes vastly less important (the super-spreading and uniform controls converge) and the sub-spreading control rises more quickly. Here we see that, at large ρµ, limiting super-spreading forces F (R) towards 1 at the fastest rate i.e. the clipping 282 of super-spreading events closes the effective support of the R distribution earlier than the other measures. In our ability to constrain z s and the epidemic lifetime and hence to achieve reliable end-of-epidemic declaration times. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint Reducing variation among elimination times 294 We examine two data-justified measures of the end of an epidemic: the mean elimination time,t α and the 295 maximum elimination time t α,max . Both have previously been used for assessing end-of-epidemic declarations 296 [13], [15] . As is convention, we focus on 95% confidence and set α = 95. Timest α and t α,max are obtained by 297 finding when the mean and minimum of the z s curves i.e.z s and z s,min , generated from possible future epidemic 298 trajectories, first cross 0.95, respectively. We generate possible z s by drawing samples from the distributions of 299 R ∞ s+1 and then computing Eq. (6). All z s curves are conditioned on some fixed past incidence I s 1 . Fig. 4 : Mean and worst case elimination probabilities. We compute the mean (z s ) and worst case (z s,min ) elimination probability curves conditioning on incidence data from the MERS-CoV epidemic in South Korea in 2015 as in [29] . These curves are for various k rising from 0.1 (blue) to 2 (red) under a mean controlled reproduction number of ρµ = 0.5. We find that increasing heterogeneity (smaller k) leads to the earliest mean elimination time (thez s is largest) and the latest maximum elimination time (the z s,min is smallest). These statistics are obtained over 2000 simulated future epidemic trajectories for each control scenario listed. In Fig. 4 we present a range of mean and worst case z s curves for various k with blue indicating the smallest k 306 and red the largest. All times are given relative to the last observed non-zero case day. For all control measures we 307 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint control scheme considerably shrinks the variation among z s curves. This proceeds from the results of the previous 319 section, where we found that, at small mean reproduction numbers (which is realistic when we are near the tail 320 or end of an epidemic), limiting sub-spreading significantly reduces VM[R s ] and hence VM[I s ] (see Eq. (4)). We 321 assess this in more detail by examining all the z s curves, which led to the mean and minima in Fig. 4 , and the 322 distributions of the 95% declaration times, t 95 , that result. We provide these in Fig. 5 . As k becomes larger the difference among all control measures expectedly shrinks. For epidemics with significant 324 heterogeneity (k < 1) we observe that both uniform and super-spreading control result in large variations in z s 325 (top red and green curves in Fig. 5 , which mostly overlay each other). This manifests in a notable spread of 95% 326 declaration times (bottom red and green bars of Fig. 5 , also overlaid). Sub-spreading control is, however, able to 327 suppress much of this variation yielding more deterministic elimination probabilities and declaration times (blue 328 curves and bars in Fig. 5 ) and thus minimising the possibility of early or late declarations. 329 We observe consistency in this trend for both EVD and SARS incidence curves in Fig. A.1 of the appendix. We CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint Fig. 5 : Elimination curves and declaration times for various control strategies. We simulate 2000 future trajectories of zero case-days, conditioned on the incidence data of the MERS-CoV epidemic in South Korea in 2015 [29] . Each trajectory is formed by sampling from the R ∞ s+1 distributions for uniform (green), super-spreading (red) and sub-spreading (blue) control measures. The top panel shows elimination probabilities (z s ) computed under these trajectories for various k and the bottom panel provides corresponding 95% declaration times (t 95 ). All times are relative to that of the last observed case and assume ρµ = 0.5 at every future time. We find that sub-spreading control is most effective at reducing the variability and hence in increasing the reliability of both z s and t 95 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint effect. Failing to observe sub-spreading events leads to a strongly overconfident view of elimination. The epidemic 340 tail appears far less variable when those events are excluded, leading to risky end-of-epidemic declarations. (bottom) we simulate 10 4 trajectories under a step-change in mean effective reproduction number, ρµ, from 2.5 to 0.6, with k = 0.25. At each time point we draw event reproduction numbers from the uniform (green), superspreading (red) and sub-spreading (blue) distributions with these means. We consistently find, for waning epidemics, that sub-spreading minimises VM ratios. The mean incidence from all methods is approximately the same. Last, we comment on differences between our renewal model approach and the Galton-Watson (GW) branching Since elimination measures only become an important consideration after t 0 and for epidemics of notable size [16], [29] , burn-out due to overdispersion does not benefit our analysis. The extra dynamics beyond t 0 , which 355 do not exist for GW processes, form our problem of interest. However, the VM ratio behaviour of our control 356 schemes (i)-(iii) is quite general and still works under GW models. In Fig. 6 we show that the GW process (top 357 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Using our framework, we found that when epidemics are strongly heterogeneous, i.e. k < 1, the maximum and 373 mean declaration times, relative to that of the last observed case, depend contrastingly on k (see Fig. 4 ). Although the 374 average time to declaration decreases with k, supporting previous work linking heterogeneity to epidemic extinction 375 [14], the concomitant increase in variability means that the safest declaration time actually increases and the risk 376 of early or late declarations can be severely amplified. Variation originating from transmission heterogeneity is 377 therefore not beneficial for achieving safe and reliable end-of-epidemic declarations. Consequently, we investigated if targeted control can ameliorate this end-of-epidemic volatility. We considered 379 three control schemes (for the same mean level of control ρ), which were non-selective or targeted either super-380 or sub-spreading (see Fig. 2 and Fig. 3) . Intriguingly, we found that, because the controlled mean of the event 381 reproduction numbers ρµ s is below 1 at pre-elimination settings, curbing super-spreading only marginally improved 382 on non-selective approaches. However, sub-spreading appeared to be the main contributor to end-of-epidemic 383 declaration risk, meaning limiting those events can significantly reduce that risk; forcing the epidemic tail to 384 be more deterministic and increasing the reliability of resulting intervention relaxation policies. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) applies to the waning phases of the epidemic, after most drastic changes would likely have already occurred. Our framework provides a general toolkit for testing hypotheses about targeted controls or case ascertainment 439 schemes and measuring their influence on end-of-epidemic declaration times. It can also be easily extended to 440 include additional factors such as imported cases or to investigate other types of heterogeneity (e.g. age-based 441 reproduction numbers) [15] . As the current COVID-19 pandemic underscores, much still remains unknown about 442 the relative merits of elimination approaches, e.g. "zero COVID" strategies [11] , and the influence of heterogeneity 443 [4]. Improved understanding of epidemic dynamics can only aid preparedness and decision-making. We hope that 444 our framework, which exposed unexpected consequences of understudied spreading events [19] , can contribute 445 towards this goal and help inform safe intervention relaxation and end-of-epidemic declaration strategies. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint and the UK Department for International Development (DFID) under the MRC/DFID Concordat agreement and is 449 also part of the EDCTP2 programme supported by the European Union. [16] WHO. WHO recommended criteria for declaring the end of the Ebola virus disease outbreak; 2020. Available from: https://www.who. int/who-documents-detail/who-recommended-criteria-for-declaring-the-end-of-the-ebola-virus-disease-outbreak. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint [22] and the simulated incidence from for EVD (bottom) [33] . We generate trajectories by sampling from the R ∞ s+1 distributions for uniform (green), super-spreading (red) and sub-spreading (blue) control measures and plot corresponding 95% declaration times (t 95 ). The uniform and super-spreading results largely overlap. All times are relative to that of the last observed case and assume ρµ = 0.5. As in the main text, we see that sub-spreading control is most effective at reducing the variability among possible t 95 and hence increases the reliability of any end-of-epidemic declaration. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 13, 2021. ; https://doi.org/10.1101/2021.03.13.21253477 doi: medRxiv preprint The Effective Reproduction Number as a Prelude to Statistical Estimation of Time-Dependent Epidemic Trends. 452 In: Mathematical and statistical estimation approaches in epidemiology Superspreading and the effect of individual variation on disease emergence Deciphering early-warning signals of the elimination and resurgence potential of SARS-CoV-2 from 456 limited data at multiple scales Characterizing superspreading of SARS-CoV-2 : from mechanism to measurement. medRxiv Epidemiology of Transmissible Diseases after Elimination Heterogeneities in the transmission of infectious agents: Implications for the design of control 461 programs The effect of superspreading on epidemic outbreak size distributions Identifying and Interrupting Superspreading Events-Implications for Control of Severe Acute Respiratory Syndrome 464 Coronavirus 2 Five challenges for stochastic epidemic models involving global transmission Improved estimation of time-varying reproduction numbers at low case incidence and between epidemic waves. medRxiv Elimination may be the optimal response strategy for COVID-19 and other emerging pandemic diseases Sexual transmission and the probability of an end of the Ebola virus disease epidemic Objective determination of end of MERS outbreak A quantitative framework to define the end of an outbreak: application to Ebola Virus Disease An exact method for quantifying the reliability of end-of-epidemic declarations in real time Understanding the influence of all nodes in a network How generation intervals shape the relationship between growth rates and reproductive numbers A New Framework and Software to Estimate Time-Varying Reproduction Numbers During 490 Epidemics Adaptive Estimation for Epidemic Renewal and Phylogenetic Skyline Models Mixed Poisson Distributions Measuring the path toward malaria elimination Spatial and temporal dynamics of superspreading events in the 2014-2015 West Africa Ebola epidemic Using information theory to optimise epidemic models for real-time prediction and estimation On the estimation of the reproduction number based on misreported epidemic data Methods to determine the end of an infectious disease epidemic: a short review Mathematical and statistical modeling for emerging and re-emerging infectious diseases Quantitative Methods for Investigating Infectious Disease Outbreaks All of Statistics: A Concise Course in Statistical Inference On the distribution theory of over-dispersion outbreaks: A Collection of Disease Outbreak Data Exhaled aerosol increases with COVID-19 infection, age, and obesity Viral load and contact heterogeneity predict SARS-CoV-2 transmission and super-spreading 511 events Statistical physics of vaccination Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions