key: cord-0306545-1ecdj314 authors: Glennon, Emma E; Jephcott, Freya L; Oti, Alexandra; Carlson, Colin J; Bustos Carillo, Fausto A; Hranac, C Reed; Parker, Edyth; Wood, James L N; Restif, Olivier title: Syndromic detectability of haemorrhagic fever outbreaks date: 2020-03-31 journal: nan DOI: 10.1101/2020.03.28.20019463 sha: 27a3cca0f970a1498cd8ea7dea67ca5ca5ea796e doc_id: 306545 cord_uid: 1ecdj314 Late detection of emerging viral transmission allows outbreaks to spread uncontrolled, the devastating consequences of which are exemplified by recent epidemics of Ebola virus disease. Especially challenging in places with sparse healthcare, limited diagnostic capacity, and public health infrastructure, syndromes with overlapping febrile presentations easily evade early detection. There is a clear need for evidence-based and context-dependent tools to make syndromic surveillance more efficient. Using published data on symptom presentation and incidence of 21 febrile syndromes, we develop a novel algorithm for aetiological identification of case clusters and demonstrate its ability to identify outbreaks of dengue, malaria, typhoid fever, and meningococcal disease based on clinical data from past outbreaks. We then apply the same algorithm to simulated outbreaks to systematically estimate the syndromic detectability of outbreaks of all 21 syndromes. We show that while most rare haemorrhagic fevers are clinically distinct from most endemic fevers in sub-Saharan Africa, VHF detectability is limited even under conditions of perfect syndromic surveillance. Furthermore, even large clusters (20+ cases) of filoviral diseases cannot be routinely distinguished by the clinical criteria present in their case definitions alone; we show that simple syndromic case definitions are insensitive to rare fevers across most of the region. We map the estimated detectability of Ebola virus disease across sub-Saharan Africa, based on geospatially mapped estimates of malaria, dengue, and other fevers with overlapping syndromes. We demonstrate "hidden hotspots" where Ebola virus is likely to spill over from wildlife and also transmit undetected for many cases. Such places may represent both the locations of past unobserved outbreaks and potential future origins for larger epidemics. Finally, we consider the implications of these results for improved locally relevant syndromic surveillance and the consequences of syndemics and under-resourced health infrastructure for infectious disease emergence. Text S1). we collected data). The expected detectable size of a syndrome n within its local disease 143 context is the minimum cluster size c for which < ",$ > 0.5. Unless otherwise stated, we used 144 . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. surveillance regimes (i.e., considering all symptoms, typical symptoms, or minimal VHF 160 symptoms; Text S1). Finally, to estimate geospatial variability in the difficulty of detecting 161 Ebola virus outbreaks, we estimated mean detectability of 5 simulated 10-case EVD clusters 162 across sub-Saharan Africa using high-spatial resolution incidence estimates for malaria 17 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. . https://doi.org/10.1101/2020.03.28.20019463 doi: medRxiv preprint probabilities were consistently higher than expected, while dengue and meningococcal 181 detection probabilities were consistent with expectations ( Fig. S6-B) . Algorithm performance 182 on typhoid outbreaks was the least accurate, with seven of ten outbreaks predicted to have 183 less than 10% of their expected probability ( Fig. S6-B ). Typhoid outbreaks with several 184 clinical features reported tended to be misidentified as diarrhoeal diseases with a higher-than-185 expected probability, and those with few symptoms reported (in several instances, only a low 186 probability of death and/or haemorrhage) tended to be misidentified as lower or upper 187 respiratory infections with a higher-than-expected probability ( Fig. S6-C) . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. . https://doi.org/10.1101/2020.03.28.20019463 doi: medRxiv preprint syndromes, we estimate the detectable sizes of each syndrome as well as likely 199 misidentifications between syndromes (Fig. 2) . We estimate, for example, that Ebola virus 200 disease and Marburg virus disease (MVD) are most likely to be misidentified as dengue 201 haemorrhagic fever, yellow fever, or typhoid fever. The clinical features caused by EVD and 202 MVD outbreaks are more likely to be caused by other diseases until at least 6 and 7 cases 203 occur, respectively. Before this detectability threshold, the balance of probabilities suggests a 204 few cases of more common aetiologies are presenting with rare clinical features; after this 205 threshold, the presentation of cases is unusual enough to suggest a common presentation of 206 extremely rare aetiologies (i.e., filoviral disease). We summarise the estimated 207 detectabilities-i.e., probabilities of correct aetiological identification-for each syndrome in 208 with malaria at small cluster sizes (approx. 1-5 cases) and dengue haemorrhagic fever at 214 larger sizes (approx. 5+ cases) and is not 50% detectable until a cluster size of 25 cases. 215 Considering all 18 clinical features included in our database, most other syndromes reach 216 50% detectability within the first 5 cases; this includes moderately rare syndromes such as 217 leptospirosis, yellow fever, and typhoid fever. Reducing the range of clinical features 218 considered dramatically reduces identifiability of most syndromes (Fig 3) . Considering only 219 minimal VHF features (i.e., fever, haemorrhage/bleeding, death, hiccups, and jaundice) 220 renders most syndromes undetectable by 20 cases, suggesting that e.g., case definitions for 221 EVD or MVD are insufficient to detect them within the first 20 cases. However, the addition is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. Considering endemic context at a higher spatial resolution reveals heterogeneity in 238 detectability and regions where EVD is most able to spread undetected ( Fig. 4-A) . Where 239 existing risk maps for EVD consider the ecological niche of Ebola virus in reservoir hosts 240 and/or human population density and movement, we add an entirely new layer to the EVD 241 map representing variation in outbreak detectability. Considering detectability of EVD 242 clusters in the context of existing geospatial estimates of spillover risk demonstrates potential 243 "hidden hotspots" where EVD is both most able to spread undetected by syndromic 244 surveillance and most likely to spill over from wildlife into people ( Fig. 4 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. . https://doi.org/10.1101/2020.03.28.20019463 doi: medRxiv preprint ( Fig. 4-B) . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. while we were frequently able to predict the true aetiology from limited dengue and malaria 285 outbreak data, performance was less accurate for typhoid fever. Given more outbreak data-286 in terms of quantity, syndromes represented, and consistency of clinical feature reporting-287 we expect to be able to improve the sensitivity, specificity, and accuracy of the algorithm for 288 categorising real outbreak data. As is, however, our mechanistic approach still enables strong 289 insights into the relationships among outbreak detection, case definitions and the values of 290 clinical features for disease identification, and local endemic context. 291 292 A central challenge to developing this approach has been the lack of standardised clinical 293 data, especially given the high dimensionality of our model. For example, we have been 294 unable to gather comparable data on non-communicable diseases or toxic aetiologies. 295 Although haemorrhagic and febrile syndromes are unlikely to be misdiagnosed as non-296 infectious syndromes 5 , the paucity of data on such syndromes may prevent such an approach 297 from being used on, for example, syndromes presenting with jaundice or neurological 298 symptoms. Particularly scant data is also available for many uncommon or seemingly 299 . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. . https://doi.org/10.1101/2020.03.28.20019463 doi: medRxiv preprint we nonetheless find that they are important for both the sensitivity and specificity of 301 syndromic surveillance approaches. Additionally, few studies describing the presentation of 302 syndromes describe heterogeneity in their presentation. We accounted for this heterogeneity 303 by introducing wider clinical feature distributions for those syndromes which represent 304 pooled aetiological agents (e.g., ascribing high heterogeneity to "diarrhoeal diseases," which 305 includes a range of pathogens, and moderate heterogeneity to most clinical features of EVD, 306 which may be caused by several distinct filovirus species). However, more consistent and 307 comprehensive reporting of clinical feature prevalence across strains and settings would 308 enable stronger accounting for heterogeneity in disease presentation. Furthermore, an advantage of our approach is its ability to exploit-rather than work 326 against-the uncertainty inherent in syndromic data from diseases with heterogeneous 327 presentations. Critically, this allows detection of diseases at population levels even before 328 any individual-level diagnosis occurs (e.g., before the development of tests for rare/novel 329 pathogens or in settings with insufficient diagnostic capacity). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. . https://doi.org/10.1101/2020.03.28.20019463 doi: medRxiv preprint outbreak-i.e., when filoviral disease is unlikely to be a common diagnostic consideration-368 requires even greater sensitivity and specificity to Ebola's clinical manifestations than ETC 369 triage. Furthermore, the variation we demonstrate in the detectability of EVD clusters across 370 West and Central Africa indicates that appropriate case definitions for routine surveillance 371 might require tailoring to local or regional endemic context. 372 373 Although a strength of our approach is the use of endemic disease data to understand rare and 374 poorly observed diseases, it is still limited by the availability of appropriate ecological and 375 epidemiological data. The relatively high predicted detectability of EVD outbreaks in 376 Ethiopia, for example, may be an artefact of underestimated incidences of dengue, malaria, 377 and yellow fever in the country (Fig. S7 ). More broadly, niche maps for filoviral and other 378 zoonotic diseases with wildlife origins rely on incompletely observed human/primate 379 outbreak data for validation 35 . The Ebola virus spillover map used to generate Figure 4 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. specificity in different contexts. Furthermore, although it seems unlikely that a healthcare 458 worker would initiate the testing for diseases as rare as EVD or MVD outside the context of 459 an on-going outbreak, the findings from our study do suggest a diagnostic testing algorithm 460 of sorts. For instance, having all samples from suspected yellow fever cases that have tested 461 negative for yellow fever automatically tested for EVD and MVD, could improve detection 462 capacity even given diagnostic constraints at the healthcare facility-level. The data limitations 463 encountered in our study highlight the need for standardised clinical data collection, 464 especially for signs and symptoms rarely associated with the syndromes under consideration. 465 Beyond improving syndromic models, such standardisation-and rapid sharing of data where 466 appropriate 61 -could enable algorithmic detection of outbreaks of misdiagnosed diseases 467 . CC-BY-NC 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 31, 2020. Transmission dynamics and control of Ebola virus disease 500 (EVD): a review Local, national, and regional viral haemorrhagic fever pandemic 502 potential in Africa: a multistage analysis Estimating undetected 504 Ebola spillovers Ebola response impact on public health programs Facility-based surveillance for 508 emerging infectious diseases; diagnostic practices in rural West African hospital 509 settings: observations from Ghana Undiagnosed acute 512 viral febrile illnesses Case definition for Ebola and Marburg haemorrhagic fevers : a 514 complex challenge for epidemiologists and clinicians Multiple circulating infections can mimic the early stages of viral 517 hemorrhagic fevers and possible human exposure to filoviruses in Sierra Leone prior 518 to the 2014 outbreak Development of a prediction model for Ebola virus disease: A 520 retrospective study in Nzérékoré Ebola treatment center Preparing for emerging infections means expecting Leveraging multiple data types to estimate the true size of the Zika 603 epidemic in the Americas Impacts of environmental and socio-economic factors on 605 emergence and epidemic potential of Ebola in Africa Habitat 607 fragmentation, biodiversity loss and the risk of novel infectious disease emergence Large serological survey showing cocirculation of Ebola Marburg viruses in Gabonese bat populations, and a high seroprevalence of both 611 viruses in Rousettus aegyptiacus Seroreactivity against Marburg or related filoviruses in West Ebola virus antibodies in fruit bats Multiple Ebola virus transmission events and rapid decline of 617 central African wildlife Survey of Ebola viruses in frugivorous and insectivorous bats in Wild animal mortality monitoring and human Ebola outbreaks Sharing data for global infectious disease 639 surveillance and outbreak detection Real-time whole-genome sequencing for routing typing, 641 surveillance, and outbreak detection of verotoxigenic Escherichia coli Changing the paradigm for hospital 644 outbreak detection by leading with genomic surveillance of nosocomial pathogens Probabilistic 647 reconstruction of measles transmission clusters from routinely collected surveillance 648 data Towards a genomics-informed, real-time, global pathogen 650 surveillance system The assessment of Twitter's potential for outbreak detection: Avian influenza case study A system for automated outbreak detection of communicable 655 diseases in Germany Addressing social determinants of health 673 and health inequalities Access to effective antimicrobials: A worldwide challenge Estimates of the global, regional, and 677 national morbidity, mortality, and aetiologies of diarrhoea in 195 countries: a 678 systematic analysis for the Global Burden of Disease Study Disease and Injury Incidence and Prevalence Collaborators. Global, 681 regional, and national incidence, prevalence, and years lived with disability for 354 682 diseases and injuries for 195 countries and territories, 1990 -2017: a systematic 683 analysis for the Global Burden of Disease Study