key: cord-0295688-xdce77zm
authors: Burkom, H.; Loschen, W.; Wojcik, R.; Holtry, R.; Punjabi, M.; Siwek, M.; Lewis, S. H.
title: ESSENCE, the Electronic Surveillance System for the Early Notification of Community-Based Epidemics
date: 2020-08-17
journal: nan
DOI: 10.1101/2020.08.14.20175398
sha: d52d7e23fe62e6301b11a25fa5211b15836def72
doc_id: 295688
cord_uid: xdce77zm

The Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCE) is a secure web-based tool that enables health care practitioners to monitor health indicators of public health importance for detection and tracking of disease outbreaks, consequences of severe weather, and other events of concern. The ESSENCE concept began in an internally funded project at the Johns Hopkins University Applied Physics Laboratory (JHU/APL), advanced with funding from the State of Maryland, and broadened in 1999 as a collaboration with the Walter Reed Army Institute for Research. Versions of the system have been further developed by JHU/APL in multiple military and civilian programs for timely detection and tracking of health threats. Features of ESSENCE include spatial and temporal statistical alerting, custom querying, user-defined alert notifications, geographical mapping, remote data capture, and event communications. These features allow ESSENCE users to gather and organize the resulting wealth of information into a coherent view of population health status and communicate findings among users. The resulting broad utility, applicability and adaptability of this system led to adoption of ESSENCE by the Centers for Disease Control and Prevention (CDC), numerous state and local health departments, and the Department of Defense (DOD) both nationally and globally. With emerging high-consequence communicable diseases and other health conditions, the continued user-requirements-driven enhancements of ESSENCE demonstrate an adaptable disease surveillance capability focused on the everyday needs of public health. The challenge of a live system for widely distributed users with multiple different data sources and high throughput requirements has driven an novel, evolving architecture design.

CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Inauguration impelled testing of information-sharing, distinct from data-sharing, strategies 117 including an InfoShare tool to enable timely sharing by ANCR users with national level 118 authorities. This effort foreshadowed multiple events or threats in which local restrictions 119 prevented ESSENCE user sites from sharing explicit data, but derived reports, aggregates, 120 data-free query language, or just descriptions could be legally shared. Implementation of 121 ESSENCE features to facilitate such sharing has continued since then. Working with the Veterans Affairs and DoD, the ESSENCE team investigated the benefits and 130 obstacles of including elements of electronic medical records beyond the demographics and 131 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint 6 chief complaints. Working with multi-terabyte database sites prepared ESSENCE developers 132 to deal with large datasets in every jurisdiction. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. with the goal of enabling public health to find and monitor outbreaks of health events and make 169 decisions. Users of ESSENCE access a secure web-based tool to conduct disease surveillance for 170 the purpose of timely detection, situational awareness, and descriptive epidemiologic analysis of 171 baseline disease patterns and outbreaks. For effective public health response, public health 172 authorities must have the ability to identify the infected population so further spread can be 173 contained. Leveraging the near real-time availability of an increasing number of data sources, 174 ESSENCE analytical and alerting capabilities provide an opportunity for public health users to 175 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. The software architecture employed for ESSENCE is a three-tier web application with a 209 presentation layer as a user frontend, a business layer for application of algorithms, and a backend 210 for databases. This architecture runs on modular server configurations, with the number of servers 211 contingent upon the data volume, number of active users, and frequency of required analysis 212 operations. The most common configurations comprise three servers for smaller instances and 213 five servers for larger ones. For systems with larger numbers of data sources and/or data volumes 214 reaching billions of records, the architecture can support additional servers to spread the functional 215 load of the processing. The backend databases are Microsoft SQL Server relational database 216 management systems (RDMS). Database functions include an ingestion database layer that 217 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint facilitates extract-transform-load (ETL) operations and performs deduplication and other data 218 cleaning operations. Depending on the user site, the ETL operations are performed by 219 Rhapsody,(6) Mirth,(7) or locally developed scripts to populate the database, and then ESSENCE's 220 Groovy-based data flow management system (8) controls data flow and business logic to transfer 221 data from ingestion to detection to web databases. 222 A detection database layer holds data and manages cube tables for fast algorithm access and 223 execution. Java-coded algorithms access the data and cubes for efficient signal detection on the 224 detection database to separate algorithm processing from user query management. The web 225 database layer expedites rapid formation and display of interactive screens for visualization and 226 communication. Web applications encoded in Java and JavaScript utilize this database via a 227

Tomcat web application server.(9) For display purposes, mapping and other geographic 228 information system (GIS) operations employ the open source tool GeoServer.(10) Users can 229 access the web application through standard web application displays or via a web service API 230 layer for direct access to ESSENCE data and functionality. 231

The types of data analyzed in ESSENCE are the prerogative and responsibility of the jurisdiction, 233 though JHU/APL provides capability for basic types. The ESSENCE system is data agnostic-the 234 only requirement for a monitored data type is the inclusion of a data field. Data time resolution is 235 also unrestricted. Data frequencies in ESSENCE have ranged from seconds to years, though daily 236 data have been most common. All but a few users monitor hospital emergency department (ED) 237 data. Most ESSENCE user jurisdictions face the burden of acquiring their data sources, gaining 238 approval for their routine intended use, and extracting features to monitor. That level of effort 239 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. In separate studies or projects, individual user jurisdictions or their research partners have also 259 used ESSENCE to analyze records of radiology impressions, genomic sequencing data, zoo animal 260 health, environmental sensor outputs, sales of specific products such as thermometers, orange 261 juice, and tissues, social media posts and searches, and even fantasy sports data. Multiple content 262 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Procedures for managing data quality issues such as deduplication, temporary dropouts of data 284 feeds, data field value validation, and management of free-text or pick-list entries are incorporated 285 in ESSENCE standard business rules. By these rules, ESSENCE does not update records in place 286 but employs a delete-and-replace approach that has proven faster. For data with the functionality 287 enabled, a history system integrates all prior instances of each record to produce a single master 288 record with the relevant fields for each encounter. An extensive set of reference tables and 289 business logic allows conversion of field entries such as patient age, race, ethnicity, and vitals 290 measurement such as temperature to categorical values from standard sources such as the Public 291

Health Information Network Vocabulary Access and Distribution System (PHIN VADS). A 292 "region" data field is used for general spatial aggregation of patient records and is most often 293 employed to combine count data from collections of zip codes to approximate county-level counts 294 when the county field is unavailable. Beyond these features and conventions, ESSENCE includes 295 a substantial website section with guidance and analysis tools dedicated to helping users manage 296 the quality of their data. 297

Features 300 Data sources used or considered for health surveillance include medical encounter billing 301 records, emergency service calls, nurse hotline calls, prescription and over-the-counter remedy 302 sales, absenteeism records, and more recently, social media data such as tweets and web 303 searches. Each data source has its own challenges for user jurisdictions to obtain sustained 304 electronic access from data providers and any requisite government approval. When a data stream 305 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint of any of these sources is acquired for routine monitoring, an immediate question is how to use the 306 streaming data to track health outcomes of concern. An often applied procedure is to track counts 307 of subcategories of the data expected to correspond to these outcomes. These subcategories are 308 commonly called syndromes, generalizing the medical definition of this term denoting disease-309 related collections of signs and symptoms. Thus, in the surveillance context a syndrome may refer 310 to grouped hospital visits associated with a fixed collection of symptoms, laboratory tests ordered 311 for certain conditions, web searches containing sets of terms, billing records covering any of a 312 class of remedies, or other subgroups depending on the data source. Syndrome formation is a 313 critical step that may use only a fraction of all streaming data and may produce few or many groups is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint list of unmodifiable terms. For example, CCP puts a record with a CC of "NAUSEA" or 329 "VOMITING" in the Gastrointestinal (GI) category. The CCP creates a ChiefComplaintsParsed 330 field for use of classification rules after treatment of abbreviations, some misspellings, and other 331 cleanup.(13) These classifications have enabled additional natural language processing and 332 machine learning initiatives by both ESSENCE developers and users, and findings from these 333 initiatives are shared among users with each emerging health threat.(14-16) 334

As done with diagnosis code-based processing, syndrome groups are tabulated, plotted and 335 monitored each day with statistical alerting algorithms for early potential outbreak indications. 336

Individual alerting algorithms implemented in ESSENCE are listed and described in the Supportint 338

Materials file "S1 Technical_Details_of_ESSENCE_Alerting_Algorithms.docx". 339

The following principles were derived with users to guide method selection and to clarify 340 interpretation of results: 341 342 General considerations: 343  These methods are not intended to positively identify outbreaks without supporting 344 evidence. Their purpose is to direct the attention of a limited monitoring staff with 345 increasingly complex data streams to data features that merit further investigation. They 346 have also been useful for corroboration of clinical suspicions, rumor control, tracking of 347 known or suspected outbreaks, monitoring of special events and health effects of severe 348 weather, and other locally important aspects of situational awareness. Successful users 349 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint value these methods more for the latter purposes and do not base public health responses 350 solely on algorithm alerts. 351 352  These algorithms are one-sided tests that monitor only for unusually high counts, not low 353 ones. Low counts could result from a critical outbreak situation that prevents data reporting, 354 but there are many more common reasons for low counts (such as unscheduled closings or 355 system problems), so the algorithms do not test for abnormally low counts. 356 357  In addition to data-and disease-specific considerations below, algorithm selection was also 358 driven by system considerations. Users need to monitor many types of data rapidly. 359

External covariates such as climate data or clinic schedules may not be available for prompt 360 analysis. Many methods in the literature, armed with substantial retrospective data of a 361 certain type, depend on analysis of substantial history. Day-to-day users, often with only a 362 small fraction of time available for monitoring, will not wait several minutes for each 363 query. In the absence of data history and data-specific analysis time for each stream, 364 ESSENCE methods have been adapted from the literature and engineered to system 365 requirements. 366 367  If the time series monitored by algorithms represent many combinations of clinical 368 groupings, age groups, and geographic regions, excessive alerting may occur simply 369 because of the number of tests applied. The Summary Alert method was implemented to 370 limit such excessive alerting. This method is based on control of the false discovery rate, 371

i.e. the expected ratio of false alerts to the total alert count, and its statistical implementation 372 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint in ESSENCE is detailed in the Summary Alert section below. Aside from analytic methods 373 to control alerting, default alert lists should be limited to results from those time series of 374 concern to the user, either by system design or by active specification by the user. For 375 example, one method of reducing the default alert list is to restrict algorithms to all-age 376 time series groupings. Depending on the scope of the user's responsibility, the alert list 377 may also be restricted according to both epidemiological interest and the resources 378 available for investigation. For example, a monitor of a national-level system with 379 algorithms applied to many facilities may be interested only in alerts with at least 5-10 380 cases. In circumstances of heightened concern, these restrictions can be relaxed, or the user 381 can use ESSENCE advanced querying methods to apply algorithms to age groups and/or 382 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint  The Time Series View provides a graphical display of the temporal behavior of the data 393 with the ability to stratify by specific parameters, view aggregated counts, and infuse data 394 quality factors to improve understanding of data features. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Store. Each geography system defines a system for geographically filtering your data. 416

Regions are a generic term that defines the default geographic way to view data. Regions 417 normally map to a set of zip codes that closely resemble a county or health district. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint 20 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint

The creating user gives each dashboard a text description and a note that is modifiable by creator 461 or shared user. Once created, each dashboard appears as a separate tab on the myESSENCE 462 webpage. 463

By default, each dashboard applies to data from the geographic regions selected for each graph 464 when added to the dashboard. The creator or sharing users may change the region, and ESSENCE 465 will change all views on the dashboard to reflect data from the new region. Users may revert to CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Applications for mass gathering surveillance 545 Scheduled mass gathering events such as political conventions and major athletic competitions 546 concern population health monitors because a) such events are bioterrorism opportunities to affect 547 many victims and gain media attention, b) infections through contaminated food or water could 548 spread rapidly through the expanded population, c) those visiting for several days could import 549 infections or take them back to their own cities, and d) a surge of patients could overwhelm local 550 care provider resources. Adequate preparedness and response require coordination across 551 jurisdiction boundaries, but privacy laws often restrict patient-level data-sharing. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Applications for injury and substance abuse surveillance 585 An unexpected but arguably the most helpful benefit of ESSENCE to health department users has 586 been to facilitate communication and collaboration among agency divisions. An important 587 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint example in the context of the ongoing opioid overdose crisis has been the strengthening of 588 connections between syndromic surveillance specialists and groups specializing in injury 589 prevention, behavioral health, and drug abuse. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. customization with query-building features using both diagnosis codes and free-text has grown 611 along with the sophistication and broadening needs of health department users. Pre-computed, 612 canned analysis products are not found in ESSENCE. However, versatility presents challenges to 613 database design and to the selection and adaptation of statistical analysis tools. Surveillance data 614 evolve with institutional information systems and formats, coding practices, and epidemiological 615 concerns. Users typically cannot wait several minutes for data retrieval and time-consuming model 616 runs. Alerting algorithms applied prospectively to detect disparate events in a wide variety of data 617 types cannot match the detection performance of models developed retrospectively using historical 618 datasets labelled with target events for a particular syndrome. Algorithm baselines in ESSENCE 619 do not reach back for years, not only for storage and computational reasons, but also because for 620 many users' desired data types, stable data or any data are available only within the past year. 621

Hence, ESSENCE alerting algorithms, adapted from published applications of models and control 622 charts in healthcare settings, (23, 67, 68) use rolling baselines of weeks rather than years. 623 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Recently added visualizations and cohort clustering analytic tools for longitudinal assessment 665 allow users to determine categories of patients who use healthcare systems that provide data to 666 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. A major data challenge is to integrate increasingly diverse and granular data sources while 688 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint preserving the essential features of ad hoc user syndrome/case definition and custom real-time 689 analysis and visualization. Public health staff need to extend the power of their systems to all-690 hazards public health threats and to One Health issues including zoonotic diseases across species, 691 antimicrobial stewardship, and food safety. Users of ESSENCE have long imported non-clinical 692 and non-syndromic data sources such as pharmacy sales and school absentee rates along with 693 traditional medical encounter records. 694

In expansion to address all-hazards threats, data complexity, categorization, and linkage challenges 695 will multiply as genomic, environmental (including remote sensing) and social media data sources 696 are added. 697

Significant advances in disease surveillance will also require meeting key analytics challenges. 698

Beyond the monitoring of individual sources and syndromes with data dashboards, combining 699 disparate data types requires statistical and machine learning tools for appropriate weighting and 700 corroboration of evidence from disparate data types. Analytic fusion efforts employing Bayesian 701 networks and other machine learning tools have been applied in both military and civilian 702 ESSENCE systems (72-75). More efficient and explainable fusion methods will be needed to 703 enable operationalizable forecasting and prediction for greater decision-making power and 704 effective planning and response. Efficient analytic methods will also be needed to determine 705 optimal feature extraction and resultant surveillance value of social media and other nontraditional 706 sources. 707

Lastly, best practices for biosurveillance systems face both human and electronic communication 708 challenges, including interoperability with other electronic systems such as the above-mentioned 709 ASPR DMAT, NPDS, and NEMSIS systems. Despite the preeminence of ESSENCE, as indicated 710 by its adoption as the analytic engine of the CDC Biosense platform and the widespread application 711 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint discussed in this paper, the limitations described above and the data gaps exposed by the COVID-712 19 pandemic must be addressed as new threats emerge and data sources proliferate. These CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Benefits: The main benefit is avoiding alerting bias resulting from expected data trends. The length 758 for the training baseline is critical. Based on performance comparisons among multiple 759 baseline lengths, it was chosen to be short and recent enough to capture seasonal time series 760 behavior but long enough to smooth out daily fluctuations. Separate multipliers are updated so 761 that a data source with regular but unusual patterns such as high weekend counts will be 762 modeled correctly. While a better fit may often be obtained with a more complex model for a 763 given data stream with a certain syndromic filter for a certain subregion and analysis of 764 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint sufficient data history, the current regression approach is relatively robust across time series 765 employed in ESSENCE. 766

Limitations: If this algorithm is applied to a data series without the baseline weekly and seasonal 767 behavior, the model will not explain the data well, and the detection sensitivity and specificity 768 will be decreased. The automated switch in the default method is applied for this reason. There 769 is no claim of optimal modeling for a given time series. This general-use implementation does 770 not assume the availability of a denominator variable such as the total visit count that can be 771 used to adjust the counts to emulate series of rates rather than counts. This adjustment has been 772 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint is subtracted, with a 2-day buffer period to separate the baseline from the counts being tested. 795

The rationale for the baseline length was the same as described above for the regression method 796 above. The test statistic is then (Sk -μk) / σk, where μk and σk are baseline mean and standard 797 deviation. As in the regression method, the hypothesis applied to determine alerting is a 798 Student's t distribution at significance levels of 1% for red alerts and 5% for yellow alerts. The 799 number of degrees of freedom assumed for this distribution is the baseline length + 1. This 800 EWMA implementation is designed for any series that does not fit the characteristic trends, so 801 a couple of safeguards are included. A "zero-filtration" algorithm is implemented for rapid 802 adjustment to and recovery from data dropouts and catch-ups. When counts are sparse but not 803 uniformly zero, a Poisson-based adjustment is added to the standard deviation scale factor to 804 avoid excessive alerts. Purposes: Many researchers and developers have applied complex statistical models to 816 surveillance data for prediction and detection. However, the predictive capability of a model 817 varies according to the specific data stream and how it is filtered and aggregated. This 818 capability may also be affected by data behavior changes that result from seasonal variations, 819 population shifts, and changes in the informatics. To account for such day-to-day changes, 820 ESSENCE automatically monitors its predictive capability of its regression model each day. 821

When this test fails, indicating that the model is not helpful for explaining the data, the system 822 switches to the EWMA adaptation described above. The result is that the regression model is 823 usually applied for the common respiratory and gastrointestinal syndrome classifications 824 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint 10 applied to county-level data, but EWMA is more commonly applied to rare syndrome data. 825 For situations where less than a week of recent baseline data exists, a simple Poisson detector 826 is applied. Such situations include new start-ups and more common restarts after long (several-827

week) intervals of missing data. 828 829 Technical Details: Details for the separate regression and EWMA methods are given in the 830 preceding pages. The adjusted R 2 coefficient for the regression is tested each day. This 831 coefficient does not give the quality of regression but is employed here specifically as a 832 measure of daily predictive capability using an empirically derived threshold criterion. When 833 the data pass this test, the model is assumed to have explanatory value, and the regression 834 algorithm is applied. When the data fail this test, the EWMA algorithm is used. The Poisson 835 distribution test is applied when less than a week (3-6 days) of recent data is available. A 836

Poisson distribution is assumed with mean and variance equal to the mean of the recent counts. 837

An alert is issued if the current count exceeds this mean and if probability that the current count 838 was drawn from this distribution is less than 1% (red alert) or 5% (yellow alert). Practical 839 safeguards for the composite method are as described in the regression and EWMA sections 840

Benefits: This algorithm is the default because it is designed to avoid mismatching the method to 842 the data. The regression model accounts for the expected data trends when they are seen in the 843 baseline. When they are absent because of the case definition used to filter the data, because 844 of the size of the monitored region, or because of data problems, alerting is based on the 845 EWMA algorithm. 846

Limitations: The goodness-of-fit test occasionally misclassifies the data. The test is set to err 847 toward the more conservative EWMA to avoid misfitting the data model. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint included in the ESSENCE suite because of their wide application. While they lack many of the 854 features described above, their simplicity has both benefits and limitations. 855

Technical Details: The C1 algorithm subtracts the daily count from the mean of a moving baseline 856 ending the previous day. In effect, it then divides this difference by the standard deviation of 857 counts in that baseline. If the result exceeds 3, indicating an increase above the mean of more 858 than 3 standard deviations, an alert is issued. The C2 algorithm does the same calculation but 859 imposes a 2-day buffer between the test day and the baseline. The C3 algorithm is a more 860 sensitive version of C2 that adds the values from the 2 previous days if they do not exceed the 861 threshold. All three algorithms use the same criterion of an increase of at least 3 baseline 862 standard deviations above the sliding baseline mean. An important implementation detail is 863 that ESSENCE does not use the standard 7-day baseline because substantial experience has 864

shown that for many time series, such a short baseline gives an unstable statistic that can lead 865 to a loss of confidence in the results. The implemented baseline is 28 days as in the EWMA 866 and regression methods. (21) There are no other changes to the standard EARS methods, 867 including retention of the flat 3-standard-deviation threshold regardless of the data stream. 868

Benefits: The methods are easy to understand and widely known.(25-27) 869

Limitations: Like the EWMA, the methods take no account of systematic data behavior such as 870 day-of-week effects or seasonal trends. C3 is the only one of these methods with sensitivity to 871 gradual outbreak effects, but it is known to produce high alarm rates. For all three methods, 872

threshold data values for alerting may fluctuate noticeably from day to day. 873 Summary Alerts--adjustment for multiple testing Multiple testing can lead to uncontrolled alert rates as the number of data streams increases. 881

For example, suppose that a hypothesis test is conducted on a time series of daily diagnoses of 882 influenza-like illness. In a one-sided test, this test results in a statistic whose value in some 883 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint distribution yields a probability p that the current count is as large as observed. For a desired 884

Type I error probability of , the probability is then (1-that an alert will not occur in the 885 distribution assumed for background data. Thus, for the parallel monitoring problem of interest 886

here, if such tests are applied to N independent data streams, the probability that no background 887 alerts occur is (1- N , which decreases quickly for practical error rates For a single-test 888 error rate of = 0.05, for example, the probability of at least one background alert exceeds 0.5 889 if more than 13 independent tests are applied. 890

Technical Details: For N tests, where N is the number of combinations of region, syndrome, age 891 group, and any other covariates affecting the number of tests, let P(1),…, P(N) be the p-values 892 sorted in ascending order, an ordering that puts the smallest and most significant p-value first. 893

The Summary Alert method applies the Simes-Seeger-Eklund criterion to reject the combined 894 null hypothesis of no anomaly for any series.(28) The null hypothesis is rejected if for some 895 j*, j* = 1,..,N, P(j*) < j*α To interpret this condition, note that for the most significant 896 p-value, an alert requires that P(1) < α/the strict Bonferroni bound. If α=0.01 and N=50, 897 then the condition becomes P(1) < 0.0002. For the least significant p-value, the condition is 898 simply P(N) < α highly unlikely for the weakest result. 899

If this condition is satisfied for any j*, then test results are considered alerts for all j < j*.(29) 900

The Summary Alert is implemented at two levels, FDR and FDR-Major. For the FDR level 901 applied to N time series, the implementation is as above. For a more liberal option appropriate 902 for certain syndromes or scenarios, FDR-Major applies the condition to two sets of N/2 time 903 series. 904 905 Benefits: In defining the false discovery rate as the expected ratio of false alerts to the total alert 906 count, Benjamini and Hochberg showed that the Simes-Seeger-Eklund criterion gives an 907 overall error rate of if the N time series tested are statistically independent. (30) Overall, this 908 criterion avoids the excess alerting resulting from using the nominal threshold α for all data 909 streams and also avoids the loss of sensitivity from using only the Bonferroni bound α/. 910

Limitations: If one of the p-values crosses the adjusted threshold, it is not obvious for 911 epidemiological or other reasons which tests to consider anomalous. Most users have followed 912 the natural procedure described by Simes to consider all p-values less than P(j*) as individual 913 alerts. Another limitation is that in general the time series are not statistically independent. 914 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint For situations where dependence is known, Hommel recommended the condition P(j) < j • 915 iC • where C =  1/j. In ESSENCE applications where many groups of time series may 916 be requested and dependence can change, the above condition with C=1 is applied. Technical Details: The null hypothesis is that the set of data subregions (often zip codes) in the 930 recent time interval tested forms a random sample from an expected spatial distribution of 931 cases. The expected distribution is not uniform over subregions but reflects a "customary" 932 spatial case spread that reflects urban/suburban case ratios or other factors. ESSENCE 933 implementation calculates the expected spatial distribution using recent case counts from a 934 sliding baseline interval. In effect, the code is similar to a common application of SaTScan, 935 the space-time permutation scan statistic, restricted to test cases from only the most recent time 936 interval and assuming circular clusters.(32) 937

As in SaTScan, the method calculates a test statistic for each candidate cluster. The test statistic 938 in the ESSENCE implementation is Kulldorff's Poisson log likelihood ratio. The set of 939 candidate clusters is generated by scanning over a set of cluster center locations, often taken 940 as centroids of all zip codes in the dataset, and considering all circles within a maximum radius 941 of each center, where the number of circles is limited by the number of data subregions within 942 each radius. The maximum test statistic over these candidates is then tested for significance. 943 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint Statistical significance inference does not depend on a theoretical distribution but on repeated 944 trials on simulated datasets randomly drawn using the baseline distribution. For each such 945 trial, the algorithm uses the same scanning procedure to derive a trial maximum. 946

For assessing the significance of the maximum test statistic over all observed clusters, the 947 ESSENCE code uses the Gumbel distribution method.(33) The code collects 99 trial maxima, 948 fits a Gumbel distribution to these values, and uses the fitted distribution to assign a p-value to 949 the test statistics of clusters found in the original data. The observed cluster with the maximum 950 test statistic is considered significant if its p-value is below a predetermined threshold, often 951 set to 0.01. This threshold criterion can yield multiple significant clusters in a given run if 952 more than one candidate cluster yields a test statistic whose p-value is below the threshold. 953

For each significant case cluster, the system shows the location, extent, and degree of 954 significance using the GIS software. 955

Benefits: The ESSENCE Java implementation inherits features that have popularized SaTScan. 956

Potential clusters of interest are localized without bias regarding the center or extent of the 957 cluster as well as the spatial resolution of the data allows. As noted in Kulldorff, Heffernan, 958 et al., the empirical significance testing with many repeated trials takes "into account the 959 multiple testing stemming from the many potential cluster locations and sizes evaluated."(32) 960

Limitations: The most important limitation, applicable also to SaTScan and to all other spatial or 961 space-time cluster detection methods, is that the usefulness of the method strongly depends on 962 the reliability of the expected spatial distribution. The use of census-based distributions, 963 insurance eligibility lists, regression models, and other means have been used to derive the 964 expected distribution. The method implemented in ESSENCE infers this distribution from 965 recent data separated from the test date(s) by a 2-day buffer. 966

Evaluation of statistically significant clusters for epidemiological significance is a nontrivial 967 task which may be exacerbated if the number of significant clusters is misleading or excessive 968 because the expected distribution is unrepresentative or because investigation resources are 969 insufficient. 970

The use of this popular approach for prospective use has been criticized despite numerous 971 applications and published real-life successes,(34) and the ESSENCE implementation lacks 972 the prospective adjustment in SaTScan attempting to manage cluster rates for multiple 973 successive runs. The ESSENCE implementation also does not support elliptical cluster shapes, 974 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint population behavioral health surveillance by using automated diagnostic and pharmacy data 1219 systems. MMWR Suppl. 2004;53:166-72. 1220 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint

Disease 1077 outbreak detection system using syndromic data in the greater Washington DC area

A systems 1080 overview of the Electronic Surveillance System for the Early Notification of Community-1081

Based Epidemics (ESSENCE II)

Bio-ALIRT biosurveillance detection algorithm evaluation

BioSense Platform Quick Start Guide to Using 1085 ESSENCE

Impact of the NSSP's transition to ESSENCE on chief complaint field-1089 based syndromes

Overview of Rational Rhapsody Poughkeepsie

Knowledge Center

HealthCare Integrations

Apache Software Foundation. The Apache Groovy programming language Wakefield

Apache Software Foundation

Open Source Geospatial 1101 Foundation

Office of the Assistant Secretary for Preparedness and Response

Department of Health and Human Services

American Association of Poison Control Centers

Automated syndromic classification of chief complaint records

International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity

Surveillance and Restrospective Identification in ESSENCE-FL

Syndromic Surveillance Identifies Unreported Cases of Zika Virus Disease

Effectiveness of Using a Chief Complaint and 1119

Discharge Diagnosis Query in ESSENCE-FL to Identify Possible Tuberculosis Patients and 1120

Online J Public Health Inform

Method selection and adaptation for distributed monitoring of 1122 infectious diseases for syndromic surveillance

Modeling emergency 1125 department visit patterns for infectious disease complaints: results and application to disease 1126 surveillance

Developments in 1128 the Roles, Features, and Evaluation of Alerting Algorithms for Disease Outbreak Monitoring

Practical comparison of aberration 1131 detection algorithms for biosurveillance systems

International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity

Enhancing time-series 1133 detection algorithms for automated biosurveillance

Statistical methods for quality improvement

The 1137 application of statistical process control charts to the detection and monitoring of hospital-1138 acquired infections

The bioterrorism preparedness and 1140 response Early Aberration Reporting System (EARS)

Initial evaluation of the early aberration reporting 1143 system--Florida. MMWR Morb Mortal Wkly Rep. 54 Suppl. United States2005

From Implementation to Automation ---A 1145

Step Approach to Developing Syndromic Surveillance Systems from a Public Health 1146 Perspective 2020

Evaluation of the Performance 1149 of a Dengue Outbreak Detection Tool for China

Simes RJ. An Improved Bonferroni Procedure for Multiple Tests of Significance

A stagewise rejective multiple test procedure based on a modified

Controlling the False Discovery Rate: A Practical and Powerful 1154 Approach to Multiple Testing

A spatial scan statistic

A space-time permutation 1159 scan statistic for disease outbreak detection

Gumbel based p-value approximations for spatial 1161 scan statistics

A critical look at prospective surveillance using a 1163 scan statistic

Clinical 1165 recognition and management of patients exposed to biological warfare agents

syndromic surveillance for influenzalike illness by International Classification of Diseases

The Ratio of Emergency Department 1171

Visits for ILI to Seroprevalence of 2009 Pandemic Influenza A (H1N1) Virus Infection

Effective detection of the 2009 H1N1 1174 influenza pandemic in U.S. Veterans Affairs medical centers using a national electronic 1175 biosurveillance system

International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted

Standardization Effort for the Distribute Emergency Department Surveillance Project

Proceedings of the Joint Statistical Meetings

Recognizing Recreational Water Exposure and 1180

Habituating HAB Surveillance in ESSENCE. 10

Using an Emergency Department Syndromic 1182

Surveillance System to Evaluate Reporting of Potential Rabies Exposures

Arizona Monitors Transfer of Patients with RMSF from Tribal Lands to Facilities 1185 in Maricopa County

Innovative uses for syndromic 1187 surveillance

NSSP Knowledge Repository: Syndrome Definition Subcommittee Calls

Tracking Health Effects of Wildfires: The Oregon ESSENCE Wildfire 1191 Pilot Project. 9

Monitoring Out-1193 of-State Patients during a 2017 Hurricane Response using ESSENCE

Usefulness of syndromic 1195 data sources for investigating morbidity resulting from a severe weather event

International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted

Monoxide Poisoning in Miami-Dade County Following Hurricane Irma in 2017

all of whom spent 1201 significant time in the development of OPHD's Rhapsody engine and informatics 1202 infrastructure

Center Data into Oregon ESSENCE using a Low-Cost Solution. 9

Operational 1205 Experience: Integration of ASPR Data into ESSENCE-FL during the RNC

Health Surveillance for Mass Gatherings

Surveillance: New ESSENCE Report and Collaboration Win Gold in OR

Surveillance during the 58th Presidential Inauguration-District of Columbia

Using SAGES 1216

OpenESSENCE for Mass Gathering Events

Using ESSENCE to Meet Local Needs for Mental Health Data: Query 1221 & Results

Assessment of the use of ED Chief 1223

Complaint Data for monitoring Chronic Diseases. 10

Day of Week Analysis of Myocardial 1225

Infarctions Using ESSENCE-FL Emergency Department Data

Definitions for ED Visits Related to Falls in Icy Weather

Torgerson A. Using ESSENCE to Detect Bomb-Making Activity: What's Appropriate? 10

Syndromic Surveillance of Emergency 1232

Department Visits for Acute Adverse Effects of Marijuana, Tri-County Health Department

Analysis of ED and UCC Visits Related to Synthetic Marijuana in 1235 ESSENCE-FL

Using Syndromic Surveillance to Rapidly Describe the 1237

Early Epidemiology of Flakka Use in Florida

Local Public Health Surveillance of Heroin-1239

Using probabilistic matching to improve opioid drug 1242 overdose surveillance, New Jersey

International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted

doi: medRxiv preprint 66

Data for Opioid Overdose Surveillance in Utah

Statistical methods for 1246 the prospective detection of infectious disease outbreaks: a review

Statistical process control as a tool for research and 1249 healthcare improvement. Qual Saf Health Care

Surveillance UCDoHIa. National Syndromic Surveillance Program Update

Evolution of Public Health Surveillance: Status and Recommendations

Continuously rethinking the definition of 1257 influenza for surveillance systems: a Dependent Bayesian Expert System

Making

An integrated 1260 approach for fusion of environmental and human health data for disease surveillance

Validation of Analytic Methods for 1263

Combining Evidence Sources in Biosurveillance

simultaneous clustering of multiple data sources, or test statistics other than the Poisson log 975 likelihood ratio. The user with a sufficiently detailed dataset and an application that requires 976 these extended SaTScan features should be aware of these limitations.  The algorithm forms 2x2 contingency tables whose entries are: 1033 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint 17 A=number of recent chief complaints with the term, 1034B=number of recent chief complaints without the term, 1035C=number of baseline chief complaints with the term, 1036D=number of baseline chief complaints without the term 1037 1038  For many such tables with small cell counts, Fisher's Exact Test is applied to alert when 1039 the probability that count of the term of interest is ≥ A, given the contingency table's 1040 marginal totals, is below a critical p-value. 1041 If the smaller of B and C is larger than 1000, then a chi-square test is applied to the same 1042 contingency table with negligible loss of accuracy. The critical threshold is then applied 1043 to half the resultant (two-sided) p-value to determine anomalous terms. 1044  Results are shown only for candidate terms that have not been previously eliminated 1045 because they are stopwords with no informational content (such as "the", "if", "all") or 1046 because users have previously added them to a list of terms to be ignored (such as 1047 "patient", "complaint", "test"). 1048  Conventions adopted from empirical test results and also to avoid impacting ESSENCE 1049 processing are: The test interval is 24 hours, the baseline interval is 30 days, a buffer of 7 1050 days is implemented between test interval and baseline, and the threshold p-value is set at 1051 p* = 10 -5 (0.00001). 1052 1053 Benefits: With the above p-value threshold and settings, this method detects as few as three 1054 instances of unusual terms (place names, rare signs/symptoms) and unusual concentrations of 1055 interesting common terms while averaging from 0-4 anomalous terms per day over small and 1056 large hospitals. Inspecting chief complaints containing each of a small number of terms each 1057 day and disqualifying some terms from further consideration is a manageable task that can 1058 uncover clusters of visits that could be missed by syndromic methods. In testing with historic 1059 data, chief complaints containing anomalous terms found with the strict p-value threshold 1060adopted have included small clusters of visits resulting from documented events of food 1061 poisoning and heat-related illness. This analysis has also found new abbreviations used by 1062 hospitals in their chief complaints. These new abbreviations can then be added to the Chief 1063 Complaint Processor to improve syndrome and subsyndrome categorization. 1064 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint 18 Limitations: While the simplicity of this method avoids impact on daily ESSENCE processing or 1065 user investigation of large collections of distributed data streams, the method has no means to 1066 interpret anomalous terms, identify phrases with multiple terms, or distinguish topics of 1067 concern in chief complaints that may not share specific terms. The only preprocessing of free-1068 text terms is the application of ESSENCE Chief Complaint Processor spell checking and 1069 abbreviation/acronym expansion. Interpretation of the anomalies requires a human-in-the-1070 loop, both to evaluate anomalous terms for investigation and to rule-out specific terms from 1071 future anomalies. The method is subject to effects of changing terminology (street drug names, 1072 triage vocabulary and abbreviations), and the user should be aware of current perceived health 1073 threats and corresponding emergency/urgent care language. 1074 1075 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted August 17, 2020. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted August 17, 2020. . https://doi.org/10.1101/2020.08.14.20175398 doi: medRxiv preprint