key: cord-0757680-ggk693oc authors: Greene, Sharon K.; Peterson, Eric R.; Balan, Dominique; Jones, Lucretia; Culp, Gretchen M.; Fine, Annie D.; Kulldorff, Martin title: Detecting COVID-19 Clusters at High Spatiotemporal Resolution, New York City, New York, USA, June–July 2020 date: 2021-05-03 journal: Emerg Infect Dis DOI: 10.3201/eid2705.203583 sha: c26e438039ead77cd73b681adcd1a1e0cc685839 doc_id: 757680 cord_uid: ggk693oc A surveillance system that uses census tract resolution and the SaTScan prospective space-time scan statistic detected clusters of increasing severe acute respiratory syndrome coronavirus 2 test percent positivity in New York City, NY, USA. Clusters included one in which patients attended the same social gathering and another that led to targeted testing and outreach. Clinical and commercial laboratories are required to report all severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) molecular test results (positive, negative, indeterminate) for New York state residents to the New York State Electronic Clinical Laboratory Reporting System (9) . For NYC residents, this reporting system transmits reports to DOHMH. Laboratory reports include specimen collection date and patient demographics, including residential address, which we geocoded by census tract. Patient symptoms and illness onset date, if any, are obtained through interviews, although not all patients are interviewed. To detect emerging clusters, the space-time scan statistic uses a cylinder in which the circular base covers a geographic area and the height corresponds to time (10) . This cylinder is moved, or scanned, over space and time to cover different areas and periods. At each position, the number of cases inside the cylinder is compared with the expected count under the null hypothesis of no clusters by using a likelihood function, and the position with the maximum likelihood is the primary candidate for a cluster. The statistical signifi cance of this cluster is then evaluated, ad justing for the multiple testing inherent in the many cylinder positions evaluated. To quickly detect emerging hotspots, prospective analyses are conducted daily (11) . To adjust for the multiple testing stemming from daily analyses, recurrence intervals are used instead of p values (12) . A recurrence interval of D days means that under the null hypothesis, if we conduct the analysis repeatedly A surveillance system that uses census tract resolution and the SaTScan prospective space-time scan statistic detected clusters of increasing severe acute respiratory syndrome coronavirus 2 test percent positivity in New York City, NY, USA. Clusters included one in which patients attended the same social gathering and another that led to targeted testing and outreach. over D days, then the expected number of clusters of the same or larger magnitude is 1. The space-time scan statistic can be used with different probability models; we used the Poisson model (10), adjusting not for population size (which would fail to account for changing testing rates) but rather for persons tested. Because the goal was to detect newly emerging hotspots rather than areas with consistently high percent positivity, we further adjusted analyses nonparametrically for purely geographic variations that were consistent over time. Fitting a log-linear function, we also adjusted for citywide temporal trends in percent positivity because the goal was to detect local hotspots rather than general citywide trends. For each day and location, the expected count was calculated as the number of persons tested × temporal trend function a location specific constant to ensure that, summed over all days in the study period, the location Necessary to control for spatial and temporal variability in testing access. A census-based population denominator would not control for variable testing uptake because the number of persons tested is not necessarily proportional to population size. Residents of long-term care facilities, correctional facilities, facilities housing people with developmental disabilities, or homeless shelters; persons whose home address matches selected providers or facilities; persons diagnosed in the 14 d before a more recent case residing in the same building identification number from geocoding; persons with COVID-19 illness onset (where available from patient interview) >14 d before specimen collection. To focus on detecting recent community-based transmission, exclude residents of congregate settings because building-level clusters are detected by using other methods (13) , persons whose listed home address is not a residence, >1 case/building, and patients whose diagnosis was made long after illness onset. Specimen collection date Defining reportable disease clusters according to when patients became ill is preferred, although a large proportion of COVID-19 infections are asymptomatic. Specimen collection date is the earliest date available for the study population of persons tested. Study period 21 d for analysis to support prioritization of case investigations; since June 1, 2020, for analysis to support place-based resource allocation Defining a study period >3 times the maximum temporal window helps with statistical power. Extending the study period further may decrease the accuracy of the log-linear temporal trend adjustment but might be of interest for detecting more prolonged clusters. If citywide percent positivity reaches an inflection point (e.g., begins to increase again after a period of decrease), the study period would need to be either temporarily shortened and reset after that inflection point to preserve suitability of a loglinear temporal trend adjustment or a nonparametric temporal trend adjustment could be used. For a longer temporal window, June 1, 2020, was selected as the earliest date when citywide percent positivity trend seemed stable without an inflection point. After 63 d elapsed from June 1, 2020, switched to 63-d rolling study period until next inflection point was reached. Lag for data accrual Given lags between specimen collection and report, exclude very incomplete data at end of study period when estimating the temporal trend. Three days is the minimum lag possible to preserve a timely analysis while allowing for at least some data to be reported, geocoded, and analyzed before open of business. has the same number of observed and expected cases. To prioritize quickly emerging clusters to identify epidemiologic linkages, we used a short maximum temporal window of 7 days. To detect sustained clusters to inform place-based resource allocation, starting July 15, we also ran secondary analyses with a maximum temporal window of 21 days. We developed SAS code (SAS Institute, https:// www.sas.com; https://github.com/CityOfNewYork/ communicable-disease-surveillance-nycdohmh) to generate daily input and parameter files Table Appendix Table, https://wwwnc.cdc.gov/EID/article/27/5/20-3583-App1.pdf). The SAS code then invoked SaTScan in batch mode, read analysis results back into SAS for further processing, output files to secured folders including patient line lists with demographics and map and time-trend visualizations), and sent an investigator notification email. We launched the system on June 11, 2020, and 2 clusters detected by July 31 prompted public health action ( detected was of 6 patients residing in a 0.6-km radius, all with specimens collected on June 17 (Figure, panel A) . Consequently, DOHMH staff interviewed patients to collect and compare potential common exposures, such as attending the same event or visiting the same location. On June 23, a DOHMH surveillance investigator (D.B.) determined that 2 patients had attended the same gathering, where recommended social distancing practices had not been observed. In response, DOHMH launched an effort to limit further transmission, including testing, contact tracing, community engagement, and health education emphasizing the importance of isolation and quarantine. No other epidemiologic lin ages ere identified after attempts to investigate ≈65 additional clusters detected through July 2020. Second, detection of a sustained cluster on July 15 (lasting >1 week) with a high percent positivity ( Figure, panel B) contributed to geographically targeted testing, outreach, and education, as part of NYC's hyper-local plan to prevent COVID-19 transmission (14) . COVID-19 community clusters detected by SaTScan prompted localized public education, testing, and community engagement (15) . In addition, prioritizing interviews of patients in clusters can identify epidemiologic linkages and opportunities for interrupting further transmission, as is done for other reportable diseases (6-8 . dentification of only 1 linkage in this study could be attributable to changing cluster investigation protocols, low patient response rates, or transmission occurring diffusely in small gatherings. Because all patients are referred for contact tracing, DOHMH discontinued reactively interviewing cluster patients for linkages and instead used clusters to proactively target resources. The first limitation in this study as timeliness. Analyses were based on specimen collection date; however, given delays in testing availability and care seeking, these dates did not necessarily represent recent infections. Timeliness was further limited by delays from specimen collection to laboratory testing and reporting. Clusters dominated by asymptomatic patients or patients with illness onset >14 days before diagnosis may not require intervention because positive PCR results indicate presence of viral RNA but not necessarily viable virus. The second limitation involved the need to geocode for spatial precision. Of unique NYC residents for whom a specimen was collected for SARS-CoV-2 RNA PCR testing during June-July 2020, residential address was not geocodable for ≈3% of residents, so they were excluded. Third, although recurrence interval thresholds can be used to prioritize responding to clusters (6), COVID-19 cluster interpretation can be more complex. Other characteristics for prioritizing COVID-19 clusters, besides statistical significance, include percent positivity, relative ris , case count, epidemic curve trajectory, radius, demographics, and persistence. Prioritization can differ by response activity (e.g., establishing new testing sites, conducting outreach) and how quickly resources can be reallocated. Deciding when and where to initiate interventions in response to COVID-19 clusters cannot be fully automated and requires epidemiologic interpretation. In summary, our COVID-19 early detection system highlighted areas warranting a rapid response. Targeted, place-based approaches for education and outreach efforts and for localized high transmission warnings could better protect persons at high risk for severe illness and death. Geospatial digital monitoring of COVID-19 cases at high spatiotemporal resolution Clusters of coronavirus disease in communities Daily surveillance of COVID-19 using the prospective space-time scan statistic in the United States Spatial analysis of COVID-19 clusters and contextual factors in New York City A space-time permutation scan statistic for disease outbreak detection Daily reportable disease spatiotemporal cluster detection South Bronx Legionnaires' Disease Investigation Team. A large community outbreak of Legionnaires' disease associated with a cooling tower Salmonellosis outbreak detected by automated spatiotemporal analysis Health advisory: reporting requirements for all laboratory results for SARS-CoV-2, including all molecular, antigen, and serological tests (including "rapid" tests) and ensuring complete reporting of patient demographics Evaluating cluster alarms: a space-time scan statistic and brain cancer Prospective time-periodic geographical disease surveillance using a scan statistic A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism uilding level analyses to prospectively detect influen a outbreaks in long-term care facilities Mayor de Blasio expands hyperlocal testing response in How a virus surge among Orthodox Jews became a crisis for We thank all staff members of the DOHMH Incident Command System Surveillance and Epidemiology Section for processing, cleaning, and managing input data; for conducting patient interviews and cluster investigations; and for logistical support. We also thank the NYC Test and Trace Corps for their assistance in managing the cases and contacts included in and identified by cluster investigations.S.K.G. and E.R.P were supported by the Public Health Emergency Preparedness Cooperative Agreement (grant NU90TP922035-01), funded by the Centers for Disease Control and Prevention (CDC). M.K. was supported by the SaTScan Enhancements Project, managed by the Fund for Public Health in New York City and funded by the CDC Foundation, CDC ELC CARES (grant NU50CK000517-01-09), Alfred P. Sloan Foundation, and Open Society Foundations. Dr. Greene is the director of the Data Analysis Unit at the Bureau of Communicable Disease of the NYC DOHMH, Long Island City, New York. Her research interests include infectious disease epidemiology and applied surveillance methods for outbreak detection.