key: cord-0316063-y0u87mu4 authors: Russell, A.; O'Connor, C.; Lasek-Nesselquist, E.; Plitnick, J.; Kelly, J. P.; Lamson, D. M.; St George, K. title: Spatiotemporal analyses illuminate the competitive advantage of a SARS-CoV-2 variant of concern over a variant of interest date: 2021-09-20 journal: nan DOI: 10.1101/2021.09.14.21262977 sha: 62c5d8d68612f25a3c2ab727d54cff7b4d2043e9 doc_id: 316063 cord_uid: y0u87mu4 The emergence of novel SARS-CoV-2 variants in late 2020 and early 2021 raised alarm worldwide and prompted reassessment of the management, surveillance, and projected future of COVID-19. Mutations that confer competitive advantages by increasing transmissibility or immune evasion have been associated with the localized dominance of single variants. Thus, elucidating the evolutionary and epidemiological dynamics among novel variants is essential for understanding the trajectory of the COVID-19 pandemic. Here we show the interplay between B.1.1.7 (Alpha) and B.1.526 (Iota) in New York (NY) from December 2020 to April 2021 through phylogeographic analyses, space-time scan statistics, and cartographic visualization. Our results indicate that B.1.526 likely evolved in the Bronx in late 2020, providing opportunity for an initial foothold in the heavily interconnected New York City (NYC) region, as evidenced by numerous exportations to surrounding locations. In contrast, B.1.1.7 became dominant in regions of upstate NY where B.1.526 had limited presence, suggesting that B.1.1.7 was able to spread more efficiently in the absence of B.1.526. Clusters discovered from the spatial-time scan analysis supported the role of competition between B.1.526 and B.1.1.7 in NYC in March 2021 and the outsized presence of B.1.1.7 in upstate NY in April 2021. Although B.1.526 likely delayed the rise of B.1.1.7 in NYC, B.1.1.7 became the dominant variant in the Metro region by the end of the study period. These results reveal the advantages endemicity may grant to a variant (founder effect), despite the higher fitness of an introduced lineage. Our research highlights the dynamics of inter-variant competition at a time when B.1.617.2 (Delta) is overtaking B.1.1.7 as the dominant lineage worldwide. We believe our combined spatiotemporal methodologies can disentangle the complexities of shifting SARS-CoV-2 variant landscapes at a time when the evolution of variants with additional fitness advantages is impending. The emergence of a novel SARS-CoV-2 variant B.1.1.7 (Alpha) in the United Kingdom 46 (UK) in late 2020 raised alarm worldwide and prompted major reassessment of the management, 47 surveillance, and projected future of COVID-19 (1,2). Evidence of increased transmissibility and 48 potential immune evasion prompted the World Health Organization to designate B.1.1.7 a 49 variant of concern (VOC) in December 2020 (3-6). Increased transmissibility of B.1.1.7 is likely 50 due to several mutations, including N501Y which confers antibody evasion (7,8) and increases 51 spike protein binding to the host cell (9), and del60-70 which enhances infectivity (10). The 52 emergence of B.1.1.7 and additional novel SARS-CoV-2 variants with competitive advantages 53 have resulted in localized dominance of single variants (11) and raised concern for increases in 54 COVID-19 incidence (12) . 55 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 20, 2021. ; https://doi.org/10.1101/2021.09.14.21262977 doi: medRxiv preprint ensuing months ( Figure 4A ). Although sampling biases could have influenced the number of 176 introductions assigned to the Bronx, the Domestic category had greater representation in the 177 dataset but led to substantially fewer introductions (Table S2) The Finger Lakes and Northern NY were well-represented in the dataset (20% and 12% of the 190 data, respectively) but contributed substantially less to the distribution of B.1.1.7 (accounting for 191 7% and 6% of the total number of introductions, respectively) than Domestic sites, which 192 represented 20% of the data and were responsible for the majority of introductions (~39% , Table 193 S3). Exchange between NYC and Long Island and NYC and the Hudson Valley was also 194 frequent, but transmission from these regions to Northern NY, Southwestern NY, and the Finger 195 Lakes was substantially more limited ( Figure 4B , Table 3 December and early January followed by decline through February, and then a smaller peak in 217 late March and early April followed by a decline through April. The peak incidence in late 2020 218 and early 2021 represents the second major wave of COVID-19 cases in NYC, and the first 219 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. There are several limitations of our study which primarily reflect the inherent limitations 259 of our genomic surveillance program. A degree of selection bias exists within our dataset given 260 that specimens were screened by cycle threshold value and were submitted by a selected group 261 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 20, 2021. ; https://doi.org/10.1101/2021.09.14.21262977 doi: medRxiv preprint of clinical and commercial labs that cannot perfectly represent all COVID-19 cases in NY. We 262 were unable to assess the demographic and clinical representativeness of our dataset because 263 these data were not available to us for many specimens. Additionally, the number of specimens 264 sequenced varied over the space and time of the study period, which created small sample sizes 265 within many ZCTA-months. This limitation extended to the multinomial scan statistic, which 266 was run with estimated values for COVID-19 cases attributable to B.1.1.7 and B.1.526, giving 267 all ZCTAs with samples equal weight. However, the spatial scan assesses data according to their 268 proximity to each other. In this context, ZCTAs are analyzed together rather than individually, 269 which has the potential to reduce bias. Another consequence of our limited sampling was that 270 our data exhibited zero samples from many ZCTAs for each month. We addressed this by using 271 included. Specimens that were sequenced as a result of pre-screening for specific mutations or 314 clinical/epidemiological criteria were removed from the analysis. In the case of duplicate 315 specimens from the same patient, the earliest collected specimen was included, and all other 316 specimens excluded from the analysis. Only specimens with ZIP code of patient address 317 available were included. 318 Monthly COVID case counts by ZIP code were obtained from 320 https://gibhub.com/nychealth/coronavirus-data for NYC, and from the NYSDOH Communicable 321 Disease Electronic Surveillance System for the remainder of NY. Reports with case status of 322 'confirmed' or 'probable' were included in the case count. Cases were assigned month based on 323 date of diagnosis. All ZIP code data was converted to ZIP code tabulation area (ZCTA). 324 Incidence was calculated using ZCTA-level population data from the 2019 1-year American 325 Community Survey estimates. 326 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 20, 2021. ; https://doi.org/10.1101/2021.09.14.21262977 doi: medRxiv preprint We utilize the retrospective multinomial space-time scan statistic in SaTScan version 9.6, 328 using the non-ordinal method (26,31). Estimated SARS-CoV-2 variant data used in the 329 multinomial scan statistic were calculated for each ZCTA-month aggregation by multiplying the 330 proportion of either B. 1.1.7, B.1.526 , or "Other" variants in our sample by the total number of 331 COVID-19 cases. 332 Maximum spatial and temporal cluster size parameters were set a priori for 10% of the 333 population at risk (24) and one month, respectively. Space-time cluster detection in SaTScan has 334 a noted limitation where the size of clusters cannot change over time (32, 33) . Given that our data 335 is aggregated to the temporal unit of months (December 2020 -April 2021), setting the 336 maximum temporal cluster size parameter to one month allows clusters to change their shape 337 from month to month by being designated as "new" clusters. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 20, 2021. ; https://doi.org/10.1101/2021.09.14.21262977 doi: medRxiv preprint Public health actions to 434 control new SARS-CoV-2 variants Genetic Variants of SARS-CoV-2-What Do They Mean? 436 JAMA Genomic 438 characteristics and clinical effect of the emergent SARS-CoV-2 B.1.1.7 lineage in London, 439 UK: a whole-genome sequencing and hospital-based cohort study. The Lancet Infectious 440 Diseases Impact of B.1.1.7 variant mutations 442 on antibody recognition of linear SARS-CoV-2 epitopes Is Associated With Significantly Higher Viral Load in Samples TaqPath Polymerase Chain Reaction. The Journal of Infectious Diseases Sensitivity 450 of infectious SARS-CoV-2 B.1.1.7 and B.1.351 variants to neutralizing antibodies Transformations, Lineage Comparisons, 453 and Analysis of Down to Up Protomer States of Variants of the SARS-CoV-2 Prefusion 454 SARS-CoV-2 501Y.V2 escapes neutralization by South African COVID-19 donor plasma Mutation N501Y in RBD of Spike 460 Protein Strengthens the Interaction between COVID-19 and its Receptor ACE2 Recurrent emergence 464 of SARS-CoV-2 spike deletion H69/V70 and its role in the variant of concern lineage 465 B.1.1.7. Cell Reports Transmission of 467 SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking epidemiological and 468 genetic data 471 Emergence and rapid spread of a new severe acute respiratory syndrome-related 472 coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa 473 The localized rise of a 476 B.1.526 SARS-CoV-2 variant containing an E484K mutation Detection and characterization of the SARS-CoV-2 lineage B.1.526 in New York 481 484 15. World Health Organization. Epidemiological update: Variants of SARS-CoV-2 in the 485 SARS-CoV-2 variants, spike mutations and immune escape Emergence 490 and spread of a SARS-CoV-2 variant through Europe in the summer of 2020 Rapid 493 Emergence and Epidemiologic Characteristics of the SARS-CoV-2 B.1.526 Variant -New 494 Introductions and early spread of SARS-CoV-2 in the New York City area Sequencing identifies multiple early introductions of SARS-CoV-2 to the New York 500 City Region. medRxiv Phylodynamic Analysis | 176 genomes The emergence of 506 SARS-CoV-2 in Europe and North America Spatial disease clusters: Detection and inference Rapid surveillance of COVID-19 in the United 510 States using a prospective space-time scan statistic: Detecting and evaluating emerging 511 clusters Real time surveillance of COVID-19 space and time clusters during the 514 summer 2020 in Spain A spatial scan statistic for multinomial data Staphylococcus aureus 518 antimicrobial susceptibility trends and cluster detection in Vermont Spatial analysis of suicide 521 mortality in Québec: Spatial clustering and area factor correlates Early introductions and transmission of SARS-CoV-2 variant B.1.1.7 in the United States Neher R. The virus is under increasing selection pressure Software for the spatial and space-time scan statistics Space-time clusters with flexible shapes. MMWR Suppl A flexibly shaped space-time scan statistic for 533 disease outbreak detection and monitoring A two-dimensional interpolation function for irregularly-spaced data Geographical Analysis of Population: With Applications to 539 Planning and MAFFT multiple sequence alignment software version 7: 541 improvements in performance and usability IQ-TREE: A Fast and Effective 543 Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies Ultrafast Approximation for Phylogenetic 546 Molecular Biology and Evolution TreeTime: Maximum-likelihood phylodynamic analysis