key: cord-0802098-wabruzfs authors: Gu, Wei; Deng, Xianding; Reyes, Kevin; Hsu, Elaine; Wang, Candace; Sotomayor-Gonzalez, Alicia; Federman, Scot; Bushnell, Brian; Miller, Steve; Chiu, Charles title: Associations of Early COVID-19 Cases in San Francisco with Domestic and International Travel date: 2020-05-21 journal: Clin Infect Dis DOI: 10.1093/cid/ciaa599 sha: 8aed4c667874cdd1cde6d6c254f6dcfbc90a4a5c doc_id: 802098 cord_uid: wabruzfs In early-to-mid March 2020, 20 of 46 (43%) COVID-19 cases at a tertiary care hospital in San Francisco, California were travel-related. Cases were significantly associated with travel to Europe or New York (odds ratio 32.9). Viral genomes recovered from 9 of 12 (75%) cases co-clustered with lineages circulating in Europe. M a n u s c r i p t As of April 4 th , 2020, the COVID-19 pandemic, caused by the novel SARS-CoV-2 coronavirus 1 , has infected more than 1.2 million people worldwide and the rise in cases has been exponential. In particular, New York cases in the United States quickly surged from 22 to >10,000 between March 10 and 22 2 . By April 4 th there are >150,000 cases in New York and nearby New Jersey, threatening to overwhelm hospitals and other regional health care systems in the city. In San Francisco, we validated a qRT-PCR test to detect SARS-CoV-2 infection from nasopharyngeal swab samples based on the EUA (Emergency Use Authorization)approved US CDC assay 3 . During the first 10 days since launch, we performed SARS-CoV-2 testing on 947 samples collected from March 10 through March 20 from patients with suspected SARS-CoV-2 infection at University of California, San Francisco. We reviewed the electronic medical records from the first 46 consecutive SARS-CoV-2 positive cases admitted to University of California, San Francisco hospitals or seen in outpatient clinics from March 10 through March 20. Data from these COVID-19 patients were matched with 102 randomly selected negative controls who were patients who tested negative for SARS-CoV-2 over the same time period. Documented history was recorded by a physician or nurse practitioner and included sick contacts, health care worker status, and travel history. Among the 46 COVID-19 positive patients, the median age was 44 years, 46% were female, and 65% were outpatients (Table S1) . We noted that a travel history within 2 weeks of symptom onset (median date Mar 11, 2020) Table S2 , S3). The association with travel may be due to direct exposure to SARS-CoV-2 while in high prevalence regions (e.g. NY) or exposure while traveling (close contact with fellow travelers or airport personnel). One cluster of 3 positive cases associated with COVID-19 infection in an airport worker was categorized as a case of community rather than travel-associated transmission. No significant associations were found with regards to close contacts with known COVID-19 infected persons or frontline healthcare workers. Those who did not have a recent travel history, a close contact who was COVID-19 positive, or were not a frontline healthcare worker were categorized as community transmission with an unknown source of infection and comprised 39% of cases. We conducted viral genomic sequencing and phylogenetic analysis of SARS-CoV-2 viruses from 12 of 20 travelers for whom the breadth of coverage of the viral genome was >90% 4-6 . These viral genomes were aligned using MAFFT v7.427 7 with 762 high-coverage viral genomes deposited in the GISAID database 8, 9 as of March 20, 2020, in addition to the most recent viral genomes sequenced in California as of May 3, 2020 4 , for a total of 983 sequences. A maximum likelihood phylogenetic tree was constructed using IQTREE (version 2) using an HKY substitution model 10 ( Figure 2 ). We defined genomic clades through the GISAID nomenclature found at that point in time on March 20, 2020 8, 9 . The majority (9 of 12) of all travel cases clustered in the G A c c e p t e d M a n u s c r i p t clade as defined by the spike protein D614G variant marker (Figure 2 , S1, S2), including 3 cases from Europe (UC40, UC45, UC46), 4 cases from New York (UC27, UC36, UC44, UC47), 1 case from Los Angeles (UC26), and 1 case from Chicago (UC48). Viruses in the G clade comprise most of the genomes sequenced from patients in Europe 8, 9 , but notably have also been identified in the vast majority of cases associated with the New York SARS-CoV-2 outbreak in March to April of 2020, which occurred after the timeline of this study 11, 12 Viruses from two additional travel-associated cases from Europe (UC43) and New York (UC41) were mapped to other clades circulating in Europe (Figure 2) . The additional case from Europe was found to be part of the V clade, defined by a G251V mutation in the NS3 protein 8,9 . Limitations of our study include the use of epidemiological data from only the first 10 days of testing at a single institution. Nevertheless, in the setting of an emergent pandemic with shifting epidemiology, the results of our study reached statistical significance over 4 categories of travel (all travel, New York, USA, and Europe), and yielded data that may have presaged the exponential rise of New York cases and subsequent large-scale outbreak in the New York metropolitan area 11,12 . A c c e p t e d M a n u s c r i p t M a n u s c r i p t Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Real-Time RT-PCR Panel for Detection 2019-nCoV A Genomic Survey of SARS-CoV-2 Reveals Multiple Introductions into Northern California without a Predominant Lineage Metagenomic sequencing with spiked primer enrichment for viral diagnostics and genomic surveillance Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform disease and diplomacy: GISAID's innovative contribution to global health Nextstrain: real-time tracking of pathogen evolution IQ-TREE: a fast and effective stochastic algorithm for estimating maximum likelihood phylogenies Introductions and early spread of SARS-CoV-2 in the New York City area Sequencing identifies multiple, early introductions of SARS-CoV2 to New York City Region Cryptic transmission of SARS-CoV-2 in Washington State Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2) Epidemiology of Covid-19 in a Long-Term Care Facility A c c e p t e d M a n u s c r i p t